Skip to content
Snippets Groups Projects
Verified Commit bec629f2 authored by Gärber, Florian's avatar Gärber, Florian
Browse files

fix: Deprecate `num.trees` param of `getTreeranger()`

- Improved documentation
- Deprecated `num.trees` argument (now defaults to num.trees of the `RF` argument)
- Documented internal `getsingletree`
- Shortened `getsingletree` by consolidating separate operations
parent 67276f15
No related branches found
No related tags found
No related merge requests found
#' Get a list of structured trees for ranger
#' Get a list of structured trees from a ranger object.
#'
#' This functions creates a list of trees for ranger objects similar as getTree function does for random Forest objects.
#'
#' @param RF random forest object created by ranger (with keep.inbag=TRUE)
#' @param num.trees number of trees
#' @return a list with trees. Each row of the list elements corresponds to a node of the respective tree and the columns correspond to:
#' \itemize{
#' \item nodeID: ID of the respective node (important for left and right daughters in the next columns)
#' \item leftdaughter: ID of the left daughter of this node
#' \item rightdaughter: ID of the right daughter of this node
#' \item splitvariable: ID of the split variable
#' \item splitpoint: splitpoint of the split variable (for categorical variables this is a comma separated lists of values, representing the factor levels (in the original order) going to the right)
#' \item status: "0" for terminal and "1" for non-terminal
#' }
#' @param RF A [`ranger::ranger`] object which was created with `keep.inbag = TRUE`.
#' @param num.trees (Deprecated) Number of trees to convert (Default: `RF$num.trees`).
#'
#' @returns A list of tree data frames of length `RF$num.trees`.
#' Each row of the tree data frames corresponds to a node of the respective tree and the columns correspond to:
#' * `nodeID`: ID of the respective node (important for left and right daughters in the next columns)
#' * `leftdaughter`: ID of the left daughter of this node
#' * `rightdaughter`: ID of the right daughter of this node
#' * `splitvariable`: ID of the split variable
#' * `splitpoint`: Split point of the split variable.
#' For categorical variables this is a comma separated lists of values, representing the factor levels (in the original order) going to the right.
#' * `status`: `0` for terminal (`splitpoint` is `NA`) and `1` for non-terminal.
#'
#' @export
getTreeranger <- function(RF, num.trees) {
trees <- lapply(1:num.trees, getsingletree, RF = RF)
return(trees)
getTreeranger <- function(RF, num.trees = RF$num.trees) {
lapply(1:num.trees, getsingletree, RF = RF)
}
#' getsingletree
#'
#' This is an internal function
#'
#' @param RF A [`ranger::ranger`] object.
#' @param k Tree index to convert.
#'
#' @returns A tree data frame for the `k`th tree in `RF`.
#' Each row of the tree data frames corresponds to a node of the respective tree and the columns correspond to:
#' * `nodeID`: ID of the respective node (important for left and right daughters in the next columns)
#' * `leftdaughter`: ID of the left daughter of this node
#' * `rightdaughter`: ID of the right daughter of this node
#' * `splitvariable`: ID of the split variable
#' * `splitpoint`: Split point of the split variable.
#' For categorical variables this is a comma separated lists of values, representing the factor levels (in the original order) going to the right.
#' * `status`: `0` for terminal (`splitpoint` is `NA`) and `1` for non-terminal.
#'
#' @keywords internal
getsingletree <- function(RF, k = 1) {
# here we use the treeInfo function of the ranger package to create extract the trees, in an earlier version this was done with a self implemented function
tree.ranger <- ranger::treeInfo(RF, tree = k)
ktree <- data.frame(
as.numeric(tree.ranger$nodeID + 1),
as.numeric(tree.ranger$leftChild + 1),
as.numeric(tree.ranger$rightChild + 1),
as.numeric(tree.ranger$splitvarID + 1),
tree.ranger$splitval,
tree.ranger$terminal
nodeID = as.numeric(tree.ranger$nodeID + 1),
leftdaughter = as.numeric(tree.ranger$leftChild + 1),
rightdaughter = as.numeric(tree.ranger$rightChild + 1),
splitvariable = as.numeric(tree.ranger$splitvarID + 1),
splitpoint = tree.ranger$splitval,
status = as.numeric(!tree.ranger$terminal)
)
if (is.factor(ktree[, 5])) {
ktree[, 5] <- as.character(levels(ktree[, 5]))[ktree[, 5]]
if (is.factor(ktree[, "splitpoint"])) {
ktree[, "splitpoint"] <- as.character(levels(ktree[, "splitpoint"]))[ktree[, "splitpoint"]]
}
ktree[, 6] <- as.numeric(ktree[, 6] == FALSE)
for (i in 2:4) {
ktree[, i][is.na(ktree[, i])] <- 0
}
colnames(ktree) <- c("nodeID", "leftdaughter", "rightdaughter", "splitvariable", "splitpoint", "status")
ktree[, 2:4][is.na(ktree[, 2:4])] <- 0
return(ktree)
}
......@@ -2,24 +2,26 @@
% Please edit documentation in R/getTreeranger.R
\name{getTreeranger}
\alias{getTreeranger}
\title{Get a list of structured trees for ranger}
\title{Get a list of structured trees from a ranger object.}
\usage{
getTreeranger(RF, num.trees)
getTreeranger(RF, num.trees = RF$num.trees)
}
\arguments{
\item{RF}{random forest object created by ranger (with keep.inbag=TRUE)}
\item{RF}{A \code{\link[ranger:ranger]{ranger::ranger}} object which was created with \code{keep.inbag = TRUE}.}
\item{num.trees}{number of trees}
\item{num.trees}{(Deprecated) Number of trees to convert (Default: \code{RF$num.trees}).}
}
\value{
a list with trees. Each row of the list elements corresponds to a node of the respective tree and the columns correspond to:
A list of tree data frames of length \code{RF$num.trees}.
Each row of the tree data frames corresponds to a node of the respective tree and the columns correspond to:
\itemize{
\item nodeID: ID of the respective node (important for left and right daughters in the next columns)
\item leftdaughter: ID of the left daughter of this node
\item rightdaughter: ID of the right daughter of this node
\item splitvariable: ID of the split variable
\item splitpoint: splitpoint of the split variable (for categorical variables this is a comma separated lists of values, representing the factor levels (in the original order) going to the right)
\item status: "0" for terminal and "1" for non-terminal
\item \code{nodeID}: ID of the respective node (important for left and right daughters in the next columns)
\item \code{leftdaughter}: ID of the left daughter of this node
\item \code{rightdaughter}: ID of the right daughter of this node
\item \code{splitvariable}: ID of the split variable
\item \code{splitpoint}: Split point of the split variable.
For categorical variables this is a comma separated lists of values, representing the factor levels (in the original order) going to the right.
\item \code{status}: \code{0} for terminal (\code{splitpoint} is \code{NA}) and \code{1} for non-terminal.
}
}
\description{
......
......@@ -6,6 +6,24 @@
\usage{
getsingletree(RF, k = 1)
}
\arguments{
\item{RF}{A \code{\link[ranger:ranger]{ranger::ranger}} object.}
\item{k}{Tree index to convert.}
}
\value{
A tree data frame for the \code{k}th tree in \code{RF}.
Each row of the tree data frames corresponds to a node of the respective tree and the columns correspond to:
\itemize{
\item \code{nodeID}: ID of the respective node (important for left and right daughters in the next columns)
\item \code{leftdaughter}: ID of the left daughter of this node
\item \code{rightdaughter}: ID of the right daughter of this node
\item \code{splitvariable}: ID of the split variable
\item \code{splitpoint}: Split point of the split variable.
For categorical variables this is a comma separated lists of values, representing the factor levels (in the original order) going to the right.
\item \code{status}: \code{0} for terminal (\code{splitpoint} is \code{NA}) and \code{1} for non-terminal.
}
}
\description{
This is an internal function
}
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment