Skip to content
Snippets Groups Projects
Verified Commit 2684c94c authored by Gärber, Florian's avatar Gärber, Florian
Browse files

refactor: `addLayer` to use `mclapply`

parent 9384d120
No related branches found
No related tags found
No related merge requests found
Type: Package
Package: RFSurrogates
Title: Surrogate Minimal Depth Variable Importance
Version: 0.3.3.9000
Version: 0.3.3.9001
Authors@R: c(
person("Stephan", "Seifert", , "stephan.seifert@uni-hamburg.de", role = c("aut", "cre"),
comment = c(ORCID = "0000-0003-2567-5728")),
......
# RFSurrogates (development version)
* [`addLayer`]: Refactor for-loop to lapply.
* Add `num.threads` param to enable parallelization using [`parallel::mclapply`]. It defaults to 1 for backward compatability.
# RFSurrogates 0.3.3
* Fixed `meanAdjAgree()` bug which caused mean adjusted agreement pairings to be set to NA incorrectly when `variables` was a subset or differently ordered than `candidates`.
......
#' Add layer information to a forest that was created by getTreeranger
#'
#' This functions adds the layer information to each node in a list with trees that was obtained by getTreeranger.
#' You should use [`getTreeranger()`] with `add_layer = TRUE` instead.
#'
#' @param trees The output of [`getTreeranger()`].
#' @param num.threads (Default: 1) Number of threads to spawn for parallelization.
#'
#' @returns A list of tree data frames of length `RF$num.trees`.
#' Each row of the tree data frames corresponds to a node of the respective tree and the columns correspond to:
#' * `nodeID`: ID of the respective node (important for left and right daughters in the next columns)
#' * `leftdaughter`: ID of the left daughter of this node
#' * `rightdaughter`: ID of the right daughter of this node
#' * `splitvariable`: ID of the split variable
#' * `splitpoint`: Split point of the split variable.
#' For categorical variables this is a comma separated lists of values, representing the factor levels (in the original order) going to the right.
#' * `status`: `0` for terminal (`splitpoint` is `NA`) and `1` for non-terminal.
#' * `layer`: Tree layer depth information, starting at 0 (root node) and incremented for each layer.
#'
#' @param trees list of trees created by getTreeranger
#' @return a list with trees. Each row of the list elements corresponds to a node of the respective tree and the columns correspond to:
#' \itemize{
#' \item nodeID: ID of the respective node (important for left and right daughters in the next columns)
#' \item leftdaughter: ID of the left daughter of this node
#' \item rightdaughter: ID of the right daughter of this node
#' \item splitvariable: ID of the split variable
#' \item splitpoint: splitpoint of the split variable
#' \item status: "0" for terminal and "1" for non-terminal
#' \item layer: layer information (0 means root node, 1 means 1 layer below root, etc)
#' }
#' @export
addLayer <- function(trees) {
# This function adds the respective layer to the different nodes in a tree. The tree has to be prepared by getTree function
tree.layer <- list()
num.trees <- length(trees)
for (i in 1:num.trees) {
tree <- trees[[i]]
layer <- rep(NA, nrow(tree))
layer[1] <- 0
t <- 1
while (anyNA(layer)) {
r <- unlist(tree[which(layer == (t - 1)), 2:3])
layer[r] <- t
t <- t + 1
}
tree <- cbind(tree, layer)
tree <- tree[order(as.numeric(tree[, "layer"])), ]
tree.layer[[i]] <- tree
addLayer <- function(trees, num.threads = 1) {
parallel::mclapply(trees, add_layer_to_tree, mc.cores = num.threads)
}
#' Internal function
#'
#' This function adds the respective layer to the different nodes in a tree.
#' The tree has to be prepared by getTree function.
#'
#' @param tree A tree data frame from [getTreeranger()].
#'
#' @returns A tree data frame with `layer` added.
#'
#' @seealso [addLayer()]
#'
#' @keywords internal
add_layer_to_tree <- function(tree) {
layer <- rep(NA, nrow(tree))
layer[1] <- 0
t <- 1
while (anyNA(layer)) {
r <- unlist(tree[which(layer == (t - 1)), 2:3])
layer[r] <- t
t <- t + 1
}
return(tree.layer)
tree <- cbind(tree, layer)
tree <- tree[order(as.numeric(tree[, "layer"])), ]
return(tree)
}
......@@ -4,23 +4,28 @@
\alias{addLayer}
\title{Add layer information to a forest that was created by getTreeranger}
\usage{
addLayer(trees)
addLayer(trees, num.threads = 1)
}
\arguments{
\item{trees}{list of trees created by getTreeranger}
\item{trees}{The output of \code{\link[=getTreeranger]{getTreeranger()}}.}
\item{num.threads}{(Default: 1) Number of threads to spawn for parallelization.}
}
\value{
a list with trees. Each row of the list elements corresponds to a node of the respective tree and the columns correspond to:
A list of tree data frames of length \code{RF$num.trees}.
Each row of the tree data frames corresponds to a node of the respective tree and the columns correspond to:
\itemize{
\item nodeID: ID of the respective node (important for left and right daughters in the next columns)
\item leftdaughter: ID of the left daughter of this node
\item rightdaughter: ID of the right daughter of this node
\item splitvariable: ID of the split variable
\item splitpoint: splitpoint of the split variable
\item status: "0" for terminal and "1" for non-terminal
\item layer: layer information (0 means root node, 1 means 1 layer below root, etc)
\item \code{nodeID}: ID of the respective node (important for left and right daughters in the next columns)
\item \code{leftdaughter}: ID of the left daughter of this node
\item \code{rightdaughter}: ID of the right daughter of this node
\item \code{splitvariable}: ID of the split variable
\item \code{splitpoint}: Split point of the split variable.
For categorical variables this is a comma separated lists of values, representing the factor levels (in the original order) going to the right.
\item \code{status}: \code{0} for terminal (\code{splitpoint} is \code{NA}) and \code{1} for non-terminal.
\item \code{layer}: Tree layer depth information, starting at 0 (root node) and incremented for each layer.
}
}
\description{
This functions adds the layer information to each node in a list with trees that was obtained by getTreeranger.
You should use \code{\link[=getTreeranger]{getTreeranger()}} with \code{add_layer = TRUE} instead.
}
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment