Skip to content
Snippets Groups Projects
Verified Commit 8ba5d1ed authored by Gärber, Florian's avatar Gärber, Florian
Browse files

feat: Add round_digits parameter to MAA

Refs: #15
parent d55825b5
No related branches found
No related tags found
No related merge requests found
Type: Package Type: Package
Package: RFSurrogates Package: RFSurrogates
Title: Surrogate Minimal Depth Variable Importance Title: Surrogate Minimal Depth Variable Importance
Version: 0.4.1 Version: 0.4.2
Authors@R: c( Authors@R: c(
person("Stephan", "Seifert", , "stephan.seifert@uni-hamburg.de", role = c("aut", "cre"), person("Stephan", "Seifert", , "stephan.seifert@uni-hamburg.de", role = c("aut", "cre"),
comment = c(ORCID = "0000-0003-2567-5728")), comment = c(ORCID = "0000-0003-2567-5728")),
......
# RFSurrogates 0.4.2
- Added new optional parameter to `MeanAdjustedAgreement` and `meanAdjAgree`: `round_digits` defaulting to 2. This offers the same behaviour as before, but allows rounding to different amounts of decimal places if desired.
# RFSurrogates 0.4.1 # RFSurrogates 0.4.1
- Fix `SurrogateMinimalDepth`'s result `$selected` returning modified names (#13) - Fix `SurrogateMinimalDepth`'s result `$selected` returning modified names (#13)
......
...@@ -11,6 +11,7 @@ ...@@ -11,6 +11,7 @@
#' @param candidates Vector of variable names that **are candidates to be related to the variables**. (Default: All variables used to create the random forest.) #' @param candidates Vector of variable names that **are candidates to be related to the variables**. (Default: All variables used to create the random forest.)
#' @param related (Default: TRUE) Whether related variables should be identified. #' @param related (Default: TRUE) Whether related variables should be identified.
#' @param num.threads (Default: 1) Number of threads used for determination of relations. #' @param num.threads (Default: 1) Number of threads used for determination of relations.
#' @param round_digits (Default: 2) Round mean adjusted agreement to this many digits.
#' #'
#' @return A `MeanAdjustedAgreement` list object: #' @return A `MeanAdjustedAgreement` list object:
#' * `RFS`: The original [RandomForestSurrogates()] object. #' * `RFS`: The original [RandomForestSurrogates()] object.
...@@ -48,8 +49,8 @@ MeanAdjustedAgreement <- function( ...@@ -48,8 +49,8 @@ MeanAdjustedAgreement <- function(
variables = RFS$ranger$forest$independent.variable.names, variables = RFS$ranger$forest$independent.variable.names,
candidates = RFS$ranger$forest$independent.variable.names, candidates = RFS$ranger$forest$independent.variable.names,
related = TRUE, related = TRUE,
num.threads = 1 num.threads = 1,
) { round_digits = 2) {
if (!inherits(RFS, "RandomForestSurrogates")) { if (!inherits(RFS, "RandomForestSurrogates")) {
stop("`RFS` must be a `RandomForestSurrogates` object.") stop("`RFS` must be a `RandomForestSurrogates` object.")
} }
...@@ -73,10 +74,11 @@ MeanAdjustedAgreement <- function( ...@@ -73,10 +74,11 @@ MeanAdjustedAgreement <- function(
t = t, t = t,
s.a = s$s.a, s.a = s$s.a,
select.var = related, select.var = related,
num.threads = num.threads num.threads = num.threads,
round_digits = round_digits
) )
results = list( results <- list(
RFS = RFS, RFS = RFS,
relations = maa$surr.res, relations = maa$surr.res,
threshold = maa$threshold threshold = maa$threshold
......
...@@ -10,6 +10,7 @@ ...@@ -10,6 +10,7 @@
#' @param s.a average number of surrogate variables (ideally calculated by count.surrogates function). #' @param s.a average number of surrogate variables (ideally calculated by count.surrogates function).
#' @param select.var set False if only relations should be calculated and no related variables should be selected. #' @param select.var set False if only relations should be calculated and no related variables should be selected.
#' @param num.threads number of threads used for parallel execution. Default is number of CPUs available. #' @param num.threads number of threads used for parallel execution. Default is number of CPUs available.
#' @param round_digits (Default: 2) Round mean adjusted agreement to this many digits in [mean.index].
#' #'
#' @returns A list containing: #' @returns A list containing:
#' * `variables`: the variables to which relations are investigated #' * `variables`: the variables to which relations are investigated
...@@ -18,7 +19,7 @@ ...@@ -18,7 +19,7 @@
#' * `surr.var`: binary matrix showing if the variables are related (1) or non-related (0) with variables in rows and candidates in columns. #' * `surr.var`: binary matrix showing if the variables are related (1) or non-related (0) with variables in rows and candidates in columns.
#' #'
#' @export #' @export
meanAdjAgree <- function(trees, variables, allvariables, candidates, t, s.a, select.var, num.threads = NULL) { meanAdjAgree <- function(trees, variables, allvariables, candidates, t, s.a, select.var, num.threads = NULL, round_digits = 2) {
num.trees <- length(trees) num.trees <- length(trees)
index.variables <- match(variables, allvariables) index.variables <- match(variables, allvariables)
index.candidates <- match(candidates, allvariables) index.candidates <- match(candidates, allvariables)
...@@ -39,7 +40,8 @@ meanAdjAgree <- function(trees, variables, allvariables, candidates, t, s.a, sel ...@@ -39,7 +40,8 @@ meanAdjAgree <- function(trees, variables, allvariables, candidates, t, s.a, sel
1:length(index.variables), 1:length(index.variables),
mean.index, mean.index,
list.res, list.res,
index.variables index.variables,
round_digits = round_digits
)), )),
ncol = length(candidates), nrow = length(variables), byrow = TRUE ncol = length(candidates), nrow = length(variables), byrow = TRUE
) )
...@@ -64,9 +66,9 @@ meanAdjAgree <- function(trees, variables, allvariables, candidates, t, s.a, sel ...@@ -64,9 +66,9 @@ meanAdjAgree <- function(trees, variables, allvariables, candidates, t, s.a, sel
#' This is an internal function #' This is an internal function
#' #'
#' @keywords internal #' @keywords internal
mean.index <- function(i, list.res, index.variables) { mean.index <- function(i, list.res, index.variables, round_digits = 2) {
list <- list.res[which(names(list.res) == index.variables[i])] list <- list.res[which(names(list.res) == index.variables[i])]
mean.list <- round(Reduce("+", list) / length(list), 2) mean.list <- round(Reduce("+", list) / length(list), digits = round_digits)
if (length(mean.list) > 0) { if (length(mean.list) > 0) {
return(mean.list) return(mean.list)
} else { } else {
......
...@@ -10,34 +10,33 @@ MeanAdjustedAgreement( ...@@ -10,34 +10,33 @@ MeanAdjustedAgreement(
variables = RFS$ranger$forest$independent.variable.names, variables = RFS$ranger$forest$independent.variable.names,
candidates = RFS$ranger$forest$independent.variable.names, candidates = RFS$ranger$forest$independent.variable.names,
related = TRUE, related = TRUE,
num.threads = 1 num.threads = 1,
round_digits = 2
) )
} }
\arguments{ \arguments{
\item{RFS}{A \code{\link[=RandomForestSurrogates]{RandomForestSurrogates()}} object.} \item{RFS}{A [RandomForestSurrogates()] object.}
\item{t}{(Default: 5) Used to calculate threshold.} \item{t}{(Default: 5) Used to calculate threshold.}
\item{variables}{Vector of variable names for \strong{which related variables should be searched}. (Default: All variables used to create the random forest.)} \item{variables}{Vector of variable names for **which related variables should be searched**. (Default: All variables used to create the random forest.)}
\item{candidates}{Vector of variable names that \strong{are candidates to be related to the variables}. (Default: All variables used to create the random forest.)} \item{candidates}{Vector of variable names that **are candidates to be related to the variables**. (Default: All variables used to create the random forest.)}
\item{related}{(Default: TRUE) Whether related variables should be identified.} \item{related}{(Default: TRUE) Whether related variables should be identified.}
\item{num.threads}{(Default: 1) Number of threads used for determination of relations.} \item{num.threads}{(Default: 1) Number of threads used for determination of relations.}
\item{round_digits}{(Default: 2) Round mean adjusted agreement to this many digits.}
} }
\value{ \value{
A \code{MeanAdjustedAgreement} list object: A `MeanAdjustedAgreement` list object:
\itemize{ * `RFS`: The original [RandomForestSurrogates()] object.
\item \code{RFS}: The original \code{\link[=RandomForestSurrogates]{RandomForestSurrogates()}} object. * `relations`: Matrix with mean adjusted agreement values
\item \code{relations}: Matrix with mean adjusted agreement values * Rows: `variables`.
\itemize{ * Columns: `candidates`.
\item Rows: \code{variables}. * `threshold`: the threshold used to select related variables.
\item Columns: \code{candidates}. * `related`: A list of vectors for each `variable` containing related `candidates`. Only present if `related = TRUE` (Default).
}
\item \code{threshold}: the threshold used to select related variables.
\item \code{related}: A list of vectors for each \code{variable} containing related \code{candidates}. Only present if \code{related = TRUE} (Default).
}
} }
\description{ \description{
This function uses the mean adjusted agreement to select variables that are related to a defined variable using a threshold T. This function uses the mean adjusted agreement to select variables that are related to a defined variable using a threshold T.
......
...@@ -4,7 +4,7 @@ ...@@ -4,7 +4,7 @@
\alias{mean.index} \alias{mean.index}
\title{mean.index} \title{mean.index}
\usage{ \usage{
\method{mean}{index}(i, list.res, index.variables) \method{mean}{index}(i, list.res, index.variables, round_digits = 2)
} }
\description{ \description{
This is an internal function This is an internal function
......
...@@ -12,11 +12,12 @@ meanAdjAgree( ...@@ -12,11 +12,12 @@ meanAdjAgree(
t, t,
s.a, s.a,
select.var, select.var,
num.threads = NULL num.threads = NULL,
round_digits = 2
) )
} }
\arguments{ \arguments{
\item{trees}{list of trees created by \code{\link[=getTreeranger]{getTreeranger()}}, \code{\link[=addLayer]{addLayer()}} and \code{\link[=addSurrogates]{addSurrogates()}}.} \item{trees}{list of trees created by [getTreeranger()], [addLayer()] and [addSurrogates()].}
\item{variables}{vector of variable names.} \item{variables}{vector of variable names.}
...@@ -24,22 +25,22 @@ meanAdjAgree( ...@@ -24,22 +25,22 @@ meanAdjAgree(
\item{candidates}{vector of variable names (strings) that are candidates to be related to the variables (has to be contained in allvariables)} \item{candidates}{vector of variable names (strings) that are candidates to be related to the variables (has to be contained in allvariables)}
\item{t}{variable to calculate threshold. Used if \code{select.var = TRUE}.} \item{t}{variable to calculate threshold. Used if `select.var = TRUE`.}
\item{s.a}{average number of surrogate variables (ideally calculated by count.surrogates function).} \item{s.a}{average number of surrogate variables (ideally calculated by count.surrogates function).}
\item{select.var}{set False if only relations should be calculated and no related variables should be selected.} \item{select.var}{set False if only relations should be calculated and no related variables should be selected.}
\item{num.threads}{number of threads used for parallel execution. Default is number of CPUs available.} \item{num.threads}{number of threads used for parallel execution. Default is number of CPUs available.}
\item{round_digits}{(Default: 2) Round mean adjusted agreement to this many digits in [mean.index].}
} }
\value{ \value{
A list containing: A list containing:
\itemize{ * `variables`: the variables to which relations are investigated
\item \code{variables}: the variables to which relations are investigated * `surr.res`: matrix with mean adjusted agreement values and variables investigated in rows and candidate variables in columns
\item \code{surr.res}: matrix with mean adjusted agreement values and variables investigated in rows and candidate variables in columns * `threshold`: the threshold used to create surr.var from surr.res
\item \code{threshold}: the threshold used to create surr.var from surr.res * `surr.var`: binary matrix showing if the variables are related (1) or non-related (0) with variables in rows and candidates in columns.
\item \code{surr.var}: binary matrix showing if the variables are related (1) or non-related (0) with variables in rows and candidates in columns.
}
} }
\description{ \description{
This is the main function of var.relations function. This is the main function of var.relations function.
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment