Arguments
- self
a 'tidyFit' R6 class.
- data
a data frame, data frame extension (e.g. a tibble), or a lazy data frame (e.g. from dbplyr or dtplyr).
Details
Hyperparameters:
ntree (number of trees)
mtry (number of variables randomly sampled at each split)
Important method arguments (passed to m
)
The function provides a wrapper for randomForest::randomForest
. See ?randomForest
for more details.
Implementation
The random forest is always fit with importance = TRUE
. The feature importance values are extracted using coef()
.
References
Liaw, A. and Wiener, M. (2002). Classification and Regression by randomForest. R News 2(3), 18--22.
See also
.fit.svm
, .fit.boost
and m
methods
Examples
# Load data
data <- tidyfit::Factor_Industry_Returns
data <- dplyr::filter(data, Industry == "HiTec")
data <- dplyr::select(data, -Date, -Industry)
# Stand-alone function
fit <- m("rf", Return ~ ., data)
fit
#> # A tibble: 1 × 5
#> estimator_fct `size (MB)` grid_id model_object settings
#> <chr> <dbl> <chr> <list> <list>
#> 1 randomForest::randomForest 8.47 #0010000 <tidyFit> <tibble>
# Within 'regress' function
fit <- regress(data, Return ~ ., m("rf"))
explain(fit)
#> Warning: using explain package 'randomForest'
#> # A tibble: 7 × 5
#> # Groups: model [1]
#> model term importance IncNodePurity importanceSD
#> <chr> <chr> <dbl> <dbl> <dbl>
#> 1 rf (Intercept) 0 0 0
#> 2 rf Mkt-RF 38.2 14401. 0.448
#> 3 rf SMB 1.18 2106. 0.101
#> 4 rf HML 3.53 3198. 0.161
#> 5 rf RMW 2.16 2527. 0.136
#> 6 rf CMA 5.43 4345. 0.239
#> 7 rf RF 0.583 1152. 0.0823