Multiple Imputation using Chained Random Forests
Convert variables to factors
Generate missing (completely at random) cells in a data set
Perform multiple imputation using the empirical error distributions an...
Perform multiple imputation based on the conditional distribution form...
Perform multiple imputation based on the conditional distribution form...
Univariate sampler function for mixed types of variables for predictio...
Univariate sampler function for mixed types of variables for node-base...
Univariate sampler function for categorical variables for prediction-b...
Univariate sampler function for continuous variables using the empiric...
Univariate sampler function for continuous variables for prediction-ba...
Identify corresponding observations indexes under the terminal nodes f...
Identify corresponding observed values for the response variable under...
Remove unnecessary arguments for ranger
function
Get regression estimates for pooled object
RfEmpImp: Multiple Imputation using Chained Random Forests
An R package for multiple imputation using chained random forests. Implemented methods can handle missing data in mixed types of variables by using prediction-based or node-based conditional distributions constructed using random forests. For prediction-based imputation, the method based on the empirical distribution of out-of-bag prediction errors of random forests and the method based on normality assumption for prediction errors of random forests are provided for imputing continuous variables. And the method based on predicted probabilities is provided for imputing categorical variables. For node-based imputation, the method based on the conditional distribution formed by the predicting nodes of random forests, and the method based on proximity measures of random forests are provided. More details of the statistical methods can be found in Hong et al. (2020) <arXiv:2004.14823>.
Useful links