Simulate data
Simulate data for illustrate the performance of prediction intervals for random forests
sim_data(n = 500, p = 10, rho = 0.6, predictor_dist = "correlated", mean_function = "nonlinear-interaction", error_dist = "homoscedastic")
n
: Sample sizep
: Number of featuresrho
: Correlation between predictorspredictor_dist
: Distribution of predictor: "uncorrelated", and "correlated"mean_function
: Mean function: "linear", "nonlinear", and "nonlinear-interaction"error_dist
: Distribution of error: "homoscedastic", "heteroscedastic", and "heavy-tailed"a data.frame of simulated data
train_data <- sim_data(n = 500, p = 10) test_data <- sim_data(n = 500, p = 10)