sim_data function

Simulate data

Simulate data

Simulate data for illustrate the performance of prediction intervals for random forests

sim_data(n = 500, p = 10, rho = 0.6, predictor_dist = "correlated", mean_function = "nonlinear-interaction", error_dist = "homoscedastic")

Arguments

  • n: Sample size
  • p: Number of features
  • rho: Correlation between predictors
  • predictor_dist: Distribution of predictor: "uncorrelated", and "correlated"
  • mean_function: Mean function: "linear", "nonlinear", and "nonlinear-interaction"
  • error_dist: Distribution of error: "homoscedastic", "heteroscedastic", and "heavy-tailed"

Returns

a data.frame of simulated data

Examples

train_data <- sim_data(n = 500, p = 10) test_data <- sim_data(n = 500, p = 10)