regressionImp function

Regression Imputation

Regression Imputation

Impute missing values based on a regression model.

regressionImp( formula, data, family = "AUTO", robust = FALSE, imp_var = TRUE, imp_suffix = "imp", mod_cat = FALSE )

Arguments

  • formula: model formula to impute one variable
  • data: A data.frame containing the data
  • family: family argument for glm(). "AUTO" (the default) tries to choose automatically and is the only really tested option!!!
  • robust: TRUE/FALSE if robust regression should be used. See details.
  • imp_var: TRUE/FALSE if a TRUE/FALSE variables for each imputed variable should be created show the imputation status
  • imp_suffix: suffix used for TF imputation variables
  • mod_cat: TRUE/FALSE if TRUE for categorical variables the level with the highest prediction probability is selected, otherwise it is sampled according to the probabilities.

Returns

the imputed data set.

Details

lm() is used for family "normal" and glm() for all other families. (robust=TRUE: lmrob(), glmrob())

Examples

data(sleep) sleepImp1 <- regressionImp(Dream+NonD~BodyWgt+BrainWgt,data=sleep) sleepImp2 <- regressionImp(Sleep+Gest+Span+Dream+NonD~BodyWgt+BrainWgt,data=sleep) data(testdata) imp_testdata1 <- regressionImp(b1+b2~x1+x2,data=testdata$wna) imp_testdata3 <- regressionImp(x1~x2,data=testdata$wna,robust=TRUE)

References

A. Kowarik, M. Templ (2016) Imputation with R package VIM. Journal of Statistical Software, 74(7), 1-16.

See Also

Other imputation methods: hotdeck(), impPCA(), irmi(), kNN(), matchImpute(), medianSamp(), rangerImpute(), sampleCat()

Author(s)

Alexander Kowarik