Intuitive Missing Data Imputation Framework
Missing data spike-in in various missing data patterns
Dataframe cleaning for missing data handling
Extraction of metadata from dataframes
Missing data imputation with various methods
Imputation algorithm tester on simulated data
Missing data spike-in in MAP pattern
Missing data spike-in in MAR pattern
Missing data spike-in in MCAR pattern
'missCompare': Missing Data Imputation Comparison Framework
Missing data spike-in in MNAR pattern
Post imputation diagnostics
Simulation of matrix with no missingness
Testing the 'Amelia II' missing data imputation algorithm
Testing the 'Hmisc' aregImpute missing data imputation algorithm
Testing the 'VIM' kNN missing data imputation algorithm
Testing the mean imputation algorithm
Testing the median imputation algorithm
Testing the 'mi' missing data imputation algorithm
Testing the 'mice' mixed missing data imputation algorithm
Testing the 'missForest' missing data imputation algorithm
Testing the 'missMDA' EM missing data imputation algorithm
Testing the 'missMDA' regularized missing data imputation algorithm
Testing the 'pcaMethods' BPCA missing data imputation algorithm
Testing the 'pcaMethods' NIPALS missing data imputation algorithm
Testing the 'pcaMethods' NLPCA missing data imputation algorithm
Testing the 'pcaMethods' PPCA missing data imputation algorithm
Testing the 'pcaMethods' svdImpute missing data imputation algorithm
Testing the random replacement imputation algorithm
Offers a convenient pipeline to test and compare various missing data imputation algorithms on simulated and real data. These include simpler methods, such as mean and median imputation and random replacement, but also include more sophisticated algorithms already implemented in popular R packages, such as 'mi', described by Su et al. (2011) <doi:10.18637/jss.v045.i02>; 'mice', described by van Buuren and Groothuis-Oudshoorn (2011) <doi:10.18637/jss.v045.i03>; 'missForest', described by Stekhoven and Buhlmann (2012) <doi:10.1093/bioinformatics/btr597>; 'missMDA', described by Josse and Husson (2016) <doi:10.18637/jss.v070.i01>; and 'pcaMethods', described by Stacklies et al. (2007) <doi:10.1093/bioinformatics/btm069>. The central assumption behind 'missCompare' is that structurally different datasets (e.g. larger datasets with a large number of correlated variables vs. smaller datasets with non correlated variables) will benefit differently from different missing data imputation algorithms. 'missCompare' takes measurements of your dataset and sets up a sandbox to try a curated list of standard and sophisticated missing data imputation algorithms and compares them assuming custom missingness patterns. 'missCompare' will also impute your real-life dataset for you after the selection of the best performing algorithm in the simulations. The package also provides various post-imputation diagnostics and visualizations to help you assess imputation performance.