Tools and Tests for Experiments with Partially Synthetic Data Sets
Confidence intervals and standard errors of multiple imputation for a ...
Checking for equality in the features of two data sets.
Checking for logical consistency between two categorical variables in ...
Confidence intervals and standard errors for one synthetic categorical...
Calculates perturbation rates of overall data set and specific variabl...
Confidence intervals and standard errors for the cross-tabulation of t...
A set of functions to support experimentation in the utility of partially synthetic data sets. All functions compare an observed data set to one or a set of partially synthetic data sets derived from the observed data to (1) check that data sets have identical attributes, (2) calculate overall and specific variable perturbation rates, (3) check for potential logical inconsistencies, and (4) calculate confidence intervals and standard errors of desired variables in multiple imputed data sets. Confidence interval and standard error formulas have options for either synthetic data sets or multiple imputed data sets. For more information on the formulas and methods used, see Reiter & Raghunathan (2007) <doi:10.1198/016214507000000932>.