dat: The microdata set. A numeric matrix or data frame containing the data.
keyVars: Indices or names of categorical key variables. They must, of course, match with the columns of dat .
numVars: Index or names of continuous key variables.
pramVars: Indices or names of categorical variables considered to be pramed.
ghostVars: if specified a list which each element being a list of exactly two elements. The first element must be a character vector specifying exactly one variable name that was also specified as a categorical key variable (keyVars), while the second element is a character vector of valid variable names (that must not be listed as keyVars). If localSuppression or kAnon was applied, the resulting suppression pattern for each key-variable is transferred to the depending variables.
weightVar: Indices or name determining the vector of sampling weights.
hhId: Index or name of the cluster ID (if available).
strataVar: Indices or names of stratification variables.
sensibleVar: Indices or names of sensible variables (for l-diversity)
excludeVars: which variables of dat should not be included in result-object? Users may specify a vector of variable-names available in dat
that were not specified in either keyVars, numVars, pramVars, ghostVars, hhId, strataVar or sensibleVar.
options: additional options (if specified, a list must be used as input)
seed: (numeric) number specifiying the seed which will be set to allow for reproducablity. The number will be rounded and saved as element seed in slot options.
randomizeRecords: (logical) if TRUE, the order of observations in the input microdata set will be randomized.
alpha: numeric between 0 and 1 specifying the fraction on how much keys containing NAs should contribute to the frequency calculation which is also crucial for risk-estimation.
object: a sdcMicroObj-class object
value: NULL or a character vector of length 1 specifying a valid variable name
Returns
a sdcMicroObj-class object
an object of class sdcMicroObj with modified slot @strataVar
Objects from the Class
Objects can be created by calls of the form new("sdcMicroObj", ...).
Examples
## we can also specify ghost (linked) variables## these variables are linked to some categorical key variables## and have the sampe suppression pattern as the variable that they## are linked to after \code{\link{localSuppression}} has been applieddata(testdata)testdata$electcon2 <- testdata$electcon
testdata$electcon3 <- testdata$electcon
testdata$water2 <- testdata$water
keyVars <- c("urbrur","roof","walls","water","electcon","relat","sex")numVars <- c("expend","income","savings")w <-"sampling_weight"## we want to make sure that some variables not used as key-variables## have the same suppression pattern as variables that have been## selected as key variables. Thus, we are using 'ghost'-variables.ghostVars <- list()## we want variables 'electcon2' and 'electcon3' to be linked## to key-variable 'electcon'ghostVars[[1]]<- list()ghostVars[[1]][[1]]<-"electcon"ghostVars[[1]][[2]]<- c("electcon2","electcon3")## donttest because Examples with CPU time > 2.5 times elapsed time## we want variable 'water2' to be linked to key-variable 'water'ghostVars[[2]]<- list()ghostVars[[2]][[1]]<-"water"ghostVars[[2]][[2]]<-"water2"## create the sdcMicroObjobj <- createSdcObj(testdata, keyVars=keyVars, numVars=numVars, w=w, ghostVars=ghostVars)## apply 3-anonymity to selected key variablesobj <- kAnon(obj, k=3); obj
## check, if the suppression patterns are identicalmanipGhostVars <- get.sdcMicroObj(obj,"manipGhostVars")manipKeyVars <- get.sdcMicroObj(obj,"manipKeyVars")all(is.na(manipKeyVars$electcon)== is.na(manipGhostVars$electcon2))all(is.na(manipKeyVars$electcon)== is.na(manipGhostVars$electcon3))all(is.na(manipKeyVars$water)== is.na(manipGhostVars$water2))## exclude some variablesobj <- createSdcObj(testdata, keyVars=c("urbrur","roof","walls"), numVars="savings", weightVar=w, excludeVars=c("relat","electcon","hhcivil","ori_hid","expend"))colnames(get.sdcMicroObj(obj,"origData"))
References
Templ, M. and Meindl, B. and Kowarik, A.: Statistical Disclosure Control for Micro-Data Using the R Package sdcMicro, Journal of Statistical Software, 67 (4), 1--36, 2015. tools:::Rd_expr_doi("10.18637/jss.v067.i04")
Author(s)
Bernhard Meindl, Alexander Kowarik, Matthias Templ, Elias Rut