freqCalc function

Frequencies calculation for risk estimation

Frequencies calculation for risk estimation

Computation and estimation of the sample and population frequency counts.

freqCalc(x, keyVars, w = NULL, alpha = 1)

Arguments

  • x: data frame or matrix
  • keyVars: key variables
  • w: column index of the weight variable. Should be set to NULL if one deal with a population.
  • alpha: numeric value between 0 and 1 specifying how much keys that contain missing values (NAs) should contribute to the calculation of fk and Fk. For the default value of 1, nothing changes with respect to the implementation in prior versions. Each wildcard-match would be counted while for alpha=0 keys with missing values would be basically ignored.

Returns

Object from class freqCalc. - freqCalc: data set

  • keyVars: variables used for frequency calculation

  • w: index of weight vector. NULL if you do not have a sample.

  • alpha: value of parameter alpha

  • fk: the frequency of equal observations in the key variables subset sample given for each observation.

  • Fk: estimated frequency in the population

  • n1: number of observations with fk=1

  • n2: number of observations with fk=2

Details

The function considers the case of missing values in the data. A missing value stands for any of the possible categories of the variable considered. It is possible to apply this function to large data sets with many (catergorical) key variables, since the computation is done in C.

freqCalc() does not support sdcMicro S4 class objects.

Examples

data(francdat) f <- freqCalc(francdat, keyVars=c(2,4,5,6),w=8) f f$freqCalc f$fk f$Fk ## with missings: x <- francdat x[3,5] <- NA x[4,2] <- x[4,4] <- NA x[5,6] <- NA x[6,2] <- NA f2 <- freqCalc(x, keyVars=c(2,4,5,6),w=8) cbind(f2$fk, f2$Fk) ## test parameter 'alpha' f3a <- freqCalc(x, keyVars=c(2,4,5,6), w=8, alpha=1) f3b <- freqCalc(x, keyVars=c(2,4,5,6), w=8, alpha=0.5) f3c <- freqCalc(x, keyVars=c(2,4,5,6), w=8, alpha=0.1) data.frame(fka=f3a$fk, fkb=f3b$fk, fkc=f3c$fk) data.frame(Fka=f3a$Fk, Fkb=f3b$Fk, Fkc=f3c$Fk)

References

look e.g. in https://research.cbs.nl/casc/deliv/12d1.pdf

Templ, M. Statistical Disclosure Control for Microdata Using the R-Package sdcMicro, Transactions on Data Privacy, vol. 1, number 2, pp. 67-85, 2008. https://www.tdp.cat/issues/abs.a004a08.php

Templ, M. New Developments in Statistical Disclosure Control and Imputation: Robust Statistics Applied to Official Statistics, Suedwestdeutscher Verlag fuer Hochschulschriften, 2009, ISBN: 3838108280, 264 pages.

Templ, M. Statistical Disclosure Control for Microdata: Methods and Applications in R. Springer International Publishing, 287 pages, 2017. ISBN 978-3-319-50272-4. tools:::Rd_expr_doi("10.1007/978-3-319-50272-4")

tools:::Rd_expr_doi("10.1007/978-3-319-50272-4")

Templ, M. and Meindl, B.: Practical Applications in Statistical Disclosure Control Using R, Privacy and Anonymity in Information Management Systems New Techniques for New Practical Problems, Springer, 31-62, 2010, ISBN: 978-1-84996-237-7.

See Also

indivRisk, measure_risk

Author(s)

Bernhard Meindl