Additional primary cells based on risky primary cells
Additional primary cells based on risky primary cells
The algorithm uses parent-child relationships found from the model matrix (x)
PrimaryFromRiskyDefault(x, y, risky, candidates, allDims =FALSE)
Arguments
x: The model matrix
y: A vector of numeric values with a length equal to nrow(x)
risky: Indices to columns in x corresponding to primary cells classified as risky (interval limits not reached)
candidates: Indices to columns in x that are candidates for becoming additional primary cells. Higher order cells must be included so that parent-child relationships are seen.
allDims: When TRUE, a primary cell is added for each dimension. can be specified as a vector of length length(risky)
Returns
Additional primary cells as indices to columns in x.
Details
For a single risky cell, the algorithm can be formulated as:
Consider this cell as a child and identify all parents that are present in candidates.
Remove parents who are also parents of other parents (i.e., eliminate higher-level parents).
Identify the children of these remaining parents that are included in candidates.
Select the child that has the smallest value in the numeric variable (y).
For several risky cells, coordination takes place. See the comment below the examples.
Examples
# Example inspired by suppression with maxN = 5d1 <- SSBtoolsData("d1")mm <- SSBtools::ModelMatrix(d1, dimVar =1:2, crossTable =TRUE)x <- mm$modelMatrix
y <- Matrix::crossprod(x, d1$freq)risky <- c(13,15,40,45)candidates <- c(1:12,14,16,17,19,21,21,24,26:37,39,42,44)info <- rep("", length(y))info[risky ]<-"risky"info[candidates]<-"c"cbind(mm$crossTable, y=as.vector(y), info)PrimaryFromRiskyDefault(x = x, y = y, risky = risky, candidates = candidates)PrimaryFromRiskyDefault(x = x, y = y, risky =40, candidates = candidates)# The last solution (39) is not included in the first (28, 35). # This is because 39 is not needed when 35 is already included.