rankings: Integer N$$x$$n matrix or data frame with full rankings in each row.
topk: Logical: whether the full rankings must be converted into top-k rankings (TRUE) or into partial rankings with missing data in arbitrary positions (FALSE). Defaults to TRUE.
nranked: Integer vector of length N with the desired number of positions to be retained in each partial sequence after censoring. If nranked = NULL (default), the number of positions are randomly generated according to the probabilities in the probs argument.
probs: Numeric vector of the (n−1) probabilities for the random generation of the number of positions to be retained in each partial sequence after censoring (normalization is not necessary). Used only if nranked = NULL. Defaults to equal probabilities.
Returns
A list of two named objects:
part_rankings: Integer N$$x$$n matrix with partial (censored) rankings in each row. Missing positions are coded as NA.
nranked: Integer vector of length N with the actual number of items ranked in each partial sequence after censoring.
Details
Both forms of partial rankings can be obtained into two ways: (i) by specifying, in the nranked argument, the number of positions to be retained in each partial ranking; (ii) by setting nranked = NULL (default) and specifying, in the probs argument, the probabilities of retaining respectively 1,2,...,(n−1) positions in the partial rankings (recall that a partial sequence with (n−1) observed entries corresponds to a full ranking).
When topk = FALSE, the exact positions that must be retained into the partial sequences after censoring are uniformly generated, regardless of the specification of the nranked argument.
Examples
## Example 1. Censoring the Antifragility dataset into partial top rankings# Top-3 censoring (assigned number of top positions to be retained)n <-7r_antifrag <- ranks_antifragility[,1:n]data_censoring(r_antifrag, topk =TRUE, nranked = rep(3,nrow(r_antifrag)))# Random top-k censoring with assigned probabilitiesset.seed(12345)data_censoring(r_antifrag, topk =TRUE, probs =1:(n-1))## Example 2. Simulate full rankings from a basic Mallows model with Spearman distancen <-10N <-100set.seed(12345)rankings <- rMSmix(sample_size = N, n_items = n)$samples
# Censoring in arbitrary positions with assigned number of ranks to be retainedset.seed(12345)nranked <- round(runif(N,0.5,1)*n)set.seed(12345)arbitr_ranks1 <- data_censoring(rankings, topk =FALSE, nranked = nranked)arbitr_ranks1
identical(arbitr_ranks1$nranked, nranked)# Censoring in arbitrary positions with random number of ranks to be retainedset.seed(12345)probs <- runif(n-1,0,0.5)set.seed(12345)arbitr_ranks2 <- data_censoring(rankings, topk =FALSE, probs = probs)arbitr_ranks2
prop.table(table(arbitr_ranks2$nranked))round(prop.table(probs),2)