rMSmix function

Random samples from a mixture of Mallows models with Spearman distance

Random samples from a mixture of Mallows models with Spearman distance

Draw random samples of full rankings from a mixture of Mallows models with Spearman distance.

rMSmix( sample_size = 1, n_items, n_clust = 1, rho = NULL, theta = NULL, weights = NULL, uniform = FALSE, mh = TRUE )

Arguments

  • sample_size: Number of full rankings to be sampled. Defaults to 1.
  • n_items: Number of items.
  • n_clust: Number of mixture components. Defaults to 1.
  • rho: Integer G$$x$$n matrix with the component-specific consensus rankings in each row. Defaults to NULL, meaning that the consensus rankings are randomly generated according to the sampling scheme indicated by the uniform argument. See Details.
  • theta: Numeric vector of GG non-negative component-specific precision parameters. Defaults to NULL, meaning that the concentrations are uniformly generated from an interval containing typical values for the precisions. See Details.
  • weights: Numeric vector of GG positive mixture weights (normalization is not necessary). Defaults to NULL, meaning that the mixture weights are randomly generated according to the sampling scheme indicated by the uniform argument. See Details.
  • uniform: Logical: whether rho or weights have to be sampled uniformly on their support. When uniform = FALSE they are sampled, respectively, to ensure separation among mixture components and populated weights. Used when G>1G>1 and either rho or weights are NULL (see Details). Defaults to FALSE.
  • mh: Logical: whether the samples must be drawn with the Metropolis-Hastings (MH) scheme implemented in the BayesMallows package, rather by direct sampling from the Mallows probability distribution. For n_items > 10, the MH is always applied to speed up the sampling procedure. Defaults to TRUE.

Returns

A list of the following named components:

  • samples: Integer N$$x$$n matrix with the sample_size simulated full rankings in each row.

  • rho: Integer G$$x$$n matrix with the component-specific consensus rankings used for the simulation in each row.

  • theta: Numeric vector of the GG component-specific precision parameters used for the simulation.

  • weights: Numeric vector of the GG mixture weights used for the simulation.

  • classification: Integer vector of the sample_size component membership labels.

Details

When n_items > 10 or mh = TRUE, the random samples are obtained by using the Metropolis-Hastings algorithm, described in Vitelli et al. (2018) and implemented in the sample_mallows function of the package BayesMallows package.

When theta = NULL, the concentration parameters are randomly generated from a uniform distribution on the interval (1/n2,3/n1.5)(1/n^{2},3/n^{1.5}) containing typical values for the precisions.

When uniform = FALSE, the mixing weights are sampled from a symmetric Dirichlet distribution with shape parameters all equal to 2G2G, to favor populated and balanced clusters, and the consensus parameters are sampled to favor well-separated clusters, i. e., at least at Spearman distance equal to 2G(n+13)\frac{2}{G}\binom{n+1}{3} from each other.

Examples

## Example 1. Drawing from a mixture with randomly generated parameters of separated clusters. set.seed(12345) rMSmix(sample_size = 50, n_items = 25, n_clust = 5) ## Example 2. Drawing from a mixture with uniformly generated parameters. set.seed(12345) rMSmix(sample_size = 100, n_items = 9, n_clust = 3, uniform = TRUE) ## Example 3. Drawing from a mixture with customized parameters. r_par <- rbind(1:5, c(4, 5, 2, 1, 3)) t_par <- c(0.01, 0.02) w_par <- c(0.4, 0.6) set.seed(12345) rMSmix(sample_size = 50, n_items = 5, n_clust = 2, theta = t_par, rho = r_par, weights = w_par)

References

Vitelli V, Sørensen Ø, Crispino M, Frigessi A and Arjas E (2018). Probabilistic Preference Learning with the Mallows Rank Model. Journal of Machine Learning Research, 18 (158), pages 1--49, ISSN: 1532-4435, https://jmlr.org/papers/v18/15-481.html.

Sørensen Ø, Crispino M, Liu Q and Vitelli V (2020). BayesMallows: An R Package for the Bayesian Mallows Model. The R Journal, 12 (1), pages 324--342, DOI: 10.32614/RJ-2020-026.

Chenyang Zhong (2021). Mallows permutation model with L1 and L2 distances I: hit and run algorithms and mixing times. arXiv: 2112.13456.

  • Maintainer: Cristina Mollica
  • License: GPL (>= 3)
  • Last published: 2025-03-25

Useful links