sample_mallows function

Random Samples from the Mallows Rank Model

Random Samples from the Mallows Rank Model

Generate random samples from the Mallows Rank Model \insertCite mallows1957BayesMallows with consensus ranking ρ\rho and scale parameter α\alpha. The samples are obtained by running the Metropolis-Hastings algorithm described in Appendix C of \insertCite vitelli2018;textualBayesMallows.

sample_mallows( rho0, alpha0, n_samples, leap_size = max(1L, floor(n_items/5)), metric = "footrule", diagnostic = FALSE, burnin = ifelse(diagnostic, 0, 1000), thinning = ifelse(diagnostic, 1, 1000), items_to_plot = NULL, max_lag = 1000L )

Arguments

  • rho0: Vector specifying the latent consensus ranking in the Mallows rank model.
  • alpha0: Scalar specifying the scale parameter in the Mallows rank model.
  • n_samples: Integer specifying the number of random samples to generate. When diagnostic = TRUE, this number must be larger than 1.
  • leap_size: Integer specifying the step size of the leap-and-shift proposal distribution.
  • metric: Character string specifying the distance measure to use. Available options are "footrule" (default), "spearman", "cayley", "hamming", "kendall", and "ulam". See also the rmm function in the PerMallows package \insertCite irurozki2016BayesMallows for sampling from the Mallows model with Cayley, Hamming, Kendall, and Ulam distances.
  • diagnostic: Logical specifying whether to output convergence diagnostics. If TRUE, a diagnostic plot is printed, together with the returned samples.
  • burnin: Integer specifying the number of iterations to discard as burn-in. Defaults to 1000 when diagnostic = FALSE, else to 0.
  • thinning: Integer specifying the number of MCMC iterations to perform between each time a random rank vector is sampled. Defaults to 1000 when diagnostic = FALSE, else to 1.
  • items_to_plot: Integer vector used if diagnostic = TRUE, in order to specify the items to plot in the diagnostic output. If not provided, 5 items are picked at random.
  • max_lag: Integer specifying the maximum lag to use in the computation of autocorrelation. Defaults to 1000L. This argument is passed to stats::acf. Only used when diagnostic = TRUE.

Examples

# Sample 100 random rankings from a Mallows distribution with footrule distance set.seed(1) # Number of items n_items <- 15 # Set the consensus ranking rho0 <- seq(from = 1, to = n_items, by = 1) # Set the scale alpha0 <- 10 # Number of samples n_samples <- 100 # We first do a diagnostic run, to find the thinning and burnin to use # We set n_samples to 1000, in order to run 1000 diagnostic iterations. test <- sample_mallows(rho0 = rho0, alpha0 = alpha0, diagnostic = TRUE, n_samples = 1000, burnin = 1, thinning = 1) # When items_to_plot is not set, 5 items are picked at random. We can change this. # We can also reduce the number of lags computed in the autocorrelation plots test <- sample_mallows(rho0 = rho0, alpha0 = alpha0, diagnostic = TRUE, n_samples = 1000, burnin = 1, thinning = 1, items_to_plot = c(1:3, 10, 15), max_lag = 500) # From the autocorrelation plot, it looks like we should use # a thinning of at least 200. We set thinning = 1000 to be safe, # since the algorithm in any case is fast. The Markov Chain # seems to mix quickly, but we set the burnin to 1000 to be safe. # We now run sample_mallows again, to get the 100 samples we want: samples <- sample_mallows(rho0 = rho0, alpha0 = alpha0, n_samples = 100, burnin = 1000, thinning = 1000) # The samples matrix now contains 100 rows with rankings of 15 items. # A good diagnostic, in order to confirm that burnin and thinning are set high # enough, is to run compute_mallows on the samples model_fit <- compute_mallows( setup_rank_data(samples), compute_options = set_compute_options(nmc = 10000)) # The highest posterior density interval covers alpha0 = 10. burnin(model_fit) <- 2000 compute_posterior_intervals(model_fit, parameter = "alpha")

References

\insertAllCited

See Also

Other rank functions: compute_expected_distance(), compute_observation_frequency(), compute_rank_distance(), create_ranking(), get_mallows_loglik()