(Log-)likelihood for mixtures of Mallows models with Spearman distance
(Log-)likelihood for mixtures of Mallows models with Spearman distance
Compute the (log-)likelihood for the parameters of a mixture of Mallows models with Spearman distance on partial rankings. Partial rankings with missing data in arbitrary positions are supported.
rho: Integer G$$x$$n matrix with the component-specific consensus rankings in each row.
theta: Numeric vector of G non-negative component-specific precision parameters.
weights: Numeric vector of G positive mixture weights (normalization is not necessary).
rankings: Integer N$$x$$n matrix or data frame with partial rankings in each row. Missing positions must be coded as NA.
log: Logical: whether the log-likelihood must be returned. Defaults to TRUE.
Returns
The (log)-likelihood value.
Details
The (log-)likelihood evaluation is performed by augmenting the partial rankings with the set of all compatible full rankings (see data_augmentation), and then the marginal likelihood is computed.
When n≤20, the (log-)likelihood is exactly computed. When n>20, the model normalizing constant is not available and is approximated with the method introduced by Crispino et al. (2023). If n>170, the approximation is also restricted over a fixed grid of values for the Spearman distance to limit computational burden.
Examples
## Example 1. Likelihood of a full ranking of n=5 items under the uniform (null) model.likMSmix(rho =1:5, theta =0, weights =1, rankings = c(3,5,2,1,4), log =FALSE)# corresponds to...1/factorial(5)## Example 2. Simulate rankings from a 2-component mixture of Mallows models## with Spearman distance.set.seed(12345)d_sim <- rMSmix(sample_size =75, n_items =8, n_clust =2)str(d_sim)# Fit the true model.rankings <- d_sim$samples
fit <- fitMSmix(rankings = rankings, n_clust =2, n_start =10)# Compare log-likelihood values of the true parameter values and the MLE.likMSmix(rho = d_sim$rho, theta = d_sim$theta, weights = d_sim$weights, rankings = d_sim$samples)likMSmix(rho = fit$mod$rho, theta = fit$mod$theta, weights = fit$mod$weights, rankings = d_sim$samples)## Example 3. Simulate rankings from a basic Mallows model with Spearman distance.set.seed(12345)d_sim <- rMSmix(sample_size =25, n_items =6)str(d_sim)# Censor data to be partial top-3 rankings.rankings <- d_sim$samples
rankings[rankings>3]<-NA# Fit the true model with data augmentation.set.seed(12345)fit <- fitMSmix(rankings = rankings, n_clust =1, n_start =10)# Compare log-likelihood values of the true parameter values and the MLEs.likMSmix(rho = d_sim$rho, theta = d_sim$theta, weights = d_sim$weights, rankings = d_sim$samples)likMSmix(rho = fit$mod$rho, theta = fit$mod$theta, weights = fit$mod$weights, rankings = d_sim$samples)
References
Crispino M, Mollica C and Modugno L (2025+). MSmix: An R Package for clustering partial rankings via mixtures of Mallows Models with Spearman distance. (submitted)
Crispino M, Mollica C, Astuti V and Tardella L (2023). Efficient and accurate inference for mixtures of Mallows models with Spearman distance. Statistics and Computing, 33 (98), DOI: 10.1007/s11222-023-10266-8.