Update a Bayesian Mallows model estimated using the Metropolis-Hastings algorithm in compute_mallows() using the sequential Monte Carlo algorithm described in \insertCite steinSequentialInferenceMallows2023;textualBayesMallows.
update_mallows(model, new_data,...)## S3 method for class 'BayesMallowsPriorSamples'update_mallows( model, new_data, model_options = set_model_options(), smc_options = set_smc_options(), compute_options = set_compute_options(), priors = model$priors, pfun_estimate =NULL,...)## S3 method for class 'BayesMallows'update_mallows( model, new_data, model_options = set_model_options(), smc_options = set_smc_options(), compute_options = set_compute_options(), priors = model$priors,...)## S3 method for class 'SMCMallows'update_mallows(model, new_data,...)
Arguments
model: A model object of class "BayesMallows" returned from compute_mallows(), an object of class "SMCMallows" returned from this function, or an object of class "BayesMallowsPriorSamples" returned from sample_prior().
new_data: An object of class "BayesMallowsData" returned from setup_rank_data(). The object should contain the new data being provided.
...: Optional arguments. Currently not used.
model_options: An object of class "BayesMallowsModelOptions" returned from set_model_options().
smc_options: An object of class "SMCOptions" returned from set_smc_options().
compute_options: An object of class "BayesMallowsComputeOptions" returned from set_compute_options().
priors: An object of class "BayesMallowsPriors" returned from set_priors(). Defaults to the priors used in model.
pfun_estimate: Object returned from estimate_partition_function(). Defaults to NULL, and will only be used for footrule, Spearman, or Ulam distances when the cardinalities are not available, cf. get_cardinalities(). Only used by the specialization for objects of type "BayesMallowsPriorSamples".
Returns
An updated model, of class "SMCMallows".
Examples
## Not run:set.seed(1)# UPDATING A MALLOWS MODEL WITH NEW COMPLETE RANKINGS# Assume we first only observe the first four rankings in the potato_visual# datasetdata_first_batch <- potato_visual[1:4,]# We start by fitting a model using Metropolis-Hastingsmod_init <- compute_mallows( data = setup_rank_data(data_first_batch), compute_options = set_compute_options(nmc =10000))# Convergence seems good after no more than 2000 iterationsassess_convergence(mod_init)burnin(mod_init)<-2000# Next, assume we receive four more observationsdata_second_batch <- potato_visual[5:8,]# We can now update the model using sequential Monte Carlomod_second <- update_mallows( model = mod_init, new_data = setup_rank_data(rankings = data_second_batch), smc_options = set_smc_options(resampler ="systematic"))# This model now has a collection of particles approximating the posterior# distribution after the first and second batch# We can use all the posterior summary functions as we do for the model# based on compute_mallows():plot(mod_second)plot(mod_second, parameter ="rho", items =1:4)compute_posterior_intervals(mod_second)# Next, assume we receive the third and final batch of data. We can update# the model againdata_third_batch <- potato_visual[9:12,]mod_final <- update_mallows( model = mod_second, new_data = setup_rank_data(rankings = data_third_batch))# We can plot the same things as beforeplot(mod_final)compute_consensus(mod_final)# UPDATING A MALLOWS MODEL WITH NEW OR UPDATED PARTIAL RANKINGS# The sequential Monte Carlo algorithm works for data with missing ranks as# well. This both includes the case where new users arrive with partial ranks,# and when previously seen users arrive with more complete data than they had# previously.# We illustrate for top-k rankings of the first 10 users in potato_visualpotato_top_10 <- ifelse(potato_visual[1:10,]>10,NA_real_, potato_visual[1:10,])potato_top_12 <- ifelse(potato_visual[1:10,]>12,NA_real_, potato_visual[1:10,])potato_top_14 <- ifelse(potato_visual[1:10,]>14,NA_real_, potato_visual[1:10,])# We need the rownames as user IDs(user_ids <-1:10)# First, users provide top-10 rankingsmod_init <- compute_mallows( data = setup_rank_data(rankings = potato_top_10, user_ids = user_ids), compute_options = set_compute_options(nmc =10000))# Convergence seems fine. We set the burnin to 2000.assess_convergence(mod_init)burnin(mod_init)<-2000# Next assume the users update their rankings, so we have top-12 instead.mod1 <- update_mallows( model = mod_init, new_data = setup_rank_data(rankings = potato_top_12, user_ids = user_ids), smc_options = set_smc_options(resampler ="stratified"))plot(mod1)# Then, assume we get even more data, this time top-14 rankings:mod2 <- update_mallows( model = mod1, new_data = setup_rank_data(rankings = potato_top_14, user_ids = user_ids))plot(mod2)# Finally, assume a set of new users arrive, who have complete rankings.potato_new <- potato_visual[11:12,]# We need to update the user IDs, to show that these users are different(user_ids <-11:12)mod_final <- update_mallows( model = mod2, new_data = setup_rank_data(rankings = potato_new, user_ids = user_ids))plot(mod_final)# We can also update models with pairwise preferences# We here start by running MCMC on the first 20 assessors of the beach data# A realistic application should run a larger number of iterations than we# do in this example.set.seed(3)dat <- subset(beach_preferences, assessor <=20)mod <- compute_mallows( data = setup_rank_data( preferences = beach_preferences), compute_options = set_compute_options(nmc =3000, burnin =1000))# Next we provide assessors 21 to 24 one at a time. Note that we sample the# initial augmented rankings in each particle for each assessor from 200# different topological sorts consistent with their transitive closure.for(i in21:24){ mod <- update_mallows( model = mod, new_data = setup_rank_data( preferences = subset(beach_preferences, assessor == i), user_ids = i), smc_options = set_smc_options(latent_sampling_lag =0, max_topological_sorts =200))}# Compared to running full MCMC, there is a downward bias in the scale# parameter. This can be alleviated by increasing the number of particles,# MCMC steps, and the latent sampling lag.plot(mod)compute_consensus(mod)## End(Not run)
See Also
Other modeling: burnin(), burnin<-(), compute_mallows(), compute_mallows_mixtures(), compute_mallows_sequentially(), sample_prior()