Conditional posterior predictive check for Bayesian mixtures of Plackett-Luce models
Conditional posterior predictive check for Bayesian mixtures of Plackett-Luce models
Perform conditional posterior predictive check to assess the goodness-of-fit of Bayesian mixtures of Plackett-Luce models with a different number of components.
pi_inv: An object of class top_ordering, collecting the numeric N$$x$$K data matrix of partial orderings, or an object that can be coerced with as.top_ordering.
seq_G: Numeric vector with the number of components of the Plackett-Luce mixtures to be assessed.
MCMCsampleP: List of size length(seq_G), whose generic element is a numeric L$$x$$(G*K) matrix with the MCMC samples of the component-specific support parameters.
MCMCsampleW: List of size length(seq_G), whose generic element is a numeric L$$x$$G matrix with the MCMC samples of the mixture weights.
top1: Logical: whether the posterior predictive p-value based on the top item frequencies has to be computed. Default is TRUE.
paired: Logical: whether the posterior predictive p-value based on the paired comparison frequencies has to be computed. Default is TRUE.
parallel: Logical: whether parallelization should be used. Default is FALSE.
Returns
A list with a named element:
post_pred_pvalue_cond: Numeric length(seq_G)x$$2 matrix of posterior predictive p-values based on the top item and paired comparison frequencies. If either top1 or paired argument is FALSE, the corresponding matrix entries are NA.
Details
The ppcheckPLMIX_cond function returns two posterior predictive p-values based on two chi squared discrepancy variables involving: (i) the top item frequencies and (ii) the paired comparison frequencies. In the presence of partial sequences in the pi_inv matrix, the same missingness patterns observed in the dataset (i.e., the number of items ranked by each sample unit) are reproduced on the replicated datasets from the posterior predictive distribution. Differently from the ppcheckPLMIX function, the condional discrepancy measures are obtained by summing up the chi squared discrepancies computed on subsamples of observations with the same number of ranked items.