NMixMCMC function

MCMC estimation of (multivariate) normal mixtures with possibly censored data.

MCMC estimation of (multivariate) normal mixtures with possibly censored data.

This function runs MCMC for a model in which unknown density is specified as a normal mixture with either known or unknown number of components. With a prespecified number of components, MCMC is implemented through Gibbs sampling (see Diebolt and Robert, 1994) and dimension of the data can be arbitrary. With unknown number of components, currently only univariate case is implemented using the reversible jump MCMC (Richardson and Green, 1997).

Further, the data are allowed to be censored in which case additional Gibbs step is used within the MCMC algorithm

Details

See accompanying paper (Komárek, 2009). In the rest of the helpfile, the same notation is used as in the paper, namely, nn denotes the number of observations, pp is dimension of the data, KK is the number of mixture components, w[1],,w[K]w[1],\ldots,w[K] are mixture weights, mu[1],,mu[K]mu[1],\ldots,mu[K]

are mixture means, Sigma1,,SigmaKSigma_1,\ldots,Sigma_K

are mixture variance-covariance matrices, Q1,,QKQ_1,\ldots,Q_K are their inverses.

For the data y[1],,y[n]y[1],\ldots,y[n] the following g(y)g(y) density is assumed

gy(y)=S1j=1Kwjφ(S1(ymμj,Σj)), g_y(\boldsymbol{y}) = |\boldsymbol{S}|^{-1} \sum_{j=1}^K w_j\varphi\bigl(\boldsymbol{S}^{-1}(\boldsymbol{y} - \boldsymbol{m}\,|\,\boldsymbol{\mu}_j,\,\boldsymbol{\Sigma}_j)\bigr),%g(y) = |S|^(-1) * sum[j=1]^K w[j] phi(S^(-1)*(y - m) | mu[j], Sigma[j]),

where phi(.mu,Sigma)phi(. | mu, Sigma) denotes a density of the (multivariate) normal distribution with mean mumu and a~variance-covariance matrix Σ\Sigma. Finally, SS is a pre-specified diagonal scale matrix and mm is a pre-specified shift vector. Sometimes, by setting mm to sample means of components of yy and diagonal of SS to sample standard deviations of yy (considerable) improvement of the MCMC algorithm is achieved.

NMixMCMC(y0, y1, censor, x_w, scale, prior, init, init2, RJMCMC, nMCMC = c(burn = 10, keep = 10, thin = 1, info = 10), PED, keep.chains = TRUE, onlyInit = FALSE, dens.zero = 1e-300, parallel = FALSE, cltype) ## S3 method for class 'NMixMCMC' print(x, dic, ...) ## S3 method for class 'NMixMCMClist' print(x, ped, dic, ...)

Arguments

  • y0: numeric vector of length nn or nxpn x p

    matrix with observed data. It contains exactly observed, right-censored, left-censored data and lower limits for interval-censored data.

  • y1: numeric vector of length nn or nxpn x p

    matrix with upper limits for interval-censored data. Elements corresponding to exactly observed, right-censored or left-censored data are ignored and can be filled arbitrarily (by NA's) as well.

    It does not have to be supplied if there are no interval-censored data.

  • censor: numeric vector of length nn or nxpn x p

    matrix with censoring indicators. The following values indicate:

    • 0: right-censored observation,
    • 1: exactly observed value,
    • 2: left-censored observation,
    • 3: interval-censored observation.

    If it is not supplied then it is assumed that all values are exactly observed.

  • x_w: optional vector providing a categorical covariate that may influence the mixture weights. Internally, it is converted into a factor.

    Added in version 4.0 (03/2015).

  • scale: a list specifying how to scale the data before running MCMC. It should have two components:

    • shift: a vector of length 1 or pp specifying shift vector mm,
    • scale: a vector of length 1 or pp specifying diagonal of the scaling matrix SS.

    If there is no censoring, and argument scale is missing then the data are scaled to have zero mean and unit variances, i.e., scale(y0) is used for MCMC. In the case there is censoring and scale is missing, scale$shift is taken to be a sample mean of init$y and scale$scale are sample standard deviations of columns of init$y.

    If you do not wish to scale the data before running MCMC, specify scale=list(shift=0, scale=1).

  • prior: a list with the parameters of the prior distribution. It should have the following components (for some of them, the program can assign default values and the user does not have to specify them if he/she wishes to use the defaults):

    • priorK: a character string which specifies the type of the prior for KK (the number of mixture components). It should have one of the following values:

        $\cr$
       
        ‘fixed’ $\cr$
       
       Number of mixture components is assumed to be fixed to $K[max]$. This is a default value.
       
        $\cr$
       
        ‘uniform’ $\cr$
       
       A priori $K ~ Unif{1,\ldots,K[max]}.$
       
        $\cr$
       
        ‘tpoisson’ $\cr$
       
       A priori c("$K ~\n$", "$\t  truncated-Poiss(lambda, K[max]).$")
      
    • priormuQ: a character string which specifies the type of the prior for c("mu[1],ldots,\nmu[1], \\ldots,\n", "\tmu[K[max]]\tmu[K[max]]") (mixture means) and c("Q[1],ldots,\nQ[1], \\ldots,\n", "Q[K[max]] Q[K[max]]") (inverted mixture covariance matrices). It should have one of the following values:

        $\cr$
       
        ‘independentC’ $\cr$
       
        $=$ independent conjugate prior (this is a default value). That is, a priori 
      
(μj,Qj)\mboxN(ξj,Dj)×\mboxWishart(ζ,Ξ)(mu[j],Q[j]) N(xi[j],D[j])Wishart(zeta,Xi) (\boldsymbol{\mu}_j,\, \boldsymbol{Q}_j) \sim\mbox{N}(\boldsymbol{\xi}_j,\,\boldsymbol{D}_j)\times\mbox{Wishart}(\zeta,\,\boldsymbol{\Xi})(mu[j], Q[j]) ~ N(xi[j], D[j]) * Wishart(zeta, Xi)
       independently for $j=1,\ldots,K$, where normal means $xi[1],\ldots,xi[K]$, normal variances $D[1],\ldots,D[K]$, and Wishart degrees of freedom $zeta$ are specified further as `xi`, `D`, `zeta` components of the list `prior`.
       
        $\cr$
       
        ‘naturalC’ $\cr$
       
        $=$ natural conjugate prior. That is, a priori 
(μj,Qj)\mboxN(ξj,cj1Qj1)×\mboxWishart(ζ,Ξ)(mu[j],Q[j]) N(xi[j],(c[j]Q[j])(1))Wishart(zeta,Xi) (\boldsymbol{\mu}_j,\, \boldsymbol{Q}_j) \sim\mbox{N}(\boldsymbol{\xi}_j,\,c_j^{-1}\boldsymbol{Q}_j^{-1})\times\mbox{Wishart}(\zeta,\,\boldsymbol{\Xi})(mu[j], Q[j]) ~ N(xi[j], (c[j]Q[j])^(-1)) * Wishart(zeta, Xi)
       independently for $j=1,\ldots,K$, where normal means $xi[1],\ldots,xi[K]$, precisions $c[1],\ldots,c[K]$, and Wishart degrees of freedom $zeta$ are specified further as `xi`, `ce`, `zeta` components of the list `prior`.
       
        $\cr$
       
       For both, independent conjugate and natural conjugate prior, the Wishart scale matrix $Xi$ is assumed to be diagonal with $gamma[1],\ldots,gamma[p]$ on a diagonal. For $gamma[j]^(-1)$ $(j=1,\ldots,K)$ additional gamma hyperprior $G(g[j], h[j])$ is assumed. Values of $g[1],\ldots,g[p]$ and $h[1],\ldots,h[p]$ are further specified as `g` and `h` components of the `prior` list.
- **Kmax**: maximal number of mixture components $K[max]$. It must always be specified by the user.
- **lambda**: parameter $lambda$ for the truncated Poisson prior on $K$. It must be positive and must always be specified if `priorK` is ‘tpoisson’ .
- **delta**: parameter $delta$ for the Dirichlet prior on the mixture weights $w[1],\ldots,w[K].$
       
       It must be positive. Its default value is 1.
- **xi**: a numeric value, vector or matrix which specifies c("$xi[1],\n$", "$\t  \\ldots, xi[K[max]]$") (prior means for the mixture means c("$mu[1], \\ldots,\n$", "$\t    mu[K[max]]$")). Default value is a matrix $K[max] x p$ with midpoints of columns of `init$y` in rows which follows Richardson and Green (1997).
       
        $\cr$
       
       If $p=1$ and `xi`$=xi$ is a single value then $xi[1]=\ldots=xi[K[max]] = xi.$
       
        $\cr$
       
       If $p=1$ and `xi`$=\boldsymbol{\xi}$ is a vector of length $K[max]$ then the $j$-th element of `xi`
       
       gives $xi[j]$ $(j=1,\ldots,K[max]).$
       
        $\cr$
       
       If $p>1$ and `xi`$=xi$ is a vector of length $p$
       
       then $xi[1]=\ldots=xi[K[max]] = xi.$
       
        $\cr$
       
       If $p>1$ and `xi` is a c("$K[max]\n$", "$\t    x p$") matrix then the $j$-th row of `xi`
       
       gives $xi[j]$ $(j=1,\ldots,K[max]).$
- **ce**: a numeric value or vector which specifies prior precision parameters $c[1],\ldots,c[K[max]]$ for the mixture means c("$mu[1], \\ldots,\n$", "$          mu[K[max]]$") when `priormuQ` is ‘naturalC’ . Its default value is a vector of ones which follows Cappe, Robert and Ryden (2003).
       
        $\cr$
       
       If `ce`$=c$ is a single value then $c[1]=\ldots=c[K[max]]=c.$
       
        $\cr$
       
       If `ce`$c$ is a vector of length $K[max]$ then the $j$-th element of `ce`
       
       gives $c[j]$ $(j=1,\ldots,K[max]).$
- **D**: a numeric vector or matrix which specifies c("$D[1],\n$", "$\t\\ldots, D[K[max]]$") (prior variances or covariance matrices of the mixture means c("$mu[1], \\ldots,\n$", "$          mu[K[max]]$") when `priormuQ` is ‘independentC’ .) Its default value is a diagonal matrix with squared ranges of each column of `init$y` on a diagonal.
       
        $\cr$
       
       If $p=1$ and `D`$=d$ is a single value then $d[1]=\ldots=d[K[max]] = d.$
       
        $\cr$
       
       If $p=1$ and `D`$=d$ is a vector of length $K[max]$ then the $j$-th element of `D`
       
       gives $d[j]$ $(j=1,\ldots,K[max]).$
       
        $\cr$
       
       If $p>1$ and `D`$=D$ is a $p x p$ matrix then $D[1]=\ldots=D[K[max]] = D.$
       
        $\cr$
       
       If $p>1$ and `D` is a c("$(K[max]*p)\n$", "$        x p$") matrix then the the first $p$ rows of `D`
       
       give $D[1]$, rows $p+1,\ldots,2p$ of `D` give $D[2]$ etc.
- **zeta**: degrees of freedom $zeta$ for the Wishart prior on the inverted mixture variances c("$Q[1],\n$", "$\t  \\ldots,Q[K[max]].$"). It must be higher then $p-1$. Its default value is $p + 1$.
- **g**: a value or a vector of length $p$ with the shape parameters $g[1],\ldots,g[p]$ for the Gamma hyperpriors on $gamma[1],\ldots,gamma[p]$. It must be positive. Its default value is a vector $(0.2,\ldots,0.2)'$.
- **h**: a value or a vector of length $p$ with the rate parameters $h[1],\ldots,h[p]$
       
       for the Gamma hyperpriors on $gamma[1],\ldots,gamma[p]$. It must be positive. Its default value is a vector containing $10/R[l]^2$, where $R[l]$ is a range of the $l$-th column of `init$y`. 
  • init: a list with the initial values for the MCMC. All initials can be determined by the program if they are not specified. The list may have the following components:

    • y: a numeric vector or matrix with the initial values for the latent censored observations.
    • K: a numeric value with the initial value for the number of mixture components.
    • w: a numeric vector with the initial values for the mixture weights.
    • mu: a numeric vector or matrix with the initial values for the mixture means.
    • Sigma: a numeric vector or matrix with the initial values for the mixture variances.
    • Li: a numeric vector with the initial values for the Colesky decomposition of the mixture inverse variances.
    • gammaInv: a numeric vector with the initial values for the inverted components of the hyperparameter gammagamma.
    • r: a numeric vector with the initial values for the mixture allocations.
  • init2: a list with the initial values for the second chain needed to estimate the penalized expected deviance of Plummer (2008). The list init2 has the same structure as the list init. All initials in init2 can be determined by the program (differently than the values in init) if they are not specified.

    Ignored when PED is FALSE.

  • RJMCMC: a list with the parameters needed to run reversible jump MCMC for mixtures with varying number of components. It does not have to be specified if the number of components is fixed. Most of the parameters can be determined by the program if they are not specified. The list may have the following components:

    • Paction: probabilities (or proportionalit constants) which are used to choose an action of the sampler within each iteration of MCMC to update the mixture related parameters. Let Paction = c("(p[1],p[2],\n(p[1], p[2],\n", "\tp[3])\t p[3])'"). Then with probability p[1]p[1] only steps assuming fixed kk (number of mixture components) are performed, with probability p[2]p[2] split-combine move is proposed and with probability p[3]p[3] birth-death move is proposed.

       If not specified (default) then in each iteration of MCMC, all sampler actions are performed.
      
    • Psplit: a numeric vector of length prior$Kmax giving conditional probabilities of the split move given kk as opposite to the combine move.

       Default value is $(1, 0.5, \ldots, 0.5, 0)'$.
      
    • Pbirth: a numeric vector of length prior$Kmax giving conditional probabilities of the birth move given kk as opposite to the death move.

       Default value is $(1, 0.5, \ldots, 0.5, 0)'$.
      
    • par.u1: a two component vector with parameters of the beta distribution used to generate an auxiliary value u[1]u[1].

       A default value is `par.u1` = $(2, 2)'$, i.e., $u[1] ~ Beta(2, 2).$
      
    • par.u2: a two component vector (for p=1p=1) or a matrix (for p>1p > 1) with two columns with parameters of the distributions of the auxiliary values u[2,1],,u[2,p]u[2,1],\ldots,u[2,p] in rows.

       A default value leads to c("$%$", "$\n$", "$\t  u[2,d] ~ Unif(-1, 1) (d=1,\\ldots,p-1),$")
       
        $u[2,p] ~ Beta(1, 2p).$
      
    • par.u3: a two component vector (for p=1p=1) or a matrix (for p>1p > 1) with two columns with parameters of the distributions of the auxiliary values u[3,1],,u[3,p]u[3,1],\ldots,u[3,p] in rows.

       A default value leads to c("$%$", "$\n$", "$\t  u[3,d] ~ Unif(0, 1) (d=1,\\ldots,p-1),$")
       
        $u[3,p] ~ Beta(1, p).$ 
      
  • nMCMC: numeric vector of length 4 giving parameters of the MCMC simulation. Its components may be named (ordering is then unimportant) as:

    • burn: length of the burn-in (after discarding the thinned values), can be equal to zero as well.
    • keep: length of the kept chains (after discarding the thinned values), must be positive.
    • thin: thinning interval, must be positive.
    • info: interval in which the progress information is printed on the screen.

    In total c("(M[burn]+\n(M[burn] +\n", "M[keep])M[thin] M[keep]) * M[thin]") MCMC scans are performed.

  • PED: a logical value which indicates whether the penalized expected deviance (see Plummer, 2008 for more details) is to be computed (which requires two parallel chains). If not specified, PED is set to TRUE

    for models with fixed number of components and is set to FALSE for models with numbers of components estimated using RJ-MCMC.

  • keep.chains: logical. If FALSE, only summary statistics are returned in the resulting object. This might be useful in the model searching step to save some memory.

  • onlyInit: logical. If TRUE then the function only determines parameters of the prior distribution, initial values, values of scale and parameters for the reversible jump MCMC.

  • dens.zero: a small value used instead of zero when computing deviance related quantities.

  • x: an object of class NMixMCMC or NMixMCMClist to be printed.

  • dic: logical which indicates whether DIC should be printed. By default, DIC is printed only for models with a fixed number of mixture components.

  • ped: logical which indicates whether PED should be printed. By default, PED is printed only for models with a fixed number of mixture components.

  • parallel: a logical value which indicates whether parallel computation (based on a package parallel) should be used when running two chains for the purpose of PED calculation

  • cltype: optional argument applicable if parallel is TRUE. If cltype is given, it is passed as the type argument into the call to makeCluster.

  • ...: additional arguments passed to the default print method.

Returns

An object of class NMixMCMC or class NMixMCMClist. Object of class NMixMCMC is returned if PED is FALSE. Object of class NMixMCMClist is returned if PED is TRUE.

Object of class NMixMCMC

Objects of class NMixMCMC have the following components:

  • iter: index of the last iteration performed.

  • nMCMC: used value of the argument nMCMC.

  • dim: dimension pp of the distribution of data

  • nx_w: number of levels of a factor covariate on mixture weights (equal to 1 if there were no covariates on mixture weights)

  • prior: a list containing the used value of the argument prior.

  • init: a list containing the used initial values for the MCMC (the first iteration of the burn-in).

  • state.first: a list having the components labeled y, K, w, mu, Li, Q, Sigma, gammaInv, r containing the values of generic parameters at the first stored (after burn-in) iteration of the MCMC.

  • state.last: a list having the components labeled y, K, w, mu, Li, Q, Sigma, gammaInv, r containing the last sampled values of generic parameters.

  • RJMCMC: a list containing the used value of the argument RJMCMC.

  • scale: a list containing the used value of the argument scale.

  • freqK: frequency table of KK based on the sampled chain.

  • propK: posterior distribution of KK based on the sampled chain.

  • DIC: a data.frame having columns labeled DIC, pD, D.bar, D.in.bar containing values used to compute deviance information criterion (DIC). Currently only DIC[3]DIC[3] of Celeux et al. (2006) is implemented.

  • moves: a data.frame which summarizes the acceptance probabilities of different move types of the sampler.

  • K: numeric vector with a chain for KK (number of mixture components).

  • w: numeric vector or matrix with a chain for ww (mixture weights). It is a matrix with KK columns when KK is fixed. Otherwise, it is a vector with weights put sequentially after each other.

  • mu: numeric vector or matrix with a chain for mumu (mixture means). It is a matrix with pKp*K columns when KK is fixed. Otherwise, it is a vector with means put sequentially after each other.

  • Q: numeric vector or matrix with a chain for lower triangles of QQ (mixture inverse variances). It is a matrix with (p(p+1)2)K(p*(p+1)2)*K

     columns when $K$ is fixed. Otherwise, it is a vector with lower triangles of $Q$ matrices put sequentially after each other.
    
  • Sigma: numeric vector or matrix with a chain for lower triangles of SigmaSigma (mixture variances). It is a matrix with (p(p+1)2)K(p*(p+1)2)*K

     columns when $K$ is fixed. Otherwise, it is a vector with lower triangles of $Sigma$ matrices put sequentially after each other.
    
  • Li: numeric vector or matrix with a chain for lower triangles of Cholesky decompositions of QQ matrices. It is a matrix with (p(p+1)2)K(p*(p+1)2)*K

     columns when $K$ is fixed. Otherwise, it is a vector with lower triangles put sequentially after each other.
    
  • gammaInv: matrix with pp columns with a chain for inverses of the hyperparameter gammagamma.

  • order: numeric vector or matrix with order indeces of mixture components related to artificial identifiability constraint defined by a suitable re-labeling algorithm (by default, simple ordering of the first component of the mixture means is used).

     It is a matrix with $K$ columns when $K$ is fixed. Otherwise it is a vector with orders put sequentially after each other.
    
  • rank: numeric vector or matrix with rank indeces of mixture components. related to artificial identifiability constraint defined by a suitable re-labeling algorithm (by default, simple ordering of the first component of the mixture means is used).

     It is a matrix with $K$ columns when $K$ is fixed. Otherwise it is a vector with ranks put sequentially after each other.
    
  • mixture: data.frame with columns labeled y.Mean.*, y.SD.*, y.Corr.*.*, z.Mean.*, z.SD.*, z.Corr.*.* containing the chains for the means, standard deviations and correlations of the distribution of the original (y) and scaled (z) data based on a normal mixture at each iteration.

  • deviance: data.frame with columns labeles LogL0, LogL1, dev.complete, dev.observed

     containing the chains of quantities needed to compute DIC.
    
  • pm.y: a data.frame with pp columns with posterior means for (latent) values of observed data (useful when there is censoring).

  • pm.z: a data.frame with pp columns with posterior means for (latent) values of scaled observed data (useful when there is censoring).

  • pm.indDev: a data.frame with columns labeled LogL0, LogL1, dev.complete, dev.observed, pred.dens containing posterior means of individual contributions to the deviance.

  • pred.dens: a numeric vector with the predictive density of the data based on the MCMC sample evaluated at data points.

     Note that when there is censoring, this is not exactly the predictive density as it is computed as the average of densities at each iteration evaluated at sampled values of latent observations at iterations.
    
  • poster.comp.prob_u: a matrix which is present in the output object if the number of mixture components in the distribution of random effects is fixed and equal to KK. In that case, poster.comp.prob_u is a matrix with KK columns and nn

     rows with estimated posterior component probabilities -- posterior means of the components of the underlying 0/1 allocation vector.
     
      WARNING: By default, the labels of components are based on artificial identifiability constraints based on ordering of the mixture means in the first margin. Very often, such identifiability constraint is not satisfactory!
    
  • poster.comp.prob_b: a matrix which is present in the output object if the number of mixture components in the distribution of random effects is fixed and equal to KK. In that case, poster.comp.prob_b is a matrix with KK columns and nn

     rows with estimated posterior component probabilities -- posterior mean over model parameters.
     
      WARNING: By default, the labels of components are based on artificial identifiability constraints based on ordering of the mixture means in the first margin. Very often, such identifiability constraint is not satisfactory!
    
  • summ.y.Mean: Posterior summary statistics based on chains stored in y.Mean.* columns of the data.frame mixture.

  • summ.y.SDCorr: Posterior summary statistics based on chains stored in y.SD.* and y.Corr.*.* columns of the data.frame mixture.

  • summ.z.Mean: Posterior summary statistics based on chains stored in z.Mean.* columns of the data.frame mixture.

  • summ.z.SDCorr: Posterior summary statistics based on chains stored in z.SD.* and z.Corr.*.* columns of the data.frame mixture.

  • poster.mean.w: a numeric vector with posterior means of mixture weights after re-labeling. It is computed only if KK is fixed and even then I am not convinced that these are useful posterior summary statistics (see label switching problem mentioned above). In any case, they should be used with care.

  • poster.mean.mu: a matrix with posterior means of mixture means after re-labeling. It is computed only if KK is fixed and even then I am not convinced that these are useful posterior summary statistics (see label switching problem mentioned above). In any case, they should be used with care.

  • poster.mean.Q: a list with posterior means of mixture inverse variances after re-labeling. It is computed only if KK is fixed and even then I am not convinced that these are useful posterior summary statistics (see label switching problem mentioned above). In any case, they should be used with care.

  • poster.mean.Sigma: a list with posterior means of mixture variances after re-labeling. It is computed only if KK is fixed and even then I am not convinced that these are useful posterior summary statistics (see label switching problem mentioned above). In any case, they should be used with care.

  • poster.mean.Li: a list with posterior means of Cholesky decompositions of mixture inverse variances after re-labeling. It is computed only if KK is fixed and even then I am not convinced that these are useful posterior summary statistics (see label switching problem mentioned above). In any case, they should be used with care.

  • relabel: a list which specifies the algorithm used to re-label the MCMC output to compute order, rank, poster.comp.prob_u, poster.comp.prob_b, poster.mean.w, poster.mean.mu, poster.mean.Q, poster.mean.Sigma, poster.mean.Li.

  • Cpar: a list with components useful to call underlying C++ functions (not interesting for ordinary users).

Object of class NMixMCMClist

Object of class NMixMCMClist is the list having two components of class NMixMCMC representing two parallel chains and additionally the following components:

  • PED: values of penalized expected deviance and related quantities. It is a vector with five components: D.expect ==

     estimated expected deviance, where the estimate is based on two parallel chains; `popt` $=$ estimated penalty, where the estimate is based on simple MCMC average based on two parallel chains; `PED` $=$ estimated penalized expected deviance $=$ `D.expect` $+$ `popt`; `wpopt` $=$
     
     estimated penalty, where the estimate is based on weighted MCMC average (through importance sampling) based on two parallel chains; `wPED` $=$ estimated penalized expected deviance $=$
     
      `D.expect` $+$ `wpopt`.
    
  • popt: contributions to the unweighted penalty from each observation.

  • wpopt: contributions to the weighted penalty from each observation.

  • inv.D: for each observation, number of iterations (in both chains), where the deviance was in fact equal to infinity (when the corresponding density was lower than dens.zero) and was not taken into account when computing D.expect.

  • inv.popt: for each observation, number of iterations, where the penalty was in fact equal to infinity and was not taken into account when computing popt.

  • inv.wpopt: for each observation, number of iterations, where the importance sampling weight was in fact equal to infinity and was not taken into account when computing wpopt.

  • sumISw: for each observation, sum of importance sampling weights.

References

Celeux, G., Forbes, F., Robert, C. P., and Titterington, D. M. (2006). Deviance information criteria for missing data models. Bayesian Analysis, 1 (4), 651--674.

Cappé, Robert and Rydén (2003). Reversible jump, birth-and-death and more general continuous time Markov chain Monte Carlo samplers. Journal of the Royal Statistical Society, Series B, 65 (3), 679--700.

Diebolt, J. and Robert, C. P. (1994). Estimation of finite mixture distributions through Bayesian sampling. Journal of the Royal Statistical Society, Series B, 56 (2), 363--375.

Jasra, A., Holmes, C. C., and Stephens, D. A. (2005). Markov chain Monte Carlo methods and the label switching problem in Bayesian mixture modelling. Statistical Science, 20 (1), 50--67.

Komárek, A. (2009). A new R package for Bayesian estimation of multivariate normal mixtures allowing for selection of the number of components and interval-censored data. Computational Statistics and Data Analysis, 53 (12), 3932--3947.

Plummer, M. (2008). Penalized loss functions for Bayesian model comparison. Biostatistics, 9 (3), 523--539.

Richardson, S. and Green, P. J. (1997). On Bayesian analysis of mixtures with unknown number of components (with Discussion). Journal of the Royal Statistical Society, Series B, 59 (4), 731--792.

Spiegelhalter, D. J.,Best, N. G., Carlin, B. P., and van der Linde, A. (2002). Bayesian measures of model complexity and fit (with Discussion). Journal of the Royal Statistical Society, Series B, 64 (4), 583--639.

See Also

NMixPredDensMarg, NMixPredDensJoint2.

Author(s)

Arnošt Komárek arnost.komarek@mff.cuni.cz

Examples

## Not run: ## See also additional material available in ## YOUR_R_DIR/library/mixAK/doc/ ## or YOUR_R_DIR/site-library/mixAK/doc/ ## - files Galaxy.R, Faithful.R, Tandmob.R and ## https://www2.karlin.mff.cuni.cz/~komarek/software/mixAK/Galaxy.pdf ## https://www2.karlin.mff.cuni.cz/~komarek/software/mixAK/Faithful.pdf ## https://www2.karlin.mff.cuni.cz/~komarek/software/mixAK/Tandmob.pdf ## ## ============================================== ## Simple analysis of Anderson's iris data ## ============================================== library("colorspace") data(iris, package="datasets") summary(iris) VARS <- names(iris)[1:4] #COLS <- rainbow_hcl(3, start = 60, end = 240) COLS <- c("red", "darkblue", "darkgreen") names(COLS) <- levels(iris[, "Species"]) ### Prior distribution and the length of MCMC Prior <- list(priorK = "fixed", Kmax = 3) nMCMC <- c(burn=5000, keep=10000, thin=5, info=1000) ### Run MCMC set.seed(20091230) fit <- NMixMCMC(y0 = iris[, VARS], prior = Prior, nMCMC = nMCMC) ### Basic posterior summary print(fit) ### Univariate marginal posterior predictive densities ### based on chain #1 pdens1 <- NMixPredDensMarg(fit[[1]], lgrid=150) plot(pdens1) plot(pdens1, main=VARS, xlab=VARS) ### Bivariate (for each pair of margins) predictive densities ### based on chain #1 pdens2a <- NMixPredDensJoint2(fit[[1]]) plot(pdens2a) plot(pdens2a, xylab=VARS) plot(pdens2a, xylab=VARS, contour=TRUE) ### Determine the grid to compute bivariate densities grid <- list(Sepal.Length=seq(3.5, 8.5, length=75), Sepal.Width=seq(1.8, 4.5, length=75), Petal.Length=seq(0, 7, length=75), Petal.Width=seq(-0.2, 3, length=75)) pdens2b <- NMixPredDensJoint2(fit[[1]], grid=grid) plot(pdens2b, xylab=VARS) ### Plot with contours ICOL <- rev(heat_hcl(20, c=c(80, 30), l=c(30, 90), power=c(1/5, 2))) oldPar <- par(mfrow=c(2, 3), bty="n") for (i in 1:3){ for (j in (i+1):4){ NAME <- paste(i, "-", j, sep="") MAIN <- paste(VARS[i], "x", VARS[j]) image(pdens2b$x[[i]], pdens2b$x[[j]], pdens2b$dens[[NAME]], col=ICOL, xlab=VARS[i], ylab=VARS[j], main=MAIN) contour(pdens2b$x[[i]], pdens2b$x[[j]], pdens2b$dens[[NAME]], add=TRUE, col="brown4") } } ### Plot with data for (i in 1:3){ for (j in (i+1):4){ NAME <- paste(i, "-", j, sep="") MAIN <- paste(VARS[i], "x", VARS[j]) image(pdens2b$x[[i]], pdens2b$x[[j]], pdens2b$dens[[NAME]], col=ICOL, xlab=VARS[i], ylab=VARS[j], main=MAIN) for (spec in levels(iris[, "Species"])){ Data <- subset(iris, Species==spec) points(Data[,i], Data[,j], pch=16, col=COLS[spec]) } } } ### Set the graphical parameters back to their original values par(oldPar) ### Clustering based on posterior summary statistics of component allocations ### or on the posterior distribution of component allocations ### (these are two equivalent estimators of probabilities of belonging ### to each mixture components for each observation) p1 <- fit[[1]]$poster.comp.prob_u p2 <- fit[[1]]$poster.comp.prob_b ### Clustering based on posterior summary statistics of mixture weight, means, variances p3 <- NMixPlugDA(fit[[1]], iris[, VARS]) p3 <- p3[, paste("prob", 1:3, sep="")] ### Observations from "setosa" species (all would be allocated in component 1) apply(p1[1:50,], 2, quantile, prob=seq(0, 1, by=0.1)) apply(p2[1:50,], 2, quantile, prob=seq(0, 1, by=0.1)) apply(p3[1:50,], 2, quantile, prob=seq(0, 1, by=0.1)) ### Observations from "versicolor" species (almost all would be allocated in component 2) apply(p1[51:100,], 2, quantile, prob=seq(0, 1, by=0.1)) apply(p2[51:100,], 2, quantile, prob=seq(0, 1, by=0.1)) apply(p3[51:100,], 2, quantile, prob=seq(0, 1, by=0.1)) ### Observations from "virginica" species (all would be allocated in component 3) apply(p1[101:150,], 2, quantile, prob=seq(0, 1, by=0.1)) apply(p2[101:150,], 2, quantile, prob=seq(0, 1, by=0.1)) apply(p3[101:150,], 2, quantile, prob=seq(0, 1, by=0.1)) ## End(Not run)