complh function

Complete Likelihood Frequency Method for Label Switching

Complete Likelihood Frequency Method for Label Switching

`complh' is used to address the label switching problem by maximizing the complete likelihood (Yao, 2015). This method leverages information from the latent component label, which is the label the user used to generate the sample. The function supports both one-dimensional (with equal variances or unequal variances) and multi-dimensional data (with equal variances).

complh(est, lat)

Arguments

  • est: a list with four elements representing the estimated mixture model, which can be obtained using the mixnorm function. When specified, it has the form of list(mu, sigma, pi, p), where mu is a C by p matrix of estimated component means where p is the dimension of data and C is the number of mixture components, sigma is a p by p matrix of estimated common standard deviation for all components (when the data is multi-dimensional) or a C-dimensional vector of estimated component standard deviations (when the data is one-dimensional), pi is a C-dimensional vector of mixing proportions, and p is a C by n matrix of the classification probabilities, where the (i, j)th element corresponds to the probability of the jth observation belonging to the ith component.
  • lat: a C by n zero-one matrix representing the latent component labels for all observations, where C is the number of components in the mixture model and n is the number of observations. If the (i, j)th cell is 1, it indicates that the jth observation belongs to the ith component.

Returns

The estimation results adjusted to account for potential label switching problems are returned, as a list containing the following elements: - mu: C by p matrix of estimated component means.

  • sigma: C-dimensional vector of estimated component standard deviations (for univariate data) or p by p matrix of estimated component variance (for multivariate data).

  • pi: C-dimensional vector of estimated mixing proportions.

Examples

#-----------------------------------------------------------------------------------------# # Example 1: Two-component Univariate Normal Mixture #-----------------------------------------------------------------------------------------# # Simulate the data set.seed(827) n = 200 prop = 0.3 n1 = rbinom(1, n, prop) mudif = 1.5 x1 = rnorm(n1, 0, 1) x2 = rnorm(n - n1, mudif, 1) x = c(x1, x2) pm = c(2, 1, 3, 5, 4) # Use the `mixnorm' function to get the MLE and the estimated classification probabilities out = mixnorm(x, 2) # Prepare latent component label lat = rbind(rep(c(1, 0), times = c(n1, n - n1)), rep(c(0, 1), times = c(n1, n - n1))) # Fit the complh/distlat function clhest = complh(out, lat) clhest # Result: # mean of the first component: -0.1037359, # mean of the second component: 1.6622397, # sigma is 0.8137515 for both components, and # the proportions for the two components are # 0.3945660 and 0.6054340, respectively. ditlatest = distlat(out, lat) #-----------------------------------------------------------------------------------------# # Example 2: Two-component Multivariate Normal Mixture #-----------------------------------------------------------------------------------------# # Simulate the data n = 400 prop = 0.3 n1 = rbinom(1, n, prop) pi = c(prop, 1 - prop) mu1 = 0.5 mu2 = 0.5 mu = matrix(c(0, mu1, 0, mu2), ncol = 2) pm = c(2, 1, 4, 3, 6, 5) sigma = diag(c(1, 1)) ini = list(sigma = sigma, mu = mu, pi = pi) x1 = mvtnorm::rmvnorm(n1, c(0, 0), ini$sigma) x2 = mvtnorm::rmvnorm(n - n1, c(mu1, mu2), ini$sigma) x = rbind(x1, x2) # Use the `mixnorm' function to get the MLE and the estimated classification probabilities out = mixnorm(x, 2) # Prepare latent component label lat = rbind(rep(c(1, 0), times = c(n1, n - n1)), rep(c(0, 1), times = c(n1, n - n1))) # Fit the complh/distlat function clhest = complh(out, lat) distlatest = distlat(out, lat) #-----------------------------------------------------------------------------------------# # Example 3: Three-component Multivariate Normal Mixture #-----------------------------------------------------------------------------------------# # Simulate the data n = 100 pi = c(0.2, 0.3, 0.5) ns = stats::rmultinom(1, n, pi) n1 = ns[1]; n2 = ns[2]; n3 = ns[3] mu1 = 1 mu2 = 1 mu = matrix(c(0, mu1, 2 * mu1, 0, mu2, 2 * mu2), ncol = 2) sigma = diag(c(1, 1)) ini = list(sigma = sigma, mu = mu, pi = pi) x1 = mvtnorm::rmvnorm(n1, c(0, 0), ini$sigma) x2 = mvtnorm::rmvnorm(n2, c(mu1, mu2), ini$sigma) x3 = mvtnorm::rmvnorm(n3, c(2 * mu1, 2 * mu2), ini$sigma) x = rbind(x1, x2, x3) # Use the `mixnorm' function to get the MLE and the estimated classification probabilities out = mixnorm(x, 3) # Prepare latent component label lat = rbind(rep(c(1, 0), times = c(n1, n - n1)), rep(c(0, 1, 0), times = c(n1, n2, n3)), rep(c(0, 1), times = c(n - n3, n3))) # Fit the complh/distlat function clhest = complh(out, lat) distlatest = distlat(out, lat)

References

Yao, W. (2015). Label switching and its solutions for frequentist mixture models. Journal of Statistical Computation and Simulation, 85(5), 1000-1012.

See Also

distlat, mixnorm

  • Maintainer: Suyeon Kang
  • License: GPL (>= 2)
  • Last published: 2023-09-20

Useful links