x: A matrix of observations where each row correspond to an observation and each columns to a feature/variable.
theta: A list of parameters of class theta as described in rtheta. Optional. If not provided m should be given.
m: numeric. The number of components if theta is not supplied.
eps: The maximal required difference in successive likelihoods to establish convergence.
max.ite: The maximum number of iterations.
trace.theta: Logical. If TRUE, all estimates are stored and returned. Default is FALSE.
verbose: Set to TRUE for verbose output. Default is FALSE.
Returns
A list of length 3 with elements: - theta: A list of the estimated parameters as described in rtheta.
loglik.tr: A numeric vector of the log-likelihood trace.
kappa: A matrix where kappa[i,j] is the probability that x[i, ] is realized from the j'th component.
Details
Though not as versatile, the algorithm can be a faster alternative to Mclust in the mclust-package. If theta is not given, a k-means clustering is used to determine the initial theta.
Examples
set.seed(3)true.theta <- rtheta(d =2, m =3, method ="old")true.theta$sigma <- lapply(true.theta$sigma, cov2cor)# Scale## Not run:plot(true.theta, nlevels =20, add.ellipses =TRUE)## End(Not run)data <- SimulateGMCMData(n =1000, theta = true.theta)start.theta <- rtheta(d =2, m =3)start.theta$mu <- t(kmeans(data$z,3)$centers)# More sensible location estimatesstart.theta <- as.theta(start.theta)# Coerce the matrix to a listres <- GMCM:::EMAlgorithm(data$z, theta = start.theta)par(mfrow = c(1,2))plot(data$z, cex =0.5, pch =16, main ="Simulated data", col = rainbow(3)[data$K])plot(data$z, cex =0.5, pch =16, main ="GMM clustering", col = rainbow(3)[apply(res$kappa,1,which.max)])