Log likelihood calculation for a Poisson mixture model
Log likelihood calculation for a Poisson mixture model
Functions to calculate the log likelihood for a Poisson mixture model, the difference in log likelihoods for two different sets of parameters of a Poisson mixture model or the log-likelihood for each observation.
y: (n x q) matrix of observed counts for n observations and q variables
mean: List of length g containing the (n x q) matrices of conditional mean expression for all observations, as calculated by the PoisMixMean function, where g represents the number of clusters
mean.new: List of length g containing the (n x q) matrices of conditional mean expression for all observations for one set of parameters, as calculated by the PoisMixMean function, where g represents the number of clusters
mean.old: List of length g containing the (n x q) matrices of conditional mean expression for all observations for another set of parameters, as calculated by the PoisMixMean function, where g represents the number of clusters
pi.new: Vector of length g containing one estimate for π^
pi.old: Vector of length g containing another estimate for π^
pi: Vector of length g containing estimate for π^
conds: Vector of length q defining the condition (treatment group) for each variable (column) in y
s: Estimate of normalized per-variable library size
lambda: (d x g) matrix containing the current estimate of lambda, where d is the number of conditions (treatment groups) and g is the number of clusters
Details
The logLikePoisMixDiff function is used to calculate the difference in log likelihood for two different sets of parameters in a Poisson mixture model; it is used to determine convergence in the EM algorithm run by the PoisMixClus function. The logLikePoisMix function (taken largely from the mylogLikePoisMix function from the poisson.glm.mix R package) calculates the log likelihood for a given set of parameters in a Poisson mixture model and is used in the PoisMixClus function for the calculation of the BIC and ICL. The mylogLikePoisMixObs function calculates the log likelihood per observation for a given set of parameters in a Poisson mixture model.
Returns
ll: (Depending on the context), the log likelihood, difference in log likelihoods for two different sets of parameters, or per-observation log-likelihood
Note
In the logLikePoisMixDiff function, we make use of the alternative mass function for a Poisson density proposed by Loader (2000) to avoid computational difficulties. The logLikePoisMixDiff function returns a default value of 100 if one or both of the log likelihoods associated with the two parameter sets takes on a value of −∞.
Rau, A., Maugis-Rabusseau, C., Martin-Magniette, M.-L., Celeux G. (2015). Co-expression analysis of high-throughput transcriptome sequencing data with Poisson mixture models. Bioinformatics, 31(9):1420-1427.
Rau, A., Celeux, G., Martin-Magniette, M.-L., Maugis-Rabusseau, C. (2011) Clustering high-throughput sequencing data with Poisson mixture models. Inria Research Report 7786. Available at https://inria.hal.science/inria-00638082.
Author(s)
Andrea Rau
See Also
PoisMixClus for Poisson mixture model estimation and model selection; PoisMixMean to calculate the per-cluster conditional mean of each observation
Examples
set.seed(12345)## Simulate data as shown in Rau et al. (2011)## Library size setting "A", low cluster separation## n = 200 observationssimulate <- PoisMixSim(n =200, libsize ="A", separation ="low")y <- simulate$y
conds <- simulate$conditions
w <- rowSums(y)## Estimate of wr <- table(conds)## Number of replicates per conditiond <- length(unique(conds))## Number of conditionss <- colSums(y)/ sum(y)## TC estimate of lib sizes.dot <- rep(NA, d)## Summing lib size within conditionsfor(j in1:d) s.dot[j]<- sum(s[which(conds == unique(conds)[j])]);## Initial guess for pi and lambdag.true <-4pi.guess <- simulate$pi
## Recalibrate so that (s.dot * lambda.guess) = 1lambda.sim <- simulate$lambda
lambda.guess <- matrix(NA, nrow = d, ncol = g.true)for(k in1:g.true){ tmp <- lambda.sim[,k]/sum(lambda.sim[,k]) lambda.guess[,k]<- tmp/s.dot
}## Run the PMM-II model for g = 4## with EM algorithm and "TC" library size parameterrun <- PoisMixClus(y, g =4, norm ="TC", conds = conds)pi.est <- run$pi
lambda.est <- run$lambda
## Mean values for each of the parameter setsmean.guess <- PoisMixMean(y,4, conds, s, lambda.guess)mean.est <- PoisMixMean(y,4, conds, s, lambda.est)## Difference in log likelihoods LL.diff <- logLikePoisMixDiff(y, mean.guess, pi.guess, mean.est, pi.est)LL.diff ## -12841.11