calc_estimate function

Calculate estimate

Calculate estimate

Calculates SAVER estimate

calc.estimate( x, x.est, cutoff = 0, coefs = NULL, sf, scale.sf, pred.gene.names, pred.cells, null.model, nworkers, calc.maxcor, estimates.only ) calc.estimate.mean(x, sf, scale.sf, mu, nworkers, estimates.only) calc.estimate.null(x, sf, scale.sf, nworkers, estimates.only)

Arguments

  • x: An expression count matrix. The rows correspond to genes and the columns correspond to cells.
  • x.est: The log-normalized predictor matrix. The rows correspond to cells and the columns correspond to genes.
  • cutoff: Maximum absolute correlation to determine whether a gene should be predicted.
  • coefs: Coefficients of a linear fit of log-squared ratio of largest lambda to lambda of lowest cross-validation error. Used to estimate model with lowest cross-validation error.
  • sf: Normalized size factor.
  • scale.sf: Scale of size factor.
  • pred.gene.names: Names of genes to perform regression prediction.
  • pred.cells: Index of cells to perform regression prediction.
  • null.model: Whether to use mean gene expression as prediction.
  • nworkers: Number of cores registered to parallel backend.
  • calc.maxcor: Whether to calculate maximum absolute correlation.
  • estimates.only: Only return SAVER estimates. Default is FALSE.
  • mu: Matrix of prior means

Returns

A list with the following components - est: Recovered (normalized) expression

  • se: Standard error of estimates

  • maxcor: Maximum absolute correlation for each gene. 2 if not calculated

  • lambda.max: Smallest value of lambda which gives the null model.

  • lambda.min: Value of lambda from which the prediction model is used

  • sd.cv: Difference in the number of standard deviations in deviance between the model with lowest cross-validation error and the null model

  • ct: Time taken to generate predictions.

  • vt: Time taken to estimate variance.

Details

The SAVER method starts by estimating the prior mean and variance for the true expression level for each gene and cell. The prior mean is obtained through predictions from a LASSO Poisson regression for each gene implemented using the glmnet package. Then, the variance is estimated through maximum likelihood assuming constant variance, Fano factor, or coefficient of variation variance structure for each gene. The posterior distribution is calculated and the posterior mean is reported as the SAVER estimate.

  • Maintainer: Mo Huang
  • License: GPL-2
  • Last published: 2019-11-13