mr_mvcML function

Multivariable constrained maximum likelihood method

Multivariable constrained maximum likelihood method

The mr_mvcML function performs multivariable Mendelian randomization via the constrained maximum likelihood method, which is robust to both correlated and uncorrelated pleiotropy. methods

mr_mvcML( object, n, DP = TRUE, rho_mat = diag(ncol(object@betaX) + 1), K_vec = 0:(ceiling(nrow(object@betaX)/2)), random_start = 0, num_pert = 100, min_theta_range = -0.5, max_theta_range = 0.5, maxit = 100, alpha = 0.05, seed = 314159265 ) ## S4 method for signature 'MRMVInput' mr_mvcML( object, n, DP = TRUE, rho_mat = diag(ncol(object@betaX) + 1), K_vec = 0:(ceiling(nrow(object@betaX)/2)), random_start = 0, num_pert = 200, min_theta_range = -0.5, max_theta_range = 0.5, maxit = 100, alpha = 0.05, seed = 314159265 )

Arguments

  • object: An MRMVInput object.
  • n: Sample size. The smallest sample size among all (both exposures and outcome) GWAS used in the analysis is recommended.
  • DP: Whether data perturbation is applied or not. Default is TRUE.
  • rho_mat: The correlation matrix among the exposures and outcome GWAS estimates, which can be estimated by the intercept term from bivariate LDSC. See reference for more discussions. Default is the identify matrix, for example, in the absence of overlapping samples among GWAS datasets.
  • K_vec: Set of candidate K's, the constraint parameter representing number of invalid IVs. It can range from 0 up to #IV - (#exposure + 1). Default is from 0 to (#IV/2).
  • random_start: Number of random starting points for MVMRcML, default is 0.
  • num_pert: Number of perturbation when DP is TRUE, default is 200.
  • min_theta_range: The lower bound of the uniform distribution for each initial value for theta generated from, default is -0.5.
  • max_theta_range: The uppder bound of the uniform distribution for each initial value for theta generated from, default is 0.5.
  • maxit: Maximum number of iterations for each optimization. Default is 100.
  • alpha: Significance level for the confidence interval for estimate, default is 0.05.
  • seed: The random seed to use when generating the perturbed samples (for reproducibility). The default value is 314159265. If set to NA, the random seed will not be set (for example, if the function is used as part of a larger simulation).

Returns

The output from the function is an MVMRcML object containing:

  • Exposure: A character vector with the names given to the exposure.

  • Outcome: A character string with the names given to the outcome.

  • Estimate: A vector of causal estimates.

  • StdError: A vector of standard errors of the causal estimates.

  • CILower: The lower bounds of the causal estimates based on the estimated standard errors and the significance level provided.

  • CIUpper: The upper bounds of the causal estimates based on the estimated standard errors and the significance level provided.

  • Alpha: The significance level used when calculating the confidence intervals.

  • Pvalue: The p-values associated with the estimates (calculated as Estimate/StdError as per Wald test) using a normal distribution.

  • BIC_invalid: Set of selected invalid IVs by MVMRcML-BIC.

  • K_hat: The number of selected invalid IVs by MVMRcML-BIC, or a vector for each data perturbation in MVMRcML-DP.

  • eff_DP_B: The number of data perturbations with successful convergence in MVMRcML-DP.

  • SNPs: The number of genetic variants (SNPs) included in the analysis.

Details

Multivariable MRcML (MVMRcML) is an extension of MRcML to deal with multiple exposures of interest. It is robust to both correlated and uncorrelated pleiotropy as its univariable version.

In practice, the data perturbation (DP) version is preferred in practice for a more robust inference as it can account for the uncertainty in model selection. However, it may take a longer time especially when the number of IVs is large (so the range of K_vec can be large too). One strategy is to try a small range of K (the number of invalid IVs) first (with a small num_pert), then adjust it if the number of selected invalid IVs fall close to the boundary. You can also use other methods, e.g. mr_mvlasso, to get a rough sense of the number of invalid IVs.

Similar to mr_cML, multiple random starting points could be used to find a global minimum.

Examples

# Perform MVMRcML-DP: mr_mvcML(mr_mvinput(bx = cbind(ldlc, hdlc, trig), bxse = cbind(ldlcse, hdlcse, trigse), by = chdlodds, byse = chdloddsse), n = 17723, num_pert = 5, random_start = 5) # num_pert is set to 5 to reduce runtime for the mr_mvcML method, # At least 100 perturbations should be used and more is preferred for a stable result. rho_mat = matrix(c(1,-0.1,0.2,0,-0.1,1,-0.3,0, 0.2,-0.3,1,0,0,0,0,1),ncol=4) ## Toy example of rho_mat mr_mvcML(mr_mvinput(bx = cbind(ldlc, hdlc, trig), bxse = cbind(ldlcse, hdlcse, trigse), by = chdlodds, byse = chdloddsse), n = 17723, num_pert = 5, rho_mat = rho_mat) # Perform MVMRcML-BIC: mr_mvcML(mr_mvinput(bx = cbind(ldlc, hdlc, trig), bxse = cbind(ldlcse, hdlcse, trigse), by = chdlodds, byse = chdloddsse), n = 17723, DP = FALSE)

References

Lin, Z., Xue, H., & Pan, W. (2023). Robust multivariable Mendelian randomization based on constrained maximum likelihood. The American Journal of Human Genetics, 110(4), 592-605.

  • Maintainer: Stephen Burgess
  • License: GPL-2 | GPL-3
  • Last published: 2024-04-12

Useful links