SimulateData function

Simulate data under certain graphs

Simulate data under certain graphs

This function simulates data using linear models for several graphs: the five basic topologies and three topologies that are common in biology, namely the multi-parent graph, the star graph and the layered graph. See details in Badsha and Fu (2019) and Badsha et al. (2021).

SimulateData(N, p, model, b0.1, b1.1, b1.2, b1.3, sd.1)

Arguments

  • N: The number of observations.
  • p: Population frequency of the reference allele. Real number between 0 to 1, which is the number of a particular allele is present.
  • model: The model for which data will be simulated. For example, if you want to generate data for model 0 you would type 'model0' into the function.
  • b0.1: Intercept of b0.1 + b1.1P1 + b1.2P2 + b1.3*P3, where P1, P2, and P3 are the parents of the corresponding node.
  • b1.1: Slope of P1 for b0.1 + b1.1P1 + b1.2P2 + b1.3*P3, where P1, P2, and P3 are the parents of the corresponding node.
  • b1.2: Slope of P2 for b0.1 + b1.1P1 + b1.2P2 + b1.3*P3, where P1, P2, and P3 are the parents of the corresponding node.
  • b1.3: Slope of P3 for b0.1 + b1.1P1 + b1.2P2 + b1.3*P3, where P1, P2, and P3 are the parents of the corresponding node.
  • sd.1: Standard deviation for corresponding data generated nodes.

Details

The first column of the input matrix is a genetic variant and the remaining columns are gene expression nodes.

Returns

Matrix

Author(s)

Md Bahadur Badsha (mbbadshar@gmail.com)

References

  1. Badsha MB and Fu AQ (2019). Learning causal biological networks with the principle of Mendelian randomization. Frontiers in Genetics, 10:460.

  2. Badsha MB, Martin EA and Fu AQ (2021). MRPC: An R package for inference of causal graphs. Frontiers in Genetics, 10:651812.

See Also

MRPC ; SimulateDataNP , which simulates data for a node with no parent; SimulateData1P for a node with one parent; SimulateData2P for a node with two parents.

Examples

# When there is one genetic variant, the 1st column of # the simulated data matrix will be the variant and the remaining # columns are the gene expression nodes. ## Model 0 simu_data_M0 <- SimulateData(N = 10^3, p = 0.45, 'model0', b0.1 = 0, b1.1 = 1, b1.2 = 1, b1.3 = 1, sd.1 = 1) ## Model 1 simu_data_M1 <- SimulateData(N = 10^3, p = 0.45, 'model1', b0.1 = 0, b1.1 = 1, b1.2 = 1, b1.3 = 1, sd.1 = 1) ## Model 2 simu_data_M2 <- SimulateData(N = 10^3, p = 0.45, 'model2', b0.1 = 0, b1.1 = 1, b1.2 = 1, b1.3 = 1, sd.1 = 1) ## Model 3 simu_data_M3 <- SimulateData(N = 10^3, p = 0.45, 'model3', b0.1 = 0, b1.1 = 1, b1.2 = 1, b1.3 = 1, sd.1 = 1) ## Model 4 simu_data_M4 <- SimulateData(N = 10^3, p = 0.45, 'model4', b0.1 = 0, b1.1 = 1, b1.2 = 1, b1.3 = 1, sd.1 = 1) ## Multiple Parent Model simu_data_multiparent <- SimulateData(N = 10^3, p = 0.45, 'multiparent', b0.1 = 0, b1.1 = 1, b1.2 = 1, b1.3 = 1, sd.1 = 1) ## Star Model simu_data_starshaped <- SimulateData(N = 10^3, p = 0.45, 'starshaped', b0.1 = 0, b1.1 = 1, b1.2 = 1, b1.3 = 1, sd.1 = 1) ## Layered Model simu_data_layered <- SimulateData(N = 10^3, p = 0.45, 'layered', b0.1 = 0, b1.1 = 1, b1.2 = 1, b1.3 = 1, sd.1 = 1)
  • Maintainer: Audrey Fu
  • License: GPL (>= 2)
  • Last published: 2022-04-11

Useful links