simuDataREM function

Data Simulation Under the Random-Effects Mixture Model

Data Simulation Under the Random-Effects Mixture Model

Function simuDataREM simulates data under the Ornstein-Uhlenbeck (OU) (or Brownian Motion; BM) process-based random-effects mixture (REM) model.

simuDataREM(pars.mtx, dt, T, ntime, nrep, nsize, times, method = c("eigen", "svd", "chol"), model = c("OU", "BM"))

Arguments

  • pars.mtx: A K×8K \times 8 matrix, where KK is the number of clusters. Each row contains 8 parameters: standard deviation of within-cluster variability, of variability across time points, and of replicates, respectively; mean and standard deviation for the value at the first time point; the overall mean, standard deviation and mean-reverting rate of the OU process.
  • dt: Increment in times.
  • T: Maximum time.
  • ntime: Number of time points to simulate data for. Needs to be same as the length of vector times.
  • nrep: Number of replicates.
  • nsize: An integer vector containing sizes of simulated clusters.
  • times: Vector of length ntime indicating at which time points to simulate data.
  • method: Method to compute the determinant of the covariance matrix in the calculation of the multivariate normal density. Required. Method choices are: "chol" for Choleski decomposition, "eigen" for eigenvalue decomposition, and "svd" for singular value decomposition.
  • model: Model to generate realizations of the mean vector of a mixture component. Required. Choices are: "OU" for an Ornstein-Uhlenbeck process (a.k.a. the mean-reverting process) and "BM" for a Brown motion (without drift).

Returns

  • means: A matrix of ntime columns. The number of rows is the same as that of pars.mtx, which is the number of clusters. Each row contains the true mean vector of the corresponding cluster.

  • data: A matrix of NN rows and ntime*nrep+1 columns, where NN is the sum of cluster sizes nsize. The first column contains the true cluster membership of the corresponding item. The rest of the columns in each row is formatted as follows: values for replicate 1 through nrep at time 1; values for replicate 1 through nrep at time 2, ...

References

Fu, A. Q., Russell, S., Bray, S. and Tavare, S. (2013) Bayesian clustering of replicated time-course gene expression data with weak signals. The Annals of Applied Statistics. 7(3) 1334-1361.

Author(s)

Audrey Q. Fu

See Also

plotSimulation for plotting simulated data.

outputData for writing simulated data and parameter values used in simulation into external files.

DIRECT for clustering the data.

Examples

## Not run: # Simulate replicated time-course gene expression profiles # from OU processes # Simulation parameters times = c(0,5,10,15,20,25,30,35,40,50,60,70,80,90,100,110,120,150) ntime=length (times) nrep=4 nclust = 6 npars = 8 pars.mtx = matrix (0, nrow=nclust, ncol=npars) # late weak upregulation or downregulation pars.mtx[1,] = c(0.05, 0.1, 0.5, 0, 0.16, 0.1, 0.4, 0.05) # repression pars.mtx[2,] = c(0.05, 0.1, 0.5, 1, 0.16, -1.0, 0.1, 0.05) # early strong upregulation pars.mtx[3,] = c(0.05, 0.5, 0.2, 0, 0.16, 2.5, 0.4, 0.15) # strong repression pars.mtx[4,] = c(0.05, 0.5, 0.2, 1, 0.16, -1.5, 0.4, 0.1) # low upregulation pars.mtx[5,] = c(0.05, 0.3, 0.3, -0.5, 0.16, 0.5, 0.2, 0.08) # late strong upregulation pars.mtx[6,] = c(0.05, 0.3, 0.3, -0.5, 0.16, 0.1, 1, 0.1) nsize = rep(40, nclust) # Generate data simudata = simuDataREM (pars=pars.mtx, dt=1, T=150, ntime=ntime, nrep=nrep, nsize=nsize, times=times, method="svd", model="OU") # Display simulated data plotSimulation (simudata, times=times, nsize=nsize, nrep=nrep, lty=1, ylim=c(-4,4), type="l", col="black") # Write simulation parameters and simulated data # to external files outputData (datafilename= "simu_test.dat", parfilename= "simu_test.par", meanfilename= "simu_test_mean.dat", simudata=simudata, pars=pars.mtx, nitem=sum(nsize), ntime=ntime, nrep=nrep) ## End(Not run)
  • Maintainer: Audrey Q. Fu
  • License: GPL (>= 2)
  • Last published: 2023-09-07

Useful links