genCorrelatedData() R function from [rockchalk]

Generates a data frame for regression analysis

The output is a data frame (x1, x2, y) with user-specified correlation between x1 and x2. The y (output) variable is created according to the equation

[REMOVE_ME] $y = beta1 + beta2 * x1 + beta3 * x2 + beta4 * x1 * x2 + e. [REMOVE_ME_2]$

The arguments determine the scales of the X matrix, the random error, and the slope coefficients.


genCorrelatedData(
  N = 100,
  means = c(50, 50),
  sds = c(10, 10),
  rho = 0,
  stde = 1,
  beta = c(0, 0.2, 0.2, 0)
)

Arguments

N: Number of cases desired
means: 2-vector of means for x1 and x2
sds: 2-vector of standard deviations for x1 and x2
rho: Correlation coefficient for x1 and x2
stde: standard deviation of the error term in the data generating equation
beta: beta vector of at most 4 coefficients for intercept, slopes, and interaction

Description

The output is a data frame (x1, x2, y) with user-specified correlation between x1 and x2. The y (output) variable is created according to the equation

y = beta1 + beta2 * x1 + beta3 * x2 + beta4 * x1 * x2 + e.

The arguments determine the scales of the X matrix, the random error, and the slope coefficients.

Details

The vector (x1,x2) is drawn from a multivariate normal distribution in which the expected value (argument means). The covariance matrix of X is built from the standard deviations (sds) and the specified correlation between x1 and x2 (rho). It is also necessary to specify the standard deviation of the error term (stde) and the coefficients of the regression equation (beta).

Examples


## 1000 observations of uncorrelated x1 and x2 with no
## interaction between x1 and x2
dat <- genCorrelatedData(N=1000, rho=0, beta=c(1, 1.0, -1.1, 0.0))
  mcGraph1(dat$x1, dat$x2, dat$y, theta=20, phi=8,
  ticktype="detailed", nticks=10)
m1 <- lm(y ~ x1 + x2, data = dat)
plotPlane(m1, plotx1 = "x1", plotx2 = "x2")

rockchalk package Read PDF manual

Maintainer: Paul E. Johnson
License: GPL (>= 3.0)
Last published: 2022-08-06

Useful links

genCorrelatedData function

Generates a data frame for regression analysis

Arguments

Description

Details

Examples