ResidualPredictionTest function

Residual prediction test.

Residual prediction test.

Tests the null hypothesis that Y and E are independent given X.

ResidualPredictionTest(Y, E, X, alpha = 0.05, verbose = FALSE, degree = 4, basis = c("nystrom", "nystrom_poly", "fourier", "polynomial", "provided")[1], resid_type = "OLS", XBasis = NULL, noiseMat = NULL, getnoiseFct = function(n, ...) { rnorm(n) }, argsGetNoiseFct = NULL, nSim = 100, funcOfRes = function(x) { abs(x) }, useX = TRUE, returnXBasis = FALSE, nSub = ceiling(NROW(X)/4), ntree = 100, nodesize = 5, maxnodes = NULL)

Arguments

  • Y: An n-dimensional vector.
  • E: An n-dimensional vector or an nxq dimensional matrix or dataframe.
  • X: A matrix or dataframe with n rows and p columns.
  • alpha: Significance level. Defaults to 0.05.
  • verbose: If TRUE, intermediate output is provided. Defaults to FALSE.
  • degree: Degree of polynomial to use if basis="polynomial" or basis="nystrom_poly". Defaults to 4.
  • basis: Can be one of "nystrom","nystrom_poly","fourier","polynomial","provided". Defaults to "nystrom".
  • resid_type: Can be "Lasso" or "OLS". Defaults to "OLS".
  • XBasis: Basis if basis="provided". Defaults to NULL.
  • noiseMat: Matrix with simulated noise. Defaults to NULL in which case the simulation is performed inside the function.
  • getnoiseFct: Function to use to generate the noise matrix. Defaults to function(n, ...){rnorm(n)}.
  • argsGetNoiseFct: Arguments for getnoiseFct. Defaults to NULL.
  • nSim: Number of simulations to use. Defaults to 100.
  • funcOfRes: Function of residuals to use in addition to predicting the conditional mean. Defaults to function(x){abs(x)}.
  • useX: Set to TRUE if the predictors in X should also be used when predicting the scaled residuals with E. Defaults to TRUE.
  • returnXBasis: Set to TRUE if basis expansion should be returned. Defaults to FALSE.
  • nSub: Number of random features to use if basis is one of "nystrom","nystrom_poly" or "fourier". Defaults to ceiling(NROW(X)/4).
  • ntree: Random forest parameter: Number of trees to grow. Defaults to 500.
  • nodesize: Random forest parameter: Minimum size of terminal nodes. Defaults to 5.
  • maxnodes: Random forest parameter: Maximum number of terminal nodes trees in the forest can have. Defaults to NULL.

Returns

A list with the following entries:

  • pvalue The p-value for the null hypothesis that Y and E are independent given X.
  • XBasis Basis expansion if returnXBasis was set to TRUE.
  • fctBasisExpansion Function used to create basis expansion if basis is not "provided".

Examples

# Example 1 n <- 100 E <- rbinom(n, size = 1, prob = 0.2) X <- 4 + 2 * E + rnorm(n) Y <- 3 * (X)^2 + rnorm(n) ResidualPredictionTest(Y, as.factor(E), X) # Example 2 E <- rbinom(n, size = 1, prob = 0.2) X <- 4 + 2 * E + rnorm(n) Y <- 3 * E + rnorm(n) ResidualPredictionTest(Y, as.factor(E), X) # not run: # # Example 3 # E <- rnorm(n) # X <- 4 + 2 * E + rnorm(n) # Y <- 3 * (X)^2 + rnorm(n) # ResidualPredictionTest(Y, E, X) # ResidualPredictionTest(Y, X, E)