Residual prediction test.
Tests the null hypothesis that Y and E are independent given X.
ResidualPredictionTest(Y, E, X, alpha = 0.05, verbose = FALSE, degree = 4, basis = c("nystrom", "nystrom_poly", "fourier", "polynomial", "provided")[1], resid_type = "OLS", XBasis = NULL, noiseMat = NULL, getnoiseFct = function(n, ...) { rnorm(n) }, argsGetNoiseFct = NULL, nSim = 100, funcOfRes = function(x) { abs(x) }, useX = TRUE, returnXBasis = FALSE, nSub = ceiling(NROW(X)/4), ntree = 100, nodesize = 5, maxnodes = NULL)
Y
: An n-dimensional vector.E
: An n-dimensional vector or an nxq dimensional matrix or dataframe.X
: A matrix or dataframe with n rows and p columns.alpha
: Significance level. Defaults to 0.05.verbose
: If TRUE
, intermediate output is provided. Defaults to FALSE
.degree
: Degree of polynomial to use if basis="polynomial"
or basis="nystrom_poly"
. Defaults to 4.basis
: Can be one of "nystrom","nystrom_poly","fourier","polynomial","provided"
. Defaults to "nystrom"
.resid_type
: Can be "Lasso"
or "OLS"
. Defaults to "OLS"
.XBasis
: Basis if basis="provided"
. Defaults to NULL
.noiseMat
: Matrix with simulated noise. Defaults to NULL in which case the simulation is performed inside the function.getnoiseFct
: Function to use to generate the noise matrix. Defaults to function(n, ...){rnorm(n)}
.argsGetNoiseFct
: Arguments for getnoiseFct
. Defaults to NULL
.nSim
: Number of simulations to use. Defaults to 100.funcOfRes
: Function of residuals to use in addition to predicting the conditional mean. Defaults to function(x){abs(x)}
.useX
: Set to TRUE
if the predictors in X should also be used when predicting the scaled residuals with E. Defaults to TRUE
.returnXBasis
: Set to TRUE
if basis expansion should be returned. Defaults to FALSE
.nSub
: Number of random features to use if basis
is one of "nystrom","nystrom_poly"
or "fourier"
. Defaults to ceiling(NROW(X)/4)
.ntree
: Random forest parameter: Number of trees to grow. Defaults to 500.nodesize
: Random forest parameter: Minimum size of terminal nodes. Defaults to 5.maxnodes
: Random forest parameter: Maximum number of terminal nodes trees in the forest can have. Defaults to NULL.A list with the following entries:
pvalue
The p-value for the null hypothesis that Y and E are independent given X.XBasis
Basis expansion if returnXBasis
was set to TRUE
.fctBasisExpansion
Function used to create basis expansion if basis is not "provided"
.# Example 1 n <- 100 E <- rbinom(n, size = 1, prob = 0.2) X <- 4 + 2 * E + rnorm(n) Y <- 3 * (X)^2 + rnorm(n) ResidualPredictionTest(Y, as.factor(E), X) # Example 2 E <- rbinom(n, size = 1, prob = 0.2) X <- 4 + 2 * E + rnorm(n) Y <- 3 * E + rnorm(n) ResidualPredictionTest(Y, as.factor(E), X) # not run: # # Example 3 # E <- rnorm(n) # X <- 4 + 2 * E + rnorm(n) # Y <- 3 * (X)^2 + rnorm(n) # ResidualPredictionTest(Y, E, X) # ResidualPredictionTest(Y, X, E)
Useful links