KCI function

Kernel conditional independence test.

Kernel conditional independence test.

Tests the null hypothesis that Y and E are independent given X. The distribution of the test statistic under the null hypothesis equals an infinite weighted sum of chi squared variables. This distribution can either be approximated by a gamma distribution or by a Monte Carlo approach. This version includes an implementation of choosing the hyperparameters by Gaussian Process regression.

KCI(Y, E, X, width = 0, alpha = 0.05, unbiased = FALSE, gammaApprox = TRUE, GP = TRUE, nRepBs = 5000, lambda = 0.001, thresh = 1e-05, numEig = NROW(Y), verbose = FALSE)

Arguments

  • Y: A vector of length n or a matrix or dataframe with n rows and p columns.
  • E: A vector of length n or a matrix or dataframe with n rows and p columns.
  • X: A matrix or dataframe with n rows and p columns.
  • width: Kernel width; if it is set to zero, the width is chosen automatically (default: 0).
  • alpha: Significance level (default: 0.05).
  • unbiased: A boolean variable that indicates whether a bias correction should be applied (default: FALSE).
  • gammaApprox: A boolean variable that indicates whether the null distribution is approximated by a Gamma distribution. If it is FALSE, a Monte Carlo approach is used (default: TRUE).
  • GP: Flag whether to use Gaussian Process regression to choose the hyperparameters
  • nRepBs: Number of draws for the Monte Carlo approach (default: 500).
  • lambda: Regularization parameter (default: 1e-03).
  • thresh: Threshold for eigenvalues. Whenever eigenvalues are computed, they are set to zero if they are smaller than thresh times the maximum eigenvalue (default: 1e-05).
  • numEig: Number of eigenvalues computed (only relevant for computing the distribution under the hypothesis of conditional independence) (default: length(Y)).
  • verbose: If TRUE, intermediate output is provided. (default: FALSE).

Returns

A list with the following entries:

  • testStatistic the statistic Tr(K_(ddot(Y)|X) * K_(E|X))
  • criticalValue the critical point at the p-value equal to alpha; obtained by a Monte Carlo approach if gammaApprox = FALSE, otherwise obtained by Gamma approximation.
  • pvalue The p-value for the null hypothesis that Y and E are independent given X. It is obtained by a Monte Carlo approach if gammaApprox = FALSE, otherwise obtained by Gamma approximation.

Examples

# Example 1 n <- 100 E <- rnorm(n) X <- 4 + 2 * E + rnorm(n) Y <- 3 * (X)^2 + rnorm(n) KCI(Y, E, X) KCI(Y, X, E)