collinear function

Filter to reduce collinearity in predictors

Filter to reduce collinearity in predictors

This function identifies predictors with r^2 above a given cut-off and produces an index of predictors to be removed. The function takes a matrix or data.frame of predictors, and the columns need to be ordered in terms of importance - first column of any pair that are correlated is retained and subsequent columns which correlate above the cut-off are flagged for removal.

collinear(x, rsq_cutoff = 0.9, rsq_method = "pearson", verbose = FALSE)

Arguments

  • x: A matrix or data.frame of values. The order of columns is used to determine which columns to retain, so the columns in x should be sorted with the most important columns first.
  • rsq_cutoff: Value of cut-off for r-squared
  • rsq_method: character string indicating which correlation coefficient is to be computed. One of "pearson" (default), "kendall", or "spearman". See cor().
  • verbose: Boolean whether to print details

Returns

Integer vector of the indices of columns in x to remove due to collinearity

  • Maintainer: Myles Lewis
  • License: MIT + file LICENSE
  • Last published: 2025-03-10