collinear2.0.0 package

Automated Multicollinearity Management

add_white_noise

Add White Noise to Encoded Predictor

case_weights

Case Weights for Unbalanced Binomial or Categorical Responses

collinear-package

collinear

collinear

Automated multicollinearity management

cor_clusters

Hierarchical Clustering from a Pairwise Correlation Matrix

cor_cramer_v

Bias Corrected Cramer's V

cor_df

Pairwise Correlation Data Frame

cor_matrix

Pairwise Correlation Matrix

cor_select

Automated Multicollinearity Filtering with Pairwise Correlations

drop_geometry_column

Removes geometry column in sf data frames

encoded_predictor_name

Name of Target-Encoded Predictor

f_auc

Association Between a Binomial Response and a Continuous Predictor

f_auto_rules

Rules to Select Default f Argument to Compute Preference Order

f_auto

Select Function to Compute Preference Order

f_functions

Data Frame of Preference Functions

f_r2_counts

Association Between a Count Response and a Continuous Predictor

f_r2

Association Between a Continuous Response and a Continuous Predictor

f_v_rf_categorical

Association Between a Categorical Response and a Categorical or Numeri...

f_v

Association Between a Categorical Response and a Categorical Predictor

identify_predictors_categorical

Identify Valid Categorical Predictors

identify_predictors_numeric

Identify Valid Numeric Predictors

identify_predictors_type

Identify Predictor Types

identify_predictors_zero_variance

Identify Zero and Near-Zero Variance Predictors

identify_predictors

Identify Numeric and Categorical Predictors

identify_response_type

Identify Response Type

model_formula

Generate Model Formulas

performance_score_auc

Area Under the Curve of Binomial Observations vs Probabilistic Model P...

performance_score_r2

Pearson's R-squared of Observations vs Predictions

performance_score_v

Cramer's V of Observations vs Predictions

preference_order_collinear

Preference Order Argument in collinear()

preference_order

Quantitative Variable Prioritization for Multicollinearity Filtering

target_encoding_lab

Target Encoding Lab: Transform Categorical Variables to Numeric

target_encoding_methods

Target Encoding Methods

validate_data_cor

Validate Data for Correlation Analysis

validate_data_vif

Validate Data for VIF Analysis

validate_df

Validate Argument df

validate_encoding_arguments

Validates Arguments of target_encoding_lab()

validate_predictors

Validate Argument predictors

validate_preference_order

Validate Argument preference_order

validate_response

Validate Argument response

vif_df

Variance Inflation Factor

vif_select

Automated Multicollinearity Filtering with Variance Inflation Factors

Effortless multicollinearity management in data frames with both numeric and categorical variables for statistical and machine learning applications. The package simplifies multicollinearity analysis by combining four robust methods: 1) target encoding for categorical variables (Micci-Barreca, D. 2001 <doi:10.1145/507533.507538>); 2) automated feature prioritization to prevent key variable loss during filtering; 3) pairwise correlation for all variable combinations (numeric-numeric, numeric-categorical, categorical-categorical); and 4) fast computation of variance inflation factors.

  • Maintainer: Blas M. Benito
  • License: MIT + file LICENSE
  • Last published: 2024-11-08