Machine Learning Tools
Alien test dataset
Alien training dataset
Area Under the ROC Curve
Map a vector of numeric values into bins
Date Factor
Empirical Cumulative Distribution Function
Explore Dataset
Exponential Weight
Cross Validation Folds
Geometric Weight
Gini Impurities
Gini Impurity
Matthews correlation coefficient
Mean Square Error
Mean Square Logarithmic Error
One Hot Encode
Relative Position
Replace NA Values
Root Mean Square Error
Root Mean Square Logarithmic Error
ROC scores
Set Factor
Skewness
Sparsify
A collection of machine learning helper functions, particularly assisting in the Exploratory Data Analysis phase. Makes heavy use of the 'data.table' package for optimal speed and memory efficiency. Highlights include a versatile bin_data() function, sparsify() for converting a data.table to sparse matrix format with one-hot encoding, fast evaluation metrics, and empirical_cdf() for calculating empirical Multivariate Cumulative Distribution Functions.