A Unified Tidy Interface to R's Machine Learning Ecosystem
Augment Data with DBSCAN Cluster Assignments
Augment Data with Hierarchical Cluster Assignments
Augment Data with K-Means Cluster Assignments
Augment Data with PAM Cluster Assignments
Augment Original Data with PCA Scores
Calculate Cluster Validation Metrics
Calculate Within-Cluster Sum of Squares for Different k
Compare Multiple Clustering Results
Compare Distance Methods
Create Summary Dashboard
Explore DBSCAN Parameters
Filter Rules by Item
Find Related Items
Get PCA Loadings in Wide Format
Get Variance Explained Summary
Inspect Association Rules
Find Optimal Number of Clusters
Determine Optimal Number of Clusters for Hierarchical Clustering
Pipe operator
Create Cluster Comparison Plot
Plot Cluster Size Distribution
Plot Clusters in 2D Space
Plot Dendrogram with Cluster Highlights
Create Distance Heatmap
Create Elbow Plot for K-Means
Plot Gap Statistic
Plot k-NN Distance Plot
Plot MDS Configuration
Plot Silhouette Analysis
Plot Variance Explained (PCA)
Plot EDA results
Plot method for tidylearn models
Predict using a tidylearn model
Predict from stratified models
Predict with transfer learning model
Print Method for tidy_apriori
Print Method for tidy_dbscan
Print Method for tidy_gap
Print Method for tidy_hclust
Print Method for tidy_kmeans
Print Method for tidy_mds
Print Method for tidy_pam
Print Method for tidy_pca
Print Method for tidy_silhouette
Print auto ML results
Print EDA results
Print method for tidylearn models
Print a tidylearn pipeline
Generate Product Recommendations
Standardize Data
Suggest eps Parameter for DBSCAN
Summarize Association Rules
Summary method for tidylearn models
Summarize a tidylearn pipeline
Tidy Apriori Algorithm
Tidy CLARA (Clustering Large Applications)
Cut Hierarchical Clustering Tree
Tidy DBSCAN Clustering
Plot Dendrogram
Tidy Distance Matrix Computation
Tidy Gap Statistic
Gower Distance Calculation
Tidy Hierarchical Clustering
Tidy K-Means Clustering
Compute k-NN Distances
Classical (Metric) MDS
Kruskal's Non-metric MDS
Sammon Mapping
SMACOF MDS (Metric or Non-metric)
Tidy Multidimensional Scaling
Tidy PAM (Partitioning Around Medoids)
Create PCA Biplot
Create PCA Scree Plot
Tidy Principal Component Analysis
Convert Association Rules to Tidy Tibble
Silhouette Analysis Across Multiple k Values
Tidy Silhouette Analysis
Classification Functions for tidylearn
tidylearn: A Unified Tidy Interface to R's Machine Learning Ecosystem
Deep Learning for tidylearn
Advanced Diagnostics Functions for tidylearn
Interaction Analysis Functions for tidylearn
Metrics Functionality for tidylearn
Model Selection Functions for tidylearn
Neural Networks for tidylearn
Model Pipeline Functions for tidylearn
Regression Functions for tidylearn
Regularization Functions for tidylearn
Support Vector Machines for tidylearn
Tree-based Methods for tidylearn
Hyperparameter Tuning Functions for tidylearn
Visualization Functions for tidylearn
XGBoost Functions for tidylearn
Cluster-Based Features
Anomaly-Aware Supervised Learning
Find important interactions automatically
High-Level Workflows for Common Machine Learning Patterns
Calculate classification metrics
Calculate the area under the precision-recall curve
Check model assumptions
Compare models using cross-validation
Compare models from a pipeline
Cross-validation for tidylearn models
Create interactive visualization dashboard for a model
Create pre-defined parameter grids for common models
Detect outliers in the data
Create a comprehensive diagnostic dashboard
Evaluate metrics at different thresholds
Evaluate a tidylearn model
Exploratory Data Analysis Workflow
Extract importance from a regularized regression model
Extract importance from a tree-based model
Fit a gradient boosting model
Fit a deep learning model
Fit an Elastic Net regression model
Fit a random forest model
Fit a Lasso regression model
Fit a linear regression model
Fit a logistic regression model
Fit a neural network model
Fit a polynomial regression model
Fit a regularized regression model (Ridge, Lasso, or Elastic Net)
Fit a Ridge regression model
Fit a support vector machine model
Fit a decision tree model
Fit an XGBoost model
Get the best model from a pipeline
Calculate influence measures for a linear model
Calculate partial effects based on a model with interactions
Load a pipeline from disk
Create a tidylearn model
Create a modeling pipeline
Plot actual vs predicted values for a regression model
Plot calibration curve for a classification model
Plot confusion matrix for a classification model
Plot comparison of cross-validation results
Plot cross-validation results
Plot deep learning model architecture
Plot deep learning model training history
Plot diagnostics for a regression model
Plot gain chart for a classification model
Plot feature importance across multiple models
Plot variable importance for a regularized regression model
Plot variable importance for tree-based models
Plot influence diagnostics
Plot interaction effects
Create confidence and prediction interval plots
Plot lift chart for a classification model
Plot model comparison
Plot neural network architecture
Plot neural network training history
Plot partial dependence for tree-based models
Plot precision-recall curve for a classification model
Plot cross-validation results for a regularized regression model
Plot regularization path for a regularized regression model
Plot residuals for a regression model
Plot ROC curve for a classification model
Plot SVM decision boundary
Plot SVM tuning results
Plot a decision tree
Plot hyperparameter tuning results
Plot feature importance for an XGBoost model
Plot SHAP dependence for a specific feature
Plot SHAP summary for XGBoost model
Plot XGBoost tree visualization
Predict using a gradient boosting model
Predict using a deep learning model
Predict using an Elastic Net regression model
Predict using a random forest model
Predict using a Lasso regression model
Predict using a linear regression model
Predict using a logistic regression model
Predict using a neural network model
Make predictions using a pipeline
Predict using a polynomial regression model
Predict using a regularized regression model
Predict using a Ridge regression model
Predict using a support vector machine model
Predict using a decision tree model
Predict using an XGBoost model
Data Preprocessing for tidylearn
Integration Functions: Combining Supervised and Unsupervised Learning
Run a tidylearn pipeline
Save a pipeline to disk
Semi-Supervised Learning via Clustering
Split data into train and test sets
Perform stepwise selection on a linear model
Stratified Features via Clustering
Test for significant interactions between variables
Perform statistical comparison of models using cross-validation
Transfer Learning Workflow
Tune a deep learning model
Tune hyperparameters for a model using grid search
Tune a neural network model
Tune hyperparameters for a model using random search
Tune XGBoost hyperparameters
Get tidylearn version information
Generate SHAP values for XGBoost model interpretation
Visualize Association Rules
Provides a unified tidyverse-compatible interface to R's machine learning packages. Wraps established implementations from 'glmnet', 'randomForest', 'xgboost', 'e1071', 'rpart', 'gbm', 'nnet', 'cluster', 'dbscan', and others - providing consistent function signatures, tidy tibble output, and unified 'ggplot2'-based visualization. The underlying algorithms are unchanged; 'tidylearn' simply makes them easier to use together. Access raw model objects via the $fit slot for package-specific functionality. Methods include random forests Breiman (2001) <doi:10.1023/A:1010933404324>, LASSO regression Tibshirani (1996) <doi:10.1111/j.2517-6161.1996.tb02080.x>, elastic net Zou and Hastie (2005) <doi:10.1111/j.1467-9868.2005.00503.x>, support vector machines Cortes and Vapnik (1995) <doi:10.1007/BF00994018>, and gradient boosting Friedman (2001) <doi:10.1214/aos/1013203451>.