A Statistically Sound 'data.frame' Processor/Conditioner
k-fold cross validation stratified on y, a splitFunction in the sense ...
k-fold cross validation stratified with replacement on y, a splitFunct...
Make a categorical input custom coder.
Make a numeric input custom coder.
Build a k-fold cross validation splitter, respecting (never splitting)...
Run categorical cross-frame experiment.
Function to build multi-outcome vtreat cross frame and treatment plan.
Run a numeric cross frame experiment.
vtreat multinomial parameters.
vtreat regression parameters.
Apply a treatment plan using rqdatatable.
vtreat unsupervised parameters.
Stateful object for designing and applying unsupervised treatments.
Prepare a simple treatment.
Apply treatments and restrict to useful variables.
Print treatmentplan.
Print treatmentplan.
Print treatmentplan.
Print treatmentplan.
Center and scale a set of variables.
check if appPlan is a good carve-up of 1:nRows into nSplits groups
vtreat classification parameters.
read application labels off a split plan.
k-fold cross validation, a splitFunction in the sense of vtreat::build...
Transform second argument by first.
Convert vtreatment plans into a sequence of rquery operations.
Stateful object for designing and applying binomial outcome treatments...
Build set carve-up for out-of sample evaluation.
Design a simple treatment plan to indicate missingingness and perform ...
Build all treatments for a data frame to predict a categorical outcome...
build all treatments for a data frame to predict a numeric outcome
Design variable treatments with no outcome variable.
Compute weighted mean
Fit first arguemnt to data in second argument.
Fit and prepare in a cross-validated manner.
Fit and transform in a cross-validated manner.
Flatten a list of functions onto d.
Display treatment plan.
Return feasible feature names.
Return score frame from vps.
Return underlying transform from vps.
Stateful object for designing and applying multinomial outcome treatme...
Report new/novel appearances of character values.
Stateful object for designing and applying numeric outcome treatments.
One way holdout, a splitFunction in the sense of vtreat::buildEvalSets...
Patch columns into data.frame.
Pre-computed cross-plan (so same split happens each time).
Function to apply mkCrossFrameMExperiment treatemnts.
Apply treatments and restrict to useful variables.
Materialize a treated data frame remotely.
Solve as piecewise linear problem, numeric target.
Solve as piecewise logit problem, categorical target.
Spline variable numeric target.
Spline variable categorical target.
Build a square windows variable, numeric target.
Build a square windows variable, categorical target.
Track unique character values for variables.
Value variables for prediction a categorical outcome.
Value variables for prediction a numeric outcome.
Return variable evaluations.
New treated variable names from a treatmentplan$treatment item.
Original variable name from a treatmentplan$treatment item.
vtreat: A Statistically Sound 'data.frame' Processor/Conditioner
A 'data.frame' processor/conditioner that prepares real-world data for predictive modeling in a statistically sound manner. 'vtreat' prepares variables so that data has fewer exceptional cases, making it easier to safely use models in production. Common problems 'vtreat' defends against: 'Inf', 'NA', too many categorical levels, rare categorical levels, and new categorical levels (levels seen during application, but not during training). Reference: "'vtreat': a data.frame Processor for Predictive Modeling", Zumel, Mount, 2016, <DOI:10.5281/zenodo.1173313>.
Useful links