Splits data into training and test sets in a cross-validation fashion based on a user-provided categorical vector. This vector can be passed during instantiation either via an arbitrary factor f
with the same length as task$nrow, or via a single string col referring to a column in the task.
An alternative but equivalent approach using leave-one-out resampling is showcased in the examples of mlr_resamplings_loo .
Dictionary
This Resampling can be instantiated via the dictionary mlr_resamplings or with the associated sugar function rsmp():
Instantiate this Resampling as cross-validation with custom splits.
Usage
ResamplingCustomCV$instantiate(task, f = NULL, col = NULL)
Arguments
task: Task
Used to extract row ids.
f: (factor() | character())
Vector of type factor or character with the same length as `task$nrow`. Row ids are split on this vector, each distinct value results in a fold. Empty factor levels are dropped and row ids corresponding to missing values are removed, c.f. `split()`.
col: (character(1))
Name of the task column to use for splitting. Alternative and mutually exclusive to providing the factor levels as a vector via parameter `f`.
Method clone()
The objects of this class are cloneable with this method.