mlr_resamplings_custom_cv function

Custom Cross-Validation

Custom Cross-Validation

Splits data into training and test sets in a cross-validation fashion based on a user-provided categorical vector. This vector can be passed during instantiation either via an arbitrary factor f

with the same length as task$nrow, or via a single string col referring to a column in the task.

An alternative but equivalent approach using leave-one-out resampling is showcased in the examples of mlr_resamplings_loo .

Dictionary

This Resampling can be instantiated via the dictionary mlr_resamplings or with the associated sugar function rsmp():

mlr_resamplings$get("custom_cv")
rsmp("custom_cv")

Examples

# Create a task with 10 observations task = tsk("penguins") task$filter(1:10) # Instantiate Resampling: custom_cv = rsmp("custom_cv") f = factor(c(rep(letters[1:3], each = 3), NA)) custom_cv$instantiate(task, f = f) custom_cv$iters # 3 folds # Individual sets: custom_cv$train_set(1) custom_cv$test_set(1) # Disjunct sets: intersect(custom_cv$train_set(1), custom_cv$test_set(1))

See Also

Other Resampling: Resampling, mlr_resamplings, mlr_resamplings_bootstrap, mlr_resamplings_custom, mlr_resamplings_cv, mlr_resamplings_holdout, mlr_resamplings_insample, mlr_resamplings_loo, mlr_resamplings_repeated_cv, mlr_resamplings_subsampling

Super class

mlr3::Resampling -> ResamplingCustomCV

Active bindings

  • iters: (integer(1))

     Returns the number of resampling iterations, depending on the values stored in the `param_set`.
    

Methods

Public methods

Method new()

Creates a new instance of this R6 class.

Usage

ResamplingCustomCV$new()

Method instantiate()

Instantiate this Resampling as cross-validation with custom splits.

Usage

ResamplingCustomCV$instantiate(task, f = NULL, col = NULL)

Arguments

  • task: Task

     Used to extract row ids.
    
  • f: (factor() | character())

     Vector of type factor or character with the same length as `task$nrow`. Row ids are split on this vector, each distinct value results in a fold. Empty factor levels are dropped and row ids corresponding to missing values are removed, c.f. `split()`.
    
  • col: (character(1))

     Name of the task column to use for splitting. Alternative and mutually exclusive to providing the factor levels as a vector via parameter `f`.
    

Method clone()

The objects of this class are cloneable with this method.

Usage

ResamplingCustomCV$clone(deep = FALSE)

Arguments

  • deep: Whether to make a deep clone.