Splits data using Leave-Location-Out (LLO), Leave-Time-Out (LTO) and Leave-Location-and-Time-Out (LLTO) partitioning. See the upstream implementation at CreateSpacetimeFolds()
(package list("CAST")) and Meyer et al. (2018) for further information.
Details
LLO predicts on unknown locations i.e. complete locations are left out in the training sets. The "space" role in Task$col_roles identifies spatial units. If stratify is TRUE, the target distribution is similar in each fold. This is useful for land cover classification when the observations are polygons. In this case, LLO with stratification should be used to hold back complete polygons and have a similar target distribution in each fold. LTO leaves out complete temporal units which are identified by the "time" role in Task$col_roles. LLTO leaves out spatial and temporal units. See the examples.
Parameters
folds (integer(1))
Number of folds.
stratify
If TRUE, stratify on the target column.
repeats (integer(1))
Number of repeats.
Examples
library(mlr3)task = tsk("cookfarm_mlr3")task$set_col_roles("SOURCEID", roles ="space")task$set_col_roles("Date", roles ="time")# Instantiate Resamplingrcv = rsmp("repeated_sptcv_cstf", folds =5, repeats =2)rcv$instantiate(task)### Individual sets:# rcv$train_set(1)# rcv$test_set(1)# check that no obs are in both setsintersect(rcv$train_set(1), rcv$test_set(1))# good!# Internal storage:# rcv$instance # table
References
Zhao Y, Karypis G (2002). Evaluation of Hierarchical Clustering Algorithms for Document Datasets.