create_timefolds() R function from [splitTools]

Creates Folds for Time Series Data

This function provides a list with in- and out-of-sample indices per fold used for time series k-fold cross-validation, see Details.


create_timefolds(y, k = 5L, use_names = TRUE, type = c("extending", "moving"))

Arguments

y: Any vector of the same length as the data intended to split.
k: Number of folds.
use_names: Should folds be named? Default is TRUE.
type: Should in-sample data be "extending" over the folds (default) or consist of one single fold ("moving")?

Returns

A nested list with in-sample and out-of-sample indices per fold.

Details

The data is first partitioned into $k+1$ sequential blocks $B_1$ to $B_{k+1}$ . Each fold consists of two index vectors: one with in-sample row numbers, the other with out-of-sample row numbers. The first fold uses $B_1$ as in-sample and $B_2$ as out-of-sample data. The second one uses either $B_2$

(if type = "moving") or $\{B_1, B_2\}$ (if type = "extending") as in-sample, and $B_3$ as out-of-sample data etc. Finally, the kth fold uses $\{B_1, ..., B_k\}$ ("extending") or $B_k$ ("moving") as in-sample data, and $B_{k+1}$ as out-of-sample data. This makes sure that out-of-sample data always follows in-sample data.

Examples


y <- runif(100)
create_timefolds(y)
create_timefolds(y, use_names = FALSE)
create_timefolds(y, use_names = FALSE, type = "moving")

create_timefolds function

Creates Folds for Time Series Data

Arguments

Returns

Details

Examples

See Also