create_timefolds function

Creates Folds for Time Series Data

Creates Folds for Time Series Data

This function provides a list with in- and out-of-sample indices per fold used for time series k-fold cross-validation, see Details.

create_timefolds(y, k = 5L, use_names = TRUE, type = c("extending", "moving"))

Arguments

  • y: Any vector of the same length as the data intended to split.
  • k: Number of folds.
  • use_names: Should folds be named? Default is TRUE.
  • type: Should in-sample data be "extending" over the folds (default) or consist of one single fold ("moving")?

Returns

A nested list with in-sample and out-of-sample indices per fold.

Details

The data is first partitioned into k+1k+1 sequential blocks B1B_1 to Bk+1B_{k+1}. Each fold consists of two index vectors: one with in-sample row numbers, the other with out-of-sample row numbers. The first fold uses B1B_1 as in-sample and B2B_2 as out-of-sample data. The second one uses either B2B_2

(if type = "moving") or {B1,B2}\{B_1, B_2\} (if type = "extending") as in-sample, and B3B_3 as out-of-sample data etc. Finally, the kth fold uses {B1,...,Bk}\{B_1, ..., B_k\} ("extending") or BkB_k ("moving") as in-sample data, and Bk+1B_{k+1} as out-of-sample data. This makes sure that out-of-sample data always follows in-sample data.

Examples

y <- runif(100) create_timefolds(y) create_timefolds(y, use_names = FALSE) create_timefolds(y, use_names = FALSE, type = "moving")

See Also

partition(), create_folds()

  • Maintainer: Michael Mayer
  • License: GPL (>= 2)
  • Last published: 2023-06-06