train_test_split function

Splitting the data sets to train and test

Splitting the data sets to train and test

A function to split the train data of class "hhsmmdata"

to train and test subsets with an option to right trim the sequences

train_test_split(train, train.ratio = 0.7, trim = FALSE, trim.ratio = NULL)

Arguments

  • train: the train data of class "hhsmmdata"
  • train.ratio: a number in (0,1] which determines the ratio of the train subset. It can be equal to 1, if we need the test set to be equal to the train set and we only need to right trim the sequences
  • trim: logical. if TRUE the sequences will be right trimmed with random lengths
  • trim.ratio: a vector of trim ratios with a length equal to that of train$N, or a single trim ratio for all sequences. If it is NULL, then random trim ratios will be used

Returns

a list containing:

  • train the randomly selected subset of train data of class "hhsmmdata"
  • test the randomly selected subset of test data of class "hhsmmdata"
  • trimmed right trimmed test subset, if trim=TRUE, with trim ratios equal to trim.ratio
  • trimmed.count the number of right trimmed individuals in each sequence of the test subset, if trim=TRUE

Details

This function splits the sample to train and test samples and trims the test sample from right, in order to provide a sample for examination of the prediction tools. In reliability applications, the hhsmm models are often left-to-right and the modeling aims to predict the future states. In such cases, the test sets are right trimmed and the prediction aims to predict the residual useful lifetime (RUL) of a new sequence.

Examples

data(CMAPSS) tt = train_test_split(CMAPSS$train, train.ratio = 0.7, trim = TRUE)

Author(s)

Morteza Amini, morteza.amini@ut.ac.ir

  • Maintainer: Morteza Amini
  • License: GPL-3
  • Last published: 2024-09-04

Useful links