A function to split the train data of class "hhsmmdata"
to train and test subsets with an option to right trim the sequences
train_test_split(train, train.ratio =0.7, trim =FALSE, trim.ratio =NULL)
Arguments
train: the train data of class "hhsmmdata"
train.ratio: a number in (0,1] which determines the ratio of the train subset. It can be equal to 1, if we need the test set to be equal to the train set and we only need to right trim the sequences
trim: logical. if TRUE the sequences will be right trimmed with random lengths
trim.ratio: a vector of trim ratios with a length equal to that of train$N, or a single trim ratio for all sequences. If it is NULL, then random trim ratios will be used
Returns
a list containing:
train the randomly selected subset of train data of class "hhsmmdata"
test the randomly selected subset of test data of class "hhsmmdata"
trimmed right trimmed test subset, if trim=TRUE, with trim ratios equal to trim.ratio
trimmed.count the number of right trimmed individuals in each sequence of the test subset, if trim=TRUE
Details
This function splits the sample to train and test samples and trims the test sample from right, in order to provide a sample for examination of the prediction tools. In reliability applications, the hhsmm models are often left-to-right and the modeling aims to predict the future states. In such cases, the test sets are right trimmed and the prediction aims to predict the residual useful lifetime (RUL) of a new sequence.
Examples
data(CMAPSS)tt = train_test_split(CMAPSS$train, train.ratio =0.7, trim =TRUE)