Perform automatic data cleaning of time series data
Perform automatic data cleaning of time series data
Returns a matrix or a list of matrices with imputed missing values and outliers. The function automatizes the usage of functions model_missing_data , detect_outliers and impute_modelled_data . The function is designed for numerical data only.
data: an input vector, matrix or data frame of dimension nobs x nvars containing missing values; each column is a variable.
S: a number or vector describing the seasonalities (S_1, ..., S_K) in the data, e.g. c(24, 168) if the data consists of 24 observations per day and there is a weekly seasonality in the data.
tau: the quantile(s) of the missing values to be estimated in the quantile regression. Tau accepts all values in (0,1). If NULL, then the weighted lasso regression is performed.
no.of.last.indices.to.fix: a number of observations in the tail of the data to be fixed, by default set to S.
indices.to.fix: indices of the data to be fixed. If NULL, then it is calculated based on the no.of.last.indices.to.fix parameter. Otherwise, the no.of.last.indices.to.fix parameter is ignored.
model.missing.pars: named list containing additional arguments for the model_missing_data function.
detect.outliers.pars: named list containing additional arguments for the detect_outliers function.
Returns
A list which contains a matrix or a list of matrices with imputed missing values or outliers, the indices of the data that were modelled, and the given quantile values.
Details
The function calls model_missing_data to clean the data from missing values, detect_outliers to detect outliers, removes them and finally applies again model_missing_data function. For details see the functions' respective help sections. if(!exists(".Rdpack.currefs")) .Rdpack.currefs <-new.env();Rdpack::insert_citeOnly(keys="*",package="tsrobprep",cached_env=.Rdpack.currefs,dont_cite=TRUE)
Examples
## Not run:autoclean <- auto_data_cleaning( data = GBload[,-1], S = c(48,7*48), no.of.last.indices.to.fix = dim(GBload)[1], model.missing.pars = list(consider.as.missing =0, min.val =0))autoclean$replaced.indices
## End(Not run)