Perform automatic data cleaning of time series data
Perform automatic data cleaning of time series data
Returns a matrix or a list of matrices with imputed missing values and outliers. The function automatizes the usage of functions model_missing_data , detect_outliers and impute_modelled_data . The function is designed for numerical data only.
data: an input vector, matrix or data frame of dimension nobs x nvars containing missing values; each column is a variable.
S: a number or vector describing the seasonalities (S_1, ..., S_K) in the data, e.g. c(24, 168) if the data consists of 24 observations per day and there is a weekly seasonality in the data.
tau: the quantile(s) of the missing values to be estimated in the quantile regression. Tau accepts all values in (0,1). If NULL, then the weighted lasso regression is performed. a number of observations in the tail of the data to be fixed, by default set to S. indices of the data to be fixed. If NULL, then it is calculated based on the parameter. Otherwise, the parameter is ignored. named list containing additional arguments for the model_missing_data function. named list containing additional arguments for the detect_outliers function.
A list which contains a matrix or a list of matrices with imputed missing values or outliers, the indices of the data that were modelled, and the given quantile values.
The function calls model_missing_data to clean the data from missing values, detect_outliers to detect outliers, removes them and finally applies again model_missing_data function. For details see the functions' respective help sections. if(!exists(".Rdpack.currefs")) .Rdpack.currefs <-new.env();Rdpack::insert_citeOnly(keys="*",package="tsrobprep",cached_env=.Rdpack.currefs,dont_cite=TRUE)
## Not run:autoclean <- auto_data_cleaning( data = GBload[,-1], S = c(48,7*48), = dim(GBload)[1], = list( =0, min.val =0))autoclean$replaced.indices
## End(Not run)