OcpPewma calculates the anomalies of a dataset using an optimized version of classical processing Probabilistic-EWMA algorithm. It Is an optimized implementation of the CpPewma algorithm using environmental variables. It has been shown that in long datasets it can reduce runtime by up to 50%. TThis algorithm is a probabilistic method of EWMA which dynamically adjusts the parameterization based on the probability of the given observation. This method produces dynamic, data-driven anomaly thresholds which are robust to abrupt transient changes, yet quickly adjust to long-term distributional shifts.
OcpPewma(data, alpha0 =0.2, beta =0, n.train =5, l =3)
Arguments
data: Numerical vector with training and test datasets.
alpha0: Maximal weighting parameter.
beta: Weight placed on the probability of the given observation.
n.train: Number of points of the dataset that correspond to the training set.
l: Control limit multiplier.
Returns
dataset conformed by the following columns:
is.anomaly: 1 if the value is anomalous 0, otherwise.
ucl: Upper control limit.
lcl: Lower control limit.
Details
data must be a numerical vector without NA values. alpha0 must be a numeric value where 0 < alpha0 < 1. If a faster adjustment to the initial shift is desirable, simply lowering alpha0 will suffice. beta is the weight placed on the probability of the given observation. It must be a numeric value where 0 <= beta <= 1. Note that if beta equals 0, PEWMA converges to a standard EWMA. Finally l is the parameter that determines the control limits. By default, 3 is used.
Examples
## Generate dataset.seed(100)n <-180x <- sample(1:100, n, replace =TRUE)x[70:90]<- sample(110:115,21, replace =TRUE)x[25]<-200x[150]<-170df <- data.frame(timestamp =1:n, value = x)## Calculate anomaliesresult <- OcpPewma( data = df$value, n.train =5, alpha0 =0.8, beta =0.1, l =3)## Plot resultsres <- cbind(df, result)PlotDetections(res, title ="PEWMA ANOMALY DETECTOR")
References
M. Carter, Kevin y W. Streilein. Probabilistic reasoning for streaming anomaly detection. 2012 IEEE Statistical Signal Processing Workshop (SSP), pp. 377-380, Aug 2012.