OcpPewma function

Optimized Classic Processing Probabilistic-EWMA (PEWMA).

Optimized Classic Processing Probabilistic-EWMA (PEWMA).

OcpPewma calculates the anomalies of a dataset using an optimized version of classical processing Probabilistic-EWMA algorithm. It Is an optimized implementation of the CpPewma algorithm using environmental variables. It has been shown that in long datasets it can reduce runtime by up to 50%. TThis algorithm is a probabilistic method of EWMA which dynamically adjusts the parameterization based on the probability of the given observation. This method produces dynamic, data-driven anomaly thresholds which are robust to abrupt transient changes, yet quickly adjust to long-term distributional shifts.

OcpPewma(data, alpha0 = 0.2, beta = 0, n.train = 5, l = 3)

Arguments

  • data: Numerical vector with training and test datasets.
  • alpha0: Maximal weighting parameter.
  • beta: Weight placed on the probability of the given observation.
  • n.train: Number of points of the dataset that correspond to the training set.
  • l: Control limit multiplier.

Returns

dataset conformed by the following columns:

  • is.anomaly: 1 if the value is anomalous 0, otherwise.

  • ucl: Upper control limit.

  • lcl: Lower control limit.

Details

data must be a numerical vector without NA values. alpha0 must be a numeric value where 0 < alpha0 < 1. If a faster adjustment to the initial shift is desirable, simply lowering alpha0 will suffice. beta is the weight placed on the probability of the given observation. It must be a numeric value where 0 <= beta <= 1. Note that if beta equals 0, PEWMA converges to a standard EWMA. Finally l is the parameter that determines the control limits. By default, 3 is used.

Examples

## Generate data set.seed(100) n <- 180 x <- sample(1:100, n, replace = TRUE) x[70:90] <- sample(110:115, 21, replace = TRUE) x[25] <- 200 x[150] <- 170 df <- data.frame(timestamp = 1:n, value = x) ## Calculate anomalies result <- OcpPewma( data = df$value, n.train = 5, alpha0 = 0.8, beta = 0.1, l = 3 ) ## Plot results res <- cbind(df, result) PlotDetections(res, title = "PEWMA ANOMALY DETECTOR")

References

M. Carter, Kevin y W. Streilein. Probabilistic reasoning for streaming anomaly detection. 2012 IEEE Statistical Signal Processing Workshop (SSP), pp. 377-380, Aug 2012.

  • Maintainer: Alaiñe Iturria
  • License: AGPL (>= 3)
  • Last published: 2019-09-06