Classic processing KNN based Conformal Anomaly Detector (KNN-CAD)
Classic processing KNN based Conformal Anomaly Detector (KNN-CAD)
CpKnnCad calculates the anomalies of a dataset using classical processing based on the KNN-CAD algorithm. KNN-CAD is a model-free anomaly detection method for univariate time-series which adapts itself to non-stationarity in the data stream and provides probabilistic abnormality scores based on the conformal prediction paradigm.
CpKnnCad(data, n.train, threshold =1, l =19, k =27, ncm.type ="ICAD", reducefp =TRUE)
Arguments
data: Numerical vector with training and test dataset.
n.train: Number of points of the dataset that correspond to the training set.
threshold: Anomaly threshold.
l: Window length.
k: Number of neighbours to take into account.
ncm.type: Non Conformity Measure to use "ICAD" or "LDCD"
reducefp: If TRUE reduces false positives.
Returns
dataset conformed by the following columns:
is.anomaly: 1 if the value is anomalous, 0 otherwise.
anomaly.score: Probability of anomaly.
Details
data must be a numerical vector without NA values. threshold must be a numeric value between 0 and 1. If the anomaly score obtained for an observation is greater than the threshold, the observation will be considered abnormal. l must be a numerical value between 1 and 1/n; n being the length of the training data. Take into account that the value of l has a direct impact on the computational cost, so very high values will make the execution time longer. k parameter must be a numerical value less than the n.train
value. ncm.type determines the non-conformity measurement to be used. ICAD calculates dissimilarity as the sum of the distances of the nearest k neighbours and LDCD as the average.
Examples
## Generate dataset.seed(100)n <-350x <- sample(1:100, n, replace =TRUE)x[70:90]<- sample(110:115,21, replace =TRUE)x[25]<-200x[320]<-170df <- data.frame(timestamp =1:n, value = x)## Set parametersparams.KNN <- list(threshold =1, n.train =50, l =19, k =17)## Calculate anomaliesresult <- CpKnnCad( data = df$value, n.train = params.KNN$n.train, threshold = params.KNN$threshold, l = params.KNN$l, k = params.KNN$k, ncm.type ="ICAD", reducefp =TRUE)## Plot resultsres <- cbind(df, result)PlotDetections(res, title ="KNN-CAD ANOMALY DETECTOR")
References
V. Ishimtsev, I. Nazarov, A. Bernstein and E. Burnaev. Conformal k-NN Anomaly Detector for Univariate Data Streams. ArXiv e-prints, jun. 2017.