CWindowCluster function

Window-Level Time Series Clustering

Window-Level Time Series Clustering

Cluster time series at a window level, based on Algorithm 2 of if(!exists(".Rdpack.currefs")) .Rdpack.currefs <-new.env();Rdpack::insert_citeOnly(keys="Ciampi_etal_2010;textual",package="funtimes",cached_env=.Rdpack.currefs) .

CWindowCluster( X, Alpha = NULL, Beta = NULL, Delta = NULL, Theta = 0.8, p, w, s, Epsilon = 1 )

Arguments

  • X: a matrix of time series observed within a slide (time series in columns).
  • Alpha: lower limit of the time-series domain, passed to CSlideCluster.
  • Beta: upper limit of the time-series domain passed to CSlideCluster.
  • Delta: closeness parameter passed to CSlideCluster.
  • Theta: connectivity parameter passed to CSlideCluster.
  • p: number of layers (time-series observations) in each slide.
  • w: number of slides in each window.
  • s: step to shift a window, calculated in the number of slides. The recommended values are 1 (overlapping windows) or equal to w (non-overlapping windows).
  • Epsilon: a real value in [0,1][0,1] used to identify each pair of time series that are clustered together over at least w*Epsilon slides within a window; see Definition 7 by if(!exists(".Rdpack.currefs")) .Rdpack.currefs <-new.env();Rdpack::insert_citeOnly(keys="Ciampi_etal_2010;textual",package="funtimes",cached_env=.Rdpack.currefs) . Default is 1.

Returns

A vector (if X contains only one window) or matrix with cluster labels for each time series (columns) and window (rows).

Details

This is the upper-level function for time series clustering. It exploits the function CSlideCluster to cluster time series within each slide based on closeness and homogeneity measures. Then, it uses slide-level cluster assignments to cluster time series within each window.

The total length of time series (number of levels, i.e., nrow(X)) should be divisible by p.

Examples

#For example, weekly data come in slides of 4 weeks p <- 4 #number of layers in each slide (data come in a slide) #We want to analyze the trend clusters within a window of 1 year w <- 13 #number of slides in each window s <- w #step to shift a window #Simulate 26 autoregressive time series with two years of weekly data (52*2 weeks), #with a 'burn-in' period of 300. N <- 26 T <- 2*p*w set.seed(123) phi <- c(0.5) #parameter of autoregression X <- sapply(1:N, function(x) arima.sim(n = T + 300, list(order = c(length(phi), 0, 0), ar = phi)))[301:(T + 300),] colnames(X) <- paste("TS", c(1:dim(X)[2]), sep = "") tmp <- CWindowCluster(X, Delta = NULL, Theta = 0.8, p = p, w = w, s = s, Epsilon = 1) #Time series were simulated with the same parameters, but based on the clustering parameters, #not all time series join the same cluster. We can plot the main cluster for each window, and #time series out of the cluster: par(mfrow = c(2, 2)) ts.plot(X[c(1:(p*w)), tmp[1,] == 1], ylim = c(-4, 4), main = "Time series cluster 1 in window 1") ts.plot(X[c(1:(p*w)), tmp[1,] != 1], ylim = c(-4, 4), main = "The rest of the time series in window 1") ts.plot(X[c(1:(p*w)) + s*p, tmp[2,] == 1], ylim = c(-4, 4), main = "Time series cluster 1 in window 2") ts.plot(X[c(1:(p*w)) + s*p, tmp[2,] != 1], ylim = c(-4, 4), main = "The rest of the time series in window 2")

References

if(!exists(".Rdpack.currefs")) .Rdpack.currefs <-new.env();Rdpack::insert_all_ref(.Rdpack.currefs)

See Also

CSlideCluster, CWindowCluster, and BICC

Author(s)

Vyacheslav Lyubchich

  • Maintainer: Vyacheslav Lyubchich
  • License: GPL (>= 2)
  • Last published: 2023-03-21

Useful links