Cluster time series at a window level, based on Algorithm 2 of if(!exists(".Rdpack.currefs")) .Rdpack.currefs <-new.env();Rdpack::insert_citeOnly(keys="Ciampi_etal_2010;textual",package="funtimes",cached_env=.Rdpack.currefs) .
X: a matrix of time series observed within a slide (time series in columns).
Alpha: lower limit of the time-series domain, passed to CSlideCluster.
Beta: upper limit of the time-series domain passed to CSlideCluster.
Delta: closeness parameter passed to CSlideCluster.
Theta: connectivity parameter passed to CSlideCluster.
p: number of layers (time-series observations) in each slide.
w: number of slides in each window.
s: step to shift a window, calculated in the number of slides. The recommended values are 1 (overlapping windows) or equal to w (non-overlapping windows).
Epsilon: a real value in [0,1] used to identify each pair of time series that are clustered together over at least w*Epsilon slides within a window; see Definition 7 by if(!exists(".Rdpack.currefs")) .Rdpack.currefs <-new.env();Rdpack::insert_citeOnly(keys="Ciampi_etal_2010;textual",package="funtimes",cached_env=.Rdpack.currefs) . Default is 1.
Returns
A vector (if X contains only one window) or matrix with cluster labels for each time series (columns) and window (rows).
Details
This is the upper-level function for time series clustering. It exploits the function CSlideCluster to cluster time series within each slide based on closeness and homogeneity measures. Then, it uses slide-level cluster assignments to cluster time series within each window.
The total length of time series (number of levels, i.e., nrow(X)) should be divisible by p.
Examples
#For example, weekly data come in slides of 4 weeksp <-4#number of layers in each slide (data come in a slide)#We want to analyze the trend clusters within a window of 1 yearw <-13#number of slides in each windows <- w #step to shift a window#Simulate 26 autoregressive time series with two years of weekly data (52*2 weeks), #with a 'burn-in' period of 300.N <-26T <-2*p*w
set.seed(123)phi <- c(0.5)#parameter of autoregressionX <- sapply(1:N,function(x) arima.sim(n = T +300, list(order = c(length(phi),0,0), ar = phi)))[301:(T +300),]colnames(X)<- paste("TS", c(1:dim(X)[2]), sep ="")tmp <- CWindowCluster(X, Delta =NULL, Theta =0.8, p = p, w = w, s = s, Epsilon =1)#Time series were simulated with the same parameters, but based on the clustering parameters,#not all time series join the same cluster. We can plot the main cluster for each window, and #time series out of the cluster:par(mfrow = c(2,2))ts.plot(X[c(1:(p*w)), tmp[1,]==1], ylim = c(-4,4), main ="Time series cluster 1 in window 1")ts.plot(X[c(1:(p*w)), tmp[1,]!=1], ylim = c(-4,4), main ="The rest of the time series in window 1")ts.plot(X[c(1:(p*w))+ s*p, tmp[2,]==1], ylim = c(-4,4), main ="Time series cluster 1 in window 2")ts.plot(X[c(1:(p*w))+ s*p, tmp[2,]!=1], ylim = c(-4,4), main ="The rest of the time series in window 2")