This function calculates an improved lower bound (LB) on the Dynamic Time Warp (DTW) distance between two time series. It uses a Sakoe-Chiba constraint.
y: A time series with the same length as x (query).
window.size: Window size for envelope calculation. See details.
norm: Vector norm. Either "L1" for Manhattan distance or "L2" for Euclidean.
lower.env: Optionally, a pre-computed lower envelope for ‘y’ can be provided (non-proxy version only). See compute_envelope().
upper.env: Optionally, a pre-computed upper envelope for ‘y’ can be provided (non-proxy version only). See compute_envelope().
force.symmetry: If TRUE, a second lower bound is calculated by swapping x and y, and whichever result has a higher distance value is returned. The proxy version can only work if a square matrix is obtained, but use carefully.
error.check: Logical indicating whether the function should try to detect inconsistencies and give more informative errors messages. Also used internally to avoid repeating checks.
Returns
The improved lower bound for the DTW distance.
Details
The reference time series should go in x, whereas the query time series should go in y.
If the envelopes are provided, they should be provided together. If either one is missing, both will be computed.
The windowing constraint uses a centered window. The calculations expect a value in window.size that represents the distance between the point considered and one of the edges of the window. Therefore, if, for example, window.size = 10, the warping for an observation xi considers the points between xi−10 and xi+10, resulting in 10(2) + 1 = 21 observations falling within the window.
Proxy version
The version registered with proxy::dist() is custom (loop = FALSE in proxy::pr_DB ). The custom function handles multi-threaded parallelization directly with RcppParallel . It uses all available threads by default (see RcppParallel::defaultNumThreads()), but this can be changed by the user with RcppParallel::setThreadOptions().
An exception to the above is when it is called within a foreach parallel loop made by dtwclust . If the parallel workers do not have the number of threads explicitly specified, this function will default to 1 thread per worker. See the parallelization vignette for more information - browseVignettes("dtwclust")
Note
The lower bound is only defined for time series of equal length and is not symmetric.
If you wish to calculate the lower bound between several time series, it would be better to use the version registered with the proxy package, since it includes some small optimizations. The convention mentioned above for references and queries still holds. See the examples.
The proxy version of force.symmetry should only be used when only x is provided or both x
and y are identical. It compares the lower and upper triangular of the resulting distance matrix and forces symmetry in such a way that the tightest lower bound is obtained.
Examples
# Sample datadata(uciCT)# Lower bound distance between two seriesd.lbi <- lb_improved(CharTraj[[1]], CharTraj[[2]], window.size =20)# Corresponding true DTW distanced.dtw <- dtw(CharTraj[[1]], CharTraj[[2]], window.type ="sakoechiba", window.size =20)$distance
d.lbi <= d.dtw
# Calculating the LB between several time series using the 'proxy' package# (notice how both argments must be lists)D.lbi <- proxy::dist(CharTraj[1], CharTraj[2:5], method ="LB_Improved", window.size =20, norm ="L2")# Corresponding true DTW distanceD.dtw <- proxy::dist(CharTraj[1], CharTraj[2:5], method ="dtw_basic", norm ="L2", window.size =20)D.lbi <= D.dtw
References
Lemire D (2009). ``Faster retrieval with a two-pass dynamic-time-warping lower bound .'' Pattern Recognition, 42 (9), pp. 2169 - 2180. ISSN 0031-3203, tools:::Rd_expr_doi("10.1016/j.patcog.2008.11.030") , https://www.sciencedirect.com/science/article/pii/S0031320308004925.