ccf_boot() R function from [funtimes]

Cross-Correlation of Autocorrelated Time Series

Account for possible autocorrelation of time series when assessing the statistical significance of their cross-correlation. A sieve bootstrap approach is used to generate multiple copies of the time series with the same autoregressive dependence, under the null hypothesis of the two time series under investigation being uncorrelated. The significance of cross-correlation coefficients is assessed based on the distribution of their bootstrapped counterparts. Both Pearson and Spearman types of coefficients are obtained, but a plot is provided for only one type, with significant correlations shown using filled circles (see Examples).


ccf_boot(
  x,
  y,
  lag.max = NULL,
  plot = c("Pearson", "Spearman", "none"),
  level = 0.95,
  B = 1000,
  smooth = FALSE,
  cl = 1L,
  ...
)

Arguments

x, y: univariate numeric time-series objects or numeric vectors for which to compute cross-correlation. Different time attributes in ts objects are acknowledged, see Example 2 below.
lag.max: maximum lag at which to calculate the cross-correlation. Will be automatically limited as in ccf.
plot: choose whether to plot results for Pearson correlation (default, or use plot = "Pearson"), Spearman correlation (use plot = "Spearman"), or suppress plotting (use plot = "none"). Both Pearson's and Spearman's results are given in the output, regardless of the plot setting.
level: confidence level, from 0 to 1. Default is 0.95, that is, 95% confidence.
B: number of bootstrap simulations to obtain empirical critical values. Default is 1000.
smooth: logical value indicating whether the bootstrap confidence bands should be smoothed across lags. Default is FALSE meaning no smoothing.
cl: parameter to specify computer cluster for bootstrapping passed to the package parallel (default cl = 1, means no cluster is used). Possible values are:
- cluster object (list) produced by makeCluster . In this case, a new cluster is not started nor stopped;
- NULL. In this case, the function will detect available cores (see detectCores ) and, if there are multiple cores ( $\>1$ ), a cluster will be started with makeCluster . If started, the cluster will be stopped after the computations are finished;
- positive integer defining the number of cores to start a cluster. If cl = 1 (default), no attempt to create a cluster will be made. If cl > 1, a cluster will be started (using makeCluster ) and stopped afterward (using stopCluster ).
...: other parameters passed to the function ARest to control how autoregressive dependencies are estimated. The same set of parameters is used separately on x and y.

Returns

A data frame with the following columns: - Lag: lags for which the following values were obtained.

r_P: observed Pearson correlations.
lower_P, upper_P: lower and upper confidence bounds (for the confidence level set by level) for Pearson correlations.
r_S: observed Spearman correlations.
lower_S, upper_S: lower and upper confidence bounds (for the confidence level set by level) for Spearman correlations.

Details

Note that the smoothing of confidence bands is implemented purely for the look. This smoothing is different from the smoothing methods that can be applied to adjust bootstrap performance if(!exists(".Rdpack.currefs")) .Rdpack.currefs <-new.env();Rdpack::insert_citeOnly(keys="DeAngelis_Young_1992",package="funtimes",cached_env=.Rdpack.currefs) . For correlations close to the significance bounds, the setting of smooth might affect the decision on the statistical significance. In this case, it is recommended to keep smooth = FALSE and set a higher B.

Examples


## Not run:

# Fix seed for reproducible simulations:
set.seed(1)

# Example 1
# Simulate independent normal time series of same lengths
x <- rnorm(100)
y <- rnorm(100)
# Default CCF with parametric confidence band
ccf(x, y)
# CCF with bootstrap
tmp <- ccf_boot(x, y)
# One can extract results for both Pearson and Spearman correlations
tmp$rP
tmp$rS

# Example 2
# Simulated ts objects of different lengths and starts (incomplete overlap)
x <- arima.sim(list(order = c(1, 0, 0), ar = 0.5), n = 30)
x <- ts(x, start = 2001)
y <- arima.sim(list(order = c(2, 0, 0), ar = c(0.5, 0.2)), n = 40)
y <- ts(y, start = 2020)
# Show how x and y are aligned
ts.plot(x, y, col = 1:2, lty = 1:2)
# The usual CCF
ccf(x, y)
# CCF with bootstrap confidence intervals
ccf_boot(x, y, plot = "Spearman")
# Notice that only +-7 lags can be calculated in both cases because of the small
# overlap of the time series. If we save these time series as plain vectors, the time
# information would be lost, and the time series will be misaligned.
ccf(as.numeric(x), as.numeric(y))

# Example 3
# Box & Jenkins time series of sales and a leading indicator, see ?BJsales
plot.ts(cbind(BJsales.lead, BJsales))
# Each of the BJ time series looks as having a stochastic linear trend, so apply differences
plot.ts(cbind(diff(BJsales.lead), diff(BJsales)))
# Get cross-correlation of the differenced series
ccf_boot(diff(BJsales.lead), diff(BJsales), plot = "Spearman")
# The leading indicator "stands out" with significant correlations at negative lags,
# showing it can be used to predict the sales 2-3 time steps ahead (that is,
# diff(BJsales.lead) at times t-2 and t-3 is strongly correlated with diff(BJsales) at
# current time t).
## End(Not run)

Author(s)

Vyacheslav Lyubchich

funtimes package Read PDF manual

Maintainer: Vyacheslav Lyubchich
License: GPL (>= 2)
Last published: 2023-03-21

Useful links

ccf_boot function