Hall, Horowitz, and Jing (1995) "HHJ" Algorithm to Select the Optimal Block-Length
Hall, Horowitz, and Jing (1995) "HHJ" Algorithm to Select the Optimal Block-Length
Perform the Hall, Horowitz, and Jing (1995) "HHJ" cross-validation algorithm to select the optimal block-length for a bootstrap on dependent data (block-bootstrap). Dependent data such as stationary time series are suitable for usage with the HHJ algorithm.
series: a numeric vector or time series giving the original data for which to find the optimal block-length for.
nb: an integer value, number of bootstrapped series to compute.
n_iter: an integer value, maximum number of iterations for the HHJ algorithm to compute.
pilot_block_length: a numeric value, the block-length (l∗ in HHJ) for which to perform initial block bootstraps.
sub_sample: a numeric value, the length of each overlapping subsample, m in HHJ.
k: a character string, either "bias/variance", "one-sided", or "two-sided" depending on the desired object of estimation. If the desired bootstrap statistic is bias or variance then select "bias/variance" which sets k=3 per HHJ. If the object of estimation is the one-sided or two-sided distribution function, then set k = "one-sided" or k = "two-sided" which sets k=4 and k=5, respectively. For the purpose of generating symmetric confidence intervals around an unknown parameter, k = "two-sided" (the default) should be used.
bofb: a numeric value, length of the basic blocks in the block-of-blocks bootstrap, seem = for tsbootstrap and Kunsch (1989).
search_grid: a numeric value, the range of solutions around l∗ to evaluate within the MSE function after the first iteration. The first iteration will search through all the possible block-lengths unless specified in grid_step =.
grid_step: a numeric value or vector of at most length 2, the number of steps to increment over the subsample block-lengths when evaluating the MSE function. If grid_step = 1 then each block-length will be evaluated in the MSE function. If grid_step > 1, the MSE
function will search over the sequence of block-lengths from 1 to m by grid_step. If grid_step is a vector of length 2, the first iteration will step by the first element of grid_step and subsequent iterations will step by the second element.
cl: a cluster object, created by package parallel, doParallel, or snow. If NULL, no parallelization will be used.
verbose: a logical value, if set to FALSE then no interim messages are output to the console. Error messages will still be output. Default is TRUE.
plots: a logical value, if set to FALSE then no interim plots are output to the console. Default is TRUE.
Returns
an object of class 'hhj'
Details
The HHJ algorithm is computationally intensive as it relies on a cross-validation process using a type of subsampling to estimate the mean squared error (MSE) incurred by the bootstrap at various block-lengths.
Under-the-hood, hhj() makes use of tsbootstrap, see Trapletti and Hornik (2020), to perform the moving block-bootstrap (or the block-of-blocks bootstrap by setting bofb > 1) according to Kunsch (1989).
References
Adrian Trapletti and Kurt Hornik (2020). tseries: Time Series Analysis and Computational Finance. R package version 0.10-48.
Kunsch, H. (1989) The Jackknife and the Bootstrap for General Stationary Observations. The Annals of Statistics, 17(3), 1217-1241. Retrieved February 16, 2021, from tools:::Rd_expr_doi("10.1214/aos/1176347265")
Peter Hall, Joel L. Horowitz, Bing-Yi Jing, On blocking rules for the bootstrap with dependent data, Biometrika, Volume 82, Issue 3, September 1995, Pages 561-574, DOI: tools:::Rd_expr_doi("10.1093/biomet/82.3.561")
Examples
# Generate AR(1) time seriessim <- stats::arima.sim(list(order = c(1,0,0), ar =0.5), n =500, innov = rnorm(500))# Calculate optimal block length for serieshhj(sim, sub_sample =10)# Use parallel computinglibrary(parallel)# Make cluster object with 2 corescl <- makeCluster(2)# Calculate optimal block length for serieshhj(sim, cl = cl)