Generate Maximum Entropy Bootstrapped Time Series Ensemble Specifying Rank Correlation
Generate Maximum Entropy Bootstrapped Time Series Ensemble Specifying Rank Correlation
Generates maximum entropy bootstrap replicates for dependent data specifying Spearman rank correlation coefficient between replicates series. (See details.)
x: vector of data, ts object or pdata.frame object.
reps: number of replicates to generate.
setSpearman: The default setting setSpearman=1 assumes that the user wants to generate replicates that are perfectly dependent on original time series. setSpearman<1 admits less perfect (more realistic for some purposes) dependence.
drift: logical; TRUE default preserves the drift of the original series.
trim: the trimming proportion.
xmin: the lower limit for left tail.
xmax: the upper limit for right tail.
reachbnd: logical. If TRUE potentially reached bounds (xmin = smallest value - trimmed mean and xmax=largest value + trimmed mean) are given when the random draw happens to be equal to 0 and 1, respectively.
expand.sd: logical. If TRUE the standard deviation in the ensemble in expanded. See expand.sd.
force.clt: logical. If TRUE the ensemble is forced to satisfy the central limit theorem. See force.clt.
scl.adjustment: logical. If TRUE scale adjustment is performed to ensure that the population variance of the transformed series equals the variance of the data.
sym: logical. If TRUE an adjustment is peformed to ensure that the ME density is symmetric.
elaps: logical. If TRUE elapsed time during computations is displayed.
colsubj: the column in x that contains the individual index. It is ignored if the input data x is not a pdata.frame object.
coldata: the column in x that contains the data of the variable to create the ensemble. It is ignored if the input data x is not a pdata.frame object.
coltimes: an optional argument indicating the column that contains the times at which the observations for each individual are observed. It is ignored if the input data x is not a pdata.frame object.
...: possible argument fiv to be passed to expand.sd.
Details
Seven-steps algorithm:
Sort the original data in increasing order and store the ordering index vector.
Compute intermediate points on the sorted series.
Compute lower limit for left tail (xmin) and upper limit for right tail (xmax). This is done by computing the trim (e.g. 10
Compute the mean of the maximum entropy density within each interval in such a way that the mean preserving constraint is satisfied. (Denoted as mt in the reference paper.) The first and last interval means have distinct formulas. See Theil and Laitinen (1980) for details.
Generate random numbers from the [0,1] uniform interval and compute sample quantiles at those points.
Apply to the sample quantiles the correct order to keep the dependence relationships of the observed data.
Repeat the previous steps several times (e.g. 999).
The scale and symmetry adjustments are described in Vinod (2013) referenced below.
In some applications, the ensembles must be ensured to be non-negative. Setting trim$xmin = 0 ensures positive values of the ensembles. It also requires force.clt = FALSE and expand.sd = FALSE. These arguments are set to FALSE if trim$xmin = 0 is defined and a warning is returned to inform that the value of those arguments were overwritten. Note: The choice of xmin and xmax cannot be arbitrary and should be cognizant of range(x) in data. Otherwise, if there are observations outside those bounds, the limits set by these arguments may not be met. If the user is concerned only with the trimming proportion, then it can be passed as argument simply trim = 0.1 and the default values for xmin and xmax will be used.
setSpearman<1 is implemented with grid search near the desired value of the rank correlation coefficient, suggested by Fred Viole, a Ph.D. student at Fordham University and author of an R package NNS.
Returns
x: original data provided as input.
ensemble: maximum entropy bootstrap replicates.
xx: sorted order stats (xx[1] is minimum value).
z: class intervals limits.
dv: deviations of consecutive data values.
dvtrim: trimmed mean of dv.
xmin: data minimum for ensemble=xx[1]-dvtrim.
xmax: data x maximum for ensemble=xx[n]+dvtrim.
desintxb: desired interval means.
ordxx: ordered x values.
kappa: scale adjustment to the variance of ME density.
Vinod, H.D. (2006), Maximum Entropy Ensembles for Time Series Inference in Economics, Journal of Asian Economics, 17 (6), pp. 955-978
Vinod, H.D. (2004), Ranking mutual funds using unconventional utility theory and stochastic dominance, Journal of Empirical Finance, 11 (3), pp. 353-377.
Examples
## Ensemble for the AirPassenger time series data set.seed(345) out <- mebootSpear(x=AirPassengers, reps=100, xmin=0, setSpearman =0) cor(out$rowAvg, AirPassengers, method ="spearman")# rank-correlation should be close to 0