dataFile: .csv format data file name. The first column must be a time index or time values. The first row must be column names unless noTime is TRUE.
dataFrame: input data.frame. The first column must be a time index or time values unless noTime is TRUE. The columns must be named.
lib: a 2-column matrix, data.frame, 2-element vector or string of row indice pairs, where each pair specifies the first and last rows of the time series to create the library.
pred: (same format as lib), but specifying the sections of the time series to forecast.
D: multivariate dimension.
E: embedding dimension.
Tp: prediction horizon (number of time column rows).
knn: number of nearest neighbors. If knn=0, knn is set to E+1.
tau: lag of time delay embedding specified as number of time column rows.
columns: string of whitespace separated column name(s), or vector of column names used to create the library. If individual column names contain whitespace place names in a vector, or, append ',' to the name.
target: column name used for prediction.
multiview: number of multiview ensembles to average for the final prediction estimate.
exclusionRadius: number of adjacent observation vector rows to exclude as nearest neighbors in prediction.
trainLib: logical to use in-sample (lib=pred) projections for the ranking of column combinations.
excludeTarget: logical to exclude embedded target column from combinations.
parameterList: logical to add list of invoked parameters.
verbose: logical to produce additional console reporting.
numThreads: number of CPU threads to use in multiview processing.
showPlot: logical to plot results.
noTime: logical to allow input data with no time column.
Returns
Named list with data.frames [[View, Predictions]].
data.frame View columns:
Col_1
column index
...
column index
Col_D
column index
rho
Pearson correlation
MAE
mean absolute error
RMSE
root mean square error
name_1
column name
...
column name
name_D
column name
If parameterList = TRUE a named list "parameters" is added.
References
Ye H., and G. Sugihara, 2016. Information leverage in interconnected ecosystems: Overcoming the curse of dimensionality. Science 353:922-925.
Details
Multiview embedding is a method to identify variables in a multivariate dynamical system that are most likely to contribute to the observed dynamics. It is a multistep algorithm with these general steps:
If E>1, all variables are embedded to dimension E. If trainLib is TRUE initial forecasts and ranking are done in-sample (lib=pred) and predictions using the top ranked combinations use the specified lib and pred. If trainLib is FALSE initial forecasts and ranking use the specified lib and pred, the step of computing predictions of the top combinations is skipped.