Predicting and calculating sequential design and optimization statistics at new design points (i.e., active learning heuristics) for dynamic tree models
## S3 method for class 'dynaTree'predict(object, XX, yy =NULL, quants =TRUE, ei =FALSE, verb =0,...)## S3 method for class 'dynaTree'coef(object, XX, verb =0,...)
Arguments
object: a "dynaTree"-class object built by dynaTree
XX: a design matrix of predictive locations (where ncol(XX) == ncol(X))
yy: an optional vector of true responses at the XX
predictive locations at which the log posterior probability are to be reported
quants: a scalar logical indicating if predictive quantiles are desired (useful for visualization, but less so for active learning); calculating predictive quantiles is expensive and should be turned off if prediction is not being used for visualization, e.g., if used for active learning
ei: a scalar logical indicating if the expected improvement statistic (for optimization) should be calculated and returned
verb: a positive scalar integer indicating how many predictive locations (iterations) after which a progress statement should be printed to the console; a (default) value of verb = 0 is quiet
...: to comply with the generic predict method -- currently unused
Details
predict returns predictive summary statistics by averaging over the samples from the posterior predictive distribution obtained from each of the particles in the cloud pointed to by the object (object)
coef returns a matrix of regression coefficients used in linear model leaves (model = "linear") leaves, averaged over all particles, for each XX location. For other models it prints a warning and defaults to predict.
The value(s) calculated are appended to object; the new fields are described below
Note that ALC calculations have been moved to the alc.dynaTree
function(s)
Returns
The object returned is of class "dynaTree", which includes a copy of the list elements from the object passed in, with the following (predictive) additions depending on whether object$model is for regression ("constant" or "linear") or classification ("class").
For regression:
mean: a vector containing an estimate of the predictive mean at the XX locations
vmean: a vector containing an estimate of the variance of predictive mean at the XX locations
var: a vector containing an estimate of the predictive variance (average variance plus variance of mean) at the XX locations
df: a vector containing the average degrees of freedom at the XX locations
q1: a vector containing an estimate of the 5% quantile of the predictive distribution at the XX locations, unless quants = FALSE
q2: a vector containing an estimate of the 95% quantile of the predictive distribution at the XX locations, unless quants = FALSE
yypred: if yy != NULL then this contains the predictive probability of the true yy values at the XX locations
ei: a vector containing an estimate of the EI statistic, unless ei = FALSE;
For classification:
p: a nrow(XX)-by-max(object$y)matrix of mean class probabilities for each of max(object$y) classes at the predictive data locations
entropy: a nrow(XX) vector of predictive entropys at the predictive data locations;
For coef a new XXc field is created so as not to trample on XXs that may have been used in a previous predict, plus
coef: a nrow(XX)-by-m+icept matrix of particle- averaged regression coefficients.
References
Taddy, M.A., Gramacy, R.B., and Polson, N. (2011). Dynamic trees for learning and design
Journal of the American Statistical Association, 106(493), pp. 109-123; arXiv:0912.1586