Retire (i.e. remove) data from the a dynaTree model
Retire (i.e. remove) data from the a dynaTree model
Allows the removal (or retireing
of X-y pairs from a "dynaTree"-class object to facilitate online learning; retireed pairs ar absorbed into the leaf prior(s)
## S3 method for class 'dynaTree'retire(object, indices, lambda =1, verb =0)
Arguments
object: a "dynaTree"-class object built by dynaTree
indices: a vector of positive integers in 1:nrow(object$X) indicating which X-y pairs to retire ; must have length(indices) <= nrow(object$X)
lambda: a scalar proportion (forgetting factor) used to downweight the previous prior summary statistics
verb: a nonzero scalar causes info about the retireed indices, i.e., their X-y values, to be printed to the screen as they are retireed
Details
Primarily for use in online learning contexts. After retireing the predictive distribution remains unchanged, because the sufficient statistics of the removed pairs enters the prior in the leaves of the tree of each particle. Further update.dynaTree calls (adding data) may cause changes to the posterior predictive as grow moves cannot keep the retires ; see a forthcoming paper for more details. In many ways, retire.dynaTree is the opposite of update.dynaTree except that the loss of information upon retireing is not complete.
Drifting regression or classification relationships may be modeled with a forgetting factor lambda < 1
The alcX.dynaTree provides a good, and computationally efficient, heuristic for choosing which points to retire for regression models, and likewise link{entropyX.dynaTree} for classification models.
Note that classification models (model = "class") are not supported, and implicit intercepts (icept = "implicit") with linear models (model = "linear") are not supported at this time
Note
In order to use model = "linear" with dynaTree and retirement one must also specify icept = "augmented" which automatically augments an extra column of ones onto the input X design matrix/matrices. The retire function only supports this icept case
Returns
returns a "dynaTree"-class object with updated attributes
References
Anagnostopoulos, C., Gramacy, R.B. (2013) Information-Theoretic Data Discarding for Dynamic Trees on DataStreams. Entropy, 15(12), 5510-5535; arXiv:1201.5568
n <-100Xp <- runif(n,-3,3)XX <- seq(-3,3, length=200)Yp <- Xp + Xp^2+ rnorm(n,0,.2)rect <- c(-3,3)out <- dynaTree(Xp, Yp, model="linear", icept="augmented")## predict and plotout <- predict(out, XX)plot(out, main="parabola data", lwd=2)## randomly remove half of the data pointsout <- retire(out, sample(1:n, n/2, replace=FALSE))## predict and add to plot -- shouldn't change anythingout <- predict(out, XX)plot(out, add=TRUE, col=3)points(out$X[,-1], out$y, col=3)## now illustrating rejuvenation, which should result## in a change to the predictive surfaceout <- rejuvenate(out)out <- predict(out, XX)plot(out, add=TRUE, col=4)legend("top", c("original","retired","rejuvenated"), col=2:4, lty=1)## clean updeletecloud(out)## see demo("online") for an online learning example## where ALC is used for retirement