predict.recurbart() R function from [BART]

Predicting new observations with a previously fitted BART model

BART is a Bayesian sum-of-trees model.

For a numeric response $y$ , we have $y = f(x) + e$ , where $e ~ N(0,sigma^2)$ .

$f$ is the sum of many tree models. The goal is to have very flexible inference for the uknown function $f$ .

In the spirit of ensemble models , each tree is constrained by a prior to be a weak learner so that it contributes a small amount to the overall fit.


## S3 method for class 'recurbart'
predict(object, newdata, mc.cores=1, openmp=(mc.cores.openmp()>0), ...)

Arguments

object: object returned from previous BART fit with recur.bart

or mc.recur.bart.
newdata: Matrix of covariates to predict the distribution of $t$ .
mc.cores: Number of threads to utilize.
openmp: Logical value dictating whether OpenMP is utilized for parallel processing. Of course, this depends on whether OpenMP is available on your system which, by default, is verified with mc.cores.openmp.
...: Other arguments which will be passed on to pwbart.

Details

BART is an Bayesian MCMC method. At each MCMC interation, we produce a draw from the joint posterior $(f,sigma) \| (x,y)$ in the numeric $y$ case and just $f$ in the binary $y$ case.

Thus, unlike a lot of other modelling methods in R, we do not produce a single model object from which fits and summaries may be extracted. The output consists of values $f*(x)$ (and $sigma*$ in the numeric case) where * denotes a particular draw. The $x$ is either a row from the training data (x.train) or the test data (x.test).

Returns

Returns an object of type recurbart with predictions corresponding to newdata.

Examples


## load 20 percent random sample
data(xdm20.train)
data(xdm20.test)
data(ydm20.train)

##test BART with token run to ensure installation works
## with current technology even a token run will violate CRAN policy
## set.seed(99)
## post <- recur.bart(x.train=xdm20.train, y.train=ydm20.train,
##                    nskip=1, ndpost=1, keepevery=1)

## Not run:

set.seed(99)
post <- recur.bart(x.train=xdm20.train, y.train=ydm20.train)
## larger data sets can take some time so, if parallel processing
## is available, submit this statement instead
## post <- mc.recur.bart(x.train=xdm20.train, y.train=ydm20.train,
##                      mc.cores=8, seed=99)

require(rpart)
require(rpart.plot)

dss <- rpart(post$yhat.train.mean~xdm20.train)

rpart.plot(dss)
## for the 20 percent sample, notice that the top splits
## involve cci_pvd and n
## for the full data set, notice that all splits
## involve ca, cci_pud, cci_pvd, ins270 and n
## (except one at the bottom involving a small group)

## compare patients treated with insulin (ins270=1) vs
## not treated with insulin (ins270=0)
N.train <- 50
N.test <- 50
K <- post$K ## 798 unique time points

## only testing set, i.e., remove training set
xdm20.test. <- xdm20.test[N.train*K+(1:(N.test*K)), ]
xdm20.test. <- rbind(xdm20.test., xdm20.test.)
xdm20.test.[ , 'ins270'] <- rep(0:1, each=N.test*K)

## multiple threads will be utilized if available
pred <- predict(post, xdm20.test., mc.cores=8)

## create Friedman's partial dependence function for the
## intensity/hazard by time and ins270
NK.test <- N.test*K
M <- nrow(pred$haz.test) ## number of MCMC samples, typically 1000

RI <- matrix(0, M, K)

for(i in 1:N.test)
    RI <- RI+(pred$haz.test[ , (N.test+i-1)*K+1:K]/
              pred$haz.test[ , (i-1)*K+1:K])/N.test

RI.lo <- apply(RI, 2, quantile, probs=0.025)
RI.mu <- apply(RI, 2, mean)
RI.hi <- apply(RI, 2, quantile, probs=0.975)

plot(post$times, RI.hi, type='l', lty=2, log='y',
     ylim=c(min(RI.lo, 1/RI.hi), max(1/RI.lo, RI.hi)),
     xlab='t', ylab='RI(t, x)',
     sub='insulin(ins270=1) vs. no insulin(ins270=0)',
     main='Relative intensity of hospital admissions for diabetics')
lines(post$times, RI.mu)
lines(post$times, RI.lo, lty=2)
lines(post$times, rep(1, K), col='darkgray')

## RI for insulin therapy seems fairly constant with time
mean(RI.mu)
## End(Not run)

BART package Read PDF manual

Maintainer: Rodney Sparapani
License: GPL (>= 2)
Last published: 2024-06-21

Useful links

Downloads (last 30 days):

predict.recurbart function

Predicting new observations with a previously fitted BART model

Arguments

Details

Returns

See Also

Examples