Fit Bayesian Cox Model for Interval Censored Survival Data
Fit Bayesian Cox Model for Interval Censored Survival Data
Fit Bayesian Cox model with time-independent, time-varying or dynamic covariate coefficient. The fit is done within a Gibbs sampling framework. The reversible jump algorithm is employed for the dynamic coefficient model. The baseline hazards are allowed to be either time-varying or dynamic.
bayesCox( formula, data, grid =NULL, out =NULL, model = c("TimeIndep","TimeVarying","Dynamic"), base.prior = list(), coef.prior = list(), gibbs = list(), control = list(),...)
Arguments
formula: A formula object, with the response on the left of a '~' operator, and the terms on the right. The response must be a survival object as returned by the function Surv with type = "interval2". help(Surv) for details.
data: A data.frame in which to interpret the variables named in the formula.
grid: Vector of pre-specified time grid points for model fitting. It will be automatically set up from data if it is left unspecified in the function call. By default, it consists of all the unique finite endpoints (rounded to two significant numbers) of the censoring intervals after time zero. The grid specified in the function call determines the location of possible jumps. It should be sorted increasingly and cover all the finite non-zero endpoints of the censoring intervals. Inappropriate grid specified will be taken care by the function internally.
out: An optional character string specifying the name of Markov chain Monte Carlo (MCMC) samples output file. By default, the MCMC samples will be output to a temporary directory set by tempdir and saved in the returned bayesCox object after burning and thinning. If out is specified, the MCMC samples will be preserved in the specified text file.
model: Model type to fit. Available options are "TimeIndep", "TimeVarying", and "Dynamic". Partial matching on the name is allowed.
base.prior: List of options for prior of baseline lambda. Use list(type = "Gamma", shape = 0.1, rate = 0.1) for all models; list(type = "Const", value = 1) for Dynamic model when intercept = TRUE.
coef.prior: List of options for prior of coefficient beta. Use list(type = "Normal", mean = 0, sd = 1) for TimeIndep
...: Other arguments that are for futher extension.
Returns
An object of S3 class bayesCox representing the fit.
Details
To use default hyper parameters in the specification of either base.prior or coef.prior, one only has to supply the name of the prior, e.g., list(type = "Gamma"), list(type = "HAR1").
The gibbs argument is a list of components:
iter:: Number of iterations, default 3000;
burn:: Number of burning, default 500;
thin:: Number of thinning, default 1;
verbose:: A logical value, default TRUE. If TRUE, print the iteration;
nReport:: Print frequency, default 100.
The control argument is a list of components:
intercept:: A logical value, default FALSE. If TRUE, the model will estimate the intercept, which is the log of baseline hazards. If TRUE, please remember to turn off the direct estimation of baseline hazards, i.e., base.prior = list(type = "Const")
a0:: Multiplier for initial variance in time-varying or dynamic models, default 100;
eps0:: Size of auxiliary uniform latent variable in dynamic model, default 1.
For users interested in extracting MCMC sampling information from the
output files, the detail of the output files is presented as follows: Let
k denote the number of time points (excluding time zero) specified
in grid, ck equal 1 for model with time-invariant coefficients;
ck equal k otherwise, and p denote the number of
covariates. Then the each sample saved in each row consists of the
following possible parts.
Part 1:: The first k numbers represent the jump size of baseline hazard function at each time grid. If we take the column mean of the first k columns of the output file, we will get the same numbers with obj$est$lambda, where obj is the bayesCox
object returned by the function.
Part 2:: The sequence from (k+1)to(k+ck∗p)
represent the coefficients of covariates at the time grid. The first $k$ numbers in the sequence are the coefficients for the first covariate at the time grid; The second $k$ numbers' sub-sequence are the coefficients for the second covariate and so on.
Part 3:: The sequence from (k+ck∗p+1) to (k+ck∗p+p) represents the sampled latent variance of coefficients.
Part 4:: The sequence from (k+ck∗p+p+1) to (k+2∗ck∗p+p) represents the indicator of whether there is a jump of the covariate coefficients at the time grid. Similar with Part 2, the first k numbers' sub-sequence is for the first covariate, the second k numbers' sub-sequence is for the second covariate, and so on.
For the model with time-independent coefficients, the output file only
has Part 1 and Part 2 in each row; For time-varying coefficient model,
the output file has Part 1, 2, and 3; The output file for the dynamic
model has all the four parts. Note that the dynamic baseline hazard will
be taken as one covariate. So p needs being replaced with
(p+1) for model with dynamic baseline hazard rate.
No function in the package actually needs the Part 1 from the output file
now; The Part 2 is used by function coef and survCurve;
The Part 3 is needed by function nu; Function jump extracts
the Part 4.
Examples
## Not run:library(dynsurv)set.seed(1216)## breast cancer datadata(bcos)mydata <- bcos
myformula <- Surv(left, right, type ="interval2")~ trt
### Fit time-independent coefficient modelfit0 <- bayesCox(myformula, mydata, model ="TimeIndep", base.prior = list(type ="Gamma", shape =0.1, rate =0.1), coef.prior = list(type ="Normal", mean =0, sd =1), gibbs = list(iter =100, burn =20, thin =1, verbose =FALSE))## plot coefficient estimatesplotCoef(coef(fit0, level =0.9))## Plot the estimated survival function for given new datanewDat <- data.frame(trt = c("Rad","RadChem"))row.names(newDat)<- unique(newDat$trt)plotSurv(survCurve(fit0, newDat), conf.int =TRUE)### Fit time-varying coefficient modelfit1 <- bayesCox(myformula, mydata, model ="TimeVary", base.prior = list(type ="Gamma", shape =0.1, rate =0.1), coef.prior = list(type ="AR1", sd =1), gibbs = list(iter =100, burn =20, thin =1, verbose =TRUE, nReport =20))plotCoef(coef(fit1))plotSurv(survCurve(fit1), conf.int =TRUE)### Fit dynamic coefficient model with time-varying baseline hazardsfit2 <- bayesCox(myformula, mydata, model ="Dynamic", base.prior = list(type ="Gamma", shape =0.1, rate =0.1), coef.prior = list(type ="HAR1", shape =2, scale =1), gibbs = list(iter =100, burn =20, thin =1, verbose =TRUE, nReport =20))plotCoef(coef(fit2))plotJumpTrace(jump(fit2))plotJumpHist(jump(fit2))plotNu(nu(fit2))plotSurv(survCurve(fit2), conf.int =TRUE)## Plot the coefficient estimates from three models togetherplotCoef(rbind(coef(fit0), coef(fit1), coef(fit2)))### Fit dynamic coefficient model with dynamic hazards (in log scales)fit3 <- bayesCox(myformula, mydata, model ="Dynamic", base.prior = list(type ="Const"), coef.prior = list(type ="HAR1", shape =2, scale =1), gibbs = list(iter =100, burn =20, thin =1, verbose =TRUE, nReport =20), control = list(intercept =TRUE))plotCoef(coef(fit3))plotJumpTrace(jump(fit3))plotJumpHist(jump(fit3))plotNu(nu(fit3))plotSurv(survCurve(fit3), conf.int =TRUE)## Plot the estimated survival function and the differenceplotSurv(survCurve(fit3, newdata = newDat, type ="survival"), legendName ="Treatment", conf.int =TRUE)plotSurv(survDiff(fit3, newdata = newDat, type ="survival"), legendName ="Treatment", conf.int =TRUE, smooth =TRUE)## extract MCMC samplesmcmc_list <- bayesCoxMcmc(fit3, part ="coef")posterior_coef <- mcmc_list$coef
## posterior probabilities of hazard ratio of RadChem (vs. Rad)## greater than 1 at time 10posterior_coef[covariate =="trtRadChem"& time ==10, mean(exp(coef)>1)]## End(Not run)
References
X. Wang, M.-H. Chen, and J. Yan (2013). Bayesian dynamic regression models for interval censored survival data with application to children dental health. Lifetime data analysis, 19(3), 297--316.
X. Wang, X. Sinha, J. Yan, and M.-H. Chen (2014). Bayesian inference of interval-censored survival data. In: D. Chen, J. Sun, and K. Peace, Interval-censored time-to-event data: Methods and applications, 167--195.
X. Wang, M.-H. Chen, and J. Yan (2011). Bayesian dynamic regression models for interval censored survival data. Technical Report 13, Department of Statistics, University of Connecticut.
D. Sinha, M.-H. Chen, and S.K. Ghosh (1999). Bayesian analysis and model selection for interval-censored survival data. Biometrics 55(2), 585--590.
See Also
coef.bayesCox, jump.bayesCox, nu.bayesCox, plotCoef, plotJumpTrace, plotNu, survCurve, survDiff, and plotSurv.