SurvCART function

Survival CART with time to event response via binary partitioning

Survival CART with time to event response via binary partitioning

Recursive partitioning for linear mixed effects model with survival data per SurvCART algorithm based on baseline partitioning variables (Kundu, 2020).

SurvCART(data, patid, timevar, censorvar, gvars, tgvars, time.dist="exponential", cens.dist="NA", event.ind=1, alpha=0.05, minsplit=40, minbucket=20, quantile=0.50, print=FALSE)

Arguments

  • data: name of the dataset. It must contain variable specified for patid (indicating subject id), all the variables specified in the formula and the baseline partitioning variables.
  • patid: name of the subject id variable.
  • timevar: name of the variable with follow-up times.
  • censorvar: name of the variable with censoring status.
  • gvars: list of partitioning variables of interest. Value of these variables should not change over time. Regarding categorical variables, only numerically coded categorical variables should be specified. For nominal categorical variables or factors, please first create corresponding dummy variable(s) and then pass through gvars.
  • tgvars: types (categorical or continuous) of partitioning variables specified in gvar. For each of continuous partitioning variables, specify 1 and for each of the categorical partitioning variables, specify 0. Length of tgvars should match to the length of gvars
  • time.dist: name of time-to-event distribution. It can be one of the following distributions: "exponential", "weibull", "lognormal" or "normal".
  • cens.dist: name of censoring distribution. It can be one of the following distributions: "exponential", "weibull", "lognormal", "normal" or "NA". If specified "NA", then parameter instability test corresponding to censoring distribution will not be performed.
  • event.ind: value of the censoring variable indicating event.
  • alpha: alpha (i.e., nominal type I error) level for parameter instability test
  • minsplit: the minimum number of observations that must exist in a node in order for a split to be attempted.
  • minbucket: the minimum number of observations in any terminal node.
  • quantile: The quantile to be displayed in the visualization of tree through plot.SurvCART() or plot().
  • print: if TRUE, then summary such as number of subjects at risk, number of events, median event time and median censoring time model will be printed for each node.

Details

Construct survival tree based on heterogeneity in time-to-event and censoring distributions.

Exponential distribution: f(t)=lambdaexp(-lambdat)

Weibull distribution: f(t)=alphalambdat^(alpha-1)exp(-lambdat^alpha)

Lognormal distribution: f(t)=(1/t)(1/sqrt(2pi*sigma^2))exp[-(1/2)(log(t)-mu)/sigma^2]

Normal distribution: f(t)=(1/sqrt(2pisigma^2))exp[-(1/2)(t-mu)/sigma^2]

Returns

  • Treeout: contains summary information of tree fitting for each terminal nodes and non-terminal nodes. Columns of Treeout include "ID", the (unique) node numbers that follow a binary ordering indexed by node depth, n, the number of subjectsreaching the node, D, the number of events reaching the node, median.T, the median survival time at the node, median.C, the median censoring time at the node, var, splitting variable, index, the cut-off value of splitting variable for binary partitioning, p (Instability), the p-value for parameter instability test for the splitting variable, loglik, the log-likelihood at the node, AIC, the AIC at the node, improve, the improvement in deviance given by this split, and Terminal, indicator (True or False) of terminal node.

  • logLik.tree: log-likelihood of the tree-structured model, based on Cox model including sub-groups as covariates

  • logLik.root: log-likelihood at the root node (i.e., without tree structure), based on Cox model without any covariate

  • AIC.tree: AIC of the tree-structured model, based on Cox model including sub-groups as covariates

  • AIC.root: AIC at the root node (i.e., without tree structure), based on Cox model without any covariate

  • nodelab: List of subgroups or terminal nodes with their description

  • varnam: List of splitting variables

  • ds: the dataset originally supplied

  • event.ind: value of the censoring variable indicating event.

  • timevar: name of the variable with follow-up times

  • censorvar: name of the variable with censoring status

  • frame: rpart compatible object

  • splits: rpart compatible object

  • cptable: rpart compatible object

  • functions: rpart compatible object

Author(s)

Madan Gopal Kundu madan_g.kundu@yahoo.com

References

Kundu, M. G., and Ghosh, S. (2021). Survival trees based on heterogeneity in time-to-event and censoring distributions using parameter instability test. Statistical Analysis and Data Mining: The ASA Data Science Journal, 14(5), 466-483.

See Also

plot, KMPlot, text, StabCat.surv, StabCont.surv

Examples

#--- Get the data data(GBSG2) #numeric coding of character variables GBSG2$horTh1<- as.numeric(GBSG2$horTh) GBSG2$tgrade1<- as.numeric(GBSG2$tgrade) GBSG2$menostat1<- as.numeric(GBSG2$menostat) #Add subject id GBSG2$subjid<- 1:nrow(GBSG2) #--- Run SurvCART() with time-to-event distribution: exponential, censoring distribution: None out<- SurvCART(data=GBSG2, patid="subjid", censorvar="cens", timevar="time", gvars=c('horTh1', 'age', 'menostat1', 'tsize', 'tgrade1', 'pnodes', 'progrec', 'estrec'), tgvars=c(0,1,0,1,0,1, 1,1), event.ind=1, alpha=0.05, minsplit=80, minbucket=40, print=TRUE) #--- Plot tree par(xpd = TRUE) plot(out, compress = TRUE) text(out, use.n = TRUE) #Plot KM plot for sub-groups identified by tree KMPlot(out, xscale=365.25, type=1) KMPlot(out, xscale=365.25, type=2, overlay=FALSE, mfrow=c(2,2), xlab="Year", ylab="Survival prob.") #--- Run SurvCART() with time-to-event distribution: weibull censoring distribution: None out2<- SurvCART(data=GBSG2, patid="subjid", censorvar="cens", timevar="time", gvars=c('horTh1', 'age', 'menostat1', 'tsize', 'tgrade1', 'pnodes', 'progrec', 'estrec'), tgvars=c(0,1,0,1,0,1, 1,1), time.dist="weibull", event.ind=1, alpha=0.05, minsplit=80, minbucket=40, print=TRUE) #--- Run SurvCART() with time-to-event distribution: weibull censoring distribution: exponential out<- SurvCART(data=GBSG2, patid="subjid", censorvar="cens", timevar="time", gvars=c('horTh1', 'age', 'menostat1', 'tsize', 'tgrade1', 'pnodes', 'progrec', 'estrec'), tgvars=c(0,1,0,1,0,1, 1,1), time.dist="weibull", cens.dist="exponential", event.ind=1, alpha=0.05, minsplit=80, minbucket=40, print=TRUE)
  • Maintainer: Madan G Kundu
  • License: GPL (>= 2)
  • Last published: 2022-05-17