SurvCART() R function from [LongCART]

Survival CART with time to event response via binary partitioning

Recursive partitioning for linear mixed effects model with survival data per SurvCART algorithm based on baseline partitioning variables (Kundu, 2020).


SurvCART(data, patid, timevar, censorvar, gvars, tgvars, 
         time.dist="exponential", cens.dist="NA", event.ind=1, 
         alpha=0.05, minsplit=40, minbucket=20, quantile=0.50, print=FALSE)

Arguments

data: name of the dataset. It must contain variable specified for patid (indicating subject id), all the variables specified in the formula and the baseline partitioning variables.
patid: name of the subject id variable.
timevar: name of the variable with follow-up times.
censorvar: name of the variable with censoring status.
gvars: list of partitioning variables of interest. Value of these variables should not change over time. Regarding categorical variables, only numerically coded categorical variables should be specified. For nominal categorical variables or factors, please first create corresponding dummy variable(s) and then pass through gvars.
tgvars: types (categorical or continuous) of partitioning variables specified in gvar. For each of continuous partitioning variables, specify 1 and for each of the categorical partitioning variables, specify 0. Length of tgvars should match to the length of gvars
time.dist: name of time-to-event distribution. It can be one of the following distributions: "exponential", "weibull", "lognormal" or "normal".
cens.dist: name of censoring distribution. It can be one of the following distributions: "exponential", "weibull", "lognormal", "normal" or "NA". If specified "NA", then parameter instability test corresponding to censoring distribution will not be performed.
event.ind: value of the censoring variable indicating event.
alpha: alpha (i.e., nominal type I error) level for parameter instability test
minsplit: the minimum number of observations that must exist in a node in order for a split to be attempted.
minbucket: the minimum number of observations in any terminal node.
quantile: The quantile to be displayed in the visualization of tree through plot.SurvCART() or plot().
print: if TRUE, then summary such as number of subjects at risk, number of events, median event time and median censoring time model will be printed for each node.

Details

Construct survival tree based on heterogeneity in time-to-event and censoring distributions.

Exponential distribution: f(t)=lambdaexp(-lambdat)

Weibull distribution: f(t)=alphalambdat^(alpha-1)exp(-lambdat^alpha)

Lognormal distribution: f(t)=(1/t)(1/sqrt(2pi*sigma^2))exp[-(1/2)(log(t)-mu)/sigma^2]

Normal distribution: f(t)=(1/sqrt(2pisigma^2))exp[-(1/2)(t-mu)/sigma^2]

Returns

Treeout: contains summary information of tree fitting for each terminal nodes and non-terminal nodes. Columns of Treeout include "ID", the (unique) node numbers that follow a binary ordering indexed by node depth, n, the number of subjectsreaching the node, D, the number of events reaching the node, median.T, the median survival time at the node, median.C, the median censoring time at the node, var, splitting variable, index, the cut-off value of splitting variable for binary partitioning, p (Instability), the p-value for parameter instability test for the splitting variable, loglik, the log-likelihood at the node, AIC, the AIC at the node, improve, the improvement in deviance given by this split, and Terminal, indicator (True or False) of terminal node.
logLik.tree: log-likelihood of the tree-structured model, based on Cox model including sub-groups as covariates
logLik.root: log-likelihood at the root node (i.e., without tree structure), based on Cox model without any covariate
AIC.tree: AIC of the tree-structured model, based on Cox model including sub-groups as covariates
AIC.root: AIC at the root node (i.e., without tree structure), based on Cox model without any covariate
nodelab: List of subgroups or terminal nodes with their description
varnam: List of splitting variables
ds: the dataset originally supplied
event.ind: value of the censoring variable indicating event.
timevar: name of the variable with follow-up times
censorvar: name of the variable with censoring status
frame: rpart compatible object
splits: rpart compatible object
cptable: rpart compatible object
functions: rpart compatible object

Author(s)

Madan Gopal Kundu madan_g.kundu@yahoo.com

References

Kundu, M. G., and Ghosh, S. (2021). Survival trees based on heterogeneity in time-to-event and censoring distributions using parameter instability test. Statistical Analysis and Data Mining: The ASA Data Science Journal, 14(5), 466-483.

Examples


#--- Get the data
data(GBSG2)

#numeric coding of character variables
GBSG2$horTh1<- as.numeric(GBSG2$horTh)
GBSG2$tgrade1<- as.numeric(GBSG2$tgrade)
GBSG2$menostat1<- as.numeric(GBSG2$menostat)

#Add subject id
GBSG2$subjid<- 1:nrow(GBSG2)

#--- Run SurvCART() with time-to-event distribution: exponential, censoring distribution: None  
out<- SurvCART(data=GBSG2, patid="subjid", censorvar="cens", timevar="time", 
        gvars=c('horTh1', 'age', 'menostat1', 'tsize', 'tgrade1', 'pnodes', 'progrec', 'estrec'),  
        tgvars=c(0,1,0,1,0,1, 1,1),          
        event.ind=1,  alpha=0.05, minsplit=80, minbucket=40, print=TRUE)

#--- Plot tree
par(xpd = TRUE)
plot(out, compress = TRUE)
text(out, use.n = TRUE)

#Plot KM plot for sub-groups identified by tree
KMPlot(out, xscale=365.25, type=1)
KMPlot(out, xscale=365.25, type=2, overlay=FALSE, mfrow=c(2,2), xlab="Year", ylab="Survival prob.")

#--- Run SurvCART() with time-to-event distribution: weibull censoring distribution: None  
out2<- SurvCART(data=GBSG2, patid="subjid", censorvar="cens", timevar="time",  
        gvars=c('horTh1', 'age', 'menostat1', 'tsize', 'tgrade1', 'pnodes', 'progrec', 'estrec'),  
        tgvars=c(0,1,0,1,0,1, 1,1),          
        time.dist="weibull", event.ind=1, alpha=0.05, minsplit=80, minbucket=40, print=TRUE)

#--- Run SurvCART() with time-to-event distribution: weibull censoring distribution: exponential
out<- SurvCART(data=GBSG2, patid="subjid", censorvar="cens", timevar="time",  
        gvars=c('horTh1', 'age', 'menostat1', 'tsize', 'tgrade1', 'pnodes', 'progrec', 'estrec'),  
        tgvars=c(0,1,0,1,0,1, 1,1),          
        time.dist="weibull", cens.dist="exponential", event.ind=1, 
        alpha=0.05, minsplit=80, minbucket=40, print=TRUE)

LongCART package Read PDF manual

Maintainer: Madan G Kundu
License: GPL (>= 2)
Last published: 2022-05-17

Useful links

SurvCART function