Given cumulative transition hazards sample paths through the multi-state model.
mssample( Haz, trans, history = list(state =1, time =0, tstate =NULL), beta.state =NULL, clock = c("forward","reset"), output = c("state","path","data"), tvec, cens =NULL, M =10, do.trace =NULL)
Arguments
Haz: Cumulative hazards to be sampled from. These should be given as a data frame with columns time, Haz, trans, for instance as the Haz list element given by msfit.
trans: Transition matrix describing the multi-state model. See trans in msprep for more detailed information
history: A list with elements state, specifying the starting state(s), time, the starting time(s), and tstate, a numeric vector of length the number of states, specifying at what times states have been visited, if appropriate. The default of tstate is NULL; more information can be found under Details.
The elements state and time may either be scalars or vectors, in which case different sampled paths may start from different states or at different times. By default, all sampled paths start from state 1 at time 0.
beta.state: A matrix of dimension (no states) x (no transitions) specifying estimated effects of times at which earlier states were reached on subsequent transitions. If these are not in the model, the value NULL (default) suffices; more information can be found under Details
clock: Character argument, either "forward" (default) or "reset", specifying whether the time-scale of the cumulative hazards is in forward time ("forward") or duration in the present state ("reset")
output: One of "state", "path", or "data", specifying whether states, paths, or data should be output.
tvec: A numeric vector of time points at which the states or paths should be evaluated. Ignored if output="data"
cens: An independent censoring distribution, given as a data frame with time and Haz
M: The number of sampled trajectories through the multi-state model. The default is 10, since the procedure can become quite time-consuming
do.trace: An integer, specifying that the replication number should be written to the console every do.trace replications. Default is NULL in which case no output is written to the console during the simulation
Returns
M simulated paths through the multi-state model given by trans and Haz. It is either a data frame with columns time, pstate1, ..., pstateS for S states when output="state", or with columns time, ppath1,..., ppathP for the P paths specified in paths(trans) when output="path". When output="data", the sampled paths are stored in an "msdata" object, a data frame in long format such as that obtained by msprep. This may be useful for (semi-)parametric bootstrap procedures, in which case cens may be used as censoring distribution (assumed to be independent of all transition times and independent of any covariates).
Details
The procedure is described in detail in Fiocco, Putter & van Houwelingen (2008). The argument beta.state and the element tstate from the argument history are meant to incorporate situations where the time at which some previous states were visited may affect future transition rates. The relation between time of visit of state s and transition k is assumed to be linear on the log-hazards; the corresponding regression coefficient is to be supplied as the (s,k)-element of beta.state, which is 0 if no such effect has been included in the model. If no such effects are present, then beta.state=NULL
(default) suffices. In the tstate element of history, the s-th element is the time at which state s was visited. This is only relevant for states which have been visited prior to the beginning of sampling, i.e. before the time element of history; the elements of tstate are internally updated when in the sampling process new states are visited (only if beta.state is not NULL
to avoid unnecessary computations).
Examples
# transition matrix for illness-death modeltmat <- trans.illdeath()# data in wide format, for transition 1 this is dataset E1 of# Therneau & Grambsch (T&G)tg <- data.frame(illt=c(1,1,6,6,8,9),ills=c(1,0,1,1,0,1), dt=c(5,1,9,7,8,12),ds=c(1,1,1,1,1,1), x1=c(1,1,1,0,0,0),x2=c(6:1))# data in long format using mspreptglong <- msprep(time=c(NA,"illt","dt"),status=c(NA,"ills","ds"), data=tg,keep=c("x1","x2"),trans=tmat)# expanded covariatestglong <- expand.covs(tglong,c("x1","x2"))# Cox model with different covariatecx <- coxph(Surv(Tstart,Tstop,status)~x1.1+x2.2+strata(trans), data=tglong,method="breslow")# new data, to check whether results are the same for transition 1 as T&Gnewdata <- data.frame(trans=1:3,x1.1=c(0,0,0),x2.2=c(0,1,0),strata=1:3)fit <- msfit(cx,newdata,trans=tmat)tv <- unique(fit$Haz$time)# mssampleset.seed(1234)mssample(Haz=fit$Haz,trans=tmat,tvec=tv,M=100)set.seed(1234)paths(tmat)mssample(Haz=fit$Haz,trans=tmat,tvec=tv,M=100,output="path")set.seed(1234)mssample(Haz=fit$Haz,trans=tmat,tvec=tv,M=100,output="data",do.trace=25)
References
Fiocco M, Putter H, van Houwelingen HC (2008). Reduced-rank proportional hazards regression and simulation-based prediction for multi-state models. Statistics in Medicine 27 , 4340--4358.