tsegest function

The Two-Stage Estimation (TSE) Method Using g-estimation for Treatment Switching

The Two-Stage Estimation (TSE) Method Using g-estimation for Treatment Switching

Obtains the causal parameter estimate of the logistic regression switching model and the hazard ratio estimate of the Cox model to adjust for treatment switching.

tsegest( data, id = "id", stratum = "", tstart = "tstart", tstop = "tstop", event = "event", treat = "treat", censor_time = "censor_time", pd = "pd", pd_time = "pd_time", swtrt = "swtrt", swtrt_time = "swtrt_time", swtrt_time_upper = "", base_cov = "", conf_cov = "", low_psi = -3, hi_psi = 3, n_eval_z = 100, strata_main_effect_only = TRUE, firth = FALSE, flic = FALSE, recensor = TRUE, admin_recensor_only = TRUE, swtrt_control_only = TRUE, alpha = 0.05, ties = "efron", tol = 1e-06, offset = 1, boot = TRUE, n_boot = 1000, seed = NA )

Arguments

  • data: The input data frame that contains the following variables:

    • id: The id to identify observations belonging to the same subject for counting process data with time-dependent covariates.
    • stratum: The stratum.
    • tstart: The starting time of the time interval for counting-process data with time-dependent covariates.
    • tstop: The stopping time of the time interval for counting-process data with time-dependent covariates.
    • event: The event indicator, 1=event, 0=no event.
    • treat: The randomized treatment indicator, 1=treatment, 0=control.
    • censor_time: The administrative censoring time. It should be provided for all subjects including those who had events.
    • pd: The disease progression indicator, 1=PD, 0=no PD.
    • pd_time: The time from randomization to PD.
    • swtrt: The treatment switch indicator, 1=switch, 0=no switch.
    • swtrt_time: The time from randomization to treatment switch.
    • swtrt_time_upper: The upper bound of treatment switching time.
    • base_cov: The baseline covariates (excluding treat).
    • conf_cov: The confounding variables for predicting treatment switching (excluding treat).
  • id: The name of the id variable in the input data.

  • stratum: The name(s) of the stratum variable(s) in the input data.

  • tstart: The name of the tstart variable in the input data.

  • tstop: The name of the tstop variable in the input data.

  • event: The name of the event variable in the input data.

  • treat: The name of the treatment variable in the input data.

  • censor_time: The name of the censor_time variable in the input data.

  • pd: The name of the pd variable in the input data.

  • pd_time: The name of the pd_time variable in the input data.

  • swtrt: The name of the swtrt variable in the input data.

  • swtrt_time: The name of the swtrt_time variable in the input data.

  • swtrt_time_upper: The name of the swtrt_time_upper variable in the input data.

  • base_cov: The names of baseline covariates (excluding treat) in the input data for the Cox model.

  • conf_cov: The names of confounding variables (excluding treat) in the input data for the logistic regression switching model.

  • low_psi: The lower limit of the causal parameter.

  • hi_psi: The upper limit of the causal parameter.

  • n_eval_z: The number of points between low_psi and hi_psi (inclusive) at which to evaluate the Wald statistics for the coefficient of the counterfactual in the logistic regression switching model.

  • strata_main_effect_only: Whether to only include the strata main effects in the logistic regression switching model. Defaults to TRUE, otherwise all possible strata combinations will be considered in the switching model.

  • firth: Whether the Firth's bias reducing penalized likelihood should be used. The default is FALSE.

  • flic: Whether to apply intercept correction to obtain more accurate predicted probabilities. The default is FALSE.

  • recensor: Whether to apply recensoring to counterfactual survival times. Defaults to TRUE.

  • admin_recensor_only: Whether to apply recensoring to administrative censoring times only. Defaults to TRUE. If FALSE, recensoring will be applied to the actual censoring times for dropouts.

  • swtrt_control_only: Whether treatment switching occurred only in the control group. The default is TRUE.

  • alpha: The significance level to calculate confidence intervals. The default value is 0.05.

  • ties: The method for handling ties in the Cox model, either "breslow" or "efron" (default).

  • tol: The desired accuracy (convergence tolerance) for psi.

  • offset: The offset to calculate the time to event, PD, and treatment switch. We can set offset equal to 1 (default), 1/30.4375, or 1/365.25 if the time unit is day, month, or year.

  • boot: Whether to use bootstrap to obtain the confidence interval for hazard ratio. Defaults to TRUE.

  • n_boot: The number of bootstrap samples.

  • seed: The seed to reproduce the bootstrap results. The default is missing, in which case, the seed from the environment will be used.

Returns

A list with the following components:

  • psi: The estimated causal parameter for the control group.

  • psi_CI: The confidence interval for psi.

  • psi_CI_type: The type of confidence interval for psi, i.e., "logistic model" or "bootstrap".

  • logrank_pvalue: The two-sided p-value of the log-rank test for an intention-to-treat (ITT) analysis.

  • cox_pvalue: The two-sided p-value for treatment effect based on the Cox model.

  • hr: The estimated hazard ratio from the Cox model.

  • hr_CI: The confidence interval for hazard ratio.

  • hr_CI_type: The type of confidence interval for hazard ratio, either "Cox model" or "bootstrap".

  • analysis_switch: A list of data and analysis results related to treatment switching.

    • data_switch: The list of input data for the time from secondary baseline to switch by treatment group. The variables include id, stratum (if applicable), swtrt, and swtrt_time. If swtrt == 0, then swtrt_time

      is censored at the time from secondary baseline to either death or censoring.

    • km_switch: The list of Kaplan-Meier plots for the time from secondary baseline to switch by treatment group.

    • eval_z: The list of data by treatment group containing the Wald statistics for the coefficient of the counterfactual in the logistic regression switching model, evaluated at a sequence of psi values. Used to plot and check if the range of psi values to search for the solution and limits of confidence interval of psi need be modified.

    • data_nullcox: The list of input data for counterfactual survival times for the null Cox model by treatment group.

    • fit_nullcox: The list of fitted null Cox models for counterfactual survival times by treatment group, which contains the martingale residuals.

    • data_logis: The list of input data for pooled logistic regression models for treatment switching using g-estimation.

    • fit_logis: The list of fitted pooled logistic regression models for treatment switching using g-estimation.

  • data_outcome: The input data for the outcome Cox model.

  • fit_outcome: The fitted outcome Cox model.

  • settings: A list with the following components:

    • low_psi: The lower limit of the causal parameter.
    • hi_psi: The upper limit of the causal parameter.
    • n_eval_z: The number of points between low_psi and hi_psi (inclusive) at which to evaluate the Wald statistics for the coefficient for the counterfactual in the logistic regression switching model.
    • strata_main_effect_only: Whether to only include the strata main effects in the logistic regression switching model.
    • firth: Whether the Firth's penalized likelihood is used.
    • flic: Whether to apply intercept correction.
    • recensor: Whether to apply recensoring to counterfactual survival times.
    • admin_recensor_only: Whether to apply recensoring to administrative censoring times only.
    • swtrt_control_only: Whether treatment switching occurred only in the control group.
    • alpha: The significance level to calculate confidence intervals.
    • ties: The method for handling ties in the Cox model.
    • tol: The desired accuracy (convergence tolerance) for psi.
    • offset: The offset to calculate the time to event, PD, and treatment switch.
    • boot: Whether to use bootstrap to obtain the confidence interval for hazard ratio.
    • n_boot: The number of bootstrap samples.
    • seed: The seed to reproduce the bootstrap results.
  • psi_trt: The estimated causal parameter for the experimental group if swtrt_control_only is FALSE.

  • psi_trt_CI: The confidence interval for psi_trt if swtrt_control_only is FALSE.

  • hr_boots: The bootstrap hazard ratio estimates if boot is TRUE.

  • psi_boots: The bootstrap psi estimates if boot is TRUE.

  • psi_trt_boots: The bootstrap psi_trt estimates if boot is TRUE and swtrt_control_only is FALSE.

Details

We use the following steps to obtain the hazard ratio estimate and confidence interval had there been no treatment switching:

  • Use a pooled logistic regression switching model to estimate the causal parameter ψ\psi based on the patients in the control group who had disease progression:
logit(p(Eik))=αUi,ψ+jβjxijk \textrm{logit}(p(E_{ik})) = \alpha U_{i,\psi} +\sum_{j} \beta_j x_{ijk}

where EikE_{ik} is the observed switch indicator for individual ii at observation kk,

Ui,ψ=TCi+eψTEi U_{i,\psi} = T_{C_i} + e^{\psi}T_{E_i}

is the counterfactual survival time for individual ii given a specific value for ψ\psi, and xijkx_{ijk} are the confounders for individual ii at observation kk. When applied from a secondary baseline, Ui,ψU_{i,\psi}

refers to post-secondary baseline counterfactual survival, where TCiT_{C_i} corresponds to the time spent after the secondary baseline on control treatment, and TEiT_{E_i} corresponds to the time spent after the secondary baseline on the experimental treatment.

  • Search for ψ\psi such that the estimate of α\alpha is close to zero. This will be the estimate of the caual parameter. The confidence interval for ψ\psi can be obtained as the value of ψ\psi such that the corresponding two-sided p-value for testing H0:α=0H_0:\alpha = 0 in the switching model is equal to the nominal significance level.
  • Derive the counterfactual survival times for control patients had there been no treatment switching.
  • Fit the Cox proportional hazards model to the observed survival times for the experimental group and the counterfactual survival times for the control group to obtain the hazard ratio estimate.
  • If bootstrapping is used, the confidence interval and corresponding p-value for hazard ratio are calculated based on a t-distribution with n_boot - 1 degrees of freedom.

Examples

# Example 1: one-way treatment switching (control to active) sim1 <- tsegestsim( n = 500, allocation1 = 2, allocation2 = 1, pbprog = 0.5, trtlghr = -0.5, bprogsl = 0.3, shape1 = 1.8, scale1 = 0.000025, shape2 = 1.7, scale2 = 0.000015, pmix = 0.5, admin = 5000, pcatnotrtbprog = 0.5, pcattrtbprog = 0.25, pcatnotrt = 0.2, pcattrt = 0.1, catmult = 0.5, tdxo = 1, ppoor = 0.1, pgood = 0.04, ppoormet = 0.4, pgoodmet = 0.2, xomult = 1.4188308, milestone = 546, swtrt_control_only = TRUE, outputRawDataset = 1, seed = 2000) fit1 <- tsegest( data = sim1$paneldata, id = "id", tstart = "tstart", tstop = "tstop", event = "died", treat = "trtrand", censor_time = "censor_time", pd = "progressed", pd_time = "timePFSobs", swtrt = "xo", swtrt_time = "xotime", swtrt_time_upper = "xotime_upper", base_cov = "bprog", conf_cov = "bprog*catlag", low_psi = -3, hi_psi = 3, strata_main_effect_only = TRUE, recensor = TRUE, admin_recensor_only = TRUE, swtrt_control_only = TRUE, alpha = 0.05, ties = "efron", tol = 1.0e-6, boot = FALSE) c(fit1$hr, fit1$hr_CI) # Example 2: two-way treatment switching sim2 <- tsegestsim( n = 500, allocation1 = 2, allocation2 = 1, pbprog = 0.5, trtlghr = -0.5, bprogsl = 0.3, shape1 = 1.8, scale1 = 0.000025, shape2 = 1.7, scale2 = 0.000015, pmix = 0.5, admin = 5000, pcatnotrtbprog = 0.5, pcattrtbprog = 0.25, pcatnotrt = 0.2, pcattrt = 0.1, catmult = 0.5, tdxo = 1, ppoor = 0.1, pgood = 0.04, ppoormet = 0.4, pgoodmet = 0.2, xomult = 1.4188308, milestone = 546, swtrt_control_only = FALSE, outputRawDataset = 1, seed = 2000) fit2 <- tsegest( data = sim2$paneldata, id = "id", tstart = "tstart", tstop = "tstop", event = "died", treat = "trtrand", censor_time = "censor_time", pd = "progressed", pd_time = "timePFSobs", swtrt = "xo", swtrt_time = "xotime", swtrt_time_upper = "xotime_upper", base_cov = "bprog", conf_cov = "bprog*catlag", low_psi = -3, hi_psi = 3, strata_main_effect_only = TRUE, recensor = TRUE, admin_recensor_only = TRUE, swtrt_control_only = FALSE, alpha = 0.05, ties = "efron", tol = 1.0e-6, boot = FALSE) c(fit2$hr, fit2$hr_CI)

References

NR Latimer, IR White, K Tilling, and U Siebert. Improved two-stage estimation to adjust for treatment switching in randomised trials: g-estimation to address time-dependent confounding. Statistical Methods in Medical Research. 2020;29(10):2900-2918.

Author(s)

Kaifeng Lu, kaifenglu@gmail.com

  • Maintainer: Kaifeng Lu
  • License: GPL (>= 2)
  • Last published: 2025-03-20

Useful links