vim function

Estimate AUC VIM

Estimate AUC VIM

vim( type, time, event, X, landmark_times = stats::quantile(time[event == 1], probs = c(0.25, 0.5, 0.75)), restriction_time = max(time[event == 1]), approx_times = NULL, large_feature_vector, small_feature_vector, conditional_surv_preds = NULL, large_oracle_preds = NULL, small_oracle_preds = NULL, conditional_surv_generator = NULL, conditional_surv_generator_control = NULL, large_oracle_generator = NULL, large_oracle_generator_control = NULL, small_oracle_generator = NULL, small_oracle_generator_control = NULL, cf_folds = NULL, cf_fold_num = 5, sample_split = TRUE, ss_folds = NULL, robust = TRUE, scale_est = FALSE, alpha = 0.05, verbose = FALSE )

Arguments

  • type: Type of VIM to compute. Options include "accuracy", "AUC", "Brier", "R-squared"

    "C-index", and "survival_time_MSE".

  • time: n x 1 numeric vector of observed follow-up times. If there is censoring, these are the minimum of the event and censoring times.

  • event: n x 1 numeric vector of status indicators of whether an event was observed.

  • X: n x p data.frame of observed covariate values

  • landmark_times: Numeric vector of length J1 giving landmark times at which to estimate VIM ("accuracy", "AUC", "Brier", "R-squared").

  • restriction_time: Maximum follow-up time for calculation of "C-index" and "survival_time_MSE".

  • approx_times: Numeric vector of length J2 giving times at which to approximate integrals. Defaults to a grid of 100 timepoints, evenly spaced on the quantile scale of the distribution of observed event times.

  • large_feature_vector: Numeric vector giving indices of features to include in the 'large' prediction model.

  • small_feature_vector: Numeric vector giving indices of features to include in the 'small' prediction model. Must be a subset of large_feature_vector.

  • conditional_surv_preds: User-provided estimates of the conditional survival functions of the event and censoring variables given the full covariate vector (if not using the vim() function to compute these nuisance estimates). Must be a named list of lists with elements S_hat, S_hat_train, G_hat, and G_hat_train. Each of these is itself a list of length K, where K is the number of cross-fitting folds. Each element of these lists is a matrix with J2 columns and number of rows equal to either the number of samples in the kth fold (for S_hat or G_hat) or the number of samples used to compute the nuisance estimator for the kth fold.

  • large_oracle_preds: User-provided estimates of the oracle prediction function using large_feature_vector. Must be a named list of lists with elements f_hat and f_hat_train. Each of these is itself a list of length K. Each element of these lists is a matrix with J1 columns (for landmark time VIMs) or 1 column (for "C-index" and "survival_time_MSE").

  • small_oracle_preds: User-provided estimates of the oracle prediction function using small_feature_vector. Must be a named list of lists with elements f_hat and f_hat_train. Each of these is itself a list of length K. Each element of these lists is a matrix with J1 columns (for landmark time VIMs) or 1 column (for "C-index" and "survival_time_MSE").

  • conditional_surv_generator: A user-written function to estimate the conditional survival functions of the event and censoring variables. Must take arguments time, event, folds (cross-fitting fold identifiers), and newtimes (times at which to generate predictions).

  • conditional_surv_generator_control: A list of arguments to pass to conditional_surv_generator.

  • large_oracle_generator: A user-written function to estimate the oracle prediction function using large_feature_vector.Must take arguments time, event, and folds (cross-fitting fold identifiers).

  • large_oracle_generator_control: A list of arguments to pass to large_oracle_generator.

  • small_oracle_generator: A user-written function to estimate the oracle prediction function using small_feature_vector.Must take arguments time, event, and folds (cross-fitting fold identifiers).

  • small_oracle_generator_control: A list of arguments to pass to small_oracle_generator.

  • cf_folds: Numeric vector of length n giving cross-fitting folds

  • cf_fold_num: The number of cross-fitting folds, if not providing cf_folds

  • sample_split: Logical indicating whether or not to sample split

  • ss_folds: Numeric vector of length n giving sample-splitting folds

  • robust: Logical, whether or not to use the doubly-robust debiasing approach. This option is meant for illustration purposes only --- it should be left as TRUE.

  • scale_est: Logical, whether or not to force the VIM estimate to be nonnegative

  • alpha: The level at which to compute confidence intervals and hypothesis tests. Defaults to 0.05

  • verbose: Whether to print progress messages.

Returns

Named list with the following elements: - result: Data frame giving results. See the documentation of the individual vim_* functions for details.

  • folds: A named list giving the cross-fitting fold IDs (cf_folds) and sample-splitting fold IDs (ss_folds).

  • approx_times: A vector of times used to approximate integrals appearing in the form of the VIM estimator.

  • conditional_surv_preds: A named list containing the estimated conditional event and censoring survival functions.

  • large_oracle_preds: A named list containing the estimated large oracle prediction function.

  • small_oracle_preds: A named list containing the estimated small oracle prediction function.

Examples

# This is a small simulation example set.seed(123) n <- 100 X <- data.frame(X1 = rnorm(n), X2 = rbinom(n, size = 1, prob = 0.5)) T <- rexp(n, rate = exp(-2 + X[,1] - X[,2] + .5 * X[,1] * X[,2])) C <- rexp(n, exp(-2 -.5 * X[,1] - .25 * X[,2] + .5 * X[,1] * X[,2])) C[C > 15] <- 15 time <- pmin(T, C) event <- as.numeric(T <= C) # landmark times for AUC landmark_times <- c(3) output <- vim(type = "AUC", time = time, event = event, X = X, landmark_times = landmark_times, large_feature_vector = 1:2, small_feature_vector = 2, conditional_surv_generator_control = list(SL.library = c("SL.mean", "SL.glm")), large_oracle_generator_control = list(SL.library = c("SL.mean", "SL.glm")), small_oracle_generator_control = list(SL.library = c("SL.mean", "SL.glm")), cf_fold_num = 2, sample_split = FALSE, scale_est = TRUE) print(output$result)

See Also

vim_accuracy vim_AUC vim_brier vim_cindex vim_rsquared vim_survival_time_mse