wpl() R function from [multipleNCC]

Weighted partial likelihood for nested case-control data

Fits Cox proportional hazards models for nested case-control data with a weigthed partial likelihood. Matching between cases and controls is broken which enables the controls to be reused for other endpoints. It handles competing risks (with simple survival data with one endpoint being a special case) and cases and controls from one endpoint are being used as additional controls for another endpoint. There are four choices of weights; Samuelsen (1997) KM, estimated with logistic regression (glm), logistic genralized additive model (gam) and local averaging (Chen, 2001) (Chen). KM, glm and gam handle additional matching, while all of them handle left-truncation.


wpl(x, data, samplestat, m = 1, weight.method = "KM", no.intervals = 10, 
variance = "robust", no.intervals.left = c(3, 4), match.var = 0, match.int = 0)

Arguments

x: A formula object, with the response on the left of a ~ operator, and the terms on the right. The response must be a survival object as returned by the Surv function. The status variable going in to Surv is not actually used but should have 1 for cases and zero for controls and non-sampled subjetcs. All elements going into the formula should have lenght equal to the number of subjects in the cohort. Generally some of the covariates are not known for all subjects in the cohort (due to the NCC-sampling). The covariate values for those subjects should just be given some value e.g. 0 (not NA). Which value choosen is not important as the values are never used.
data: data.frame in which to interpret the variables named in the formula.
samplestat: A vector containing sampling and status information: 0 represents non-sampled subjects in the cohort, 1: sampled controls, 2,3,... indicate different events. Cohort dimension.
m: Number of sampled controls. A scalar if equal number of controls for all cases. If unequal number of controls per case: A vector of length number of cases. The vector must be in the same order as the cases in the samplestat-vector.
weight.method: Which weigths should be used, possibilities "KM", "gam", "glm", "Chen"
no.intervals: Number of intervals for censoring times for Chen-weights with only right censoring
variance: Default is robust variances, but model based variance (only for KM-weights), "Modelbased" and variance based on stratified case-cohort "Poststrat" (only for Chen-weights) is also possible. Pseudo-variance and Strat-variance will appear under "est.se(coef)" in the output.
no.intervals.left: Number of intervals for Chen-weights with left-truncation. A vector on the form [number of intervals for left truncated time, number of intervals for survival time].
match.var: If the controls are matched to the cases (on other variables than time), match.var is the vector or matrix of matching variables. Cohort dimension.
match.int: A vector of length 2*number of matching variables. For caliper matching (matched on value pluss/minus epsilon) match.int should consist of c(-epsilon,epsilon). For exact matching match.int should consist of c(0,0).

Returns

An object of class wpl representing the fit. Objects of this class have methods for the functions print and summary. The wpl-object consists of the following elements which are repeated for each endpoint. Unfortunately only the values for the first endpoint can be reached by $-operator(ex. fit$ coefficients only return the coefficients for the first endpoint)

coefficients: The vector of coefficients.
var: Robust or estimated variance
weighted.loglik: A vector of length 2 containing the log-likelihood with the initial values and with the final values of the coefficients.
iter: Number of iterations used
linear.predictors: The vector of linear predictors, one per subject. Note that this vector has been centered, see predict.coxph for more details
residuals: The martingale residuals
means: Vector of column means of the X matrix
method: The computation method used
n: The number of observations used in the fit
nevent: The number of events (usually deaths) used in the fit
naive.var: naive.var
rscore: The robust log-rank statistic
wald.test: The Wald test of whether the final coefficients differ from the initial values
y: Inclusion time and event/censoring time
weights: The vector of weights, which are inverse sampling probabilities
est.var: Estimated variance (T) or robust variance (F)

References

Samuelsen SO. A pseudolikelihood approach to analysis of nested case-control studies. Biometrika, 84(2):379-394, 1997.

Stoer NC and Samuelsen SO (2013): Inverse probability weighting in nested case-control studies with additional matching - a simulation study. Statistics in Medicine, 32(30), 5328-5339.

Author(s)

Nathalie C. Stoer

Examples


data(CVD_Accidents)
wpl(Surv(agestart,agestop,dead24)~factor(smoking3gr)+bmi+factor(sex),data=CVD_Accidents,
samplestat=CVD_Accidents$samplestat,weight.method="gam")

wpl(Surv(agestart,agestop,dead24)~factor(smoking3gr)+bmi+factor(sex),data=CVD_Accidents,
samplestat=CVD_Accidents$samplestat,m=1,match.var=cbind(CVD_Accidents$sex,
CVD_Accidents$bmi),match.int=c(0,0,-2,2),weight.method="glm")

## The function is currently defined as
function (x, data, samplestat, m = 1, weight.method = "KM", no.intervals = 10, 
    variance = "robust", no.intervals.left = c(3, 4), match.var = 0, 
    match.int = 0) 
{
    UseMethod("wpl")
  }

multipleNCC package Read PDF manual

Maintainer: Nathalie C. Stoer
License: GPL-2
Last published: 2024-02-12

Useful links

wpl function