An internal function called by the tmle function to obtain an initial estimate of the Q portion of the likelihood based on user-supplied matrix values for predicted values of (counterfactual outcomes) Q(0,W),Q(1,W), or a user-supplied regression formula, or based on a data-adaptively selected SuperLearner fit. In the absence of user-supplied values, a user-supplied regression formula takes precedence over data-adaptive super-learning. The default is to return cross-validated predictions.
estimateQ(Y, Z, A, W, Delta, Q, Qbounds, Qform, maptoYstar, SL.library, cvQinit, family, id, V, verbose, discreteSL, obsWeights)
Arguments
Y: continuous or binary outcome variable
Z: optional binary indicator for intermediate covariate for conrolled direct effect estimation
A: binary treatment indicator, 1 - treatment, 0 - control
W: vector, matrix, or dataframe containing baseline covariates
Qbounds: Bounds on predicted values for Q, set to alpha for logistic fluctuation, or range(Y) if not user-supplied
Qform: regression formula of the form Y~A+W
maptoYstar: if TRUE indicates continuous Y values should be shifted and scaled to fall between (0,1)
SL.library: specification of prediction algorithms, default is (SL.glm , SL.glmnet , tmle.SL.dbarts2 ). In practice, including more prediction algorithms in the library improves results.
cvQinit: logical, whether or not to estimate cross-validated values for initial Q, default=TRUE
family: family specification for regressions, generally gaussian for continuous oucomes, binomial for binary outcomes
id: subject identifier
V: Number of cross-validation folds for Super Learning
verbose: status message printed if set to TRUE
discreteSL: If true, returns discrete SL estimates, otherwise ensemble estimates. Ignored when SL is not used.
obsWeights: sampling weights
Returns
Q: nx3 matrix, columns contain the initial estimate of [Q(A,W)=E(Y∣A=a,W),Q(0,W)=E(Y∣A=0,W),Q(1,W)=E(Y∣A=1,W)]. For controlled direct estimation, nx5 matrix, E(Y∣Z,A,W), evaluated at (z,a),(0,0),(0,1),(1,0),(1,1) on scale of linear predictors
Qfamily: binomial for targeting with logistic fluctuation, gaussian for linear fluctuation
coef: coefficients for each term in working model used for initial estimation of Q if glm used.