gee function

Generalized Estimating Equations

Generalized Estimating Equations

gee performs estimation of parameters in a restricted mean model using standard GEEs with independent working correlation matrix. For clustered data, cluster-robust standard errors are calculated. When cond=TRUE, cluster-specific intercepts are assumed. latin1

gee(formula, link = c("identity", "log", "logit"), data, subset, cond = FALSE, clusterid, clusterid.vcov, rootFinder = findRoots, ...)

Arguments

  • formula: An expression or formula representing the expected outcome conditional on covariates.
  • link: A character string naming the link function to use. Has to be "identity", "log" or "logit". Default is "identity".
  • data: A data frame or environment containing the variables appearing in formula. If missing, the variables are expected to be found in the environment of the formula argument.
  • subset: An optional vector defining a subset of the data to be used.
  • cond: A logical value indicating whether cluster-specific intercepts should be assumed. Requires a clusterid argument.
  • clusterid: A cluster-defining variable or a character string naming a cluster-defining variable in the data argument. If it is not found in the data argument, it will be searched for in the calling frame. If missing, each observation will be considered to be a separate cluster. This argument is required when cond = TRUE.
  • clusterid.vcov: A cluster-defining variable or a character string naming a cluster-defining variable in the data argument to be used for adding contributions from the same cluster. These clusters can be different from the clusters defined by clusterid. However, each cluster defined by clusterid needs to be contained in exactly one cluster defined by clusterid.vcov. This variable is useful when the clusters are hierarchical.
  • rootFinder: A function to solve a system of non linear equations. Default is findRoots.
  • ...: Further arguments to be passed to the function rootFinder.

Details

Estimates parameters in a regression model, defined by formula. When cond=FALSE, the estimated coefficients are identical to those obtained with glm, but since no distributional assumptions are made, a robust variance is calculated. When cond=TRUE and link is "identity"

or "log", the coefficients will be calculated using conditional estimating equations as described in Goetgeluk and Vansteelandt (2008). When cond=TRUE and link="logit", the coefficients will be calculated by conditional logistic regression (with robust standard errors).

Returns

gee return an object of class gee containing:

  • coefficients: Estimates of the parameters.

  • vcov: Robust variance of the estimates.

  • call: The matched call.

  • y: The outcome vector.

  • x: The design matrix. For conditional methods there is no column corresponding to the intercept.

  • optim.object: An estimation object returned from the function specified in the rootFinder, if this function is called.

  • res: The residuals from the estimating equations.

  • d.res: The derivative of the residuals from the estimating equations.

  • data: The original data object, if given as an input argument

  • formula: The original formula object, if given as an input argument

The class methods coef and vcov can be used to extract the estimated parameters and their covariance matrix from a gee object. To obtain the 'naive' variance, i.e. the variance obtained from maximum likelihood estimation assuming correct parameteric model and no clustering, use the class method naiveVcov. The class method summary.drgee produces a summary of the calculations.

See Also

glm

Author(s)

Johan Zetterqvist, Arvid

References

Goetgeluk S., & Vansteelandt S. (2008), Conditional generalized estimating equations for the analysis of clustered and longitudinal data. Biometrics, 64 (3), pp. 772--780.

  • Maintainer: Johan Zetterqvist
  • License: GPL-2 | GPL-3
  • Last published: 2020-01-09

Useful links