predictCI function

Calculate a predicted value matrix (fit, lwr, upr) for a regression, either lm or glm, on either link or response scale.

Calculate a predicted value matrix (fit, lwr, upr) for a regression, either lm or glm, on either link or response scale.

This adapts code from predict.glm and predict.lm. I eliminated type = "terms" from consideration.

predictCI( object, newdata = NULL, type = c("response", "link"), interval = c("none", "confidence", "prediction"), dispersion = NULL, scale = NULL, na.action = na.pass, level = 0.95, ... )

Arguments

  • object: Regression object, class must include glm or lm.
  • newdata: Data frame including focal values for predictors
  • type: One of c("response", "link"), defaults to former.
  • interval: One of c("none", "confidence", "prediction"). "prediction" is defined only for lm objects, not for glm.
  • dispersion: Will be estimated if not provided. The variance coefficient of the glm, same as scale squared. Dispersion is allowed as an argument in predict.glm.
  • scale: The square root of dispersion. In an lm, this is the RMSE, called sigma in summary.lm.
  • na.action: What to do with missing values
  • level: Optional. Default = 0.95. Specify whatever confidence level one desires.
  • ...: Other arguments to be passed to predict

Returns

c(fit, lwr, upr), and possibly more.

Details

R's predict.glm does not have an interval argument. There are about 50 methods to calculate CIs for predicted values of GLMs, that's a major worry. This function takes the simplest route, calculating the (fit, lwr, upr) in the linear predictor scale, and then if type= "response", those 3 columns are put through linkinv(). This is the same method that SAS manuals suggest they use, same as Ben Bolker suggests in r-help (2010). I'd rather use one of the fancy tools like Edgeworth expansion, but that R code is not available (but is promised).

Use predict.lm with se.fit = TRUE to calculate fit and se.fit. Then calculate lwr and upr as fit +/- tval * se.fit. If model is lm, the model df.residual will be used to get tval. If glm, this is a normal approximation, so we thugishly assert tval = 1.98.

There's some confusing term translation. I wish R lm and glm would be brought into line. For lm, residual.scale = sigma. For glm, residual.scale = sqrt(dispersion)

  • Maintainer: Paul E. Johnson
  • License: GPL (>= 3.0)
  • Last published: 2022-08-06

Useful links