irglm function

fit a robust generalized linear models

fit a robust generalized linear models

Fit a robust GLM where the loss function is a composite function cfunodfun.

## S3 method for class 'formula' irglm(formula, data, weights, offset=NULL, contrasts=NULL, cfun="ccave", dfun=gaussian(), s=NULL, delta=0.1, fk=NULL, init.family=NULL, iter=10, reltol=1e-5, theta, x.keep=FALSE, y.keep=TRUE, trace=FALSE, ...)

Arguments

  • formula: symbolic description of the model, see details.

  • data: argument controlling formula processing via model.frame.

  • weights: optional numeric vector of weights.

  • x: input matrix, of dimension nobs x nvars; each row is an observation vector

  • y: response variable. Quantitative for dfun=1 and -1/1 for classification.

  • contrasts: the contrasts corresponding to levels from the respective models

  • offset: this can be used to specify an a priori known component to be included in the linear predictor during fitting. This should be NULL or a numeric vector of length equal to the number of cases. Currently only one offset term can be included in the formula.

  • cfun: character, type of convex cap (concave) function.

    Valid options are:

    • "hcave"
    • "acave"
    • "bcave"
    • "ccave"
    • "dcave"
    • "ecave"
    • "gcave"
    • "tcave"
  • dfun: character, type of convex component.

    Valid options are:

    • gaussian()
    • binomial()
    • poisson()
  • init.family: character value for initial family, one of "clossR","closs","gloss","qloss", which can be used to derive an initial estimator, if the selection is different from the default value

  • s: tuning parameter of cfun. s > 0 and can be equal to 0 for cfun="tcave". If s is too close to 0 for cfun="acave", "bcave", "ccave", the calculated weights can become 0 for all observations, thus crash the program.

  • delta: a small positive number provided by user only if cfun="gcave" and 0 < s <1

  • fk: predicted values at an iteration in the IRGLM algorithm

  • iter: number of iteration in the IRGLM algorithm

  • reltol: convergency criteria in the IRGLM algorithm

  • theta: an overdispersion scaling parameter for family=negbin()

  • x.keep, y.keep: logical values indicating whether the response vector and model matrix used in the fitting process should be returned as components of the returned value, x is a design matrix of dimension n * p, and x is a vector of observations of length n.

  • trace: if TRUE, fitting progress is reported

  • ...: other arguments passing to irglm

Details

A robust linear, logistic or Poisson regression model is fit by the iteratively reweighted GLM (IRGLM). The output weights_update is a useful diagnostic to the outlier status of the observations.

Returns

An object with S3 class "irglm", "glm" for various types of models. - call: the call that produced the model fit

  • weights: original weights used in the model

  • weights_update: weights in the final iteration of the IRGLM algorithm

  • cfun, s, dfun: original input arguments

  • is.offset: is offset used?

References

Zhu Wang (2024) Unified Robust Estimation, Australian & New Zealand Journal of Statistics. 66(1):77-102.

Author(s)

Zhu Wang zwang145@uthsc.edu

See Also

print, predict, coef.

Examples

x=matrix(rnorm(100*20),100,20) g2=sample(c(-1,1),100,replace=TRUE) fit=irglm(g2~x,data=data.frame(cbind(x, g2)), s=1, cfun="ccave", dfun=gaussian()) fit$weights_update
  • Maintainer: Zhu Wang
  • License: GPL-2
  • Last published: 2024-06-27