fit a GLM with lasso (or elastic net), snet or mnet regularization
fit a GLM with lasso (or elastic net), snet or mnet regularization
Fit a generalized linear model via penalized maximum likelihood. The regularization path is computed for the lasso (or elastic net penalty), scad (or snet) and mcp (or mnet penalty), at a grid of values for the regularization parameter lambda. Fits linear, logistic, Poisson and negative binomial (fixed scale parameter) regression models.
## S3 method for class 'formula'glmreg(formula, data, weights, offset=NULL, contrasts=NULL,x.keep=FALSE, y.keep=TRUE,...)## S3 method for class 'matrix'glmreg(x, y, weights, offset=NULL,...)## Default S3 method:glmreg(x,...)
Arguments
formula: symbolic description of the model, see details.
data: argument controlling formula processing via model.frame.
weights: optional numeric vector of weights. If standardize=TRUE, weights are renormalized to weights/sum(weights). If standardize=FALSE, weights are kept as original input
offset: this can be used to specify an a priori known component to be included in the linear predictor during fitting. This should be NULL or a numeric vector of length equal to the number of cases. Currently only one offset term can be included in the formula.
x: input matrix, of dimension nobs x nvars; each row is an observation vector
y: response variable. Quantitative for family="gaussian". Non-negative counts for family="poisson" or family="negbin". For family="binomial" should be either a factor with two levels or a vector of proportions.
contrasts: the contrasts corresponding to levels from the respective models
...: Other arguments passing to glmreg_fit
Details
The sequence of models implied by lambda is fit by coordinate descent. For family="gaussian" this is the lasso, mcp or scad sequence if alpha=1, else it is the enet, mnet or snet sequence. For the other families, this is a lasso (mcp, scad) or elastic net (mnet, snet) regularization path for fitting the generalized linear regression paths, by maximizing the appropriate penalized log-likelihood. Note that the objective function for "gaussian" is
1/2∗weights∗RSS+λ∗penalty,
if standardize=FALSE and
1/2∗∑(weights)weights∗RSS+λ∗penalty,
if standardize=TRUE. For the other models it is
−∑(weights∗loglik)+λ∗penalty
if standardize=FALSE and
−∑(weights)weights∗loglik+λ∗penalty
if standardize=TRUE.
Returns
An object with S3 class "glmreg" for the various types of models. - call: the call that produced this object
b0: Intercept sequence of length length(lambda)
beta: A nvars x length(lambda) matrix of coefficients.
lambda: The actual sequence of lambda values used
offset: the offset vector used.
resdev: The computed deviance (for "gaussian", this is the R-square). The deviance calculations incorporate weights if present in the model. The deviance is defined to be 2*(loglike_sat - loglike), where loglike_sat is the log-likelihood for the saturated model (a model with a free parameter per observation).
nulldev: Null deviance (per observation). This is defined to be 2*(loglike_sat -loglike(Null)); The NULL model refers to the intercept model.
nobs: number of observations
pll: penalized log-likelihood values for standardized coefficients in the IRLS iterations. For family="gaussian", not implemented yet.
pllres: penalized log-likelihood value for the estimated model on the original scale of coefficients
fitted.values: the fitted mean values, obtained by transforming the linear predictors by the inverse of the link function.
References
Breheny, P. and Huang, J. (2011) Coordinate descent algorithms for nonconvex penalized regression, with applications to biological feature selection. Ann. Appl. Statist., 5 : 232-253.
Zhu Wang, Shuangge Ma, Michael Zappitelli, Chirag Parikh, Ching-Yun Wang and Prasad Devarajan (2014) Penalized Count Data Regression with Application to Hospital Stay after Pediatric Cardiac Surgery, Statistical Methods in Medical Research. 2014 Apr 17. [Epub ahead of print]