Instrumental Variable Estimation with Selection on the exogenous Variables by Lasso
Instrumental Variable Estimation with Selection on the exogenous Variables by Lasso
This function estimates the coefficient of an endogenous variable by employing Instrument Variables in a setting where the exogenous variables are high-dimensional and hence selection on the exogenous variables is required. The function returns an element of class rlassoIVselectX
rlassoIVselectX(x,...)## Default S3 method:rlassoIVselectX(x, d, y, z, post =TRUE,...)## S3 method for class 'formula'rlassoIVselectX(formula, data, post =TRUE,...)
Arguments
x: exogenous variables in the structural equation (matrix)
...: arguments passed to the function rlasso
d: endogenous variables in the structural equation (vector or matrix)
y: outcome or dependent variable in the structural equation (vector or matrix)
z: set of potential instruments for the endogenous variables.
post: logical. If TRUE, post-lasso estimation is conducted.
formula: An object of class Formula of the form " y ~ x + d | x + z" with y the outcome variable, d endogenous variable, z instrumental variables, and x exogenous variables.
data: An optional data frame, list or environment (or object coercible by as.data.frame to a data frame) containing the variables in the model. If not found in data, the variables are taken from environment(formula), typically the environment from which rlassoIVselectX is called.
Returns
An object of class rlassoIVselectX containing at least the following components: - coefficients: estimated parameter vector
The implementation is a special case of of Chernozhukov et al. (2015). The option post=TRUE conducts post-lasso estimation for the Lasso estimations, i.e. a refit of the model with the selected variables. Exogenous variables x are automatically used as instruments and added to the instrument set z.
Examples
library(hdm)data(AJR); y = AJR$GDP; d = AJR$Exprop; z = AJR$logMort
x = model.matrix(~-1+(Latitude + Latitude2 + Africa + Asia + Namer + Samer)^2, data=AJR)dim(x)#AJR.Xselect = rlassoIV(x=x, d=d, y=y, z=z, select.X=TRUE, select.Z=FALSE) AJR.Xselect = rlassoIV(GDP ~ Exprop +(Latitude + Latitude2 + Africa + Asia + Namer + Samer)^2| logMort +(Latitude + Latitude2 + Africa + Asia + Namer + Samer)^2, data=AJR, select.X=TRUE, select.Z=FALSE)summary(AJR.Xselect)confint(AJR.Xselect)
References
Chernozhukov, V., Hansen, C. and M. Spindler (2015). Post-Selection and Post-Regularization Inference in Linear Models with Many Controls and Instruments American Economic Review, Papers and Proceedings 105(5), 486--490.