Two-Stage Least Squares (2SLS) Instrumental Variable Regression
Two-Stage Least Squares (2SLS) Instrumental Variable Regression
Performs a two-stage least squares regression on a single equation including endogenous regressors Y and exogenous regressors X on the right hand-side. Note that by specifying the set of endogenous regressors Y by endog the set of remaining regressors X are assumed to be exogenous and therefore automatically considered as part of the instrument in the first stage of the 2SLS. These variables are not to be specified in the iv argument. Here only instrumental variables outside the equation under consideration are specified.
ivr(formula, data = list(), endog, iv, contrasts =NULL, details =FALSE,...)
Arguments
formula: model formula.
data: name of the data frame used. To be specified if variables are not stored in environment.
endog: character vector of endogenous (to be instrumented) regressors.
iv: character vector of predetermined/exogenous instrumental variables NOT already included in the model formula.
contrasts: an optional list. See the contrasts.arg of model.matrix.default.
details: logical value indicating whether details should be printed out by default.
...: further arguments that lm.fit() understands.
Returns
A list object including:
adj.r.squ
adjusted coefficient of determination (adj. R-squared).
coefficients
IV-estimators of model parameters.
data/model
matrix of the variables' data used.
data.name
name of the data frame used.
df
degrees of freedom in the model (number of observations minus rank).
exogenous
exogenous regressors.
f.hausman
exogeneity test: F-value for simultaneous significance of all instrument parameters. If H0: "Instruments are exogenous" is rejected, usage of IV-regression can be justified against OLS.
f.instr
weak instrument test: F-value for significance of instrument parameter in first stage of 2SLS regression. If H0: "Instrument is weak" is rejected, instruments are usually considered sufficiently strong.
fitted.values
fitted values of the IV-regression.
fsd
first stage diagnostics (weakness of instruments).
has.const
logical value indicating whether model has a constant (internal purposes).
instrumented
name of instrumented regressors.
instruments
name of instruments.
model.matrix
the model (design) matrix.
ncoef
integer, giving the rank of the model (number of coefficients estimated).
nobs
number of observations.
p.hausman
according p-value of exogeneity test.
p.instr
according p-value of weak instruments test.
p.values
vector of p-values of single parameter significance tests.
r.squ
coefficient of determination (R-squared).
residuals
residuals in the IV-regression.
response
the endogenous (response) variable.
shea
Shea's partial R-squared quantifying the ability to explain the endogenous regressors.
sig.squ
estimated error variance (sigma-squared).
ssr
sum of squared residuals.
std.err
vector of standard errors of the parameter estimators.
t.values
vector of t-values of single parameter significance tests.
ucov
the (unscaled) variance-covariance matrix of the model's estimators.
vcov
the (scaled) variance-covariance matrix of the model's estimators.
modform
the model's regression R-formula.
Examples
## Numerical Illustration 20.1 in Auer (2023)ivr(contr ~ score, endog ="score", iv ="contrprev", data = data.insurance, details =TRUE)## Replicating an example of Ani Katchova (econometric academy)## (https://www.youtube.com/watch?v=lm3UvcDa2Hc)## on U.S. Women's Labor-Force Participation (data from Wooldridge 2013)if(requireNamespace('wooldridge', quietly =TRUE)){ library('wooldridge') data(mroz)# Select only working women mroz = mroz[mroz$"inlf"==1,] mroz = mroz[, c("lwage","educ","exper","expersq","fatheduc","motheduc")] attach(mroz)# Regular ols of lwage on educ, where educ is suspected to be endogenous# hence estimators are biased ols(lwage ~ educ, data = mroz)# Manual calculation of ols coeff Sxy(educ, lwage)/Sxy(educ)# Manual calculation of iv regression coeff# with fatheduc as instrument for educ Sxy(fatheduc, lwage)/Sxy(fatheduc, educ)# Calculation with 2SLS educ_hat = ols(educ ~ fatheduc)$fitted
ols(lwage ~ educ_hat)# Verify that educ_hat is completely determined by values of fatheduc head(cbind(educ,fatheduc,educ_hat),10)# Calculation with ivr() ivr(lwage ~ educ, endog ="educ", iv ="fatheduc", data = mroz, details =TRUE)# Multiple regression model with 1 endogenous regressor (educ)# and two exogenous regressors (exper, expersq)# Biased ols estimation ols(lwage ~ educ + exper + expersq, data = mroz)# Unbiased 2SLS estimation with fatheduc and motheduc as instruments# for the endogenous regressor educ ivr(lwage ~ educ + exper + expersq, endog ="educ", iv = c("fatheduc","motheduc"), data = mroz)# Manual 2SLS# First stage: Regress endog. regressor on all exogen. regressors# and instruments -> get exogenous part of educ stage1.mod = ols(educ ~ exper + expersq + fatheduc + motheduc) educ_hat = stage1.mod$fitted
# Second stage: Replace endog regressor with predicted value educ_hat# See the uncorrected standard errors! stage2.mod = ols(lwage ~ educ_hat + exper + expersq, data = mroz)## Simple test for endogeneity of educ:## Include endogenous part of educ into model and see if it is signif.## (is signif. at 10% level) uhat = ols(educ ~ exper + expersq + fatheduc + motheduc)$resid
ols(lwage ~ educ + exper + expersq + uhat) detach(mroz)}else{ message("Package 'wooldridge' not available.")}
Wooldridge, J.M. (2013): Introductory Econometrics: A Modern Approach, 5th Edition, Cengage Learning, Datasets available for download at Cengage Learning