Today
- Multivariate IV: Two Stage Least Squares
- Testing IV Assumptions
Multiple Instruments
- Is there a multivariate version of IV like there is for regression?
- Yes, but now 2 types of multiple variables
- Multiple instruments \(Z\)
- Multiple endogenous regressors
- To start: multiple instruments
- Allows us to weaken random assignment of instrument to conditional random assignment
- Any case where control strategy will let us estimate causal effect of instrument, can do modified IV
- Deal with constant effects linear model case only
IV model with multiple instruments
- Let \(Z\) be an \(m+1\times 1\) vector of instruments
- Include constant term \(Z_0=1\)
- Regressor of interest, call it \(Y_2\), still endogenous scalar (correlated with residual)
- Allow first \(k+1<m+1\) instruments to affect \(Y_1\) directly
- These act as control variables
- Still need at least one excluded instrument with no direct effect
- Model becomes \[Y_{1i}=\beta_0+\beta_1Y_{2i} + \beta_2Z_{1i} + \beta_3Z_{2i} + ... + \beta_{k+1}Z_{ki} + u_{i}\]
- Endogeneity means \[E[Y_2u]\neq 0 \]
- For convenience call \(X=(1,Y_2,Z_1,\ldots,Z_k)^\prime\) \[Y_{1i}=X_i^{\prime}\beta+u_i\]
Interpretation: Causal Graph (code)
#Library to create and analyze causal graphs
library(dagitty)
library(ggdag) #library to plot causal graphs
#create graph
iv2graph<-dagify(Y1~Y2+Z1+W,Y2~Z1+Z2+W,Z2~Z1)
#Set position of nodes
coords<-list(x=c(Y2 = 0, W = 1, Y1 = 2, Z1=-1, Z2=-1),
y=c(Y2 = 0, W = 0.1, Y1 = 0, Z1=0.1,Z2=0))
coords_df<-coords2df(coords)
coordinates(iv2graph)<-coords2list(coords_df)
#Plot causal graph
ggdag(iv2graph)+theme_dag_blank()+labs(title="Multivariate IV")
Interpretation: Causal Graph
- Want effect of endogenous regressor \(Y2\) on outcome \(Y1\)
- Unobserved confounder \(W\) prevents using regression
- Even when adjusting for observed confounders \(Z1\)
- Excluded instruments \(Z2\) directly affect \(Y2\) but not \(Y1\)
- By adjusting for included instruments \(Z1\), effect of excluded instrument \(Z2\) on \(Y2\) can be estimated
- Effect of \(Z2\) on \(Y1\) can also be estimated by controlling for \(Z1\)
- Like univariate IV, but now conditional random assignment
- Can have multiple included instruments \(Z1\) or excluded instruments \(Z2\)
IV assumptions
- Exclusion restriction is now
- \(E[Z_ju]= 0\) for all instruments \(j=0\ldots m\)
- Let’s look at special case where there is exactly one instrument not included in the stuctural equation, \(m=k+1\)
- Model and exogeneity give us system of k+2 equations in k+2 unknowns \[E[Z_{ji}(Y_{1i}-\beta_0-\beta_1Y_{2i}-\beta_2Z_{1i} - ... - \beta_{k+1}Z_{ki})]=0\ \forall j=0\ldots m\]
- Estimate by sample means in place of expectations
- Multivariate IV estimator
- Exactly standard IV in case where k=0 (no controls)
- Relevance condition now says that this system has a unique solution
- Requires \(Z\) to be related to \(Y_2\)
Many excluded instruments
- If \(m>k+1\), exclusion restictions give more equations than unknowns
- Model said to be overidentified
- Can drop any subset of Zs to just use \(k+2\) of them
- So long as relevance still holds
- Or use any \(k+2\) linear combinations of \(Z\)
- Let \(\tilde{Z}\) be vector of \(k+2\) linear combinations of elements of \(Z\)
- Use subsets or weighted averages of instruments
- Then \[E[\tilde{Z}_i(Y_{1i}-X_i^{\prime}\beta)]=0\] holds
- Solving for \(\beta\) and replacing mean with sample average gives \(\hat{\beta}^{IV}\)
Example: Incarceration and crime
- Suppose we have \(m+1\) judges, and an indicator for each
- \(Y_1\) is recidivism, \(Y_2\) is sentence length
- With no controls, multivariate IV is just univariate IV with a particular pair of judges as IV
- Can use linear combinations to get more precise estimates: compare average of judges who give lots of time to average of those who give little
- With controls, get effect of incarceration on crime conditioning on (exogenous) characteristics
- Adding controls can also help ensure instrument uncorrelated with residual
- Include any omitted variable which is correlated with instrument and affects outcome
- E.g., if judges assigned randomly conditional on district, and district correlated with crime, include district as an included instrument
Finding the “right” linear combination
- Intuitively, by combining the multiple valid IV estimators, should get better estimate (at least if model right)
- Do this by choosing “right” linear combination of instruments
- Under some assumptions, there is a choice which gives smallest variance
- Called Two Stage Least Squares
- First, Regress \(Y_2\) on \(Z\) to get predicted value \(\hat{Y}_2=\phi_0 +\phi_1Z_1+\phi_2 Z_2+...\phi_mZ_m\)
- Use this and \(Z_0\ldots Z_k\) as k+2 elements of \(\tilde{Z}\)
Implementing 2SLS
- Equivalent to regressing \(Y_2\) on \(Z\) to get predicted value \(\hat{Y}_2=\phi_0 +\phi_1Z_1+\phi_2 Z_2+...\phi_mZ_m\)
- Then replacing \(Y_2\) with \(\hat{Y}_2\) in structural function and running regression \[Y_{1i}=\beta_0+\beta_1\hat{Y}_{2i} + \beta_2Z_{1i} + \beta_3Z_{2i} + ... + \beta_{k+1}Z_{ki} + u_{i}\]
- Denote matrix of regressors here \(\hat{X}_i\)
- Coefficient on \(\hat{Y}_{2i}\) is \(\hat{\beta}_1^{2SLS}\)
- Need at least one excluded instrument with non-zero first stage coefficient for \(\hat{Y}_{2i}\) to not be a linear combination of other regressors
- This is relevance or no multicollinearity condition
- Caveat: Don’t do this in R by running OLS twice!
- Why? Standard errors will be wrong
- Don’t take into account first stage uncertainty
- Use \(ivreg\) command in package \(AER\)
2SLS assumptions
- (2SLS1) Linear Model \[Y_{1i}=\beta_0+\beta_1Y_{2i} + \beta_2Z_{1i} + \beta_3Z_{2i} + ... + \beta_{k+1}Z_{ki} + u_{i}\]
- (2SLS2) Random sampling: \((Y_{1i},Y_{2i},Z_i)\) drawn i.i.d. from population satisfying linear model assumptions
- (2SLS3) Relevance
- Need at least as many instruments correlated with \(X\) as parameters
- (2SLS4) Exogeneity \[E[Z_{ij}u_i]=0\ \forall j=1\ldots m+1\]
- To perform inference, sometimes assume (2SLS5) Homoskedasticity \[E[u^2|Z]=\sigma^2\] (\(\sigma^2\) a finite nonzero constant)
Results
- Under (1-4), 2SLS consistent
- Relevance condition more complicated
- Necessary conditions:
- At least one instrument is not included as a control: \(m+1>k\)
- First stage regression of \(Y_2\) on \(Z\) has nonzero coefficient on an excluded regressor
- Otherwise it is a linear combination of included \(Z\), and so there is multicollinearity
- Under (1-4), asymptotically normal inference is possible
- Under (1-5), 2SLS is also choice of linear combination of instruments with smallest asymptotic variance
- As usual, can use Robust SEs if (5) fails
Example: Cigarette Demand with controls (Code)
#Load library containing IV command 'ivreg'
library(AER)
# Load data on cigarette prices and quantities
data("CigarettesSW")
#Use real prices as X
CigarettesSW$rprice <- with(CigarettesSW,
price/cpi)
#Use changes in cigarette tax
# as supply curve shifting instrument
CigarettesSW$tdiff <- with(CigarettesSW,
(taxs - tax)/cpi)
#data from different states in 1995
c1995 <- subset(CigarettesSW,
year == "1995")
Example: Cigarette Demand with controls
- Predict cigarette demand controlling for income
- Income may affect sales, but also state tax policy
- Use same data as last class
- \(Y_1\) log Quantity Demanded
- \(Y_2\) log Price
- \(Z_1\) (included) income level
- \(Z_2\) (excluded) state cigarette tax rates \[Y_1=\beta_0+\beta_1Y_2+\beta_2Z_1 + u\]
- To run 2SLS, use \(ivreg\) in \(AER\) library, with syntax
- ivreg(y1 ~ y2 + z1 + … + zk | z1 + … + zm)
Results (Code 1)
#To get IV estimate of effect of x on y using
# z as instrument syntax is
# ivreg(y1 ~ y2 + z1 + ... + zk | z1 + ... + zm)
# Effect of log(price) on log(quantity)
# controlling for income
# Elasticity
fm_ivreg <- ivreg(log(packs) ~ log(rprice) + income
| tdiff + income, data = c1995)
#Obtain (robust) standard errors
ivresults<-coeftest(fm_ivreg,
vcov = vcovHC(fm_ivreg, type = "HC0"))
Results (Code 2)
#Compare to simple IV (no controls),
#first stage, and reduced form
fm_ols<-lm(log(packs) ~ log(rprice) + income,
data=c1995)
fm_simpleIV<-ivreg(log(packs) ~ log(rprice)
| tdiff, data = c1995)
fm_firststage<-lm(log(rprice)~tdiff + income,
data = c1995)
fm_reducedform<-lm(log(packs)~tdiff + income,
data = c1995)
Results (Code 3)
#Obtain robust standard errors for each
fm_ols.coef<-coeftest(fm_ols,
vcov = vcovHC(fm_ols, type = "HC0"))
fm_simpleIV.coef<-coeftest(fm_simpleIV,
vcov = vcovHC(fm_simpleIV, type = "HC0"))
fm_firststage.coef<-coeftest(fm_firststage,
vcov = vcovHC(fm_firststage, type = "HC0"))
fm_reducedform.coef<-coeftest(fm_reducedform,
vcov = vcovHC(fm_reducedform, type = "HC0"))
Results (Code 4)
library(stargazer)
stargazer(fm_simpleIV.coef,ivresults,fm_ols.coef,
type="html",header=FALSE,no.space=TRUE,
title="2SLS, OLS, Simple IV",
column.labels=c("Simple IV"), # "2SLS","OLS",
omit.table.layout="nl")
Results
2SLS, OLS, Simple IV
|
|
|
|
2SLS
|
OLS
|
Simple IV
|
|
(1)
|
(2)
|
(3)
|
|
log(rprice)
|
-0.919***
|
-1.107***
|
-1.084***
|
|
(0.312)
|
(0.187)
|
(0.312)
|
income
|
-0.000*
|
-0.000
|
|
|
(0.000)
|
(0.000)
|
|
Constant
|
8.974***
|
9.865***
|
9.720***
|
|
(1.487)
|
(0.894)
|
(1.496)
|
|
|
Multiple endogenous variables
Linear model with endogeneity \[Y_{1i}=\beta_0+\beta_1Y_{2i} + \beta_2Y_{3i} + \ldots + \beta_{\ell}Y_{\ell i} + \beta_{\ell+1}Z_{1i} + ... + \beta_{k+\ell}Z_{ki} + u_{i}\]
\(Y_{-1}=(Y_2\ldots Y_\ell)\) are endogenous regressors
\[E[Y_{-1i}u_i]\neq 0\]
- More compact notation
- \(X_i=(1,Y_{2i},\ldots,Y_{\ell i},Z_{1i},\ldots,Z_{ki})^{\prime}\)
- \(Y_{1i}=X_i^{\prime}\beta + u_{i}\)
- Instruments \(Z=(Z_0,Z_1,\ldots,Z_k,Z_{k+1},\ldots,Z_m)^{\prime}\) \[E[Z_{i}u_i]=0\]
- \((Z_0,Z_1,\ldots,Z_k)\) are exogenous regressors, or included instruments
- \((Z_{k+1},\ldots,Z_m)\) are excluded instruments
Alternate Representation of 2SLS
- Consider one endogenous regressor case
- Can interpret first stage as regressing ALL variables in \(X\) on \(Z\)
- \(Y_{2i}= Z_i^\prime\phi_{Y_2}+e_i\)
- \(Z_{0i}= Z_i^\prime\phi_{Z_0}+e_i\)
- \(\ldots\)
- \(Z_{ki}= Z_i^\prime\phi_{Z_k}+e_i\)
Then in second stage \(\hat{X}_i=(\hat{Z}_{0i},\hat{Y}_{2i},\hat{Z}_{1i},\ldots,\hat{Z}_{ki})\)
- Note that \(\hat{Z}_{ji}=Z_{ji}\) since \(Z\) predicts itself perfectly
- So this yields exactly the same predictor
We can do this also in multiple endogenous variables case
Full 2SLS
- First stage
- Regress each element of \(X_i\) on \(Z_i\) by OLS
- Compute predicted values \(\hat{X}_i=(\hat{Z}_{0i},\hat{Y}_{2i},\ldots,\hat{Y}_{\ell i},\hat{Z}_{1i},\ldots,\hat{Z}_{ki})^\prime\)
- Second stage
- Regress \(Y_{1i}\) on \(\hat{X}_i\) by OLS
- Need at least \(\ell-1\) excluded instruments with nonzero coefficients in at least some first stage regressions to avoid multicolinearity
- Second stage coefficients are Two Stage Least Squares estimator
2SLS assumptions
- (2SLS1) Linear Model \[Y_{1i}=\beta_0+\beta_1Y_{2i} + \beta_2Y_{3i} + \ldots + \beta_{\ell}Y_{\ell i} + \beta_{\ell+1}Z_{1i} + ... + \beta_{k+\ell}Z_{ki} + u_{i}\]
- (2SLS2) Random sampling: \((Y_{1i},\ldots,Y_{\ell i},Z_i)\) drawn i.i.d. from population satisfying linear model assumptions
- (2SLS3) Relevance
- There are no exact linear relationships among the variables in \(\hat{X}_i=(1,\hat{Y}_{2i},\ldots,\hat{Y}_{\ell i},Z_{1i},\ldots,Z_{ki})\)
- (2SLS4) Exogeneity \[E[Z_{ij}u_i]=0\ \forall j=1\ldots m+1\]
- Sometimes also assume (2SLS5) Homoskedasticity \[E[u^2|Z]=\sigma^2\] (\(\sigma^2\) a finite nonzero constant)
Results
- Under (1-4), 2SLS consistent
- Relevance condition more complicated
- Necessary conditions:
- At least \(\ell-1\) instruments not included as a control: \(m\geq k+\ell\)
- First stage regressions of \(Y_{-1}\) on \(Z\) have at least \(\ell\) nonzero coefficients on excluded regressor
- Otherwise it is a linear combination of included \(Z\), and so there is multicollinearity in second stage
- Under (1-5), asymptotically normal inference is possible
- Under (1-5), 2SLS is also choice of linear combination of instruments with smallest asymptotic variance
- As usual, inference possible under heteroskedasticity (5 false)
- Unlike OLS, 2SLS not necessarily unbiased
2SLS inference
- Test values of individual coefficients using a Z test
- Test multiple coefficients with a Wald test
- Can also test some model assumptions (command is \(diagnostics=TRUE\) option in \(summary\) after \(ivreg\))
- Relevance
- Endogeneity
- Exclusion restrictions
Failure of relevance
- What happens if (IV3) \(Cov(Z,X)\neq 0\) or (2SLS3) fails or is close to failing?
- \(\hat{\beta}_1^{IV}\) gives division by 0, or close to it
- 2SLS estimator can’t solve system of equations
- Irrelevant instrument gives undefined limit
- \(Z\) just doesn’t have any effect
- If \(Z\) irrelevant or nearly so, IV involves division by something close to 0 and variable, so huge and sometimes positive, sometimes negative
- Results in large standard errors and bias, finite-sample distribution with many outliers (infinite variance)
- If only one endogenous regressor, can test it using F test in first stage regression of \(X\) on \(Z\)
- Rule of thumb (Homoskedastic 1 variable case): F<10, estimate may be unreliable, even if n large
Testing endogeneity
- With # excluded instruments= # endogenous regressors, exclusion restriction not testable
- Regardless of true joint distribution of \((Y,Z)\), can always find \(\beta\) satisfying the moment conditions
- If we believe IV assumptions (exclusion and relevance), we can at least test whether IV is doing any good
- If \(E[Y_{-1}u]=0\), so \(Y_{-1}\) exogenous, IV and OLS will both give consistent estimates
- But IV estimator will have larger variance, since instrument uses only part of variation in \(Y_{-1}\) to find \(\beta\)
- (Durbin-Wu-)Hausman test uses difference between IV and OLS to test null that IV not needed against alternative that it is.
Testing IV Assumptions: Exclusion
- With multiple excluded instruments, can compare IV estimate computed using different subsets of \(Z\)
- If they differ more than expected due to sampling variation, something in assumptions is wrong
- Either (at least one) exclusion restriction \(E[Z_j u]=0\) is wrong
- Or constant effects assumption is wrong
- In LATE setting, different IV’s will result in different groups of compliers
- 2SLS assumptions valid for linear model only if treatment effect on all subgroups identical
- Can test by Sargan (or J) test: essentially a Wald test of the exclusion restrictions
- Not rejecting does NOT mean model assumptions all valid
- Use \(diagnostics=TRUE\) option in \(summary\) after ivreg: gives Sargan test, Hausman test, and F test for weak instruments
Conclusions
- Multivariate IV model permits use of conditionally random instrument to generate conditionally random variation in endogenous regressor
- 2SLS can be used to estimate
- First stage OLS of \(X\) on \(Z\)
- Second stage OLS of \(Y1\) on \(\hat{X}\)
- Requires exclusion and relevance, which can (sometimes) be tested
- Relevance condition testable by first stage F test,
- Validity testable if more excluded instruments than endogenous variables
- Endogeneity testable if IV valid
- Next class
- More use cases for IV
- Begin Panel Data