Today

Inference with minimal assumptions

  1. In population, \(y=\beta_0+\beta_{1}x_{1}+\beta_{2}x_{2}+\ldots+\beta_{k}x_{k}+u\)
  2. \({(y_i,\mathbf{x}_i^\prime):i=1 \ldots n}\) are independent random sample of observations following 1
  3. There are no exact linear relationships among the variables \(x_0 \ldots x_k\)

Heteroskedasticity

Data exhibiting linearity but heteroskedasticity (code)

#Generate a data set
x<-runif(1000, min=1, max=7)
u<-rnorm(1000)*(4*x) #u is a function of x
y<-1+4*x+u
#Fit linear regression
hetreg<-lm(y ~ x)
#Plot points and OLS best fit line
plot(x,y,xlab = "x", ylab = "y",
     main = "Heteroskedastic Linear Relationship")
abline(hetreg, col = "blue", lwd=2)

Data exhibiting linearity but heteroskedasticity

Data exhibiting nonlinear conditional expectation (code)

#Generate a data set
x<-runif(1000, min=1, max=7)
u<-6*rnorm(1000) #u is not a function of x
y<-1+4*x+5*cos(3*x)+u
#Fit linear regression
hetreg2<-lm(y ~ x)
#Plot points and OLS best fit line
plot(x,y,xlab = "x", ylab = "y",
     main = "Misspecified Relationship")
abline(hetreg2, col = "blue", lwd=2)
curve(1+4*x+5*cos(3*x),add=TRUE,col="red",lwd=2)

Data exhibiting nonlinear conditional expectation

Implications of heteroskedasticity

Limit Distribution under Heteroskedasticity: Univariate Case

Eicker-Huber-White Robust Standard Errors

Wage predictions with and without robust SEs (Code 1)

# Obtain access to data sets used in our textbook
library(foreign) 
#Load library to make pretty table
library(stargazer)
# Import data set of education and wages
wage1<-read.dta(
  "http://fmwww.bc.edu/ec-p/data/wooldridge/wage1.dta")
# Regress log wage on years of education and experience
wageregression2 <- lm(formula = lwage ~ educ + 
                  exper, data = wage1)
#Load library to calculate Robust SEs
library(sandwich)
# Calculate heteroskedasticity-consistent (HC) estimate 
Sigmahat<-vcovHC(wageregression2,type="HC0") 
#type chooses scaling with some degrees of freedom 
#correction: exact choice doesn't matter in large samples

Wage predictions with and without robust SEs (Code 2)

#Load library to implement tests
library(lmtest)
# Test coefs using robust s.e.s
robustwagereg<-coeftest(wageregression2,
              df=Inf,vcov=Sigmahat)
# df=Inf means use critical values from Normal 
# distribution, equivalent to t dist with 
# infinite degrees of freedom)

#Plot comparison
stargazer(wageregression2,robustwagereg,
    type="html",       
    header=FALSE,omit.stat=c("adj.rsq","ser","F"),
    font.size="tiny",digits=5,column.labels = 
    c("Homoskedastic Error Formula", 
      "Heteroskedastic Error Formula"),
    model.names=FALSE, 
    title="Wage Regression with SE Estimates")

Wage predictions with and without robust SEs

## Warning: package 'foreign' was built under R version 3.5.2
Wage Regression with SE Estimates
Dependent variable:
lwage
Homoskedastic Error Formula Heteroskedastic Error Formula
(1) (2)
educ 0.09794*** 0.09794***
(0.00762) (0.00807)
exper 0.01035*** 0.01035***
(0.00156) (0.00160)
Constant 0.21685** 0.21685*
(0.10860) (0.11418)
Observations 526
R2 0.24934
Note: p<0.1; p<0.05; p<0.01

Hypotheses tests under heteroskedasticity

Causality

Education Example again

Notation

Causal Models

A Randomized Experiment

Identifying Causal Effects

Going from identification to estimation

What we can’t learn from experiments

Non-experimental data

Special case: binary treatment

Learning about treatment effects

Bias in naïve estimate of causal effects

\[E[Y_i|X_i=1]-E[Y_i|X_i=0]=E[Y_i^1|X_i=1]-E[Y_{i}^{0}|X_i=0]\] Add and subtract \(E[Y_i^0|X_i=1]\) \[=\stackrel{\text{ATT}}{E[Y_i^1-Y_i^0|X_i=1]}+\stackrel{\text{"selection bias"}}{(E[Y_i^0|X_i=1]-E[Y_i^0|X_i=0])}\]

Example: Job Training and Earnings

Experiments and treatment effects

Random coefficients

Relating to standard linear model

Estimation

Next class

(Bonus 1): Standard errors under heteroskedasticity

(Bonus 2) Limiting variance

(Bonus 3) Sandwich formula

(Bonus 4) Wald Test Formula