Econometrics

  • The art of learning about economic phenomena using data
  • Neither pure economics nor statistics applied to economic data

Goals of econometrics

The Econometric Approach

Challenges

This Course

  1. Causal inference in linear models
  2. Nonlinear models and time series

Syllabus

https://canvas.cmu.edu/

R

Example: Wage and Education Data: Code

# Obtain access to data sets used in our textbook
library(foreign) 
# Import data set of education and wages
wage1<-read.dta(
  "http://fmwww.bc.edu/ec-p/data/wooldridge/wage1.dta")
# Scatter plot
plot(wage1$educ,wage1$lwage, xlab = "Years of Education",
      ylab = "Log Wage", main = "Wage vs Education")

Example: Wage and Education Data

Statistical Question: Do educated people earn more money?

Economic Question

Statistics

Ordinary Least Squares Review

Let \({(x_i,y_i):i=1 \ldots n}\) be a sample of measured education levels and log wages from a population of individuals. OLS estimator solves \[(\hat{\beta_0},\hat{\beta_1})=\arg\min \sum_{i=1}^{n}(y_i-\beta_0-\beta_{1}x_i)^2\] Giving the formulas \[\hat{\beta_1}=\frac{\sum_{i=1}^{n}(x_i-\bar{x})(y_i-\bar{y})}{\sum_{i=1}^{n}(x_i-\bar{x})^2}\] \[\hat{\beta_0}=\bar{y}-\hat{\beta_1}\bar{x}\]

Regression as statistical model

  1. In population, \(y=\beta_0+\beta_{1}x+u\)
  2. \({(x_i,y_i):i=1 \ldots n}\) are independent random sample of observations following 1
  3. \({x_i : i=1 \ldots n}\) are not all identical
  4. \(E(u|x)=0\)
  5. \(Var(u|x)=\sigma^2\) a constant \(>0\)

Regression properties

  1. Consistent: \(Pr(|\hat{\beta_1}-\beta_1|>e)\rightarrow 0\) as \(n\rightarrow\infty\) for any \(e>0\)
  2. Unbiased: \(E(\hat{\beta_1})=\beta_1\)
  3. Asymptotically normal \[Pr(\frac{\sqrt{n}(\hat{\beta_1}-\beta_1)}{\sqrt{\frac{\sigma^2} {\frac{1}{n}\sum_{i=1}^{n}(x_i-\bar{x})^2}}}<t)\rightarrow Pr(Z<t)\] for any \(t\), where \(Z \sim N(0,1)\)

Regression Results: Table (Code)

# Obtain access to data sets used in our textbook
library(foreign) 
#Load library to make pretty table
suppressWarnings(suppressMessages(library(stargazer))) 
# Import data set of education and wages
wage1<-read.dta(
  "http://fmwww.bc.edu/ec-p/data/wooldridge/wage1.dta")
# Regress log wage on years of education 
wageregoutput <- lm(formula = lwage ~ educ, data = wage1)
stargazer(wageregoutput,header=FALSE,type="html",
  omit.stat=c("adj.rsq"),font.size="tiny",
  title="Regression of Log Wage on Years of Education")

Regression Results: Table

Regression of Log Wage on Years of Education
Dependent variable:
lwage
educ 0.083***
(0.008)
Constant 0.584***
(0.097)
Observations 526
R2 0.186
Residual Std. Error 0.480 (df = 524)
F Statistic 119.582*** (df = 1; 524)
Note: p<0.1; p<0.05; p<0.01

Regression Results: Plot (Code)

# Scatter plot with regression line
plot(wage1$educ,wage1$lwage, xlab = "Years of Education",
      ylab = "Log Wage", main = "Wage vs Education")
abline(wageregoutput,col="red")

Regression Results: Plot

Note on using R

regression<-lm(formula = lwage ~ educ)
summary(regression)
?plot

Interpretation

Causal inference strategies

Nonlinear methods (code)

#Load Libraries for nonparametric estimation
#Install "np" library 
install.packages("np", 
            repos="http://cran.us.r-project.org")
library(np))
wagebw<-npregbw(ydat=wage1$lwage,xdat=wage1$educ,
    regtype="ll",bws=c(2),
    bandwidth.compute=FALSE, ckertype="uniform")
#Pretty up Names
wagebw$xnames<-"Years of Education"
wagebw$ynames<-"Log Wage"
plot(wagebw,ylim=c(min(wage1$lwage),max(wage1$lwage)), 
    main="Wage vs Education, Linear and Nonlinear Fits")
points(wage1$educ,wage1$lwage,pch=1, col="blue")
abline(wageregoutput, col="red")

Nonlinear methods

Assignments and readings