Today

Binary outcome/Discrete Choice Models

Probit and Logit as MLE

MLE Results

Interpretation of Nonlinear Model Parameters

Heterogeneity in effects

Probit vs. Logit vs. LPM

Bertrand and Mullainathan (2004) resume study (Code)

#Load library with data
library(AER)
#Load Bertrand and Mullainathan data
data("ResumeNames")
#Load library with marginal effects commands
library(mfx)
# Convert outcome variable to numeric so it reads as 0/1
ResumeNames$callback<-as.numeric(ResumeNames$call)-1
#Load library to make pretty table
library(stargazer) 

Example: Bertrand and Mullainathan (2004) resume study

Results (Code 1)

#Run Probit & Logit by glm command, LPM by lm()
disc.probit<-glm(formula=callback~ethnicity
  +experience+quality,family=
  binomial(link="probit"), data=ResumeNames)
disc.logit<-glm(formula=callback~ethnicity
  +experience+quality,family=
  binomial(link="logit"), data=ResumeNames)
disc.lpm<-lm(formula=callback~ethnicity
  +experience+quality,data=ResumeNames)

Results (Code 2)

#Display table of coefficients
stargazer(disc.probit,disc.logit,
      disc.lpm,header=FALSE,
      omit.stat=c("aic","rsq","adj.rsq"),
      font.size="tiny",
      title="Binary Choice Output")

Results

Binary Choice Output
Dependent variable:
callback
probit logistic OLS
(1) (2) (3)
ethnicityafam -0.217*** -0.439*** -0.032***
(0.053) (0.108) (0.008)
experience 0.020*** 0.038*** 0.003***
(0.005) (0.009) (0.001)
qualityhigh 0.072 0.154 0.011
(0.053) (0.107) (0.008)
Constant -1.500*** -2.631*** 0.066***
(0.059) (0.117) (0.009)
Observations 4,870 4,870 4,870
Log Likelihood -1,345.500 -1,345.572
Residual Std. Error 0.271 (df = 4866)
F Statistic 12.484*** (df = 3; 4866)
Note: p<0.1; p<0.05; p<0.01

Partial effects at average: Logit (Code)

#Estimate partial effects at average
disc.lmfx<-logitmfx(formula=callback~ethnicity
    +experience+quality, data=ResumeNames)
#Display them
(disc.lmfx)

Partial effects at average: Logit

## Call:
## logitmfx(formula = callback ~ ethnicity + experience + quality, 
##     data = ResumeNames)
## 
## Marginal Effects:
##                     dF/dx   Std. Err.       z     P>|z|    
## ethnicityafam -0.03153069  0.00767404 -4.1087 3.978e-05 ***
## experience     0.00271059  0.00065543  4.1356 3.540e-05 ***
## qualityhigh    0.01105134  0.00761716  1.4508    0.1468    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## dF/dx is for discrete change for the following variables:
## 
## [1] "ethnicityafam" "qualityhigh"

Partial effects at average: Probit (Code)

#Estimate partial effects at average
disc.pmfx<-probitmfx(formula=callback~ethnicity
  +experience+quality, data=ResumeNames)
#Display them
(disc.pmfx)

Partial effects at average: Probit

## Call:
## probitmfx(formula = callback ~ ethnicity + experience + quality, 
##     data = ResumeNames)
## 
## Marginal Effects:
##                     dF/dx   Std. Err.       z     P>|z|    
## ethnicityafam -0.03174445  0.00772256 -4.1106 3.946e-05 ***
## experience     0.00285682  0.00069971  4.0829 4.448e-05 ***
## qualityhigh    0.01052741  0.00771039  1.3654    0.1721    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## dF/dx is for discrete change for the following variables:
## 
## [1] "ethnicityafam" "qualityhigh"

Probit and Logit as Empirical Risk Minimizers

Alternative Loss Functions

Incorporating Nonlinearities

Multiple Outcomes: J>2 Multinomial Logit

Example: Travel Mode Choice (Code)

#Load data on travel decisions
#travel<-read.csv(
#  "http://www.stern.nyu.edu/~wgreene
# /Text/Edition7/TableF18-2.csv")
#Install multinomial logit library if not installed yet 
# install.packages("mlogit",
# repos = "http://cran.us.r-project.org")
library(mlogit) # Load Library
data("TravelMode", package = "AER") #Load data
travelchoice<-mlogit(choice~wait|income, TravelMode, 
        shape = "long", chid.var = "individual", 
        alt.var="mode", choice = "choice")

Example: Travel Mode Choice

travelchoice<-mlogit(choice~wait|income, TravelMode, 
        shape = "long", chid.var = "individual", 
        alt.var="mode", choice = "choice")

Results: Coefficients

#Display table of coefficients
stargazer(travelchoice,
          type="html",
    header=FALSE,omit.stat=c("rsq"),
    font.size="tiny",
    title="Multinomial Logit 
      Travel Choice Estimates")

Results: Coefficients

Multinomial Logit Travel Choice Estimates
Dependent variable:
choice
train:(intercept) -0.489
(0.574)
bus:(intercept) -1.876***
(0.670)
car:(intercept) -5.983***
(0.808)
wait -0.098***
(0.011)
train:income -0.058***
(0.015)
bus:income -0.024
(0.016)
car:income 0.006
(0.012)
Observations 210
Log Likelihood -192.425
LR Test 182.668*** (df = 7)
Note: p<0.1; p<0.05; p<0.01

Alternatives Models for Multinomial Choice

More Alternative Models for Multinomial Choice

Integer Outcomes: Poisson Regression and Extensions

Mixed discrete and continuous outcomes: Tobit Model

Extramarital Affairs (Fair 1978) (Code)

#Can download from online
#affairs<-read.csv(
#  "http://www.stern.nyu.edu/~wgreene
# /Text/Edition7/TableF17-2.csv")

# Or use version in AER library
data("Affairs",package = "AER")

Example: Extramarital Affairs (Fair 1978)

Fair’s Affairs: Tobit, Poisson, and Quasipoisson (Code 1)

fair.tobit <- tobit(affairs ~ age + yearsmarried + 
 religiousness  + occupation + rating, data = Affairs)
fair.pois <- glm(affairs ~ age + yearsmarried + 
 religiousness + occupation + rating, 
 family=poisson, data = Affairs)
fair.qpois <- glm(affairs ~ age + yearsmarried +
  religiousness + occupation + rating, 
  family=quasipoisson, data = Affairs)

Fair’s Affairs: Tobit, Poisson, and Quasipoisson (Code 2)

#Display table of coefficients
stargazer(fair.tobit, fair.pois, 
    fair.qpois,type="html",header=FALSE,
    omit.stat=c("rsq","aic"),
    font.size="tiny",title=
    "Predicting Extramarital Affairs")

Fair’s Affairs: Tobit, Poisson, and Quasipoisson

Predicting Extramarital Affairs
Dependent variable:
affairs
Tobit Poisson glm: quasipoisson
link = log
(1) (2) (3)
age -0.179** -0.032*** -0.032**
(0.079) (0.006) (0.015)
yearsmarried 0.554*** 0.116*** 0.116***
(0.135) (0.010) (0.026)
religiousness -1.686*** -0.354*** -0.354***
(0.404) (0.031) (0.081)
occupation 0.326 0.080*** 0.080
(0.254) (0.019) (0.051)
rating -2.285*** -0.409*** -0.409***
(0.408) (0.027) (0.072)
Constant 8.174*** 2.534*** 2.534***
(2.741) (0.197) (0.519)
Observations 601 601 601
Log Likelihood -705.576 -1,427.037
Wald Test 67.707*** (df = 5)
Note: p<0.1; p<0.05; p<0.01

Summary