Motivation

Problem Setup: (Sharp) Regression Discontinuity

Identification

Illustration of continuity argument

Intuition

Aside: Local randomization

Estimation: Local regression

Estimation: Parametric Models

A happy medium: Local Polynomials

Tuning parameters

Analysis: Rates

Implementation Choices

Inference

Impossibility/Optimality of Inference

Alternate Perspective: Bayes

Visualization

House Election Wins lead to more wins for party

Eyeballing it: pros and cons

Features of an RD plot that indicate a questionable fit

Comparison of methods: Lee (2008) Incumbency Effect

#Load RD package containing Lee (2008) incumbency data 
suppressWarnings(suppressMessages(library(rddtools)))
data(house) #Lee data
#Load another package for RD estimation
suppressWarnings(suppressMessages(library(rdrobust))) #Calonic0 2014 robust CIs
suppressWarnings(suppressMessages(library(RDHonest))) #Arstrong Kolesar robust CIs
# Specify x, y, and cutpoint = 0 Dem-Rep vote share
house_rdd<-rdd_data(y=house$y,x=house$x,cutpoint=0)
# Specify same for format of RDHonest package
LEEframe<-data.frame(y=house$y,x=house$x)
# OLS estimate in rddtools library, linear with different slope on each side
(reg_para <- rdd_reg_lm(rdd_object=house_rdd))
## ### RDD regression: parametric ###
##  Polynomial order:  1 
##  Slopes:  separate 
##  Number of obs: 6558 (left: 2740, right: 3818)
## 
##  Coefficient:
##    Estimate Std. Error t value  Pr(>|t|)    
## D 0.1182314  0.0056799  20.816 < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#Estimate MSE- optimal bandwidth on Lee data by Imbens-Kalyanaraman procedure
LEEbw<-rdd_bw_ik(house_rdd)
# Local linear RD estimate, different slope on each side
(LEEnp<-rdd_reg_np(rdd_object=house_rdd,bw=LEEbw))
## ### RDD regression: nonparametric local linear###
##  Bandwidth:  0.2938561 
##  Number of obs: 3200 (left: 1594, right: 1606)
## 
##  Coefficient:
##   Estimate Std. Error z value  Pr(>|z|)    
## D 0.079924   0.009465  8.4443 < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# Calonico Cattaneo Titiunik bias corrected estimates and CIs
summary(rdrobust(y=house$y,x=house$x,c=0,all = TRUE))
## [1] "Mass points detected in the running variable."
## Call: rdrobust
## 
## Number of Obs.                 6558
## BW type                       mserd
## Kernel                   Triangular
## VCE method                       NN
## 
## Number of Obs.                 2740         3818
## Eff. Number of Obs.             789          817
## Order est. (p)                    1            1
## Order bias  (q)                   2            2
## BW est. (h)                   0.136        0.136
## BW bias (b)                   0.240        0.240
## rho (h/b)                     0.565        0.565
## Unique Obs.                    2108         2581
## 
## =============================================================================
##         Method     Coef. Std. Err.         z     P>|z|      [ 95% C.I. ]       
## =============================================================================
##   Conventional     0.064     0.011     5.815     0.000     [0.042 , 0.085]     
## Bias-Corrected     0.059     0.011     5.418     0.000     [0.038 , 0.081]     
##         Robust     0.059     0.013     4.738     0.000     [0.035 , 0.084]     
## =============================================================================
## RD plot with bin-scatter and global quartic polynomial estimates
rdplot(y=house$y,x=house$x,c=0,masspoints="off")

# Armstrong-Kolesar CI with bound M of 0.4 on second derivative (sclass T),
# bandwidth optimized for smallest "Fixed Length Confidence Interval" 
RDHonest(y~x,data=LEEframe,kern="triangular",M=0.4,opt.criterion = "FLCI",sclass="T")
## Call:
## RDHonest(formula = y ~ x, data = LEEframe, M = 0.4, kern = "triangular", 
##     opt.criterion = "FLCI", sclass = "T")
## 
## 
## Inference by se.method:
##      Estimate Maximum Bias  Std. Error
## nn 0.07810269  0.005018494 0.008338368
## 
## Confidence intervals:
## nn    (0.05919947, 0.09700591), (0.0593688, Inf), (-Inf, 0.09683658)
## 
## Bandwidth: 0.2638011
## Number of effective observations: 645.7353

Tests and Diagnostics

Balance test

Density Estimation and Manipulation Test

#Run McCrary test 
# Estimates density left and right of cutoff
# null is no difference in density
LEEtest<-dens_test(LEEnp,plot=FALSE)
pvalue<-LEEtest[2] #get p value 
#Run McCrary test
# Estimates density left and right of cutoff
# null is no difference in density
dens_test(LEEnp)

## 
##  McCrary Test for no discontinuity of density around cutpoint
## 
## data:  LEEnp
## z-val = 1.2952, p-value = 0.1952
## alternative hypothesis: Density is discontinuous around cutpoint
## sample estimates:
## Discontinuity 
##     0.1035008

Fuzzy Regression Discontinuity

Fuzzy RD: Example

Estimates

Bandwidth choice for Fuzzy Regression Discontinuity

Bandwidths

Extensions and Alternatives

Conclusions

References

Armstrong, Timothy B, and Michal Kolesár. 2018. “Optimal Inference in a Class of Regression Models.” Econometrica 86 (2): 655–83.
———. 2020. “Simple and Honest Confidence Intervals in Nonparametric Regression.” Quantitative Economics 11 (1): 1–39.
Bertanha, Marinho, and Marcelo J Moreira. 2020. “Impossible Inference in Econometrics: Theory and Applications.” Journal of Econometrics 218 (2): 247–70.
Branson, Zach, Maxime Rischard, Luke Bornn, and Luke W Miratrix. 2019. “A Nonparametric Bayesian Methodology for Regression Discontinuity Designs.” Journal of Statistical Planning and Inference 202: 14–30.
Calonico, Sebastian, Matias D Cattaneo, and Rocio Titiunik. 2014. “Robust Nonparametric Confidence Intervals for Regression-Discontinuity Designs.” Econometrica 82 (6): 2295–2326.
Cattaneo, Matias D., Richard K. Crump, Max H. Farrell, and Yingjie Feng. 2021. “On Binscatter.” http://arxiv.org/abs/1902.09608.
Cattaneo, Matias D., Brigham R. Frandsen, and Rocío Titiunik. 2015. “Randomization Inference in the Regression Discontinuity Design: An Application to Party Advantages in the u.s. Senate.” Journal of Causal Inference 3 (1): 1–24. https://doi.org/doi:10.1515/jci-2013-0010.
Ganong, Peter, and Pascal Noel. 2020. “Liquidity Versus Wealth in Household Debt Obligations: Evidence from Housing Policy in the Great Recession.” American Economic Review 110 (10): 3100–3138. https://doi.org/10.1257/aer.20181243.
Gelman, Andrew, and Guido Imbens. 2019. “Why High-Order Polynomials Should Not Be Used in Regression Discontinuity Designs.” Journal of Business & Economic Statistics 37 (3): 447–56.
Imbens, Guido, and Karthik Kalyanaraman. 2012. “Optimal Bandwidth Choice for the Regression Discontinuity Estimator.” The Review of Economic Studies 79 (3): 933–59.
Kleven, Henrik Jacobsen. 2016. “Bunching.” Annual Review of Economics 8: 435–64.
Lee, David S. 2008. “Randomized Experiments from Non-Random Selection in US House Elections.” Journal of Econometrics 142 (2): 675–97.
Lee, David S, and Thomas Lemieux. 2010. “Regression Discontinuity Designs in Economics.” Journal of Economic Literature 48 (2): 281–355.
McCrary, Justin. 2008. “Manipulation of the Running Variable in the Regression Discontinuity Design: A Density Test.” Journal of Econometrics 142 (2): 698–714.
Thistlethwaite, Donald L, and Donald T Campbell. 1960. “Regression-Discontinuity Analysis: An Alternative to the Ex Post Facto Experiment.” Journal of Educational Psychology 51 (6): 309.
Tsybakov, Alexandre B. 2009. “Introduction to Nonparametric Estimation.” Springer Series in Statistics. Springer.
Vaart, Aad W van der, and J Harry van Zanten. 2008. “Rates of Contraction of Posterior Distributions Based on Gaussian Process Priors.” The Annals of Statistics 36 (3): 1435–63.
Williams, Christopher K, and Carl Edward Rasmussen. 2006. Gaussian Processes for Machine Learning. Vol. 2. 3. MIT press Cambridge, MA.
Zimmerman, Seth D. 2014. “The Returns to College Admission for Academically Marginal Students.” Journal of Labor Economics 32 (4): 711–54.