Regression Discontinuity Estimation
rd_est.Rd
rd_est
estimates both sharp and fuzzy RDDs using parametric and non-parametric
(local linear) models.
It is based on the RDestimate
function in the "rdd" package.
Sharp RDDs (both parametric and non-parametric) are estimated using lm
in the
stats package.
Fuzzy RDDs (both parametric and non-parametric) are estimated using two-stage least-squares
ivreg
in the AER package.
For non-parametric models, Imbens-Kalyanaraman optimal bandwidths can be used,
Usage
rd_est(
formula,
data,
subset = NULL,
cutpoint = NULL,
bw = NULL,
kernel = "triangular",
se.type = "HC1",
cluster = NULL,
verbose = FALSE,
less = FALSE,
est.cov = FALSE,
est.itt = FALSE,
t.design = NULL
)
Arguments
- formula
The formula of the RDD; a symbolic description of the model to be fitted. This is supplied in the format of
y ~ x
for a simple sharp RDD ory ~ x | c1 + c2
for a sharp RDD with two covariates. A fuzzy RDD may be specified asy ~ x + z
wherex
is the running variable, andz
is the endogenous treatment variable. Covariates are included in the same manner as in a sharp RDD.- data
An optional data frame containing the variables in the model. If not found in
data
, the variables are taken fromenvironment(formula)
.- subset
An optional vector specifying a subset of observations to be used in the fitting process.
- cutpoint
A numeric value containing the cutpoint at which assignment to the treatment is determined. The default is 0.
- bw
A vector specifying the bandwidths at which to estimate the RD. Possible values are
"IK09"
,"IK12"
, and a user-specified non-negative numeric vector specifying the bandwidths at which to estimate the RD. The default is"IK12"
. Ifbw
is"IK12"
, the bandwidth is calculated using the Imbens-Kalyanaraman 2012 method. Ifbw
is"IK09"
, the bandwidth is calculated using the Imbens-Kalyanaraman 2009 method. Then the RD is estimated with that bandwidth, half that bandwidth, and twice that bandwidth. If only a single value is passed into the function, the RD will similarly be estimated at that bandwidth, half that bandwidth, and twice that bandwidth.- kernel
A string indicating which kernel to use. Options are
"triangular"
(default and recommended),"rectangular"
,"epanechnikov"
,"quartic"
,"triweight"
,"tricube"
, and"cosine"
.- se.type
This specifies the robust standard error calculation method to use, from the "sandwich" package. Options are, as in
vcovHC
,"HC3"
,"const"
,"HC"
,"HC0"
,"HC1"
,"HC2"
,"HC4"
,"HC4m"
,"HC5"
. The default is"HC1"
. This option is overridden bycluster
.- cluster
An optional vector of length n specifying clusters within which the errors are assumed to be correlated. This will result in reporting cluster robust SEs. This option overrides anything specified in
se.type
. It is suggested that data with a discrete running variable be clustered by each unique value of the running variable (Lee and Card, 2008).- verbose
A logical value indicating whether to print additional information to the terminal. The default is
FALSE
.- less
Logical. If
TRUE
, return the estimates of linear and optimal. IfFALSE
return the estimates of linear, quadratic, cubic, optimal, half and double. The default isFALSE
.- est.cov
Logical. If
TRUE
, the estimates of covariates will be included. IfFALSE
, the estimates of covariates will not be included. The default isFALSE
. This option is not applicable if method is"front"
.- est.itt
Logical. If
TRUE
, the estimates of ITT will be returned. The default isFALSE
.- t.design
A string specifying the treatment option according to design. Options are
"g"
(treatment is assigned ifx
is greater than its cutoff),"geq"
(treatment is assigned ifx
is greater than or equal to its cutoff),"l"
(treatment is assigned ifx
is less than its cutoff), and"leq"
(treatment is assigned ifx
is less than or equal to its cutoff).
Value
rd_est
returns an object of class "rd
".
The functions summary
and plot
are used to obtain and print a summary and
plot of the estimated regression discontinuity. The object of class rd
is a list
containing the following components:
- type
A string denoting either
"sharp"
or"fuzzy"
RDD.- est
Numeric vector of the estimate of the discontinuity in the outcome under a sharp RDD or the Wald estimator in the fuzzy RDD, for each corresponding bandwidth.
- se
Numeric vector of the standard error for each corresponding bandwidth.
- z
Numeric vector of the z statistic for each corresponding bandwidth.
- p
Numeric vector of the p-value for each corresponding bandwidth.
- ci
The matrix of the 95 for each corresponding bandwidth.
- d
Numeric vector of the effect size (Cohen's d) for each estimate.
- cov
The names of covariates.
- bw
Numeric vector of each bandwidth used in estimation.
- obs
Vector of the number of observations within the corresponding bandwidth.
- call
The matched call.
- na.action
The number of observations removed from fitting due to missingness.
- impute
A logical value indicating whether multiple imputation is used or not.
- model
For a sharp design, a list of the
lm
objects is returned. For a fuzzy design, a list of lists is returned, each with two elements:firststage
, the first stagelm
object, andiv
, theivreg
object. A model is returned for each corresponding bandwidth.- frame
Returns the dataframe used in fitting the model.
References
Lee, D. S., Lemieux, T. (2010). Regression Discontinuity Designs in Economics. Journal of Economic Literature, 48(2), 281-355. doi:10.1257/jel.48.2.281 .
Imbens, G., Lemieux, T. (2008). Regression discontinuity designs: A guide to practice. Journal of Econometrics, 142(2), 615-635. doi:10.1016/j.jeconom.2007.05.001 .
Lee, D. S., Card, D. (2010). Regression discontinuity inference with specification error. Journal of Econometrics, 142(2), 655-674. doi:10.1016/j.jeconom.2007.05.003 .
Angrist, J. D., Pischke, J.-S. (2009). Mostly harmless econometrics: An empiricist's companion. Princeton, NJ: Princeton University Press.
Drew Dimmery (2016). rdd: Regression Discontinuity Estimation. R package version 0.57. https://CRAN.R-project.org/package=rdd
Imbens, G., Kalyanaraman, K. (2009). Optimal bandwidth choice for the regression discontinuity estimator (Working Paper No. 14726). National Bureau of Economic Research. https://www.nber.org/papers/w14726.
Imbens, G., Kalyanaraman, K. (2012). Optimal bandwidth choice for the regression discontinuity estimator. The Review of Economic Studies, 79(3), 933-959. https://academic.oup.com/restud/article/79/3/933/1533189.
Examples
set.seed(12345)
x <- runif(1000, -1, 1)
cov <- rnorm(1000)
y <- 3 + 2 * x + 3 * cov + 10 * (x >= 0) + rnorm(1000)
rd_est(y ~ x, t.design = "geq")
#>
#> Call:
#> rd_est(formula = y ~ x, t.design = "geq")
#>
#> Coefficients:
#> Linear Quadratic Cubic Opt Half-Opt Double-Opt
#> 10.23 10.67 10.66 10.47 10.57 10.30
#>
# Efficiency gains can be made by including covariates (review SEs in "summary" output).
rd_est(y ~ x | cov, t.design = "geq")
#>
#> Call:
#> rd_est(formula = y ~ x | cov, t.design = "geq")
#>
#> Coefficients:
#> Linear Quadratic Cubic Opt Half-Opt Double-Opt
#> 10.049 10.068 9.829 10.047 9.896 10.052
#>