Abstract
In this paper we use data from an evaluation of the Benefit Offset National Demonstration (BOND) to evaluate the efficacy of using comparative regression discontinuity (CRD) and regression discontinuity (RD) relative to a randomized controlled trial (RCT). BOND is a large demonstration intended to promote return to work among people with disabilities who receive Social Security Disability Insurance (DI). RD is known as a relatively rigorous non-experimental method but produces imprecise results that apply to small populations. CRD is a promising enhancement that addresses these issues. The CRD and RD methods are potentially attractive because they can be used in contexts in which RCTs are challenging or infeasible. However, the bias of findings from CRD and RD studies is unknown in the context of DI. In this paper, we estimate CRD and RD models using simulated assignment to the BOND treatment group based on the duration of DI receipt at the start of BOND. We compare the CRD and RD estimates to RCT estimates. While the findings are not intended to revise the well-established evidence evaluating BOND, they can be used to help interpret the results from CRD and RD studies on other income support interventions for people with disabilities and to inform future study designs.
Our paper has two key limitations. First, our RD models are far from ideal. This limits the degree to which our RD results generalize to what would be found with state-of-the art RD models. Second, our results may not generalize to other populations. Our analysis was based on BOND beneficiaries who were representative of the larger DI population at the time of BOND random assignment but may not reflect the DI population in more recent years.
The paper found that:
- Average bias from CRD and RD is generally below 0.02 standard deviations in absolute size for the groups of bias estimates we analyzed.
- Given the precision that may be needed to evaluate interventions like BOND, the standard deviation of bias (after accounting for sampling error) is nontrivial, generally between 0.02 and 0.07 standard deviations for the groups of bias estimates we analyzed.
The policy implications of the findings are:
- When designing and interpreting results from CRD and RD evaluations, it is important to note that both produce biased estimates suggesting that their results be interpreted with more caution than those from an RCT with similar standard errors.
- This bias appears to be larger in the presence of major non-linearities in the relationship between the running variable and the lagged outcome for CRD.