APSA Annual Meeting & Exhibition: How Much Should We Trust Modern Difference-in-Differences Estimates?

Social Media Menu
Facebook
X (Twitter)

Back Home

Refresh: Off

Individual Submission Summary

Share...

Direct link:

How Much Should We Trust Modern Difference-in-Differences Estimates?

In Event: Challenges in Learning from Panel Data

Fri, September 6, 12:00 to 1:30pm, Pennsylvania Convention Center (PCC), 110B

Abstract

What makes modern difference-in-difference methods viable for the studying policy effects in the 50-unit U.S. state panel? Modern estimators for causal effects in panel data address parallel trends violations, heterogeneous treatment effects, and other challenging features common to observational settings. These methods, however, tend to produce larger standard errors than classic methods do, constraining statistical power. In this paper, I investigate statistical power of modern and conventional two-way fixed effects-style estimators in a familiar “natural experiment” setting in the American politics literature---time-series cross-sectional analysis of state policy effects. In particular, using a Monte Carlo approach, I estimate statistical power and a range of other diagnostic statistics for eight modern estimators that rely on some type of parallel trends assumption, as well as the classic two-way fixed effects estimator. I test data-generating processes with N = 50 units, T ∈ {2, . . . , 30} time periods, half of units eventually entering treatment, and effect size coefficients corresponding to 0, 0.2, 0.5, and 0.8 standard deviations.

Under an optimistic data-generating process consistent with all the assumptions for use of classic two-way fixed effects and up to 30 time periods, I find that no estimator in my study becomes well-powered with an effect size of 0.2 standard deviations. The two-way fixed effects assumption becomes well-powered with an effect size of 0.5, but more robust estimators---such as the Callaway-Sant’Anna and d’Chaisemartin-d’Haultfoeuille estimators---do not become well-powered with an effect size of less than 0.8.

In the paper, I also estimate power and other diagnostic statistics for data-generating processes with dynamic treatment effects, heterogeneous treatment effects, conditional parallel trends, and violated parallel trends. The paper concludes with suggestions for applied researchers interested in using difference-in-difference methods to study American state policy effects.

Author

Amanda Weiss, Yale University