Individual Submission Summary
Share...

Direct link:

The Steroid Era in Social Science? Statistical Significance in 1970s-2010s

Thu, September 5, 4:00 to 5:30pm, Pennsylvania Convention Center (PCC), 112A

Abstract

In the social sciences, statistically significant results have been treated as a prime marker of human discovery. Later inquiries revealed that publication bias, file drawer problems, and p-hacking could be a part of the story. Given advancements in computing power and analytic tools, to what extent could researchers be mass-producing or cherry-picking statistically significant results during social science’s “steroid era” with unprecedented computing power? To examine the prevalence of statistically significant results in social sciences, we analyze academic papers in political science and economics in the past 50 years, using random samples from four populations of published papers: American National Elections Study (ANES) bibliography, Panel of Income Dynamics (PSID) bibliography, general-interest top journals in Political Science and Economics.

We find that descriptive studies have declined while the number of statistically significant tests has increased dramatically. Surprisingly, the ratio of significant tests to total tests per paper has remained largely consistent over time. We further analyze these dynamics by categorizing tests by analytic purposes (e.g., the variable of interest; whether significance is wanted). Surprisingly, even the ratio of significant tests among the tests conducted for main findings (“main-wanted tests”) has stayed roughly constant over time. Yet, the proportion of residual tests (e.g., controls) relative to main-wanted tests has increased over time. Our work generates several interesting implications for reliable science, including the roles of institutions (e.g., journal review process) and evolving research practices (e.g., reregistration, robustness tests).

Authors