| Last Updated:
Creating DOI. Please wait...
Replication is a critical component of scientific credibility. Replication increases our confidence in the reliability of knowledge generated by original research. Yet, replication is the exception rather than the rule in economics. Few replications are completed and even fewer are published. Indeed, in the last 6 years, only 11 replication studies were published in top 11 (empirical) Economics Journals. In this paper, we examine why so little replication is done and propose changes to the incentives to replicate in order to solve these problems.
Our study focuses on code replication, which seeks to replicate the results in the original paper uses the same data as the original study. Specifically, these studies seek to replicate exactly the same analyses performed by the authors. The objective is to verify that the analysis code is correct and confirm that there are no coding errors. This is usually done in a two-step process. The first step is to reconstruct the sample and variables used in the analysis from the raw data. The second step is to confirm that the analysis code (i.e., the code that fits a statistical model to the data) reproduces the reported results. By definition, the results reported in the original paper can then be replicated if this two-step procedure is successful. The threat of code replication provides an incentive for authors to put more effort into writing the code to avoid errors and incentive not to purposely misreport results.
We analyze the effectiveness of code replication in the context a model that has three characteristics:
1. Unbiasedness: there is no “overturn bias” i.e., the model does not create incentives to find or claim mistakes in the original analysis.
2. Fair: all papers have some probability of being replicated, the sample of papers replicated is representative, and the likelihood that a paper is replicated is independent of author identity, topic, and results.
3. Cost: the model should provide the right incentives at low cost to be efficient.
These characteristics are necessary to establish a creditable threat of valid replication that authors take seriously enough to modify behavior. Replication needs to be low cost for researchers to undertake it, fair so that studies face some positive probability of being replicated, and unbiased so that the original authors have reason to participate and the profession believe the replication results.
We believe the current model for code replication does not have many of the desired characteristics and fails to provide the proper incentives to authors. We first show that there are low incentives for researchers to perform replication studies and that there is substantial “overturn bias” among editors. This is reflected in the replication studies published in economics. Since 2011 only 11 replication studies published in top journals, has been published, all of which overturned the results from the original paper. We also show poor author compliance with journal policies that require post acceptance posting of data and code, thereby raising the cost of replication. All of this means that there is a very low probability of a paper being replicated, overturn bias lowers the confidence in the replication results, and there is little incentive for authors to facilitate replication, and that the current model of replication fails to provide confidence in the integrity of published results.
We outline a simple proposal to improve replication. The core of the proposal is to have journals perform the replication exercise post-acceptance but pre-publication. Specifically, authors submit their data and code after a conditional acceptance. Journals then verify that the code and data reproduce the results in the paper. For a random sample of papers the journal attempts to re-construct the code from scratch or search the code for errors. This can be an iterative process until authors and editors are able to reach agreement. If the results change, the editors can choose to re-review the paper.
This simple procedure has three desirable properties. First, it is unbiased since there are no overturn bias incentives for the parties involved (editors/researchers). Second, it is fair because all papers have an equal probability of being replicated. Third, it is low-cost: there is little cost associated with having a research associate perform “push button exercises”, authors have strong incentives to cooperate pre-publication, and there are no adversarial feelings. Such a mechanism would create a strong incentive not to misreport findings and to ensure that code is free of errors.
CC-By Attribution 4.0 International