What dung beetles and empirical research have in common
This article was written in the course of the summer academy «Science Behind The Scenes» in 2018 and was editorially supervised by «reatch – research and technology in switzerland».
«People respond to incentives» is a phrase students regularly come across in introductory economics courses. It is a simple but powerful and well-examined predictor of human behaviour as many studies demonstrate: Teachers manipulate test scores if their rewards depend on their pupils’ performance, immunisation rates in developing countries can increase if a small amount of food is provided along with the vaccination, and even Sumo-wrestlers, a profession guided by a stringent code of honour, manipulate bouts if the price is right. Naturally, this does not generalise to all individuals. But whenever there are strong incentives (monetary or otherwise) to «misbehave», chances are at least some do.
Economists have traditionally been prolific in pointing fingers at malpractice resulting from bad incentives in other areas. However, those addressing incentive problems within economic research itself are harder to be found. Yet, there is much finger pointing to be done as well. One issue especially deserves our attention: The assembly and treatment of data underlying empirical studies.
Incentives are stacked against meticulous data scrutiny
The basic incentive problem has three main ingredients: First, a researcher’s career prospects hinge on her ability to produce a paper that will be published in one of the top peer-reviewed journals. Second, the peer review process will scrutinize the methodology of the study, but as reviewers are time-constrained, the details on how exactly the data were handled before the analysis are usually beyond their attention. Third, there is scant credit for the meticulous work needed to gather and curate data.
Given this incentive structure, it is natural to put the main emphasis on the methodology and on a good story to produce a well-polished manuscript while keeping the time and attention devoted to menial data cleaning, checking, and assembly at a minimum – or even outsourcing this part to junior research assistants altogether. In addition, there are incentives to disclose as little as possible on how the data was curated. Full transparency only marginally increases credit for one’s work but instead invites scrutiny and allows other researchers to «piggyback» on the efforts made to attain a rich dataset.
The upshot is a large body of empirical papers (not only in economics but in other social science disciplines as well) where the diligence of the researchers cannot be assessed with the publicly available information. This hinders scientific progress.
Small mistakes can have large consequences
Even academics with the best of intentions can commit errors or are simply unaware of certain issues. To provide a personal example: During my work on the effects of police militarisation in the US, I noticed that the existing publications were all based on data containing a large amount of incorrect entries caused by a database migration. It is hard to phrase this discovery in an interesting fashion, and indeed much of the work involved in figuring out such details is not particularly exciting. But in this case, accounting for the imperfections ended up drastically weakening the conclusions of the existing literature (which found clear crime reducing effects of militarising police forces). Similarly problematic but better known are the simple but consequential spreadsheet errors which have been found in a very influential economics paper used to justify austerity policies by policy makers across the globe.
In other words: details matter, sometimes a lot. Importantly, this issue is not identical to the problem of reproducibility. It goes much deeper. All the results of the police militarisation papers were completely replicable. Both the code and the data used by existing publications were readily available online and multiple authors came to similar conclusions using different approaches. But all of this is rendered moot when the data are unreliable. If incentives remain such that researchers do not get rewarded for checking and double-checking the soundness of the raw data and their use of it, there can be plenty of situations where researchers are doing nothing more than creating elaborate structures of «data dung». Some of them might be fancier than others and receive more attention, but the key ingredient remains the same: dung. So, how do we prevent researchers from becoming scientific dung beetles?
How to get rid of bad incentives
There are several ways to diminish incentives detrimental to the quality of empirical research. Publishing datasets and code along with the manuscripts and making both the data and the article citeable is one practice more and more social science journals are adopting. This also enables other researchers and students to easily use the data for their own work and can help to detect mistakes similar to the examples mentioned above. In parallel, citeable datasets provide additional metrics for academics to demonstrate the scientific impact of their work. This is in line with the San Francisco Declaration on Research Assessment (DORA), which calls for a more holistic assessment of research in all scientific disciplines.
Because peer-reviewers are the gatekeepers of a researcher's academic prospects, the review process also needs to put more emphasis on examining how the data was handled and give appropriate credit to diligent and transparent data management. This will likely require more – and more structured – information from authors about how they came about the dataset used in their analysis.
If it is possible to establish a culture where the diligence in data acquisition and curation is regularly scrutinised and readily «scrutinisable», there would be far greater incentives to go the extra mile in the laborious process of data generation. Scientific recognition should not be limited to a catching story and solid methodology, but also include reliable data management. At the same time, there needs to be more tolerance for honest mistakes. Some unintentional lapses are bound to be picked up in an environment with greater scrutiny and transparency. And if there are severe consequences to this, we would be back in an environment where obscurity trumps openness in the eyes of the individual academic. This cannot be the research environment we should strive for. Let us be gold miners, not dung beetles.
Der vorliegende Blogeintrag gibt die persönliche Meinung der Autoren wieder und entspricht nicht zwingend derjenigen von reatch oder seiner Mitglieder.