SCIENCE

Biomedical Science Studies Are Shockingly Hard to Reproduce

Limited access to research details and a culture that emphasizes breakthroughs are undermining the credibility of science

Adam Hoffman

January 4, 2016

Seeking transparency in the scientific literature. urfinguss/iStock

It's hard to argue against the power of science. From studies that evaluate the latest dietary trend to experiments that illuminate predictors of happiness, people have come to increasingly look at scientific results as concrete, reliable facts that can govern how we think and act.

But over the past several years, a growing contingent of scientists has begun to question the accepted veracity of published research—even after it's cleared the hurdles of peer review and appears in widely respected journals. The problem is a pervasive inability to replicate a large proportion of the results across numerous disciplines.

In 2005, for instance, John Ioannidis, a professor of medicine at Stanford University, used several simulations to show that scientific claims are more likely to be false than true. And this past summer Brian Nosek, a professor of psychology at the University of Virginia, attempted to replicate the findings of 100 psychology studies and found that only 39 percent of the results held up under rigorous re-testing.

“There are multiple lines of evidence, both theoretical and empirical, that have begun to bring the reproducibility of a substantial segment of scientific literature into question,” says Ioannidis. “We are getting millions of papers that go nowhere.”

These preliminary findings have spawned the creation of an entirely new field called meta-research—the scientific study of science.

This week, the biology arm of the Public Library of Science (PLOS), a nonprofit publisher and advocacy organization, launched a new section solely dedicated to meta-research. The section will explore issues such as transparency in research, methodological standards, sources of bias, data sharing, funding and incentive structures.

To kick things off, Ioannidis and his colleagues evaluated a random sample of 441 biomedical articles published between 2000 and 2014. They checked whether these papers provided public access to raw data and experimental protocols, were replicated in subsequent studies, had their results integrated into systematic reviews of a subject area and included documentation of funding sources and other potential conflicts of interest.

Their results were troubling to say the least. For instance, only one study provided full experimental protocols, and zero studies provided directly available raw data.

“These are two basic pillars of reproducibility,” says Ioannidis. “Unless data and the full protocol are available, one cannot really reproduce anything.” After all, without that key information, how can another team know exactly what to do and how their results differ from those in the original experiment?

The team also found that the claims of just eight of the surveyed articles were later confirmed by subsequent studies. And even though many of the studies claimed to have novel findings, the results of only 16 articles were included in later review articles, which serve as a litmus test for the true impact of a study on a particular subject.

“The numbers that we get are pretty scary," says Ioannidis. “But you can see that as a baseline of where we are now, and there is plenty of room for improvement.”

However, not all the results were discouraging. The percentage of articles without a conflict of interest statement decreased from 94.4 percent in 2000 to 34.6 percent in 2014—likely a result of a growing awareness of the pernicious effects of bias on research outcomes.

In a second meta-research study, a German team analyzed how the loss of animal subjects during pre-clinical trials might contribute to the widespread inability to translate laboratory findings into useful clinical drugs.

Research animals might vanish from a study randomly—for instance, because the animal died—or through subtly biased actions, like being removed from the trial to eliminate data that undermines the expected results. The team demonstrated that the biased removal of animal subjects can skew results and significantly increase the likelihood of a false positive—when a new drug is thought to work but actually does not.

In a separate analysis of pre-clinical studies on stroke and cancer, the same researchers found that most papers did not adequately report the loss of animal subjects, and that the positive effects of many drugs being tested may be greatly overestimated.

So why is this crisis in transparency and reproducibility happening in the first place?

While some issues may lie in conscious or unconscious research biases, it's likely that most studies that reach publication are one of a kind due to the current incentive structure in science.

In the cutthroat world of academia, the primary measure of success is the number of studies a researcher gets in prestigious journals. As a result, scientists are under pressure to spend the majority of their time obtaining the kinds of breakthrough results that are most likely to get published.

“While we value reproducibility in concept, we don't really value it in practice,” says Nosek, who is also co-director of the Center for Open Science, a nonprofit technology startup that works to foster transparency and reproducibility in scientific research.

“The real incentives driving my behavior as a scientist are to innovate, make new discoveries and break new ground—not to repeat what others have done. That's the boring part of science.”

Scientists also see few incentives to provide the information necessary for others to replicate their work, which is one of the primary reasons why the claims of so many studies remain unverified.

“I am not rewarded for making my data available or spelling out my methodology in any more depth than what is required to get into a publication,” says Nosek.

Many journals do ask scientists to provide a detailed explanation of their methods and to share data, but these policies are rarely enforced and there are no universal publication standards.

“If I knew there were never going to be any cops on the roads, would I always stick to the speed limit? No—it's human nature,” says Ivan Oransky, co-founder of Retraction Watch, an organization that promotes accountability and transparency by tracking retractions in scientific literature. “If you know nobody is going to sanction you, then you are not going to share data.”

Those scientists who want to conduct replication work and are able to obtain experimental details are then unlikely to find funding from public agencies like the NIH, who primarily judge grant applications based on novelty and innovation.

“The odds are clearly against replication,” says Ioannidis.

That's where the emerging field of meta-research can step in. Organizations like the Center for Open Science and the Meta-Research Innovation Center at Stanford (METRICS) are working to help realign the reward system and set stringent universal standards that will encourage more widespread transparency and reproducibility practices.

“If the funding levels or promotion depended on what happened to your prior research—if it was replicable, if people could make sense of it, if people could translate it to something useful rather than just how many papers did you publish—that would be a very strong incentive toward changing research to become more reproducible,” says Ioannidis, who is co-director of METRICS.

“I am hopeful that these indicators will improve,” he adds. “And for some of them, there is no other possibility but to go up, because we start from zero.”