Want More Trust in Medical Science? Embrace Uncertainty and Cut the Hype – John Mandrola, MD – April 2017
…in this week of righteous celebration of science, certainty will be favored over uncertainty, as will acceptance over skepticism.
This, I believe, is a core reason science has a trust problem. An old mentor warned me of the danger of hubris. Hubris, he said, was the doctor’s greatest enemy.
I see a lot of overconfidence in medical science.
Lack of Skepticism Among Clinicians
At the bedside, clinicians—myself included—underestimate harms and overestimate benefits of medical intervention. These inaccuracies have many causes.
One is a lack of skepticism. It was 10 years into practice before I learned that most of a study’s bias comes in its planning, in the questions it asks.
Rarely do I hear a practicing colleague or speaker at a medical meeting cite Dr John Ioannidis’s famous 2005 paper “Why Most Published Research Findings Are False.”
Ioannidis, a Stanford epidemiologist, argues that
- small sample sizes,
- tiny treatment effects,
- “flexible” study designs (which can transform “negative” into “positive” results),
- prestudy biases, and
- conflicts of interest
are the root causes of false research findings. Research findings, he argues, may simply be an accurate measure of the prevailing bias.
in that same year, he published a paper in JAMA comparing the results of 49 highly cited articles against later studies of larger sample sizes and similar design.
- only 44% of the study results were replicated;
- 16% were contradicted, and
- another 16% had overestimated treatment effects.
Richard Horton, the editor of the Lancet, agrees with Ioannidis. In 2015, he wrote that “much of the scientific literature, perhaps half, may simply be untrue.”
One of the (many) reasons for this crisis, Horton adds, is that “in their quest for telling a compelling story, scientists too often sculpt data to fit their preferred theory of the world.”
Worship of the P value is also a problem.
Pharmacologist and statistician David Colquhoun warns that “if you use P=0.05 to suggest that you have made a discovery, you will be wrong at least 30% of the time.”
He adds: “If, as is often the case, experiments are underpowered, you will be wrong most of the time.” This year, Nobel laureate Daniel Kahneman (author of Thinking Fast and Slow) admitted placing too much faith in underpowered studies.
Then there is the vast divide between statistically significant and clinically significant.
The most recent blockbuster, the proprotein convertase subtilisin/kexin type 9 (PCSK9) inhibitor evolocumab (Repatha, Amgen), highlights this issue. If it takes over 27,500 patients for this lipid-lowering drug to show a 1.5% reduction in nonfatal events, and no mortality benefit, is this clinically important?
Publication bias is another trust-in-the-evidence problem I learned of later in my career.
In a highly cited study published in the New England Journal of Medicine, Dr Erick Turner and colleagues showed that positive drug trials (of an antidepressant medication) were much more likely to be published. Of 74 FDA-registered studies, they found 37/38 positive trials were published while negative or questionable results were, with three exceptions, either not published (22 studies) or published in a way that conveyed a positive outcome (11 studies)
Ross and colleagues analyzed more than 600 NIH-funded trials and reported that a third of trials remained unpublished more than 4 years after completion. Not surprising is that nonpublication of trials is more common among those with industry funding. Selective publication deserves attention because not knowing about unpublished negative studies leads to overconfidence in our therapies.
The Trust Problem With Overselling Medical Science
Bias and hype from medical researchers are also a problem.
Science does not do itself. Humans—bent on having a successful academic career—do science. This means positive results can become the goal rather than the pursuit of scientific truth.
Statistician Thomas Fleming noted that the “psychological influence of the driving goal to establish benefit is sufficiently subtle that it often is not recognized by investigators who have this goal.” He pointed to a review of trial protocols in which bias came through in the language of the written objectives.
Rather than write that a study’s objectives are “to determine whether the experimental intervention is safe and effective,” many authors, he observed, write objectives in this way: “to establish the experimental intervention is safe and effective.”
Many of the same medical scientists who garner fame doing clinical trials also write practice guidelines.
Guidelines are too often infected with hubris.
The sordid story of dronedarone’s (Multaq, Sanofi) rise in atrial-fibrillation treatment guidelines exposes serious reliability issues with the important documents. In a study published in JAMA Internal Medicine, Italian researchers applied the Grading of Recommendations Assessment, Development and Evaluation (GRADE) method to the evidence base of dronedarone.
This assessment of the evidence revealed excess deaths, adverse effects, and inefficacy of arrhythmia suppression with dronedarone. The Italian investigators also found that the current AF treatment guidelines were lacking in five of eight quality domains set out by the Institute of Medicine.
Another group of academics that threaten the public trust are screening evangelists.
Screening is precarious because it puts doctors close to breaking the golden rule—first, do no harm. Doing things to people-without-complaints and promoting the slogan “screening saves lives” should require clearing the highest bar of evidence. The truth, though, is that the evidence does not support such zealous advocacy.
A systematic review of meta-analyses and randomized clinical trials that studied screening of asymptomatic adults for 19 diseases (39 tests, including mammography) found reductions in disease-specific mortality were uncommon and reductions in all-cause mortality were very rare or nonexistent
The late Dr David Sackett, the father of evidence-based medicine, famously wrote that preventive medicine displayed all three elements of arrogance: it was aggressively assertive, presumptuous, and overbearing. It’s hardly controversial to posit less arrogance would improve confidence in medical science.
Can Journal Editors Help?
Three areas in need of improvement in medical journals are
- attention to the limitations of studies,
- peer review, and
- consistent reporting of results
I would propose that authors have to list in the abstract—before the conclusions—a paragraph on limitations. Too many readers of the scientific literature never get to the full paper; as such, they don’t see a study’s limitations. Press releases from journals should also include the limitations of a study.
Peer review is another area in need of improvement. A Google search of “peer review” churns out a litany of articles describing how it is broken. Among the many problems, anonymous peer reviewers now have no incentive, other than goodwill, or malice, to do a good job. As a nonacademic, I don’t claim to have the answer to fix peer review, but it’s hard to see how a more transparent process could not help
Another easy improvement editors could insist on is a consistent presentation of results.
It crushes trust when reading a study that expresses benefits in relative terms and harms in absolute terms. Clinicians caring for patients deserve to read about both benefits and harms of therapies in absolute—not relative—terms.
My view of medical science is similar to my view of patient care.
- Slow and conservative is better.
- Don’t oversell;
- share the data more than the interpretation of the data.
Consumers of medical science, patients and clinicians alike, will make better decisions with more data and less hype.
We can handle the truth: science is supposed to move slowly.