Truths, lies, and statistics

Truths, lies, and statistics – free full-text article /PMC5723807/ – Oct 2017

I’ve long noticed that research on pain and opioids has become ridiculously biased to support the “opioids are evil” narrative, measuring milligrams of opioid instead of patient outcomes. (See Opioids Blamed for Side-Effects of Chronic Pain)

While there is evidence of ongoing research misconduct in all it’s forms, it is challenging to identify the actual occurrence of research misconduct, which is especially true for misconduct in clinical trials.

Research misconduct is challenging to measure and there are few studies reporting the prevalence or underlying causes of research misconduct among biomedical researchers.

There have been efforts to measure the prevalence of research misconduct; however, the relatively few published studies are not freely comparable because of varying characterizations of research misconduct and the methods used for data collection

There are existing resources to assist in ensuring appropriate statistical methods and preventing other types of research fraud. These included

  • the “Statistical Analyses and Methods in the Published Literature”, also known as the SAMPL guidelines, which help scientists determine the appropriate method of reporting various statistical methods;
  • the “Strengthening Analytical Thinking for Observational Studies”, or the STRATOS, which emphases on execution and interpretation of results; and
  • the Committee on Publication Ethics (COPE), which was created in 1997 to deliver guidance about publication ethics. COPE has a sequence of views and strategies grounded in the values of honesty and accuracy.

The goal of medical research should be to determine scientific truth regarding a treatment, exposure, or outcome.

Clinicians rely upon peer-reviewed, published literature to improve patient care and continue to make informed treatment decisions while considering the increasing complexity in medical care including new treatments, procedures, guidelines and related concerns

Research, surgical or otherwise, must have sound design, execution, and analysis to be considered quality. Study design and execution can often have shortcomings, which impact the quality of research. Examples of these include the use of a poor comparison treatment, lack of blinding, poor randomization, and small sample size.

Many criteria can assist in the identification of quality evidence, which are employed when creating treatment guidelines

This was definitely not how the CDC guidelines were created. Its authors made strong recommendations from very little and very weak evidence.

The development of a statistical analysis plan,

  • adherence to that plan,
  • the specific statistical analysis used,
  • decisions made during analysis,
  • assumptions made regarding the data, and
  • subsequent results

are influenced by factors including

  • quality of data,
  • appropriate choice and implementation of statistical analysis methods,
  • rigorous evaluation of the data, and
  • truthful interpretations of the results

“Truthful” is no longer an American value like it used to be. Much of science is now indirecty profit-driven due to the potential financial windfall when the results of some study can swing medical product stock values by millions of dollars,

Biomedical research has become increasingly complex, particularly in surgical clinical research

It is still relatively common to find incorrect statistical evaluations performed for the given study design and/or type of data. Basic parametric tests continue to be used often, even though most data are not normally distributed

A review of 91 published comparative surgical papers found most of which (78%) contained potentially meaningful errors in the application of analytical statistics. Common errors included not performing a test for significance when indicated, providing p-values without reference to a specific comparison, and inappropriate application of basic statistical methods

Another study assessing 100 orthopedic surgery papers reported

  • 17% of the results did not support the overstated conclusions and
  • 39% performed the incorrect analysis altogether

I think these “overstated conclusions” are usually motivated by the financial potential of some new discovery: a new drug or device.

Reviews of other peer-reviewed literature found approximately half of the clinical research have one or more statistical errors, a few of which influenced results and interpretation of study findings

The goal of this paper is to describe the common statistical errors in published literature and how to avoid and detect these errors.

Research misconduct

Research misconduct, often referred to as fraud, can encompass a range of activities including:

  • fabrication of data or results,
  • plagiarism of ideas or text without giving appropriate credit,
  • falsification of research methods or results (e.g., omitting data or significantly deviating from the research protocol), and
  • manipulation of the peer review process

Research misconduct generally does not include unintentional errors, but rather intentional misrepresentations of research data, processes, and/or findings

Research practices that are indicators of potential research misconduct, if not research misconduct itself:

Study design and data collection   

  • Inappropriate design
  • Distortion of design
  • Carelessness or incompetence
  • Fabrication of data
  • Not following protocol for safety or informed consent
  • Not obtaining institutional review board approval prior to starting the study

Analytical methods   

  • Falsification of data
  • Improper analysis
  • Multiple comparisons not reported or adjusted
  • Misrepresentation of statistical methods/analysis
  • Post Hoc analyses not identified
  • Exclusion of outliers


  • Selective reporting
  • Failure to publish/agreement not to publish
  • Over interpretation of results
  • Study weaknesses not described
  • Duplicate or near identical publications
  • Undisclosed or meaningful conflicts of interest
  • Will not provide raw data

It is difficult to identify the true prevalence of research misconduct, particularly in clinical trials.

There have been some attempts to quantify the prevalence of research misconduct; however, the different studies are not readily comparable due to varying definitions of research misconduct.

it is logical to believe that individuals at certain career levels may have higher likelihood of committing research misconduct (e.g., to publish in order to obtain tenure).

The largest problem in identifying the prevalence of research misconduct is the response bias toward under-reporting, even in anonymous surveys.

overall prevalence of non-self-reported research misconduct was slightly higher than 14%.

Detection of misrepresentations, including inadvertent error and intentional misconduct should result in a retraction of the published article(s), which is uncommon with estimates of retracted articles at 0.07%.

Many of these retracted articles were then cited by 1,893 subsequent articles after the retraction was made, with nearly all citations being either explicitly (14.5%) or implicitly positive (77.9%) and only a small proportion (7.5%) acknowledging the retraction

A recent study evaluating retractions specifically in orthopedic research found that a majority of the retractions were for fraud (26.4%) or plagiarism (22.7%).

There are multiple studies which demonstrate that retraction rates are increasing. An article evaluating retractions between 2001 and 2011 reported a 10 fold increase in retractions for that time period

Unintentional statistical misrepresentations

In many specific instances it may be difficult or impossible to identify whether or not an issue with a study was intentional and therefore considered research misconduct, or unintentional and attributed to an honest error, difference of opinion, or other benign cause

Detecting and avoiding statistical untruths

There are multiple resources to assist in avoiding inappropriate statistical procedures and presentation.

One of which is the “Statistical Analyses and Methods in the Published Literature”, also known as the SAMPL guidelines which assist researchers in the proper manner of reporting various statistical methods

Clinical research requires self-policing and holding peers to high rigorous standards in order to maintain credibility

Identification of research misconduct often comes from many sources including: research peers, reviewers, IRB auditors, and even study participants.

There are also some common statistical signs which may indicate potential research misconduc

Statistical signs indicating potential research misconduct

  • Employing incorrect statistical test
  • Oversimplification of analyses
  • Exclusion of data
  • Exploratory analyses
  • Multiple tests performed but few statistically significant
  • Pattern of effect size inconsistent
  • P values not adjusted for multiple comparisons
  • Low statistical power/high type II error
  • Incorrect identification of the study design
  • Did not follow an a priori analysis plan
  • Performing only one-sided tests for statistical significance without justification

An evaluation of 240 surgical peer-reviewed publications found that meaningful proportions of studies had one or more signs of potential research misconduct.

Of these publications,

  • 60% used rudimentary parametric statistics with no test for normality reported,
  • 21% did not report a measure of central tendency (mean, median or mode) for the primary measures, and
  • 10% did not identify the type of statistical test was used to calculate a P value

Exclusion of data and treatment of outliers

Protocol failure, testing error, lab error, or equipment failure are unfortunately common in research and are all reasonable reasons to exclude data, given that they are documented sources of erroneous data

It is wise to have a set definition for what constitutes an outlier (i.e., 3 standard deviations, or 1.5 IQR) before analysis and clearly include this information in the methods section

Analyses plans are becoming more commonly available through secondary sources such as or requested by journal editors prior to publishing an article

Appropriate explanation for excluding each datum should result from a documented protocol deviation or lab error, not merely data points which are “beyond what is expected” (e.g., 2 standard deviations above the mean).

A study looking at research misconduct concluded more than a third (33.7%) of surveyed researchers admitted to poor research methodologies indicative of research misconduct, including exclusion of datum or multiple data due of a “gut feeling that they were inaccurate” and misleading or selective reporting of study design, data or results

Detection of fraud

These include the oversight by trial committees, on site monitoring and central statistical monitoring

There are three distinct types of fraud detection.

  1. The oversight by trial committee members is best for preventing flaws in the study design as well as the interpretation of results
  2. On-site monitoring is useful for ensuring that no procedural errors occur during data collection at participating centers
  3. Lastly, the statistical monitoring is essential for eliminating data errors as well as incidence of faulty equipment or sloppiness.


Research misconduct may originate from many sources. It is often difficult to detect and little is known regarding the prevalence or underlying causes of research misconduct among biomedical researchers.

And below is an overview of another similar article, of which I found many on PubMed. This issue is clearly a problem and I think it’s definitely corrupting most studies about pain and oppioids.

Guidelines for standardizing and increasing the transparency in the reporting of biomedical research – free full-text article /PMC5594115/ – Aug 2017

“Because what is ultimately at stake is public trust in science”

I think that’s been destroyed by the myriad of financial interests in the outcomes of studies and cultural myths create tremendous bias that taints most studies.

As with any kind of trust, it is easily broken and almost impossible to recover completely.

Scientific scrutiny of the ‘evidence’ in medical research, in order to improve its scientific reliability, show that most of the medical research findings and their interpretations are false and lead to colossal waste of research resources.

Over the years of my  research for blogging I’ve seen how vague measurements of pain and the inappropriate use of opioid doseages as endpoints have been twisted to accommodate the “opioids are evil” myth.

Among all the research practices suggested to improve the reliability of medical research, improving and standardizing reporting is one of the key elements.

Efforts have been made to standardize reporting of the studies conducted, in user-friendly checklist formats, to increase reliability in reporting research. These guidelines have now been adopted by many medical journals and funding organizations, resulting in increased transparency of clinical study reporting in journals that have adopted them

The current article is based primarily on these guidelines to inform and empower the clinical researchers about biomedical research reporting.

Guidelines to standardize and increase transparency in reporting biomedical literature

The “Enhancing the Quality And Transparency of health Research” (EQUATOR) network has published more than 350 guidelines for Health Research Reporting


It has been reported that around one third of the Randomized Controlled Trials (RCT) did not report the method of randomization

Selection of study subjects, or allocating study subjects to a particular group, is an important step to eliminate ‘selection bias’ in any biomedical research. It is important to identify and report the target population and then the sampling or random allocation technique. On the other hand, the term randomization should not be used loosely. It should be mentioned, only if some randomization technique, such as computer generated random numbers, is used.

So many “random” samples are unintentionally selected based on some other unanticipated variable:

From a 1970’s observational study we learned that “oat bran reduces cholesterol levels”. But no one considered that people who were eating oat bran were probably more healthy and active in general and those factors may have influeced their cholesterol levels much more than “eating oat bran”.

Clinical versus statistical significance

P value tells us about the statistical significance, whereas effect size reveals the clinical significance. A smaller P value does not mean that there is a high correlation or strong association. Similarly a study with a large sample size may find a small effect size to be statistically significant. Incorrect interpretation and reporting of P values and giving too much emphasis on P values less than 0.05 when the effect size is clinically meaningless, should be avoided. And effect sizes should be mentioned along with their 95% confidence intervals.

Misinterpretation of the results

Researchers frequently interpret odds ratios (OR) as risk ratios (RR) and report it as such. OR and RR are two different entities. OR overestimates RR. OR are reported in case-control studies and logistic regression models, while RR in cohort studies.


Comprehensive and transparent reporting makes research credible, reproducible and help reduce research wastage.

2 thoughts on “Truths, lies, and statistics

  1. Pingback: Not much Evidence for Evidence-Based Therapy | EDS and Chronic Pain News & Info

Other thoughts?

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.