Unboxing Evidence Based Medicine: August 2018

Wednesday, August 29, 2018

Scientific Fake News: what is it exactly?

Improper information have always existed. By becoming a popular term, the expression "fake news" has alerted people and brought some useful skepticism, which is not a natural feature of the human mind.

Evolutionary speaking, the human mind has evolved over 200.000 years of history to believe. In fact, evolutionary psychology claims that our unique ability to fantasise abstract phenomena was responsible for the species to prevail .

As Francis Bacon once stated, "the human mind is more excited by affirmatives than negatives". And it was recently demonstrated in tweeter and published in Science: "fake news spread faster than true news".

Although we have evolved technologically and science is in the core of this evolution, the human mind did not have enough time to evolve from fantasy to skepticism. The last 500 hundred years were not enough to outrun 200.000 years of evolution. Biologically, we are believers.

The root of scientific thinking is skepticism. In science, we must have a method to overcome our predisposition to believe. This method is called the null hypothesis: we start by not believing and only change to the alternative of believe after strong evidence against chance or bias rejects the null. Being skeptical is tiresome and sometimes boring.

This is in the center of a scientific problem: the lack of reproducibility, well described by Ioannidis in his popular PLOS One article: "most published research findings are false". And we believe them.

The term "fake news" became popular two years ago and served as an alert for people, before becoming unpopular for political reasons.

With a correct understanding its meaning, the term "scientific fake news" helps against the problem of scientific reproducibility. But first, we must differentiate "scientific fake news" from "fake news".

Fake news is created by a person or small group of people with common interest. Scientific fake news is created by a system who is defective: the creators are not alone, peer-reviewers, editors, societies and readers have to approve it and spread the message with enthusiasm. And they may do it with good intention.

Fake news has a creator who knows the news is fake. In scientific fake news, the creator believe in the message, a belief reinforced by his or her confirmation bias.

Fake news has a creator with poor personal integrity. In scientific fake news, the creator suffers from scientific integrity, mediated biologically by cognitive bias.

Fake news does not have empirical evidence, scientific fake news has experimental evidence that falsely suggests credibility.

Fake news is easily dismissed. Scientific fake news may take years to dismiss. It is responsible for the phenomenon of medical reversal, when improper information drives medical behaviour for years, only to be reverted after true and stronger evidence takes place. It was the case of medical therapies that were incorporated such as Xigris for sepses, hypothermia after cardiac arrest, beta-blockers for non-cardiac surgery and so on ...

In his seminal article on medical reversal, Vinay Prasad wrote "we must raise the bar and before adopting medical technologies".

And the last difference: Donald Trump loves the term ''fake news", but has no ideia what "scientific fake news" mean.

Well, it is not that scientific fake news is totally naive, there is also conflict of interest mediating it. But the main conflict comes from positivism bias, meaning every authors, editors or readers prefere positive studies over negative studies.

Following description of human mind cognitive biases under uncertainty by Kahneman and Tversky, Richard Thaler came up with the solution to nudge human behaviour. Nudge means interventions to unconsciously change behaviour, with may be more effective than rational arguments.

For example, to avoid people to cheat on tax returns, instead of explaining how important it is to pay taxes, a nudge would just say "most people fill their reports accurately". It was the most effective to improve behaviour in the UK.

In the case of science, the expression fake news is so strong that may act as a nudge to scientific integrity. Yes, it may sound politically incorrect, but it is a disruptive nudge. Just speak about bias and chance has not been enough, as recently meant by Marcia Angell: "no longer possible to believe much of clinical research published".

Maybe we are not in a crises of scientific integrity. Actually, I think this type of discussions are becoming more frequent and we should be optimist.

But a nudge may accelerate the process: before reading any article, we should make a critical appraisal of our internal beliefs and ask ourselves: in this specific subject, am I specially vulnerable to believe in "scientific fake news"?

Monday, August 27, 2018

SCOT-HEART Trial: how to spot scientific fake news at a glance

It was just presented at the ESC Congress and simultaneously published in the NEJM a great example of scientific fake news, the SCOT-HEART Trial.

I use this example to show that the reading of an article starts before the traditional process. A pre-reading should bring us the critical spirit necessary for the reading process. During pre-reading we begin to develop a vision of the whole, as if we were looking at a city from the airplane window.

Then we'll land the plane and start reading to assess details.

The pre-reading of an article is composed of two questions: first, the hypothesis makes sense, should this study have been carried out? (pre-test probability of the idea = plausibility + previous studies); second, is the result too good to be true (effect size)?

In pre-reading process, we should avoid flooding the details head. We need only to identify the tested hypothesis and the main result. By reading just the conclusion of the article, we get this information which should be accompanied by a look at the line of results that presents the main numbers in order to get notion of effect size (it takes 30 seconds).

In the case of SCOT-HEART trial:

"CTA in addition to standard care in patients with stable chest pain resulted in a significantly lower rate of death from coronary heart disease or nonfatal myocardial infarction at 5 years than standard care alone.

The 5-year rate of the primary end point was lower in the CTA group than in the standard care group (2.3% [48 patients] vs. 3.9% [81 patients]; hazard ratio, 0.59; 95% confidence interval [CI ], 0.41 to 0.84, P = 0.004)."

From these two sentences, we have noticed the hypothesis tested: the use of tomography in patients with stable thoracic pain reduces cardiovascular events. What is the pre-test probability of this idea?

There is some plausibility to the extent that anatomical information can modify the therapeutic behaviour of physicians and then modifies outcomes. Regarding previous evidence, the PROMISE study randomised 10,000 patients for tomography versus noninvasive evaluation and was negative for cardiovascular outcomes. The PROMISE control group is not exactly the same as SCOT-HEART, but indirectly the result of that study models negatively the pretest probability of the SCOT-HEART hypothesis. Therefore, I would say that the pre-test probability is low, but not zero, maintaining the justification for the study being performed.

Then comes the second question: is the effect size too good to be true? Note that the CT scan promoted a 41% relative hazard reduction. This magnitude of effect is typical of beneficial treatments. It is important to note that the effect size of a test will always be much less than that of a treatment, since in the first there are many more steps between intervention and outcome.

In the case of a clinical trial testing efficacy of a test, the following steps are necessary before the benefit occurs:

The examination is done on all patients - a portion of them has a result that may suggest to the physician to improve patient's treatment - in a sub-portion of these patients the physician actually enhances the treatment - a sub-sub-portion of patients benefits from treatment improve. Therefore, we should expect that the magnitude of the clinical effect of a test is much lower than that of a treatment.

In this way, we conclude that the SCOT-HEART result is too good to be true. It would be extraordinary for a test to promote such effect size. As Carl Sagan said, "extraordinary claims requires extraordinary evidence". Is the quality of this trial extraordinary?

Now let's read the article, looking for problems that justify such an unusual finding, 41% relative reduction of the hazard by performing an exam.

The first point that draws attention was the minimal difference in treatment modification promoted by the CT scan versus the control group. There was no difference in the revascularization procedure. Regarding preventive therapies such as statin or aspirin, the difference between the two groups was only 4% (19% versus 15%).

The number of patients in the CT scan group is 2,073 x 4% improvement in therapy = the CT scan group had an additional 83 patients with improved therapy in relation to control.

The number of events prevented in the CT group (relative to the control group) was 33.

Thus, drug enhancement of 83 patients prevented 33 clinical outcomes. If we were to evaluate the treatment that was performed at the end of the cascade, the NNT would be 2.5. Something unprecedented, that almost no real treatment is able to promote, nor a test.

This is a definitely false result.

The continuity of the reading will serve to understand the mechanisms that generated this false result.

"There were no trial-specific visits, and all follow-up information was obtained from data collected routinely by the Information and Statistics Division and the electronic Data Research and In- novation Service of the National Health Service (NHS) Scotland. These data include diagnostic codes from discharge records, which were classified according to the International Classification of Dis- eases, 10th Revision. There was no formal event adjudication, and end points were classified primarily on the basis of diagnostic codes."

So the outcomes were obtained through the electronic records review, through ICD and without adjucation by the authors. Second, the study was open and ascertainment bias can happen. For example, knowledge of a normal CT scan may influence the doctor who writes the ICD to interpret a symptom as innocent, while in another patient who is unaware of the anatomy, a symptom may prompt troponin measurement and subsequent diagnosis of nonfatal infarction. This is just a potential explanation, which serves as an example.

In fact, we are never able to open the black box of the exact mechanism that prevailed in generating a bias. However, it should be borne in mind that the combination of an open-label study with an inaccurate method of outcome measurement leads to a high risk of bias.

One of the techniques to explore the possibility of ascertainment bias is to compare the outcome of specific death (subject to scoring bias - subjectivity) with the result of death from any cause (immune to bias). Even though it is not a primary or statistically significant outcome, it is worth as exploratory analysis. It is interesting to note that the hazard ratio is 0.46 for cardiovascular death and 1.02 (null) for general death. In the absence of a substantial increase in non-cardiovascular death, this suggests that the study is especially subject to ascertainment bias for subjective outcomes.

In addition, the study presents a high risk of random error, since it is underpowered. In fact, the calculation of the sample was based on the premise of a 13% incidence of the outcome in the control group, but only 3.9% took place. By my calculation, it reduced a desired statistical power of 80% to 32%. As we know, small studies are more predisposed to false positive results because of their imprecision.

This imprecision not only increases the probability of type I error, but also incapacitates the study of measuring the size of the effect. That is, 41% relative reduction of the hazard presented a confidence interval ranging from 16% to 59%.

Finally, if we considered the information true, it would be worth an analysis of applicability. The hypothesis tested here is of a pragmatic nature. That is, an intervention is done at the beginning, and we expect the physician reacts in a way that benefits the patient. However, the protocol was designed to systematically influence physicians behaviour.

"When there was evidence of nonobstructive (10 to 70%) cross-sectional luminal stenosis or obstructive coronary artery disease on the CTA, or when a patient had an ASSIGN score of 20 or higher, the attending clinician and primary care physician were prompted by the trial coordinating center to prescribe preventive therapies. "

This methodology reduces the external validity of the study, because we do not know if in the absence of this induction by the protocol of the study, doctors would act the same. If the benefit were true, in practice it would be of a smaller magnitude.

For studies of insufficient quality we should keep uncertainty in mind. But SCOT-HEART goes further: this is study is certainly false. A great example of fake scientific news.

Sunday, August 26, 2018

Dear Aseem: nutrition has hijacked evidence-based medicine

My friend Aseem,

early this morning you shared with me a sensationalistic news of a “Harvard professor” claiming that coconut oil is poison, and my first reaction was “let me look at the evidence, I know nothing about this oil”. Actually, I'm too lazy to look for this evidence. I already know the answer: coconut oil does not matter; neither for good, nor for bad.

Apart from eating enough calories to become obese or drink alcohol pathologically, food specifics should not be a major cardiovascular concern. Specially coconut. In a scientific sense, the value of food is overestimated. It usually happens with medical opinion: we tend to overestimate the value of beneficial interventions or the harmful effect of risk factors (JAMA Internal Medicine). That’s how the human mind evolved, not much calibrated for “value”. I think it happened with the so called "Harvard professor".

Quality scientific research either fail to show specific dieting effects on clinical outcomes or show very small effects. In a peculiar systematic review from Ioannidis exploring the tiniest effect sizes in literature, nutrition was the field most prevalent; second, when randomised trials adjust for calorie intake and confounding variables, the type of diet does not show much impact on weight or clinical outcomes (PLOS ONE systematic review). The DIETFITS trial recently published in JAMA is one example of such evidence.

So we should not investe our time discussing efficacy of diets, meaning an explanatory concept or intrinsic property.

I know, it seems frustrating, skepticism is boring. By I am not a boring skeptical. I just found a solution to make dieting an interesting issue again: Aseem, let’s shift the discussion to effectiveness.

Efficacy is tested by randomized trials, in which allocation for intervention takes nothing into account (it is random). Thus, a randomized trial does not evaluate the effect of preferences on outcomes, it is a pure explanatory concept.

In the real word, preferences may be are taken into account for dieting decision. In this circumstances, if preference and choice match, effectiveness tend to be superior to efficacy, because patients get motivated to perform well with the intervention they prefer. Thus, if a type of diet is an easier match with general preferences, it should be a more effective diet.

For example, in my non-scientific experience, I have a sense that people on low-carb diets are happier with the experience and results obtained, as compared with other diets. I claim the test of this hypothesis should be pursued. Studies should be planned to test effectiveness.

Here are some ideas:

Pragmatic randomized trials, in which just a general recommendation is given and we let people develop their eating habit, their meals in a pragmatic way. Just like a long-term diet takes place in the real world (habit). In this pragmatic circumstance, I suspect a low-carb individual will eat less calories, are happier, lose more weight.

Observational studies with statistical adjustment for outcome predisposition, as opposed to propensity scores. The traditional propensity score should not be utilised for the test of effectiveness, because it adjust for the very preference that will improve result in a preference-matched choice. Observational studies lose opportunity to test effectiveness by focusing on the non-sense of pursing efficacy of life-style by adjusting for propensity.

Cross-over studies, to measure the outcome happiness at the end of each intervention. Happiness improves effectiveness and if people are happier with a certain diet, the result will be better.

Science is more about the right question as opposed to a platonic answer. Sometimes we ask the wrong questions in unfruitful debates.

I am glad that you woke me up (considering our time difference) with your provocation on coconut oil. I know that, as a cardiologist, your concern is not coconut oil (maybe as a chef it is). You are really concerned with the “miscarriage of science” performed by fake scientific news of this type. To say that any specific food is poison is non-scientific. To use the title of a “Harvard professor” to make such a statement is a good example of eminence-based medicine, as opposed to evidence-based medicine.

I think advocates should have a role to use the dieting issue to clarify the different values of these two complementary concepts: efficacy and effectiveness. The important issue here is science, not coconut oil.

Meanwhile, I have to confess: personally, I love the subject of dieting and I have my own preferences. Professionally, I should stick with evidence. We need effectiveness evidence.

Sunday, August 19, 2018

Low-carb and mortality: an unnecessary controversy

This week a The Lancet Public Health observational study caused a controversy claiming an U-shaped effect of low carbohydrate diet on all-cause mortality. This controversy gives us the opportunity to promote a more general scientific discussion: "the fallacies of all-cause mortality and U-shaped phenomena" for suggesting causation in observational studies.

All-cause mortality implies an infinite number of causes, making this outcome specially vulnerable to the typical confounding of observational studies. Characteristics of low-carb or high-carb individuals may influence deaths independent of cardiovascular events.

During summer, people tend to eat more ice-cream with lots of sugar. At the same time, it is during summer that most drowning deaths take place. Then, by not knowing the specific cause of death and by suffering from confirmation bias, one may conclude that sugar in summer diets increases cardiovascular death. Thus, it is imperative to show cause-specific mortality in observational studies.

Second, dose-response relationship is an important criteria for causation. Thus, its opposite, the U-shaped curve, should be an argument against causation and in favor of confounding effect.

Unless the interventional is prone to become poison in high doses (the case for certain drugs), an U-shaped phenomenon indicates bias due to the observational nature of the study design. A carbo-restricted subject is different from a moderate-carb individual, who is different from high-carb person. The extremes tend to be associated with more atypical behavior, unpredictable influences, specially in all-cause mortality.

Finally, observational studies should be avoided as tools to explore efficacy concepts. It only causes confusion and unnecessary controversies. Observational approaches should be valued as means to test real world effectiveness of concepts first proved by randomized clinical trails (efficacy).

Regarding dieting (or anything else), both sides of the aisle should avoid weak observational arguments. It should be more about good science and less about specific dieting. By this approaching, together and slowly, with no sensationalism, we will get the right concepts to guide individual decisions.

The Lancet publication should be disregarded. Better to talk about more importante concepts.

Wednesday, August 15, 2018

Why Unbox Evidence-Based Medicine?

I’ve been teaching evidence-based medicine with enthusiasm for over 15 years, to undergraduate medical students within my university and for medical doctors in extra-mural activities. As time passes by, my feeling of uneasy with this discipline increases to the point that I come to the conclusion that evidence-based medicine should not be a discipline at all. Nowadays, I finish my course every semester confessing to my students that I have a dream: one day my discipline will cease to exist, because evidence-based medicine will be recognized just as medicine.

In fact, I recently looked into the etymology of the word “medicine”. It originates from the latim “mederi”, which means “to know the best way”. I then realized that medicine is not knowing the certain way, because “medicine is a science of uncertainty”, as stated by Willian Osler in the first half of the 20th century. Best choice does not mean the right choice, it only provides the best chance. Only afterwards, we will learn about right or wrong, when we see the outcome result.

Maybe we should consider Osler as the father of evidence-based medicine. Once he recognized the uncertainty of medicine, he suggested a solution: “medicine is the art of probability”. He proposed probability-based medicine, as empirical evidence is the way to assess diagnostic, prognostic and treatment probabilities. In fact, “evidence” is just a means to an end: the end of critical thinking based on uncertainty.

First, we need to unbox evidence-based medicine into uncertainty-based medicine. Second, it is not to consider evidence-based medicine a medical field or a form to practice medicine. It is just medicine as it should be.

I have a feeling that evidence-based medicine has been presented in a too methodological package that creates a gap between the real world physician and the scientific way of thinking. Evidence-based medicine should be presented with more sensibility and grace. It is not about the evidence, it is about the patient.

Evidence is the means to acquire the probability to be used in an individual decision, that takes into account clinical aspects of the patient, as well as her values and preferences. Evidence-based medicine is the art of medicine, the art of probability. And now I understand why Osler said art of probability, instead of science of probability. Because we need sensibility to apply probabilities to a unique patient, taking into account her clinical, mental, spiritual particularities.

Unboxing evidence-based medicine is presenting it to the consumer of science, as opposed to the researcher. Evidence-based medicine is not the field of trialists or systematic reviewers, it is the field of the caretaker. This caretaker should know how to judge the quality of evidence, how to use the knowledge from a systematic review, but does not need to learn how to build one. He needs to develop more attitude than knowledge. The knowledge to read an article is easy to acquire, but the skepticism and critical attitude is one to be developed over the years. Again, it takes sensibility and affection for this matter.

Unboxing evidence-based medicine is to put emphasis in the general concept of chance in determining outcomes, one step beyond P values or confidence intervals. It is to understand the real world as full of bias to promote illusions. Concepts have to be created in a controlled environment and then applied with art in the real world.

The intention of this blog is not to make easy a supposedly difficult discipline. It is to make it interesting. In fact, medicine is not supposed to be easy, but it can be very interesting and fulfilling. It is what evidence-based medicine should be about.