Statistical correlations, roosters and placebo pills

blog

 

Engaging in the cause-effect analysis is something we cannot escape doing. Regardless of what you study, regardless of where you live, regardless of when you grew up, we all make causal inferences every day of our lives. From the African plains to Twitter feeds, almost seamlessly, the question is asked: why do we perform thought (or actual) experiments before we act accordingly, both intentionally and unconsciously?

 

 

“Correlation is not causation” but… What is causation?

Anyone who has taken a Statistics course knows and lives by this mantra. No one can dispute its veracity or relevance, and it is a guiding light for data analysts. However, these words obscure the meaning of causation in our approach to modern scientific investigation

 

 

What is causation? Have you ever asked yourself that after hearing or repeating “correlation is not causation”? If correlation is not causation, how do you know when you are in front of a causal relationship? How do you know when you already found a causal relationship?

 

 

Indulge me by engaging in a thought experiment:

 

 

Imagine you stopped for a takeaway chicken wrap on your way home. You are about to get to your place when a cosmic accident throws you to a parallel reality.

 

 

You arrive at a primitive Earth populated by farmers. They have developed statistical knowledge because of its usefulness in crop and cattle management but have no clue about Physics, Biology or Geography. Of course, this civilization is highly dependent on agriculture and therefore, the Sun.

 

 

These people are convinced that the roosters’ crow makes the Sun rise and therefore, worship them as deities. Remember that chicken wrap in your bag? Guess what…? They found it! Logically, you are facing death accused of the ultimate form of blasphemy in this world.

 

 

Your first reaction is, of course, telling them that they got it mixed-up, however, they quickly refute your claims by showing you their “Big-data”. For the last two millennia, they have carefully collected detailed datasets, and they proudly show their beautiful plots showing almost perfect correlations between the rooster’s crow and the Sun rising.

 

 

To save your life, you would correctly discard their 0.99 correlated data directly on the grounds of being ludicrous and would immediately propose experiments blinding or muting the roosters to prove the Sun would rise anyway.

 

 

My example posits how we are inclined to agree that a carefully designed and simple experiment can unveil cause-effect relationships more effectively than large and immaculate datasets.

 

 

Not convinced yet? – Do you want a real example?

What has been the biggest debunked myth in Biology?

 

 

I propose spontaneous generation (living creatures arising from non-living matter, as in maggots from rotten meat). There was a time when challenging this idea could bring one professional, social and even legal problems. Today, even the most recalcitrant creationists would agree that “maggots come from flies, not from rotten meat”.

 

 

How did this happen? A very simple counterfactual is enough (What if I isolate the rotten meat?). Perhaps you remember Francesco Redi’s experiment from high school, in which he placed a piece of rotten meat in a closed jar and a piece of rotten meat in an open jar to test spontaneous generation. Redi shocked his contemporaries’ central belief about life by finding that maggots only appeared in the meat in the open jar.

 

 

Again, imagine that you are asked to prove the same idea “maggots come from flies, not from rotten meat” without using counterfactuals, but just the statistical methods you typically read in scientific journal of choice. How would you do it?

 

 

This is where post hoc ergo propter hoc (Latin: “after this, therefore because of this”) comes into play. Is fallacy that can make draw wild conclusions…

 

 

The rooster crowed; the Sun came up. Therefore, the rooster made the Sun come up.

 

 

Change the nouns for…

I took a pill, I got better. Therefore, the pill made me better.

 

I hope you are convinced by now that the temporary association between two events or the frequency in which two events are seen together says very little about causality.

 

 

As some philosophers say, “There is no correlation without manipulation” Unless you manipulate variables, you can’t say they is a causal link between them. Of course, this is not always possible/feasible/ethical to manipulate variables in medical research, however, keep this mantra in mind the next time you read a click-baity title about medical news.



Adrian Soto Mota

*The views and opinions expressed herein are those of the author and do not necessarily reflect the views of MDLingo.com, its affiliates, or its employees.