Correlation Does Not Imply Causation


Correlation tests for a relationship between two variables. However, seeing two variables moving together does not necessarily mean we know whether one variable causes the other to occur. This is why we commonly say “correlation does not imply causation.”

A strong correlation might indicate causality, but there could easily be other explanations:

  • It may be the result of random chance, where the variables appear to be related, but there is no true underlying relationship.
  • There may be a third, lurking variable that that makes the relationship appear stronger (or weaker) than it actually is.

For observational data, correlations can’t confirm causation...

Correlations between variables show us that there is a pattern in the data: that the variables we have tend to move together. However, correlations alone don’t show us whether or not the data are moving together because one variable causes the other.

It’s possible to find a statistically significant and reliable correlation for two variables that are actually not causally linked at all. In fact, such correlations are common! Often, this is because both variables are associated with a different causal variable, which tends to co-occur with the data that we’re measuring.

More
https://www.jmp.com/en_us/statistics-knowledge-portal/what-is-correlation/correlation-vs-causation.html






The phrase "correlation does not imply causation" refers to the inability to legitimately deduce a cause-and-effect relationship between two events or  variables solely on the basis of an observed association or correlation between them.[1][2] The idea that "correlation implies causation" is an example of a questionable-cause logical fallacy, in which two events occurring together are taken to have established a cause-and-effect relationship. This fallacy is also known by the Latin phrase cum hoc ergo propter hoc ('with this, therefore because of this'). This differs from the fallacy known as post hoc ergo propter hoc ("after this, therefore because of this"), in which an event following another is seen as a necessary consequence of the former event, and from conflation, the errant merging of two events, ideas, databases, etc., into one.

As with any logical fallacy, identifying that the reasoning behind an argument is flawed does not necessarily imply that the resulting conclusion is false. Statistical methods have been proposed that use correlation as the basis for hypothesis tests for causality, including the Granger causality test and convergent cross mapping.

Usage

In logic, the technical use of the word "implies" means "is a sufficient condition for".[3] This is the meaning intended by statisticians when they say causation is not certain. Indeed, p implies q has the technical meaning of the material conditionalif p then q symbolized as p → q. That is "if circumstance p is true, then q follows." In this sense, it is always correct to say "Correlation does not imply causation." In casual use, the word "implies" loosely means suggests rather than requires.

Where there is causation, there is correlation, but also a sequence in time from cause to effect, a plausible mechanism, and sometimes common and intermediate causes. While correlation is often used when inferring causation because it is a necessary condition, it is not a sufficient condition.

In a widely studied example of the difficulties this possibility of this statistical fallacy poses in deciding cause, numerous epidemiological studies showed that women taking combined hormone replacement therapy (HRT) also had a lower-than-average incidence of coronary heart disease (CHD), leading doctors to propose that HRT was protective against CHD. But later randomized controlled trials showed that use of HRT led to a small but statistically significant increase in the risk of CHD. Reanalysis of the data from the epidemiological studies showed that women undertaking HRT were more likely to be from higher socioeconomic groups (ABC1), with better-than-average diet and exercise regimens. Thus the use of HRT and decreased incidence of coronary heart disease were coincident effects of a common cause (i.e., the benefits associated with a higher socioeconomic status), rather than one being a direct cause of the other, as had been supposed.[4] The widely held (but mistaken) belief that RCTs provide stronger causal evidence than observational studies, the latter continued to consistently show benefits and subsequent analyses and follow-up studies have demonstrated a significant benefit for CHD risk in healthy women initiating oestrogen therapy soon after the onset of menopause.[5]

Causal analysis

Causal analysis is the field of experimental design and statistics pertaining to establishing cause and effect.[6][7] For any two correlated events, A and B, their possible relationships include:

  • A causes B (direct causation);
  • B causes A (reverse causation);
  • A and B are both caused by C
  • A causes B and B causes A (bidirectional or cyclic causation);
  • There is no connection between A and B; the correlation is a coincidence.

Thus there can be no conclusion made regarding the existence or the direction of a cause-and-effect relationship only from the fact that A and B are correlated. Determining whether there is an actual cause-and-effect relationship requires further investigation, even when the relationship between A and B is statistically significant, a large effect size is observed, or a large part of the variance is explained.

In philosophy and physics

The nature of causality is systematically investigated in several academic disciplines, including philosophy and physics.

In academia, there are a significant number of theories on causality; The Oxford Handbook of Causation (Beebee, Hitchcock & Menzies 2009) encompasses 770 pages. Among the more influential theories within philosophy are Aristotle's Four causes and Al-Ghazali's occasionalism.[8] David Hume argued that beliefs about causality are based on experience, and experience similarly based on the assumption that the future models the past, which in turn can only be based on experience – leading to circular logic. In conclusion, he asserted that causality is not based on actual reasoning: only correlation can actually be perceived.[9]Immanuel Kant, according to Beebee, Hitchcock & Menzies (2009), held that "a causal principle according to which every event has a cause, or follows according to a causal law, cannot be established through induction as a purely empirical claim, since it would then lack strict universality, or necessity".

Outside the field of philosophy, theories of causation can be identified in classical mechanicsstatistical mechanicsquantum mechanicsspacetime theories, biologysocial sciences, and law.[8] To establish a correlation as causal within physics, it is normally understood that the cause and the effect must connect through a local mechanism (cf. for instance the concept of impact) or a nonlocal mechanism (cf. the concept of field), in accordance with known laws of nature.

From the point of view of thermodynamics, universal properties of causes as compared to effects have been identified through the second law of thermodynamics, confirming the ancient, medieval and Cartesian[10] view that "the cause is greater than the effect" for the particular case of thermodynamic free energy. This, in turn, is challenged[dubious ] by popular interpretations of the concepts of nonlinear systems and the butterfly effect, in which small events cause large effects due to, respectively, unpredictability and an unlikely triggering of large amounts of potential energy.


https://en.wikipedia.org/wiki/Correlation_does_not_imply_causation


No comments:

Post a Comment