Understanding Pearson Correlation
The Pearson Correlation Coefficient (often denoted as $r$) is a statistical measure that expresses the extent to which two variables are linearly related. It is widely used in sciences and statistics to determine if a change in one variable is associated with a change in another.
Need to analyze the dispersion of a single dataset instead of comparing two? Check out our Sample Variance Calculator.
Interpreting the $r$ Value
The Pearson correlation coefficient always ranges between -1 and +1:
- $r = 1$: Perfect positive correlation. As X increases, Y increases perfectly proportionally.
- $r = -1$: Perfect negative correlation. As X increases, Y decreases perfectly proportionally.
- $r = 0$: No linear correlation. The variables do not show any linear trend together.
Generally, values above 0.7 (or below -0.7) are considered "strong", values between 0.4 and 0.7 are "moderate", and values closer to 0 are "weak".
Correlation vs. Causation
One of the most important rules in statistics is that correlation does not imply causation.
Just because a high Pearson correlation exists between Dataset X and Dataset Y does not mean that X causes Y. They could both be influenced by a hidden third variable, or the correlation could simply be a statistical coincidence.
For a deeper dive into the mathematical formulas driving this coefficient, you can visit the Wikipedia page on the Pearson correlation coefficient.