PickMyTest

Pearson Correlation

Measures the strength and direction of the linear relationship between two continuous variables

Pearson Correlation#

The Pearson correlation (also: Pearson product-moment correlation) measures the strength and direction of the linear relationship between two continuous variables. The correlation coefficient r ranges from -1 to +1.

When to Use#

Use the Pearson correlation when you want to:

  • Quantify the linear relationship between two variables
  • Both variables are metric (continuous)
  • The data are approximately normally distributed
  • You are interested in the direction and strength of the relationship

Assumptions#

  • Both variables are measured on a metric scale
  • Normal distribution of both variables (Shapiro-Wilk test)
  • Linear relationship between the variables (check with scatterplot)
  • No significant outliers
  • Independence of observation pairs

Formula#

The Pearson correlation coefficient is calculated as:

r=i=1n(xixˉ)(yiyˉ)i=1n(xixˉ)2i=1n(yiyˉ)2r = \frac{\sum_{i=1}^{n}(x_i - \bar{x})(y_i - \bar{y})}{\sqrt{\sum_{i=1}^{n}(x_i - \bar{x})^2 \cdot \sum_{i=1}^{n}(y_i - \bar{y})^2}}

The test statistic for significance testing:

t=rn21r2t = \frac{r \sqrt{n-2}}{\sqrt{1-r^2}}

with n2n - 2 degrees of freedom.

Example#

Practical Example: Study Time and Exam Results

A lecturer wants to investigate whether there is a relationship between study time (in hours) and exam results (in points). She collects data from 50 students.

  • Variable X: Study time in hours per week
  • Variable Y: Exam result (0–100 points)

The Pearson correlation yields r = 0.72, p < 0.001. There is a strong positive linear relationship: The more hours studied, the higher the exam result tends to be.

Effect Size#

The correlation coefficient r is itself a measure of effect size. The coefficient of determination r² indicates the proportion of explained variance:

r2=proportion of explained variancer^2 = \text{proportion of explained variance}

| Effect Size | |r| | |---|---| | Small | 0.10 | | Medium | 0.30 | | Large | 0.50 |

Important: Correlation does not imply causation. A high correlation coefficient says nothing about the cause-and-effect relationship.

Further Reading

  • Cohen, J. (1988). Statistical Power Analysis for the Behavioral Sciences (2nd ed.). Lawrence Erlbaum Associates.
  • Field, A. (2018). Discovering Statistics Using IBM SPSS Statistics (5th ed.). SAGE.