PickMyTest

Partial Correlation

Measures the linear association between two variables while statistically controlling for one or more confounding variables. Allows isolation of the true relationship.

Partial Correlation#

Partial correlation measures the linear association between two variables XX and YY after statistically removing the influence of one or more control variables ZZ. It is an essential tool for uncovering spurious correlations and isolating the "true" relationship between two variables. When a third variable influences both XX and YY, the simple Pearson correlation can paint a misleading picture β€” partial correlation adjusts for this confounding effect.

When to Use#

  • You suspect the association between XX and YY may be distorted by a third variable ZZ
  • You want to check whether an observed correlation is a spurious correlation
  • You want to report the adjusted association between two variables
  • All variables are metric (interval or ratio scaled)
  • You want to control for a confounding variable without running a full regression analysis

Assumptions#

  • Metric scale level for all variables (X, Y, and Z)
  • Linear relationship between all variable pairs
  • Approximate normality of all variables
  • Independent observations
  • No perfect multicollinearity between variables

Formula#

The first-order partial correlation (controlling for one variable ZZ) is computed from the three bivariate Pearson correlations:

rXYβ‹…Z=rXYβˆ’rXZβ‹…rYZ(1βˆ’rXZ2)(1βˆ’rYZ2)r_{XY \cdot Z} = \frac{r_{XY} - r_{XZ} \cdot r_{YZ}}{\sqrt{(1 - r_{XZ}^2)(1 - r_{YZ}^2)}}

Here, rXYr_{XY} is the correlation between XX and YY, rXZr_{XZ} is the correlation between XX and ZZ, and rYZr_{YZ} is the correlation between YY and ZZ.

Significance is tested using a t-test:

t=rXYβ‹…Zβ‹…nβˆ’31βˆ’rXYβ‹…Z2,df=nβˆ’3t = \frac{r_{XY \cdot Z} \cdot \sqrt{n - 3}}{\sqrt{1 - r_{XY \cdot Z}^2}}, \quad df = n - 3

When controlling for kk variables, df=nβˆ’2βˆ’kdf = n - 2 - k.

Example#

Practical Example: Ice Cream Sales, Drowning, and Temperature

A study finds a high positive correlation between ice cream sales (XX) and the number of drowning incidents (YY) at public pools. Does eating more ice cream increase the risk of drowning? Of course not β€” temperature (ZZ) is the common cause driving both variables.

Correlations (n = 50 summer days):

  • rXY=0.83r_{XY} = 0.83 (ice cream sales ↔ drowning incidents)
  • rXZ=0.90r_{XZ} = 0.90 (ice cream sales ↔ temperature)
  • rYZ=0.88r_{YZ} = 0.88 (drowning incidents ↔ temperature)

Calculating the partial correlation:

rXYβ‹…Z=0.83βˆ’0.90β‹…0.88(1βˆ’0.902)(1βˆ’0.882)=0.83βˆ’0.7920.19β‹…0.2256=0.0380.207=0.18r_{XY \cdot Z} = \frac{0.83 - 0.90 \cdot 0.88}{\sqrt{(1 - 0.90^2)(1 - 0.88^2)}} = \frac{0.83 - 0.792}{\sqrt{0.19 \cdot 0.2256}} = \frac{0.038}{0.207} = 0.18

Interpretation: The originally strong correlation of r=0.83r = 0.83 drops to rXYβ‹…Z=0.18r_{XY \cdot Z} = 0.18 after controlling for temperature β€” a small and likely non-significant association. The observed correlation was largely a spurious correlation caused by the shared confounding variable temperature.

Effect Size#

The partial correlation rXYβ‹…Zr_{XY \cdot Z} is itself an effect size measure and is interpreted using the same conventions as Pearson's rr:

| ∣rXYβ‹…Z∣|r_{XY \cdot Z}| | Interpretation | |---|---| | 0.10 | Small effect | | 0.30 | Medium effect | | 0.50 | Large effect |

Additionally, the proportion of uniquely explained variance can be calculated:

Rpartial2=rXYβ‹…Z2R^2_{\text{partial}} = r_{XY \cdot Z}^2

This value indicates how much variance in YY is explained by XX after the influence of ZZ has already been accounted for. In the example above, ice cream sales explain only 0.182=3.2%0.18^2 = 3.2\% of the variance in drowning incidents after controlling for temperature.

Further Reading

  • Cohen, J., Cohen, P., West, S. G. & Aiken, L. S. (2003). Applied Multiple Regression/Correlation Analysis for the Behavioral Sciences (3rd ed.). Lawrence Erlbaum Associates.
  • Field, A. (2018). Discovering Statistics Using IBM SPSS Statistics (5th ed.). SAGE Publications.
  • Agresti, A. & Finlay, B. (2009). Statistical Methods for the Social Sciences (4th ed.). Pearson.