Effect Sizes#

Effect size quantifies the magnitude of an effect regardless of sample size. While the p-value only indicates whether an effect is statistically significant, the effect size tells you how large the effect is.

Why Effect Sizes Matter#

A statistically significant result can be practically meaningless. Conversely, a non-significant result may reflect a substantial effect that went undetected due to a small sample size.

Why p-values alone are not enough

Two studies examine the effect of a medication:

Study A (n = 20): Mean difference = 8 points, p = 0.12, d = 0.72
Study B (n = 5,000): Mean difference = 0.3 points, p < 0.001, d = 0.04

Study B is significant, but the effect is tiny. Study A shows a substantial effect that did not reach significance because of the small sample.

Key Effect Size Measures#

Cohen's d — For Mean Comparisons#

Cohen's d measures the difference between two means in units of standard deviation.

For independent samples:

d = \frac{\bar{X}_1 - \bar{X}_2}{s_p}

with the pooled standard deviation:

s_p = \sqrt{\frac{(n_1 - 1)s_1^2 + (n_2 - 1)s_2^2}{n_1 + n_2 - 2}}

For paired samples (Cohen's d_z):

d_z = \frac{\bar{d}}{s_d}

Interpretation	Cohen's d
Small	0.2
Medium	0.5
Large	0.8

Eta-Squared (η²) — For ANOVA#

Eta-squared indicates the proportion of variance explained by the effect relative to total variance.

\eta^2 = \frac{SS_{\text{effect}}}{SS_{\text{total}}}

Interpretation	η²
Small	0.01
Medium	0.06
Large	0.14

Note: η² systematically overestimates the population effect size. Partial η² or omega-squared (ω²) are preferred.

Partial Eta-Squared (η²_p)#

In the context of factorial ANOVAs, partial η² considers only the relevant error variance:

\eta_p^2 = \frac{SS_{\text{effect}}}{SS_{\text{effect}} + SS_{\text{error}}}

Omega-Squared (ω²) — Unbiased Estimator#

\omega^2 = \frac{SS_{\text{effect}} - df_{\text{effect}} \cdot MS_{\text{error}}}{SS_{\text{total}} + MS_{\text{error}}}

Correlation Coefficient r#

Pearson's r is itself an effect size measure for the linear relationship between two variables.

| Interpretation | |r| | |---|---| | Small | 0.10 | | Medium | 0.30 | | Large | 0.50 |

Converting Between Effect Size Measures#

r = \frac{d}{\sqrt{d^2 + 4}}

d = \frac{2r}{\sqrt{1 - r^2}}

Cramer's V — For Categorical Data#

Cramer's V is the effect size measure for the chi-square test:

V = \sqrt{\frac{\chi^2}{n \cdot (k - 1)}}

where k is the minimum of the number of rows and columns.

Reporting Effect Sizes#

The APA (American Psychological Association) recommends always reporting effect sizes and confidence intervals.

Correct APA-style reporting

"The independent t-test showed a significant difference between the experimental group (M = 24.3, SD = 4.1) and the control group (M = 20.1, SD = 3.8), t(38) = 3.28, p = .002, d = 1.06, 95% CI [0.38, 1.73]."

Practical Interpretation#

Cohen's benchmarks (small, medium, large) are general guidelines. Practical significance depends on context:

In medicine, a "small" effect (d = 0.2) can save thousands of lives
In educational research, a "medium" effect (d = 0.5) is already noteworthy
In basic research, large effects are not uncommon

Common Misconceptions#

"A significant p-value means a large effect." No. Significance and effect size are independent concepts. Significance depends heavily on sample size.

"Cohen's benchmarks are universal." The benchmarks (0.2 / 0.5 / 0.8) are conventions. In some research areas, d = 0.2 is already a meaningful effect; in others, d = 0.8 is relatively small.

"Negative effect sizes mean bad results." The sign only indicates the direction. A d = -0.5 is just as large as d = +0.5, only in the opposite direction.

Effect Sizes