ANCOVA (Analysis of Covariance)#

The analysis of covariance (ANCOVA) is a statistical method that combines analysis of variance (ANOVA) with linear regression. It compares group means while statistically removing the influence of one or more covariates (control variables). This allows for more precise estimation of group differences and increased statistical power.

When to Use#

You want to compare group means while controlling for the influence of a continuous confounding variable
There is a relevant baseline variable (e.g., pretest scores) that could bias the group comparison
You want to reduce error variance to increase the power of your test
The covariate was measured before the experimental manipulation (not influenced by the IV)
Full randomization was not possible and you want to control for pre-existing differences

Assumptions#

Normality of residuals (Shapiro-Wilk test)
Homogeneity of variances (Levene's test)
Independence of observations
Continuous dependent variable and continuous covariate
Linear relationship between covariate and DV (check with scatterplot)
Homogeneity of regression slopes (IV × covariate interaction not significant)
Covariate independent of IV (measured before manipulation)

Note: Homogeneity of regression slopes is a critical assumption. Test it by fitting the interaction model IV × covariate. If the interaction is significant, standard ANCOVA is not appropriate — consider the Johnson-Neyman technique or a moderated regression model instead.

Formula#

The ANCOVA model for an observation $Y_{ij}$ in group $i$ :

Y_{ij} = \mu + \tau_i + \beta(X_{ij} - \bar{X}) + \varepsilon_{ij}

Where:

$\mu$ is the grand mean
$\tau_i$ is the effect of group $i$
$\beta$ is the regression coefficient of the covariate
$X_{ij} - \bar{X}$ is the centered covariate
$\varepsilon_{ij}$ is the residual error

The F-statistic for the group effect:

F = \frac{MS_{group(adj)}}{MS_{error(adj)}}

The adjusted group means are calculated by standardizing the covariate influence to the grand mean of the covariate:

\bar{Y}_{i(adj)} = \bar{Y}_i - \hat{\beta}(\bar{X}_i - \bar{X})

Example#

Practical Example: Therapy Comparison Controlling for Baseline

A clinical psychologist compares two therapy types (CBT vs. EMDR) for depression patients. Because the groups differ in baseline depression severity (BDI pre-score) despite randomization, this variable is included as a covariate.

IV: Therapy type (CBT, EMDR), DV: BDI post-score, Covariate: BDI pre-score
N = 60 (30 per group)

Results:

Without covariate (ANOVA): $F(1, 58) = 2.14$ , $p = .149$ — no significant difference
With covariate (ANCOVA): $F(1, 57) = 7.83$ , $p = .007$ , $\eta_p^2 = .12$ — significant difference

The ANCOVA reveals that EMDR leads to significantly lower depression scores after controlling for baseline. The adjusted means are 12.3 (CBT) vs. 8.7 (EMDR). Without controlling for baseline scores, this effect was masked by pre-existing group differences.

Effect Size#

The standard effect size measure is partial eta-squared ( $\eta_p^2$ ):

\eta_p^2 = \frac{SS_{group(adj)}}{SS_{group(adj)} + SS_{error(adj)}}

Effect Size	$\eta_p^2$
Small	0.01
Medium	0.06
Large	0.14

Important: The effect size refers to the adjusted sums of squares, i.e., after removing the covariate's influence. Partial eta-squared thus represents the proportion of remaining variance explained by the group difference.