Linear Mixed Models (LMM)#

If you would use RM-ANOVA but have missing data, unequal time intervals, or nested structures — then linear mixed models (LMM) are your tool. They are not a complicated specialist method but the natural evolution of ANOVA. In fact, RM-ANOVA is a special case of the LMM.

When to Use#

You have repeated measures but some data are missing — LMM can handle missing values without excluding entire participants
Your measurements have unequal time intervals (e.g., assessments at 1, 3, and 12 months)
Your data are nested (e.g., students in classrooms, patients in clinics)
You need more flexibility than a traditional ANOVA provides — such as different covariance structures
You want to model individual trajectories (each person can have their own starting point and rate of change)

Fixed and Random Effects — Simply Explained#

Imagine you are studying learning progress of students across different classrooms:

Fixed effects answer your research question: "Does the new teaching method improve performance?" — these are the effects you want to generalize to the population.
Random effects model the structure of your data: "Students in the same classroom are more similar to each other than students from different classrooms." — they capture variation between clusters (e.g., classrooms).

Key insight: Fixed effects = what you care about scientifically. Random effects = the nesting in your data.

Assumptions#

Linearity — the relationship between predictors and outcome is linear
Normality of residuals (not the raw data!)
Normality of random effects
Independence of observations at the highest level (e.g., between classrooms)

Tip: LMM are robust to mild violations of normality, especially with larger samples. Check residuals with a QQ plot. Unlike RM-ANOVA, no sphericity assumption is needed.

Formula#

The core idea is simple — just add a random component:

$Y_{ij} = (\beta_0 + b_{0j}) + (\beta_1 + b_{1j}) \cdot X_{ij} + \varepsilon_{ij}$

Where:

$\beta_0$ = fixed intercept (overall mean, fixed effect)
$b_{0j}$ = random deviation of the intercept for group $j$ (random intercept)
$\beta_1$ = fixed effect of the predictor (e.g., time)
$b_{1j}$ = random deviation of the slope for group $j$ (random slope)
$\varepsilon_{ij}$ = residual

In plain language: Each person (or cluster) gets their own starting point and their own rate of change, but all are drawn from a shared distribution.

RM-ANOVA is a special case of the LMM — with a balanced design, complete data, and compound symmetry covariance structure, both yield identical results.

Example#

Practical Example: Stress Reduction With Missing Data

60 patients participate in an 8-week stress reduction program. Stress levels are measured at 4 time points (weeks 0, 2, 5, and 8 — unequal intervals). 12 patients miss at least one appointment.

With RM-ANOVA: 12 patients are excluded entirely → only 48 patients → loss of information and potential bias.

With LMM: All 60 patients are included. The model uses whatever data each person provides.

Model: Stress ~ Time + (1 + Time | Person)

Results:

Fixed effect of time: $\beta_1 = -2.34$ , $SE = 0.41$ , $p < .001$ — stress decreases by an average of 2.34 points per measurement occasion
Random intercept SD: 4.12 — individuals start at different stress levels
Random slope SD: 1.08 — the rate of stress reduction varies across individuals

Effect Size#

For LMM, two $R^2$ measures are reported (following Nakagawa & Schielzeth):

$R^2_{\text{marginal}} = \frac{\sigma^2_f}{\sigma^2_f + \sigma^2_r + \sigma^2_e}$

$R^2_{\text{conditional}} = \frac{\sigma^2_f + \sigma^2_r}{\sigma^2_f + \sigma^2_r + \sigma^2_e}$

Marginal $R^2$ : Variance explained by fixed effects alone
Conditional $R^2$ : Variance explained by fixed + random effects

Measure	Interpretation
$R^2_m$ = 0.13, $R^2_c$ = 0.55	Fixed effects explain 13%, with random effects 55% — cluster structure matters a lot
$R^2_m \approx R^2_c$	Little variation between clusters — a simpler model may suffice

Tip: When $R^2_m$ and $R^2_c$ diverge strongly, the random effects are important and a simple regression model would be inadequate.