Test Assumptions#

Every statistical test relies on certain assumptions (requirements). When these assumptions are violated, results can be biased, p-values unreliable, or conclusions invalid.

Overview of Key Assumptions#

1. Level of Measurement#

The type of variable determines the class of possible tests:

Metric (interval/ratio) → t-test, ANOVA, regression
Ordinal → Mann-Whitney, Kruskal-Wallis, Spearman
Nominal → Chi-square, Fisher's exact test

2. Independence of Observations#

Measurements from different participants must not influence each other. This assumption applies to virtually all statistical tests.

Violation: When students from the same classroom are tested, observations are not independent (cluster effect). Solution: Multilevel models or averaging at the cluster level.

Example: Violation of independence

A researcher examines the academic performance of 200 students from 10 classrooms. Students in the same class share the same teacher and influence each other.

Problem: A simple t-test ignores the cluster structure and produces p-values that are too small. Solution: A multilevel model accounts for the nesting of students within classrooms.

3. Normal Distribution#

Many parametric tests assume that certain distributions are normal:

Test	What must be normally distributed?
t-test (independent)	Data in each group
t-test (paired)	Differences of paired values
ANOVA	Residuals in each group
Regression	Residuals

How to check:

Graphical: Histogram, Q-Q plot
Statistical: Shapiro-Wilk test (n < 50), Kolmogorov-Smirnov test (n ≥ 50)
Descriptive measures: Skewness and kurtosis

4. Homogeneity of Variance (Homoscedasticity)#

The spread of the dependent variable should be equal across all groups.

How to check:

Levene's test: Most commonly used, robust against non-normality
Bartlett's test: More sensitive, assumes normality
Rule of thumb: If the ratio of the largest to the smallest variance is < 3:1, the assumption is considered met

When violated:

t-test → Welch's t-test (adjusts degrees of freedom)
ANOVA → Welch's ANOVA or Brown-Forsythe test

5. Sphericity (for Repeated Measures)#

In repeated-measures ANOVAs, the variances of differences between all pairs of levels must be equal.

How to check: Mauchly's test

When violated:

Greenhouse-Geisser correction (more conservative)
Huynh-Feldt correction (more liberal)

6. Linearity#

For correlation and regression, the relationship between variables must be linear.

How to check: Scatter plot, residual plot

7. No (Multi-)Collinearity#

In multiple regression, predictors should not be too strongly correlated with each other.

How to check: Variance Inflation Factor (VIF). A VIF > 10 indicates problematic collinearity.

Which Test Requires Which Assumptions?#

Assumption	t-test	ANOVA	Mann-Whitney	Kruskal-Wallis	Chi²
Metric DV	Yes	Yes	No	No	No
Independence	Yes	Yes	Yes	Yes	Yes
Normal distribution	Yes	Yes	No	No	No
Homogeneity of variance	Yes	Yes	No*	No*	No
Expected frequencies ≥ 5	—	—	—	—	Yes

*Mann-Whitney and Kruskal-Wallis assume similar distribution shapes when comparing medians.

How Important Are Assumptions Really?#

Not all violations are equally serious:

Robust against violation:

t-test and ANOVA are robust against moderate normality violations with equal group sizes
With n > 30 per group, mild deviations from normality are usually unproblematic

Sensitive to violation:

Violation of independence is almost always problematic
Unequal group sizes + variance heterogeneity is a critical combination
Sphericity violation in repeated-measures ANOVA can substantially inflate error rates

Decision Path When Assumptions Are Violated#

Check the assumption (graphically and/or statistically)
Assess the severity of the violation (mild → often tolerable, severe → problematic)
Choose an alternative:
- Non-parametric test
- Robust test (e.g., Welch's t-test)
- Data transformation
- Bootstrapping

Practical decision tree

You want to compare two groups (metric DV):

Check normality → Shapiro-Wilk
- Yes → Continue to step 2
- No → Mann-Whitney U test
Check homogeneity of variance → Levene's test
- Yes → t-test
- No → Welch's t-test

Common Misconceptions#

"If the Shapiro-Wilk test is significant, I must use a non-parametric test." Not necessarily. With large samples, the Shapiro-Wilk test is almost always significant, even for negligible deviations. Combine the test with graphical inspection.

"Non-parametric tests have no assumptions." Wrong. Non-parametric tests also have assumptions, such as independence and (for median comparisons) similar distribution shapes.

"I must perfectly meet all assumptions." Statistics in practice is rarely perfect. Many tests are robust against mild violations. What matters is knowing about violations and reporting them transparently.

Test Assumptions