Test Assumptions#
Every statistical test relies on certain assumptions (requirements). When these assumptions are violated, results can be biased, p-values unreliable, or conclusions invalid.
Overview of Key Assumptions#
1. Level of Measurement#
The type of variable determines the class of possible tests:
- Metric (interval/ratio) β t-test, ANOVA, regression
- Ordinal β Mann-Whitney, Kruskal-Wallis, Spearman
- Nominal β Chi-square, Fisher's exact test
2. Independence of Observations#
Measurements from different participants must not influence each other. This assumption applies to virtually all statistical tests.
Violation: When students from the same classroom are tested, observations are not independent (cluster effect). Solution: Multilevel models or averaging at the cluster level.
Example: Violation of independence
A researcher examines the academic performance of 200 students from 10 classrooms. Students in the same class share the same teacher and influence each other.
Problem: A simple t-test ignores the cluster structure and produces p-values that are too small. Solution: A multilevel model accounts for the nesting of students within classrooms.
3. Normal Distribution#
Many parametric tests assume that certain distributions are normal:
| Test | What must be normally distributed? |
|---|---|
| t-test (independent) | Data in each group |
| t-test (paired) | Differences of paired values |
| ANOVA | Residuals in each group |
| Regression | Residuals |
How to check:
- Graphical: Histogram, Q-Q plot
- Statistical: Shapiro-Wilk test (n < 50), Kolmogorov-Smirnov test (n β₯ 50)
- Descriptive measures: Skewness and kurtosis
4. Homogeneity of Variance (Homoscedasticity)#
The spread of the dependent variable should be equal across all groups.
How to check:
- Levene's test: Most commonly used, robust against non-normality
- Bartlett's test: More sensitive, assumes normality
- Rule of thumb: If the ratio of the largest to the smallest variance is < 3:1, the assumption is considered met
When violated:
- t-test β Welch's t-test (adjusts degrees of freedom)
- ANOVA β Welch's ANOVA or Brown-Forsythe test
5. Sphericity (for Repeated Measures)#
In repeated-measures ANOVAs, the variances of differences between all pairs of levels must be equal.
How to check: Mauchly's test
When violated:
- Greenhouse-Geisser correction (more conservative)
- Huynh-Feldt correction (more liberal)
6. Linearity#
For correlation and regression, the relationship between variables must be linear.
How to check: Scatter plot, residual plot
7. No (Multi-)Collinearity#
In multiple regression, predictors should not be too strongly correlated with each other.
How to check: Variance Inflation Factor (VIF). A VIF > 10 indicates problematic collinearity.
Which Test Requires Which Assumptions?#
| Assumption | t-test | ANOVA | Mann-Whitney | Kruskal-Wallis | ChiΒ² |
|---|---|---|---|---|---|
| Metric DV | Yes | Yes | No | No | No |
| Independence | Yes | Yes | Yes | Yes | Yes |
| Normal distribution | Yes | Yes | No | No | No |
| Homogeneity of variance | Yes | Yes | No* | No* | No |
| Expected frequencies β₯ 5 | β | β | β | β | Yes |
*Mann-Whitney and Kruskal-Wallis assume similar distribution shapes when comparing medians.
How Important Are Assumptions Really?#
Not all violations are equally serious:
Robust against violation:
- t-test and ANOVA are robust against moderate normality violations with equal group sizes
- With n > 30 per group, mild deviations from normality are usually unproblematic
Sensitive to violation:
- Violation of independence is almost always problematic
- Unequal group sizes + variance heterogeneity is a critical combination
- Sphericity violation in repeated-measures ANOVA can substantially inflate error rates
Decision Path When Assumptions Are Violated#
- Check the assumption (graphically and/or statistically)
- Assess the severity of the violation (mild β often tolerable, severe β problematic)
- Choose an alternative:
- Non-parametric test
- Robust test (e.g., Welch's t-test)
- Data transformation
- Bootstrapping
Practical decision tree
You want to compare two groups (metric DV):
- Check normality β Shapiro-Wilk
- Yes β Continue to step 2
- No β Mann-Whitney U test
- Check homogeneity of variance β Levene's test
- Yes β t-test
- No β Welch's t-test
Common Misconceptions#
"If the Shapiro-Wilk test is significant, I must use a non-parametric test." Not necessarily. With large samples, the Shapiro-Wilk test is almost always significant, even for negligible deviations. Combine the test with graphical inspection.
"Non-parametric tests have no assumptions." Wrong. Non-parametric tests also have assumptions, such as independence and (for median comparisons) similar distribution shapes.
"I must perfectly meet all assumptions." Statistics in practice is rarely perfect. Many tests are robust against mild violations. What matters is knowing about violations and reporting them transparently.
Further Reading
- Field, A. (2018). Discovering Statistics Using IBM SPSS Statistics (5th ed.). SAGE.
- Bortz, J. & Schuster, C. (2010). Statistik fΓΌr Human- und Sozialwissenschaftler (7th ed.). Springer.