Mann-Whitney U Test#
The Mann-Whitney U test (also: Wilcoxon rank-sum test) is a nonparametric test for comparing two independent groups. It tests whether the distributions of both groups differ systematically and is the nonparametric alternative to the independent samples t-test.
When to Use#
Use the Mann-Whitney U test when you want to:
- Compare two independent groups
- The dependent variable is at least ordinally scaled
- The normality assumption of the t-test is violated
- The sample is small and the distribution shape is unclear
The test is based on ranks rather than raw values and therefore does not require any specific distributional form.
Assumptions#
- Independence of observations (both between and within groups)
- At least ordinal scale of the dependent variable
- Similar distribution shape in both groups (for interpretation as a median comparison)
- The variable is continuous enough that ties are rare
Note: Strictly speaking, the Mann-Whitney U test evaluates whether a randomly chosen value from Group 1 is equally likely to be greater or smaller than a randomly chosen value from Group 2. It can only be interpreted as a median comparison when the distribution shapes are similar.
Formula#
The U statistic is calculated for both groups. First, all values are combined and ranked. Then:
where:
- and are the sample sizes of the two groups
- and are the rank sums of the respective groups
It always holds that:
The test statistic is .
For large samples (), a z-approximation can be used:
Example#
Practical Example: Patient Satisfaction
A hospital wants to compare patient satisfaction between two wards. Satisfaction is measured on a Likert scale (1--5) -- the data are therefore ordinal and not normally distributed.
- Ward A (n=25): Patient satisfaction scores
- Ward B (n=30): Patient satisfaction scores
Since the data are ordinally scaled and the normality assumption is not met, the Mann-Whitney U test is used instead of the t-test. All 55 values are combined into a joint ranking, and the rank sums of the two groups are compared.
Effect Size#
The rank-biserial correlation coefficient as a measure of effect size:
Alternatively, can be calculated from the z-statistic:
| Effect Size | r |
|---|---|
| Small | 0.1 |
| Medium | 0.3 |
| Large | 0.5 |
Tip: When results are significant, the effect size should always be reported alongside the p-value. The Mann-Whitney U test often has higher statistical power than the t-test when the normality assumption is violated.
Further Reading
- Mann, H. B. & Whitney, D. R. (1947). On a test of whether one of two random variables is stochastically larger than the other. Annals of Mathematical Statistics, 18(1), 50β60.
- Field, A. (2018). Discovering Statistics Using IBM SPSS Statistics (5th ed.). SAGE.