PickMyTest

Mann-Whitney U Test

Nonparametric comparison of two independent groups

Mann-Whitney U Test#

The Mann-Whitney U test (also: Wilcoxon rank-sum test) is a nonparametric test for comparing two independent groups. It tests whether the distributions of both groups differ systematically and is the nonparametric alternative to the independent samples t-test.

When to Use#

Use the Mann-Whitney U test when you want to:

  • Compare two independent groups
  • The dependent variable is at least ordinally scaled
  • The normality assumption of the t-test is violated
  • The sample is small and the distribution shape is unclear

The test is based on ranks rather than raw values and therefore does not require any specific distributional form.

Assumptions#

  • Independence of observations (both between and within groups)
  • At least ordinal scale of the dependent variable
  • Similar distribution shape in both groups (for interpretation as a median comparison)
  • The variable is continuous enough that ties are rare

Note: Strictly speaking, the Mann-Whitney U test evaluates whether a randomly chosen value from Group 1 is equally likely to be greater or smaller than a randomly chosen value from Group 2. It can only be interpreted as a median comparison when the distribution shapes are similar.

Formula#

The U statistic is calculated for both groups. First, all values are combined and ranked. Then:

U1=n1n2+n1(n1+1)2βˆ’R1U_1 = n_1 n_2 + \frac{n_1(n_1 + 1)}{2} - R_1 U2=n1n2+n2(n2+1)2βˆ’R2U_2 = n_1 n_2 + \frac{n_2(n_2 + 1)}{2} - R_2

where:

  • n1n_1 and n2n_2 are the sample sizes of the two groups
  • R1R_1 and R2R_2 are the rank sums of the respective groups

It always holds that: U1+U2=n1β‹…n2U_1 + U_2 = n_1 \cdot n_2

The test statistic is U=min⁑(U1,U2)U = \min(U_1, U_2).

For large samples (n1,n2>20n_1, n_2 > 20), a z-approximation can be used:

z=Uβˆ’n1n22n1n2(n1+n2+1)12z = \frac{U - \frac{n_1 n_2}{2}}{\sqrt{\frac{n_1 n_2 (n_1 + n_2 + 1)}{12}}}

Example#

Practical Example: Patient Satisfaction

A hospital wants to compare patient satisfaction between two wards. Satisfaction is measured on a Likert scale (1--5) -- the data are therefore ordinal and not normally distributed.

  • Ward A (n=25): Patient satisfaction scores
  • Ward B (n=30): Patient satisfaction scores

Since the data are ordinally scaled and the normality assumption is not met, the Mann-Whitney U test is used instead of the t-test. All 55 values are combined into a joint ranking, and the rank sums of the two groups are compared.

Effect Size#

The rank-biserial correlation coefficient rrbr_{rb} as a measure of effect size:

rrb=1βˆ’2Un1n2r_{rb} = 1 - \frac{2U}{n_1 n_2}

Alternatively, rr can be calculated from the z-statistic:

r=zNr = \frac{z}{\sqrt{N}}
Effect Sizer
Small0.1
Medium0.3
Large0.5

Tip: When results are significant, the effect size should always be reported alongside the p-value. The Mann-Whitney U test often has higher statistical power than the t-test when the normality assumption is violated.

Further Reading

  • Mann, H. B. & Whitney, D. R. (1947). On a test of whether one of two random variables is stochastically larger than the other. Annals of Mathematical Statistics, 18(1), 50–60.
  • Field, A. (2018). Discovering Statistics Using IBM SPSS Statistics (5th ed.). SAGE.