Normal Distribution#

The normal distribution (also called Gaussian distribution or bell curve) is the most important distribution in statistics. Many statistical tests assume that data are normally distributed, and numerous natural phenomena approximately follow this distribution.

What Is the Normal Distribution?#

A normally distributed variable is symmetric around its mean. The distribution is completely described by two parameters:

X \sim N(\mu, \sigma^2)

where:

$\mu$ is the mean (expected value)
$\sigma^2$ is the variance
$\sigma$ is the standard deviation

The probability density function is:

f(x) = \frac{1}{\sigma\sqrt{2\pi}} \, e^{-\frac{(x - \mu)^2}{2\sigma^2}}

The 68-95-99.7 Rule#

In a normal distribution:

68.3% of values fall within $\mu \pm 1\sigma$
95.4% of values fall within $\mu \pm 2\sigma$
99.7% of values fall within $\mu \pm 3\sigma$

Example: IQ Distribution

IQ is normally distributed with μ = 100 and σ = 15.

68% of people have an IQ between 85 and 115
95% have an IQ between 70 and 130
99.7% have an IQ between 55 and 145

The Standard Normal Distribution#

Any normal distribution can be converted to the standard normal distribution via z-transformation:

z = \frac{x - \mu}{\sigma}

The standard normal distribution has $\mu = 0$ and $\sigma = 1$ .

Why Is It So Important?#

Central Limit Theorem#

The central limit theorem states: Regardless of the distribution of the population, the distribution of the sample mean is approximately normal for a sufficiently large sample size.

\bar{X} \sim N\left(\mu, \frac{\sigma^2}{n}\right)

This means: Even if individual measurements are not normally distributed, the mean of many such measurements can be normally distributed. A common rule of thumb is n ≥ 30.

Assumption of Many Tests#

Parametric tests such as t-tests, ANOVA, and Pearson correlation assume normality. When this assumption is violated, non-parametric alternatives are available.

Testing for Normality#

Graphical Methods#

Histogram — Shows the shape of the distribution
Q-Q plot (Quantile-Quantile plot) — Points should fall along a straight line
Box plot — Identifies asymmetries and outliers

Statistical Tests#

Test	Suitable for	Notes
Shapiro-Wilk	n < 50	Most recommended for small samples
Kolmogorov-Smirnov	n ≥ 50	Less sensitive than Shapiro-Wilk
Anderson-Darling	All n	Emphasizes tails of the distribution

Descriptive Measures#

Skewness: Should be close to 0. Values between -1 and +1 are considered acceptable.
Kurtosis: Should be close to 3 (or excess kurtosis close to 0). Values between -2 and +2 are considered acceptable.

What to Do When Normality Is Violated#

Use non-parametric tests — e.g., Mann-Whitney instead of t-test
Transform the data — Log transformation, square root transformation, or Box-Cox transformation
Increase sample size — The central limit theorem provides robustness with large samples
Bootstrapping — Distribution-free method for estimating confidence intervals

Common Misconceptions#

"The raw data themselves must be normally distributed." For many tests, it is sufficient if the residuals are normally distributed (regression) or the differences (paired t-test). It is not always about the raw data.

"The Shapiro-Wilk test is not significant, so the data are normally distributed." A non-significant result only means that we cannot reject the null hypothesis (normality). With small samples, the test has low power and can easily miss deviations.

"Normality must be perfect." Parametric tests are often robust to mild violations of the normality assumption, especially with larger samples (n > 30).

Normal Distribution