Central Limit Theorem

The most important theorem in statistics: why sums of random variables tend to look like a Bell curve.

1 The Theorem

Let \( X_1, X_2, \dots, X_n \) be i.i.d. random variables with mean \( \mu \) and finite variance \( \sigma^2 \). Consider the standardized sum (or standardized sample mean):

$$ Z_n = \frac{\sum_{i=1}^n X_i - n\mu}{\sigma\sqrt{n}} = \frac{\overline{X}_n - \mu}{\sigma/\sqrt{n}} $$

The Central Limit Theorem (CLT) states that as \( n \to \infty \), \( Z_n \) converges in distribution to the Standard Normal Distribution \( \mathcal{N}(0, 1) \).

$$ \lim_{n \to \infty} P(Z_n \le z) = \Phi(z) $$

Key Insight: It does not matter what the original distribution of \( X_i \) is (as long as variance is finite). The sum will eventually look Normal.

2 Interactive Simulation

Sum of Uniform Random Variables

Let \( X_i \sim \text{Uniform}(0, 1) \). We calculate \( S_n = \sum_{i=1}^n X_i \).
Adjust \( n \) (sample size) and observe the shape of the histogram of \( S_n \).

3 Applications

Approximating Probabilities

If \( Y = \sum X_i \), we can approximate \( P(a \le Y \le b) \) using the Normal distribution.

$$ Z = \frac{Y - n\mu}{\sigma\sqrt{n}} \sim \mathcal{N}(0,1) $$

Why Normal is Everywhere

Measurement errors, heights, test scores, and many natural phenomena are sums of many small independent factors, leading to a Normal distribution via CLT.