Skip to main content

Central Limit Theorem

A Most Magical Theorem ✨. It says two things:

  1. The mean of IID Random Variables XiX_i sampled from any distribution is Normally Distributed.
Xˉ=1ni=1nXiN(μ,σ2n) as n\bar{X} = \frac{1}{n}\sum_{i=1}^{n}{X_i} \sim N(\mu, \frac{\sigma^2}{n}) \space \text{as} \space n \rarr \infty
  1. The sum of equally weighted IID Random Variables XiX_i from any distribution is also Normally Distributed.
nXˉ=ni=1nXiN(nμ,nσ2) as nn\bar{X} = n\sum_{i=1}^{n}{X_i} \sim N(n\mu, n\sigma^2) \space \text{as} \space n \rarr \infty

How nice. The underlying distribution doesn't matter!

People like expressing #1 as the Standard Normal ZZ, a beautiful little bell-curve with its mean at 00 and Standard Deviation 11. Here's the general form:

Z=XμσZ = \frac{X - \mu}{\sigma}

and a form relevant to what we're talking about. The denominator normalizes ZZ to [0,1][0,1].

Z=XˉμσnZ = \frac{\bar{X} - \mu}{\frac{\sigma}{\sqrt{n}}}

Why the Normal?

Because of the Law of Large Numbers1 plus the fact that the normal distribution is the only stable distribution under addition (when variance is finite). When you add up a bunch of things, the heres and theres, ups and downs, lefts and rights tend to sort of cancel out. The only thing that survives... (TODO This is akin to asking "Why is TV noise Gaussian?")

Footnotes

  1. When nn is sufficiently large and Random Variable XX is IID, the sample mean Xˉ\bar{X} approaches ("converges to") the true, population mean μ\mu.