Central Limit Theorem

One property of the normal distribution is that the sum of n independent random variables${\displaystyle X_{1},X_{2},\dots ,X_{n}}$ with Normal distribution, is also normally distributed. This property remains true for any value of n. If the random variables ${\displaystyle X_{1},X_{2},\dots ,X_{n}}$ are not normally distributed, then this property is not exactly true, but it remains approximately correct for large n. Let ${\displaystyle X_{1},X_{2},\dots ,X_{n}}$ be independently and identically distributed random variables with E(${\displaystyle X_{i}}$) = ${\displaystyle \mu }$  and Var(${\displaystyle X_{i}}$) = ${\displaystyle \sigma ^{2}>0}$  for i = 1,${\displaystyle \dots }$,n. Then the sum of these random variables is for large n approximately normally distributed:E(${\displaystyle X_{1}+X_{2}+\dots +X_{n}}$) = n${\displaystyle \mu }$ and Var(${\displaystyle X_{1}+X_{2}+\dots +X_{n}}$) = n${\displaystyle \sigma ^{2}}$ ${\displaystyle X_{1}+X_{2}+\dots +X_{n}\approx }$ N(n${\displaystyle \mu }$,n${\displaystyle \sigma ^{2}}$),where ${\displaystyle \approx }$ means approximately for large n. Let ${\displaystyle X_{1},X_{2},\dots ,X_{n}}$ be independently and identically distributed random variables with E(${\displaystyle X_{i}}$) = ${\displaystyle \mu }$ and Var(${\displaystyle X_{i}}$) = ${\displaystyle \sigma ^{2}>0}$ for i = 1,${\displaystyle \dots }$,n. Then the mean of these random variables is for large n approximately normally distributed:E${\displaystyle \left({\frac {1}{n}}(X_{1}+X_{2}+\dots +X_{n})\right)}$ =E${\displaystyle \left({\overline {x}}\right)}$= ${\displaystyle \mu }$ and Var${\displaystyle \left({\overline {x}}\right)}$ = ${\displaystyle {\frac {\sigma ^{2}}{n}}}$ ${\displaystyle {\overline {x}}\approx }$ N(${\displaystyle \mu }$,${\displaystyle {\frac {\sigma ^{2}}{n}}}$),where ${\displaystyle \approx }$ means approximately for large n This result requires that none of the random variables are responsible for most of the variance.The distribution N(${\displaystyle \mu ,{\frac {\sigma ^{2}}{n}}}$) depends on the number of the summands n and for infinite n it would have infinite and infinite variance. The meaning of this theorem can be described more clearly if we use standardized sums of random variables.
Let ${\displaystyle X_{1},\dots ,X_{n}}$ be independent and identically distributed random variables: E(${\displaystyle X_{i}}$) = ${\displaystyle \mu }$ and Var(${\displaystyle X_{i}}$) = ${\displaystyle \sigma ^{2}}$ > 0 . Then the distribution function ${\displaystyle F_{n}(z)=P(Z_{n}\leq z)}$ of ${\displaystyle {\overline {x}}}$ ${\displaystyle Z_{n}={\frac {\sum \limits _{i=1}^{n}{\frac {X_{i}}{n}}-\mu }{\sqrt {\sigma ^{2}/n}}}={\frac {1}{\sqrt {n}}}\sum _{i}{1}^{n}{\frac {{\frac {X_{i}}{n}}-\mu }{\sigma }}}$ converges as n${\displaystyle \rightarrow \infty }$ to a standard normal distribution: ${\displaystyle \lim _{n\rightarrow \infty }F_{n}(z)=\Phi (z)}$ The “standardized ” random variable ${\displaystyle Z_{n}}$ is approximately distributed as a standard Normal distribution: ${\displaystyle Z_{n}\approx N(0;1)}$. In this example, we will try to illustrate the principle of the Central Limit Theorem. Let us consider continuous random variables ${\displaystyle X_{1},X_{2},\dots }$ random variables which are independently and identically uniformly distributed on the interval ${\displaystyle [-0,5;0,5]}$: ${\displaystyle f(x)=\left\{{\begin{array}{ll}1\quad &{\text{for}}\ -0.5\leq x\leq 0.5\\0\quad &{\text{otherwise.}}\end{array}}\right.}$ The expected value and the variance are: ${\displaystyle E(X)={\frac {b+a}{2}}={\frac {0.5-0.5}{2}}=0}$ ${\displaystyle Var(X)={\frac {(b-a)^{2}}{12}}={\frac {[0.5-(-0.5)]^{2}}{12}}={\frac {1}{12}}\,.}$ Let us consider a sequence of the sum of these variables; the index of the variable ${\displaystyle Y}$ denotes the number of observations in the sample: ${\displaystyle Y_{n}=\sum _{i=1}^{n}X_{i}\qquad n=1,2,3,\dots \,.}$ For example, for ${\displaystyle n=1}$, ${\displaystyle n=2}$, and ${\displaystyle n=3}$ we get:${\displaystyle Y_{1}=X_{1}}$${\displaystyle Y_{2}=X_{1}+X_{2}}$${\displaystyle Y_{3}=X_{1}+X_{2}+X_{3}}$. and the densities: ${\displaystyle f(y_{1})=\left\{{\begin{array}{ll}1\quad &{\text{for}}\ -0.5\leq y_{1}\leq 0.5\\0\quad &{\text{otherwise}}\end{array}}\right.}$ ${\displaystyle f(y_{2})=\left\{{\begin{array}{ll}1+y_{2}\quad &{\text{for}}\ -1\leq y_{2}\leq 0\\1-y_{2}\quad &{\text{for}}\ 0\leq y_{2}\leq 1\\0\quad &{\text{otherwise}}\end{array}}\right.}$ ${\displaystyle f(y_{3})=\left\{{\begin{array}{ll}0.5(1.5+y_{3})^{2}\quad &{\text{for}}\ -1.5\leq y_{3}\leq -0.5\\0.5+(0.5+y_{3})(0.5-y_{3})\quad &{\text{for}}\ -0.5 All these densities are plotted in the following figure, which also contains a plot of a ${\displaystyle N(0,1)}$ density for comparison:
The convergence towards of these distributions to a Normal density can be clearly seen. As the number of observations increases the distribution becomes more similar to a Normal distribution. In fact, for ${\displaystyle n\geq 30}$ we can hardly see any differences. The (Lindeberg and Lévy) Central Limit Theorem is the main reason that the Normal distribution is so commonly used. The practical usefulness of this theorem derives from the fact that a sample of identically distributed independent random variables has an approximately Normal distribution as the sample increases (usually ${\displaystyle n\geq 30)}$ This theorem becomes particularly important when the deriving the sampling distribution of the statistics. The convergence towards the Normal distribution will be very quick if the distribution of the random variables is symmetric. If the distribution is not symmetric, then the convergence will be much slower. The Central Limit Theorem has many various generalizations (e.g. Lyapunov CLT for independent, but not identically distributed random variables). Furthermore, there are also limit theorems that describe convergence towards other sorts of distributions.