# Normal Distribution

Jump to: navigation, search
 English Português Français ‎Español Italiano Nederlands

A continuous random variable X is normally distributed with parameters ${\displaystyle \mu }$ and ${\displaystyle \sigma ,}$ denoted ${\displaystyle X\sim N(\mu ,\sigma ),}$ if and only if its density function is: ${\displaystyle f_{N}V(x;\mu ,\sigma )={\frac {1}{\sigma {\sqrt {2\pi }}}}e^{-(x-\mu )^{2}/2\sigma ^{2}}\quad -\infty 0}$ the distribution function is : ${\displaystyle F_{N}V(x;\mu ,\sigma )={\frac {1}{\sigma {\sqrt {2\pi }}}}\int \limits _{-\infty }^{x}e^{-(t-\mu )^{2}/2\sigma ^{2}}\,dt}$ The Normal distribution depends on two parameters ${\displaystyle \mu }$ and ${\displaystyle \sigma }$, which are the expected value and the standard deviation of the random variable X. Expected value, variance and standard deviation: ${\displaystyle E(X)=\mu =\int \limits _{-\infty }^{+\infty }xf(x)\,dx,\quad Var(X)={\sigma }^{2}=\int \limits _{-\infty }^{+\infty }(x-\mu )^{2}f(x)\,dx,\quad \sigma ={\sqrt {{\sigma }^{2}}}}$ Two important properties of Normal random variables:

• Linear transformation

Let X be Normally distributed, ${\displaystyle X\sim N(\mu ,\sigma )}$ and Y be a linear combination of X: ${\displaystyle Y=a+bX\,,b\neq 0}$. Then, the random variable Y has also Normal distribution:

Y ${\displaystyle \sim }$ N(a + b${\displaystyle \mu }$, ${\displaystyle |}$ b ${\displaystyle |\cdot \sigma }$)

The values of the parameters of the transformed random variable follow from the rules for calculating with expecting values and variances:

E(a + bX) = a + b ${\displaystyle \cdot }$ E(X)

Var(a + bX) = ${\displaystyle b^{2}}$ Var(X) = ${\displaystyle b^{2}{\sigma }^{2}}$.

• Reproduction property

Let us consider n random variables ${\displaystyle X_{1},X_{2}\dots ,X_{n}}$ with Normal distributions: ${\displaystyle X_{i}\sim N(\mu _{i},\sigma _{i}),E(X_{i})=\mu _{i},Var(X_{i})=\sigma _{i}^{2}.}$

The sum of independent, normally distributed random variables ${\displaystyle X_{1},\dots ,X_{n}}$, i.e.

${\displaystyle Y=a_{1}X_{1}+a_{2}X_{2}+\dots +a_{n}X_{n},a_{i}\neq 0}$ for at least one i, is again normally distributed.

${\displaystyle Y=\sum \limits _{i=1}^{n}A_{i}X_{i}\sim N\left(\sum \limits _{i=1}^{n}a_{i}\mu _{i},{\sqrt {\sum \limits _{i=1}^{n}a_{i}^{2}\sigma _{i}^{2}}}\right)}$

The following diagrams displays a density and distribution function for a N(2;1) random variable. Density:

The distribution function of N(2;1):

Standardized random variable: ${\displaystyle Z={\frac {X-\mu }{\sigma }}}$ The random variable Z denotes a standardized random variable, which has been centred at its mean and scaled by its standard deviation.If X is normally distributed, then Z also has a Normal distribution. Standardized Normal distribution: The distribution of Z is usually denoted as standardized Normal distribution N(0;1). The density function of a standardized Normal distribution: ${\displaystyle \varphi (z)={\frac {1}{\sqrt {2\pi }}}e^{-{\frac {z^{2}}{2}}}}$ The distribution function of a standardized Normal distribution: ${\displaystyle \Phi (z)={\frac {1}{\sqrt {2\pi }}}\int \limits _{-\infty }^{z}e^{-v^{2}/2}\,dv}$ Expected value and variance of standardized Normal distribution: E(Z) = 0 ${\displaystyle \quad }$ Var(Z) = 1 The density and distribution function for a standardized normal random variable are plotted in the following figures. Density of N(0;1)

Distribution function of N(0;1)

The relation between the distribution N(${\displaystyle \mu ,\sigma }$) and the standardized Normal distribution: ${\displaystyle x=\mu +z\cdot \sigma ,z={\frac {x-\mu }{\sigma }}}$ which implies: ${\displaystyle F_{NV}(x;\mu ,\sigma )=P(X\leq x)=P\left({\frac {X-\mu }{\sigma }}\leq {\frac {x-\mu }{\sigma }}\right)=P(Z\leq z)=\Phi (z)}$ Confidence interval: A confidence interval for the random variable X is the interval with boundaries ${\displaystyle x_{l}}$ and ${\displaystyle x_{u}(x_{l}\leq x_{u})}$, which will contain the value of the random variable X with probability 1 - ${\displaystyle \alpha }$, i.e. (1 -${\displaystyle \alpha }$) ${\displaystyle \cdot }$ 100% of all values of X will fall in this interval and ${\displaystyle \alpha \cdot }$ 100%  will fall outside this interval. 1-${\displaystyle \alpha }$ is usually referred to as the confidence level.For known values of ${\displaystyle \mu ,}$ the expected value of X, the interval is constructed to make the probability that X falls outside this region (there are 2 such regions) with probability ${\displaystyle \alpha }$/2. We call the interval [${\displaystyle x_{u}\leq x_{o}}$] = [${\displaystyle \mu -k\leq X\leq \mu +k}$] the (symmetric) confidence interval with confidence level P(${\displaystyle x_{u}\leq X\leq x_{o}}$) = 1 - ${\displaystyle \alpha }$ . To stress the importance of the standard deviation, as the parameter of scale, the deviation of X from its expected value ${\displaystyle \mu }$ is often measured in multiples of ${\displaystyle \sigma }$. The confidence interval has then this form: [${\displaystyle \mu }$ - c${\displaystyle \sigma \leq }$ X ${\displaystyle \leq \mu }$ + c${\displaystyle \sigma }$] If the random variable X is N(${\displaystyle \mu ,\sigma }$), then for x = ${\displaystyle \mu }$ + c${\displaystyle \sigma }$ the following holds: ${\displaystyle {\frac {x-\mu }{\sigma }}={\frac {\mu +c\sigma -mu}{\sigma }}}$ ${\displaystyle z=c}$ and P(Z ${\displaystyle \leq }$ z) = ${\displaystyle \Phi }$(z) = 1 - ${\displaystyle \alpha }$/2 . The critical value ${\displaystyle z_{1-\alpha /2}}$ for the probability 1 - ${\displaystyle \alpha }$/2 can be obtained from the tabulated values of a standardized Normal distribution. Using these values, we can obtain the confidence interval for a normally distributed random variable: [${\displaystyle \mu -z_{1-\alpha /2}\sigma \leq }$ X ${\displaystyle \mu +z_{1-\alpha /2}\sigma }$] and the probability of “this interval”: P(${\displaystyle \mu -z_{1-\alpha /2}\sigma \leq }$ X ${\displaystyle \mu +z_{1-\alpha /2}\sigma }$) = 1 - ${\displaystyle \alpha }$ The confidence interval for normally distributed random variable:

We have P(-z ${\displaystyle \leq }$ Z ${\displaystyle \leq }$ z) = P(Z ${\displaystyle \leq }$ z) - P(Z ${\displaystyle \leq }$ -z) = P(Z ${\displaystyle \leq }$ z) - [1 - P(Z ${\displaystyle \leq }$ z)] = 2P(Z ${\displaystyle \leq }$ z) -1 , which implies that ${\displaystyle P(\mu -z_{1-\alpha /2}\sigma \leq X\mu +z_{1-\alpha /2}\sigma )=2\Phi (z)-1}$ . For given z we can calculate the confidence levels of the interval:

 ${\displaystyle P(\mu -z\sigma \leq X\mu +z\sigma )}$ ${\displaystyle =0.6827\quad }$for${\displaystyle \quad z=1}$ ${\displaystyle =0.9545\quad }$for${\displaystyle \quad z=2}$ ${\displaystyle =0.9973\quad }$for${\displaystyle \quad z=3}$

On the other hand, we could also find the value z that produces the desired confidence level 1-${\displaystyle \alpha }$, e.g. ${\displaystyle P(\mu -z_{1-\alpha /2}\sigma \leq X\mu +z_{1-\alpha /2}\sigma )}$ = 0.95, z = 1.96. The Normal distribution is described by two parameters which imply its:

• shape
• location and
• scale (variance)

In this interactive example, you can choose different values of these parameters and observe their effect on the density function of a normal random variable. We recommend that you only change one parameter at time to better observe their effects on the distribution function. The density function of a standard Normal distribution is presented (in black) to provide a further reference point. In addition, you can also calculate the probability that ${\displaystyle X}$ falls in some interval. Let us consider random variable ${\displaystyle X}$ with Normal distribution ${\displaystyle N(100;\,10)}$.

1.  We want to compute ${\displaystyle P(X\leq x)}$ for ${\displaystyle x=125}$: ${\displaystyle z=(x-\mu )/\sigma =(125-100)/10=2,5}$ ${\displaystyle P(X\leq 125)=F(125)=\Phi \left({\frac {125-100}{10}}\right)=\Phi (2.5)=0.99379}$

There is a 99.38% probability  that the random variable ${\displaystyle X}$ is smaller than 125.

2.  We want to calculate the probability ${\displaystyle P(X\geq x)}$ for ${\displaystyle x=115.6}$: ${\displaystyle z=(x-\mu )/\sigma =(115.6-100)/10=1.56}$ {\displaystyle {\begin{aligned}P(X\geq 115.6)&=&1-P(X\leq 115.6)=1-F(115.6)\\&=&1-\Phi \left({\frac {115.6-100}{10}}\right)=1-\Phi (1.56)\\&=&1-0.94062=0.05938\end{aligned}}}

There is a 5.94% probability that the random variable ${\displaystyle X}$ is greater than 115.6.

3.  Let us calculate the probability ${\displaystyle P(X\leq x)}$ for ${\displaystyle x=80}$: ${\displaystyle z=(x-\mu )/\sigma =(80-100)/10=-2}$ ${\displaystyle P(X\leq 80)=F(80)=\Phi \left({\frac {80-100}{10}}\right)=\Phi (-2)=1-\Phi (2)=1-0.97725=0.02275}$

The random variable ${\displaystyle X}$ is smaller than 80 with probability of 2.275% .

4.  Let us compute ${\displaystyle P(X\geq x)}$ for ${\displaystyle x=94.8}$: ${\displaystyle z=(x-\mu )/\sigma =(94.8-100)/10=-0.52}$ {\displaystyle {\begin{aligned}P(X\geq 94.8)&=&1-P(X\leq 94.8)=1-F(94.8)\\&=&1-\Phi \left({\frac {94.8-100}{10}}\right)=1-\Phi (-0.52)\\&=&1-(1-\Phi (0.52))=\Phi (0.52)=0.698468\end{aligned}}}

The probability that the random variable ${\displaystyle X}$ is greater than 94.8 is 69.85% .

5.  We compute the probability ${\displaystyle P(x_{u}\leq X\leq x_{o})}$ for ${\displaystyle x_{u}=88.8}$ and ${\displaystyle x_{o}=132}$: ${\displaystyle z_{u}=(x_{u}-\mu )/\sigma =(88.8-100)/10=-1.12}$ ${\displaystyle z_{o}=(x_{o}-\mu )/\sigma =(132-100)/10=3.2}$ {\displaystyle {\begin{aligned}P(88.8\leq X\leq 132)&=&P(X\leq 132)-P(X\leq 88,8)\\&=&F(132)-F(88.8)\\&=&\Phi (3.2)-\Phi (-1.12)\\&=&\Phi (3.2)-(1-\Phi (1.12)\\&=&0.999313+0.868643-1\\&=&0.867956\\&&\end{aligned}}}

The random variable ${\displaystyle X}$ falls in the interval ${\displaystyle [88,8\,;\,132]}$ with probability 86.8% .

6.  Let us calculate ${\displaystyle P(x_{u}\leq X\leq x_{o})}$ for ${\displaystyle x_{u}=80.4}$ and ${\displaystyle x_{o}=119.6}$ (centered probability interval): ${\displaystyle z_{u}=(x_{u}-\mu )/\sigma =(80.4-100)/10=-1.96}$ ${\displaystyle z_{o}=(x_{o}-\mu )/\sigma =(119.6-100)/10=1.96}$ {\displaystyle {\begin{aligned}P(80.4\leq X\leq 119.6)&=&P(X\leq 119.6)-P(X\leq 80.4)\\&=&F(119.6)-F(80.4)\\&=&\Phi (1.96)-\Phi (-1.96)\\&=&\Phi (1.96)-(1-\Phi (1.96)\\&=&2\Phi (1.96)-1\\&=&2\cdot 0.975-1=0.95\\&&\end{aligned}}}

The random variable ${\displaystyle X}$ falls into the interval ${\displaystyle [80,4\,;\,119,6]}$ with probability 95% .

7.  We want to calculate an interval, which is symmetric around the expected value, such that it will contain 99% of the realizations of ${\displaystyle X}$: {\displaystyle {\begin{aligned}P(x_{u}\leq X\leq x_{o})&=&0.99\\&=&P\left({\frac {x_{u}-100}{10}}\leq Z\leq {\frac {x_{o}-100}{10}}\right)\\&=&P(-z\leq Z\leq z)=2\Phi (z)-1\\\Phi (z)&=&{\frac {1.99}{2}}=0.995\\&&\end{aligned}}}

For the value (the probability) 0.995 we find in the tables of the distribution function of standard Normal distribution function that ${\displaystyle z=2,58}$ .. This implies: ${\displaystyle x_{o}=\mu +z\sigma =100+2.58\cdot 10=125.8}$ ${\displaystyle x_{u}=\mu -z\sigma =100-2.58\cdot 10=74.2}$ take ${\displaystyle P(74.2\leq X\leq 125.8)=0.99}$.

The random variable ${\displaystyle X}$ falls into the interval ${\displaystyle [74,2\,;\,125,8]}$ with a 99% probability .

8.  Let us find an ${\displaystyle x}$ such that 76.11% of the realizations of ${\displaystyle X}$ are smaller than ${\displaystyle x}$: {\displaystyle {\begin{aligned}P(X\leq x)&=&0.7611\\&=&P\left(Z\leq {\frac {x-100}{10}}\right)=P(Z\leq z)\\&&\end{aligned}}}

For the value 0.7611 we obtain from the standard Normal distribution tables that ${\displaystyle z=0.71}$. Hence: ${\displaystyle x=\mu +z\sigma =100+0.71\cdot 10=107.1}$ so that ${\displaystyle P(X\leq 107.1)=0.7611}$.

There is a 76.11% probability that the random variable ${\displaystyle X}$ will be smaller than 107.1.

9.  We calculate ${\displaystyle x}$ such that 3.6% of realizations of ${\displaystyle X}$ is greater than ${\displaystyle x}$: {\displaystyle {\begin{aligned}P(X\geq x)&=&0.036\\&=&P\left(Z\geq {\frac {x-100}{10}}\right)=P(Z\geq z)\\&&\end{aligned}}}

Since ${\displaystyle P(Z\geq z)=1-P(Z\leq z)=0.964}$, using the standard Normal distribution tables the value ${\displaystyle z=1.8}$ for the probability 0.964. Hence, ${\displaystyle x=\mu -z\sigma =100-1.8\cdot 10=118}$ so that ${\displaystyle P(X\geq 118)=0.036}$.

There is a 3.6% probability that the random variable ${\displaystyle X}$ is greater than 118.

The Normal distribution is one of the most important continuous distributions because:

• approximate normality can be assumed in many applications
• it can be used to approximate other distributions
• many variables have normal distributions if there is a large number of observations

A random variable with a Normal distribution can take all values between -${\displaystyle \infty }$ and +${\displaystyle \infty }$ The Normal distribution is also sometimes referred to as a Gaussian distribution. The density of Normal distribution is sometimes called the Bell curve.The formulas for the density (or the distribution function) imply that a Normal distribution will  depend on the parameters ${\displaystyle \mu {\text{ and }}\sigma }$ .. By varying these parameters we can obtain a range of distributions. The following diagram shows 5 normal densities with various parameters ${\displaystyle \mu {\text{ and }}\sigma }$.

The parameter ${\displaystyle \mu }$ specifies the location of the distribution. If we change the parameter ${\displaystyle \mu }$, the location of the distribution will shift but its shape remains the same.By increasing or decreasing the parameter ${\displaystyle \sigma }$, the density ”spreads” or ”concentrates”. Large values of ${\displaystyle \sigma }$, produce flatter and wider densities. Small values of ${\displaystyle \sigma }$ produce distributions that are narrow and tight. Other properties of the Normal distribution :

•  the density has global maximum (the mode) at point ${\displaystyle x=\mu }$
•  the density is symmetric around the point ${\displaystyle x=\mu }$. The symmetry implies that the median is ${\displaystyle x_{0.5}=\mu }$.
•  the density has inflexion points at ${\displaystyle x_{1}=\mu -\sigma }$ and ${\displaystyle x_{2}=\mu +\sigma }$
•  the density is asymptotically equal to 0 as ${\displaystyle x\rightarrow -\infty }$ or ${\displaystyle x\rightarrow \infty }$.

The digram contains a plot of a ${\displaystyle N(2;1)}$ distribution

Standard Normal distribution: Tabulating the distribution function of the Normal distribution for all values of ${\displaystyle \mu }$ and ${\displaystyle \sigma }$ is not possible.However, since we can transform a Normal random variable to obtain another Normal random variable we need only tabulate one distribution. The obvious choice is the Normal distribution with expected value ${\displaystyle 0}$, ${\displaystyle E(X)=\mu =0}$ and standard deviation ${\displaystyle 1}$, ${\displaystyle \sigma =1}$. This distribution is called a standard Normal distribution, denoted ${\displaystyle N(0,1)}$ – distribution. The corresponding random variables are usually denoted by the letter ${\displaystyle Z}$.The random variable ${\displaystyle Z}$ is the random variable ${\displaystyle X}$ centered at its mean and divided by its standard deviation. Hence ${\displaystyle E(Z)=0}$ and ${\displaystyle Var(Z)=1}$. If ${\displaystyle X}$ is normally distributed, then ${\displaystyle Z}$ also has a (standard) Normal distribution. The standard Normal distribution is important because each random variable ${\displaystyle X}$ with arbitrary Normal distribution can be linearly transformed to a random variable ${\displaystyle Z}$ with standard Normal distribution.In most tables for the density and distribution function of the standard Normal distribution, you can find only positive values of ${\displaystyle Z}$. The tables of standard Normal distribution for negative ${\displaystyle Z}$ is unnecessary since the Normal distribution is symmetric. ${\displaystyle \Phi (-z)=P(Z\leq -z)=1-P(Z\leq z)=1-\Phi (z)}$