Parameters

 English Português Français ‎Español Italiano Nederlands

A random variable is completely described by its density and distribution functions. However, some important aspects of the probability distribution can be characterized by a small number of parameters. The most important of which are the location and scale parameters of a random variable.

Expected value

The expected value of a random variable $X$ , denoted by $E(X)$ or $\mu ,$ corresponds to the arithmetic mean of an empirical frequency distribution. The expected value is the value that we, on average, expect to obtain as an outcome of the experiment. By repeating the experiment many times, the expected value $E(X)$ is the number that will be obtained as an average of all the outcomes of an experiment. Definition: Let us consider the discrete random variable $X$ with outcomes $x_{i}$ and the corresponding probabilities $f(x_{i})$ . Then, the expression $E(X)=\mu =\sum \limits _{i}x_{i}f(x_{i})$ defines the expected value of the random variable $X$ . For a continuous random variable $X$ , with density $f(x)$ , we define the expected value as $E(X)=\mu =\int \limits _{-\infty }^{+\infty }x\cdot f(x)\,dx$ Properties of the expected value:

Let $X$ and $Y$ be two random variables with the expected values $E(X)$ and $E(Y)$ . Then:

• for $Y=a+bX$ with any $a,b$ $E(Y)=E(a+bX)=a+bE(X)$ • for $Z=X+Y$ $E(Z)=E(X+Y)=E(X)+E(Y)$ • for $X,Y$ independent random variables

$E(XY)=E(X)E(Y)$ Variance

Definition:The variance, which is usually denoted by $Var(X)$ or $\sigma ^{2},$ is defined as expected value of the squared difference between a random variable and its expected value: $Var(X)=E[(X-E(X))^{2}]=E(X^{2})=[E(X)]^{2}$ For discrete random variable we obtain $Var(X)=\sigma ^{2}=\sum \limits _{i}[x_{i}-E(X)]^{2}\cdot f(x_{i})=\sum \limits _{i}x_{i}^{2}f(x_{i})-[E(X)]^{2}$ and for a continuous random variable the variance is defined as $Var(X)=\sigma ^{2}=\int \limits _{-\infty }^{+\infty }[x-E(X)]^{2}\cdot f(x)\,dx=\int \limits _{-\infty }^{+\infty }x^{2}f(x)\,dx-[E(X)]^{2}$ The properties of the variance:

Assume that $X$ and $Y$ are two random variables with the variances $Var(X)$ and $Var(Y)$ . Then:

• for $Y=a+bX$ , where $a$ and $b$ are constants

$Var(Y)=Var(a+bX)=b^{2}Var(X)$ • for $X,Y$ independent random variables and $Z=X+Y$ $Var(Z)=Var(X)+Var(Y)$ $\sigma _{Z}=\sigma _{X+Y}={\sqrt {\sigma _{X}^{2}+\sigma _{Y}^{2}}}$ Standard deviation

Standard deviation $\sigma$ denotes the square root of the variance, which summarizes the spread of the distribution. Large values of the standard deviation mean that the random variable $X$ is likely to vary in a large neighbourhood around the expected value. Smaller values of the standard deviation indicate that the values of $X$ will be concentrated around the expected value.

Standardization

Sometimes, it is useful to transform a random variable in order to obtain a distribution that does not depend on any (unknown) parameters. It is easy to show that the standardized random variable $Z={\frac {X-E(X)}{\sigma _{X}}}$ has expected value $E(Z)=0$ and variance $Var(Z)=1$ .

Chebyshev’s inequality

Chebyschev’s inequality provides a on the probability that a random variable falls within some interval around its expected value. This inequality only requires us to know the expected value and the variance of the distribution; we do not have to know the distribution itself. The inequality is based on the interval $[\mu -k\cdot \sigma ;\mu +k\cdot \sigma ]$ which is centered around $\mu$ . Definition:Consider the random variable $X$ with expected value $\mu$ and variance $\sigma$ . Then, for any $k>0$ , we have $P(\mu -k\cdot \sigma \leq X\leq \mu +k\cdot \sigma )\geq 1-{\frac {1}{k^{2}}}$ Denoting $k\cdot \sigma =a$ , we obtain $P(\mu -a\leq X\leq \mu +a)\geq 1-{\frac {\sigma ^{2}}{k^{2}}}$ We can use the inequality to also obtain a bound for the complementary event that the random variable $X$ falls outside the interval, i.e. $\{|X-\mu |>k\cdot \sigma \}$ $P(|X-\mu |>k\cdot \sigma )<1/k^{2}$ and for $k\cdot \sigma =a$ $P(|X-\mu |>a)<\sigma ^{2}/a^{2}{\text{.}}$ Note that the exact probabilities $\{|X-\mu | and $\{|X-\mu |\leq k\cdot \sigma \}$ depend on the specific distribution $X$ . Let $X$ be continuous random variable with the density $f(x)=\left\{{\begin{array}{ll}0.25x-0.5\ &{\text{for}}\ 2 We calculate the expected value of $X$ : {\begin{aligned}E(X)=\mu &=&\int _{-\infty }^{\infty }xf(x)\,dx\\&=&\int _{2}^{4}x(0,25x-0,5)\,dx+\int _{4}^{6}x(-0,25x+1,5)\,dx\\&=&\int _{2}^{4}(0,25x^{2}-0,5x)\,dx+\int _{4}^{6}(-0,25x^{2}+1,5x)\,dx\\&=&\left[0,25{\frac {1}{3}}x^{3}-0,5{\frac {1}{2}}x^{2}\right]_{2}^{4}+\left[-0,25{\frac {1}{3}}x^{3}+1,5{\frac {1}{2}}x^{2}\right]_{4}^{6}\\&=&4\end{aligned}} Now we calculate the variance: {\begin{aligned}Var(X)=\sigma ^{2}&=&\int _{-\infty }^{\infty }x^{2}f(x)\,dx-[E(X)]^{2}\\&=&\int _{2}^{4}x^{2}(0,25x-0,5)\,dx+\int _{4}^{6}x^{2}(-0,25x+1,5)\,dx-4^{2}\\&=&\int _{2}^{4}(0,25x^{3}-0,5x^{2})\,dx+\int _{4}^{6}(-0,25x^{3}+1,5x^{2})\,dx-4^{2}\\&=&\left[0,25{\frac {1}{4}}x^{4}-0,5{\frac {1}{3}}x^{3}\right]_{2}^{4}+\left[-0,25{\frac {1}{4}}x^{4}+1,5{\frac {1}{3}}x^{3}\right]_{4}^{6}-16\\&=&0,\,6667\,.\end{aligned}} The standard deviation is equal to $\sigma =0.8165$ .For this continuous random variable the distribution has an expected value 4 and a standard deviation $0.8165$ . Let the random variable $X$ denote the number of traffic accidents occurring at an intersection during a week. From long-term records, we know the following frequency distribution of $X$ :

$x_{i}$ 0 1 2 3 4 5
$f(x_{i})$ 0.08 0.18 0.32 0.22 0.14 0.06

The expected value of $X$ , i.e. the expected number of crashes, can be computed as follows:

$x_{i}$ 0 1 2 3 4 5
$f(x_{i})$ 0.08 0.18 0.32 0.22 0.14 0.06
$x_{i}f(x_{i})$ 0 0.18 0.64 0.66 0.56 0.30

This gives $E(X)=\mu =\sum x_{i}f(x_{i})=2.34\,.$ This number of traffic accidents is, of course, not possible, since we cannot have 2.34 accidents during a week. The value $E(X)=2.34$ just shows the center of the probability function of the random variable $X$ . Now we calculate the standard deviation:

$x_{i}^{2}$ 0 1 4 9 16 25
$x_{i}^{2}f(x_{i})$ 0 0.18 1.28 1.98 2.24 1.50

$Var(X)=\sigma ^{2}=\sum x_{i}^{2}f(x_{i})-\mu ^{2}=7.18-2.34^{2}=1.7044\Rightarrow \sigma =1.306\,.$ We can expect that the distribution function for accidents at this intersection has a mean of 2.34 and a standard deviation of $1.306$ .