From MM*Stat International

Jump to: navigation, search

A random variable is completely described by its density and distribution functions. However, some important aspects of the probability distribution can be characterized by a small number of parameters. The most important of which are the location and scale parameters of a random variable.

Expected value

The expected value of a random variable , denoted by or corresponds to the arithmetic mean of an empirical frequency distribution. The expected value is the value that we, on average, expect to obtain as an outcome of the experiment. By repeating the experiment many times, the expected value is the number that will be obtained as an average of all the outcomes of an experiment. Definition: Let us consider the discrete random variable with outcomes and the corresponding probabilities . Then, the expression defines the expected value of the random variable . For a continuous random variable , with density , we define the expected value as

Properties of the expected value:

Let and be two random variables with the expected values and . Then:

  • for with any

  • for

  • for independent random variables


Definition:The variance, which is usually denoted by or is defined as expected value of the squared difference between a random variable and its expected value: For discrete random variable we obtain and for a continuous random variable the variance is defined as

The properties of the variance:

Assume that and are two random variables with the variances and . Then:

  • for , where  and  are constants

  • for independent random variables and

Standard deviation

Standard deviation denotes the square root of the variance, which summarizes the spread of the distribution. Large values of the standard deviation mean that the random variable is likely to vary in a large neighbourhood around the expected value. Smaller values of the standard deviation indicate that the values of will be concentrated around the expected value.


Sometimes, it is useful to transform a random variable in order to obtain a distribution that does not depend on any (unknown) parameters. It is easy to show that the standardized random variable has expected value and variance .

Chebyshev’s inequality

Chebyschev’s inequality provides a on the probability that a random variable falls within some interval around its expected value. This inequality only requires us to know the expected value and the variance of the distribution; we do not have to know the distribution itself. The inequality is based on the interval which is centered around . Definition:Consider the random variable with expected value and variance . Then, for any , we have Denoting , we obtain We can use the inequality to also obtain a bound for the complementary event that the random variable falls outside the interval, i.e.  and for Note that the exact probabilities and depend on the specific distribution . Let be continuous random variable with the density We calculate the expected value of : Now we calculate the variance: The standard deviation is equal to .For this continuous random variable the distribution has an expected value 4 and a standard deviation . Let the random variable denote the number of traffic accidents occurring at an intersection during a week. From long-term records, we know the following frequency distribution of :

0 1 2 3 4 5
0.08 0.18 0.32 0.22 0.14 0.06

The expected value of , i.e. the expected number of crashes, can be computed as follows:

0 1 2 3 4 5
0.08 0.18 0.32 0.22 0.14 0.06
0 0.18 0.64 0.66 0.56 0.30

This gives This number of traffic accidents is, of course, not possible, since we cannot have 2.34 accidents during a week. The value just shows the center of the probability function of the random variable . Now we calculate the standard deviation:

0 1 4 9 16 25
0 0.18 1.28 1.98 2.24 1.50

We can expect that the distribution function for accidents at this intersection has a mean of 2.34 and a standard deviation of .