Probability Concepts

From MM*Stat International

Jump to: navigation, search
English
Português
Français
‎Español
Italiano
Nederlands


Probability is a measure P(\bullet) which quantifies the dregree of (un)certainty associated with an event. We will discuss three common approaches to probability.

Classical Probability

Laplace’s classical definition of probability is based on equally likely outcomes. He postulates the following properties of events:

  • the sample space is composed of a finite number of basic outcomes
  • the random process generates exactly basic outcome and hence one elementary event
  • the elementary events are equally likely, i.e. occur with the same probability

Accepting these assumptions, the probability of any event A (subset of the sample space) can be computed as P(A)=\frac{\#\left(  \text{basic outcomes in }A\right)  }{\#\left(
\text{basic outcomes in }S\right)  }=\frac{\#\left(  \text{elementary events comprising }A\right)  }{\#\left(  \text{elementary events comprising }S\right)  } Properties:

  • 0\leq P(A) \leq1
  • P(\emptyset)=0
  • P(S)=1

Example: Rolling a six-sided dieSample space: S=\{1,2,3,4,5,6\} Define event A= ‘even number’Elementary events in A: \{2\},\{4\},\{6\}P(A)=\frac{3}{6}=0.5

Statistical Probability

Richard von Mises originated the relative frequency approach to probability: The probability P(A) for an event A is defined as the limit of the relative frequency of A, i.e. the value the relative frequency will converge to if the experiment is repeated an infinite number of times.  It is assumed that replications are independent of each other. Let h_{n}(A) denote the absolute frequency of A occurring in n repetitions. The relative frequency of A is then defined as f_{n}(A)=\frac{h_{n}(A)}{n} According to the statistical concept of probability we have P(A)=\lim_{n\rightarrow\infty}f_{n}(A) Since 0\leq f_{n}(A)\leq1 it follows that 0\leq P(A)\leq1. Example: Flipping a coin Denote by T  the event ‘a head appears’.  Absolute and relative frequencies of A after n trials are listed in the table below. This particular sample displays a non monotonic convergence to 0.5, the theoretical probability of a head occuring in repeated flips of a ’fair’ coin..

n h_{n}(A) f_{n}(A)
10 7 0.700
20 11 0.550
40 17 0.425
60 24 0.400
80 34 0.425
100 47 0.470
200 92 0.460
400 204 0.510
600 348 0.580
800 404 0.505
1000 492 0.492
2000 1010 0.505
3000 1530 0.510
4000 2032 0.508
5000 2515 0.503

Visualizing the sequence of relative frequencies f_{n}\left(  A\right)  as a function of sample size provides some intuition into the character of the convergence.

En folimg535.gif

A central objective of statistics is to estimate or approximate probabilities of events using observed data.  These estimates can then be used to make probabilistic statements about the process generating the data,  (e.g., confidence intervals which we will study later), tto test propositions about the process and to predict the likelihood of future events

Axiomatic Foundation of Probability

P is a probability measure.  It is a function which assigns a number P(A) to each event A of the sample space S. Axiom 1P(A) is real-valued with P(A)\geq0. Axiom 2P(S)=1. Axiom 3If two events A and B are mutually exclusive (A\cap
B=\emptyset), thenP(A\cup B)=P(A)+P(B) Some basic properties of probabilityLet A,B,A_{1},A_{2}
,\ldots\subset S be events and P(\bullet) a probability measure. Then the following properties follow from the above three axioms

Properties

  1. P(\overline{A})=1-P(A)
  2. \left(  A \cap B = \emptyset\right)  \Rightarrow P(A \cap
B)=P(\emptyset) = 0
  3. If A_{i}\cap A_{j}=\emptyset for i\neq j, then P(A_{1}\cup
A_{2}\cup\ldots)=P(A_{1})+P(A_{2})+\ldots

Addition Rule of Probability

Let A and B be any two events. ThenP\left(  A\cup B\right)
=P\left(  A\right)  +P\left(  B\right)  -P\left(  A\cap B\right)

En folnode7 c 02.gif

Extension to three events A, B, C:P(A\cup B\cup
C)=P(A)+P(B)+P(C)-P(A\cap B)-P(A\cap C)-P(B\cap C)+P(A\cap B\cap C)

En folnode7 c k 1 1.gif

Assume you have shuffled a standard deck of 52 playing cards. You are interested in the probability of a randomly drawn card being a queen or a ’heart’.We are thus interested in probability of the event \left(  \left\{
\text{Queen}\right\}  \cup\left\{  \text{Heart}\right\}  \right)  . Following Laplace’s notion of probability, we proceed as follows: There are 4 queens and 13 hearts in the deck. Hence,

  • P\left(  \left\{  \text{Queen}\right\}  \right)  =\frac{4}{52}
  • P\left(  \left\{  \text{Heart}\right\}  \right)  =\frac{13}{52}

But there is also one card which is both a queen and a heart. As this card is included in both counts, we would overstate the probability of either queen or heart appearing if we simply added both probabilities. In fact, the addition rule of probability requires one to deduct the probability of this joint event: P\left(  A\cup B\right)  =P\left(  A\right)  +P\left(  B\right)  -P\left(
A\cap B\right) Here,

  • P\left(  A\cap B\right)  =P\left(  \left\{  \text{Queen}\right\}
\cap\left\{  \text{Heart}\right\}  \right)  =\frac{1}{52}

Thus,P\left(  \left\{  \text{Queen}\right\}  \cup\left\{
\text{Heart}\right\}  \right)  =P\left(  \left\{  \text{Queen}\right\}
\right)  +P\left(  \left\{  \text{Heart}\right\}  \right)  -P\left(  \left\{
\text{Queen}\right\}  \cap\left\{  \text{Heart}\right\}  \right)  =\frac{4}{52}+\frac{13}{52}-\frac{1}{52}=\frac{16}{52} The probability of drawing queen’s face and/or heart suit is 16/52.

  1. The event B can be rewritten as a union of two disjoint sets A
\cap B and \bar A \cap B as follows B = (A \cap B) \cup(\bar A \cap B)

    as illustrated on the Venn diagram below:

    En folnode7 c mi 2 1k.gif

    The probability P(B) is, according to axiom 3, P(B)=P[(A\cap B)\cup(\bar{A}\cap B)]=P(A\cap B)+P(\bar{A}\cap B) which implies P(\bar{A}\cap B)=P(B)-P(A\cap B)

  2. We rewrite the event A \cup B as a union of two disjoint sets A and \bar A \cap B so that A \cup B = A \cup(\bar A \cap B)

    En folnode7 c mi 2 1l.gif

    The probability P(A \cup B) follows from axiom 3 P(A \cup B) = P[A \cup(\bar A \cap B)] = P(A) + P(\bar A \cap B) Now we obtain the desired result by calculating P(\bar A \cap B) using the formula given in part one: P(A \cup B) = P(A) + P(B) - P(A \cap B).

    En folnode7 c mi 2 1m.gif

Proof of Property 5:Let us show that for A\subset B it follows that P(A)\leq P(B). The event B can be rewritten as B = A \cup(B \setminus A), where A and B \setminus A are disjoint sets.According to axiom 3 we have the following: P(B) = P(A) + P(B \setminus A). Nonnegativity of the probability P(B \setminus A) \geq0 implies that P(B) \geq P(A). This rule can be illustrated using a Venn diagram:

En folnode7 c mi 01.gif

Proof of Property 7:Let us prove that P(A\setminus
B)=P(A)-P(A\cap B). We have A\setminus B=A\cap\bar{B} and A=(A\cap B)\cup(A\cap\bar{B}), where (A\cap B) and (A\cap\bar{B}) are clearly disjoint.Using axiom 3 the probability of A can be calculated as P(A)=P[(A\cap B)\cup(A\cap\bar{B})]=P(A\cap B)+P(A\cap\bar{B})=P(A\cap
B)+P(A\setminus B) This result is displayed on the following Venn diagram:

En folnode7 c mi 02.gif