# Conditional Probability and Independent Events

 English Português Français ‎Español Italiano Nederlands

## Conditional Probability

Let ${\displaystyle A}$ and ${\displaystyle B}$ be two events defined on the sample space ${\displaystyle S}$. The conditional probability of ${\displaystyle A}$ given ${\displaystyle B}$, is defined as ${\displaystyle P(A|B)={\frac {P(A\cap B)}{P(B)}},{\text{ for }}P(B)>0}$ The conditional probability assumes that ${\displaystyle B}$ has occurred and asks what is the probability that ${\displaystyle A}$ has occurred.  By assuming that ${\displaystyle B}$ has occurred, we have defined a new sample space ${\displaystyle S=B}$   and a new probability measure ${\displaystyle P(A|B)}$.. If ${\displaystyle B=A_{2}\cap A_{3}}$ then we may write ${\displaystyle P(A_{1}|A_{2}\cap A_{3})={\frac {P(A_{1}\cap A_{2}\cap A_{3})}{P(A_{2}\cap A_{3})}},{\text{ for }}P(A_{2}\cap A_{3})>0}$ We may also define the conditional probability of ${\displaystyle B}$ given ${\displaystyle A}$:: ${\displaystyle P(B|A)={\frac {P(A\cap B)}{P(A)}},{\text{ for }}P(A)>0}$

## Multiplication Rule

By rearranging the definition of conditional probability we can extract a formula for the probability of both ${\displaystyle A}$ AND ${\displaystyle B}$ occurring: ${\displaystyle P(A\cap B)=P(A)\cdot P(B|A)=P(B)\cdot P(A|B)}$ and, in analogous fashion, ${\displaystyle P(A_{1}\cap A_{2}\cap A_{3})=P(A_{1})\cdot P(A_{2}|A_{1})\cdot P(A_{3}|A_{1}\cap A_{2})}$ Generalisation for events ${\displaystyle A_{2},A_{2},\ldots A_{n}}$: ${\displaystyle P(A_{1}\cap \ldots \cap A_{n}=P(A_{1})\cdot P(A_{2}|A_{1})\cdot P(A_{3}|A_{1}\cap A_{2})\cdot \ldots \cdot P(A_{n}|A_{1}\cdot \ldots \cdots A_{n-1})}$

## Independent Events

The notion underlying the concept of conditional probability is that a priori information cocerning the occurrence of events does in general influence probabilities of other events.  (For example, if one knows that someone is a smoker, than one would assign a higher probability to that individual contracting lung cancer).  In general,  one would expect.${\displaystyle P(A)\neq P(A|B)}$ The case ${\displaystyle P(A)=P(A|B)}$ has an important interpretation.  If the probability of ${\displaystyle A}$ occurring remains the same, whether or not ${\displaystyle B}$ has occurred, we say that the two events are statistically (or stochastically) independent. (For example knowing whether an individual is tall or short does not affect one’s assessment of that individual developing lung cancer.) We define stochastic independence of two events ${\displaystyle A}$ and ${\displaystyle B}$ by the condition ${\displaystyle P(A\cap B)=P(A)\cdot P(B)}$which implies that the followig conditions hold${\displaystyle P(A)=P(A|B)}$${\displaystyle P(B)=P(B|A)}$${\displaystyle P(A|B)=P(A|{\overline {B}})}$${\displaystyle P(B|A)=P(B|{\overline {A}})}$ The multiplication condition defining stochastic independence of two events also holds for ${\displaystyle n}$ independent events: ${\displaystyle P(A_{1}\cap \ldots \cap A_{n})=P(A_{1})\cdot \ldots \cdot P(A_{n})}$ To establish statistical independence of ${\displaystyle n}$ events, one must ensure that the multiplication rule holds for any subset of the events. That is ${\displaystyle P\left(A_{i_{1}}\cap \ldots \cap A_{i_{m}}\right)=P\left(A_{i_{1}}\right)\cdot \ldots \cdot P\left(A_{i_{m}}\right),{\text{ for }}i_{1},\ldots ,i_{m}{\text{ distinct integers}} It is important not to confuse stochastic independence with mutual exclusivity. For example, if two events ${\displaystyle A}$ and ${\displaystyle B}$ with ${\displaystyle P(A)>0}$ and ${\displaystyle P(B)>0}$, are mutually exclusive then ${\displaystyle P(A\cap B)=0}$, as ${\displaystyle P(\emptyset )=0}$ and ${\displaystyle A\cap B=\emptyset }$.   In which case ${\displaystyle P(A\cap B)\neq P(A)\cdot P(B)}$. A small example should clarify the difference between independence and mutual exclusivity (rowing Cambridge versus Oxford): click on the symbol of the loudspeaker.

## Two-Way Cross-Tabulation

In many applications the researcher is interested in associations between two categorical variables. The simplest case is if one observes two binary variables, i.e. there are two variables, each with two possible outcomes.   For example, suppose that for a randomly selected individual we observe whether or not they smoke and whether or not they have emphysema.  Let ${\displaystyle A}$ be the outcome that the individual smokes and ${\displaystyle B}$ be the outcome that they have emphysema.  We can construct separate sample spaces ${\displaystyle \left\{A,{\overline {A}}\right\}}$ and ${\displaystyle \left\{B,{\overline {B}}\right\}}$.for each of the two variables. Alternatively we can construct the sample space of ordered pairs: ${\displaystyle S=\left\{\left(A,B\right),\left(A,{\overline {B}}\right),\left({\overline {A}},B\right),\left({\overline {A}},{\overline {B}}\right)\right\}}$ In tabulating data of this type, we would simply count the number of individuals corresponding to each of the four basic outcomes. No information is lost regarding the two variables individually because we can always obtain frequencies for both categories of either variable by summing over the two categories of the other variable.  For example, to calculate the number of individuals who have emphysema, we add up all those who smoke and have emphysema (i.e., ${\displaystyle (A,B)}$) and all those who do not smoke and have emphysema (i.e., ${\displaystyle (A,{\overline {B}})}$).   Relative frequencies for categories of of the individual variables are called marginal relative frequencies. Relative frequencies arising from bivariate categorical data are usually displayed by cross-tabulating the categories of the two variables. Marginal frequencies are included as sums of the columns/rows representing the categories of each of the variables.  The resulting matrix is called an ${\displaystyle (r\times c)}$-contingency table, where ${\displaystyle r}$ and ${\displaystyle c}$ denote the number of categories observed for each variable. In our example with two categories for each variable, we have a ${\displaystyle (2\times 2)}$-contingency table. We may summarize the probabilities associated with each basic outcome in a similar table:

${\displaystyle B}$ ${\displaystyle {\overline {B}}}$ Sum
${\displaystyle A}$ ${\displaystyle P(A\cap B)}$ ${\displaystyle P(A\cap {\overline {B}})}$ ${\displaystyle P(A)}$
${\displaystyle {\overline {A}}}$ ${\displaystyle P({\overline {A}}\cap B)}$ ${\displaystyle P({\overline {A}}\cap {\overline {B}})}$ ${\displaystyle P({\overline {A}})}$
Sum ${\displaystyle P(B)}$ ${\displaystyle P({\overline {B}})}$ ${\displaystyle P(S)=1}$

The structure of this table is particularly helpful in checking for independence between events. Recall that the joint probability of two independent events can be calculated as the product of the probabilities of the two individual events. In this case, we want to verify whether the joint probabilities in the main body of the table are equal to the products of the marginal probabilities.  If they are not, then the events are not independent.  For example, under independence, we would have ${\displaystyle \ P(A)\,P(B)=P(A\cap B)}$ If one replaces the probabilities in the above table with their sample frequencies, then independence implies that the estimated joint probabilities should be approximately equal to the products of the estimated marginal probabilities.  Formal procedures for testing independence will be discussed later. Joint probabilities of two binary variables are arranged in the contingency table below. Are the variables represented by the events ${\displaystyle \left\{A,{\overline {A}}\right\}}$ respectively ${\displaystyle \left\{B,{\overline {B}}\right\}}$ (mutually) independent?

${\displaystyle B}$ ${\displaystyle {\overline {B}}}$ Sum
${\displaystyle A}$ ${\displaystyle 1/3}$ ${\displaystyle 1/6}$ ${\displaystyle 1/2}$
${\displaystyle {\overline {A}}}$ ${\displaystyle 1/3}$ ${\displaystyle 1/6}$ ${\displaystyle 1/2}$
Sum ${\displaystyle 2/3}$ ${\displaystyle 1/3}$ ${\displaystyle 1}$

For the multiplication condition of independence to be satisfied, the inner cells of the contingency table must equal the product of their corresponding marginal probabilities. This is true for all four cells:

${\displaystyle B}$ ${\displaystyle {\overline {B}}}$ Sum
${\displaystyle A}$ ${\displaystyle 1/3=1/2\cdot 2/3}$ ${\displaystyle 1/6=1/2\cdot 1/3}$ ${\displaystyle 1/2}$
${\displaystyle {\overline {A}}}$ ${\displaystyle 1/3=1/2\cdot 2/3}$ ${\displaystyle 1/6=1/2\cdot 1/3}$ ${\displaystyle 1/2}$
Sum ${\displaystyle 2/3}$ ${\displaystyle 1/3}$ ${\displaystyle 1}$

In this very special example with two binary variables it is, however, not necessary to verify the validity of the multiplication rule for each of the four cells. As we have already seen, stochastic independence of two events implies stochastic independence of the complementary. Consequently, if the multiplication condition holds for one of the four cells, it must hold for the other three. This is only true because the only two events to be considered for each variable are complements. A master and his apprentice produce hand-made screws. The following data were collected over the course of the year1998:

 Total production: 2000 screws Group 1 1400 screws (the master) 1162 good screws 238 faulty screws Group 2 600 screws (the apprentice) 378 good screws 222 faulty screws

What is the probability, that a randomly selected screw is not faulty given that it was produced by the master? In order to calculate this probability, we will use this notation:${\displaystyle A}$ = {screw is good}${\displaystyle B}$ = {screw produced by master}${\displaystyle C}$ = {screw produced by apprentice} The situation is displayed on this Venn diagram:

We would like to calculate ${\displaystyle P(A|B)}$. This conditional probability is defined as ${\displaystyle P(A|B)=P(A\cap B)/P(B)}$. The event ${\displaystyle A\cap B}$ corresponds to selection of a good screw produced by the master. In order to calculate ${\displaystyle P(A\cap B)}$, we divide the number of screws with this property by the total number of screws: ${\displaystyle P(A\cap B)=1162/2000}$. The probability ${\displaystyle P(B)}$ can be calculated as a ratio of the number of screws produced by the master and total production: ${\displaystyle P(B)=1400/2000}$. Thus we obtain: ${\displaystyle P(A|B)=1162/1400=0.83\,.}$ We want to show that: For any pair of independent events ${\displaystyle A}$ and ${\displaystyle B}$ we have ${\displaystyle P(A)=P(A|B)}$. Assume that the events ${\displaystyle A}$ and ${\displaystyle B}$ are independent. Then we have ${\displaystyle P(A|B)={\frac {P(A\cap B)}{P(B)}}={\frac {P(A)\,P(B)}{P(B)}}=P(A)}$

Similarly, we can show that ${\displaystyle P(B|A)=P(B)}$. Next suppose that ${\displaystyle P(A)=P(A|B)\,\ }$we want to show that this implies the multiplicative rule, i.e., that ${\displaystyle A}$ and ${\displaystyle B}$ are independent: {\displaystyle {\begin{aligned}P(A|B)&={\frac {P(A\cap B)}{P(B)}}=P(A)\\\,\,\,P(A\cap B)&=P(A)\cdot P(B)\\&\end{aligned}}} Indeed stochastic independence can be defined equivalently in a number of ways.