Explain and perform calculations concerning joint probability functions and cumulative distribution functions for discrete random variables only

Explain and perform calculations concerning joint probability functions and cumulative distribution functions for discrete random variables only

Discrete Joint Probability Distributions

In the field of probability and statistics, we often encounter experiments that involve multiple events occurring simultaneously. For example:

  • An experimenter tossing a fair die is interested in the intersection of getting, say, a 5 and a 6.
  • The intersection of events of height and weight measures and so on.

These events might be related to each other or have an intersection. A fundamental understanding of joint probability functions, probability density functions, and cumulative distribution functions is essential. These concepts play a pivotal role in various practical applications, making them a key focus area for the SOA Exam P.

Definition

Let \(X\) and \(Y\) be two discrete random variables defined on a two-dimensional discrete space, \(S\).

The joint probability mass function of \(X\) and \(Y\) is defined as:

$$ f\left(x, y\right)= P \left(X = x, Y = y\right) $$

In other words, this means that \(f\left(x, y\right)\) gives us the probability that random variable \(X\) takes the value \(x\), and random variable \(Y\) takes the value \(y\) simultaneously.

Properties of a Joint Discrete Distribution

  1. \(0\le f\left(x,y\right)\le 1\): This property states that the probability assigned to any possible outcome in the joint distribution must always be a value between 0 and 1.
  2. \(\sum_{x}\sum_{y}f\left(x,y\right)=1,\ \forall \left(x,y\right)\epsilon S\): This condition requires all the probabilities over the entire space \(S\) to sum up to 1.
  3. \( P \left[ \left(X, Y \epsilon A\right) \right]=\sum_{x,y \epsilon A}{\sum f\left(x,y \right)}\): Where \(A\) is a subset of the space \(S\). This property states that to find the probability of event \(A\) occurring, you sum up the joint probabilities, \(f\left(x, y\right)\) for all \(\left(x, y\right)\) values within \(A\).

Conditional Probabilities under the Joint Distribution

The conditional probability function of \(X\), given that \(Y = y\), is given by:

$$ P\left(X=x\middle| Y=y\right)=p\left(x\middle| y\right)=\frac{p\left(x,y\right)}{p_X\left(x\right)} $$

Example 1: Discrete Conditional Distribution

An insurance company collects data on accidents in different regions. The data is characterized as follows, with the number of accidents, \(X\), in each region:

$$ \begin{array}{c|c|c|c}
\text{Region} & \bf{X=0} & \bf{X=1} & \bf{X=2} \\ \hline
A & 0.23 & 0.10 & 0.13 \\ \hline
B & 0.10 & 0.15 & 0.02 \\ \hline
C & 0.01 & 0.18 & 0.08
\end{array} $$

Find the conditional distribution of \(X|\text{Region } A\).

Solution:

To determine the conditional distribution of \(X\) given Region A, denoted as \(P(X=x | \text{Region } A)\), we will use the conditional probability formula:

$$ P(X=x |\text{Region } A)= \frac{P\left(X=x \text{ and Region } A\right)}{P\left(\text{Region } A\right)} $$

First, we need to find \(P(\text{Region } A)\), which we can directly calculate from the table:

$$ P\left(\text{Region } A\right)=0.23+0.10+0.13=0.46 $$

Therefore, we can now calculate the conditional probabilities for each value, \(x\):

$$ \begin{align*}
P\left(X=0\middle| \text{Region } A\right) & = \frac{0.23}{0.46} = 0.5 \\
P\left(X=1\middle| \text{Region } A\right) & = \frac{0.10}{0.46} =0.2174 \\
P(X=2 |\text{Region } A) & =\frac{0.13}{0.46}= 0.2826 \end{align*} $$

So, the conditional distribution of \(X\) given Region \(A\) is as follows:

$$ \begin{array}{c|c}
X|\text{Region } A & P(X|\text{Region } A) \\ \hline
0 & 0.5 \\ \hline
1 & 0.2174 \\ \hline
2 & 0.2826
\end{array} $$

Note that,

$$ \text{Conditional probability}=\frac{\text{Joint probability}}{\text{Marginal probability}} $$

We will discuss this in detail in the next reading.

Example 2: Discrete Joint Distribution

Suppose a certain local bank had three deposit or withdrawal counters. Two investors arrive at the counters at different times when the counters are serving no other customers. Each investor chooses a counter at random, independently of the other.

Let \(X\) be the number of investors who select counter 1, and let \(Y\) be the number of investors who select counter 2.

  1. Determine the joint probability function of \(X\) and \(Y\).
  2. Find \(P \left(X=2, Y=0 \text{ or } 1\right)\).
  3. Find \(P\left(Y=2\right)\).

Solution

  1. First, we have to consider the sample space associated with the experiment.

    Let the pair \(\left\{I,j\right\}\) represent the simple event that the first investor selects counter \(i\) and the second investor chooses counter \(j\), where \(I, j=1,2, \text{ and } 3\).

    By using the mn rule, the sample space consists of:

    $$ 3\times3=9 \text{ sample points} $$

    Therefore, each sample point is equal and has a probability of \(\frac{1}{9}\)

    Thus, the sample space for the experiment is given as:

    $$ S=\left[\left\{1,1\right\},\left\{1,2\right\},\left\{1,3\right\},\left\{2,1\right\},\left\{2,2\right\},\left\{2,3\right\},\left\{3,1\right\},\left\{3,2\right\},\left\{3,3\right\}\right] $$

    We know that:

    $$ f \left(x, y\right)= P \left(X =x, Y = y\right) $$

    Therefore, the joint probability of \(X\) and \(Y\) is given as follows:

    $$ \begin{array}{c|c|c|c|c} {\begin{matrix} X \\ \huge{\diagdown} \\ Y \end{matrix}} & {0} & {1} & {2} \\ \hline 0 & \frac{1}{9} & \frac{2}{9} & \frac{1}{9} \\ \hline 1 & \frac{2}{9} & \frac{2}{9} & 0 \\ \hline 2 & \frac{1}{9} & 0 & 0 & \end{array} $$

  2. We know that,

    $$ f \left(x, y\right)= P \left(X = x, Y = y\right) $$

    Thus,

    $$ \begin{align*} P\left(X=2, Y=0 \text{ or } 1\right) & =P\left(X=2, Y=0\right)+P\left(X=2, Y=1\right) \\
    & =\frac{1}{9}+0=\frac{1}{9} \\
    & =\frac{1}{9}+0=\frac{1}{9} \end{align*} $$

  3. We are required to find \(P\left(Y=2\right)\), and since it does not depend on the value of \(X\), it is the same as finding \(P\left(Y=2, X=0,1,2\right)\). That is, we are summing over all the possible values of \(X\).

    Thus,

    $$ P\left(Y=2\right)=\frac{1}{9}+0+0=\frac{1}{9} $$

Example 3: Discrete Joint Distribution

An insurance company collects data on the number of claims made by male and female policyholders. Let \(X\) be the number of claims from males, and \(Y\) be the number of claims from females. \(X\) and \(Y\) have the following joint probability distribution:

$$ f\left(x,y\right)=\frac{y}{9x},\ \ \ \text{for } x=1, 2;\ y=1, 2, 3 $$

Calculate \(P\left(X+\frac{Y}{2}=2\right)\).

Solution

We first determine the pairs \(\left(x,y\right)\) which satisfy the condition that \(x+\frac{y}{2}=2\).

\(x+\frac{y}{2}=2\) only for the pair (1, 2).

Now, we can proceed to calculate the required probability:

$$ P\left(X+\frac{Y}{2}=2\right)=\frac{2}{9\times1}=\frac{2}{9} $$

Example 4: Discrete Joint Distribution

An analyst is concerned about the annual number of tsunamis in two countries, \(M\) and \(N\).

Let \(X\) and \(Y\) be the annual number of tsunamis in countries \(M\) and \(N\), respectively.

The analyst determines that \(X\) and \(Y\) are jointly distributed as below:

$$ f\left(x,y\right)=\frac{xy}{10},\ \ \ \text{for } \ x=0, 1;y=0, 1, 2, 3, 4 $$

Calculate \(P\left(X+Y \lt 3\right)\).

Solution

\(x+y \lt 3\) for the pairs, \(\left(0,0\right); \left(0,1\right); \left(0,2\right); \left(1,0\right)\) and \(\left(1,1\right)\)

Therefore,

$$ \begin{align*} P\left(X+Y \lt 3\right) & =\frac{0}{10}+\frac{0}{10}+\frac{0}{10}+\frac{0}{10}+\frac{1}{10} \\
& =\frac{1}{10} \end{align*} $$

Joint Discrete Cumulative Distribution Functions

Definition

The joint cumulative distribution function, \(F_{XY}\left(x,y\right)\) of two discrete random variables, \(X\) and \(Y\), is defined as the probability that the random variable \(X\) is less than or equal to a specified value of \(x\) and that the random variable \(Y\) is less than or equal to a specified value of \(y\), namely,

$$ F_{XY}\left(x,y\right)=P\left(X\le x, Y\le y\right) $$

Now, consider an experiment involving a sample of size \(n\), i.e., \(X_1, X_2,\ldots, X_n\). The cumulative distribution function of \(X_1, X_2,\ldots, X_n\) is given by:

$$ F\left(X_1, X_2,\ldots, X_n\right)=\sum_{w_1\le x_1}\sum_{w_2\le x_2}{\ldots\sum_{w_n\le x_n}f\left(w_1, w_2, \ldots., w_n\right)} $$

The following result holds for two random variables, \(X\) and \(Y\):

$$ P\left(x_1 \lt X\le x_2, {y}_1 \lt Y\le y_2\right)= F\left(x_2,y_2\right)+ F\left(x_1, y_1\right)- F\left(x_1, y_2\right)- F\left(x_2,y_1\right) $$

The above result holds if and only if \(x_1\lt x_2\) and \(y_1 \lt y_2\).

To prove the above results, suppose we have two discrete random variables, \(X\) and \(Y\), and we wish to find the probability that \(x_1 \lt x_2\) and \(y_1 \lt y_2\). This can be expressed as:

$$ P\left(x_1 \lt X\le x_2, { y}_1 \lt Y\le y_2\right) $$

Now, we can break this into four cases depending on whether \(X\) is less than or equal to \(x_1\) and \(Y\) is less than or equal to \(y_1\):

$$ P\left(x_1 \lt X\le x_2, {y}_1 \lt Y\le y_2\right) $$

$$ \begin{align*} = & P\left(X\le x_1, Y\le y_1\right)+ P\left(X\le x_2, Y \gt y_1\right) \\ + & P\left(X \gt x_1, Y \le y_1\right)+ P\left(X \gt x_1, Y \gt y_1\right) \end{align*} $$

From the definition of the joint cumulative distribution function,

$$ P\left(X \le x_1, Y \le y_1\right)= F\left(x_1, y_1\right) $$

$$ P\left(X\le x_2, Y \gt y_1\right)= F\left(x_2, y_2\right)- F\left(x_2, y_1\right) $$

$$ P\left(X \gt x_1, Y\le y_1\right)= F\left(x_1, y_2\right)- F\left(x_1, y_1\right) $$

$$ P\left(X \gt x_1, Y \gt y_1\right)=1-F\left(x_1, y_2\right)-F\left(x_2, y_1\right)+ F\left(x_1, y_1\right) $$

And when we substitute the above equations in the original equation, we get:

$$ P\left(x_1 \gt X\le x_2, { y}_1 \lt Y\le y_2\right)= F\left(x_2,y_2\right)+ F\left(x_1, y_1\right)- F\left(x_1, y_2\right)- F\left(x_2,y_1\right) $$

Example 5: Joint Discrete Cumulative Distribution Function

An actuary is conducting an analysis of the number of days of sickness and the number of medical appointments for a group of policyholders. Let \(X\) be the random variable representing the number of days of sickness, and \(Y\) be the random variable representing the number of medical appointments.

The joint probability mass function (pmf) for \(X\) and \(Y\) is given in the table below:

$$ \begin{array}{c|c|c|c|c} {\begin{matrix} X \\ \huge{\diagdown} \\ Y \end{matrix}} & {0} & {1} & {2} \\ \hline 0 & \frac{1}{8} & \frac{1}{6} & \frac{1}{4} \\ \hline 1 & \frac{1}{6} & \frac{1}{8} & \frac{1}{6} \end{array} $$

Find \(F_{XY}\left(0.5,1\right)\)

Solution

By definition of a joint cumulative distribution function,

$$ \begin{align*} F_{XY}\left(0.5,1\right) & =P\left(X\le 0.5, Y\le 1\right) \\
& =P_{XY}\left(0,0\right)+P_{XY}\left(0,1\right)=\frac{1}{8}+\frac{1}{6}=\frac{7}{24} \\
\therefore F_{XY}\left(0.5,1\right) & =\frac{7}{24} \end{align*} $$

Example 6: Joint Discrete Cumulative Distribution Function

An insurance company operates in two neighboring cities, \(A\) and \(B\). In June 2022, they collected data on the number of road accidents in each city. Let \(X\) represent the number of accidents in city \(A\), and \(Y\) represent the number of accidents in city \(B\). \(X\) and \(Y\) have the following joint cumulative distribution function:

$$ F\left(x,y\right)=\left({0.8}^x\right)\left({0.2}^y\right), \text{ for } x=0, 1, 2\ldots \text{ and } y=0, 1,2\ldots $$

Find the probability that in June 2022, we will have exactly 3 claims from city \(A\) and exactly 3 claims from city \(B\).

Solution:

We wish to find \(P\left(X=3, Y=3\right)\).

We know that,

$$ F\left(x,y\right)=P\left(X\le x, Y\le y\right) $$

We also know that,

$$ P\left(x_1 \lt X\le x_2, { y}_1 \lt Y\le y_2\right)= F\left(x_2,y_2\right)+ F\left(x_1, y_1\right)- F\left(x_1, y_2\right)- F\left(x_2,y_1\right) $$

$$ \begin{align*}
\Rightarrow P\left(X=3, Y=3\right) & =F\left(3,3\right)-F\left(2,3\right)-F\left(3,2\right)+F\left(2,2\right)\\
& =\left({0.8}^3\right)\left({0.2}^3\right)+\left({0.8}^2\right)\left({0.2}^2\right)-\left({0.8}^2\right)\left({0.2}^3\right) \\ & -\left({0.8}^3\right)\left({0.2}^2\right) \\
& =0.004096 \end{align*} $$

Question

A clinical trial is testing a new medication that either improves a patient’s condition (represented by 1) or has no effect (represented by 0). Let \(X\) represent the actual effect of the medication on a patient, and let \(Y\) represent the observed effect as reported by the patient. The joint probability function of \(X\) and \(Y\) is given by:

  • \(P[X = 0, Y = 0] = 0.700\)
  • \(P[X = 1, Y = 0] = 0.100\)
  • \(P[X = 0, Y = 1] = 0.050\)
  • \(P[X = 1, Y = 1] = 0.150\)

Calculate the variance of the observed effect given that the actual effect is positive, \(Var(Y∣X = 1)\)

  1. 0.12
  2. 0.21
  3. 0.24
  4. 0.35
  5. 0.42

Solution

The correct answer is C.

First, we need to calculate the conditional probabilities \(P\left(Y=0\middle| X=1\right)\) and \(P\left(Y=1\middle| X=1\right)\)

The conditional probability \(P(Y=0|X=1)\) is calculated as:

$$ P\left(Y=0\middle| X=1\right)=\frac{P\left(X=1,Y=0\right)}{P\left(X=1\right)} $$

We know that \(P\left(X=1\right)=P\left(X=1,Y=0\right)+P\left(X=1,Y=1\right)=0.100+0.150=0.250\)

Now, we calculate \(P(Y=0|X=1)\):

$$ P\left(Y=0\middle| X=1\right)=\frac{0.100}{0.250}=0.4 $$

Similarly, we calculate \(P(Y = 1∣X = 1)\):

$$ P\left(Y=1\middle| X=1\right)=\frac{P\left(X=1,Y=1\right)}{P\left(X=1\right)}=\frac{0.150}{0.250}=0.6 $$

Given that \(Y\) is a Bernoulli random variable, the variance of \(Y\) given \(X\) is:

$$ Var(Y∣X = 1) = p(1 − p) $$

Where \(p = P(Y = 1∣X = 1)\). Substituting p with the calculated value:

$$ Var(Y∣X = 1) = 0.600(1 − 0.600) = 0.600(0.400) = 0.240 $$

Note: A Bernoulli random variable is a discrete random variable that takes the value 1 with probability \(p\) and the value 0 with probability \(1−p\). In the case of the problem you’ve provided, the outcome of the medication’s effect is a perfect scenario for a Bernoulli random variable because each trial (i.e., each patient’s response to the medication) has only two possible outcomes.

Learning Outcome

Topic 3. a: Multivariate Random Variables – Explain and perform calculations concerning joint probability functions, probability density functions, and cumulative distribution function.

Shop CFA® Exam Prep

Offered by AnalystPrep

Featured Shop FRM® Exam Prep Learn with Us

    Subscribe to our newsletter and keep up with the latest and greatest tips for success
    Shop Actuarial Exams Prep Shop Graduate Admission Exam Prep


    Daniel Glyn
    Daniel Glyn
    2021-03-24
    I have finished my FRM1 thanks to AnalystPrep. And now using AnalystPrep for my FRM2 preparation. Professor Forjan is brilliant. He gives such good explanations and analogies. And more than anything makes learning fun. A big thank you to Analystprep and Professor Forjan. 5 stars all the way!
    michael walshe
    michael walshe
    2021-03-18
    Professor James' videos are excellent for understanding the underlying theories behind financial engineering / financial analysis. The AnalystPrep videos were better than any of the others that I searched through on YouTube for providing a clear explanation of some concepts, such as Portfolio theory, CAPM, and Arbitrage Pricing theory. Watching these cleared up many of the unclarities I had in my head. Highly recommended.
    Nyka Smith
    Nyka Smith
    2021-02-18
    Every concept is very well explained by Nilay Arun. kudos to you man!
    Badr Moubile
    Badr Moubile
    2021-02-13
    Very helpfull!
    Agustin Olcese
    Agustin Olcese
    2021-01-27
    Excellent explantions, very clear!
    Jaak Jay
    Jaak Jay
    2021-01-14
    Awesome content, kudos to Prof.James Frojan
    sindhushree reddy
    sindhushree reddy
    2021-01-07
    Crisp and short ppt of Frm chapters and great explanation with examples.