Determine conditional and marginal pro ...
Marginal Probability Distribution In the previous reading, we looked at joint discrete distribution... Read More
In the previous readings, we introduced the concept of conditional distribution functions for random variable \(X\) given \(Y=y\) and the conditional distribution of \(Y\) given \(X=x\). We defined the conditional distribution function of \(X\), given that \(Y=y\) as:
$$ g\left(x\middle|y\right)=\frac{f\left(x,y\right)}{f_Y\left(y\right)},\ \ \ \text{ provided that } f_Y\left(y\right) \gt 0 $$
Similarly, the conditional distribution function of \(Y\), given that \(X=x\), is defined as:
$$ h\left(y\middle|x\right)=\frac{f\left(x,y\right)}{f_X\left(x\right)},\ \ \ \ \text{ provided that } f_X\left(x\right) \gt 0 $$
Building upon this foundation, we can now extend our discussion into the computation of conditional variance and conditional standard deviation.
The conditional variance of \(X\), given that \(Y=y\), is defined by:
$$ Var(X|Y=y)=E\left(X^2|Y=y\right)-\left[E(X|Y=y)\right]^2 $$
Where:
$$ \begin{align*}
E\left(X^2|Y=y\right) & =\sum_{x}{x^2.g(x|Y=y)} \\
E\left(X|Y=y\right) & =\sum_{x}{x.g(x|Y=y)}
\end{align*} $$
Note that this is analogous to the variance of a single random variable.
Furthermore, since the standard deviation is simply the square root of variance, we can define the conditional standard deviation of \(X\), given that \(Y=y\) as:
$$ \sigma_{X|Y=y}=\sqrt{Var\left(X\middle|Y=y\right)} $$
Similarly, the conditional variance of \(Y\) given \(X=x\) is defined by:
$$ Var(Y|X=x)=E\left(Y^2|X=x\right)-\left[E(Y|X=x)\right]^2 $$
Where
$$ E\left(Y^2|X=x\right)=\sum_{y}{y^2\times h(y|X=x)} $$
and
$$ E(Y|X=x)=\sum_{x}{y.h(y|X=x)} $$
The number of days of hospitalization for two individuals, \(P\) and \(Q\), is jointly distributed as follows:
$$ f\left(x,y\right)=\frac{x+y}{21},\ \ \ x=1,2,3\ \ \ y=1,2 $$
Solution
$$ g\left(x\middle|y\right)=\frac{f\left(x,y\right)}{f_Y\left(y\right)} $$
Where,
$$ \begin{align*}
f_Y\left(y\right) & =\sum_{all\ x}{P\left(x,y\right)=P\left(Y=y\right),\ \ \ y\epsilon S_y} \\
{\Rightarrow f}_Y\left(y\right) & =\frac{\left(1\right)+y}{21}+\frac{\left(2\right)+y}{21}+\frac{\left(3\right)+y}{21}=\frac{6+3y}{21} \end{align*} $$
Therefore, we have,
$$ g\left(x\middle|y\right)=\frac{f\left(x,y\right)}{f_Y\left(y\right)}=\frac{\frac{x+y}{21}}{\frac{6+3y}{21}}=\frac{x+y}{3y+6} $$
We can now go ahead to find the conditional variance:
$$ Var\left(X\middle|Y=y\right)=\sigma_{X|Y=y}^2 =E\left(X^2|Y=y\right)-\left[E(X|Y=y)\right]^2 $$
We need:
$$ Var(X|Y=1)=E\left(X^2|Y=1\right)-\left[E(X|Y=1)\right]^2 $$
Now,
$$ \begin{align*}
E\left(X\middle|Y=1\right) & =\sum_{x=1}^{3}{x.g\left(x\middle|y=1\right)} \\
& =\sum_{x=1}^{3}{x\frac{\left(x+\left(1\right)\right)}{3\left(1\right)+6}} \\
& =\left(1\right)\frac{\left(1+\left(1\right)\right)}{3\left(1\right)+6}+2\frac{\left(2+\left(1\right)\right)}{3\left(1\right)+6}+3\frac{\left(3+\left(1\right)\right)}{3\left(1\right)+6} \\
& =1\left(\frac{2}{9}\right)+2\left(\frac{1}{3}\right)+3\left(\frac{4}{9}\right)=\frac{20}{9}
\end{align*} $$
We also need,
$$ \begin{align*}
E\left(X^2|Y=y\right) & =\sum_{x=1}^{3}{x.g\left(x\middle|y=1\right)}\\
& =\sum_{x=1}^{3}{x^2\frac{\left(x+\left(1\right)\right)}{3\left(1\right)+6}} \\
& =\left(1^2\right)\frac{\left(1+\left(1\right)\right)}{3\left(1\right)+6}+2^2\frac{\left(2+\left(1\right)\right)}{3\left(1\right)+6}+3^2\frac{\left(3+\left(1\right)\right)}{3\left(1\right)+6} \\
& =1\left(\frac{2}{9}\right)+4\left(\frac{1}{3}\right)+9\left(\frac{4}{9}\right)=\frac{50}{9} \\
\Rightarrow Var\left(X\middle|Y=1\right) &=E\left(X^2|Y=1\right)-\left[E\left(X\middle|Y=1\right)\right]^2 \\
& =\frac{50}{9}-\left(\frac{20}{9}\right)^2=\frac{50}{81} \\
V\left(X\middle|Y=1\right) &=\frac{50}{81}
\end{align*} $$
In a specific region, the annual number of accidents occurring in two regions, Region \(A\) and Region \(B\), is jointly distributed as follows:
$$ \begin{array}{c|c|cccc}
& & \text{Number of} & \text{accidents in} & \text{Region B} \\ \hline
& & 0 & 1 & 2 & 3 \\ \hline
\text{Number of} & 0 & 0.13 & 0.06 & 0.01 & 0.06 \\
\text{accidents in} & 1 & 0.12 & 0.15 & 0.12 & 0.02 \\
\text{Region A} & 2 & 0.04 & 0.16 & 0.09 & 0.04
\end{array} $$
Calculate the conditional variance of the annual number of accidents in Region B, given that there are no accidents in Region \(A\).
Solution:
First, we need to find the conditional distribution for the annual number of accidents in Region \(B\), given that there are no accidents in Region \(A\).
Let the number of accidents in Region \(A\) be \(X\) and in Region \(B\) be \(Y\).
We wish to find \(Var(Y|X=0)\).
We know that \(h\left(y\middle|x\right)=\frac{f\left(x,y\right)}{f_X\left(x\right)},\ \ \ \text{ provided that } f_X\left(x\right) \gt 0\)
$$ \Rightarrow P\left(Y=y\middle| X=0\right)=\frac{P(X=0, Y=y)}{P(X=0)} $$
But,
$$ P\left(X=0\right)=0.13+0.06+0.01+0.06=0.26 $$
Now,
$$ \begin{align*}
P\left(Y=0\middle| X=0\right) & =\frac{P(X=0,\ Y=0)}{P(X=0)}=\frac{0.13}{0.26}=\frac{1}{2} \\
P\left(Y=1\middle| X=0\right) & =\frac{P(X=0,\ Y=1)}{P(X=0)}=\frac{0.06}{0.26}=\frac{3}{13} \\
P\left(Y=2\middle| X=0\right) & =\frac{P(X=0,\ Y=2)}{P(X=0)}=\frac{0.01}{0.26}=\frac{1}{26} \\
P\left(Y=3\middle| X=0\right) & =\frac{P(X=0,\ Y=3)}{P(X=0)}=\frac{0.06}{0.26}=\frac{3}{13}
\end{align*} $$
$$ \begin{array}{c|c|c|c|c}
Y|X=0 & 0 & 1 & 2 & 3 \\ \hline
P\left(Y=0\middle| X=0\right) & \frac{1}{2} & \frac{3}{13} & \frac{1}{26} & \frac{3}{13}
\end{array} $$
Now,
$$ \begin{align*}
E(Y|X=0) & =\sum_{x}{y.P(Y|X=0)} \\
\Rightarrow E\left(Y\middle|X=0\right) & =0\times \frac{1}{2}+1\times \frac{3}{13}+2\times \frac{1}{26}+3\times \frac{3}{13}=1 \\
E\left(Y^2\middle|X=0\right) & =0^2\times \frac{1}{2}+1^2\times \frac{3}{13}+2^2\times \frac{1}{26}+3\times \frac{3}{13}=\frac{32}{13} \\
\Rightarrow Var\left(Y\middle|X=0\right) &=\frac{32}{13}-1^2=\frac{19}{13}\approx 1.4615
\end{align*} $$
Recall that if \(X\) and \(Y\) are discrete random variables with joint probability mass function \(f(x, y)\) defined on the space \(S\), then the marginal distribution functions of \(X\) and \(Y\) are given by:
$$ f_X\left(x\right)=\sum_{y}{f\left(x,y\right)=P\left(X=x\right),\ \ \ x\epsilon S_x} $$
and,
$$ f_Y\left(y\right)=\sum_{x}{f\left(x,y\right)=P\left(Y=y\right),\ \ \ \ y\epsilon S_y} $$
Once we have the marginal distribution functions of \(X\) and \(Y\), we can now go ahead to find individual variances and standard deviations for \(X\) and \(Y\).
The variance for the random variable \(X\) is given by:
$$ Var\left(X\right)=E\left(X^2\right)-\left[E(X)\right]^2 $$
Where
$$ E\left(X^2\right)=\sum_{x}{x^2\times P(X=x)} $$
and
$$ E\left(X\right)=\sum_{x}{x\times P(X=x)} $$
Similarly, the variance of the random variable \(Y\) is given by:
$$ Var\left(Y\right)=E\left(Y^2\right)-\left[E(Y)\right]^2 $$
Where
$$ E\left(Y^2\right)=\sum_{x}{y^2\times P(Y=y)} $$
and
$$ E\left(Y\right)=\sum_{x}{y\times P(Y=y)} $$
The standard deviation for \(X\) and \(Y\) is the square root of their respective variances.
$$ SD(X)=\sqrt{E\left(X^2\right)-\left[E(X)\right]^2} $$
and,
$$ SD(Y)=\sqrt{E\left(Y^2\right)-\left[E(Y)\right]^2} $$
In a specific region, the annual number of accidents occurring in two regions, Region \(A\) and Region \(B\), is jointly distributed as follows:
$$ \begin{array}{c|c|cccc}
& & \text{Number of} & \text{accidents in} & \text{Region B} \\ \hline
& & 0 & 1 & 2 & 3 \\ \hline
\text{Number of} & 0 & 0.13 & 0.06 & 0.01 & 0.06 \\
\text{accidents in} & 1 & 0.12 & 0.15 & 0.12 & 0.02 \\
\text{Region A} & 2 & 0.04 & 0.16 & 0.09 & 0.04
\end{array} $$
Calculate the variance of the annual number of accidents in Region \(A\).
Solution:
Let the number of accidents in Region \(A\) be \(X\) and in Region \(B\) be \(Y\).
We wish to find \(Var(X)\). As such, we should first find the marginal distribution of \(X\) which is given by:
$$ \begin{align*}
f_X\left(x\right) & =\sum_{\text{all } y}f\left(x,y\right)\ \ x\epsilon S_x \\
P\left(X=0\right) & =0.13+0.06+0.01+0.06=0.26 \\
P\left(X=1\right) & =0.12+0.15+0.12+0.02=0.41 \\
P\left(X=2\right) & =0.04+0.16+0.09+0.04=0.33
\end{align*} $$
$$ \begin{array}{c|c|c|c}
X & 0 & 1 & 2 \\ \hline
P(X=x) & 0.26 & 0.41 & 0.33
\end{array} $$
So that,
$$ \begin{align*}
E\left(X\right) & =0\times 0.26+1\times 0.41+2\times 0.33=1.07 \\
E\left(X^2\right) & =0^2\times 0.26+1^2\times 0.41+2^2\times 0.33=1.73 \\
\Rightarrow Var\left(X\right) &=1.73-{1.07}^2=0.5851
\end{align*} $$
Let \(X\) and \(Y\) be the number of days of sickness for two individuals, \(A\) and \(B\).
$$ f\left(x,y\right)=\frac{x+y}{21},\ \ \ x=1,2,3\ \ \ \ \ y=1,2 $$
Calculate the variance and the standard deviation of \(X\).
Solution
We know that,
$$ Var\left(X\right)=E\left(X^2\right)-\left[E(X)\right]^2 $$
First, we find the marginal probability mass function of \(X\), which is given by:
$$ \begin{align*}
f_X\left(x\right) & =\sum_{\text{all } y}f\left(x,y\right)\ \ x\epsilon S_x \\ & =\frac{x+\left(1\right)}{21}+\frac{x+\left(2\right)}{21}=\frac{2x+3}{21},\ \ \text{for } x=1, 2, 3 \end{align*} $$
Then,
$$ \begin{align*} E\left(X\right) & =\sum_{x=1}^{3}{xP_X\left(x\right) =\sum_{x=1}^{3}{x\frac{2x+3}{21}}} \\ & =\left(1\right)\frac{2\left(1\right)+3}{21}+\left(2\right)\frac{2\left(2\right)+3}{21}+\left(3\right)\frac{2\left(3\right)+3}{21}\\ & =1\left(\frac{5}{21}\right)+2\left(\frac{7}{21}\right)+3\left(\frac{9}{21}\right)=\frac{46}{21} \end{align*} $$
and
$$ \begin{align*} E\left(X^2\right) & =\sum_{\text{all } x}{x^2P_X\left(x\right)}\\ & =\left(1\right)^2\left(\frac{5}{21}\right)+\left(2\right)^2\left(\frac{7}{21}\right)+\left(3\right)^2\left(\frac{9}{21}\right)-\left(\frac{46}{21}\right)^2 \\ & =\frac{38}{7} \end{align*} $$
Thus,
$$ Var\left(X\right)=\frac{38}{7}-\left(\frac{46}{21}\right)^2=\frac{278}{441}\approx 0.6304 $$
We know that the standard deviation of \(X\) is the square root of its variance.
Therefore,
$$ \sigma_X=\sqrt{Va\left(X\right)} =\sqrt{0.6304}=0.7940 $$
A laptop dealer specializes in two brands of laptops, HP and Lenovo. Let X be the number of HP laptops sold in a day, and let Y be the number of Lenovo laptops sold in a day. The dealer has determined that the number of mobile phones sold in a day is jointly distributed as in the table below:
$$ \begin{array}{c|c|c|c} {\quad \text X }& {1} & {2} & {3} \\ {\Huge \diagdown } & & & \\ {\text Y \quad} & & & \\ \hline
0 & \frac{1}{6} & \frac{1}{8} & \frac{1}{6} \\ \hline 1 & \frac{1}{3} & \frac{1}{12} & \frac{1}{8} \\ \end{array} $$
Calculate the variance of \(X\).
Solution
We need to find the marginal distribution function of \(X\) first:
We know that,
$$ f_X\left(x\right)=\sum_{y}{f\left(x,y\right)=P\left(X=x\right),\ \ \ \ x\epsilon S_x} $$
Now,
$$ \begin{align*}
P\left(X=1\right) & =\frac{1}{6}+\frac{1}{3}=\frac{1}{2} \\
P\left(X=2\right) & =\frac{1}{8}+\frac{1}{12}=\frac{5}{24} \\
P\left(X=3\right) & =\frac{1}{6}+\frac{1}{8}=\frac{7}{24}
\end{align*} $$
Therefore,
$$ E\left(X\right)=\sum_{x=1}^{3}{xP_X\left(x\right)=1\times \frac{1}{2}+2\times \frac{5}{24}+3\times \frac{7}{24}=\frac{43}{24}} $$
and,
$$ E\left(X^2\right)=\sum_{x=1}^{3}{x^2\left(x\right)=1^2\times \frac{1}{2}+2^2\times \frac{5}{24}+3^2\times \frac{7}{24}=\frac{95}{24}} $$
Thus,
$$ Var\left(X\right)=E\left(X^2\right)-\left[E\left(X\right)\right]^2=\frac{95}{24}-\left(\frac{43}{95}\right)^2=3.992 $$
Note:
We can calculate the variance and the standard deviation of \(Y\) in a similar manner.
Exam tips:
Let \(\alpha\) and \(\beta\) be non-zero constants. Then, it can be proven that:
Question
An online streaming service is analyzing viewer behavior regarding the number of movies watched (X) and the number of series episodes watched (Y) on weekends. The joint probability mass function of \(X\) and \(Y\) is given by:
$$ f_{XY}\left(x,y\right)=\frac{2x+y}{21} $$
for \(x\) = 1, 2, and \(y\) = 1, 2, 3
Calculate the variance and standard deviation of the number of movies watched given that 2 series episodes were watched.
- Variance = 0.24, Standard deviation = 0.49
- Variance = 1.6, Standard deviation = 1.26
- Variance = 2.8, Standard deviation = 1.67
- Variance = 0.49, Standard deviation = 0.7
- Variance = 0.4, Standard deviation = 0.63
Solution
The correct answer is A.
First, we need to determine the conditional probability mass function \(f_{X|Y}\left(x\middle| y\right)=2=\frac{f_{XY}\left(x,y=2\right)\ }{f_Y(y=2)}\)
We find \(f_Y(y=2)\) by summing the joint pmf over all values of \(X\):
$$ \begin{align*} f_Y\left(y=2\right) & =\sum_{x}{f_{XY}(x, y=2)} \\
& =f_{XY}\left(1,2\right)+f_{XY}\left(2,2\right) \\
& =\frac{2\left(1\right)+2}{21}+\frac{2\left(2\right)+2}{21}=\frac{10}{21}
\end{align*} $$Now, we calculate the conditional probabilities:
$$ \begin{align*}
f_{XY}\left(1\middle|2\right) & =\frac{f_{XY}(1,2)}{f_Y(2)}=\frac{\frac{2\left(1\right)+2}{21}}{\frac{10}{21}}=\frac{4}{10} \\
f_{XY}\left(2\middle|2\right) & =\frac{f_{XY}(1,2)}{f_Y(2)}=\frac{\frac{2\left(2\right)+2}{21}}{\frac{10}{21}}=\frac{6}{10} \\
\end{align*} $$With the conditional probabilities, we can now calculate the conditional expectation:
$$ \begin{align*}
E\left[X\middle| Y=2\right] & =\sum{x\cdot f_{XY}\left(x\middle|2\right)} \\
& =1\cdot \frac{4}{10}+2\cdot \frac{6}{10}=1.6 \end{align*} $$The variance \(Var(X|Y=2)\) is calculated by:
$$ Var\left(X\middle| Y=2\right)=E\left[X^2\middle| Y=2\right]-\left(E\left[X\middle| Y=2\right]\right)^2 $$
First, we calculate \(E\left[X^2\middle| Y=2\right]\):
$$ \begin{align*}
E\left[X^2\middle| Y=2\right] & =\sum{x^2\cdot f_{XY}\left(x\middle|2\right)} \\
& =1^2\cdot \frac{4}{10}+2^2\cdot \frac{6}{10}=2.8
\end{align*} $$Now, we find the variance:
$$ Var\left(X\middle| Y=2\right)=2.8-{1.6}^2=0.24 $$
$$ \text{Standard deviation} =\sqrt{0.24}=0.49 $$
Learning Outcome
Topic 3. d: Multivariate Random Variables – Calculate variance and standard deviation for conditional and marginal probability distributions for discrete random variables only.