Calculate variance, standard deviation for conditional and marginal probability distributions

Calculate variance, standard deviation for conditional and marginal probability distributions

Variance and Standard Deviation for Conditional Discrete Distributions

In the previous readings, we introduced the concept of conditional distribution functions for random variable \(X\) given \(Y=y\) and the conditional distribution of \(Y\) given \(X=x\). We defined the conditional distribution function of \(X\), given that \(Y=y\) as:

$$ g\left(x\middle|y\right)=\frac{f\left(x,y\right)}{f_Y\left(y\right)},\ \ \ \text{ provided that } f_Y\left(y\right) \gt 0 $$

Similarly, the conditional distribution function of \(Y\), given that \(X=x\), is defined as:

$$ h\left(y\middle|x\right)=\frac{f\left(x,y\right)}{f_X\left(x\right)},\ \ \ \ \text{ provided that } f_X\left(x\right) \gt 0 $$

Building upon this foundation, we can now extend our discussion into the computation of conditional variance and conditional standard deviation.

The conditional variance of \(X\), given that \(Y=y\), is defined by:

$$ Var(X|Y=y)=E\left(X^2|Y=y\right)-\left[E(X|Y=y)\right]^2 $$

Where:

$$ \begin{align*}
E\left(X^2|Y=y\right) & =\sum_{x}{x^2.g(x|Y=y)} \\
E\left(X|Y=y\right) & =\sum_{x}{x.g(x|Y=y)}
\end{align*} $$

Note that this is analogous to the variance of a single random variable.

Furthermore, since the standard deviation is simply the square root of variance, we can define the conditional standard deviation of \(X\), given that \(Y=y\) as:

$$ \sigma_{X|Y=y}=\sqrt{Var\left(X\middle|Y=y\right)} $$

Similarly, the conditional variance of \(Y\) given \(X=x\) is defined by:

$$ Var(Y|X=x)=E\left(Y^2|X=x\right)-\left[E(Y|X=x)\right]^2 $$

Where

$$ E\left(Y^2|X=x\right)=\sum_{y}{y^2\times h(y|X=x)} $$

and

$$ E(Y|X=x)=\sum_{x}{y.h(y|X=x)} $$

Example 1: Conditional Variance for Discrete Random Variables

The number of days of hospitalization for two individuals, \(P\) and \(Q\), is jointly distributed as follows:

$$ f\left(x,y\right)=\frac{x+y}{21},\ \ \ x=1,2,3\ \ \ y=1,2 $$

  1. Find \(Var\left(X\middle|Y=1\right)\).
  2. Find \(SD\left(X\middle|Y=1\right)\).

Solution

  1. Since we wish to find \(Var\left(X\middle|Y=1\right)\), we first need to determine the conditional distribution of \(X\) given \(Y=1\), namely,

    $$ g\left(x\middle|y\right)=\frac{f\left(x,y\right)}{f_Y\left(y\right)} $$

    Where,

    $$ \begin{align*}
    f_Y\left(y\right) & =\sum_{all\ x}{P\left(x,y\right)=P\left(Y=y\right),\ \ \ y\epsilon S_y} \\
    {\Rightarrow f}_Y\left(y\right) & =\frac{\left(1\right)+y}{21}+\frac{\left(2\right)+y}{21}+\frac{\left(3\right)+y}{21}=\frac{6+3y}{21} \end{align*} $$

    Therefore, we have,

    $$ g\left(x\middle|y\right)=\frac{f\left(x,y\right)}{f_Y\left(y\right)}=\frac{\frac{x+y}{21}}{\frac{6+3y}{21}}=\frac{x+y}{3y+6} $$

    We can now go ahead to find the conditional variance:

    $$ Var\left(X\middle|Y=y\right)=\sigma_{X|Y=y}^2 =E\left(X^2|Y=y\right)-\left[E(X|Y=y)\right]^2 $$

    We need:

    $$ Var(X|Y=1)=E\left(X^2|Y=1\right)-\left[E(X|Y=1)\right]^2 $$

    Now,

    $$ \begin{align*}
    E\left(X\middle|Y=1\right) & =\sum_{x=1}^{3}{x.g\left(x\middle|y=1\right)} \\
    & =\sum_{x=1}^{3}{x\frac{\left(x+\left(1\right)\right)}{3\left(1\right)+6}} \\
    & =\left(1\right)\frac{\left(1+\left(1\right)\right)}{3\left(1\right)+6}+2\frac{\left(2+\left(1\right)\right)}{3\left(1\right)+6}+3\frac{\left(3+\left(1\right)\right)}{3\left(1\right)+6} \\
    & =1\left(\frac{2}{9}\right)+2\left(\frac{1}{3}\right)+3\left(\frac{4}{9}\right)=\frac{20}{9}
    \end{align*} $$

    We also need,

    $$ \begin{align*}
    E\left(X^2|Y=y\right) & =\sum_{x=1}^{3}{x.g\left(x\middle|y=1\right)}\\
    & =\sum_{x=1}^{3}{x^2\frac{\left(x+\left(1\right)\right)}{3\left(1\right)+6}} \\
    & =\left(1^2\right)\frac{\left(1+\left(1\right)\right)}{3\left(1\right)+6}+2^2\frac{\left(2+\left(1\right)\right)}{3\left(1\right)+6}+3^2\frac{\left(3+\left(1\right)\right)}{3\left(1\right)+6} \\
    & =1\left(\frac{2}{9}\right)+4\left(\frac{1}{3}\right)+9\left(\frac{4}{9}\right)=\frac{50}{9} \\
    \Rightarrow Var\left(X\middle|Y=1\right) &=E\left(X^2|Y=1\right)-\left[E\left(X\middle|Y=1\right)\right]^2 \\
    & =\frac{50}{9}-\left(\frac{20}{9}\right)^2=\frac{50}{81} \\
    V\left(X\middle|Y=1\right) &=\frac{50}{81}
    \end{align*} $$

  2. We know that standard deviation is simply the square root of variance.
    $$ \therefore SD(X|Y=1)=\sqrt{V\left(X\middle|Y=1\right)}=\sqrt{\frac{50}{81}}=0.7856 $$

Example 2: Conditional Variance for Discrete Random Variables

In a specific region, the annual number of accidents occurring in two regions, Region \(A\) and Region \(B\), is jointly distributed as follows:

$$ \begin{array}{c|c|cccc}
& & \text{Number of} & \text{accidents in} & \text{Region B} \\ \hline
& & 0 & 1 & 2 & 3 \\ \hline
\text{Number of} & 0 & 0.13 & 0.06 & 0.01 & 0.06 \\
\text{accidents in} & 1 & 0.12 & 0.15 & 0.12 & 0.02 \\
\text{Region A} & 2 & 0.04 & 0.16 & 0.09 & 0.04
\end{array} $$

Calculate the conditional variance of the annual number of accidents in Region B, given that there are no accidents in Region \(A\).

Solution:

First, we need to find the conditional distribution for the annual number of accidents in Region \(B\), given that there are no accidents in Region \(A\).

Let the number of accidents in Region \(A\) be \(X\) and in Region \(B\) be \(Y\).

We wish to find \(Var(Y|X=0)\).

We know that \(h\left(y\middle|x\right)=\frac{f\left(x,y\right)}{f_X\left(x\right)},\ \ \ \text{ provided that } f_X\left(x\right) \gt 0\)

$$ \Rightarrow P\left(Y=y\middle| X=0\right)=\frac{P(X=0, Y=y)}{P(X=0)} $$

But,

$$ P\left(X=0\right)=0.13+0.06+0.01+0.06=0.26 $$

Now,

$$ \begin{align*}
P\left(Y=0\middle| X=0\right) & =\frac{P(X=0,\ Y=0)}{P(X=0)}=\frac{0.13}{0.26}=\frac{1}{2} \\
P\left(Y=1\middle| X=0\right) & =\frac{P(X=0,\ Y=1)}{P(X=0)}=\frac{0.06}{0.26}=\frac{3}{13} \\
P\left(Y=2\middle| X=0\right) & =\frac{P(X=0,\ Y=2)}{P(X=0)}=\frac{0.01}{0.26}=\frac{1}{26} \\
P\left(Y=3\middle| X=0\right) & =\frac{P(X=0,\ Y=3)}{P(X=0)}=\frac{0.06}{0.26}=\frac{3}{13}
\end{align*} $$

$$ \begin{array}{c|c|c|c|c}
Y|X=0 & 0 & 1 & 2 & 3 \\ \hline
P\left(Y=0\middle| X=0\right) & \frac{1}{2} & \frac{3}{13} & \frac{1}{26} & \frac{3}{13}
\end{array} $$

Now,

$$ \begin{align*}
E(Y|X=0) & =\sum_{x}{y.P(Y|X=0)} \\
\Rightarrow E\left(Y\middle|X=0\right) & =0\times \frac{1}{2}+1\times \frac{3}{13}+2\times \frac{1}{26}+3\times \frac{3}{13}=1 \\
E\left(Y^2\middle|X=0\right) & =0^2\times \frac{1}{2}+1^2\times \frac{3}{13}+2^2\times \frac{1}{26}+3\times \frac{3}{13}=\frac{32}{13} \\
\Rightarrow Var\left(Y\middle|X=0\right) &=\frac{32}{13}-1^2=\frac{19}{13}\approx 1.4615
\end{align*} $$

Variance and Standard Deviation for Marginal Discrete Distributions

Recall that if \(X\) and \(Y\) are discrete random variables with joint probability mass function \(f(x, y)\) defined on the space \(S\), then the marginal distribution functions of \(X\) and \(Y\) are given by:

$$ f_X\left(x\right)=\sum_{y}{f\left(x,y\right)=P\left(X=x\right),\ \ \ x\epsilon S_x} $$

and,

$$ f_Y\left(y\right)=\sum_{x}{f\left(x,y\right)=P\left(Y=y\right),\ \ \ \ y\epsilon S_y} $$

Once we have the marginal distribution functions of \(X\) and \(Y\), we can now go ahead to find individual variances and standard deviations for \(X\) and \(Y\).

The variance for the random variable \(X\) is given by:

$$ Var\left(X\right)=E\left(X^2\right)-\left[E(X)\right]^2 $$

Where

$$ E\left(X^2\right)=\sum_{x}{x^2\times P(X=x)} $$

and

$$ E\left(X\right)=\sum_{x}{x\times P(X=x)} $$

Similarly, the variance of the random variable \(Y\) is given by:

$$ Var\left(Y\right)=E\left(Y^2\right)-\left[E(Y)\right]^2 $$

Where

$$ E\left(Y^2\right)=\sum_{x}{y^2\times P(Y=y)} $$

and

$$ E\left(Y\right)=\sum_{x}{y\times P(Y=y)} $$

The standard deviation for \(X\) and \(Y\) is the square root of their respective variances.

$$ SD(X)=\sqrt{E\left(X^2\right)-\left[E(X)\right]^2} $$

and,

$$ SD(Y)=\sqrt{E\left(Y^2\right)-\left[E(Y)\right]^2} $$

Example 3

In a specific region, the annual number of accidents occurring in two regions, Region \(A\) and Region \(B\), is jointly distributed as follows:

$$ \begin{array}{c|c|cccc}
& & \text{Number of} & \text{accidents in} & \text{Region B} \\ \hline
& & 0 & 1 & 2 & 3 \\ \hline
\text{Number of} & 0 & 0.13 & 0.06 & 0.01 & 0.06 \\
\text{accidents in} & 1 & 0.12 & 0.15 & 0.12 & 0.02 \\
\text{Region A} & 2 & 0.04 & 0.16 & 0.09 & 0.04
\end{array} $$

Calculate the variance of the annual number of accidents in Region \(A\).

Solution:

Let the number of accidents in Region \(A\) be \(X\) and in Region \(B\) be \(Y\).

We wish to find \(Var(X)\). As such, we should first find the marginal distribution of \(X\) which is given by:

$$ \begin{align*}
f_X\left(x\right) & =\sum_{\text{all } y}f\left(x,y\right)\ \ x\epsilon S_x \\
P\left(X=0\right) & =0.13+0.06+0.01+0.06=0.26 \\
P\left(X=1\right) & =0.12+0.15+0.12+0.02=0.41 \\
P\left(X=2\right) & =0.04+0.16+0.09+0.04=0.33
\end{align*} $$

$$ \begin{array}{c|c|c|c}
X & 0 & 1 & 2 \\ \hline
P(X=x) & 0.26 & 0.41 & 0.33
\end{array} $$

So that,

$$ \begin{align*}
E\left(X\right) & =0\times 0.26+1\times 0.41+2\times 0.33=1.07 \\
E\left(X^2\right) & =0^2\times 0.26+1^2\times 0.41+2^2\times 0.33=1.73 \\
\Rightarrow Var\left(X\right) &=1.73-{1.07}^2=0.5851
\end{align*} $$

Example 4: Variance and Standard Deviation for Marginal Discrete Random Variables

Let \(X\) and \(Y\) be the number of days of sickness for two individuals, \(A\) and \(B\).

$$ f\left(x,y\right)=\frac{x+y}{21},\ \ \ x=1,2,3\ \ \ \ \ y=1,2 $$

Calculate the variance and the standard deviation of \(X\).

Solution

We know that,

$$ Var\left(X\right)=E\left(X^2\right)-\left[E(X)\right]^2 $$

First, we find the marginal probability mass function of \(X\), which is given by:

$$ \begin{align*}
f_X\left(x\right) & =\sum_{\text{all } y}f\left(x,y\right)\ \ x\epsilon S_x \\ & =\frac{x+\left(1\right)}{21}+\frac{x+\left(2\right)}{21}=\frac{2x+3}{21},\ \ \text{for } x=1, 2, 3 \end{align*} $$

Then,

$$ \begin{align*} E\left(X\right) & =\sum_{x=1}^{3}{xP_X\left(x\right) =\sum_{x=1}^{3}{x\frac{2x+3}{21}}} \\ & =\left(1\right)\frac{2\left(1\right)+3}{21}+\left(2\right)\frac{2\left(2\right)+3}{21}+\left(3\right)\frac{2\left(3\right)+3}{21}\\ & =1\left(\frac{5}{21}\right)+2\left(\frac{7}{21}\right)+3\left(\frac{9}{21}\right)=\frac{46}{21} \end{align*} $$

and

$$ \begin{align*} E\left(X^2\right) & =\sum_{\text{all } x}{x^2P_X\left(x\right)}\\ & =\left(1\right)^2\left(\frac{5}{21}\right)+\left(2\right)^2\left(\frac{7}{21}\right)+\left(3\right)^2\left(\frac{9}{21}\right)-\left(\frac{46}{21}\right)^2 \\ & =\frac{38}{7} \end{align*} $$

Thus,

$$ Var\left(X\right)=\frac{38}{7}-\left(\frac{46}{21}\right)^2=\frac{278}{441}\approx 0.6304 $$

We know that the standard deviation of \(X\) is the square root of its variance.

Therefore,

$$ \sigma_X=\sqrt{Va\left(X\right)} =\sqrt{0.6304}=0.7940 $$

Example 5: Variance and Standard Deviation for Marginal Discrete Random Variables

A laptop dealer specializes in two brands of laptops, HP and Lenovo. Let X be the number of HP laptops sold in a day, and let Y be the number of Lenovo laptops sold in a day. The dealer has determined that the number of mobile phones sold in a day is jointly distributed as in the table below:

$$ \begin{array}{c|c|c|c} {\quad \text X }& {1} & {2} & {3} \\ {\Huge \diagdown } & & & \\ {\text Y \quad} & & & \\ \hline
0 & \frac{1}{6} & \frac{1}{8} & \frac{1}{6} \\ \hline 1 & \frac{1}{3} & \frac{1}{12} & \frac{1}{8} \\ \end{array} $$

Calculate the variance of \(X\).

Solution

We need to find the marginal distribution function of \(X\) first:

We know that,

$$ f_X\left(x\right)=\sum_{y}{f\left(x,y\right)=P\left(X=x\right),\ \ \ \ x\epsilon S_x} $$

Now,

$$ \begin{align*}
P\left(X=1\right) & =\frac{1}{6}+\frac{1}{3}=\frac{1}{2} \\
P\left(X=2\right) & =\frac{1}{8}+\frac{1}{12}=\frac{5}{24} \\
P\left(X=3\right) & =\frac{1}{6}+\frac{1}{8}=\frac{7}{24}
\end{align*} $$

Therefore,

$$ E\left(X\right)=\sum_{x=1}^{3}{xP_X\left(x\right)=1\times \frac{1}{2}+2\times \frac{5}{24}+3\times \frac{7}{24}=\frac{43}{24}} $$

and,

$$ E\left(X^2\right)=\sum_{x=1}^{3}{x^2\left(x\right)=1^2\times \frac{1}{2}+2^2\times \frac{5}{24}+3^2\times \frac{7}{24}=\frac{95}{24}} $$

Thus,

$$ Var\left(X\right)=E\left(X^2\right)-\left[E\left(X\right)\right]^2=\frac{95}{24}-\left(\frac{43}{95}\right)^2=3.992 $$

Note:

We can calculate the variance and the standard deviation of \(Y\) in a similar manner.

Exam tips:

Let \(\alpha\) and \(\beta\) be non-zero constants. Then, it can be proven that:

  1. \(Var\left(\alpha\right)=0\)
  2. \(Var\left(\alpha X\right)=\alpha^2\times Var(X)\)
  3. \(Var\left(\alpha X+\beta\right)=\alpha^2\times Var(X)\)

Question

An online streaming service is analyzing viewer behavior regarding the number of movies watched (X) and the number of series episodes watched (Y) on weekends. The joint probability mass function of \(X\) and \(Y\) is given by:

$$ f_{XY}\left(x,y\right)=\frac{2x+y}{21} $$

for \(x\) = 1, 2, and \(y\) = 1, 2, 3

Calculate the variance and standard deviation of the number of movies watched given that 2 series episodes were watched.

  1. Variance = 0.24, Standard deviation = 0.49
  2. Variance = 1.6, Standard deviation = 1.26
  3. Variance = 2.8, Standard deviation = 1.67
  4. Variance = 0.49, Standard deviation = 0.7
  5. Variance = 0.4, Standard deviation = 0.63

Solution

The correct answer is A.

First, we need to determine the conditional probability mass function \(f_{X|Y}\left(x\middle| y\right)=2=\frac{f_{XY}\left(x,y=2\right)\ }{f_Y(y=2)}\)

We find \(f_Y(y=2)\) by summing the joint pmf over all values of \(X\):

$$ \begin{align*} f_Y\left(y=2\right) & =\sum_{x}{f_{XY}(x, y=2)} \\
& =f_{XY}\left(1,2\right)+f_{XY}\left(2,2\right) \\
& =\frac{2\left(1\right)+2}{21}+\frac{2\left(2\right)+2}{21}=\frac{10}{21}
\end{align*} $$

Now, we calculate the conditional probabilities:

$$ \begin{align*}
f_{XY}\left(1\middle|2\right) & =\frac{f_{XY}(1,2)}{f_Y(2)}=\frac{\frac{2\left(1\right)+2}{21}}{\frac{10}{21}}=\frac{4}{10} \\
f_{XY}\left(2\middle|2\right) & =\frac{f_{XY}(1,2)}{f_Y(2)}=\frac{\frac{2\left(2\right)+2}{21}}{\frac{10}{21}}=\frac{6}{10} \\
\end{align*} $$

With the conditional probabilities, we can now calculate the conditional expectation:

$$ \begin{align*}
E\left[X\middle| Y=2\right] & =\sum{x\cdot f_{XY}\left(x\middle|2\right)} \\
& =1\cdot \frac{4}{10}+2\cdot \frac{6}{10}=1.6 \end{align*} $$

The variance \(Var(X|Y=2)\) is calculated by:

$$ Var\left(X\middle| Y=2\right)=E\left[X^2\middle| Y=2\right]-\left(E\left[X\middle| Y=2\right]\right)^2 $$

First, we calculate \(E\left[X^2\middle| Y=2\right]\):

$$ \begin{align*}
E\left[X^2\middle| Y=2\right] & =\sum{x^2\cdot f_{XY}\left(x\middle|2\right)} \\
& =1^2\cdot \frac{4}{10}+2^2\cdot \frac{6}{10}=2.8
\end{align*} $$

Now, we find the variance:

$$ Var\left(X\middle| Y=2\right)=2.8-{1.6}^2=0.24 $$

$$ \text{Standard deviation} =\sqrt{0.24}=0.49 $$

Learning Outcome

Topic 3. d: Multivariate Random Variables – Calculate variance and standard deviation for conditional and marginal probability distributions for discrete random variables only.

Shop CFA® Exam Prep

Offered by AnalystPrep

Featured Shop FRM® Exam Prep Learn with Us

    Subscribe to our newsletter and keep up with the latest and greatest tips for success
    Shop Actuarial Exams Prep Shop Graduate Admission Exam Prep


    Daniel Glyn
    Daniel Glyn
    2021-03-24
    I have finished my FRM1 thanks to AnalystPrep. And now using AnalystPrep for my FRM2 preparation. Professor Forjan is brilliant. He gives such good explanations and analogies. And more than anything makes learning fun. A big thank you to Analystprep and Professor Forjan. 5 stars all the way!
    michael walshe
    michael walshe
    2021-03-18
    Professor James' videos are excellent for understanding the underlying theories behind financial engineering / financial analysis. The AnalystPrep videos were better than any of the others that I searched through on YouTube for providing a clear explanation of some concepts, such as Portfolio theory, CAPM, and Arbitrage Pricing theory. Watching these cleared up many of the unclarities I had in my head. Highly recommended.
    Nyka Smith
    Nyka Smith
    2021-02-18
    Every concept is very well explained by Nilay Arun. kudos to you man!
    Badr Moubile
    Badr Moubile
    2021-02-13
    Very helpfull!
    Agustin Olcese
    Agustin Olcese
    2021-01-27
    Excellent explantions, very clear!
    Jaak Jay
    Jaak Jay
    2021-01-14
    Awesome content, kudos to Prof.James Frojan
    sindhushree reddy
    sindhushree reddy
    2021-01-07
    Crisp and short ppt of Frm chapters and great explanation with examples.