Calculate moments for joint, conditional, and marginal random variables

In this Learning objective section, we will go through concepts already explained and worked throughout past sections. This time we will explore more properties and practice more in their developments.

Moments of a Probability Mass function

Probability mass functions have multiple moments but we will not go over most of them for this text, for most text the 4th moment is the last studied and anything over that is utilized for more advanced studies. And understanding the underlying logic of them it’s somewhat difficult to understand. For this text the following moments will be considered:

  • First Moment: Mean
  • Second Moment: Variance
  • Third Moment: Skewness
  • Fourth Moment: Kurtosis

Properties of Expectation

Recall that the expected value of a random variable \(X\) is defined by

$$ E[X] = \sum_{x} xp(x) $$

where \(X\) is a discrete random variable with probability mass function \(p(x)\), and by

$$ E[X] = \int_{-\infty}^{\infty} xf(x)dx $$

when \(X\) is a continuous random variable with probability density function \(f(x)\).\\ Since \(E[X]\) is a weighted average of the possible values of \(X\) (if reader is not familiar with a weighted mean, a weight mean is the calculation of \(\bar{x} = (w_1x_1 + w_2x_2)/n\)), it follows that if \(X\) must lie between a and b, then so must its expected value.


$$ P(a\leq X \leq b) = 1 $$


$$ a \leq E[X] \leq b $$

To check if this is true we suppose that \(X\) is a discrete random variable for which \(P(a \leq X \leq b) = 1\). Since this implies that \(p(x) = 0\) for all \(x\) outside of interval \([a,b]\) it is deduced that \begin{align} E[X] & = \sum_{x:p(x)>0} xp(x)\\ & \geq \sum_{x:p(x)>0} ap(x)\\ & = a \sum_{x:p(x)>0} p(x)\\ & = a \end{align} Inverse for b we have: \begin{align} E[X] & = \sum_{x:p(x)>0} xp(x)\\ & \leq \sum_{x:p(x)>0} bp(x)\\ & = b \sum_{x:p(x)>0} p(x)\\ & = b \end{align} And we found the result we proposed. The proof is similar for the continuous case.

The expected value of the sum/difference of two random variables is equal to the sum/difference of their expectations:

$$E(X + Y) = E(X) + E(Y)$$


$$E(X – Y) = E(X) – E(Y).$$

The proof of this is done by defining \(g\) function as \(g(X,Y) = X + Y\) or \(g(X,Y) = X – Y\).

We will not go in depth with marginal random variables since this is not necessary, most of the times when we have a joint distribution we usually calculate the mean for one of the variables. This means, by definition of marginal pmf, we are doing the computation for a marginal joint random variable or marginal conditional random variable.

Moments of joint random variables

Let \(X,Y\) be a pair of joint random variables with a joint probability function \(f(x,y)\) on the space \(S\). If there exist a function of these two namely \(g(X,Y)\) defined:

$$ E[g(X,Y)] = \sum_{(x,y) \in S} g(x,y)f(x,y) $$

If this function exists, is called the mathematical expectation (or expected value) of \(g(X,Y)\).

This mathematical expectation is known as the first moment of joint random variables, or mean.

The second moment is a derivative of the first moment and it is equal to:

$$ E[g(X,Y)]= E(g(X^2,Y^2)) – (E[g(X,Y)])^2 = Var(X,Y) $$

As we said before most of the cases we will be working with marginal pmf instead of the mean of the joint distribution itself. For showing purposes, we will try to show at least one example and be explicit on \(g(X, Y)\).

Example 1: Let \(X\) and \(Y\) have the following pmf:

$$ f(x,y) = \frac{x^2 + 3y}{60} \qquad x = 1,2,3,4\quad y=1,2. $$

Find the expected value with \(g(X,Y) = XY\)

The possible values for this distribution are:


Then we proceed to calculate,

\begin{align} E[XY] & = \sum_{(x,y) \in S} g(x,y) f(x,y)\\ & = \sum_{(x,y) \in S} (xy) \frac{x^2 + 3y}{60}\\ & = (1)\frac{1+3}{60} + (2)\frac{1+6}{60} + (2)\frac{4+3}{60} + (4)\frac{4+6}{60} + \\ & (3)\frac{9+3}{60} + (6)\frac{9+9}{60} + (4)\frac{16+3}{60} + (8)\frac{16+6}{60}\\ & = \frac{4}{60} + \frac{14}{60} + \frac{14}{60} + \frac{40}{60} + \frac{36}{60} + \frac{108}{60} + \frac{76}{60} +\frac{176}{60}\\ & = \frac{39}{5} \end{align}

This process is really similar to calculating the mean of any mass function, be univariate or multivariate, with these type of calculations one must think of the \(g\) function and calculate like any other weighted mean.

Example 2: Let \(X\) and \(Y\) be random variables with joint pmf: \begin{equation*} f(x,y) = \begin{cases} 3(x+y) & 0 < x\leq y\leq 1\\ 0 & \text{otherwise} \end{cases} \end{equation*} This is found by computing as we have done before for continuous random joint variables:

\begin{align} E(X) & = \int_{0}^{1} x \int_{x}^{1} 3(x+y)dy \\ & = \int_{0}^{1}x \bigg( 3x + \bigg[\frac{3}{2} – \frac{3x^2}{2}\bigg]\bigg)\\ & = \int_{0}^{1} x \bigg( 3x + \frac{3}{2} – \frac{3x^2}{2}\bigg)\\ & = \frac{3x^3}{3} + \frac{3x^1}{4} – \frac{3x^4}{8} \bigg|_{0}^{1} = \frac{11}{8} \end{align}

The first integral represents the marginal pmf of the continuous distribution and the second integral is then what we call the expected value of a function.

Example 3: Let \(X\) and \(Y\) have the following pmf:

$$ f(x,y) = \frac{x^2 + 3y}{96} \qquad x = 1,2,3,4\quad y=1,2. $$

Find the expected values.

\begin{align} f_x(x) & = \frac{x^2 + 3(1)}{96} + \frac{x^2 + 3(2)}{96}\\ & = \frac{x^2 + 3}{96} + \frac{x^2 + 6}{96}\\ & = \frac{x^2 + x^2 + 3 + 6}{96} = \frac{2x^2+ 9}{96} \end{align}


\begin{align} f_y(y) & = \frac{1+3y}{96} + \frac{4+3y}{96} + \frac{9+3y}{96} + \frac{16+3y}{96}\\ & = \frac{12y + 30}{96} \end{align}

Then we can calculate the means for each of the variables:

\begin{align} E(x) & = \sum_{x=1}^{4} xf_x \\ & = \sum_{x=1}^{4} x \frac{2x^2+ 9}{96}\\ & = (1) \frac{2(1)^2+ 9}{60} + (2) \frac{2(2)^2+ 9}{96} + (3) \frac{2(3)^2+ 9}{96} +(4) \frac{2(4)^2+ 9}{96} \\ & = (1)\frac{11}{96} + (2)\frac{17}{96} + (3)\frac{27}{96} + (4)\frac{41}{96} = \frac{145}{48} \end{align}

And the mean for y:

\begin{align} E(y) & = \sum_{y=1}^{2}yf_y\\ & = \sum_{y=1}^{2}y\frac{12y + 30}{96}\\ & = (1) \frac{12(1) + 30}{96} + (2) \frac{12(2) + 30}{96} \\ & = (1) \frac{42}{96} + (2) \frac{54}{96} = \frac{25}{16} \end{align}

With these past examples, we can show how to find the variance for the corresponding variables:

Finding the variance is extremely easy and clean if we work thoroughly with the mean, this is because we already found the corresponding probabilities on the mean and we can omit a few steps, this is:

\begin{align} V(X) & = \sum_{x=1}^{4} x^2 f_x – [E(X)]^2\\ & = \sum_{x=1}^{4} x^2 \frac{2x^2 + 9}{96} – \bigg(\frac{145}{48}\bigg)^2\\ & = (1)^2 \frac{11}{96} + (2)^2 \frac{17}{96} + (3)^2 \frac{27}{96} + (4)^2\frac{41}{96} – \bigg(\frac{145}{48}\bigg)^2= \frac{163}{16} – \bigg(\frac{145}{48}\bigg)^2 = 1.062 \end{align}

And for \(Y\):

\begin{align} V(Y) &= \sum_{y=1}^{2} y^2f_y – [E(Y)]^2\\ & = \sum_{y=1}^{2} y^2 \frac{12y +30}{96} – \bigg(\frac{25}{16}\bigg)^2\\ & = (1)^2 \frac{42}{96} + (2)^2 \frac{54}{96} – \frac{625}{256}= \frac{43}{16} – \frac{625}{256} = \frac{63}{256} \end{align}

Finding all the other variances would be similar to this process, we’d just have to switch var for \(\bf \text{var}^2\) and calculate accordingly. This is not the only way of finding the variance, but if you already found the expected value you can just modify what is already done and it’s halfway done.

Moments for conditional random variables

Like the last section, moments for conditional and marginal random variables is nothing more than:

  • 1st moment: Mean
  • 2nd moment: Variance

This is because conditional and marginal random variables satisfy all the properties of a random function.

The conditional mean of \(Y\), given that \(X=x\) is defined:

$$ \mu_{Y|x} = E[y|x] = \sum_{y} y h(y|x), $$

and the conditional variance of \(Y\), given that \(X=x\) is defined:

$$ \sigma_{Y|x} = E{(Y – E[Y|x])^2|x} = \sum_{x}(y-E[Y|x])^2 h(y|x) $$

This is simplified to:

$$ \sigma_{Y|x} = E[Y^2|x] – (E[Y|x])^2 $$

For conditional functions the process of finding mean and variance it’s similar than the previous case, we have to find the conditional function which would be the equivalent of finding the marginal function, then after we find this function we can compute the mean. The only procedural difference is that we must find the expected value of the conditional function before computing the variance, this is all shown in the next couple of examples. 

Examples: The reader should already be familiar with this function:

$$ f(x,y) = \frac{x^2 + 3y}{60} \qquad x=1,2,3,4,\quad y=1,2. $$

And it’s marginal functions:

$$ f_x= \frac{2x^2+9}{60} \qquad f_y = \frac{12y+30}{60} $$

As seen before then we have to find each conditional function with:

$$ g(x|y) = \frac{x^2+3y}{12y+30} \qquad h(y|x)=\frac{x^2+3y}{2x^2+9} $$

We usually don’t need to find both conditional functions but for examples purposes, we will find a mean for each one.

We decided then to find \(E(X|Y=1)\) and \(E(Y|X=3)\) and will find the Variance for \(V(X|Y=1)\).

Example 1: Calculating our first proposed exercise:

\begin{align} E(X|Y=1) & = E(g(x|y=1))\\ & = \sum_{x=1}^{4} xg(x|y=1)\\ & = \sum_{x=1}^{4} x \frac{x^2+3(1)}{12(1)+30}\\ & = (1)\frac{1^2+3(1)}{12(1)+30} + (12)\frac{2^2+3(1)}{12(1)+30} + (3)\frac{3^2+3(1)}{12(1)+30} + (4)\frac{4^2+3(1)}{12(1)+30}\\ & = (1)\frac{4}{42} + (2)\frac{7}{42} + (3)\frac{12}{42} + (4)\frac{19}{42} = \frac{65}{21} \end{align}

Example 2:

\begin{align} E(Y|X=3)& = E(h(y|x=3))\\ & = \sum_{y=1}^{2} yh(y|x=3)\\ & = \sum_{y=1}^{2} y\frac{3^2+3y}{2(3)^2+9}\\ & = (1)\frac{3^2+3(1)}{2(3)^2+9} + (2)\frac{3^2+3(2)}{2(3)^2+9}\\ & = \frac{12}{27}+(2)\frac{15}{27}=\frac{14}{9} \end{align}

Then using the output from example number one we can calculate what we proposed:

Example 3:

\begin{align} V(X|Y=1) & = E[(X-E(X|Y=1))^2|Y=1]\\ & = \sum_{x=1}^{4}(x-E(X|Y=1))^2g(x|y=1)\\ & = \sum_{x=1}^{4}\bigg(x-\frac{65}{21}\bigg)^2\frac{x^2+3(1)}{12(1)+30}\\ & = \bigg(1-\frac{65}{21}\bigg)^2 \frac{4}{42} + \bigg(2-\frac{65}{21}\bigg)^2 \frac{7}{42} + \bigg(3-\frac{65}{21}\bigg)^2 \frac{12}{42} + \bigg(4-\frac{65}{21}\bigg)^2 \frac{19}{42}\\ & = \bigg(\frac{1936}{441}\bigg)\frac{4}{42} + \bigg(\frac{529}{441}\bigg)\frac{7}{42} + \bigg(\frac{4}{441}\bigg)\frac{12}{42} + \bigg(\frac{361}{441}\bigg)\frac{19}{42} = 0.9909 \end{align}

We see then that we need to calculate the mean to find the variance, and at the same time it saves us some calculations (like evaluating g or h function for each value since we know them from the mean). Now let’s check how this calculations look for the continuous case:

Example 4: We will show how to do the workout for a continuous conditional function, we’ve already done this on an earlier chapter but we will redo all we did and add a new calculation:


$$ f(x,y) = \frac{4}{3}(1-xy) \qquad 0\leq x\leq1,\quad 0\leq y\leq1 $$

We decide to find \(E(x|y)\), for this we write:

$$ E(x|y) = \int_{0}^{1}xg(x|y)dx $$

For this then we have to find \(g\), since we are familiar on how to, we just find it as done with continuous functions:

\begin{align} g(x|y)& = \frac{f(x,y)}{f_y(y)} = \frac{\frac{4}{3}(1-xy)}{\int_{0}^{1}\frac{4}{3}(1-xy)dx}=\frac{\frac{4}{3}(1-xy)}{\frac{4}{3}(x-\frac{x^2y}{2})|_{x=0}^{x=1}}=\frac{\frac{4}{3}(1-xy)}{\frac{4}{3}(1-\frac{y}{2})} \end{align}

And back to the expectancy function we have:

\begin{align} E(x|y)& =\int_{0}^{1} x \frac{(1-xy)}{1-\frac{y}{2}}dx\\ & = \frac{1}{1-\frac{y}{2}} \int_{0}^{1}x(1-xy)dx\\ & = \frac{1}{1-\frac{y}{2}} \int_{0}^{1}(x-x^2y)dx\\\\ & = \frac{1}{1-\frac{y}{2}} \bigg(\frac{x^2}{2}-\frac{x^3y}{3}\bigg)\bigg|_{x=0}^{x=1}\\ & = \frac{1}{1-\frac{y}{2}} \bigg(\frac{1}{2}-\frac{y}{3}\bigg) \end{align}

With this we have the function that will give us the desired expected value, for example if we wanted \(E(x|y=1)\) we’d just replace:

$$ E(x|y=1) = \frac{1}{1-\frac{1}{2}} \bigg(\frac{1}{2}-\frac{1}{3}\bigg) = \frac{1}{3} $$

Note that finding the variance for this type of functions we would just integrate over the same region with \(x^2\) instead of \(x\) and find the difference between this integral and \([E(x|y)]^2\).


Learning Outcome

Topic 3.c: Multivariate Random Variables – Calculate moments for joint, conditional, and marginal random variables.