###### Determine conditional and marginal pro ...

Conditional Distributions Conditional probability is a key part of Baye’s theorem, which describes... **Read More**

Let \(X_1\) and \(X_2\) be a discrete random variables with joint probability mass function \(f_{X_1,X_2}(x_1,x_2)\) defined on a two dimensional set \(A\). Define the following functions:

$$ y_1 =g_1 (x_1, x_2)$$

and

$$y_2 =g_2(x_1,x_2)$$

Remember: \(y_2=g_2(x_1,x_2)\) is sometimes called a **dummy transformation**; you have to make it up!

Define bivariate one-to-one transformation from the two-dimensional set \(A\) to the two-dimensional \(B\). Assume that the function \(y_1=g_1(x_1,x_2)\) and \(y_2=g_2(x_1,x_2) \) can be solved uniquely and in closed form for \(x_1\) and \(x_2\) as:

$$\begin{align}x_{1}&=g_{1}^{-1}(y_{1},y_{2})\\ x_{2}&=g_{2}^{-1}(y_{1},y_{2})\end{align}$$

Which maps \(B\) to \(A\).

The joint probability mass function \(Y_1=g_1(X_1,X_2)\) and \(Y_2=g_2(X_1,X_2)\) is:

$$f_{Y_1,Y_2}\left(y_1,y_2\right)=f_{X_1,X_2}(g_1^{-1}\left(y_1,y_2\right),g_2^{-1}\left(y_1,y_2\right)\ \left(y_1,y_2\right)\epsilon B$$

Let \(X_1 \sim Poisson(\lambda_1)\) and \(X_2 \sim Poisson(\lambda_2)\) be independent random variables.

Find the pmf of \(X_1+X_2\).

**Solution **

We need:

$$Y_1=g\left(X_1,X_2\right)=X_1+X_2$$

But we have to make up \(Y_2\). In this case, we choose:

$$Y_2=g_2\left(X_1,X_2\right)=X_2$$

Then, we follow these steps:

**Step 1. **Determining the support of \(X\) values which in this case is:

$$\left\{\left(x_1,x_2\right)\middle| x_1=0,1,2,\ldots;x_2=0,1,2\ldots.\right\}$$

**Step 2: **Find the pmf of \(X_1\) and \(X_2\). We know that:

$$\begin{align} f(x_1)&=\frac{\lambda_{1}^{x_1}e^{-\lambda_{1}}}{x_{1}!}\\ f(x_2)&=\frac{\lambda_{2}^{x_2}e^{-\lambda_{2}}}{x_{2}!}\end{align}$$

Since \(X_1\) and \(X_2\) are *iid* random variables, the joint distribution \(f_{X_1X_2}(x_1,x_2)\) is:

$$f(x_1,x_2)=\frac{\lambda_{1}^{x_1}e^{-\lambda_{1}}}{x_{1}!}\bullet \frac{\lambda_{2}^{x_2}e^{-\lambda_{2}}}{x_{2}!}$$

**Step 3: **Find the inverse functions which is solving the set of equations for \(X_1\) and \(X_2\). We know that:

$$Y_1=g_1\left(X_1,X_2\right)=X_1+X_2$$

and

$$Y_2=g_2\left(X_1,X_2\right)=X_2$$

Solving for \(X_1\) and \(X_2\) we have:

$$\begin{align}X_1&=g_1^{-1}(x_1,x_2)=Y_1-Y_2\\ X_2&=g_2^{-1}(x_1,x_2)=Y_2\end{align}$$

**Step 4: **Find the set \(B\), which in this case is:

$$B=\left\{\left(y_1,y_2\right)\middle| y_1=0,1,2,\ldots,y_2=0,1,2,\ldots y_1\right\}$$

**Step 5: **Find the joint pmf of \(Y_1\) and \(Y_2\) using the transformation technique:

$$\begin{align}f_{Y_1, Y_2}(y_1,y_2)&=f_{X_1,X_2}(g_1^{-1}(x_1,x_2),g_2^{-1}(x_1, x_2)),\\&= \frac{\lambda_1^{y_1-y_2}e^{-\lambda_1}}{(y_1-y_2)!}\bullet \frac{\lambda_1^{y_2}e^{-\lambda_2}}{(y_2)!} (y_1,y_2)\epsilon B\end{align}$$

Now, note that the question needs us to find pmf of \(Y_1=X_1+X_2\). Thus, we need to find a marginal distribution for \(Y_1\) given by:

$$\begin{align}f_{Y_1}(y_1)&=\sum_{y_2}{f_{Y_1, Y_2}(y_1,y2)} \\ &=\sum_{y_2=0}^{y_1}{\frac{\lambda_1^{y_1-y_2}e^{-\lambda_1}}{(y_1-y_2)!}\bullet \frac{\lambda_1^{y_2}e^{-\lambda_2}}{(y_2)!}}\\ &=\frac{(\lambda_1 +\lambda_2)^{y_1}e^{-(\lambda_1+\lambda_2)}}{y_1 !}\\ &=Y_1 \sim Poisson(\lambda_1+\lambda_2)\end{align}$$

Consider the transformation of one random variable \(X\) with pdf \(f(x)\). In continuous case, let \(Y = u(X)\) be an increasing or decreasing function of \(X\), with inverse \(X = v(Y)\). The pdf of \(Y\) is:

$$ g(y) = |v'(y)|f[v(y)],\quad c < y < d, $$

where the interval \(c < y < d\) corresponds to the support of \(X\), say, \(a < x < b\), through the transformation \(x = v(y)\).

There is one remark we must pay attention to: If the function \(Y = u(X)\) does not have a single-valued inverse, the determination of the distribution of \(Y\) will not be as simple.

If we think about the bivariate case, in the case of a single-valued inverse, the rule is about the same as that in the one-variable case, with the derivative being replaced by the Jacobian (a matrix of first-order partial derivatives).

That is, if \(X_1\) and \(X_2\) are two continuous-type random variables with joint pdf \(f(x_1,x_2)\), and \(Y_1 = u_1(X_1,X_2)\), \(Y_2 = u_2(X_1,X_2)\) that have a the single-valued inverse \(X_1 = v_1(Y_1,Y_2)\),\(X_2 = v_2(Y_1,Y_2)\), then the joint pdf of \(Y_1\) and \(Y_2\) is:

$$g(y_1,y_2) = |J|f[v_1(y_1,y_2),v_2(y_1,y_2)],\quad (y_1,y_2)\in S_Y,$$

where the Jacobian \(J\) is the determinant.

\begin{equation*} J =\text{det} \begin{bmatrix} \frac{\partial x_1}{\partial y_1} & \frac{\partial x_1}{\partial y_2}\\ \frac{\partial x_2}{\partial y_1} & \frac{\partial x_2}{\partial y_2} \end{bmatrix} = \frac{\partial x_1}{\partial y_1} \frac{\partial x_2}{\partial y_2} – \frac{\partial x_1}{\partial y_2} \frac{\partial x_2}{\partial y_1} \ne 0 \end{equation*}

We find the support \(S_Y\) of \(Y_1,Y_2\) by considering the mapping of the support \(S_X\) of \(X_1,X_2\) under the transformation \(y_1 = u_1(x_1,x_2)\), \(y_2 = u_2(x_1,x_2)\). This method of finding the distribution of \(Y_1\) and \(Y_2\) is called the **change-of-variables technique**.

It is often the mapping of the support \(S_x\) of \(X_1,X_2\) into that (say, \(S_Y\)) of \(Y_1,Y_2\) which causes the biggest challenge. That is, in most cases, it is easy to solve for \(x_1\) and \(x_2\) in terms of \(y_1\) and \(y_2\), say, $$ x_1=v_1(y_1,y_2),\quad x_2=v_2(y_1,y_2),$$ and then compute the Jacobian

\begin{equation*} J = \begin{bmatrix} \frac{\partial v_1(y_1,y_2)}{\partial y_1} & \frac{\partial v_1(y_1,y_2)}{\partial y_2}\\ \frac{\partial v_2(y_1,y_2)}{\partial y_1} & \frac{\partial v_2(y_1,y_2)}{\partial y_2} \end{bmatrix} \end{equation*}

Let \(X_1\) and \(X_2\) have a joint pdf \(f(x_1,x_2)\). Let \(Y_1 = X_1 + X_2\), \(Y_2 = X_1 – X_2\) be a transformation of \(X_1,X_2\).

Find the joint density function of \(Y_1\) and \(Y_2\) in terms of \(f_{X_1,X_2}\).

**Solution**

Applying the change of variable technique, we should have to make some arrangements:

\begin{align} Y_1 & = X_1 + X_2\\ X_1 & = Y_1 – X_2\\ & \text{Then,} \nonumber\\ X_1 & = Y_1 + Y_2 – X_1\\ 2X_1 & = Y_1 + Y_2\\ X_1 & =v_1\left(y_1,y_2\right) =\frac{Y_1 + Y_2}{2} \end{align}

And for \(X_2\),

\begin{align} Y_2 & = X_1 + X_2\\ X_2 & = Y_2 – X_1\\ & \text{Then,}\nonumber \\ X_2 & = Y_1 – X_2 – Y_2\\ 2X_2 & = Y_1 – Y_2\\ X_2 & =v_2\left(y_1,y_2\right)= \frac{Y_1 – Y_2}{2} \end{align} Now we can find our desired value: \begin{equation*} J =\text{det}. \begin{bmatrix}\frac{\partial v_1\left(y_1,y_2\right)}{\partial y_1}&\frac{\partial v_1\left(y_1,y_2\right)}{\partial y_2}\\ \frac{\partial v_2\left(y_1,y_2\right)}{\partial y_1}&\frac{\partial v_2\left(y_1,y_2\right)}{\partial y_2}\\ \end{bmatrix}= \text{det} \begin{bmatrix} \frac{1}{2} & \frac{1}{2}\\ \frac{1}{2} & -\frac{1}{2} \end{bmatrix} =\frac{1}{2}\frac{1}{2} – \frac{1}{2}\bigg(-\frac{1}{2}\bigg) = \frac{1}{2} \end{equation*}

Applying the formula:

$$g\left(y_1,y_2\right)=\left|J\right|f\left[v_1\left(y_1,\ y_2\right),\ v_2\left(y_1,\ y_2\right)\right]=\left|\frac{1}{2}\right|f\left(x_1,\ x_2\right)=\frac{1}{2}f\left(\frac{y_1+y_2}{2},\frac{y_1-y_2}{2}\right)$$

If we defined any function to \(X_1,X_2\) we would only need to add it to this equation and change the bounds if needed. After that, we would comfortably determine the mean, variance, and other moments that can be deduced from a joint probability function.

** Note: **In some other reference materials, you might come across an expression like \(g(y_1,y_2) = f_{x_{1},x_{2}}(x_1,x_2)|J(x_1,x_2)|^{-1}\). This would lead us to the same calculations as above and it can be proved that \(|J(x_1,x_2)|^{-1} = |J(y_1,y_2)|\). In this case, the Jacobian matrix would look like this:

\begin{equation*} J = \text{det} \begin{bmatrix} \frac{\partial g_1}{\partial x_1} & \frac{\partial g_1}{\partial x_2}\\ \frac{\partial g_2}{\partial x_2} & \frac{\partial g_2}{\partial x_2} \end{bmatrix} \end{equation*}

Let \(X_1\) and \(X_2\) be independent exponential random variables, both having means of \(\lambda\)>0. Evaluate the following probability density function in terms of \(u_1\) and \(u_2\).

$$U=\frac{X_1}{X_1+X_2}$$

**Solution**

Note that the density functions of \(X_1\) and \(X_2\) are given by:

$$f\left(x_1\right)=\begin{cases}\frac{1}{\lambda}e^{-\frac{x_1}{\lambda}},& x_1 > 0\\ 0, &\text{otherwise}\end{cases} $$

And

$$f\left(x_2\right)=\begin{cases}\frac{1}{\lambda}e^{-\frac{x_2}{\lambda}},& x_2 > 0\\ 0, &\text{otherwise}\end{cases} $$

Since \(X_1\) and \(X_2\) are independent, then the joint distribution is given by:

$$f_{X_1,X_2}\left(x_1,x_2\right)=\begin{cases}\frac{1}{\lambda^2}e^{-\frac{{(x}_1+x_2)}{\lambda}},& x_1 > 0, x_2 > 0\\ 0, &\text{otherwise}\end{cases} $$

Let

$$u_1=g_1\left(x_1,x_2\right)=\frac{x_1}{x_1+x_2}$$

And

$$u_2=g_1\left(x_1,x_2\right)=x_1+x_2$$

Now from above, it is easy to see that:

$$x_1=v_1\left(Y_1,\ Y_2\right)=u_1u_2$$

And

$$x_2=v_2\left(Y_1,\ Y_2\right)=u_2(1-u_1)$$

Thus the Jacobian transformation is given by:

$$J=\text{det}\begin{bmatrix}\frac{\partial x_1}{\partial y_1}&\frac{\partial x_1}{\partial y_2}\\\frac{\partial x_2}{\partial y_1}&\frac{\partial x_2}{\partial y_2}\\ \end{bmatrix}=\text{det}\begin{bmatrix}u_2&u_1\\-u_2&1-u_1\\ \end{bmatrix}=u_2$$

As such,

$$\begin{align}f_{U_1, U_2}(u_1,u_2)&=|J|f[v_1(y_1,y_2),v_2(y_1,y_2)]\\ &=\frac{1}{\lambda^2}|u_2|e^{-\frac{u_1u_2+u_2(1-u_2)}{\lambda}}\end{align}$$

Note that:

$$u_2\left(1-u_1\right)=u_2-u_1u_2 > 0$$

Which mplies that:

$$ 0<u_1u_2<u_2 \Leftrightarrow 0 < u_1<1$$

Also,

$$u_1u_2 > 0 \Leftrightarrow u_2 > 0$$

Thus the density function of U can be written as:

$$f_{U_1U_2}\left(u_1u_2\right)=\begin{cases}{\frac{1}{\lambda^2}u}_2e^{-\left(\frac{u_1u_2+u_2\left(1-u_1\right)}{\lambda}\right)}, & 0 < u_1 < 1\ ,u_2 > 0\\ 0, &\text{otherwise}\end{cases}$$

**Order statistics** are the observations of the random sample, arranged from the smallest to the largest. In recent years, the importance of order statistics has increased because of the more frequent use of nonparametric inferences and robust procedures. However, order statistics have always been prominent because, among other things, they are needed to determine rather simple statistics such as the sample median, the sample range, and the empirical cdf.

For studying purposes, we will assume that the *n* independent observations come from a continuous-type distribution. This means, among other things, that the probability of any two observations being equal is zero. That is, the probability that the observations can be ordered from smallest to largest without having two equal values is 1. Of course, in practice, we frequently observe *ties*; but if the probability of a tie is small, the distribution theory that follows will hold approximately.

If \(X_1,X_2,\cdots,X_n\) are observations of a random sample of size *n* from a continuous-type distribution, we let the random variables:

$$ Y_1 < Y_2 < \cdots < Y_n $$

Denote the order statistics of that sample. That is,

\begin{align*} Y_1 & = \text{smallest of } X_1,X_2,\cdots,X_n\\ Y_2 &= \text{second smallest of } X_1,X_2,\cdots,X_n\\ & \vdots\\ Y_n & =\text{largest } X_1,X_2,\cdots,X_n \end{align*}

The joint density function of the order statistics is obtained by noting that the order statistics \(Y_1,\cdots, Y_n\) will take values \(y_1 \leq y_2, \leq \cdots \leq y_n\) if and only if, for some permutation \((i_1,i_2,\cdots,i_n)\) of \((1,2,\cdots,n)\),

$$Y_1 = y_{i_1}, Y_2=y_{i_2},\cdots,Y_n = y_{i_n}$$

There is a very simple procedure for determining the cdf of the *r*^{th} order statistics, \(Y_r\), and it majorly depends on the binomial distribution.

For order statistics, we can compute things like the sample median, sample range, and other such statistics. Let’s see one simple example:

In an experiment with \(n= 5\) data points, we have:

$$ x_1 = 0.34 , x_2 = 0.54 , x_3 = 0.43, x_4 = 0.67, x_5 = 0.14 $$

each having a pdf defined as \(f(x) = 3x, 0 < x < 1\) and 0 elsewhere.

Determine the sample median and sample range.

**Solution**

The order statistics are:

$$y_1 = 0.14 < y_2 = 0.34 < y_3 = 0.43 < y_4 = 0.54 < y_5 = 0.67$$

It is simple enough to note that \(y_3\) is the middle statistic and this is equal to the sample median.

The sample range is: \(y_5 – y_1 =0.67-0.14 =0.53\).

Now, let \(Y_1, \cdots , Y_5\) be arbitrary and unknown, and let’s assume that \(Y_4 < \frac{1}{3}\), this means that the other 3 random variables must be less than this value too since they are ordered. This type of event can then be thought of as a binomial experiment for convenience. Now, the probability of success (that an event that \(X_i<\frac{1}{3}\) )is:

$$P(X_i <= 1/3) = \int_{0}^{1/3}3x^{2}dx = \frac{1}{27} $$

Note that we must have at least four successes so that:

$$P\left(X_3\le\frac{1}{3}\right)=\binom{5}{4}\left(\frac{1}{27}\right)^4\left(\frac{26}{27}\right)+\left(\frac{1}{27}\right)^5=0.00000233$$

Using the same analogy as in the above example, we can find the cdf of \(Y_3\) which we can denote as \(F\left(y\right)\). We know that:

$$F\left(y\right)=P\left(Y_3<y\right)$$

We know that from the order statistics:

$$P\left(X_i<y\right)=\int_{0}^{y}{3x^2\ dx=\left[\frac{3x^2}{2}x^3\right]_0^y=\frac{3y^2}{2}y^3}$$

Then,

$$\begin{align}F\left(y\right)&=P\left(Y_3<y\right)\\ &=\binom{5}{4}\left(y^3\right)^4\left(1-y^3\right)+\left(y^3\right)^5\end{align}$$

Where \(y^3\) is nothing more than what we would get if we did the same integral we did on the example above on the region \(\left[0,y\right]\). The transforming \(G(y)\) for \(0 < y < 1\):

$$\begin{align} f\left(y\right)&=F^\prime\left(y\right)\\ &=\frac{5!}{3!1!}[\left(y^3\right)^4\left(1-y^3\right)3y^2] \end{align}$$

From theese example, we can generalize the results of order statistics:

Let \(Y_1 < Y_2<\ldots<Y_n \) be the order statistics of \(n\) independent observations from distribution of the continuous type with cdf \(F(x)\) and pdf \(F^\prime\left(x\right)=f\left(x\right)\), where \(0<F\left(x\right)<1\) for \(a < x < b\) and \(F\left(a\right)=0\) ,\(F\left(b\right)=1\). (It is possible that \(a=-\infty\) and /or \(b=+\infty\)).

The event that the *r*^{th} order statistic \(Y_r\) is at most \(y\), \({Y_r\leq y}\) can occur if and only if at least *r* of the *n* observations is less than or equal to *y*. That is, the probability of “success” on each trial is \(F(y)\), and we must have at least *r* successes. Thus,

$$ G_r(y) = P(Y_r\leq y) = \sum_{k=r}^{n}\binom{n}{k}[F(y)]^k[1-F(y)]^{n-k}$$

Rewriting this, we have

$$G_r(y) = \sum_{k=r}^{n-1}\binom{n}{k}[F(y)]^k[1-F(y)]^{n-k}+[F(y)]^n$$

Hence, the pdf of \(Y_r\) is:

\begin{align} gr(y) = G_r'(y) = & \sum_{k=r}^{n-1}\binom{n}{k}(k)[F(y)]^{k-1}f(y)[1-F(y)]^{n-k}\\ & + \sum_{k=r}^{n-1}\binom{n}{k}[F(y)]^k(n-k)[1-F(y)]^{n-k-1}[-f(y)]\\ & + n[F(y)]^{n-1}f(y). \end{align}

But,

$$\binom{n}{k}k=\frac{n!}{(r-1)!(n-r)!}\quad\text{and}\quad\binom{n}{k}(n-k)=\frac{n!}{k!(n-k-1)!},$$

then replacing on the pdf of \(Y_r\):

$$g_r(y)=\frac{n!}{(r-1)!(n-r)!}[F(y)]^{r-1}[1-F(y)]^{n-r}f(y),\quad a < y < b,$$

which is the first term of the first summation in \(g_r(y)=G_r'(y)\). The remaining terms in \(g_r(y) = G_r'(y)\) sum to zero because the second term of the first summation (when \(k = r + 1\)) equals the negative of the first term in the second summation (when \(k=r\)), and so on. Finally, the last term of the second summation equals the negative of \(n[F(y)]^{n-1}f(y)\).

Recall that for an order statistic \(Y_1\ ,\ Y_2,\ \ldots,\ Y_n\), \(Y_1\) is the smallest of \(Y_1,\ Y_2,\ldots,\ Y_n \) and \(Y_n\) is the largest of \(Y_1,\ Y_2,\ldots,\ Y_n\). This can be clearly stated as:

$$Y_1=\min(Y_1, Y_2,\ \ldots,\ Y_n)$$

is the minimum of the order statistic and

$$Y_n=\max(Y_1, Y_2,\ \ldots,\ Y_n)$$

is the maximum of the order statistic.

It can be shown that the pdf of the smallest (minimum) order statistics is:

$$g_1\left(y\right)=n\left[1-F\left(y\right)\right]^{n-1}f\left(y\right),\ a < y < b $$

And the pdf of the largest (maximum) order statistics is:

$$g_n\left(y\right)=n\left[F\left(y\right)\right]^{n-1}f\left(y\right),\ a < y < b.$$

Two machines in the manufacturing industry each have an operating life (in years) \(Y\) with a pdf given by:

$$f(y)=\begin{cases}\frac{1}{200}e^{-\frac{y}{200}}, & y > 0\\ 0, &\text{otherwise}\end{cases}$$

The machines operate independently, but if one machine breaks down, the manufacturing process must be stopped.

Find the pdf of \(X\), the length of time of the manufacturing process until one machine breaks.

**Solution**

Since manufacturing stops when one machine fails, then \(X\) must be:

$$X=\min(Y_1,Y_2)$$

Where \(Y_1\) and \(Y_2\) are independent random variables with the given pdf defined above. In this case, we know that:

$$g_X\left(y\right)=n\left[1-F\left(y\right)\right]^{n-1}f\left(y\right),\ a < y < b$$

Now,

$$F\left(y\right)=\int_{0}^{y}{\frac{1}{200}e^{-\frac{t}{200}}dt=}-e^{-\frac{y}{200}}+1$$

Thus,

$$\begin{align}g_{X}&=2\left[1-\left(-e^{-\frac{y}{200}}+1\right)\right]^{n-1}\bullet \frac{1}{200}e^{-\frac{y}{200}}\\ &=\frac{1}{100}e^{-\frac{y}{200}}\end{align}$$

More precisely,

$$g_{X}=\begin{cases}\frac{1}{100}e^{-\frac{y}{200}}, &y > 0\\ 0, &\text{otherwise} \end{cases}$$

**Learning Outcome**

**Topic 3.g: Multivariate Random Variables – Determine the distribution of a transformation of jointly distributed random variables. Determine the distribution of order statistics from a set of independent random variables.**