Calculate joint moments, such as the c ...
Let \(X\) and \(Y\) be two discrete random variables, with a joint probability... Read More
One of the most important results in probability theory is the central limit theorem. According to the theory, the sum of a large number of independent random variables has an approximately normal distribution. The theory provides a simple method of calculating probabilities for sums of independent random variables, as well as explaining why many natural populations exhibit bell-shaped (normal) distributions.
Suppose that \(\text{X}_{1}, \text{X}_{2}, \ldots, \text{X}_{\text{n}}\) are independent random variables normally distributed with mean, \(\mu_{\text{i}}\) and variance \(\sigma_{\text{i}}^{2}\).
The sample mean \(\overline{\text{X}}=\frac{\text{X}_{1}+\text{X}_{2}+\cdots+\text{X}_{\text{n}}}{\text{n}}\) has mean \(\text{E}(\overline{\text{X}})=\mu\) and variance \(\operatorname{Var}(\overline{\text{X}})=\frac{\sigma^{2}}{\text{n}}\).
As \(n\) increases, the variance of \(\overline{\text{X}}\) decreases. Consequently, the distribution of \(\overline{\text{X}}\) depends on \(n\), and we see that we are dealing with sequences of distributions.
Now, let \(X_{1}, X_{2}, \ldots, X_{n}\) be independent normal random variables, each with mean, \(\mu_{\text{i}}\) and variance \(\sigma_{\text{i}}^{2}\).
Let \(Y=c_{1} X_{1}+c_{2} X_{2}+\cdots+c_{n} X_{n}\) be the linear combination of the random variables \(X_{1}, X_{2}, \ldots, X_{n}\).
It can be proven that the linear combination,
$$
\text{Y}=\text{c}_{1} \text{X}_{1}+\text{c}_{2} \text{X}_{2}+\cdots+\text{c}_{\text{n}} \text{X}_{\text{n}}
$$
is normally distributed as
$$
N\left(\text{c}_{1} \mu_{1}+\text{c}_{2} \mu_{2}+\cdots+\text{c}_{\text{n}} \mu_{\text{n}}, \text{c}_{1} \sigma_{1}^{2}+\text{c}_{2} \sigma_{2}^{2}+\cdots+\text{c}_{\text{n}} \sigma_{\text{n}}^{2}\right)=\text{N}\left(\sum_{\text{i}=1}^{\text{n}} \text{c}_{\text{i}} \mu_{\text{i}}, \sum_{\text{i}=1}^{\text{n}} \text{c}_{\text{i}}^{2} \sigma_{\text{i}}^{2}\right)
$$
We can apply the moment generating function technique to prove the above results.
An important observation noted is that as \(n\) increases, \(\bar{X}\) tends to converge to \(\mu\) which implies that \(\bar{X}-\mu\) tends to converge to 0 in a probability sense.
For most cases, if we assume
$$
\text{W}=\frac{\sqrt{\text{n}}}{\sigma}(\overline{\text{X}}-\mu)=\frac{\overline{\text{X}}-\mu}{\sigma / \sqrt{\text{n}}}=\frac{\text{Y}-\text{n} \mu}{\sqrt{\text{n}} \sigma},
$$
Where \(Y\) is the sum of a random sample of size \(\text{n}\) from some distribution with mean \(\mu\) and variance \(\sigma^{2}\).
Then, for each positive integer, \(n\),
$$
\text{E}(\text{W})=\text{E}\left[\frac{\overline{\text{X}}-\mu}{\frac{\sigma}{\sqrt{\text{n}}}}\right]=\frac{\text{E}(\overline{\text{X}})-\mu}{\frac{\sigma}{\sqrt{\text{n}}}}=\frac{\mu-\mu}{\sqrt{\text{n} \sigma}}=0
$$
and,
$$
\operatorname{Var}(\text{W})=\text{E}\left(\text{W}^{2}\right)=\text{E}\left[\frac{(\overline{\text{X}}-\mu)^{2}}{\frac{\sigma^{2}}{\text{n}}}\right]=\frac{\text{E}\left[(\overline{\text{X}}-\mu)^{2}\right]}{\frac{\sigma^{2}}{\text{n}}}=\frac{\frac{\sigma^{2}}{\text{n}}}{\frac{\sigma^{2}}{\text{n}}}=1
$$
Then, when \(\bar{X}-\mu\) tends (reduce) to 0 , the factor \(\frac{\sqrt{n}}{\sigma}\) in \(\sqrt{n}(\bar{X}-\mu) / \sigma\) tends to decrease the probability of \(\overline{\text{X}}-\mu\) tending to 0 enough to prevent this “reduction.” Then what happens to \(\text{W}\) when \(\text{n}\) increases? If this sample comes from a normal distribution, then we know that \(\overline{\text{X}}\) is \(N\left(\mu, \frac{\sigma^{2}}{\text{n}}\right)\), and hence \(\text{W}\) is \(N(0,1)\) for each positive \(n\). So in this limit, the distribution of \(W\) necessarily will be \(N(0,1)\). So the answer to this question is: if this increase does not depend on the underlying distribution, the answer must be \(N(0,1)\).
If \(\overline{\text{X}}\) is the mean of a random sample \(\text{X}_{1}, \text{X}_{2}, \ldots, \text{X}_{\text{n}}\) of size \(\text{n}\) from a distribution with a finite mean \(\mu\) and a finite positive variance \(\sigma^{2}\), then the distribution of
$$
\text{W}=\frac{\overline{\text{X}}-\mu}{\frac{\sigma}{\sqrt{\text{n}}}}=\frac{\sum_{\text{i}=1}^{\text{n}} \text{X}_{\text{i}}-\text{n} \mu}{\sqrt{\text{n}} \sigma}
$$
is \(\text{N}(0,1)\) in the limit as \(\text{n} \rightarrow \infty\).
When \(\text{n}\) is “sufficiently large,” practical use of the Central Limit Theorem is approximately the CDF of \(\text{W}\):
$$
\text{P}(\text{W} \leq \text{w}) \approx \int_{-\infty}^{\text{w}} \frac{1}{\sqrt{2 \pi}} \text{e}^{-\frac{\text{z}^{2}}{2}} \text{dz}=\Phi(\text{w})
$$
An interesting thing about the central limit theorem is that it does not matter what the distribution of the \(\text{X}_{\text{i}}\) ‘s is. The \(\text{X}_{\text{i}}\) ‘s can be discrete, continuous, or mixed random variables.
For example, assume that \(\text{X}_{\text{i}}{ }^{\prime} \text{s}\) are Bernoulli(p) random variables, then \(\text{E}\left(\text{X}_{\text{i}}\right)=\text{p}, \operatorname{Var}\left(\text{X}_{\text{i}}\right)=\text{p}(1-\text{p})\). Also, \(Y_{n}=X_{1}+X_{2}+\cdots+X_{n}\) has Binomial \((n, p)\) distribution. Thus,
$$
\text{Z}_{\text{n}}=\frac{\text{Y}-\text{np}}{\sqrt{\text{np}(1-\text{p})}}
$$
Where \(Y_{n} \sim \operatorname{Binomial}(n, p)\)
In the above example,\(\text{Z}_{\text{n}}\) is a discrete random variable, and thus, mathematically, we refer to it as having a PMF and not a PDF. This is the reason why the central limit theorem states that CDF and not PDF of \(\text{Z}_{\text{n}}\) converges to the standard normal CDF.
A common question asked is one concerning how large \(\text{n}\) should be so that the normal approximation can be used. To use the normal approximation will generally depend on \(\text{X}_{\text{i}} \text{s}\) distribution. However, a rule of thumb is often stated that if \(n \geq 30\), then normal approximation applies.
i. Write the random variable of interest, \(Y\), as the sum of \(n\) independent random variables \(X_{j}^{\prime}\)s:
$$
\text{Y}=\text{X}_{1}+\text{X}_{2}+\cdots+\text{X}_{\text{n}}
$$
ii. Compute \(\text{E}(\text{Y})\) and \(\operatorname{Var}(\text{Y})\) by noting that
$$
\text{E}(\text{Y})=\text{n} \mu, \quad \text { and } \quad \operatorname{Var}(\text{Y})=\text{n} \sigma^{2}
$$
Where \(\mu=\text{E}\left(\text{X}_{\text{i}}\right)\) and \(\sigma^{2}=\operatorname{Var}\left(\text{X}_{\text{i}}\right)\)
iii. As per the central limit Theorem, conclude that
$$
\frac{\text{Y}-\text{E}(\text{Y})}{\sqrt{\operatorname{Var}(\text{Y})}}=\frac{\text{Y}-\text{n} \mu}{\sqrt{\text{n} \sigma}}
$$
is approximately standard normal, hence, to find \(\text{P}\left(\text{y}_{1} \leq \text{Y} \leq \text{y}_{2}\right)\), we can write,
$$
\begin{align}
\text{P}\left(\text{y}_{1} \leq \text{Y} \leq \text{y}_{2}\right)&=\text{P}\left(\frac{\text{y}_{1}-\text{n} \mu}{\sqrt{\text{n}} \sigma} \leq \frac{\text{Y}-\text{n} \mu}{\sqrt{\text{n}} \sigma} \leq \frac{\text{y}_{2}-\text{n} \mu}{\sqrt{\text{n}} \sigma}\right) \\&
\approx \Phi\left(\frac{\text{y}_{2}-\text{n} \mu}{\sqrt{\text{n}} \sigma}\right)-\Phi\left(\frac{\text{y}_{1}-\text{n} \mu}{\sqrt{\text{n}} \sigma}\right)
\end{align}
$$
Let \(\overline{\text{X}}=18\) and \(\operatorname{Var}(\text{X})=3\) for a random sample of \(\text{n}=30\).
Find \(\text{P}(17.4<\overline{\text{X}}<18.5)\)
From the information given, \(\overline{\text{X}} \sim \text{N}\left(18, \frac{3}{30}\right)\). Then:
$$
\begin{aligned}
P(17.4<\bar{X}<18.5)&=P\left(\frac{17.4-18}{\sqrt{\frac{3}{30}}}<\frac{\bar{X}-18}{\sqrt{\frac{3}{30}}}<\frac{18.5-18}{\sqrt{\frac{3}{30}}}\right) \\
&\approx \Phi(1.5811)-\Phi(-1.8974)=0.9430-0.0289=0.9141
\end{aligned}
$$
Let \(\text{X}_{1}, \text{X}_{2}, \ldots, \text{X}_{15}\) be a random sample of size 15 from a random joint distribution. We are also given that, \(\text{E}\left(\text{X}_{\text{i}}\right)=\frac{1}{4}\) and \(\operatorname{Var}\left(\text{X}_{\text{i}}\right)=\frac{1}{24}\) for \(\text{i}=1,2, \ldots, 20\). Let \(\text{Y}=\text{X}_{1}+\text{X}_{2}+\cdots+\text{X}_{15}\).
Find \(\text{P}(\text{Y}<4.11)\)
$$
\begin{align}
\mathrm{P}(\mathrm{Y} \leq 4.11)&=\mathrm{P}\left(\frac{\mathrm{Y}-15\left(\frac{1}{4}\right)}{\sqrt{\frac{15}{24}}}<\frac{4.11-3.75}{\sqrt{\frac{15}{24}}}\right)=\mathrm{P}(\mathrm{Z}<0.455) \\&
=\Phi(0.4554)=0.6755
\end{align}
$$
In Example 1, the formula differs from the formula in Example 2. This can be explained by the central limit theorem. Example 1 uses a single variable and a single sample, so we use the left side of the expression. However, in Example 2, we are using a random sample from a random distribution and must weigh the distribution. Thus, we end up using the righthand side.
A company offers payment for their employees; the amount paid is 10,000 for each of its 200 employees if they survive a set criterion. The probability of survival for each employee is \(98.9 \%\). The person who built this fund says there is a \(99 \%\) probability that the fund will handle the payouts.
Calculate the smallest amount of money that the company should put into the fund.
Let \(P\) be the payments and \(X\) be the number of deaths, \(P=10,000 X\), where \(X \sim \operatorname{Bin}(200,1-0.989)\).
$$
\begin{gathered}
\text{E}(\text{P})=\text{E}(10,000 \mathrm{X})=\text{n} \cdot \text{p}=10,000(200)(1-0.989)=22,000 \\\\
\operatorname{Var}(\text{P})=\operatorname{Var}(10,000 \text{X})=\text{n} \cdot \text{p}(1-\text{p})=10,000^{2}(200)(1-0.989)(0.989)=217,580,000 \\\\
\Rightarrow \text{SD}(\text{P})=\sqrt{\operatorname{Var}(\text{P})}=14,7500.60
\end{gathered}
$$
Since there is a probability of \(0.99\) that the fund will be able to handle the payout, then:
$$\begin{align}
&\operatorname{Pr}\left(\text{Z} \leq \frac{\text{P}-22,000}{147500.60}\right)=0.99 \\
&\Rightarrow \Phi\left(\frac{\text{P}-22000}{147500.60}\right)=0.99 \\
&\Rightarrow\frac{\text{P}-22000}{147500.60}=\Phi^{-1}(0.99) \\
&\therefore \text{P}=22000+147500.60(2.326)=365,086.40
\end{align}$$
The value \(2.326\) is simply \(\Phi(0.99)\). If we assume for this problem that it approximates the Theorem, then we know that calculating the amount of variance (which is the right part of the sum) is needed to sum to the mean for becoming the center point to the \(99^{\text {th }}\) point.
Below is the pdf for the standard normal table required for this reading:
Learning Outcome
Topic 3.i: Multivariate Random Variables – Apply the Central Limit Theorem to calculate probabilities for linear combinations of independent and identically distributed random variables