*For this chapter, a certain knowledge of normal distribution and knowing how to use a table for the normal distribution is assumed*

The central limit theorem is of the most important results in the probability theory. It states that the sum of a large number of independent random variables has a distribution that is approximately normal. It not only provides a simple method for computing approximate probabilities for sums of independent random variables but also helps explain the remarkable fact that the empirical frequencies of so many natural populations exhibit bell-shaped (or normal) curves.

After analyzing the moment generating technique, we have found that the mean \(\bar{X}\) of a random sample size *n* from a distribution with mean \(\mu\) and variance \(\sigma^2 > 0\) is a random variable with the properties that

$$ E(\bar{X}) = \mu \quad\text{and}\quad Var(\bar{X}) = \frac{\sigma^2}{n}.$$

As *n* increases, the variance of \(\bar{X}\) decreases. Consequently, the distribution of \(\bar{X}\) clearly depends on *n*, and we see that we are dealing with sequences of distributions.

If we consider \(n\) mutually independent normal variables with \(n\) means and \(n\) variances each one belong to its \(n\) sub-index then the linear function

$$ Y = \sum_{i=1}^{n}c_iX_i$$

has the normal distribution

$$ N\bigg(\sum_{i=1}^{n}c_i\mu_i,\sum_{i=1}^{n}c_{i}^2\sigma_{i}^2\bigg). $$

This can be proved by applying the moment generating technique to the linear function.

Having applied this we can note that as \(n\) increases, the probability becomes concentrated in a small interval centered at \(\mu\). This means, as \(n\) increases, \(\bar{X}\) tends to converge to \(\mu\) or (\(\bar{X} – \mu\)) tends to converge to 0 in a probability sense.

For most cases, if we assume

$$ W = \frac{\sqrt{n}}{\sigma}(\bar{X}-\mu) = \frac{\bar{X}-\mu}{\sigma/\sqrt{n}}=\frac{Y – n\mu {\sqrt{n}\sigma},$$

where Y is the sum of a random sample of size \(n\) from some distribution with mean \(\mu\) and variance \(\sigma^2\), then, for each positive integer \(n\),

$$ E(W) = E\bigg[\frac{\bar{X}-\mu}{\sigma/\sqrt{n}} \bigg]= \frac{E(\bar{X})-\mu}{\sigma/\sqrt{n}}=\frac{\mu – \mu}{\sqrt{n}\sigma}=0$$

and

$$Var(W)= E(W^2) = E\bigg[\frac{(\bar{X}-\mu)^2}{\sigma^2/n} \bigg]= \frac{E\big[(\bar{X}-\mu)^2\big]}{\sigma^2/n}=\frac{\sigma^2/n}{\sigma^2/n}=1.$$

Then, when \(\bar{X}-\mu\) tends to “reduce” to 0, the factor \(\sqrt{n}/\sigma\) in \(\sqrt{n}(\bar{X}-\mu)/\sigma\) starts making the probability enough to prevent this “reduction”. Then what happens to \(W\) when \(n\) increases? If this sample comes from a normal distribution then we know that \(\bar{X}\) is \(N(\mu,\sigma^2/n)\), and hence \(W\) is \(N(0,1)\) for each positive \(n\). So in this limit the distribution of \(W\) necessarily will be \(N(0,1)\).So the answer to this question is: if this doesn’t depend on the underlying distribution, the answer must be \(N(0,1)\).

With this we can give the following theorem:

**Central Limit Theorem** If \(\bar{X}\) is the mean of a random sample \(X_1,X_2,\cdots,X_n\) of size \(n\) from a distribution with a finite mean \(\mu\) and a finite positive variance \(\sigma^2\), then the distribution of

$$ W = \frac{\bar{X} – \mu}{\sigma/\sqrt{n}} = \frac{\sum_{i=1}^{n}X_i – n\mu}{\sqrt{n}\sigma}$$

is \(N(0,1)\) in the limit as \(n \to \infty\). When \(n\) is “sufficiently large”, a practical use of the central limit theorem is approximating the cdf of \(W\):

$$P(W \leq w) \approx \int_{-\infty}^{w}\frac{1}{\sqrt{2\pi}}e^{-z^{2}/2}dz = \Phi(w).$$

Proof of the Central Limit Theorem is based in applying moment generating functions and using it on \(X_i/\sqrt{n}\), if the reader is interested in this proof, can look it up as it’s not part of this text but it is not to complex and can be understood with a certain level of calculus.

Now let’s see some examples:

**Example 1:** Let \(\bar{X} = 18\) and \(Var(X) = 3\) for a random sample of \(n = 30\). Then one can assume that \(\bar{X}\) has an approximate \(N(18,3/30)\) distribution. And one can compute probabilities such as:

\begin{align} P(17.4 < \bar{X} < 18.5) & = P\bigg(\frac{17.4-18}{\sqrt{3/30}} < \frac{\bar{X}-18}{\sqrt{3/30}} < \frac{18.5-18}{\sqrt{3/30}}\bigg)\\ & \approx \Phi(0.158) – \Phi(-0.189) = 0.946-0.026 = 0.92 \end{align}

The simulation of the Central Limit Theorem is really interesting and gives a better comprehension of what it really does and why it is so important in the probability world. We recommend the reader to look up some simulations and try it by yourself.

**Example 2:** Let \(X_1,X_2,\cdots, X_{15}\) be a random sample of size 15 from a joint random distribution. For them we have found \(E(X_i) = 1/4\) and \(Var(X_i) = 1/24\) for \(i=1,2,\cdots, 20\). If \(Y\) is a transformation \(Y = X_1 + X_2 + \cdots X_{15}\) then

\begin{align} P(Y \leq 4.11) & = P\bigg(\frac{Y – 15(1/4)}{\sqrt{15/24}}\leq \frac{4.11 – 3.75}{\sqrt{15/24}}\bigg) = P(W \leq 0.455)\\ & \approx \Phi(0.455) = 0.676 \end{align}

Notice how the formula for Example 1 is different from the formula on example 2, if we go back to the central limit theorem we can see why this happens, on example 1 we are using a single variable with a single sample so we are using the left side of the expression whereas on the second example we are using a random sample from a random distribution with \(X_n\) data points so we need weight the distribution and end up using the right side of the formula.

**Example 3:** A company offers payment for the death of their employees, the amount paid is \(10000\) for each of its \(200\) employees. The probability of survival for each employee is \(98.9\%\). The person who built this fund says that there’s a probability of at least \(0.99\) that the fund will be able to handle the payout. Calculate the smallest amount of money that the company should put into the fund.

Let \(P\) be the payments and \(X\) the number of deaths, \(P = 1000X\), where \( X \sim Bin(200,1-0.989)\).

$$E(P) = E(10000X)= n\cdot p =10000(200)(1-0.989)=22000 $$

$$Var(P) = Var(10000X) = n\cdot p \cdot (1-p)= 10000^2(200)(1-0.989)(0.989)=217,580,000$$

$$Sd(P)=\sqrt{Var(P)}= 14750$$

Then the least amount paid, or the 99th percentile is \(22000+147500(2.326)=56309.87\)

The value \(2.326\) is nothing more than our application of the Central Limit Theorem(\(\Phi(0.99)\)), if we assume for this problem that it approximates the Theorem then we know that calculating the amount of variance (which is the right part of the sum) that is needed to sum to the mean for becoming the center point to the 99th point.

**Learning Outcome**

**Topic 3.i: Multivariate Random Variables – State and apply the Central Limit Theorem.**