Save 10% on All AnalystPrep 2024 Study Packages with Coupon Code BLOG10.

State and apply the Central Limit Theorem

28 Jun 2019

For this learning objective, a certain knowledge of the normal distribution and knowing how to use the Z-table is assumed.

The central limit theorem is of the most important results in the probability theory. It states that the sum of a large number of independent random variables has an approximately normal distribution. It provides a simple method for computing approximate probabilities for sums of independent random variables and helps explain the remarkable fact that the empirical frequencies of so many natural populations exhibit bell-shaped (or normal) curves.

After analyzing the moment generating technique, we have found that the mean $\bar{X}$ of a random sample size n from a distribution with mean $\mu$ and variance $\sigma^2 > 0$ is a random variable with the properties:

$$ E(\bar{X}) = \mu \quad\text{and}\quad Var(\bar{X}) = \frac{\sigma^2}{n}$$

As n increases, the variance of $\bar{X}$ decreases. Consequently, the distribution of $\bar{X}$ clearly depends on n, and we see that we are dealing with sequences of distributions.

If we consider n mutually independent normal variables with n means and n variances, each one belong to its n sub-indexes, then the linear function:

$$ Y = \sum_{i=1}^{n}c_iX_i$$

has the normal distribution:

$$ N\bigg(\sum_{i=1}^{n}c_i\mu_i,\sum_{i=1}^{n}c_{i}^2\sigma_{i}^2\bigg) $$

This can be proved by applying the moment generating technique to the linear function.

Having applied this we can note that as n increases, the probability becomes concentrated in a small interval centered at $\mu$. This means, as n increases, $\bar{X}$ tends to converge to $\mu$ or ($\bar{X} – \mu$) tends to converge to 0 in a probability sense.

For most cases, if we assume:

$$W=\frac{\sqrt n}{\sigma}\left(\ \bar{X}-\mu\right)=\frac{\ \bar{X}-\mu}{\sigma/\sqrt n}=\frac{Y-n\mu}{\sqrt n\sigma}$$

where $Y$ is the sum of a random sample of size n from some distribution with mean $\mu$ and variance $\sigma^2$, then, for each positive integer n,

$$ E(W) = E\bigg[\frac{\bar{X}-\mu}{\sigma/\sqrt{n}} \bigg]= \frac{E(\bar{X})-\mu}{\sigma/\sqrt{n}}=\frac{\mu – \mu}{\sqrt{n}\sigma}=0$$

and

$$Var(W)= E(W^2) = E\bigg[\frac{(\bar{X}-\mu)^2}{\sigma^2/n} \bigg]= \frac{E\big[(\bar{X}-\mu)^2\big]}{\sigma^2/n}=\frac{\sigma^2/n}{\sigma^2/n}=1$$

Then, when $\bar{X}-\mu$ tends to “reduce” to 0, the factor $\sqrt{n}/\sigma$ in $\sqrt{n}(\bar{X}-\mu)/\sigma$ starts making the probability enough to prevent this “reduction.”

But what happens to $W$ when n increases? If this sample comes from a normal distribution, then we know that $\bar{X}$ is $N(\mu,\sigma^2/n)$, and hence $W$ is $N(0,1)$ for each positive n. So in this limit, the distribution of $W$ necessarily will be $N(0,1)$. Circling back to the original question: if this does not depend on the underlying distribution, the answer must be $N(0,1)$.

Now, we can state the following theorem:

The Central Limit Theorem

If $\bar{X}$ is the mean of a random sample $X_1,X_2,\cdots,X_n$ of size n from a distribution with a finite mean $\mu$ and a finite positive variance $\sigma^2$, then the distribution of:

$$ W = \frac{\bar{X} – \mu}{\sigma/\sqrt{n}} = \frac{\sum_{i=1}^{n}X_i – n\mu}{\sqrt{n}\sigma}$$

is $N(0,1)$ in the limit as $n \to \infty$. When n is “sufficiently large”, a practical use of the central limit theorem is approximating the cdf of $W$:

$$P(W \leq w) \approx \int_{-\infty}^{w}\frac{1}{\sqrt{2\pi}}e^{-z^{2}/2}dz = \Phi(w).$$

An interesting thing about the Central Limit Theorem is that it does not matter what the distribution of the $X_i\prime s$ is; $X_i\prime s$ can be discrete, continuous, or mixed random variables.

For example, assume that $X_i\prime s$ are Bernoulli (p) random variables, then $E[X_i]=p,\ Var\left(X_i\right)=p(1-p)$. Also, $Y_n=X_1+X_2+\ldots+X_n$ has a Binomial (n,p) distribution. Thus,

$$Z_n=\frac{Y-np}{\sqrt{np\left(1-p\right)}}$$

Where $Y_n\sim Binomial\ \left(n,p\right)$.

In the example, $Z_n$ is a discrete random variable; thus, mathematically, we refer to it as having a PMF and not a PDF. This is the reason why the Central Limit Theorem states that the CDF and not the PDF of $Z_n$ converge to the standard normal CDF.

A common question asked is how large n should be so that the normal approximation can be used. Using the normal approximation will generally depend on $X_i$’s distribution. However, a rule of thumb is often stated that if $n\geq30$, then a normal approximation applies.

Steps on How to Apply the Central Limit Theorem (CLT)

Step 1: Write the random variable of interest, $Y$, as the sum of n independent random variables $X_j^\prime s$:

$$Y=X_1+X_2+\ldots+X_n$$

Step 2: Compute $E(Y)$ and $Var(Y)$ by noting that:

$$ E\left(Y\right)=n\mu, \text{ and } Var\left(Y\right)=n\sigma^2$$

Where $\mu=E(X_i)$ and $\sigma^2=Var(X_i)$.

Step 3: As per the Central Limit Theorem, conclude that $\frac{Y-E(Y)}{\sqrt{Var\left(Y\right)}}=\frac{Y-n\mu}{\sqrt n\sigma}$ is approximately standard normal.

Hence, to find $P\left(y_1\le Y\le y_2\right)$, we can write,

$$P\left(y_1\le Y\le y_2\right)=P\left(\frac{y_1-n\mu}{\sqrt n\sigma}\le\frac{Y-n\mu}{\sqrt n\sigma}\le\frac{y_2-n\mu}{\sqrt n\sigma}\right)$$

Which is given by:

$$P\left(y_1\le Y\le y_2\right)=\Phi\left(\frac{y_2-n\mu}{\sqrt n\sigma}\right)-\Phi\left(\frac{y_1-n\mu}{\sqrt n\sigma}\right)$$

Example: Central Limit Theorem #1

Let $\bar{X} = 18$ and $Var(X) = 3$ for a random sample of $n = 30$. Approximate $P(17.4 < \bar{X} < 18.5$.

Solution

From the information given, $\bar{X}$ has an approximate $N(18,3/30)$ distribution. We can compute probabilities such as:

\begin{align} P(17.4 < \bar{X} < 18.5) & = P\bigg(\frac{17.4-18}{\sqrt{3/30}} < \frac{\bar{X}-18}{\sqrt{3/30}} < \frac{18.5-18}{\sqrt{3/30}}\bigg)\\ & \approx \Phi(0.158) – \Phi(-0.189) = 0.94295-0.02872 = 0.9142 \end{align}

Example: Central Limit Theorem #2

Let $X_1,X_2,\cdots, X_{15}$ be a random sample of size 15 from a joint random distribution. Let $E(X_i) =\frac{1}{4}$ and $Var(X_i) = \frac{1}{24}$ for $i=1,2,\cdots, 20$.

If $Y$ is a transformation $Y = X_1 + X_2 + \cdots X_{15}$, approximate $P(Y \leq 4.11)$.

Solution

\begin{align} P(Y \leq 4.11) & = P\bigg(\frac{Y – 15(1/4)}{\sqrt{15/24}}\leq \frac{4.11 – 3.75}{\sqrt{15/24}}\bigg) = P(W \leq 0.455)\\ & \approx \Phi(0.455) = 0.676 \end{align}

Notice how the formula in Example 1 is different from the formula in Example 2. In example 1, we are using a single variable with a single sample, so we are using the left side of the expression, whereas, in example 2, we are using a random sample from a random distribution with $X_n$ data points, so we need to weight the distribution, and we end up using the right side of the formula.

Example: Central Limit Theorem #3

A company offers payment for its employees; the amount paid is 10,000 for its 200 employees if they survive a set criterion. The probability of survival for each employee is 1.1%. The person who built this fund says there is a 99% probability that the fund will handle the payouts.

Calculate the smallest amount of money that the company should put into the fund.

Solution

Let $P$ be the payments and $X$ the number of deaths, $P = 10,000X$, where $ X \sim Bin(200,0.011)$.

$$E(P) = E(10,000X)= n\cdot p =10,000(200)(0.011)=22000 $$

$$Var(P) = Var(10,000X) = n\cdot p \cdot (1-p)= 10,000^2(200)(0.011)(1-0.011)=217,580,000$$

$$Sd(P)=\sqrt{Var(P)}= 14,750.60$$

Since there is a probability of at least 0.99 that the fund will be able to handle the payout, then:

$$Pr\left(Z\le\frac{P-22,000}{14,750.60}\right)=0.99$$

Thus,

$$\Rightarrow\Phi\left(\frac{P-22,000}{14,750.60}\right)=0.99$$

Intuitively,

$$\frac{P-22,000}{14,750.60}=\Phi^{-1}(0.99)$$

$$\therefore P=22,000+14,750.60\left(2.326\right)=56,309.90$$

The value $2.326$ is nothing more than our application of the Central Limit Theorem ($\Phi(0.99)$).

Learning Outcome

Topic 3.i: Multivariate Random Variables – State and apply the Central Limit Theorem.

Offered by AnalystPrep

Swaps

Principles for Sound Stress Testing – Practices and Supervision

Country Risk: Determinants, Measures, and Implications

Daniel Glyn

2021-03-24

I have finished my FRM1 thanks to AnalystPrep. And now using AnalystPrep for my FRM2 preparation. Professor Forjan is brilliant. He gives such good explanations and analogies. And more than anything makes learning fun. A big thank you to Analystprep and Professor Forjan. 5 stars all the way!

michael walshe

2021-03-18

Professor James' videos are excellent for understanding the underlying theories behind financial engineering / financial analysis. The AnalystPrep videos were better than any of the others that I searched through on YouTube for providing a clear explanation of some concepts, such as Portfolio theory, CAPM, and Arbitrage Pricing theory. Watching these cleared up many of the unclarities I had in my head. Highly recommended.