Understanding Test Statistics
A test statistic is a standardized value computed from sample information when testing... Read More
The central limit theorem asserts that when we have simple random samples each of size n from a population with a mean μ and variance σ2, the sample mean X approximately has a normal distribution with mean μ and variance σ2/n as n(sample size) becomes large.
Suppose we have a sequence of independent and identically distributed variables X1, X2, X3 …, Xn with mean finite mean μ and non-zero variance σ2, then the distribution of \( \frac {(X – \mu)}{\left(\frac {\sigma}{\sqrt n}\right)}\) approaches the standard normal distribution as n approaches infinity, i.e., \(n \rightarrow \infty \).
Remember that \(X=\left( \frac { 1 }{ n } \right) \ast \sum { { X }_{ i } } \).
The central limit theorem provides very useful normal approximations to some common distributions including the binomial and Poisson distributions.
Note that while X is approximately normally distributed with mean μ and variance σ2/n, ΣXi is approximately normally distributed with mean nμ and variance nσ2. In fact, we can show that both the mean and the variance are exact and it is only the shape of the curve that is approximate.
Although the widely accepted value is n ≥ 30, this answer might be too simple. The truth is that the value of n depends on the shape of the population involved, i.e., the distribution of Xi and its skewness.
In a non-normal but fairly symmetric distribution, n = 10 can be considered large enough. With a very skewed distribution, the value of n can be 50 or even more.
Suppose we have a set of independent and identically distributed variables Xi, i = 1…, n such that:
\(\sum X_i \sim \text{binomial} (n,\theta)\).Applying the CLT, for large n:
\(X \sim N \left( \mu, \frac {\sigma^2}{n} \right)\) and \(\sum X_i \sim N(n \mu, n \sigma^2)\)
Additionally, note that the binomial distribution is a sequence of Bernoulli variables such that the mean is θ and the variance θ(1- θ).
Therefore, applying the CLT:
\(\sum X_i \sim N(n\theta, n \theta (1 – \theta))\) which is the normal approximation to the binomial and n = 10 is considered large enough for CLT application.
Under Poisson, \(\mu = \sigma^2=\lambda\)
Therefore, applying CLT:
$$ \sum X_i \sim N(n\lambda, n \lambda) $$
The central limit theorem is a very useful tool, especially in the construction of confidence intervals or testing of hypotheses. As long as n is “sufficiently large,” just about any non-normal distribution can be approximated as normal.
Reading 10 LOS 10e:
Explain the central limit theorem and its importance.