Independent vs. Dependent Events
Two or more events are independent if the occurrence of one event has... Read More
The central limit theorem asserts that “given a population described by any probability distribution having mean \(\mu\) and finite variance \(\sigma^2\), the sampling distribution of the sample mean \(\bar{X}\) computed from random samples of size \(n\) from this population will be approximately normal with mean \(\mu\) (the population mean) and variance \(\frac{\sigma^2}{n}\) (the population variance divided by \(n\)) when the sample size \(n\) is large”.
The answer to this question might not be straightforward. Nevertheless, the widely accepted value is \(n \geq 30\). The truth is that the value of \(n\) depends on the shape of the population involved, i.e., the distribution of \(X_i\) and its skewness.
In a non-normal but fairly symmetric distribution, \(n = 10\) can be considered large enough. With a very skewed distribution, the value of \(n\) can be 50 or even more.
Remember that from the central limit theorem, the variance of the sample mean distribution is given by:
$$ \sigma_{\bar{X}}^2= \frac{\sigma^2}{n} $$
Where \(n\) is the sample size.
The standard error is the standard deviation of the statistic (sample mean).
$$ \sigma_{\bar{X}}=\frac{\sigma}{\sqrt n} $$
Formally defined, for a sample mean \(\bar{X}\) computed from a sample generated by a population with standard deviation \(\sigma\), the standard error of the sample mean is given by:
$$ \sigma_{\bar{X}}=\frac{\sigma}{\sqrt n} $$
Where \(\sigma\) = Known population standard deviation.
When the population standard deviation, \(\sigma\), is unknown, the following formula is used to estimate the standard error of the sample mean, also denoted as \(s_{\bar{X}}\):
$$ s_{\bar{X}}=\frac{s}{\sqrt n} $$
Where: \(s\) = Sample standard deviation.
The formula above is applicable where we do not know the population standard deviation. Note that the sample standard deviation is the square root of the sample variance, \(s^2\), given by:
$$ \begin{align*}
s^2 & =\frac{\sum_{i=1}^{n}\left(X_i-\bar{X}\right)^2}{n-1} \\
\Rightarrow s & =\sqrt{\frac{\sum_{i=1}^{n}\left(X_i-\bar{X}\right)^2}{n-1}} \end{align*} $$
The standard error of the sample mean estimates the variation that would occur if you took multiple samples from the same population. While the standard deviation measures variation within one sample, the standard error estimates variation across many samples. So, standard deviation and standard error are distinct concepts.
The standard error of the sample mean gives analysts an idea of how precisely the sample mean estimates the population mean. A lower standard error value indicates a more precise estimation of the population mean. On the other hand, a larger standard error value indicates a less precise estimate of the population mean.
It is also important to note that the standard error becomes smaller as the sample size increases. This can be seen from its formula. This happens because increasing the sample size ultimately brings the sample mean closer to the true value of the population mean.
Example 1
In a certain property investment company with an international presence, workers have a mean hourly wage of $12 with a population standard deviation of $3. Given a sample size of 30, the standard error of the sample mean is closest to:
$$ \begin{align*} \sigma_{\bar{X}} & =\frac{\sigma}{\sqrt n} \\
& =\frac{3}{\sqrt{30}}=\$0.55 \end{align*} $$
If we were to draw several samples of size 30 from the employee population and construct a sampling distribution of the sample means, we would end up with a mean of $12 and a standard error of $0.55.
Example 2
A sample of 30 latest returns on XYZ stock reveals a mean return of $4 with a sample standard deviation of $0.13. The standard error of the sample mean is closest to:
$$ \begin{align*}
s_{\bar{X}} & =\frac{s}{\sqrt n} \\
\frac{0.13}{\sqrt{30}}& =\$0.02 \end{align*} $$
If we were to draw more samples from the population of yearly returns on XYZ stock and construct a sample mean distribution, we would end up with a mean of $4 and a standard error of $0.02.
Question
Emma Johnson wants to know how finance analysts performed last year. Johnson assumes that the population cross-sectional standard deviation of finance analyst returns is 8 percent and that the returns are independent across analysts.
The random sample size that Johnson needs if she wants the standard deviation of the sample means to be 2% is closest to:
- 4.
- 16.
- 72.
Solution
The correct answer is B.
Remember that,
$$ \begin{align*}
\sigma_{\bar{X}} & =\frac{\sigma}{\sqrt n} \\
\Rightarrow 0.02 & =\frac{0.08}{\sqrt n} \\ \therefore n & =16
\end{align*} $$