Confidence interval (C.I) refers to a range of values within which statisticians believe the actual value of a certain population parameter lies. It differs from a point estimate which is a single, specific numerical value.
Breaking down Confidence Interval
When constructing confidence intervals, we must specify the probability that the interval contains the true value of the parameter of interest. This probability is represented by (1 – α) where α is the level of significance. In statistical terminology, 1 α is called the degree of confidence or certainty.
We define a 100(1 – α)% confidence interval for a given parameter, say, θ by specifying two random variables θ’_{1}(X)and θ’_{2}(X) such that P{ θ’_{1}(X?) < θ < θ’_{2}(X)} = 1 – α.
It happens that α = 0.05 is the most common case not just in the exam but in practice. This leads to a 95% confidence interval.
Consequently, P{ θ’_{1}(X) < θ < θ’_{2}(X)} = 0.95 specifies { θ’_{1}(X), θ’_{2}(X)} as a 95% C.I for θ. The main task for candidates lies in being able to construct and interpret a confidence interval. Thus, the C.I for θ above could be interpreted to mean that if we were to construct similar intervals using samples of equal sizes from the same population, 95% of the intervals would contain the true parameter value and just 5% would not contain it, hence, the phrase “confidence” interval.
Constructing Confidence Intervals
To construct a confidence interval, one must come up with an appropriate value that will be subtracted and added to a point estimate. A confidence interval appears as follows:
C.I = point estimate ± reliability factor * standard error
Where:
Point estimate refers to a calculated value of the sample statistic such as the mean, X.
Reliability factor is a value that depends on the sampling distribution involved and (1 – ), the probability that the point estimate is contained in the C.I.
Standard error = standard error of the point estimate.
Different Scenarios

Normal distribution with a known variance:
We can calculate the C.I for the mean as,
x ± z_{α/2} * σ/√n
Here, the reliability factor is z_{α/2}– the zscore that leaves a probability of α/2 on the upper tail (righthand tail) of the standard normal distribution.
The following table represents the standard normal distributions commonly used by analysts.
Degree of confidence  Level of significance(onetailed)  z_{α/2} 
90%  10%  1.645 
95%  5%  1.960 
99%  1%  2.575 

Normal distribution with unknown variance:
When the variance is unknown, we construct the C.I for the mean by replacing the zscore in the first scenario with the tscore. Similarly, we replace the unknown σ with S, the standard deviation of the sample mean. Why is the tdistribution used?
Thus,
C.I = x ± t_{α/2} * S/√n
t_{α/2 }is the tscore that leaves probability of α/2 on the upper tail of the tdistribution. The number of degrees of freedom is determined by the sample size such that d.f = n – 1.

Confidence Interval of the population mean when variance is unknown and the sample size is large enough (any type of distribution):
Thanks to the Central Limit Theorem, we can approximate just about any type of nonnormal distribution as a normal one provided the sample size is large (n ≥ 30). Therefore, we can use the relevant zscore when constructing a confidence interval for the population mean. However, some analysts may advocate the use of the tdistribution in scenarios where the distribution is nonnormal and the population variance is unknown, even if n ≥ 30. Nonetheless, the use of the z statistic would still be justified under such circumstances provided the central limit theorem is applied correctly.
Example
A teacher draws a sample of 5 12yearold children from the school’s population and records their heights as follows:
{124, 124, 128, 130, 127}
Assume that the heights have a normal distribution where both μ and σ are unknown. Calculate a twotailed 95% confidence interval for the mean height of 12yearolds.
Solution:
Since the variance is unknown and the sample size is less than 30, we should use the tscore as opposed to the zscore, even if the distribution is normal. Thus, the C.I for the mean will take the form,
C.I = x ± t_{α/2} * S/√n
From the data, X = 126.6 and S^{2} = 17.8
You can read off the tscore value from the tdistribution table where you will find that,
0.95 = P(2.776 < t_{4} < 2.776) i.e. t_{4, 0.025} = 2.776
Therefore, C.I = 126.6 ± 2.776 * 4.219/√5
= 126.6 ± 5.238
Thus, our confidence interval for μ is (121.4, 131.8)
Question:
Use the data from the example above to calculate a twotailed 99% confidence interval for the population mean.
A. (121.4, 130.8)
B. (117.9, 135.3)
C. (126.6, 135.3)
Solution
The correct answer is B.
C.I = x ± t_{α/2} * S/√n
t_{0.005, 4 }= 4.604
The other inputs remain the same as in the example above.
Therefore, C.I = 126.6 ± 4.604 * 4.219/√5
= 126.6 ± 8.687
The C.I for the mean is (117.9, 135.3).
As you might have observed, the interval widens as the level of confidence increases.
Reading 11 LOS 11j:
Calculate and interpret a confidence interval for a population mean, given a normal distribution with 1) a known population variance, 2) an unknown population variance, or 3) an unknown variance and a large sample size.