Covariance and Correlation
Sampling error is the statistical error that occurs when an analyst selects a sample that is not representative of the population as a whole. In other words, it is the difference between the observed value of a sample statistic (mean, variance, or standard deviation) and the actual but unknown population parameter. For example, we calculate the sampling error for variance as:
$$ \text{Sampling error for the variance} = \text{Sample variance} – \text{Population variance} = s^2 – \sigma^2 $$
A sampling exercise that involves the selection of a few elements to represent an entire population is always susceptible to both sampling errors and non-sampling errors. While sampling errors result from the data collection process, non-sampling errors are unrelated to the sample and are systematic. A good example of a non-sampling error, for instance, during the administration of a questionnaire, would manifest in the form of asking “leading” questions or phrasing the question in a manner that dictates the respondent to give a particular response.
Let us take a look at an example of sampling error in practice. We assume that the producers of a particular Mexican Soap Opera that airs biweekly wish to determine the percentage of Mexican viewers who watch the two episodes of the program every week. The producers would have to come up with a sample that is representative of the various classes of viewers based on such parameters as age. For instance, young people between the ages of 14 and 18 generally have a lot of time in their hands, and most of them would easily manage to make time for the two weekly episodes.
On the other hand, viewers in higher age brackets, for instance, between 18 and 35 years, may not watch the episodes regularly because they are more likely to have a tighter working schedule and other engagements that may leave them with little time for TV. Therefore, a sample drawn disproportionately, without taking into account the age factor, would produce erroneous results that may not represent the actual population parameters.
Other factors analysts consider when selecting samples for different purposes may include gender, education level, and socioeconomic status.
The sampling error for a given sample is usually unknown since the true population parameter is also unknown. However, analysts may use analytical methods to measure the extent of variation brought about by sampling error.