The p-value in Hypothesis Testing
The p-value is the lowest level of significance at which we can reject... Read More
You will recall that simple random sampling, stratified random sampling, and cluster sampling are types of probability sampling techniques. On the other hand, convenience sampling and judgemental sampling are types of non-probability sampling techniques.
Simple random sampling involves the selection of a sample from an entire population such that each member or element of the population has an equal probability of being picked. The method attempts to come up with a sample that represents a population in an unbiased manner.
However, simple random sampling is not appropriate when there are glaring differences within a population. Differences within a population prompt statisticians to divide the members of a population into different, distinctive categories. That is where stratified random sampling comes in.
Note that simple random sampling is preferred when the population data is homogenous.
Imagine that we wish to come up with a sample of 50 level I candidates out of a total of 100,000 level I candidates.
One approach may involve numbering each of the 100,000candidates, placing them in a basket, and shaking the basket to jumble up the numbers. Next, we would randomly draw 50 numbers from the basket, one after the other, without replacement.
A more scientific approach may also involve the use of random numbers where all the 100,000 candidates are numbered in a sequence (from 1 to 100,000). We may then use a computer to randomly generate 50 numbers between 1 and 100,000, where a given number represents a particular candidate who can be identified by their name or admission number.
The underlying feature in random sampling is that all elements in the population must have equal chances of being chosen.
In stratified random sampling, analysts subdivide the population into separate groups known as strata (singular – stratum). Each stratum is composed of elements that have a common characteristic (attribute) that distinguishes them from all the others. The method is most appropriate for large populations that are heterogeneous in nature.
A simple random sample is then drawn from within each stratum and combined to form the overall, final sample that takes heterogeneity into account. The number of members chosen from any one stratum depends on its size relative to the population as a whole.
An advertising firm wants to determine the extent to which it needs to invigorate television advertisements in a district. The company decides to carry out a survey aimed at estimating the mean number of hours households spend watching TV per week. The district has three distinct towns – A, B, which are urbanized, and C, located in a rural area. Town A is adjacent to a major factory where most residents work, with most having kids of school-going age. Town B mainly harbors retirees while most people in town C practice agriculture.
There are 160 households in town A, 60 in town B, and 80 in C. Given the differences in the composition of each region, the firm decides to draw a sample of 50 households, taking into account the total number of families in each.
What is the number of homes that have been sampled in each region?
Solution
We have three strata: A, B, and C. We use the following formula to determine the number of households from each region to be included in the sample:
$$ \text{Number of households in sample} = \left( \cfrac {\text{Number of households in region}}{ \text {Total number of households} }\right) × \text {Required sample size} $$
Therefore, the number of households to be sampled in A = \(\frac {160}{300} × 50 = 27\) (approximately).
Similarly, the number of households to be sampled in B = \(\frac {60}{300} × 50 = 10\).
Finally, the firm would need \( \left(\frac {80}{300} × 50 \right) = 13\) households in town C.
In cluster sampling, all population elements are categorized into mutually exclusive and exhaustive groups called clusters. A simple random sample of the cluster is selected and the elements in each of these clusters are subsequently sampled.
Non-probability samples are selected on the basis of judgment or the convenience of accessing data. As such, non-probability sampling majorly depends on the researchers’ sample selection skills. There are two types of non-probability sampling methods:
Judgmental sampling is preferred to use when there is a restricted number of people in the population who possess qualities that the researcher expects from the target population.
$$
\begin{array}{l|l|l}
\textbf { Method } & \textbf { Strengths } & \textbf { Weaknesses } \\
\hline \textbf { Probability Sampling } & & \\
\hline \text { Simple random sampling } & \text { Easy to use } & \begin{array}{c}
\text { Lower precision; no assurance } \\
\text { of representativeness }
\end{array} \\
\hline \text { Stratified sampling } & \begin{array}{c}
\text { Higher precision relative to } \\
\text { simple random sampling }
\end{array} & \begin{array}{c}
\text { Difficult to choose relevant } \\
\text { stratification; expensive }
\end{array} \\
\hline \text { Cluster sampling } & \text { Cost effective and efficient } & \text { Lower precision } \\
\hline \textbf { Non-probability Sampling } & & \\
\hline \text { Convenience sampling } & \begin{array}{c}
\text { Cost effective and saves time; } \\
\text { easy to use }
\end{array} & \begin{array}{c}
\text { Selection bias, sample may not } \\
\text { accurately represent population }
\end{array} \\
\hline \text { Judgmental sampling } & \begin{array}{c}
\text { Cost effective, convenient, less } \\
\text { time consuming }
\end{array} & \begin{array}{c}
\text { Subjective method. } \\
\text { Selection bias, sample may not } \\
\text { accurately represent population }
\end{array} \\
\end{array}
$$
Question 1
An analyst is analyzing the spending habits of people belonging to different annual income categories. In his analysis, he creates the following different groups according to the annual family income: Less than $30,000, $31,000 – $40,000, $41,000 to $50,000, and $51,000 to $60,000. He then selects a sample from each distinct groups to form a whole sample. The sampling method used by the analyst is most likely:
- cluster sampling.
- stratified sampling.
- simple random sampling.
Solution
The correct answer is B.
Dividing the population into different strata/groups and then selecting sample from each group is called stratified sampling technique.
A is incorrect. In cluster sampling, each cluster is considered a sampling unit, and only selected clusters are sampled.
C is incorrect. Simple random sampling involves the selection of a sample from an entire population such that each member or element of the population has an equal probability of being picked.
Question 2
A PhD student is conducting research related to her thesis and for this purpose, she uses some students from her university to constitute a sample. The sampling method used by the analyst is most likely:
- simple random sampling
- convenience sampling
- judgmental sampling
Solution
The correct answer is B.
The researcher has selected the students from her university because of she can conveniently access them.
A is incorrect. Simple random sampling involves the selection of a sample from an entire population such that each member or element of the population has an equal probability of being picked.
C is incorrect. Judgmental sampling involves handpicking elements from a sample based on the researcher’s knowledge and expertise.
Question 3
An analyst wants to estimate the downtime of ABC Bank’s ATMs in a city for the last 6 months. For this purpose, he selects 20 locations or areas within the city and then selects 50% of the ATMs in each area. The sampling method used by the analyst is most likely:
- cluster sampling.
- stratified random sampling.
- simple random sampling.
Solution
The correct answer is A.
In cluster sampling, all population elements are categorized into mutually exclusive and exhaustive groups called clusters. A simple random sample of the cluster is selected and then the elements in each of these clusters are sampled.
B is incorrect. In stratified random sampling, analysts subdivide the population into separate groups known as strata (singular – stratum), and each stratum is composed of elements that have a common characteristic (attribute) that distinguishes them from all the others.
C is incorrect. Simple random sampling involves the selection of a sample from an entire population such that each member or element of the population has an equal probability of being picked.