In: Statistics and Probability
Provide brief answers to the following questions. Use specific situations and examples where applicable ( In Your Own Words)
In statistics, sampling is when researchers determine a representative segment of a larger population that is then used to conduct a study.
Sampling comes in two forms — probability sampling and non-probability sampling.
When each entity of the population has a definite, non-zero probability of being incorporated into the sample, the sample is known as a probability sample.
Probability samples are selected in such a way as to be representative of the population. They provide the most valid or credible results because they reflect the characteristics of the population from which they are selected.
The Methods of Probability Sampling :
There are several types of probability sampling. Below we explain the basics of each, and address their advantages and disadvanta Simple Random Sampling
Stratified Random Sampling :
How : The entire process of sampling is done in a single step with each subject selected independently of the other members of the population. The term random has a very precise meaning and you can’t just collect responses on the street and have a random sample.
When : There is a very large population and it is difficult to identify every member of the population.
Use case : Want to study and understand the rice consumption pattern across rural India? While it might not be possible to cover every household, you could draw meaningful insights by building your sample from different districts or villages (depending on the scope).
Systematic Sampling :
How : In a systematic sample, after you decide the sample size, arrange the elements of the population in some order and select terms at regular intervals from the list.
When : Your given population is logically homogenous.
Use Case : Suppose a supermarket wants to study buying habits of their customers. Using systematic sampling, they can choose every 10th or 15th customer entering the supermarket and conduct the study on this sample.
Stratified Sampling :
How : A stratified sample, in essence, tries to recreate the statistical features of the population on a smaller scale. Before sampling, the population is divided into characteristics of importance for the research — for example, by gender, social class, education level, religion, etc. Then the population is randomly sampled within each category or stratum. If 38% of the population is college-educated, then 38% of the sample is randomly selected from the college-educated subset of the population.
When : You can divide your population into characteristics of importance for the research.
Use Case : If 38% of the population is college-educated and 62% of the population have not been to college, then 38% of the sample is randomly selected from the college-educated subset of the population and 62% of the sample is randomly selected from the non-college-going population. Maintaining the ratios while selecting a randomized sample is key to stratified sampling.
Cluster (Area) Random Sampling :
How : Think of instances such as investigating the dietary trends amongst the entire population of Africa — the population is just too large to manage effectively.
When : Cluster random sampling is conducted when the size of a population is too large to perform simple random sampling.
In cluster random sampling, the initial research identifies boundaries. Sticking with the example above, our boundaries would be the various countries in Africa.
From here, the researcher randomly selects a number of identified boundaries. It’s important to note that each of the areas, in our case African countries, should have equal chances of being selected.
Finally, the researcher conducting the study can then include all of the individuals within the selected areas, or he or she can use simple random selection to select subjects from the identified countries.
Use Case : The primary advantages of cluster random sampling are its convenience and ease of use, while its most notable disadvantage is that if members of clusters are not homogenous in nature, researchers will end up with less accurate data.
Multi-Stage Sampling :
How : Multi-stage sampling With more advanced research, using just one form of probability sampling does not ensure the randomization necessary to ensure confidence in results.
When : Multi-stage sampling involves a combination of two or more of the probability sampling methods outlined above
Use Case : By combining various probability sampling techniques at various stages of research initiatives, researchers are able to maintain confidence that they are mitigating biases as much as possible.
2. Non-Probability Sampling Techniques :
In these techniques, the units that make up the sample are collected with no specific probability structure in mind. The selection is not completely randomized, and hence the resultant sample isn’t truly representative of the population.
The Methods of Non Probability Sampling :
Non-probability sampling techniques include
Convenience Sampling :
How : As the name suggests, the elements of such a sample are picked only on the basis of convenience in terms of availability, reach and accessibility.
When : During preliminary research efforts.
Use Case : This method is often used during preliminary research efforts to get a gross estimate of the results, without incurring the cost or time required to select a random sample.
Snowball Sampling :
How : Just as the snowball rolls and gathers mass, the sample constructed in this way will grow in size as you move through the process of conducting a survey. In this technique, you rely on your initial respondents to refer you to the next respondents whom you may connect with for the purpose of your survey.
When : When you can rely on your initial respondents to refer you to the next respondents.
Use Case : Snowball sampling can be useful when you need the sample to reflect certain features that are difficult to find. To conduct a survey of people who go jogging in a certain park every morning, for example, snowball sampling would be a quick, accurate way to create the sample.
Quota Sampling :
How : Quota sampling is the non-probability equivalent of stratified sampling that we discussed earlier. It starts with characterizing the population based on certain desired features and assigns a quota to each subset of the population.
When : When you can characterize the population based on certain desired features.
Use Case : If a survey requires a sample of fifty men and fifty women, a quota sample will survey respondents until the right number of each type has been surveyed. Unlike stratified sampling, the sample isn’t necessarily randomized.
Note : Probability sampling techniques are superior, but the costs can be prohibitive. For the initial stages of a study, non-probability sampling techniques might be sufficient to give you a sense of what you’re dealing with. For detailed insights and results that you can bank upon, move on to the more sophisticated techniques as the study gathers pace and takes a more concrete structure.