In: Statistics and Probability
What is "sampling bias"? Explain using proper terminology and craft your example to explain how it can affect the outcome of a statistical study. This should be several paragraphs long.
Sampling bias means that the samples of a stochastic variable are selected incorrectly and the samples do not represent the true distribution because of non-random reasons, these samples are collected to determine its distribution. This also means that the samples systematically favor some outcomes over others.
Example:
Telephone sampling is a common example of sampling bias in marketing surveys. A simple random sample may be chosen from the sampling frame consisting of a list of telephone numbers of people in the area being surveyed. This method does involve taking a simple random sample, but it is not a simple random sample of the target population (consumers in the area being surveyed.) It will miss people who do not have a phone. It may also miss people who only have a cell phone that has an area code, not in the region being surveyed. It will also miss people who do not wish to be surveyed, including those who monitor calls on an answering machine and don't answer those from telephone surveyors. Thus the method systematically excludes certain types of consumers in the area.
Problems:
Sampling bias is problematic because it is possible that a statistic computed of the sample is systematically erroneous i.e. it has as bias over the estimates calculated as it is bias towards some of the samples. This can lead to a systematic over- or under-estimation of the corresponding parameter in the population giving wrong interpretations of the result. Sampling bias occurs in most of the sampling schemes, as it is practically impossible to ensure perfect randomness in sampling.
Some techniques to reduce the sampling bias:
Take many different samples and average out the results obtained. Increasing the sample size also tend to decrease the sampling error.