In: Statistics and Probability
Sampling and sampling distributions are the basis of statistical analysis, and they are also important parts of applied business statistics. Therefore, designing an effective sampling plan is crucial to get representative samples of the population under study.
Explain the role that a sample plays in making statistical inferences about the population. Give an example of a sampling survey for a business study, and briefly discuss how the projected sampling results could have an effect on the business’ pricing strategies and price competitions.
The use of randomization in sampling allows for the analysis of results using the methods of statistical inference. Statistical inference is based on the laws of probability, and allows analysts to infer conclusions about a given population based on results observed through random sampling. Two of the key terms in statistical inference are parameter andstatistic:
A parameter is a number describing a population, such as a percentage or proportion.
A statistic is a number which may be computed from the data observed in a random sample without requiring the use of any unknown parameters, such as a sample mean.
But there is an incredible range of ways to sample and each one has an impact on the generalizability of inference and most have an impact on the ways to do inference.
Often, the sample prohibits inference or requires extreme assumptions to make inferences.
Example
Suppose an analyst wishes to determine the percentage of defective items which are produced by a factory over the course of a week. Since the factory produces thousands of items per week, the analyst takes a sample 300 items and observes that 15 of these are defective. Based on these results, the analyst computes the statistic , 15/300 = 0.05, as an estimate of the parameter p , or true proportion of defective items in the entire population.
Suppose the analyst takes 200 samples, of size 300 each, from the same group of items, and achieves the following results:
Number of Samples Percentage of Defective Items 20 3 30 4 50 5 45 6 35 7 30 8
The histogram corresponding to these results is shown below:
These results approximate a sampling distribution for the statistic , or the distribution of values taken by the statistic in all possible samples of the size 300 from the population of factory items. The distribution appears to be approximately normal, with mean between 0.05 and 0.06. With repeated sampling, the sampling distribution would more closely approximate a normal distribution, although it would remain discontinuous because of the granularity caused by rounding to percentage points.