In: Statistics and Probability
Explain the gist of statistical hypothesis testing. Why do hypotheses need to be about population means when the actual information used is from sample means? How is it possible to make a decision on population means based on sample means?
Statistical hypothesis testing is a strong tool for inferential statistics. It works by drawing conclusions about the population given the information about a sample.
It works so because the sample is drawn the population and hence is a representative of the population. To ensure that the sample is not biased in any way and does indeed represent the population, the sample size is chosen to be sufficiently large. Also, the sample is drawn in a random manner so that there is no clustering effect or any other sort of sampling bias.
The sample mean is an efficient estimator of the population mean. Hence, the sample drawn by the methods described above can be used to get an estimate of the population mean. Similarly, the population standard deviation can be inferred from the sample standard deviation and sample size drawn. This gives an estimate of the nature of population distribution and using p-value and appropriate significance level, we can infer based on the sample, whether another distribution is different from the population or whether another sample belongs to the same population and so on.