In: Statistics and Probability
In statistics we are able to make inferences because statistics has a predictable distribution called a sampling distribution. One method of inference we have discussed is the idea of hypothesis testing. Explain briefly how the sampling distribution is used in the process of hypothesis testing?
Hypothesis testing consists of the following steps
1. calculating some test statistic
2. comparing that statistic to an underlying distribution to determine how likely it would be to occur by chance
For example, in the coin-tossing example: let the random variable X be the number of heads obtained, then the underlyingd was the binomial distribution where P = 0.50 and N = 18.
More generally, any inferential test we’ll learn about involves comparing a test statistic to an underlying probability distribution. The underlying distribution of a statistic ontained from sample values is called its sampling distribution.
The sampling distribution of a statistic gives all the possible values that the statistic can take, and the probability of each value occurring by chance.
The sampling distribution tells us what values we might expect to obtain for a particular statistic if some predefined conditions are true (e.g., the conditions under the null hypothesis is taken to be true)
For example if H0 is that P(head) = 0.50, then the sampling distribution of the statistic tells us how likely we would be to obtain 6 heads when we flip the coin 18 times.
The distribution is derived by assuming that we repeat that experiment (infinitely) many times. For example, the first time we do the experiment, we might get 10 heads; the next time, we might get 6 heads. If we do this many, many times, we’ll define the sampling distribution of the statistic.
The sampling distribution assumes that the null hypothesis is true. When we compare an obtained test statistic to the sampling distribution, we’re asking how likely it is that we would get that statistic if we were sampling from a population that has the null hypothesis characteristics (e.g., P = 0.50).
If it’s very unlikely i.e. ess than alpha – we conclude that we must be sampling from a population whose characteristics are different from the null hypothesis population (e.g., P > 0.50). That is, we reject the null hypothesis.
The sampling distribution that is appropriate depends on the statistical test we use