In: Statistics and Probability
Step 2 of hypothesis testing involves reviewing the assumptions of 2 sample t-test. Discuss the three assumptions of the t-test. Provide an example of the assumption that is not robust to violations and a situation when the assumption is violated.
Two-Sample T-Test Assumptions
The assumptions of the two-sample t-test are:
1. The data are continuous (not discrete).
2. The data follow the normal probability distribution.
3. The variances of the two populations are equal. (If not, the
Aspin-Welch Unequal-Variance test is used.)
4. The two samples are independent. There is no relationship
between the individuals in one sample as
compared to the other (as there is in the paired t-test).
5. Both samples are simple random samples from their respective
populations. Each individual in the
population has an equal probability of being selected in the
sample.
An Example is
Compare the effects of two specific drugs
Obs. Drug 1 Drug 2
1 0.7 1.9
2 −1.6 0.8
3 −0.2 1.1
4 −1.2 0.1
5 −0.1 −0.1
6 3.4 4.4
7 3.7 5.5
8 0.8 1.6
9 0.0 4.6
10 2.0 3.4
Here the data satisfies all the above assumptions of 2 sample t test
The following data satisfies all the assumptions but not the 2nd and 3rd one.In this case t test can't be used
So a non parametric test (Mann Whitney U test ) is used .
Consider a Phase II clinical trial designed to investigate the effectiveness of a new drug to reduce symptoms of asthma in children. A total of n=10 participants are randomized to receive either the new drug or a placebo. Participants are asked to record the number of episodes of shortness of breath over a 1 week period following receipt of the assigned treatment. The data are shown below.
Placebo |
7 |
5 |
6 |
4 |
12 |
New Drug |
3 |
6 |
4 |
2 |
1 |
Is there a difference in the number of episodes of shortness of breath over a 1 week period in participants receiving the new drug as compared to those receiving the placebo? By inspection, it appears that participants receiving the placebo have more episodes of shortness of breath, but is this statistically significant?
In this example, the outcome is a count and in this sample the data do not follow a normal distribution
Hence t test can't be used.