In: Statistics and Probability
In the following data set, the columns indicate young adults’ smoking habit, while the rows indicate their exercise status. Please conduct a hypothesis to determine whether smoking habit and exercise status are associated. Choose α = 0.05. (Please make sure to check assumptions, if assumptions are not met, you may stop).
this q is for a biostatistical subject.
Smoking Habit |
||||
Exercise Status |
Frequent |
Some |
None |
Total |
Never |
98 |
86 |
35 |
219 |
Occasion |
29 |
47 |
23 |
99 |
Regular |
17 |
9 |
17 |
43 |
Heavy |
9 |
7 |
19 |
35 |
Total |
153 |
149 |
94 |
396 |
To check whether there smoking habit and exercise status are associated, we will use two way ANOVA process. Since there are two factors of variation due to rows and columns, the hypothesis are as follows ;
NULL HYPOTHESIS:
H0R : there is no significant difference between the exercise statuses.
H0C : there is no significant difference between the smoking habits.
ALTERNATIVE HYPOTHESIS :
H1R : at least two of the exercise statuses differs significantly.
H1C : at least two of the smoking habits differs significantly.
Smoking habit | |||||
Exercise status | frequent | some | none | Total | ti.^2 |
never | 98 | 86 | 35 | 219 | 47961 |
Occasion | 29 | 47 | 23 | 99 | 9801 |
Regular | 17 | 9 | 17 | 43 | 1849 |
Heavy | 9 | 7 | 19 | 35 | 1225 |
Total | 153 | 149 | 94 | grand total =396 | total = 60836 |
t.j^2 | 23409 | 22201 | 8836 | total= 54446 |
N = 4*3 = 12 | |||||
Raw sum of squares = 22954 | |||||
Correction Factor = G^2 / N = 13068 | |||||
Total sum of Squares = RSS - CF = 9886 | |||||
Row sum of Squares = (sum of ti.^2/ 3) - CF = 7210.667 | |||||
Column Sum of Squares = (sum of t.j^2/ 4) - CF = 543.5 | |||||
Error sum of squares = TSS - RSS - CSS = 2131.833 |
ANOVA TABLE | ||||
Sources of Variation | S.S.(1) | d.f.(2) | M.S.S. (3) = (1)/(2) | Variance Ratio (4) |
Between Columns | 543.5 | 3-1 =2 | 271.75 | 0.764834769 |
Between rows | 7210.667 | 4-1= 3 | 2403.555667 | 6.76475784 |
Error | 2131.83 | 2*3 = 6 | 355.3055 | |
Total | 9886 |
Given alpha level of significance is 0.05
now, tabulated F0.05(2,6) is 5.14 and the calculated value is 0.765 ,which is much less than the tabulated value, it is not significant and we fail to reject H0R at 5% level of significance. Hence there is no significant difference between the smoking habits.
Again,the tabulated F0.05 (3,6) is 4.76 and the calculated value is 6.765 ,which is greater than the tabulated value , so we will reject the null hypothesis at 5% level of significance, and conclude that there is significant difference between the exercise statuses.