In: Statistics and Probability
The paper “Cigarette Tar Yields in Relation to Mortality from Lung Cancer in the Cancer Prevention Study II Prospective Cohort “ (British Medical Journal [2004]: 72-79) included the accompanying data on the tar level of cigarettes smoked for a sample of male smokers who subsequently died of lung cancer.
Tar Level | Frequency |
0-7mg | 103 |
8-14 mg | 378 |
15-21 mg | 563 |
≥ 22 mg | 150 |
Assume it is reasonable to regard the sample as representative of male smokers who die of lung cancer. Is there convincing evidence that the proportion of male smoker long cancer death is not the same for the four given tar level categories?
Null Hypothesis:Ho: proportion of male smoker long cancer death is the same for the four given tar level categories.
Alternate hypothesis:Ha: proportion of male smoker long cancer death is not the same for the four given tar level categories
degree of freedom =categories-1= | 3 |
for 3 df and 0.05 level of signifcance critical region χ2= | 7.815 |
applying chi square goodness of fit test: |
relative | observed | Expected | residual | Chi square | |
category | frequency | Oi | Ei=total*p | R2i=(Oi-Ei)/√Ei | R2i=(Oi-Ei)2/Ei |
0-7 | 0.250 | 103 | 298.50 | -11.32 | 128.041 |
8-14 | 0.250 | 378 | 298.50 | 4.60 | 21.173 |
15-21 | 0.250 | 563 | 298.50 | 15.31 | 234.373 |
>=22 | 0.250 | 150 | 298.50 | -8.60 | 73.877 |
total | 1.000 | 1194 | 1194 | 457.464 |
test statistic X2 =457.464
p value =0.0000
as test statisitc is signfiicantly higher therefore we reject null hypothesis and conclude that proportion of male smoker long cancer death is not the same for the four given tar level categories