In: Statistics and Probability
A particular paper included the accompanying data on the tar level of cigarettes smoked for a sample of male smokers who subsequently died of lung cancer. Assume it is reasonable to regard the sample as representative of male smokers who die of lung cancer. Is there convincing evidence that the proportion of male smoker lung cancer deaths is not the same for the four given tar level categories at the α = .05 level? (Use 2 decimal places.) Tar Level Frequency 0-7:115, 8-14: 350, 15-21: 516, > 22: 178
χ2 =
Null hypothesis H0: Proportion of male smoker lung cancer deaths is same for the four given tar level categories.
Alternative hypothesis Ha: Proportion of male smoker lung cancer deaths is not the same for the four given tar level categories.
Observed frequencies Oi = 115, 350, 516, 178
Total frequency, n = 115 + 350 + 516 + 178 = 1159
If null hypothesis is true, proportion of male smoker lung cancer deaths in each tar level is 1/4 = 0.25
Expected frequencies, E = n * p = 1159 * 0.25 = 289.75
χ2 =
= (115 - 289.75)2 / 289.75 + (350 - 289.75)2 / 289.75 + (516 - 289.75)2 / 289.75 + (178 - 289.75)2 / 289.75
= 337.69
Degree of freedom = k-1 = 4-1 = 3
P-value = P(χ2 > 337.69) = 0.0000
Since p-value is less than 0.05 significance level, we reject null hypothesis H0 and conclude that there is significant evidence that proportion of male smoker lung cancer deaths is not the same for the four given tar level categories.