In: Statistics and Probability
A retrospective study conducted in Japan in 1975 investigated the relationship between Smoking and Lung Cancer. After the study was done, the 2000 people in the sample were classified by whether the had died from Lung Cancer (LC) or not (NLC) and by whether they had been smokers (S) or non-smokers (NS) during their lifetimes. The final table produced from the sample looked like this:
S |
NS |
TOTAL |
|
LC |
350 |
150 |
500 |
NLC |
624 |
876 |
1500 |
TOTAL |
974 |
1026 |
2000 |
(3a) In this sample, of those who died from Lung Cancer, what percentage were Smokers?
(3b) In this sample, of those who did not die from Lung Cancer, what percentage were Smokers?
(3c) If the presence of Lung Cancer were independent of Smoking, fill in the four expected cell counts in the table below (subject to the marginal constraints given).
S |
NS |
TOTAL |
|
LC |
500 |
||
NLC |
1500 |
||
TOTAL |
974 |
1026 |
2000 |
(3d) The expected cell counts found in part (3c) are rather different from the values actually observed when the sample was taken. Perform a χ2 test of Independence on these data vs. the alternative that Lung Cancer victims are more likely to be Smokers. Report the X2 value, the degrees of freedom, and the approximate P-value. Show explicit work and/or tell what R commands you used to obtain the results.
(3e) Give a brief statement (2 or 3 complete English sentences) explaining your conclusions from the test which you applied in part (3d).