In: Statistics and Probability
Beryllium disease: Beryllium is an extremely lightweight metal that is used in many industries, such as aerospace and electronics. Long-term exposure to beryllium can cause people to become sensitized. Once an individual is sensitized, continued exposure can result in chronic beryllium disease, which involves scarring of the lungs. In a study of the effects of exposure to beryllium, workers were categorized by their duration of exposure (in years) and by their disease status (diseased, sensitized, or normal). The results were as follows:
Duration of Exposure
<1 1 to <5 ≥5
Diseased 12 14 23
sensitized 12 18 12
normal 64 130 205
(a) compute the expected frequencies under the null hypothesis.
(b) compute the value of the chi-square statistic.
(c) how many degrees of freedom are there?
(d) Test the hypothesis of independence. use the a=0.01 level of significance. What do you conclude?
Diseased
This is a chi square test of independence of the two variables here.
a) The expected frequencies under the null hypothesis for each of the 9 cells here are computed as:
Ei = (Sum of column i) * (Sum of row i) / Grand Total
Using the formula given, the circular brackets in the above table shows the expected frequency for each of the 9 cells here.
b) The chi square test statistic contribution for each cell here is computed as:
The values in the square bracket in the above table shows the chi square test statistic contribution and therefore the chi square test statistic here is computed as:
Therefore 10.2431 is the required chi square test statistic value here.
c) The degrees of freedom here is computed as:
df = (num of columns - 1)(num of rows - 1) = 2*2 = 4
Therefore 4 is the degrees of freedom here.
d) For 4 degrees of freedom, the p-value here is computed from the chi square distribution tables here as:
As the p-value here is 0.036526 > 0.01 which is the level of significance, therefore the test is not significant here and we cannot reject the null hypothesis here. Therefore we dont have sufficient evidence here that the two variables are associated here.