In: Statistics and Probability
Distance from dump (miles) of cancer patient | ||
0.5 | ||
0.7 | ||
0.95 | ||
1.3 | ||
1.55 | ||
1.7 | ||
1.9 | ||
2.15 | ||
2.25 | ||
2.8 | ||
3.2 | ||
4.2 | ||
4.35 | ||
4.45 | ||
5.25 | ||
6.35 | ||
7.1 | ||
8.2 | ||
8.25 | ||
9.35 | ||
10.1 | ||
12.15 | ||
13.95 | ||
15.15 | ||
16.6 | ||
16.95 | ||
17.2 | ||
17.45 | ||
19.15 |
You suspect that townsfolk near Gloomsville are getting cancer because of a new toxic waste dump built in town. So, suspecting this is in the water, you look at cancer rates up to 20 miles downstream from the dump site. Is cancer evenly distributed along those 20 miles? The data to answer this question are in the table above.
I will have to solve this using Excel. Specifically what should I do? What functions could I use? For the question, I have to
1. A null and alternative hypothesis stated, as appropriate and for each hypothesis tested. may involve several hypothesis tests.
2. Choose the most appropriate test. Explain how you have met the assumptions of the test or why the test is robust to violations of the assumptions.
3. State explicitly what test(s) you are using.
4. If you fail to reject the null hypothesis, calculate the power of the test.
1)
Ho: cancer is evenly distributed along those 20 miles
Ha: Cancer is not evenly distributed along those 20 miles
2)
We can use chi-square goodness of fit
3)
Formulas
Oi | Ei | (Oi-Ei)^2/Ei | ||
0.2 | 11 | =$C$8*B2 | =(C2-D2)^2/D2 | |
0.2 | 6 | =$C$8*B3 | =(C3-D3)^2/D3 | |
0.2 | 4 | =$C$8*B4 | =(C4-D4)^2/D4 | |
0.2 | 3 | =$C$8*B5 | =(C5-D5)^2/D5 | |
0.2 | 5 | =$C$8*B6 | =(C6-D6)^2/D6 | |
=SUM(B2:B6) | =SUM(C2:C6) | =SUM(D2:D6) | ||
TS | =SUM(E2:E7) | |||
critical value | =CHISQ.INV(0.95,4) | |||
p-value | =CHISQ.DIST.RT(E9,4) |
since Expected frequencies are greater than 5
assumptions are satisified
4)
p-value = 0.1533
since p-value > alpha
we fail to reject the null hypothesis