In: Statistics and Probability
Nine samples were taken from 2 streams, four from one and five from the other and the following data were obtained.
Pollution lever in stream 1: ppm |
Pollution level in stream 2: ppm |
16 |
9 |
12 |
10 |
14 |
8 |
11 |
6 |
Someone claimed that the data proved that stream 2 is less polluted than stream 1. Someone else asked the following questions: When were the data collected? Were the data collected all in one day or on different days? Were the data collected during the same time period for the two streams? What was the temperature in the two streams when the data was collected? At what points in the two streams were the data collected? Why do you think the second person asked these questions? Are there other questions that should be asked? Is there any set of answers to these questions and others that you can think of that would justify the use of a confidence interval formula to draw conclusions from this data? |
5 |
Two sample t-test for equal variances:
Stream 1 | Stream 2 | |
Mean | 13.25 | 7.6 |
Variance | 4.916667 | 4.3 |
Observations | 4 | 5 |
Assume:
The degree of freedom: df= n1+n2-2= 4+5-2= 7
The test statistic:
Pooled variance:
P-value: 0.002793
The test statistic is significant and rejects H0. There is sufficient evidence to support the claim that the stream 2 is less than stream 1.
95% Confidence interval:
There is no zero within confidence interval range. So, the test statisitc statisitcally different and conlude significiant diffeence between means.