In: Statistics and Probability
Using the following data (already sorted), use a goodness of fit test to test whether it comes from an exponential distribution. The exponential distribution has one parameter, its mean, μ (which is also its standard deviation). The exponential distribution is a continuous distribution that takes on only positive values in the interval (0,∞). Probabilities for the exponential distribution can be found based on the following probability expression:
.
Use 10 equally likely cells for your goodness of fit test.
Data Display
0.2 0.4 0.5 0.5 0.7 0.8 1.0 1.2 1.2 1.2
1.4 1.5 1.5 1.6 1.7 1.7 1.7 1.8 1.8 1.9
2.0 2.3 2.6 2.7 2.7 2.8 2.8 2.8 2.8 2.8
2.8 2.9 2.9 3.0 3.0 3.0 3.2 3.2 3.2 3.4
3.4 3.5 3.6 3.6 3.7 3.8 3.9 3.9 3.9 4.0
4.1 4.1 4.2 4.3 4.5 4.5 4.5 4.6 4.7 4.8
4.8 4.9 4.9 4.9 5.0 5.0 5.1 5.1 5.1 5.3
5.3 5.3 5.3 5.4 5.4 5.4 5.4 5.5 5.5 5.5
5.6 5.6 5.6 5.7 5.7 5.8 5.8 5.8 5.9 5.9
6.0 6.0 6.2 6.2 6.2 6.3 6.3 6.3 6.3 6.4
6.6 6.6 6.6 6.6 6.7 6.8 6.9 6.9 6.9 7.0
7.0 7.1 7.2 7.3 7.3 7.4 7.5 7.5 7.6 7.6
7.7 7.8 7.8 7.9 8.0 8.0 8.0 8.1 8.1 8.1
8.2 8.3 8.4 8.4 8.4 8.5 8.5 8.6 8.6 8.7
8.7 8.8 9.0 9.1 9.1 9.2 9.3 9.4 9.5 9.6
9.6 9.6 9.8 9.9 9.9 9.9 10.0 10.1 10.2 10.5
10.6 10.7 10.7 10.8 10.9 10.9 11.0 11.0 11.4 11.5
11.7 11.8 11.8 11.9 12.0 12.0 12.1 12.1 12.3 12.3
12.3 12.3 12.6 12.9 13.1 13.3 13.3 13.4 13.5 13.6
13.9 14.0 14.2 14.2 14.3 14.3 14.4 15.0 15.0 15.2
15.6 15.6 15.7 15.7 15.7 15.9 16.0 16.3 16.4 16.5
16.5 16.6 16.6 16.7 17.2 17.3 17.3 17.4 17.7 17.9
18.6 18.8 19.9 19.9 19.9 20.0 20.1 20.3 20.4 21.0
21.3 21.5 22.2 23.3 23.5 23.9 24.3 24.8 25.5 25.5
25.6 25.8 27.5 28.2 30.9 35.7 36.3 37.2 40.9 52.8
Descriptive Statistics:
Variable N Mean
Exp? 250 9.974
We test the hypothesis,
Null Hypothesis : The given data comes from an exponential distribution, versus
Alternative Hypothesis : The given data does not come from an exponential distribution.
We need to form a frequency table to solve the problem. Here, I have divided the observations into ten class intervals and write the corresponding frequencies.
Class Interval | Frequency |
0 - 5.5 | 77 |
5.5 - 11 | 89 |
11 - 16.5 | 43 |
16.5 - 22 | 23 |
22 - 27.5 | 10 |
27.5 - 33 | 3 |
33 - 38.5 | 3 |
38.5 - 44 | 1 |
44 - 49.5 | 0 |
49.5 - 55 | 1 |
Here, we are required to estimate the parameters of the distribution if it is not provided. We are already given the parameter, i.e., mean is equal to 9.974.
If X follows exponential distribution, then the pdf of X is given by,
Then,
Hence,
Expected frequency,
The integral can be easily calculated using calculator.
"a" and "b" are the lower and upper limits of the class interval respectively.
Class Interval | Observed Frequency , | P(a < X < b) | Expected Frequency, | |
0 - 5.5 | 77 | 0.424 | 106 | 7.934 |
5.5 - 11 | 89 | 0.244 | 61 | 12.852 |
11 - 16.5 | 43 | 0.141 | 36 | 1.3611 |
16.5 - 22 | 23 | 0.081 | 20 | 0.45 |
22 - 27.5 | 10 | 0.047 | 12 | 0.333 |
27.5 - 33 | 3 | 0.027 | 7 | 3.2667 |
33 - 38.5 | 3 | 0.016 | 4 | |
38.5 - 44 | 1 | 0.009 | 2 | |
44 - 49.5 | 0 | 0.005 | 1 | |
49.5 - 55 | 1 | 0.003 | 1 |
If the frequency values are less than five, chi-square test does not work well. Hence, we have to pool the values less than 5 to get a larger value.
For the pooled values,
The test statistic, under the null hypothesis is
From the above table, U = 26.1968
Critical value, ( from the chi - square table)
where is the level of significance = 0.05
Degrees of freedom = k - r - 1, where k is the number of classes after pooling(i.e.,k=6)and r is the number of parameters estimate(i.e., r=0).
We reject the null hypothesis if
Here, .
Hence, we reject the null hypothesis and conclude that exponential is not a good fit for the above data.