In: Statistics and Probability
A study is conducted to determine the relationship between a driver's age and the number of accidents that he or she has had over a 1-year period.
The data from this study is in the table:
Age 16 24 18 17 23 27 32
Accidents 3 2 5 2 0 1 1
The correlation coefficient for this problem is
There is/ is not evidence of significant correlation at the 5% significance level
The best estimate for the number of accidents a 20-year old driver has in 1 year is
Observation table -
Sr. No. | Age(x) | Accidents(y) | (x-x_bar) | (y-y_bar) | (x-x_bar)(y-y_bar) | (x-x_bar)^2 | (y-y_bar)^2 |
1 | 16 | 3 | -6.4286 | 1 | -6.4286 | 41.3269 | 1 |
2 | 24 | 2 | 1.5714 | 0 | 0 | 2.4693 | 0 |
3 | 18 | 5 | -4.4286 | 3 | -13.2858 | 19.6125 | 9 |
4 | 17 | 2 | -5.4286 | 0 | 0 | 29.4697 | 0 |
5 | 23 | 0 | 0.5714 | -2 | -1.1428 | 0.3265 | 4 |
6 | 27 | 1 | 4.5714 | -1 | -4.5714 | 20.8977 | 1 |
7 | 32 | 1 | 9.5714 | -1 | -9.5714 | 91.6117 | 1 |
Total | 157 | 14 | - | - | -35 | 205.7143 | 16 |
Formula for correlation coefficient is -
Calculations -
Hence, the correlation coefficient between age and accidents is -0.6101.
Now, we have to test that correlation coefficient is significant or not. We use t-test for significance of correlation coefficient.
Null hypothesis - H0 : Correlation coefficient is not significant, i.e. = 0.
Alternative hypothesis - H1 : Correlation coefficient is significant. i.e. 0.
Test statistic -
Test criterion -
Reject H0 if t t/2,n-2 or -t -t/2,n-2 .
Calculations -
r = -0.6101, n = 7
Critical value -
t/2,n-2 = t0.05/2,7-2 = t0.025,5 = 2.5706
Conclusion -
-t (-1.1003) > -t/2,n-2 (-2.5706), hence H0 is accepted at 5% level of significance.
Result -
There is no significant correlation between age & accidents.
Now, we have to find out the best estimate for the number of accidents a 20 year old has in 1 year -
Let, linear regression equation be -
Y = a + bX
Where,
a = - b
Calculations -
a = - b = 2 - (22.4286)(-0.1701) = 2 + 3.8159 = 5.8159
linear regression equation is -
Y = a + bX = 5.8159 - 0.1701X
the best estimate for the number of accidents a 20 year old has in 1 year =
Y = 5.8159 - 0.1701X = 5.8159 - 0.1701(20) = 5.8159 - 3.402 = 2.4139 2
Hence, the best estimate for the number of accidents a 20 year old has in 1 year is 2.
Note : t-table provided below.