In: Statistics and Probability
country | stork pairs(102) | birth rate (104/year) |
Albania | 1 | 8 |
Austria | 3 | 9 |
Greece | 25 | 11 |
Portugal | 15 | 12 |
Romania | 50 | 35 |
Spain | 80 | 45 |
g) Calculate the average error for the predicted birth rates
h) Calculate the proportion of variance accounted for and explain it verbally.
let X : Stork pairs(102), Y : birth rate(104/year), we are going to find the linear regression model such that Y= a+bx, with a and b being the intercept and the slope(regression coefficient) parameters.
sl. no. | |||||
1 | 1 | 8 | 784 | 144 | 336 |
2 | 3 | 9 | 676 | 121 | 286 |
3 | 25 | 11 | 16 | 81 | 36 |
4 | 15 | 12 | 196 | 64 | 112 |
5 | 50 | 35 | 441 | 225 | 315 |
6 | 80 | 45 | 2601 | 625 | 1275 |
Total | 174 | 120 | 4714 | 1260 | 2360 |
mean | 29 | 20 |
hence, Y= 5.4815 + 0.5006 x
Sl. no. | X | Y | Y(predicted) | ( Ypredicted- Y)2 |
1 | 1 | 8 | 5.9821 | 4.07192041 |
2 | 3 | 9 | 6.9833 | 4.06707889 |
3 | 25 | 11 | 17.9965 | 48.95101225 |
4 | 15 | 12 | 12.9905 | 0.98109025 |
5 | 50 | 35 | 30.5115 | 20.14663225 |
6 | 80 | 45 | 45.5295 | 0.28037025 |
squared sum of error | 78.4981043 | |||
average | 13.08301738 |
Hence g) average error for the predicted birth rates = 13.083 (approximately)
MSE = SSE/ (n-2)= 78.4981/(6-2)= 19.6245
h) note that, for a simple linear regression problem, r = correlation coefficient and R2= coefficient of determination
and ,
= 0.9377 (approx.)
Converting the r2 into a percentage we get 93.77%, hence, about 93.77% of the total variation in the response variable is explained by the simple linear regression model. It is a very high value and indicates a better goodness of fit for the observations. The R2 is used as a measure for how well , change in the response variable can be explained by changes in the predictor variable.