In: Statistics and Probability
The table below shows the life expectancy for an individual born in
the United States in certain years.
Year of Birth | Life Expectancy |
---|---|
1930 | 59.7 |
1940 | 62.9 |
1950 | 70.2 |
1965 | 69.7 |
1973 | 71.4 |
1982 | 74.5 |
1987 | 75 |
1992 | 75.7 |
2010 | 78.7 |
1. Find the estimated life expectancy for an individual born in 1973
2. Use the two points in part (e) to plot the least squares line on your graph from part (b).
3. Are there any outliers in the data?Yes, 1930 and 2010 are outliers.Yes, 1930 and 1950 are outliers. Yes, 1950 is an outlier.No, there are no outliers
4. Using the least squares line, find the estimated life
expectancy for an individual born in 1870. (Round your answer to
one decimal place.)
Does the least squares line give an accurate estimate for that
year? Explain why or why not. Yes, because the estimate is over 50
years.No, because 1870 is outside the domain of the least squares
line.
1) -
To find the estimated life expectancy for year 1973, we have to find equation of regression line-
Equation of regression line - Y = b0 + b1X
Where,
Observation table -
Year(x) | Life expectancy(y) | (x=x_bar) | (y-y_bar) | (x-x_bar)(y-y_bar) | (x-x_bar)^2 | (y-y_bar)^2 |
1930 | 59.7 | -39.889 | -11.1667 | 445.4284963 | 1591.132321 | 124.6951889 |
1940 | 62.9 | -29.889 | -7.9667 | 238.1166963 | 893.352321 | 63.46830889 |
1950 | 70.2 | -19.889 | -0.6667 | 13.2599963 | 395.572321 | 0.44448889 |
1965 | 69.7 | -4.889 | -1.1667 | 5.7039963 | 23.902321 | 1.36118889 |
1973 | 71.4 | 3.111 | 0.5333 | 1.6590963 | 9.678321 | 0.28440889 |
1982 | 74.5 | 12.111 | 3.6333 | 44.0028963 | 146.676321 | 13.20086889 |
1987 | 75 | 17.111 | 4.1333 | 70.7248963 | 292.786321 | 17.08416889 |
1992 | 75.7 | 22.111 | 4.8333 | 106.8690963 | 488.896321 | 23.36078889 |
2010 | 78.7 | 40.111 | 7.8333 | 314.2014963 | 1608.892321 | 61.36058889 |
17729 | 637.8 | 1239.966667 | 5450.888889 | 305.26 |
Calculations -
Hence, the equation of regression line is -
Y = -377.283 + 0.2275X
Estimated value of life expectancy for year 1973 -
Y = -377.283 + 0.2275X = -377.283 + (0.2275)(1973) = -377.283 + 448.8575 = 71.5745
Estimated life expectancy for year 1973 is 71.5745 years.
b) -
Least square regression line on the graph -
c) -
To find outliers, we have to find out the standardized residual for each year life expectancy. If any standardized residual is greater than 3, then it is an outlier.
Formula for standardized residual is -
Where ,
n = Total number of observation = 9
k = parametres estimated = 2
Observation table -
Year(x) | Life expectancy(y) | y_hat | ei = Y-y_hat | ei^2 |
1930 | 59.7 | 61.792 | -2.092 | 4.376464 |
1940 | 62.9 | 64.067 | -1.167 | 1.361889 |
1950 | 70.2 | 66.342 | 3.858 | 14.884164 |
1965 | 69.7 | 69.7545 | -0.0545 | 0.00297025 |
1973 | 71.4 | 71.5745 | -0.1745 | 0.03045025 |
1982 | 74.5 | 73.622 | 0.878 | 0.770884 |
1987 | 75 | 74.7595 | 0.2405 | 0.05784025 |
1992 | 75.7 | 75.897 | -0.197 | 0.038809 |
2010 | 78.7 | 79.992 | -1.292 | 1.669264 |
23.19273475 |
Calculations -
Observation table for standardized residuals -
ei = Y-y_hat | Standardized residuals |
-2.092 | -1.18201523 |
-1.167 | -0.659374653 |
3.858 | 2.179834971 |
-0.0545 | -0.030793418 |
-0.1745 | -0.098595439 |
0.878 | 0.496084786 |
0.2405 | 0.13588655 |
-0.197 | -0.111308318 |
-1.292 | -0.730001758 |
Here, we can see that no standardized residual is greater than 3, there are no outlier.
So, There are no outlier.
4) -
We have to estimate life expectancy for year 1870 -
Y = 0.2275X - 377.283 = (1870)(0.2275) - 377.283 = 425.425 - 377.283 = 48.142 48.1
Least square regression line does not give accurate value,because 1870 is outside the domain of the least square line.