In: Statistics and Probability
Using the data in the Excel file under the tab Problem1, develop a regression equation for a country’s lung cancer death rate based on cigarette consumption.
Country | Cigarette Consumption | Lung Cancer Death Rate |
Austria | 455 | 170 |
Canada | 510 | 150 |
Denmark | 380 | 165 |
Finland | 1115 | 350 |
Great Britain | 1145 | 465 |
Holland | 460 | 245 |
Iceland | 220 | 58 |
Norway | 250 | 90 |
Sweden | 310 | 115 |
Switzerland | 530 | 250 |
USA | 1280 | 190 |
All the calculation is done in MS EXCEL and the analysis is shown below.
a.
Here the coefficient of determination or the square to the correlation coefficient is 0.549 i.e. only 54.9% variability of the total variability is explained by the model. Hence the model is moderately good.
b.
Here, the regression model is,
Lung cancer death rate(y)=65.75+0.23*Cigarette consumption(x)
So, we replace x=725 and get the lung cancer death rate.
For 725 consumption of cigarettes, the death rate due to lung cancer is=232.5
c.
Using the model the residual value for Finland is=328.08
d.
Here in this given data, there is no outlier from the analysis shown below.
Country |
Cigarette Consumption | Lung Cancer Death Rate | SUMMARY OUTPUT | ||||||||||||
Austria | 455 | 170 | 345 | Q1 | |||||||||||
Canada | 510 | 150 | 822.5 | Q3 | Regression Statistics | ||||||||||
Denmark | 380 | 165 | 477.5 | IQR | Multiple R | 0.740972337 | |||||||||
Finland | 1115 | 350 | R Square | 0.549040004 | |||||||||||
Great Britain | 1145 | 465 | -371.25 | LOWER BOUND | Adjusted R Square | 0.498933338 | |||||||||
Holland | 460 | 245 | Standard Error | 84.12962829 | |||||||||||
Iceland | 220 | 58 | Observations | 11 | |||||||||||
Norway | 250 | 90 | 1538.75 | UPPER BOUND | |||||||||||
Sweden | 310 | 115 | ANOVA | ||||||||||||
Switzerland | 530 | 250 | df | SS | MS | F | Significance F | ||||||||
USA | 1280 | 190 | Regression | 1 | 77554.39625 | 77554.39625 | 10.95742 | 0.009081 | |||||||
Residual | 9 | 63700.1492 | 7077.794356 | ||||||||||||
Total | 10 | 141254.5455 | |||||||||||||
Coefficients | Standard Error | t Stat | P-value | Lower 95% | Upper 95% | Lower 95.0% | Upper 95.0% | ||||||||
232.5 | death rate | Intercept | 65.74885702 | 48.95871046 | 1.342945033 | 0.212173 | -45.0034 | 176.5012 | -45.0034 | 176.5012 | |||||
Cigarette Consumption | 0.229115338 | 0.069214952 | 3.310200048 | 0.009081 | 0.07254 | 0.38569 | 0.07254 | 0.38569 | |||||||
RESIDUAL OUTPUT | PROBABILITY OUTPUT | ||||||||||||||
Observation | Predicted Lung Cancer Death Rate | Residuals | Standard Residuals | Percentile | Lung Cancer Death Rate | ||||||||||
1 | 169.9963357 | 0.0036643 | 4.59114E-05 | 4.545455 | 58 | ||||||||||
2 | 182.5976793 | -32.59767928 | -0.408428893 | 13.63636 | 90 | ||||||||||
3 | 152.8126854 | 12.18731463 | 0.152699564 | 22.72727 | 115 | ||||||||||
4 | 321.2124586 | 28.78754138 | 0.360690207 | 31.81818 | 150 | ||||||||||
5 | 328.0859188 | 136.9140812 | 1.715449318 | 40.90909 | 165 | ||||||||||
6 | 171.1419124 | 73.85808761 | 0.92539646 | 50 | 170 | ||||||||||
7 | 116.1542313 | -58.15423133 | -0.728636789 | 59.09091 | 190 | ||||||||||
8 | 123.0276915 | -33.02769146 | -0.413816682 | 68.18182 | 245 | ||||||||||
9 | 136.7746117 | -21.77461173 | -0.272822507 | 77.27273 | 250 | ||||||||||
10 | 187.179986 | 62.82001397 | 0.787096179 | 86.36364 | 350 | ||||||||||
11 | 359.0164893 | -169.0164893 | -2.117672768 | 95.45455 | 465 | ||||||||||
#NAME? |