Question

In: Statistics and Probability

Using the data in the Excel file under the tab Problem1, develop a regression equation for...

Using the data in the Excel file under the tab Problem1, develop a regression equation for a country’s lung cancer death rate based on cigarette consumption.

  1. Is this a good regression model?  Why or why not?
  1. If Cigarette Consumption was 725, what would you predict the Lung Cancer Death Rate to be?
  1. Using your regression model, what is the Residual Value for Finland?
  1. Do you think there are any outliers in the model?  If so, what would you do with them?
Country Cigarette Consumption Lung Cancer Death Rate
Austria 455 170
Canada 510 150
Denmark 380 165
Finland 1115 350
Great Britain 1145 465
Holland 460 245
Iceland 220 58
Norway 250 90
Sweden 310 115
Switzerland 530 250
USA 1280 190

Solutions

Expert Solution

All the calculation is done in MS EXCEL and the analysis is shown below.

a.

Here the coefficient of determination or the square to the correlation coefficient is 0.549 i.e. only 54.9% variability of the total variability is explained by the model. Hence the model is moderately good.

b.

Here, the regression model is,

Lung cancer death rate(y)=65.75+0.23*Cigarette consumption(x)

So, we replace x=725 and get the lung cancer death rate.

For 725 consumption of cigarettes, the death rate due to lung cancer is=232.5

c.

Using the model the residual value for Finland is=328.08

d.

Here in this given data, there is no outlier from the analysis shown below.

Country

Cigarette Consumption Lung Cancer Death Rate SUMMARY OUTPUT
Austria 455 170 345 Q1
Canada 510 150 822.5 Q3 Regression Statistics
Denmark 380 165 477.5 IQR Multiple R 0.740972337
Finland 1115 350 R Square 0.549040004
Great Britain 1145 465 -371.25 LOWER BOUND Adjusted R Square 0.498933338
Holland 460 245 Standard Error 84.12962829
Iceland 220 58 Observations 11
Norway 250 90 1538.75 UPPER BOUND
Sweden 310 115 ANOVA
Switzerland 530 250 df SS MS F Significance F
USA 1280 190 Regression 1 77554.39625 77554.39625 10.95742 0.009081
Residual 9 63700.1492 7077.794356
Total 10 141254.5455
Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0%
232.5 death rate Intercept 65.74885702 48.95871046 1.342945033 0.212173 -45.0034 176.5012 -45.0034 176.5012
Cigarette Consumption 0.229115338 0.069214952 3.310200048 0.009081 0.07254 0.38569 0.07254 0.38569
RESIDUAL OUTPUT PROBABILITY OUTPUT
Observation Predicted Lung Cancer Death Rate Residuals Standard Residuals Percentile Lung Cancer Death Rate
1 169.9963357 0.0036643 4.59114E-05 4.545455 58
2 182.5976793 -32.59767928 -0.408428893 13.63636 90
3 152.8126854 12.18731463 0.152699564 22.72727 115
4 321.2124586 28.78754138 0.360690207 31.81818 150
5 328.0859188 136.9140812 1.715449318 40.90909 165
6 171.1419124 73.85808761 0.92539646 50 170
7 116.1542313 -58.15423133 -0.728636789 59.09091 190
8 123.0276915 -33.02769146 -0.413816682 68.18182 245
9 136.7746117 -21.77461173 -0.272822507 77.27273 250
10 187.179986 62.82001397 0.787096179 86.36364 350
11 359.0164893 -169.0164893 -2.117672768 95.45455 465
#NAME?

Related Solutions

Using the data in the Excel file Home Market Value, develop a multiple regression model for...
Using the data in the Excel file Home Market Value, develop a multiple regression model for estimating the market value as a function of house age and house size. Predict the value of a house that is 30 years old and has 1800 square feet, and also predict the value of a house that is 5 years old and has 2800 square feet. Conduct your analysis using the following Multiple Regression Model Building and Interpretation Rubric: Identify the dependent variable...
Use Excel to develop a regression model for the Hospital Database (using the “Excel Databases.xls” file...
Use Excel to develop a regression model for the Hospital Database (using the “Excel Databases.xls” file on Blackboard) to predict the number of Personnel by the number of Births. Perform a test of the overall model, what is the value of the test statistic? Write your answer as a number, round your answer to 2 decimal places. SUMMARY OUTPUT Regression Statistics Multiple R 0.697463374 R Square 0.486455158 Adjusted R Square 0.483861497 Standard Error 590.2581194 Observations 200 ANOVA df SS MS...
In the Excel data file, the tab labeled Question 1 contains data on the number of...
In the Excel data file, the tab labeled Question 1 contains data on the number of times boys and girls raise their hands in class. Conduct the t-test: Two-Sample Assuming Equal Variances. Males 9,8,4,9,3,8,10,8,9,10,7,6,12 Females 3,5,1,2,6,4,3,6,7,9,7,3,7,6,8,8 a. What is the null hypothesis? b. What is the research hypothesis? c. Why run a Two-Sample Assuming Equal Variances t-test? d. Interpret the findings. What are the results of the hypothesis test? Can you reject the null hypothesis?
The data set for this question set (Tab Q1 in the Excel data file) comes from...
The data set for this question set (Tab Q1 in the Excel data file) comes from a research project that tracks the elderly residents in a community to monitor their cognitive function and general health. Based on the literature, education is considered a protective factor against dementia, and memory decline is usually the first sign of dementia. So the researchers would like to know whether education level (measured in number of years of formal schooling) is correlated with memory function...
64. FILE Refer to the Lincolnville School bus data. Develop a regression equation that ex presses...
64. FILE Refer to the Lincolnville School bus data. Develop a regression equation that ex presses the relationship between age of the bus and maintenance cost. The age of the bus the Independent variable a. Draw a scatter diagram. What does this diagram suggest as to the relationship be tween the two variables? Is it direct or indirect? Does it appear to be strong or weak? b. Develop a regression equation. How much does an additional year add to the...
Develop a scatter diagram for these data. Develop the estimated regression equation.
Given the following: X: 1    2    3 4    5 Y: 3    7    5    11    14 Develop a scatter diagram for these data. Develop the estimated regression equation. Use the estimated regression equation to predict the value of Y when X = 4
2. Solve using Microsoft Excel: Use the following data to find the equation of the regression...
2. Solve using Microsoft Excel: Use the following data to find the equation of the regression line. X-Bar 2 4 5 6 Y-Bar 7 11 13 20
Consider the data set below. Excel File: data12-33.xls The estimated regression equation is ŷ = 30.33...
Consider the data set below. Excel File: data12-33.xls The estimated regression equation is ŷ = 30.33 - 1.88x. Estimate the standard deviation of ŷ p when x = 3 (to 3 decimals). Develop a 95% confidence interval for the expected value of y when x = 3 (to 2 decimals). (  ,  ) Estimate the standard deviation of an individual value of y when x = 3 (to 2 decimals). Develop a 95% prediction interval for y wh
Use Excel to develop a regression model for the Consumer Food Database (using the “Excel Databases.xls”...
Use Excel to develop a regression model for the Consumer Food Database (using the “Excel Databases.xls” file) to predict Annual Food Spending by Annual Household Income. Assume a 5% level of significance. (file here: https://drive.google.com/file/d/13uDUXwoSRZHEUtjMUedu2yjR_4lrLepC/view?usp=sharing ) Must complete all the parts to this problem: PART 1: Perform a simple linear regression in Excel to predict Annual Food Spending by Annual Household Income and output the results. Include the Regression Statistics, ANOVA, and table of Coefficients for each model. PART 2:...
1) Use Excel to develop a regression model for the Consumer Food Database (using the “Excel...
1) Use Excel to develop a regression model for the Consumer Food Database (using the “Excel Databases.xls” file on Blackboard) to predict Annual Food Spending by Annual Household Income for those living in the Metro area only.    Suppose a household in the metro area has an annual income of $60,000. Predict how much they spend on food per year. Write your answer as a number (do not include the $ sign or comma) and round to 2 decimal places....
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT