In: Statistics and Probability
he National Science Foundation collects data on the research and development spending by universities and colleges in the United States. Here is the data for the years 2008 - 2011:
Year |
2008 |
2009 |
2010 |
2011 |
Spending (billions of dollars) |
51.9 |
54.9 |
58.4 |
62.0 |
(a) Draw a scatterplot that shows the increase in research and development spending over time. Try to be as accurate as you can with your scatterplot. You may use Excel to help guide you. Does the pattern suggest that the spending is increasing linearly over time?
(b) Find the equation of the least-squares regression line for predicting spending from year. Use Excel to help you. Add this line to your scatterplot.
(c) For each of the four years, find the residual.
(d) Write the regression model for this setting. What are your estimates for the slope and y-intercept in this model?
(e) Use the least-squares regression equation to predict research and development spending for the year 2013 (assume year 2008 is x = 1 , year 2009 is x = 2 , etc.). The actual spending for 2013 was $63.4 billion. Add this point to your plot, and comment on why your equation performed so poorly.
a)
yes, the pattern suggest that the spending is increasing linearly over time
b)
Regression Statistics | ||||||
Multiple R | 0.9991 | |||||
R Square | 0.9983 | |||||
Adjusted R Square | 0.9974 | |||||
Standard Error | 0.2214 | |||||
Observations | 4 | |||||
ANOVA | ||||||
df | SS | MS | F | Significance F | ||
Regression | 1 | 57.1 | 57.1 | 1165.76 | 0.0009 | |
Residual | 2 | 0.1 | 0.0490 | |||
Total | 3 | 57.2 | ||||
Coefficients | Standard Error | t Stat | P-value | lower 95% | upper 95% | |
Intercept | 48.3500 | 0.271 | 178.342 | 0.0000 | 47.1835 | 49.52 |
X | 3.3800 | 0.099 | 34.143 | 0.0009 | 2.9541 | 3.8059 |
Ŷ = 48.350 +
3.380 *x
c)
x | y | Ŷ | residual,ei=y-y^ | |||
1 | 51.9 | 51.73 | 0.17 | |||
2 | 54.9 | 55.11 | -0.21 | |||
3 | 58.4 | 58.49 | -0.09 | |||
4 | 62 | 61.87 | 0.13 |
d)
estimated slope , ß1 = SSxy/SSxx = 16.9
/ 5.000 = 3.38000
intercept, ß0 = y̅-ß1* x̄ =
48.35000
e)
Predicted Y at X= 6 is
Ŷ = 48.3500 +
3.3800 *6= 68.63
equation performed so poorly because variable x is extraploated which can lead to wrong prediction