In: Statistics and Probability
Part 1. Consider the dataset below. You will perform a series of regressions and data transformations. Be sure to keep a record of all your computer results. First, please perform a simple linear regression. Predict Y if X = 40. To avoid rounding errors in ALL your calculations, please perform your calculations on your spreadsheet referencing data from your regression output.
X |
Y |
54 |
6 |
42 |
16 |
28 |
33 |
38 |
18 |
25 |
41 |
70 |
3 |
48 |
10 |
41 |
14 |
20 |
45 |
52 |
9 |
65 |
5 |
14.0363 |
||
14.1891 |
||
17.2164 |
||
21.5627 |
||
None of the above |
Part 2
Please perform a polynomial regression of Y against X and X-squared. What is the coefficient of the curvature term?
92.8725 |
||
-2.7222 |
||
0.0208 |
||
0.9851 |
||
None of the above |
Part 3.
With other OLS regression conditions satisfied, can we utilize the estimated equation of this model to predict Y?
Yes. The regression is statistically significant and the coefficient of determination is reasonably high. |
||
No. Although the regression is significant, there is a high likelihood that multicollinearity is a problem due to the inclusion of the X-squared term. |
||
No. The regression model is nonlinear and cannot therefore be utilized to make a forecast. |
||
No. The standard error of the regression is too high, indicating that the unexplained variation (error term) exhibits heteroscedasticity. |
||
None of the above |
Part 4.
Based on the parameter estimates of the quadratic model, predict Y if X = 40.
14.0363 |
||
14.1891 |
||
17.2164 |
||
21.5627 |
||
19.1523 |
Part 5.
Perform a logarithmic transformation of only Y on the original dataset. That is, ln(Y) = B0 + B1(X) + e. Then predict Y if X = 40.
21.5627 |
||
17.2164 |
||
2.7885 |
SUMMARY OUTPUT | |||||
Regression Statistics | |||||
Multiple R | 0.932145801 | ||||
R Square | 0.868895795 | ||||
Adjusted R Square | 0.854328661 | ||||
Standard Error | 5.64255772 | ||||
Observations | 11 | ||||
ANOVA | |||||
df | SS | MS | F | Significance F | |
Regression | 1 | 1899.090245 | 1899.090245 | 59.64768355 | 2.92936E-05 |
Residual | 9 | 286.5461186 | 31.83845762 | ||
Total | 10 | 2185.636364 | |||
Coefficients | Standard Error | t Stat | P-value | Lower 95% | |
Intercept | 56.15733314 | 5.203079582 | 10.79309518 | 1.88916E-06 | 44.3871494 |
X | -0.8648668 | 0.111983087 | -7.72319128 | 2.92936E-05 | -1.118190142 |
part 1)
y^ = 56.1573 - 0.8649 *x
= 56.1573 - 0.8649 *40
= 21.5627
part 2)
Y | X | x^2 |
6 | 54 | 2916 |
16 | 42 | 1764 |
33 | 28 | 784 |
18 | 38 | 1444 |
41 | 25 | 625 |
3 | 70 | 4900 |
10 | 48 | 2304 |
14 | 41 | 1681 |
45 | 20 | 400 |
9 | 52 | 2704 |
5 | 65 | 4225 |
SUMMARY OUTPUT | |||||
Regression Statistics | |||||
Multiple R | 0.994003876 | ||||
R Square | 0.988043706 | ||||
Adjusted R Square | 0.985054633 | ||||
Standard Error | 1.807349945 | ||||
Observations | 11 | ||||
ANOVA | |||||
df | SS | MS | F | Significance F | |
Regression | 2 | 2159.504253 | 1079.752 | 330.5518 | 2.04E-08 |
Residual | 8 | 26.1321106 | 3.266514 | ||
Total | 10 | 2185.636364 | |||
Coefficients | Standard Error | t Stat | P-value | Lower 95% | |
Intercept | 92.87251293 | 4.436918394 | 20.93176 | 2.85E-08 | 82.64096 |
X | -2.722212931 | 0.211088773 | -12.8961 | 1.24E-06 | -3.20898 |
x^2 | 0.020770253 | 0.002326226 | 8.928735 | 1.96E-05 | 0.015406 |
coefficient of the curvature term = 0.0208
part 3)
A) Yes. The regression is statistically significant and the coefficient of determination is reasonably high.
part 4)
17.21640086 |
part 5)
ln (Y) | Y | X |
1.791759 | 6 | 54 |
2.772589 | 16 | 42 |
3.496508 | 33 | 28 |
2.890372 | 18 | 38 |
3.713572 | 41 | 25 |
1.098612 | 3 | 70 |
2.302585 | 10 | 48 |
2.639057 | 14 | 41 |
3.806662 | 45 | 20 |
2.197225 | 9 | 52 |
1.609438 | 5 | 65 |
SUMMARY OUTPUT | ||||||||
Regression Statistics | ||||||||
Multiple R | 0.991254363 | |||||||
R Square | 0.982585213 | |||||||
Adjusted R Square | 0.980650236 | |||||||
Standard Error | 0.122438958 | |||||||
Observations | 11 | |||||||
ANOVA | ||||||||
df | SS | MS | F | Significance F | ||||
Regression | 1 | 7.612614045 | 7.612614 | 507.8022 | 3.16E-09 | |||
Residual | 9 | 0.134921686 | 0.014991 | |||||
Total | 10 | 7.747535731 | ||||||
Coefficients | Standard Error | t Stat | P-value | Lower 95% | Upper 95% | Lower 95.0% | Upper 95.0% | |
Intercept | 4.978748604 | 0.112902636 | 44.09772 | 7.92E-12 | 4.723345 | 5.234152 | 4.723345 | 5.234152 |
X | -0.054757465 | 0.002429943 | -22.5345 | 3.16E-09 | -0.06025 | -0.04926 | -0.06025 | -0.04926 |
ln y^ = 4.9787 -0.0547 * x
y^ =
16.25580413 |