In: Economics
A realtor has examined the impact of lot size on housing prices for homes in a lakeside resort. The output appears on the next page. One of his colleagues believes that the realtor should also have included whether the house was lakeside as another independent variable. The realtor added this dummy variable to the regression (1=house is lakeside; 0=not). The new regression is also presented on the next page. a) (3 points) What is the regression equation? (I want the specific estimates for the coefficients for the regression including the lakeside variable.) b) (3 points) What is your interpretation of the slope coefficient on lakeside? c) (8 points) Explain the interpretation of the “significance F” value in the regression output. Specifically, what are the hypotheses being tested? What is the purpose of the test? What can you conclude about the result of the test given the estimate of the “significance F” output? d) (6 points) Should the lakeside variable be included in the regression? Why or why not? Be sure to comment on the significance of the lakeside dummy variable and the improvement of the regression fit.
Multiple R | 0.303524708 | ||||
R Square | 0.092127248 | ||||
Adjusted R Square | 0.07647427 | ||||
Standard Error | 43290.16017 | ||||
Observations | 60 | ||||
df | SS | MS | F | Significance F | |
Regressions | 1 | 11029847233 | 11029847233 | 5.885605 | 0.018396055 |
Residual | 58 | 1.09E+11 | 1874037967 | ||
Total | 59 | 1.20E+11 | |||
Coefficients | Standard Error | t stat | P-Value | Lower 95% | |
Intercept | 37645.54569 | 20983.51574 | 1.794053301 | 0.078017 | -4357.52465 |
Lot Size | 1362.741742 | 561.7175628 | 2.426026587 | 0.018396 | 238.3418752 |
New Regression Output | |||||
Multiple R | 0.304263563 | ||||
R Square | 0.092576315 | ||||
Adjusted R Square | 0.060736888 | ||||
Standard Error | 43657.44605 | ||||
Observations | 60 | ||||
df | SS | MS | F | Significance F | |
Regressions | 2 | 11083611361 | 5541805681 | 2.9076 | 0.032746304 |
Residual | 57 | 1.09E+11 | 1905972596 | ||
Total | 59 | 1.20E+11 | |||
Coefficients | Standard Error | t stat | P-Value | Lower 95% | |
Intercept | 37081.45017 | 21426.42167 | 1.730641296 | 0.088927 | -5824.219139 |
Lot Size | 1386.834292 | 567.6436065 | 2.411432589 | 0.019135 | 232.1475765 |
Lake | 2954.769374 | 17592.82614 | 0.167953082 | 0.867215 | -32274.25731 |
a) The regression equation for the new regression is
price = 37081.45017 + 1386.834292*Lot_size + 2954.769374*Dummy_lake..............(1)
b) The slope coefficient on the Lake gives us the difference between the house prices between houses which are on lakeside and those which aren't.
c) The significance of F-value denotes the significance of overall model (the model of equation (1) )
Specifically, the hypothesis that we're testing here is
H0: beta_0 = beta_1 = beta_2 = 0
Ha: otherwise
The purpose of the test is to see whether our model is significant or whether our model is statistically viable.
The value of 'significance F' denotes the p-value(0.0327) for the F-test. We see that the our model is significant(0.0327 < 0.05) at 95% level of significance.
d) The lakeside variable should not be included in the regression framework. The reasons are as follows:
a) We can see that the value of R2 in first regression is not too different from the R2 of second regression.
b) The p-value for the F-test is less (more significant) in first regression than that of second regression
c) Also, the the lakeside dummy is insignificant at 95% level of significance.