In: Statistics and Probability
a) What is the p-value for the test for βSq. Ft?What is the p-value for the test for βBedrooms?
b) Are these significant? Why or why not?Kindly explain
c) Are Sq. Ft. and Bedrooms highly correlated?How does one know that? Please justify
d) What should we do? ( They are dropping bedrooms here but why)? How do we know which one to leave out in the model?
e) What is the final proposed model?
Obs. | Age | Price ($K) | Sq. Ft | Bedrooms |
1 | 21 | 240 | 1185 | 3 |
2 | 22 | 359 | 2400 | 4 |
3 | 1 | 219.9 | 1450 | 3 |
4 | 25 | 239.9 | 1308 | 3 |
5 | 20 | 259.9 | 1688 | 4 |
6 | 13 | 519 | 2944 | 5 |
7 | 9 | 614 | 3721 | 4 |
8 | 29 | 289 | 2050 | 3 |
9 | 18 | 259.9 | 1504 | 3 |
10 | 12 | 449.9 | 2877 | 4 |
11 | 4 | 735 | 3992 | 5 |
12 | 11 | 725 | 3750 | 5 |
13 | 13 | 394.9 | 2889 | 5 |
14 | 17 | 349.9 | 1960 | 3 |
15 | 8 | 525 | 2625 | 3 |
16 | 10 | 429 | 2437 | 4 |
17 | 6 | 620 | 3190 | 4 |
18 | 20 | 329.9 | 1898 | 3 |
19 | 25 | 319.5 | 1757 | 3 |
20 | 5 | 375 | 1800 | 3 |
21 | 15 | 390 | 2544 | 4 |
Solution:
Obs. | Age | Price ($K) | Sq. Ft | Bedrooms |
1 | 21 | 240 | 1185 | 3 |
2 | 22 | 359 | 2400 | 4 |
3 | 1 | 219.9 | 1450 | 3 |
4 | 25 | 239.9 | 1308 | 3 |
5 | 20 | 259.9 | 1688 | 4 |
6 | 13 | 519 | 2944 | 5 |
7 | 9 | 614 | 3721 | 4 |
8 | 29 | 289 | 2050 | 3 |
9 | 18 | 259.9 | 1504 | 3 |
10 | 12 | 449.9 | 2877 | 4 |
11 | 4 | 735 | 3992 | 5 |
12 | 11 | 725 | 3750 | 5 |
13 | 13 | 394.9 | 2889 | 5 |
14 | 17 | 349.9 | 1960 | 3 |
15 | 8 | 525 | 2625 | 3 |
16 | 10 | 429 | 2437 | 4 |
17 | 6 | 620 | 3190 | 4 |
18 | 20 | 329.9 | 1898 | 3 |
19 | 25 | 319.5 | 1757 | 3 |
20 | 5 | 375 | 1800 | 3 |
21 | 15 | 390 | 2544 | 4 |
SUMMARY OUTPUT | ||||||||
Regression Statistics | ||||||||
Multiple R | 0.955019 | |||||||
R Square | 0.912061 | |||||||
Adjusted R Square | 0.90229 | |||||||
Standard Error | 49.12255 | |||||||
Observations | 21 | |||||||
ANOVA | ||||||||
df | SS | MS | F | Significance F | ||||
Regression | 2 | 450481.8 | 225240.9 | 93.34379 | 3.15E-10 | |||
Residual | 18 | 43434.45 | 2413.025 | |||||
Total | 20 | 493916.2 | ||||||
Coefficients | Standard Error | t Stat | P-value | Lower 95% | Upper 95% | Lower 95.0% | Upper 95.0% | |
Intercept | 51.03018 | 55.51726 | 0.919177 | 0.370157 | -65.6073 | 167.6676 | -65.6073 | 167.6676 |
Sq. Ft | 0.205729 | 0.02172 | 9.471812 | 2.04E-08 | 0.160096 | 0.251361 | 0.160096 | 0.251361 |
Bedrooms | -34.7178 | 22.99898 | -1.50954 | 0.148516 | -83.0369 | 13.60126 | -83.0369 | 13.60126 |
Now,
(a)->
p-value for the test for βSq. Ft = 0.0000000204 which is equivalent to zero.
p-value for the test for βBedrooms = 0.1485
(b)->
Here the p-value from the above ANOVA table for significance is almost zero. Hence the given model is significance.
(c)->
Price ($K) | Sq. Ft | Bedrooms | |
Price ($K) | 1 | ||
Sq. Ft | 0.949173 | 1 | |
Bedrooms | 0.688301 | 0.792896 | 1 |
Correlation between the Sq.Ft and bedroom is 0.792896, we can't say that Sq. Ft. and Bedrooms are highly correlated.
(d)->
The p-value corresponding to test of significance of regression coefficient Bedroom is very high, So we cannot say that bedroom and Price have significant relationship. Therefore we are dropping the Bedroom.
So, if the p-value of any regression coefficient of a variable is greater than 5%, we drop that variable at 5% level of significance.
(e)->
Hence the final purposed model is:
Price ($K) = 51.03018 + 0.205729 * Sq. Ft