In: Statistics and Probability
The variables in the file are
Price -Average selling price of houses
Location -A code to indicate the location of the house
Condition -A code to indicate the physical condition of the house
Bedrooms Number of bedrooms in the house
Bathrooms Number of bathrooms in the house
Other Rooms Number of other rooms in the house
(a) Run a regression of Price on Location, Condition, Bedrooms, Bathrooms and Other Rooms. Please attach your Excel file.
(b) What variables seem to be important for buyers of houses? Please explain.
(c) Based on your regression results, if you wanted a low selling price on a house you wanted to purchase, what would you look for? Please explain.
(d) Run a regression of Price on Bedroom, Bathrooms and Other Rooms. Please attach your Excel file
(e) Compare your regression results in (i) to the regression results in (ii). Which would you consider to be a better model and why?
Price Location Condition
Bedrooms Bathrooms Other Rooms
67000 2 2 2
1 2
68000 2 2 3
1 3
68000 2 2 3
1 3
69000 2 3 2
1 3
72000 2 2 4
2 5
75000 3 4 2
1 3
76000 2 3 2
1 2
76900 2 3 3
1 3
77000 2 3 2
3 5
78000 3 2 2
1 2
79000 2 3 3
2 3
80000 2 3 3
1.5 2
80000 2 3 3
1 2
81000 3 3 2
1 3
82000 2 3 3
1.5 3
83000 2 3 3
1 3
84000 2 2 3
1 3
84000 2 3 3
1.5 3
86250 1 4 4
2 3
87000 3 3 3
2 2
89500 3 2 3
2 2
90400 2 4 4
2 4
90500 3 3 3
1.5 3
91000 3 3 3
2 3
91500 3 1 4
2 3
91500 3 1 4
2 3
92500 3 3 3
1.5 4
93500 2 3 3
2 3
93500 2 3 4
2 2
94000 1 2 3
1.5 3
95500 3 3 3
2 2
96000 2 4 3
2 3
96000 2 3 3
2 3
97900 3 4 3
2 3
98000 3 4 3
2 3
98000 2 4 3
2 4
98000 3 4 3
2 3
99000 2 3 4
2 4
99000 3 2 4
2 4
99000 3 3 3
2 3
102000 3 3 4
2 3
102000 2 3 3
1.5 3
102000 3 3 4
2 3
102000 3 4 3
1.5 3
103000 3 3 3
2 3
103000 3 2 3
1.5 2
103500 3 2 3
2 5
103500 3 3 3
2 5
105000 3 3 3
2 5
105000 3 4 3
1.5 3
108000 2 4 3
2 3
112000 3 2 4
2 4
112500 3 4 3
2 4
114900 2 2 5
2 3
115500 3 4 4
2 3
120500 4 5 3
2 4
122000 2 2 3
3 4
125500 3 3 4
2.5 3
127000 2 4 3
2.5 4
128000 4 4 3
2 4
129900 3 4 4
2.5 3
130350 3 3 3
2 4
132350 3 4 3
2 4
133000 3 3 3
2 4
134500 4 3 3
2 3
135500 3 3 3
3 3
135500 4 3 3
3 3
136500 4 4 3
2 4
136500 4 3 3
2 4
137400 3 3 4
2.5 4
137400 4 3 4
2.5 4
137500 4 4 3
2 4
139500 3 4 4
2.5 4
144000 4 3 4
2.5 5
145000 4 3 3
2 3
149000 4 4 3
2 2
155000 4 4 4
2 5
154000 4 2 3
2 4
155500 3 5 3
2.5 3
156500 4 5 3
2 3
163000 4 3 4
2 4
165000 5 4 4
2 2
167000 5 4 4
2 2
168700 3 5 3
2.5 5
169900 4 5 4
2.5 4
169900 4 5 3
2.5 5
169900 4 5 3
2.5 5
176000 4 5 4
2.5 4
179000 4 5 4
2.5 5
179000 4 5 4
2.5 5
179500 4 4 3
2.5 3
179500 5 4 3
2.5 3
187500 4 3 4
2.5 4
203000 4 5 4
3 6
220000 5 5 4
3.5 5
222000 5 4 3
3.5 6
250000 5 4 4
2.5 4
250000 5 5 4
2.5 4
255000 5 5 4
2.5 4
255000 5 5 3
2.5 4
The Regression calculation done in Excel is given below
Regression Statistics | ||||||||
Multiple R | 0.9085 | |||||||
R Square | 0.8253 | |||||||
Adjusted R Square | 0.8161 | |||||||
Standard Error | 19116.0451 | |||||||
Observations | 100 | |||||||
ANOVA | ||||||||
df | SS | MS | F | Significance F | ||||
Regression | 5 | 1.62325E+11 | 3.2465E+10 | 88.84234686 | 4.58807E-34 | |||
Residual | 94 | 34349778889 | 365423180 | |||||
Total | 99 | 1.96675E+11 | ||||||
Coefficients | Standard Error | t Stat | P-value | Lower 95% | Upper 95% | Lower 95.0% | Upper 95.0% | |
Intercept | -64558.5034 | 12387.5357 | -5.2116 | 0.0000 | -89154.2459 | -39962.7608 | -89154.2459 | -39962.7608 |
Location | 25533.3487 | 2474.4568 | 10.3188 | 0.0000 | 20620.2568 | 30446.4406 | 20620.2568 | 30446.4406 |
Condition | 10124.5722 | 2340.8940 | 4.3251 | 0.0000 | 5476.6723 | 14772.4721 | 5476.6723 | 14772.4721 |
Bedrooms | 8842.6631 | 3572.1941 | 2.4754 | 0.0151 | 1749.9881 | 15935.3382 | 1749.9881 | 15935.3382 |
Bathrooms | 17202.5553 | 5030.2405 | 3.4198 | 0.0009 | 7214.8949 | 27190.2157 | 7214.8949 | 27190.2157 |
Other Rooms | 3173.6549 | 2464.2430 | 1.2879 | 0.2009 | -1719.1573 | 8066.4671 | -1719.1573 | 8066.4671 |
a) The regression line is
Price = -64558 + 25533 * Location + 10124 * Condition + 8842 * Bedrooms + 17202 * Bathroom + 3173 * Other Rooms
b) The top two critical factors for buyer are Location and Bathroom
c) If I have to purchase a house with low selling price I will ignore the coefficients which increase the price of the house. The top two critical factors explain above, will be ignored and make the value = 0. So the price of the house will automatically come down. So I will only consider Other Rooms, then Bedrooms and lastly Condition and ignore location and bathroom
d)
SUMMARY OUTPUT | ||||||||
Regression Statistics | ||||||||
Multiple R | 0.701814009 | |||||||
R Square | 0.492542903 | |||||||
Adjusted R Square | 0.476684868 | |||||||
Standard Error | 32243.24002 | |||||||
Observations | 100 | |||||||
ANOVA | ||||||||
df | SS | MS | F | Significance F | ||||
Regression | 3 | 9.69E+10 | 3.23E+10 | 31.05952 | 4.03954E-14 | |||
Residual | 96 | 9.98E+10 | 1.04E+09 | |||||
Total | 99 | 1.97E+11 | ||||||
Coefficients | Standard Error | t Stat | P-value | Lower 95% | Upper 95% | Lower 95.0% | Upper 95.0% | |
Intercept | -19468.36015 | 19564.65 | -0.99508 | 0.3222 | -58303.87522 | 19367.15493 | -58303.87522 | 19367.15493 |
Bedrooms | 9141.840464 | 5978.095 | 1.529223 | 0.129498 | -2724.5851 | 21008.26603 | -2724.5851 | 21008.26603 |
Bathrooms | 45704.33258 | 7678.179 | 5.952497 | 4.31E-08 | 30463.26841 | 60945.39675 | 30463.26841 | 60945.39675 |
Other Rooms | 6072.784488 | 4118.138 | 1.474643 | 0.143581 | -2101.655472 | 14247.22445 | -2101.655472 | 14247.22445 |
The new Regression line is
Price = -19648 + 9141 * Bedrooms + 45704 * Bathrooms + 6072 * OtherRooms
e) The Regression line in 1 better, as the R2 value is closer to 1 in regression equation 1 showing a better fit of the data points.