In: Statistics and Probability
Price (in K) | Sqft | Age | Features | CornerCODE | Corner_Label |
310.0 | 2650 | 13 | 7 | 0 | NO |
313.0 | 2600 | 9 | 4 | 0 | NO |
320.0 | 2664 | 6 | 5 | 0 | NO |
320.0 | 2921 | 3 | 6 | 0 | NO |
304.9 | 2580 | 4 | 4 | 0 | NO |
295.0 | 2580 | 4 | 4 | 0 | NO |
285.0 | 2774 | 2 | 4 | 0 | NO |
261.0 | 1920 | 1 | 5 | 0 | NO |
250.0 | 2150 | 2 | 4 | 0 | NO |
249.9 | 1710 | 1 | 3 | 0 | NO |
242.5 | 1837 | 4 | 5 | 0 | NO |
232.0 | 1880 | 8 | 6 | 0 | NO |
230.0 | 2150 | 15 | 3 | 0 | NO |
228.5 | 1894 | 14 | 5 | 0 | NO |
222.0 | 1928 | 18 | 8 | 0 | NO |
223.0 | 1830 | 16 | 3 | 0 | NO |
220.5 | 1767 | 16 | 4 | 0 | NO |
216.0 | 1630 | 15 | 3 | 1 | YES |
218.9 | 1680 | 17 | 4 | 1 | YES |
204.5 | 1725 | 13 | 3 | 0 | NO |
204.5 | 1500 | 15 | 4 | 0 | NO |
202.5 | 1430 | 10 | 3 | 0 | NO |
202.5 | 1360 | 12 | 4 | 0 | NO |
195.0 | 1400 | 16 | 2 | 1 | YES |
201.0 | 1573 | 17 | 6 | 0 | NO |
191.0 | 1385 | 22 | 2 | 0 | NO |
274.5 | 2931 | 28 | 3 | 1 | YES |
260.3 | 2200 | 28 | 4 | 0 | NO |
230.0 | 2277 | 30 | 4 | 0 | NO |
235.0 | 2000 | 37 | 3 | 0 | NO |
207.0 | 1478 | 53 | 3 | 1 | YES |
207.0 | 1713 | 30 | 4 | 1 | YES |
197.2 | 1326 | 25 | 4 | 0 | NO |
197.5 | 1050 | 22 | 2 | 1 | YES |
194.9 | 1464 | 34 | 2 | 0 | NO |
190.0 | 1190 | 41 | 1 | 0 | NO |
192.6 | 1156 | 37 | 1 | 0 | NO |
194.0 | 1746 | 30 | 2 | 0 | NO |
192.0 | 1280 | 28 | 1 | 0 | NO |
175.0 | 1215 | 43 | 3 | 0 | NO |
177.0 | 1121 | 46 | 4 | 0 | NO |
177.0 | 1050 | 48 | 1 | 0 | NO |
179.9 | 1733 | 43 | 6 | 0 | NO |
178.1 | 1299 | 40 | 6 | 0 | NO |
177.5 | 1140 | 36 | 3 | 1 | YES |
172.0 | 1181 | 37 | 4 | 0 | NO |
320.0 | 2848 | 4 | 6 | 0 | NO |
264.9 | 2440 | 11 | 5 | 0 | NO |
240.0 | 2253 | 23 | 4 | 0 | NO |
234.9 | 2743 | 25 | 5 | 1 | YES |
230.0 | 2180 | 17 | 4 | 1 | YES |
228.9 | 1706 | 14 | 4 | 0 | NO |
225.0 | 1948 | 10 | 4 | 0 | NO |
217.5 | 1710 | 16 | 4 | 0 | NO |
215.0 | 1657 | 15 | 4 | 0 | NO |
213.0 | 2200 | 26 | 4 | 0 | NO |
210.0 | 1680 | 13 | 4 | 0 | NO |
209.9 | 1900 | 34 | 3 | 0 | NO |
200.5 | 1565 | 19 | 3 | 0 | NO |
198.4 | 1543 | 20 | 3 | 0 | NO |
192.5 | 1173 | 6 | 4 | 0 | NO |
193.9 | 1549 | 5 | 4 | 0 | NO |
190.5 | 1900 | 3 | 3 | 0 | NO |
188.5 | 1560 | 8 | 5 | 1 | YES |
186.0 | 1365 | 10 | 2 | 0 | NO |
185.5 | 1258 | 7 | 4 | 1 | YES |
184.9 | 1314 | 5 | 2 | 0 | NO |
180.0 | 1338 | 2 | 3 | 1 | YES |
180.9 | 997 | 4 | 4 | 0 | NO |
180.5 | 1275 | 8 | 5 | 0 | NO |
180.0 | 1030 | 4 | 1 | 0 | NO |
178.0 | 1027 | 5 | 3 | 0 | NO |
177.9 | 1007 | 19 | 6 | 0 | NO |
176.0 | 1083 | 22 | 4 | 0 | NO |
182.3 | 1320 | 18 | 5 | 0 | NO |
174.0 | 1348 | 15 | 2 | 0 | NO |
172.0 | 1350 | 12 | 2 | 0 | NO |
166.9 | 837 | 13 | 2 | 0 | NO |
234.5 | 3750 | 10 | 4 | 1 | YES |
202.5 | 1500 | 7 | 3 | 1 | YES |
198.9 | 1428 | 40 | 2 | 0 | NO |
187.0 | 1375 | 28 | 1 | 0 | NO |
183.0 | 1080 | 20 | 3 | 0 | NO |
182.0 | 900 | 23 | 3 | 0 | NO |
175.0 | 1505 | 16 | 2 | 1 | YES |
167.0 | 1480 | 19 | 4 | 0 | NO |
159.0 | 1142 | 10 | 0 | 0 | NO |
212.0 | 1464 | 7 | 2 | 0 | NO |
315.0 | 2116 | 25 | 3 | 0 | NO |
177.5 | 1280 | 14 | 3 | 0 | NO |
171.0 | 1159 | 23 | 0 | 0 | NO |
165.0 | 1198 | 10 | 4 | 0 | NO |
163.0 | 1051 | 15 | 2 | 0 | NO |
289.4 | 2250 | 40 | 6 | 0 | NO |
263.0 | 2563 | 17 | 2 | 0 | NO |
174.9 | 1400 | 45 | 1 | 1 | YES |
238.0 | 1850 | 5 | 5 | 1 | YES |
221.0 | 1720 | 5 | 4 | 0 | NO |
215.9 | 1740 | 4 | 3 | 0 | NO |
217.9 | 1700 | 6 | 4 | 0 | NO |
210.0 | 1620 | 6 | 4 | 0 | NO |
209.5 | 1630 | 6 | 4 | 0 | NO |
210.0 | 1920 | 8 | 4 | 0 | NO |
207.0 | 1606 | 5 | 4 | 0 | NO |
205.0 | 1535 | 7 | 5 | 1 | YES |
208.0 | 1540 | 6 | 2 | 1 | YES |
202.5 | 1739 | 13 | 3 | 0 | NO |
200.0 | 1715 | 8 | 3 | 0 | NO |
199.0 | 1305 | 5 | 3 | 0 | NO |
197.0 | 1415 | 7 | 4 | 0 | NO |
199.5 | 1580 | 9 | 3 | 0 | NO |
192.4 | 1236 | 3 | 4 | 0 | NO |
192.2 | 1229 | 6 | 3 | 0 | NO |
192.0 | 1273 | 4 | 4 | 0 | NO |
191.9 | 1165 | 7 | 4 | 0 | NO |
181.6 | 1200 | 7 | 4 | 1 | YES |
178.9 | 970 | 4 | 4 | 1 | YES |
1.) Make a multiple regression model using these potential numerical predictor variables and, at most, one categorical dummy variable.
2.)Write the sample multiple regression equation for the “final best” model you have developed.
3.) Look at the set of residual plots, cut and paste them into the report, and briefly comment on the appropriateness of your fitted model.
(1) Regression Analysis | |||||||||
R² | 0.739 | ||||||||
Adjusted R² | 0.729 | n | 117 | ||||||
R | 0.859 | k | 4 | ||||||
Std. Error | 19.795 | Dep. Var. | Price (in K) | ||||||
ANOVA table | |||||||||
Source | SS | df | MS | F | p-value | ||||
Regression | 124,004.2121 | 4 | 31,001.0530 | 79.12 | 9.90E-32 | ||||
Residual | 43,885.2558 | 112 | 391.8326 | ||||||
Total | 167,889.4679 | 116 | |||||||
Regression output | confidence interval | ||||||||
variables | coefficients | std. error | t (df=112) | p-value | 95% lower | 95% upper | std. coeff. | ||
Intercept | 111.6367 | 0.000 | |||||||
Sqft | 0.0586 | 0.0038 | 15.292 | 3.46E-29 | 0.0510 | 0.0662 | 0.807 | ||
Age | -0.2079 | 0.1517 | -1.371 | .1732 | -0.5084 | 0.0926 | -0.068 | ||
Features | 2.2579 | 1.4453 | 1.562 | .1210 | -0.6057 | 5.1215 | 0.083 | ||
CornerCODE | -10.2273 | 4.7006 | -2.176 | .0317 | -19.5409 | -0.9138 | -0.105 | ||
(2) The regression equation is Price (in K) = 111.6367 + 0.0586 Sqft - 0.2079 Age + 2.2579 Features - 10.2273 CornerCODE |
The normal probability plot of residuals shows a straight line pattern, so the assumptions of simple linear regression are satisfied. | ||||||||
R^2 = 0.729 means the model accounts for 72.9% of the variation in Price (in K) on the basis of the predictor variables. The model is a good fit. | ||||||||
Age and Features have p- values > 0.05 and are therefore not significant. Dropping these from the model, we can get the final regression equation as Price (in K) = 111.3012 + 0.0617 Sqft - 11.0245 CornerCODE | ||||||||
Regression Analysis | ||||||||
R² | 0.727 | |||||||
Adjusted R² | 0.722 | n | 117 | |||||
R | 0.852 | k | 2 | |||||
Std. Error | 20.066 | Dep. Var. | Price (in K) | |||||
ANOVA table | ||||||||
Source | SS | df | MS | F | p-value | |||
Regression | 121,986.6611 | 2 | 60,993.3305 | 151.48 | 7.92E-33 | |||
Residual | 45,902.8068 | 114 | 402.6562 | |||||
Total | 167,889.4679 | 116 | ||||||
Regression output | confidence interval | |||||||
variables | coefficients | std. error | t (df=114) | p-value | 95% lower | 95% upper | std. coeff. | |
Intercept | 111.3012 | 0.000 | ||||||
Sqft | 0.0617 | 0.0036 | 17.330 | 9.90E-34 | 0.0546 | 0.0688 | 0.849 | |
CornerCODE | -11.0245 | 4.7516 | -2.320 | .0221 | -20.4374 | -1.6115 | -0.114 |