In: Statistics and Probability
0 | Bedroom | Bathroom | Cars | SQ FT |
298,000 | 3 | 2.5 | 0 | 1,566 |
319,900 | 3 | 2.5 | 0 | 2,000 |
354,000 | 3 | 2 | 2 | 0 |
374,900 | 4 | 2.5 | 0 | 2,816 |
385,000 | 4 | 2 | 0 | 0 |
389,000 | 3 | 2.5 | 0 | 2,248 |
399,000 | 4 | 3 | 0 | 2,215 |
415,000 | 3 | 2.5 | 0 | 3,188 |
444,900 | 3 | 2 | 0 | 2,530 |
450,000 | 3 | 2 | 0 | 1,967 |
465,000 | 4 | 3 | 0 | 2,564 |
340,000 | 4 | 2.5 | 0 | 2,293 |
275,000 | 3 | 2.5 | 2 | 1,353 |
425,000 | 3 | 2 | 0 | 1,834 |
250,000 | 3 | 2.5 | 0 | 5,837 |
450,000 | 3 | 2.5 | 0 | 9,060 |
390,000 | 3 | 3.5 | 0 | 1,002 |
269,000 | 3 | 2.5 | 0 | 1,680 |
425,000 | 3 | 2.5 | 2 | 4,356 |
425,000 | 2 | 2.5 | 2 | 2,993 |
425,000 | 3 | 3 | 0 | 4,356 |
429,900 | 5 | 3.5 | 1 | 2,154 |
400,000 | 3 | 2.5 | 2 | 1,846 |
399,900 | 3 | 2 | 1 | 2,018 |
388,990 | 4 | 4 | 0 | 2,295 |
Construct and interpret a correlation matrix for your data. and explain the relationship.
Price | Bedroom | Bathroom | Cars | SQ FT |
1 | 0.116311 | 0.069295 | -0.01194 | 0.182509 |
0.116311 | 1 | 0.454391 | -0.27731 | -0.16482 |
0.069295 | 0.454391 | 1 | -0.14388 | 0.0818 |
-0.01194 | -0.27731 | -0.14388 | 1 | -0.15225 |
0.182509 | -0.16482 | 0.0818 | -0.15225 | 1 |
The correlation matrix shows that Bedroom, SQ FT and Bathroom have a good positive relationship with Price and Cars has a negative relationship with Price.
Show the step wise process of determining the best regression model to predict PRICE. EXPLAIN the process as you move from the full model to your final model.
The regression output is:
R² | 0.060 | |||||
Adjusted R² | 0.000 | |||||
R | 0.245 | |||||
Std. Error | 63973.056 | |||||
n | 25 | |||||
k | 4 | |||||
Dep. Var. | Price | |||||
ANOVA table | ||||||
Source | SS | df | MS | F | p-value | |
Regression | 5,22,26,51,649.4394 | 4 | 1,30,56,62,912.3598 | 0.32 | .8619 | |
Residual | 81,85,10,38,446.5607 | 20 | 4,09,25,51,922.3280 | |||
Total | 87,07,36,90,096.0000 | 24 | ||||
Regression output | confidence interval | |||||
variables | coefficients | std. error | t (df=20) | p-value | 95% lower | 95% upper |
Intercept | 3,09,980.8034 | |||||
Bedroom | 17,945.0381 | 25,574.4249 | 0.702 | .4910 | -35,402.2774 | 71,292.3537 |
Bathroom | -2,595.9465 | 28,990.2834 | -0.090 | .9295 | -63,068.6180 | 57,876.7249 |
Cars | 5,114.6350 | 16,897.8149 | 0.303 | .7653 | -30,133.5893 | 40,362.8593 |
SQ FT | 7.3638 | 7.4722 | 0.985 | .3362 | -8.2230 | 22.9505 |
The best regression model to predict PRICE is:
PRICE = 3,09,980.8034 + 17,945.0381Bedroom -2,595.9465Bathroom + 5,114.6350Cars + 7.3638SQ FT
make 3 regression models based on cars, bedroom, bath room.
The regression models based on cars is:
r² | 0.000 | |||||
r | -0.012 | |||||
Std. Error | 61524.573 | |||||
n | 25 | |||||
k | 1 | |||||
Dep. Var. | Price | |||||
ANOVA table | ||||||
Source | SS | df | MS | F | p-value | |
Regression | 1,24,07,863.4877 | 1 | 1,24,07,863.4877 | 0.00 | .9548 | |
Residual | 87,06,12,82,232.5123 | 23 | 3,78,52,73,140.5440 | |||
Total | 87,07,36,90,096.0000 | 24 | ||||
Regression output | confidence interval | |||||
variables | coefficients | std. error | t (df=23) | p-value | 95% lower | 95% upper |
Intercept | 3,83,919.1626 | |||||
Cars | -874.0887 | 15,267.0666 | -0.057 | .9548 | -32,456.4221 | 30,708.2448 |
The estimated regression equation is:
PRICE = 3,83,919.1626 -874.0887Cars
The regression models based on bedroom is:
r² | 0.014 | |||||
r | 0.116 | |||||
Std. Error | 61111.349 | |||||
n | 25 | |||||
k | 1 | |||||
Dep. Var. | Price | |||||
ANOVA table | ||||||
Source | SS | df | MS | F | p-value | |
Regression | 1,17,79,59,510.1593 | 1 | 1,17,79,59,510.1593 | 0.32 | .5798 | |
Residual | 85,89,57,30,585.8407 | 23 | 3,73,45,96,981.9931 | |||
Total | 87,07,36,90,096.0000 | 24 | ||||
Regression output | confidence interval | |||||
variables | coefficients | std. error | t (df=23) | p-value | 95% lower | 95% upper |
Intercept | 3,46,057.9646 | |||||
Bedroom | 11,415.1327 | 20,325.3324 | 0.562 | .5798 | -30,631.0207 | 53,461.2862 |
The estimated regression equation is:
PRICE = 3,46,057.9646 + 11,415.1327Bedroom
The regression models based on bath room is:
r² | 0.005 | |||||
r | 0.069 | |||||
Std. Error | 61381.057 | |||||
n | 25 | |||||
k | 1 | |||||
Dep. Var. | Price | |||||
ANOVA table | ||||||
Source | SS | df | MS | F | p-value | |
Regression | 41,81,05,299.9432 | 1 | 41,81,05,299.9432 | 0.11 | .7421 | |
Residual | 86,65,55,84,796.0568 | 23 | 3,76,76,34,121.5677 | |||
Total | 87,07,36,90,096.0000 | 24 | ||||
Regression output | confidence interval | |||||
variables | coefficients | std. error | t (df=23) | p-value | 95% lower | 95% upper |
Intercept | 3,62,547.9653 | |||||
Bathroom | 8,120.7886 | 24,377.5318 | 0.333 | .7421 | -42,307.9781 | 58,549.5553 |
The estimated regression equation is:
PRICE = 3,62,547.9653 + 8,120.7886Bathroom
Using your final model, select values for the independent variables and predict the house’s sales price.
Let a place has 3 Bedroom, 2 Bathroom, 8 Cars and an area of 3,000 SQ FT.
The house’s sales PRICE will be:
The best regression model to predict PRICE is:
PRICE = 3,09,980.8034 + 17,945.0381Bedroom -2,595.9465Bathroom + 5,114.6350Cars + 7.3638SQ FT
The best regression model to predict PRICE is:
PRICE = 3,09,980.8034 + 17,945.0381*3 -2,595.9465*2 + 5,114.6350*8 + 7.3638*3,000
PRICE = $4,21,632.379