In: Statistics and Probability
You work for a Realtor agency and part of the agency's job is to advise clients on the asking price. Based on your experience you know that buyers tend to focus on the size of the house, the number of bedrooms, number of bathrooms, and the age of house. You take a sample from the hundreds of houses that the agency sold over the past six months and plan to run a regression analysis in order to find out which variables significantly impact price and how.
(1) State the null and alternative hypothesis for each independent variable.
(2) Explain how you expect each independent variable to impact the dependent variable.
(3) Conduct regression analysis, find the final model in which only significant IVs remain.
(4) Recommend an asking price for a client whose house is 1800 squared feet, has 3 bedrooms, 2 bathrooms, and was constructed in 1980.
Price | Size | Bedrooms | Baths | Age |
$235,000.00 | 1,530 | 3 | 2 | 6 |
$375,000.00 | 2,380 | 4 | 3 | 43 |
$199,950.00 | 720 | 2 | 1 | 2 |
$258,000.00 | 1,040 | 2 | 2 | 40 |
$96,500.00 | 484 | 1 | 1 | 43 |
$237,000.00 | 1,584 | 3 | 3 | 23 |
$829,000.00 | 2,701 | 5 | 3 | 7 |
$200,000.00 | 952 | 2 | 2 | 18 |
$328,500.00 | 1,098 | 3 | 3 | 75 |
$365,000.00 | 2,004 | 3 | 2 | 35 |
$116,000.00 | 640 | 2 | 1 | 41 |
$885,000.00 | 3,849 | 6 | 4 | 5 |
$250,000.00 | 2,010 | 3 | 2 | 84 |
$165,000.00 | 575 | 2 | 1 | 32 |
$159,900.00 | 984 | 2 | 2 | 43 |
$275,000.00 | 864 | 3 | 2 | 70 |
$144,900.00 | 1,330 | 3 | 2 | 73 |
$165,000.00 | 575 | 2 | 1 | 32 |
$175,000.00 | 575 | 2 | 1 | 32 |
$184,500.00 | 768 | 3 | 1 | 60 |
$115,000.00 | 575 | 1 | 1 | 40 |
$180,000.00 | 1,044 | 3 | 1 | 42 |
$140,000.00 | 720 | 2 | 1 | 40 |
$165,000.00 | 1,144 | 2 | 1 | 88 |
$149,900.00 | 831 | 3 | 1.5 | 71 |
$109,900.00 | 637 | 1 | 1 | 42 |
$218,400.00 | 1,248 | 3 | 2 | 30 |
$285,000.00 | 1,588 | 3 | 3 | 60 |
$345,000.00 | 1,322 | 4 | 2 | 11 |
$165,000.00 | 575 | 2 | 1 | 32 |
$175,000.00 | 906 | 2 | 2 | 43 |
$270,000.00 | 1,116 | 3 | 2 | 43 |
$165,000.00 | 402 | 1 | 1 | 32 |
$409,000.00 | 2,200 | 3 | 3 | 40 |
$175,000.00 | 937 | 1 | 2 | 40 |
$450,000.00 | 1,341 | 4 | 3 | 6 |
$170,000.00 | 852 | 3 | 2 | 69 |
$429,000.00 | 1,532 | 3 | 1 | 80 |
$380,000.00 | 1,116 | 3 | 1 | 61 |
$195,000.00 | 1,168 | 2 | 2 | 21 |
$320,000.00 | 1,130 | 2 | 1 | 50 |
(2)
From above scatter plot, we observed that Price and Size are strongly positively correlated i.e. if size increases price also increases. Similarly Price and no. of bedrooms are strongly positively correlated i.e. if no. of bedrooms increases price also increases. This is also observed for Price and no. of bathrooms but the strength is not good and direction is positive. However Price and Age are uncorrelated.
(3)
Minitab output:
Regression Analysis: Price versus Size, Bedrooms, Baths, Age
The regression equation is
Price = 10012 + 146 Size + 60790 Bedrooms - 21401 Baths - 1073
Age
Predictor Coef SE Coef T P
Constant 10012 45888 0.22 0.829
Size 145.63 35.53 4.10 0.000
Bedrooms 60790 20552 2.96 0.005
Baths -21401 24084 -0.89 0.380
Age -1072.9 549.9 -1.95 0.059
S = 76956.7 R-Sq = 80.6% R-Sq(adj) = 78.4%
Analysis of Variance
Source DF SS MS F P
Regression 4 8.85394E+11 2.21348E+11 37.38 0.000
Residual Error 36 2.13204E+11 5922330990
Total 40 1.09860E+12
From above outputs, we see that p-values corresponding Size, Bedrooms are less than 0.05 and p-values corresponding Baths, Age are greater than 0.05. So Size and bedrooms are significantly present and moreover intercept term is also insignificant since its corresponding p-value less than 0.05.
Hence the final model is: Price = 145.63 Size + 60790 Bedrooms
(4)
Estimated price for a client whose house is 1800 squared feet, has 3 bedrooms, 2 bathrooms, and was constructed in 1980=45.63 *1800+ 60790*3=$264504.