In: Statistics and Probability
The following data gives the selling price, square footage, number of bedrooms, and the age of a house in years. These houses have been sold in a specific neighborhood over the last six months.
Selling Price ($) |
Square Footage |
Bedrooms |
Age (years) |
84,000 |
1,670 |
2 |
30 |
79,000 |
1,339 |
2 |
25 |
91,500 |
1,712 |
3 |
30 |
120,000 |
1,840 |
3 |
40 |
127,500 |
2,300 |
3 |
18 |
132,500 |
2,234 |
3 |
30 |
145,000 |
2,311 |
3 |
19 |
164,000 |
2,377 |
3 |
7 |
155,000 |
2,736 |
4 |
10 |
168,000 |
2,500 |
3 |
1 |
172,500 |
2,500 |
4 |
3 |
174,500 |
2,479 |
3 |
3 |
175,000 |
2,400 |
3 |
1 |
177,500 |
3,124 |
4 |
0 |
184,000 |
2,500 |
3 |
2 |
195,500 |
4,062 |
4 |
10 |
195,000 |
2,854 |
3 |
3 |
a) Using square footage develop a model to predict the selling price of the house. How well does the model fit the data? What percentage of the selling price is explained by the model?
b) Using the number of bedrooms develop a model to predict the selling price of the house. How well does the model fit the data? What percentage of the selling price is explained by the model?
c) Using the age of the house develop a model to predict the selling price of the house. How well does the model fit the data? What percentage of the selling price is explained by the model?
d) Which of the models estimated in parts a – d best fits the data? Why did you select that model? Model using selling price & X 3 = Age of house fits the data best since it explains maximum variability in selling price.
The linear model (regression equation) can be formed using excel. So, just the put the data into spreadsheet first.
a) Prediction of selling price using square footage:
Step 1: Go to data --> data analysis --> regression
Step 2: Select the data ( Selling price column and Square footage column) and press ok
The regression equation for the square footage and selling price is as follows:
y = 26547.72 + 51.03X1
The value of r2 = 0.699 suggests that this model fits the data well and 69% of the variation in the selling price has been explained by the square footage.
b) bedrooms and selling price
The regression output can be created using the same steps as in (a)
Y = 20392.85 + 41392.85X2
The value of r2 = 0.43 suggests that this model fits the data well and 43% of the variation in the selling price has been explained by the number of bedrooms.
c) Age of the house and selling price
Y = 182560.76 - 2426.866 X3
The value of r2 = 0.703 suggests that this model fits the data well and 70% of the variation in the selling price has been explained by the age of the house.
d) This answer is already correct.
Model using selling price & X 3 = Age of house fits the data best since it explains maximum variability in selling price.