In: Statistics and Probability
A Realtor is interested in modeling the selling price of houses based on the square footage (X1), the age of the house (X2) and the number of bedrooms (X3). The data (below) was collected in the two largest cities in Arkansas and is given in an excel file. Follow the Minitab instructions on blackboard to answer the questions below.
1. Check the model assumptions
a. Does the plot of Residuals vs. Fitted Values indicate that the assumption of constant variance is valid? Explain your reasoning.
b. Does the Normal Probability Plot indicate that the assumption of normality is valid? Explain your reasoning.
c. What is the sum of the residuals? Does this value indicate that the assumption E(ε) = 0 is valid?
2. Determine if any higher order terms are needed in the model by creating the scatter plots of Y vs X1, Y vs X2, and Y vs. X3. What higher order terms, if any, are needed in the model?
Here is the data given for the problem.
Y | X1 | X2 | X3 |
28,000 | 775 | 37 | 4 |
34,000 | 700 | 49 | 4 |
34,500 | 720 | 54 | 4 |
39,900 | 864 | 37 | 5 |
40,000 | 650 | 35 | 3 |
41,500 | 780 | 79 | 5 |
42,500 | 900 | 48 | 6 |
53,500 | 816 | 35 | 8 |
57,000 | 1800 | 17 | 14 |
59,000 | 1340 | 66 | 10 |
59,500 | 1800 | 18 | 12 |
62,000 | 1124 | 34 | 9 |
68,500 | 2880 | 24 | 16 |
72,500 | 1480 | 75 | 11 |
70,000 | 1652 | 94 | 13 |
73,112 | 2088 | 71 | 15 |
76,780 | 1700 | 34 | 12 |
77,350 | 1262 | 78 | 9 |
85,590 | 1500 | 54 | 10 |
79,900 | 1200 | 35 | 13 |
48,100 | 650 | 45 | 4 |
We will take help from MINITAB software to get the answers
1. Check the model assumptions
a. Does the plot of Residuals vs. Fitted Values indicate that the assumption of constant variance is valid? Explain your reasoning.
so we can say the constant variance is assumption is valid since all points are scattered.
b. Does the Normal Probability Plot indicate that the assumption of normality is valid? Explain your reasoning.
Here normal probability plot of residual and Y are showing both are normally distributed at 95% confidence. The plots are normal since all points are within the confidence bound.
c. What is the sum of the residuals? Does this value indicate that the assumption E(ε) = 0 is valid?
for observed data we calculated residuals, the residuals are
RESI1
-9667.3
-6309.8
-6354.3
-1542.2
6021.2
-7252.3
-4968.3
-2383.4
-13903.2
-5397.4
-2072.5
4872.7
-1222.6
3510.7
-9498.4
-7630.5
11684.1
15019.0
24758.3
4479.8
7856.4
here we can see that the sum is 35.
here E(e)=0 assumption valid since we can see that the residual has large variance.
2. Determine if any higher order terms are needed in the model by creating the scatter plots of Y vs X1, Y vs X2, and Y vs. X3. What higher order terms, if any, are needed in the model?
so we can see that Y vs X3 has higher degree since for increasing X3 we get larger Y.