In: Statistics and Probability
Selling Price (Y) |
Square Footage (X1 ) |
Bedrooms (X2 ) |
Age (X3 ) |
84,000 |
1670 |
2 |
30 |
79,000 |
1339 |
2 |
25 |
91,500 |
1712 |
3 |
30 |
120,000 |
1840 |
3 |
40 |
127,500 |
2300 |
3 |
18 |
132,500 |
2234 |
3 |
30 |
145,000 |
2311 |
3 |
19 |
164,000 |
2377 |
3 |
7 |
155,000 |
2736 |
4 |
10 |
168,000 |
2500 |
3 |
1 |
172,500 |
2500 |
4 |
3 |
174,000 |
2479 |
3 |
3 |
Using Excel
Data -> Data analysis -> correlation
Correlation matrix
Square Footage (X1 ) | Bedrooms (X2 ) | Age (X3 ) | |
Square Footage (X1 ) | 1 | ||
Bedrooms (X2 ) | 0.79204215 | 1 | |
Age (X3 ) | -0.74713851 | -0.483045892 | 1 |
Selling Price (Y) | Square Footage (X1 ) | Bedrooms (X2 ) | Age (X3 ) | |
Selling Price (Y) | 1 | |||
Square Footage (X1 ) | 0.924978077 | 1 | ||
Bedrooms (X2 ) | 0.714005914 | 0.79204215 | 1 | |
Age (X3 ) | -0.811400487 | -0.74713851 | -0.483045892 | 1 |
correlation between Bedrooms and Square footage is 0.79 and
correlation between Age and Square footage is -0.747
which are higher values
Square footage has highest correlation with dependent variable (Selling Price)
then comes age and then is Bedrooms
Data -> Data analysis -> regression
SUMMARY OUTPUT | |||||
Regression Statistics | |||||
Multiple R | 0.942708618 | ||||
R Square | 0.888699539 | ||||
Adjusted R Square | 0.846961865 | ||||
Standard Error | 13587.43836 | ||||
Observations | 12 | ||||
ANOVA | |||||
df | SS | MS | F | Significance F | |
Regression | 3 | 11792968818 | 3930989606 | 21.29250324 | 0.000360426 |
Residual | 8 | 1476947849 | 184618481.1 | ||
Total | 11 | 13269916667 | |||
Coefficients | Standard Error | t Stat | P-value | Lower 95% | |
Intercept | 19710.49459 | 38520.88914 | 0.511683272 | 0.6226939 | -69118.83506 |
Square Footage (X1 ) | 56.57849379 | 21.64759505 | 2.613615677 | 0.030955932 | 6.659050083 |
Bedrooms (X2 ) | 1830.337576 | 11551.09912 | 0.158455707 | 0.87802463 | -24806.54476 |
Age (X3 ) | -742.3415246 | 488.0666793 | -1.520983825 | 0.166756036 | -1867.825305 |
here only X1 is significant s its p-value = 0.03 < 0.05
so we consider only X1
SUMMARY OUTPUT | |||||
Regression Statistics | |||||
Multiple R | 0.924978077 | ||||
R Square | 0.855584443 | ||||
Adjusted R Square | 0.841142887 | ||||
Standard Error | 13843.34645 | ||||
Observations | 12 | ||||
ANOVA | |||||
df | SS | MS | F | Significance F | |
Regression | 1 | 11353534258 | 11353534258 | 59.24461739 | 1.64854E-05 |
Residual | 10 | 1916382409 | 191638240.9 | ||
Total | 11 | 13269916667 | |||
Coefficients | Standard Error | t Stat | P-value | Lower 95% | |
Intercept | -29786.79311 | 21704.35788 | -1.372387669 | 0.19993853 | -78147.11616 |
Square Footage (X1 ) | 75.79204236 | 9.846891681 | 7.697052513 | 1.64854E-05 | 53.85180044 |
here model is still significant
hence we choose simple linear model with Square Footage as independent variable