In: Statistics and Probability
The data below id from 128 recent sales in Mid City. For each sale, it shows the neighborhood (1, 2, or 3) of the house, # of offers made on the house, sq. footage, whether house is primarily made of brick, # of bathrooms, # of bedrooms, & selling price. Â Neghborhoods 1 & 2 are traditional, 3 is newer/more prestigious. Use regression to estimate & interpret the pricing structure of houses in Mid City.
Q2. Predict Price using SqFt as the independent variable. Is the relationship between these variables statistically significant? YES or NO
Home | Nbhd | Offers | Sq Ft | Brick | Bedrooms | Bathrooms | Price |
1 | 2 | 2 | 1790 | No | 2 | 2 | 228600 |
2 | 2 | 3 | 2030 | No | 4 | 2 | 228400 |
3 | 2 | 1 | 1740 | No | 3 | 2 | 229600 |
4 | 2 | 3 | 1980 | No | 3 | 2 | 189400 |
5 | 2 | 3 | 2130 | No | 3 | 3 | 239600 |
6 | 1 | 2 | 1780 | No | 3 | 2 | 229200 |
7 | 3 | 3 | 1830 | Yes | 3 | 3 | 303200 |
8 | 3 | 2 | 2160 | No | 4 | 2 | 301400 |
9 | 2 | 3 | 2110 | No | 4 | 2 | 238400 |
10 | 2 | 3 | 1730 | No | 3 | 3 | 208000 |
11 | 2 | 3 | 2030 | Yes | 3 | 2 | 265000 |
12 | 2 | 2 | 1870 | Yes | 2 | 2 | 246000 |
13 | 1 | 4 | 1910 | No | 3 | 2 | 205200 |
14 | 1 | 5 | 2150 | Yes | 3 | 3 | 252600 |
15 | 3 | 4 | 2590 | No | 4 | 3 | 353600 |
16 | 3 | 1 | 1780 | No | 4 | 2 | 291600 |
17 | 2 | 4 | 2190 | Yes | 3 | 3 | 294200 |
18 | 1 | 4 | 1990 | No | 3 | 3 | 167200 |
19 | 2 | 1 | 1700 | Yes | 2 | 2 | 222800 |
20 | 3 | 2 | 1920 | Yes | 3 | 3 | 334400 |
21 | 2 | 3 | 1790 | No | 3 | 2 | 232400 |
22 | 1 | 4 | 2000 | No | 3 | 2 | 227600 |
23 | 1 | 3 | 1690 | No | 3 | 2 | 183400 |
24 | 1 | 3 | 1820 | Yes | 3 | 2 | 212200 |
25 | 2 | 2 | 2210 | Yes | 4 | 3 | 312800 |
26 | 1 | 3 | 2290 | No | 4 | 3 | 298600 |
27 | 3 | 3 | 2000 | No | 4 | 2 | 274000 |
28 | 2 | 2 | 1700 | No | 3 | 2 | 198600 |
29 | 1 | 3 | 1600 | No | 2 | 2 | 138200 |
30 | 3 | 1 | 2040 | Yes | 4 | 3 | 376000 |
31 | 3 | 3 | 2250 | Yes | 4 | 3 | 364000 |
32 | 1 | 2 | 1930 | Yes | 2 | 2 | 224600 |
33 | 2 | 3 | 2250 | Yes | 3 | 3 | 270000 |
34 | 2 | 4 | 2280 | Yes | 5 | 3 | 279200 |
35 | 1 | 3 | 2000 | No | 2 | 2 | 235600 |
36 | 1 | 3 | 2080 | No | 3 | 3 | 234200 |
37 | 1 | 2 | 1880 | No | 2 | 2 | 235000 |
38 | 3 | 4 | 2420 | No | 4 | 3 | 294000 |
39 | 3 | 1 | 1720 | No | 3 | 2 | 262600 |
40 | 1 | 2 | 1740 | No | 3 | 2 | 216400 |
41 | 2 | 1 | 1560 | No | 2 | 2 | 213200 |
42 | 3 | 2 | 1840 | No | 4 | 3 | 267200 |
43 | 2 | 3 | 1990 | No | 2 | 2 | 211200 |
44 | 2 | 1 | 1920 | Yes | 3 | 2 | 308000 |
45 | 3 | 2 | 1940 | Yes | 3 | 3 | 333000 |
46 | 2 | 3 | 1810 | No | 3 | 2 | 206400 |
47 | 1 | 2 | 1990 | No | 2 | 3 | 259600 |
48 | 1 | 6 | 2050 | No | 3 | 2 | 180600 |
49 | 2 | 2 | 1980 | No | 2 | 2 | 231800 |
50 | 1 | 3 | 1700 | Yes | 3 | 2 | 215000 |
51 | 2 | 3 | 2100 | Yes | 3 | 2 | 302200 |
52 | 1 | 3 | 1860 | No | 2 | 2 | 182200 |
53 | 1 | 4 | 2150 | No | 2 | 3 | 234800 |
54 | 1 | 3 | 2100 | No | 3 | 2 | 261600 |
55 | 1 | 3 | 1650 | No | 3 | 2 | 162600 |
56 | 2 | 2 | 1720 | Yes | 2 | 2 | 251400 |
57 | 2 | 3 | 2190 | Yes | 3 | 2 | 281800 |
58 | 3 | 3 | 2240 | No | 4 | 3 | 304600 |
59 | 3 | 1 | 1840 | No | 3 | 3 | 276200 |
60 | 3 | 1 | 2090 | No | 4 | 2 | 310800 |
61 | 3 | 1 | 2200 | No | 3 | 3 | 361800 |
62 | 1 | 2 | 1610 | No | 2 | 2 | 201800 |
63 | 3 | 2 | 2220 | No | 4 | 3 | 322600 |
64 | 2 | 2 | 1910 | No | 2 | 3 | 241000 |
65 | 3 | 2 | 1860 | No | 3 | 2 | 260600 |
66 | 1 | 1 | 1450 | Yes | 2 | 2 | 222200 |
67 | 1 | 4 | 2210 | No | 3 | 3 | 252400 |
68 | 2 | 3 | 2040 | No | 4 | 3 | 303800 |
69 | 1 | 4 | 2140 | No | 3 | 2 | 187200 |
70 | 3 | 3 | 2080 | No | 4 | 3 | 331200 |
71 | 3 | 3 | 1950 | Yes | 3 | 3 | 333400 |
72 | 3 | 1 | 2160 | No | 4 | 2 | 315200 |
73 | 1 | 3 | 1650 | No | 3 | 2 | 214600 |
74 | 2 | 2 | 2040 | No | 3 | 3 | 251400 |
75 | 3 | 3 | 2140 | No | 3 | 3 | 288400 |
76 | 1 | 2 | 1900 | No | 2 | 2 | 213800 |
77 | 3 | 2 | 1930 | No | 3 | 2 | 259600 |
78 | 3 | 3 | 2280 | Yes | 4 | 3 | 353000 |
79 | 1 | 3 | 2130 | No | 3 | 2 | 242600 |
80 | 3 | 1 | 1780 | No | 4 | 2 | 287200 |
81 | 2 | 4 | 2190 | Yes | 3 | 3 | 286800 |
82 | 3 | 2 | 2140 | Yes | 4 | 3 | 368600 |
83 | 3 | 1 | 2050 | Yes | 2 | 2 | 329600 |
84 | 2 | 2 | 2410 | No | 3 | 3 | 295400 |
85 | 1 | 3 | 1520 | No | 2 | 2 | 181000 |
86 | 3 | 2 | 2250 | Yes | 4 | 3 | 376600 |
87 | 1 | 4 | 1900 | No | 4 | 2 | 205400 |
88 | 3 | 1 | 1880 | Yes | 3 | 3 | 345000 |
89 | 1 | 2 | 1930 | No | 3 | 3 | 255400 |
90 | 1 | 4 | 2010 | No | 2 | 2 | 195600 |
91 | 3 | 2 | 1920 | No | 4 | 2 | 286200 |
92 | 2 | 2 | 2150 | No | 3 | 2 | 233000 |
93 | 3 | 2 | 2110 | No | 3 | 2 | 285200 |
94 | 2 | 2 | 2080 | No | 3 | 3 | 314200 |
95 | 3 | 3 | 2150 | Yes | 4 | 3 | 321200 |
96 | 3 | 1 | 1970 | Yes | 2 | 2 | 305000 |
97 | 2 | 3 | 2440 | No | 3 | 3 | 266600 |
98 | 2 | 1 | 2000 | Yes | 2 | 2 | 253600 |
99 | 3 | 1 | 2060 | No | 3 | 2 | 291000 |
100 | 3 | 2 | 2080 | Yes | 3 | 3 | 342000 |
101 | 1 | 5 | 2010 | No | 3 | 2 | 206400 |
102 | 2 | 5 | 2260 | No | 3 | 3 | 246200 |
103 | 2 | 4 | 2410 | No | 3 | 3 | 273600 |
104 | 3 | 3 | 2440 | Yes | 4 | 3 | 422400 |
105 | 2 | 4 | 1910 | No | 3 | 2 | 164600 |
106 | 3 | 4 | 2530 | No | 4 | 3 | 293800 |
107 | 1 | 4 | 2130 | No | 3 | 2 | 217000 |
108 | 2 | 1 | 1890 | Yes | 3 | 2 | 268000 |
109 | 2 | 3 | 1990 | Yes | 3 | 3 | 234000 |
110 | 2 | 3 | 2110 | No | 3 | 2 | 217400 |
111 | 1 | 1 | 1710 | No | 2 | 2 | 223200 |
112 | 1 | 2 | 1740 | No | 2 | 2 | 229800 |
113 | 2 | 2 | 1940 | Yes | 2 | 2 | 247200 |
114 | 1 | 3 | 2000 | Yes | 3 | 2 | 231400 |
115 | 2 | 2 | 2010 | No | 4 | 3 | 249000 |
116 | 1 | 3 | 1900 | No | 3 | 3 | 205000 |
117 | 3 | 1 | 2290 | Yes | 5 | 4 | 399000 |
118 | 1 | 2 | 1920 | No | 3 | 2 | 235600 |
119 | 1 | 3 | 1950 | Yes | 3 | 2 | 300400 |
120 | 1 | 4 | 1920 | No | 2 | 2 | 219400 |
121 | 1 | 3 | 1930 | No | 2 | 3 | 220800 |
122 | 2 | 3 | 1930 | No | 3 | 3 | 211200 |
123 | 2 | 1 | 2060 | Yes | 2 | 2 | 289600 |
124 | 2 | 3 | 1900 | Yes | 3 | 3 | 239400 |
125 | 2 | 3 | 2160 | Yes | 4 | 3 | 295800 |
126 | 1 | 2 | 2070 | No | 2 | 2 | 227000 |
127 | 3 | 1 | 2020 | No | 3 | 3 | 299800 |
128 | 1 | 4 | 2250 | No | 3 | 3 | 249200 |
using excel>data analysis>regression
independent variable =sq ft
dependent varible = Price
Regression Statistics | ||||||
Multiple R | 0.5530 | |||||
R Square | 0.3058 | |||||
Adjusted R Square | 0.3003 | |||||
Standard Error | 44951.0673 | |||||
Observations | 128 | |||||
ANOVA | ||||||
df | SS | MS | F | Significance F | ||
Regression | 1 | 112145452218.5470 | 112145452218.5470 | 55.5011 | 0.0000 | |
Residual | 126 | 254595404968.9530 | 2020598452.1346 | |||
Total | 127 | 366740857187.5000 | ||||
Coefficients | Standard Error | t Stat | P-value | Lower 95% | Upper 95% | |
Intercept | -20182.2598 | 37932.2080 | -0.5321 | 0.5956 | -95248.9843 | 54884.4647 |
Sq Ft | 140.4526 | 18.8529 | 7.4499 | 0.0000 | 103.1432 | 177.7620 |
for every one unit increase in sq ft there is a $140.4526 increase in price.
since the p-value of f statistic is less than 0.05 so the relationship between these variables statistically significant