In: Statistics and Probability
The data below id from 128 recent sales in Mid City. For each sale, it shows the neighborhood (1, 2, or 3) of the house, # of offers made on the house, sq. footage, whether house is primarily made of brick, # of bathrooms, # of bedrooms, & selling price. Neghborhoods 1 & 2 are traditional, 3 is newer/more prestigious. Use regression to estimate & interpret the pricing structure of houses in Mid City.
Q1. Predict Price (dependent variable) as a function of Nbhd (independent variable). Is the relationship statistically significant i.e. can we reject the null hypothesis that the mean price is the same across all neighborhoods? YES or NO
| Home | Nbhd | Offers | Sq Ft | Brick | Bedrooms | Bathrooms | Price | 
| 1 | 2 | 2 | 1790 | No | 2 | 2 | 228600 | 
| 2 | 2 | 3 | 2030 | No | 4 | 2 | 228400 | 
| 3 | 2 | 1 | 1740 | No | 3 | 2 | 229600 | 
| 4 | 2 | 3 | 1980 | No | 3 | 2 | 189400 | 
| 5 | 2 | 3 | 2130 | No | 3 | 3 | 239600 | 
| 6 | 1 | 2 | 1780 | No | 3 | 2 | 229200 | 
| 7 | 3 | 3 | 1830 | Yes | 3 | 3 | 303200 | 
| 8 | 3 | 2 | 2160 | No | 4 | 2 | 301400 | 
| 9 | 2 | 3 | 2110 | No | 4 | 2 | 238400 | 
| 10 | 2 | 3 | 1730 | No | 3 | 3 | 208000 | 
| 11 | 2 | 3 | 2030 | Yes | 3 | 2 | 265000 | 
| 12 | 2 | 2 | 1870 | Yes | 2 | 2 | 246000 | 
| 13 | 1 | 4 | 1910 | No | 3 | 2 | 205200 | 
| 14 | 1 | 5 | 2150 | Yes | 3 | 3 | 252600 | 
| 15 | 3 | 4 | 2590 | No | 4 | 3 | 353600 | 
| 16 | 3 | 1 | 1780 | No | 4 | 2 | 291600 | 
| 17 | 2 | 4 | 2190 | Yes | 3 | 3 | 294200 | 
| 18 | 1 | 4 | 1990 | No | 3 | 3 | 167200 | 
| 19 | 2 | 1 | 1700 | Yes | 2 | 2 | 222800 | 
| 20 | 3 | 2 | 1920 | Yes | 3 | 3 | 334400 | 
| 21 | 2 | 3 | 1790 | No | 3 | 2 | 232400 | 
| 22 | 1 | 4 | 2000 | No | 3 | 2 | 227600 | 
| 23 | 1 | 3 | 1690 | No | 3 | 2 | 183400 | 
| 24 | 1 | 3 | 1820 | Yes | 3 | 2 | 212200 | 
| 25 | 2 | 2 | 2210 | Yes | 4 | 3 | 312800 | 
| 26 | 1 | 3 | 2290 | No | 4 | 3 | 298600 | 
| 27 | 3 | 3 | 2000 | No | 4 | 2 | 274000 | 
| 28 | 2 | 2 | 1700 | No | 3 | 2 | 198600 | 
| 29 | 1 | 3 | 1600 | No | 2 | 2 | 138200 | 
| 30 | 3 | 1 | 2040 | Yes | 4 | 3 | 376000 | 
| 31 | 3 | 3 | 2250 | Yes | 4 | 3 | 364000 | 
| 32 | 1 | 2 | 1930 | Yes | 2 | 2 | 224600 | 
| 33 | 2 | 3 | 2250 | Yes | 3 | 3 | 270000 | 
| 34 | 2 | 4 | 2280 | Yes | 5 | 3 | 279200 | 
| 35 | 1 | 3 | 2000 | No | 2 | 2 | 235600 | 
| 36 | 1 | 3 | 2080 | No | 3 | 3 | 234200 | 
| 37 | 1 | 2 | 1880 | No | 2 | 2 | 235000 | 
| 38 | 3 | 4 | 2420 | No | 4 | 3 | 294000 | 
| 39 | 3 | 1 | 1720 | No | 3 | 2 | 262600 | 
| 40 | 1 | 2 | 1740 | No | 3 | 2 | 216400 | 
| 41 | 2 | 1 | 1560 | No | 2 | 2 | 213200 | 
| 42 | 3 | 2 | 1840 | No | 4 | 3 | 267200 | 
| 43 | 2 | 3 | 1990 | No | 2 | 2 | 211200 | 
| 44 | 2 | 1 | 1920 | Yes | 3 | 2 | 308000 | 
| 45 | 3 | 2 | 1940 | Yes | 3 | 3 | 333000 | 
| 46 | 2 | 3 | 1810 | No | 3 | 2 | 206400 | 
| 47 | 1 | 2 | 1990 | No | 2 | 3 | 259600 | 
| 48 | 1 | 6 | 2050 | No | 3 | 2 | 180600 | 
| 49 | 2 | 2 | 1980 | No | 2 | 2 | 231800 | 
| 50 | 1 | 3 | 1700 | Yes | 3 | 2 | 215000 | 
| 51 | 2 | 3 | 2100 | Yes | 3 | 2 | 302200 | 
| 52 | 1 | 3 | 1860 | No | 2 | 2 | 182200 | 
| 53 | 1 | 4 | 2150 | No | 2 | 3 | 234800 | 
| 54 | 1 | 3 | 2100 | No | 3 | 2 | 261600 | 
| 55 | 1 | 3 | 1650 | No | 3 | 2 | 162600 | 
| 56 | 2 | 2 | 1720 | Yes | 2 | 2 | 251400 | 
| 57 | 2 | 3 | 2190 | Yes | 3 | 2 | 281800 | 
| 58 | 3 | 3 | 2240 | No | 4 | 3 | 304600 | 
| 59 | 3 | 1 | 1840 | No | 3 | 3 | 276200 | 
| 60 | 3 | 1 | 2090 | No | 4 | 2 | 310800 | 
| 61 | 3 | 1 | 2200 | No | 3 | 3 | 361800 | 
| 62 | 1 | 2 | 1610 | No | 2 | 2 | 201800 | 
| 63 | 3 | 2 | 2220 | No | 4 | 3 | 322600 | 
| 64 | 2 | 2 | 1910 | No | 2 | 3 | 241000 | 
| 65 | 3 | 2 | 1860 | No | 3 | 2 | 260600 | 
| 66 | 1 | 1 | 1450 | Yes | 2 | 2 | 222200 | 
| 67 | 1 | 4 | 2210 | No | 3 | 3 | 252400 | 
| 68 | 2 | 3 | 2040 | No | 4 | 3 | 303800 | 
| 69 | 1 | 4 | 2140 | No | 3 | 2 | 187200 | 
| 70 | 3 | 3 | 2080 | No | 4 | 3 | 331200 | 
| 71 | 3 | 3 | 1950 | Yes | 3 | 3 | 333400 | 
| 72 | 3 | 1 | 2160 | No | 4 | 2 | 315200 | 
| 73 | 1 | 3 | 1650 | No | 3 | 2 | 214600 | 
| 74 | 2 | 2 | 2040 | No | 3 | 3 | 251400 | 
| 75 | 3 | 3 | 2140 | No | 3 | 3 | 288400 | 
| 76 | 1 | 2 | 1900 | No | 2 | 2 | 213800 | 
| 77 | 3 | 2 | 1930 | No | 3 | 2 | 259600 | 
| 78 | 3 | 3 | 2280 | Yes | 4 | 3 | 353000 | 
| 79 | 1 | 3 | 2130 | No | 3 | 2 | 242600 | 
| 80 | 3 | 1 | 1780 | No | 4 | 2 | 287200 | 
| 81 | 2 | 4 | 2190 | Yes | 3 | 3 | 286800 | 
| 82 | 3 | 2 | 2140 | Yes | 4 | 3 | 368600 | 
| 83 | 3 | 1 | 2050 | Yes | 2 | 2 | 329600 | 
| 84 | 2 | 2 | 2410 | No | 3 | 3 | 295400 | 
| 85 | 1 | 3 | 1520 | No | 2 | 2 | 181000 | 
| 86 | 3 | 2 | 2250 | Yes | 4 | 3 | 376600 | 
| 87 | 1 | 4 | 1900 | No | 4 | 2 | 205400 | 
| 88 | 3 | 1 | 1880 | Yes | 3 | 3 | 345000 | 
| 89 | 1 | 2 | 1930 | No | 3 | 3 | 255400 | 
| 90 | 1 | 4 | 2010 | No | 2 | 2 | 195600 | 
| 91 | 3 | 2 | 1920 | No | 4 | 2 | 286200 | 
| 92 | 2 | 2 | 2150 | No | 3 | 2 | 233000 | 
| 93 | 3 | 2 | 2110 | No | 3 | 2 | 285200 | 
| 94 | 2 | 2 | 2080 | No | 3 | 3 | 314200 | 
| 95 | 3 | 3 | 2150 | Yes | 4 | 3 | 321200 | 
| 96 | 3 | 1 | 1970 | Yes | 2 | 2 | 305000 | 
| 97 | 2 | 3 | 2440 | No | 3 | 3 | 266600 | 
| 98 | 2 | 1 | 2000 | Yes | 2 | 2 | 253600 | 
| 99 | 3 | 1 | 2060 | No | 3 | 2 | 291000 | 
| 100 | 3 | 2 | 2080 | Yes | 3 | 3 | 342000 | 
| 101 | 1 | 5 | 2010 | No | 3 | 2 | 206400 | 
| 102 | 2 | 5 | 2260 | No | 3 | 3 | 246200 | 
| 103 | 2 | 4 | 2410 | No | 3 | 3 | 273600 | 
| 104 | 3 | 3 | 2440 | Yes | 4 | 3 | 422400 | 
| 105 | 2 | 4 | 1910 | No | 3 | 2 | 164600 | 
| 106 | 3 | 4 | 2530 | No | 4 | 3 | 293800 | 
| 107 | 1 | 4 | 2130 | No | 3 | 2 | 217000 | 
| 108 | 2 | 1 | 1890 | Yes | 3 | 2 | 268000 | 
| 109 | 2 | 3 | 1990 | Yes | 3 | 3 | 234000 | 
| 110 | 2 | 3 | 2110 | No | 3 | 2 | 217400 | 
| 111 | 1 | 1 | 1710 | No | 2 | 2 | 223200 | 
| 112 | 1 | 2 | 1740 | No | 2 | 2 | 229800 | 
| 113 | 2 | 2 | 1940 | Yes | 2 | 2 | 247200 | 
| 114 | 1 | 3 | 2000 | Yes | 3 | 2 | 231400 | 
| 115 | 2 | 2 | 2010 | No | 4 | 3 | 249000 | 
| 116 | 1 | 3 | 1900 | No | 3 | 3 | 205000 | 
| 117 | 3 | 1 | 2290 | Yes | 5 | 4 | 399000 | 
| 118 | 1 | 2 | 1920 | No | 3 | 2 | 235600 | 
| 119 | 1 | 3 | 1950 | Yes | 3 | 2 | 300400 | 
| 120 | 1 | 4 | 1920 | No | 2 | 2 | 219400 | 
| 121 | 1 | 3 | 1930 | No | 2 | 3 | 220800 | 
| 122 | 2 | 3 | 1930 | No | 3 | 3 | 211200 | 
| 123 | 2 | 1 | 2060 | Yes | 2 | 2 | 289600 | 
| 124 | 2 | 3 | 1900 | Yes | 3 | 3 | 239400 | 
| 125 | 2 | 3 | 2160 | Yes | 4 | 3 | 295800 | 
| 126 | 1 | 2 | 2070 | No | 2 | 2 | 227000 | 
| 127 | 3 | 1 | 2020 | No | 3 | 3 | 299800 | 
| 128 | 1 | 4 | 2250 | No | 3 | 3 | 249200 | 
let us consider Price (dependent variable) as a function of Nbhd (independent variable)
using excel>data analysis >regression analysis
| Regression Statistics | ||||||
| Multiple R | 0.7323 | |||||
| R Square | 0.5363 | |||||
| Adjusted R Square | 0.5326 | |||||
| Standard Error | 36737.2912 | |||||
| Observations | 128 | |||||
| ANOVA | ||||||
| df | SS | MS | F | Significance F | ||
| Regression | 1 | 196687658463.0920 | 196687658463.0920 | 145.7347 | 0.0000 | |
| Residual | 126 | 170053198724.4080 | 1349628561.3048 | |||
| Total | 127 | 366740857187.5000 | ||||
| Coefficients | Standard Error | t Stat | P-value | Lower 95% | Upper 95% | |
| Intercept | 165283.9702 | 8556.7453 | 19.3162 | 0.0000 | 148350.4227 | 182217.5176 | 
| Nbhd | 48737.2582 | 4037.1945 | 12.0721 | 0.0000 | 40747.7691 | 56726.7474 | 
since the p value of f statistic is less than 0.05 ,so the relationship is statistically significant.
yes we can reject the null hypothesis that the mean price is the same across all neighborhoods