In: Statistics and Probability
In this problem, we will use linear regression and residual analysis to study the relationship between square footage of a house and the home sales price.
(a) Go to the course webpage and under Datasets, download the CSV file “homes.csv” and follow the accompanying Minitab instructions. Copy and paste the Fitted Line Plots and the Residual Plots in a blank document. Print these out and attach them to your homework.
(b) Based on the fitted line and residual plots for the non-transformed Sales Price, which assumption for linear regression has been violated?
(c) After we have taken a log transform of the response, i.e. our response is now log(Sales Price), what happens? Compare the subsequent fitted line plot and the residual plots for the model with the log-transformed response with the fitted line plot and residual plots for the model without the transformation.
(d) Using the fitted equation in the Fitted Line Plot “LogSalePrice versus SqrFeet” and then an inverse transform 10logyi (should be yi hat), determine the expected mean sales price if a house has 3,500 square footage.
homes.csv
SqrFeet,SalePrice,LogSalePrice
3032,360,2.556302501
2058,340,2.531478917
1780,250,2.397940009
1638,205.5,2.312811826
2196,275.5,2.440121603
1966,248,2.394451681
2216,229.9,2.361538971
1597,150,2.176091259
1622,195,2.290034611
1976,160,2.204119983
2812,190,2.278753601
2791,559,2.747411808
3381,535,2.728353782
3459,525,2.720159303
2090,299.9,2.476976466
3232,527,2.721810615
1502,169.9,2.230193379
2747,335.25,2.525368787
2890,323.9,2.510410948
1825,200,2.301029996
1578,211,2.324282455
1763,212,2.326335861
2517,245,2.389166084
1872,140.4,2.147367108
3266,295,2.469822016
2020,170.9,2.232742063
2164,229,2.359835482
2080,218.5,2.339451441
2208,160,2.204119983
3048,259,2.413299764
1460,164.5,2.216165902
2540,280,2.447158031
2208,154,2.187520721
2560,272,2.434568904
2061,180,2.255272505
1980,157.5,2.197280558
3308,242.5,2.384711743
2616,182,2.260071388
1460,178,2.250420002
1580,171.9,2.235275877
1460,165.5,2.218797998
1540,183.5,2.263636069
1388,135,2.130333768
1624,175,2.243038049
1580,149.5,2.174641193
1820,177.5,2.249198357
1733,155,2.190331698
1896,145,2.161368002
2038,178,2.250420002
1436,156,2.193124598
1690,159,2.201397124
1496,160,2.204119983
1668,112,2.049218023
980,84,1.924279286
2562,155,2.190331698
2304,360,2.556302501
1268,104,2.017033339
2283,420,2.62324929
2060,355,2.550228353
2087,165,2.217483944
2081,244,2.387389826
1696,179.9,2.255031163
2222,253,2.403120521
2110,200,2.301029996
1774,200,2.301029996
1592,147.7,2.169380495
1748,188.7,2.2757719
1985,177,2.247973266
2558,585,2.767155866
4000,549.9,2.74028372
3942,675,2.829303773
3889,830,2.919078092
3857,920,2.963787827
4756,855,2.931966115
3302,585.5,2.767526899
2629,399,2.600972896
4418,790,2.897627091
4746,665,2.822821645
3242,725,2.860338007
2464,647,2.810904281
4419,780,2.892094603
3877,657.5,2.817895757
3808,578,2.761927838
3376,500,2.698970004
2940,484.53,2.685320672
2654,609,2.784617293
2672,635,2.802773725
3386,519,2.715167358
3648,625.1,2.795949499
3455,585.444,2.767485359
3321,399.9,2.601951404
2817,389.9,2.590953235
3210,649,2.812244697
3588,535,2.728353782
2705,640,2.806179974
2344,600,2.77815125
4264,582.5,2.76529593
2615,545,2.736396502
3608,480,2.681241237
2479,595,2.774516966
3251,610,2.785329835
2547,570,2.755874856
5032,479,2.680335513
4973,545,2.736396502
2582,335,2.525044807
3139,629,2.798650645
3516,505.5,2.70372116
2936,370,2.568201724
3351,529,2.723455672
3136,490,2.69019608
3525,535,2.728353782
3917,612,2.786751422
2702,442.5,2.645913275
3644,500,2.698970004
3072,539,2.731588765
3233,545.5,2.736794755
2918,424,2.627365857
3266,325,2.511883361
2940,367,2.564666064
3430,470,2.672097858
2472,393,2.59439255
2878,530,2.72427587
2537,400,2.602059991
3858,403.5,2.605843539
2742,550,2.740362689
3460,380,2.579783597
3836,500,2.698970004
4453,465,2.667452953
4080,451.5,2.654657755
3301,336,2.526339277
3828,550,2.740362689
2973,450,2.653212514
2821,440,2.643452676
2950,515,2.711807229
2362,415,2.618048097
3092,380,2.579783597
2866,489.5,2.689752696
3369,478,2.679427897
3068,460,2.662757832
2380,379.9,2.579669294
2225,390,2.591064607
2655,338,2.5289167
3855,675,2.829303773
2892,440,2.643452676
3832,500,2.698970004
3164,470,2.672097858
2620,317.5,2.50174373
3076,430,2.633468456
4022,430,2.633468456
3377,475,2.67669361
2858,389,2.589949601
3540,400,2.602059991
3045,395,2.596597096
4150,395,2.596597096
1778,296,2.471291711
2412,387.5,2.588271707
2668,353,2.547774705
2274,350,2.544068044
2936,437.632,2.64110907
2526,447.5,2.65079304
2449,318.5,2.503109437
3131,352,2.546542663
2536,295,2.469822016
3314,313.5,2.496237545
2230,330,2.51851394
2631,400,2.602059991
2638,325,2.511883361
2756,340,2.531478917
3262,399.9,2.601951404
1936,249.9,2.397766256
3148,389,2.589949601
2616,364.5,2.561697533
3630,357.5,2.553276046
1954,295,2.469822016
1926,274.5,2.438542349
2556,259,2.413299764
2282,415,2.618048097
3314,443,2.646403726
2001,249,2.396199347
2607,330,2.51851394
2840,291,2.463892989
3036,418,2.621176282
2240,320,2.505149978
1788,264,2.421603927
2620,381,2.580924976
1480,250,2.397940009
2588,360,2.556302501
3138,369.5,2.567614443
2460,285.4,2.455453969
3566,409,2.611723308
2692,333,2.522444234
2958,362,2.558708571
3164,387.5,2.588271707
2058,239,2.378397901
2717,299.9,2.476976466
2920,335,2.525044807
2554,275,2.439332694
2805,328,2.515873844
2736,333,2.522444234
3516,397,2.598790507
3536,374.8,2.573799582
2138,520,2.716003344
2718,325,2.511883361
2178,295,2.469822016
3152,415,2.618048097
2611,224.9,2.351989455
2060,265,2.423245874
2448,299.9,2.476976466
4050,390,2.591064607
2414,271,2.432969291
3072,330,2.51851394
2525,350,2.544068044
2866,310,2.491361694
3246,340,2.531478917
2707,307,2.487138375
2300,304,2.482873584
1860,275.9,2.4407517
3636,315,2.498310554
1910,295,2.469822016
2280,251.01,2.399691024
3386,335,2.525044807
2324,343.5,2.535926741
1970,297,2.472756449
2062,281,2.44870632
2617,235,2.371067862
2612,237,2.374748346
2472,274.9,2.43917474
1922,229.9,2.361538971
1852,259,2.413299764
2239,245,2.389166084
2068,208,2.318063335
2101,421,2.624282096
2200,320,2.505149978
1972,256,2.408239965
2007,275,2.439332694
2612,222,2.346352974
2124,249.9,2.397766256
2612,273.5,2.436957331
2548,218.5,2.339451441
2767,377,2.57634135
2025,220,2.342422681
1956,192.9,2.285332228
2460,298.75,2.475307914
2764,315,2.498310554
2004,315,2.498310554
2852,465.9,2.66829271
2096,239.5,2.379305518
2330,276,2.440909082
2520,226,2.354108439
2528,235,2.371067862
2030,247,2.392696953
2208,182,2.260071388
1500,180,2.255272505
2078,249,2.396199347
2442,260,2.414973348
2612,219.9,2.342225229
2268,295,2.469822016
2734,290,2.462397998
2228,300,2.477121255
3000,354.9,2.550105999
2404,261,2.416640507
1840,174.5,2.241795431
2107,244.9,2.388988785
2788,274.9,2.43917474
2416,214,2.330413773
2430,283,2.451786436
1584,177.9,2.250175948
1873,237.5,2.375663614
1644,202.15,2.305673746
2400,235,2.371067862
2094,217,2.336459734
2561,285,2.45484486
1752,217.5,2.337459261
1738,210,2.322219295
2068,183.34,2.263257227
2428,252,2.401400541
2090,237,2.374748346
1820,205,2.311753861
3219,285,2.45484486
2654,210,2.322219295
1802,280,2.447158031
1765,207,2.315970345
2786,221,2.344392274
1794,257,2.409933123
2768,274,2.437750563
2288,262,2.418301291
2028,204.4,2.310480891
2620,254.9,2.406369835
1644,244,2.387389826
1888,213,2.328379603
2116,240,2.380211242
2313,235,2.371067862
1824,206,2.31386722
1942,237,2.374748346
2184,274,2.437750563
2578,275,2.439332694
2036,218.4,2.339252634
1384,156,2.193124598
1826,220,2.342422681
1681,171.5,2.234264124
1726,180,2.255272505
1910,204,2.309630167
2664,307,2.487138375
2116,265,2.423245874
2030,209.9,2.322012439
1940,173,2.238046103
1676,189,2.276461804
2120,222.5,2.347330015
2152,265,2.423245874
1984,264.67,2.422704717
1575,200.75,2.302655554
1798,227.9,2.357744325
2017,255,2.40654018
1904,208.5,2.319106059
1718,226.9,2.355834496
1776,215,2.33243846
2609,222.95,2.348207477
2226,239.9,2.380030248
1556,176,2.245512668
1764,228,2.357934847
1626,204.9,2.311541958
2012,258,2.411619706
2090,241.85,2.383546092
2192,198.5,2.297760511
2228,243,2.385606274
1825,187,2.271841607
2132,233,2.367355921
2160,205,2.311753861
1974,205,2.311753861
1696,189,2.276461804
2132,204.9,2.311541958
1814,239,2.378397901
1796,193,2.285557309
2268,260,2.414973348
1719,188,2.274157849
1704,190.5,2.27989498
2142,230,2.361727836
1705,240,2.380211242
1752,235,2.371067862
2554,275,2.439332694
1650,205,2.311753861
2816,280,2.447158031
1919,190,2.278753601
2080,232.5,2.366422957
2108,259.5,2.414137362
2480,275,2.439332694
1746,183.9,2.264581729
2703,290,2.462397998
1640,217.95,2.338356873
2196,220,2.342422681
1701,185,2.267171728
2250,288,2.459392488
2502,197.5,2.2955671
1762,179.975,2.255212182
2016,195,2.290034611
1904,228.4,2.3586961
1652,194.75,2.289477466
2042,195,2.290034611
2019,210,2.322219295
2791,239.55,2.379396175
2514,242,2.383815366
1746,185,2.267171728
1930,175,2.243038049
1552,165,2.217483944
1566,185,2.267171728
1669,173.194,2.238532843
1814,205.15,2.312071521
1794,214.2,2.330819466
1691,182.5,2.261262869
1834,205,2.311753861
1984,208,2.318063335
1966,225,2.352182518
1669,170,2.230448921
2132,216,2.334453751
2007,180,2.255272505
1964,169.2,2.228400359
2325,213,2.328379603
2196,210,2.322219295
2061,185,2.267171728
1828,179.9,2.255031163
1956,196,2.292256071
1852,219.9,2.342225229
1795,159.9,2.203848464
1580,170,2.230448921
1708,169.9,2.230193379
1700,189.5,2.277609214
1742,195,2.290034611
1890,215,2.33243846
1512,171,2.23299611
1840,179.9,2.255031163
1060,120,2.079181246
1540,175,2.243038049
1550,232.9,2.367169489
2787,229.9,2.361538971
1528,174.9,2.242789809
1928,168.9,2.22762965
2329,229.5,2.36078269
1940,236,2.372912003
2114,205.5,2.312811826
1799,212,2.326335861
1864,205,2.311753861
1581,193,2.285557309
1652,180,2.255272505
1592,184,2.264817823
1520,144.9,2.161068385
1792,255,2.40654018
1464,137,2.136720567
1702,178,2.250420002
2180,296,2.471291711
1486,186.5,2.270678836
1364,170,2.230448921
1540,219,2.340444115
1608,188,2.274157849
1668,195.25,2.290591043
1944,175,2.243038049
1883,215,2.33243846
2215,197.5,2.2955671
1916,249.9,2.397766256
1508,180,2.255272505
1809,174.9,2.242789809
1958,189.9,2.278524965
1592,154,2.187520721
1636,150,2.176091259
1800,189.9,2.278524965
1600,157,2.195899652
1550,182,2.260071388
1578,187.65,2.273348569
1644,175,2.243038049
1556,189.9,2.278524965
1672,175,2.243038049
1650,159.9,2.203848464
1676,184.9,2.266936911
1960,174.9,2.242789809
1649,143,2.155336037
1728,164.9,2.217220656
1586,173.5,2.239299479
1592,161.8,2.208978517
1514,148,2.170261715
1952,177,2.247973266
1550,149.9,2.175801633
1544,170,2.230448921
1566,142,2.152288344
1650,186.9,2.271609301
1392,152.9,2.184407485
2981,350,2.544068044
1412,130,2.113943352
2180,167.9,2.225050696
1704,184.9,2.266936911
1600,178,2.250420002
1276,111,2.045322979
1666,207,2.315970345
1760,190,2.278753601
1836,230,2.361727836
1636,165,2.217483944
1748,210,2.322219295
2556,226,2.354108439
1511,149.9,2.175801633
1524,155,2.190331698
1821,219.9,2.342225229
1596,132,2.120573931
2392,195,2.290034611
1748,155.9,2.192846115
1384,119.9,2.078819183
1628,175,2.243038049
1911,304,2.482873584
1624,190,2.278753601
1956,229.1,2.360025089
2012,187.5,2.273001272
1590,173,2.238046103
1687,170,2.230448921
1816,179,2.252853031
1622,162.5,2.210853365
1604,172,2.235528447
1592,153.2,2.185258765
1922,220,2.342422681
1892,174,2.240549248
1628,200,2.301029996
1644,161,2.206825876
1450,135,2.130333768
1592,190,2.278753601
1654,153.8,2.186956335
1388,144.9,2.161068385
1670,165,2.217483944
1953,186,2.269512944
2008,167,2.222716471
1650,189.6,2.277838333
1578,170,2.230448921
1618,189.5,2.277609214
1642,153.5,2.18610838
2008,159.9,2.203848464
1604,158,2.198657087
2035,175,2.243038049
1534,147,2.167317335
1624,155,2.190331698
1700,150,2.176091259
1630,165,2.217483944
1526,147,2.167317335
1672,146.25,2.165095875
1588,177.5,2.249198357
1752,153.65,2.186532565
1674,199.5,2.2999429
1980,186,2.269512944
1396,139.9,2.145817714
1178,160,2.204119983
1263,125,2.096910013
2377,359.9,2.556181847
1304,184.5,2.26599637
1340,155,2.190331698
1559,150,2.176091259
1412,146,2.164352856
1198,129,2.11058971
1424,145,2.161368002
1370,200,2.301029996
1584,149.9,2.175801633
1567,132,2.120573931
1409,136.9,2.136403448
1655,137,2.136720567
1944,185,2.267171728
1922,133.5,2.125481266
1480,124,2.093421685
1184,95.5,1.980003372
Result:
In this problem, we will use linear regression and residual analysis to study the relationship between square footage of a house and the home sales price.
(a) Go to the course webpage and under Datasets, download the CSV file “homes.csv” and follow the accompanying Minitab instructions. Copy and paste the Fitted Line Plots and the Residual Plots in a blank document. Print these out and attach them to your homework.
Regression Analysis: SalePrice versus SqrFeet
Analysis of Variance
Source |
DF |
Adj SS |
Adj MS |
F-Value |
P-Value |
Regression |
1 |
6666447 |
6666447 |
1087.53 |
0.000 |
SqrFeet |
1 |
6666447 |
6666447 |
1087.53 |
0.000 |
Error |
519 |
3181416 |
6130 |
||
Lack-of-Fit |
412 |
2848387 |
6914 |
2.22 |
0.000 |
Pure Error |
107 |
333029 |
3112 |
||
Total |
520 |
9847862 |
Model Summary
S |
R-sq |
R-sq(adj) |
R-sq(pred) |
78.2936 |
67.69% |
67.63% |
67.31% |
Coefficients
Term |
Coef |
SE Coef |
T-Value |
P-Value |
VIF |
Constant |
-82.3 |
11.4 |
-7.19 |
0.000 |
|
SqrFeet |
0.15909 |
0.00482 |
32.98 |
0.000 |
1.00 |
Regression Equation
SalePrice |
= |
-82.3 + 0.15909 SqrFeet |
(b) Based on the fitted line and residual plots for the non-transformed Sales Price, which assumption for linear regression has been violated?
The residual plots shows that normality and equality assumptions of regressions are violated.
(c) After we have taken a log transform of the response, i.e. our response is now log(Sales Price), what happens? Compare the subsequent fitted line plot and the residual plots for the model with the log-transformed response with the fitted line plot and residual plots for the model without the transformation.
Regression Analysis: LogSalePrice versus SqrFeet
Analysis of Variance
Source |
DF |
Adj SS |
Adj MS |
F-Value |
P-Value |
Regression |
1 |
12.9274 |
12.9274 |
1270.91 |
0.000 |
SqrFeet |
1 |
12.9274 |
12.9274 |
1270.91 |
0.000 |
Error |
519 |
5.2791 |
0.0102 |
||
Lack-of-Fit |
412 |
4.5368 |
0.0110 |
1.59 |
0.002 |
Pure Error |
107 |
0.7423 |
0.0069 |
||
Total |
520 |
18.2065 |
Model Summary
S |
R-sq |
R-sq(adj) |
R-sq(pred) |
0.100855 |
71.00% |
70.95% |
70.72% |
Coefficients
Term |
Coef |
SE Coef |
T-Value |
P-Value |
VIF |
Constant |
1.8988 |
0.0147 |
128.93 |
0.000 |
|
SqrFeet |
0.000222 |
0.000006 |
35.65 |
0.000 |
1.00 |
Regression Equation
LogSalePrice |
= |
1.8988 + 0.000222 SqrFeet |
Now the residual plots shows that there is no violation of assumptions.
(d) Using the fitted equation in the Fitted Line Plot “LogSalePrice versus SqrFeet” and then an inverse transform 10logyi (should be yi hat), determine the expected mean sales price if a house has 3,500 square footage.
When square feet is 3500,
Predicted LogSalePrice=1.8988 + 0.000222*3500
=2.6758
expected mean sales price = 10 2.6758
=474.0236