Question

In: Statistics and Probability

In this problem, we will use linear regression and residual analysis to study the relationship between...

In this problem, we will use linear regression and residual analysis to study the relationship between square footage of a house and the home sales price.

(a) Go to the course webpage and under Datasets, download the CSV file “homes.csv” and follow the accompanying Minitab instructions. Copy and paste the Fitted Line Plots and the Residual Plots in a blank document. Print these out and attach them to your homework.

(b) Based on the fitted line and residual plots for the non-transformed Sales Price, which assumption for linear regression has been violated?

(c) After we have taken a log transform of the response, i.e. our response is now log(Sales Price), what happens? Compare the subsequent fitted line plot and the residual plots for the model with the log-transformed response with the fitted line plot and residual plots for the model without the transformation.

(d) Using the fitted equation in the Fitted Line Plot “LogSalePrice versus SqrFeet” and then an inverse transform 10logyi (should be yi hat), determine the expected mean sales price if a house has 3,500 square footage.

homes.csv

SqrFeet,SalePrice,LogSalePrice

3032,360,2.556302501

2058,340,2.531478917

1780,250,2.397940009

1638,205.5,2.312811826

2196,275.5,2.440121603

1966,248,2.394451681

2216,229.9,2.361538971

1597,150,2.176091259

1622,195,2.290034611

1976,160,2.204119983

2812,190,2.278753601

2791,559,2.747411808

3381,535,2.728353782

3459,525,2.720159303

2090,299.9,2.476976466

3232,527,2.721810615

1502,169.9,2.230193379

2747,335.25,2.525368787

2890,323.9,2.510410948

1825,200,2.301029996

1578,211,2.324282455

1763,212,2.326335861

2517,245,2.389166084

1872,140.4,2.147367108

3266,295,2.469822016

2020,170.9,2.232742063

2164,229,2.359835482

2080,218.5,2.339451441

2208,160,2.204119983

3048,259,2.413299764

1460,164.5,2.216165902

2540,280,2.447158031

2208,154,2.187520721

2560,272,2.434568904

2061,180,2.255272505

1980,157.5,2.197280558

3308,242.5,2.384711743

2616,182,2.260071388

1460,178,2.250420002

1580,171.9,2.235275877

1460,165.5,2.218797998

1540,183.5,2.263636069

1388,135,2.130333768

1624,175,2.243038049

1580,149.5,2.174641193

1820,177.5,2.249198357

1733,155,2.190331698

1896,145,2.161368002

2038,178,2.250420002

1436,156,2.193124598

1690,159,2.201397124

1496,160,2.204119983

1668,112,2.049218023

980,84,1.924279286

2562,155,2.190331698

2304,360,2.556302501

1268,104,2.017033339

2283,420,2.62324929

2060,355,2.550228353

2087,165,2.217483944

2081,244,2.387389826

1696,179.9,2.255031163

2222,253,2.403120521

2110,200,2.301029996

1774,200,2.301029996

1592,147.7,2.169380495

1748,188.7,2.2757719

1985,177,2.247973266

2558,585,2.767155866

4000,549.9,2.74028372

3942,675,2.829303773

3889,830,2.919078092

3857,920,2.963787827

4756,855,2.931966115

3302,585.5,2.767526899

2629,399,2.600972896

4418,790,2.897627091

4746,665,2.822821645

3242,725,2.860338007

2464,647,2.810904281

4419,780,2.892094603

3877,657.5,2.817895757

3808,578,2.761927838

3376,500,2.698970004

2940,484.53,2.685320672

2654,609,2.784617293

2672,635,2.802773725

3386,519,2.715167358

3648,625.1,2.795949499

3455,585.444,2.767485359

3321,399.9,2.601951404

2817,389.9,2.590953235

3210,649,2.812244697

3588,535,2.728353782

2705,640,2.806179974

2344,600,2.77815125

4264,582.5,2.76529593

2615,545,2.736396502

3608,480,2.681241237

2479,595,2.774516966

3251,610,2.785329835

2547,570,2.755874856

5032,479,2.680335513

4973,545,2.736396502

2582,335,2.525044807

3139,629,2.798650645

3516,505.5,2.70372116

2936,370,2.568201724

3351,529,2.723455672

3136,490,2.69019608

3525,535,2.728353782

3917,612,2.786751422

2702,442.5,2.645913275

3644,500,2.698970004

3072,539,2.731588765

3233,545.5,2.736794755

2918,424,2.627365857

3266,325,2.511883361

2940,367,2.564666064

3430,470,2.672097858

2472,393,2.59439255

2878,530,2.72427587

2537,400,2.602059991

3858,403.5,2.605843539

2742,550,2.740362689

3460,380,2.579783597

3836,500,2.698970004

4453,465,2.667452953

4080,451.5,2.654657755

3301,336,2.526339277

3828,550,2.740362689

2973,450,2.653212514

2821,440,2.643452676

2950,515,2.711807229

2362,415,2.618048097

3092,380,2.579783597

2866,489.5,2.689752696

3369,478,2.679427897

3068,460,2.662757832

2380,379.9,2.579669294

2225,390,2.591064607

2655,338,2.5289167

3855,675,2.829303773

2892,440,2.643452676

3832,500,2.698970004

3164,470,2.672097858

2620,317.5,2.50174373

3076,430,2.633468456

4022,430,2.633468456

3377,475,2.67669361

2858,389,2.589949601

3540,400,2.602059991

3045,395,2.596597096

4150,395,2.596597096

1778,296,2.471291711

2412,387.5,2.588271707

2668,353,2.547774705

2274,350,2.544068044

2936,437.632,2.64110907

2526,447.5,2.65079304

2449,318.5,2.503109437

3131,352,2.546542663

2536,295,2.469822016

3314,313.5,2.496237545

2230,330,2.51851394

2631,400,2.602059991

2638,325,2.511883361

2756,340,2.531478917

3262,399.9,2.601951404

1936,249.9,2.397766256

3148,389,2.589949601

2616,364.5,2.561697533

3630,357.5,2.553276046

1954,295,2.469822016

1926,274.5,2.438542349

2556,259,2.413299764

2282,415,2.618048097

3314,443,2.646403726

2001,249,2.396199347

2607,330,2.51851394

2840,291,2.463892989

3036,418,2.621176282

2240,320,2.505149978

1788,264,2.421603927

2620,381,2.580924976

1480,250,2.397940009

2588,360,2.556302501

3138,369.5,2.567614443

2460,285.4,2.455453969

3566,409,2.611723308

2692,333,2.522444234

2958,362,2.558708571

3164,387.5,2.588271707

2058,239,2.378397901

2717,299.9,2.476976466

2920,335,2.525044807

2554,275,2.439332694

2805,328,2.515873844

2736,333,2.522444234

3516,397,2.598790507

3536,374.8,2.573799582

2138,520,2.716003344

2718,325,2.511883361

2178,295,2.469822016

3152,415,2.618048097

2611,224.9,2.351989455

2060,265,2.423245874

2448,299.9,2.476976466

4050,390,2.591064607

2414,271,2.432969291

3072,330,2.51851394

2525,350,2.544068044

2866,310,2.491361694

3246,340,2.531478917

2707,307,2.487138375

2300,304,2.482873584

1860,275.9,2.4407517

3636,315,2.498310554

1910,295,2.469822016

2280,251.01,2.399691024

3386,335,2.525044807

2324,343.5,2.535926741

1970,297,2.472756449

2062,281,2.44870632

2617,235,2.371067862

2612,237,2.374748346

2472,274.9,2.43917474

1922,229.9,2.361538971

1852,259,2.413299764

2239,245,2.389166084

2068,208,2.318063335

2101,421,2.624282096

2200,320,2.505149978

1972,256,2.408239965

2007,275,2.439332694

2612,222,2.346352974

2124,249.9,2.397766256

2612,273.5,2.436957331

2548,218.5,2.339451441

2767,377,2.57634135

2025,220,2.342422681

1956,192.9,2.285332228

2460,298.75,2.475307914

2764,315,2.498310554

2004,315,2.498310554

2852,465.9,2.66829271

2096,239.5,2.379305518

2330,276,2.440909082

2520,226,2.354108439

2528,235,2.371067862

2030,247,2.392696953

2208,182,2.260071388

1500,180,2.255272505

2078,249,2.396199347

2442,260,2.414973348

2612,219.9,2.342225229

2268,295,2.469822016

2734,290,2.462397998

2228,300,2.477121255

3000,354.9,2.550105999

2404,261,2.416640507

1840,174.5,2.241795431

2107,244.9,2.388988785

2788,274.9,2.43917474

2416,214,2.330413773

2430,283,2.451786436

1584,177.9,2.250175948

1873,237.5,2.375663614

1644,202.15,2.305673746

2400,235,2.371067862

2094,217,2.336459734

2561,285,2.45484486

1752,217.5,2.337459261

1738,210,2.322219295

2068,183.34,2.263257227

2428,252,2.401400541

2090,237,2.374748346

1820,205,2.311753861

3219,285,2.45484486

2654,210,2.322219295

1802,280,2.447158031

1765,207,2.315970345

2786,221,2.344392274

1794,257,2.409933123

2768,274,2.437750563

2288,262,2.418301291

2028,204.4,2.310480891

2620,254.9,2.406369835

1644,244,2.387389826

1888,213,2.328379603

2116,240,2.380211242

2313,235,2.371067862

1824,206,2.31386722

1942,237,2.374748346

2184,274,2.437750563

2578,275,2.439332694

2036,218.4,2.339252634

1384,156,2.193124598

1826,220,2.342422681

1681,171.5,2.234264124

1726,180,2.255272505

1910,204,2.309630167

2664,307,2.487138375

2116,265,2.423245874

2030,209.9,2.322012439

1940,173,2.238046103

1676,189,2.276461804

2120,222.5,2.347330015

2152,265,2.423245874

1984,264.67,2.422704717

1575,200.75,2.302655554

1798,227.9,2.357744325

2017,255,2.40654018

1904,208.5,2.319106059

1718,226.9,2.355834496

1776,215,2.33243846

2609,222.95,2.348207477

2226,239.9,2.380030248

1556,176,2.245512668

1764,228,2.357934847

1626,204.9,2.311541958

2012,258,2.411619706

2090,241.85,2.383546092

2192,198.5,2.297760511

2228,243,2.385606274

1825,187,2.271841607

2132,233,2.367355921

2160,205,2.311753861

1974,205,2.311753861

1696,189,2.276461804

2132,204.9,2.311541958

1814,239,2.378397901

1796,193,2.285557309

2268,260,2.414973348

1719,188,2.274157849

1704,190.5,2.27989498

2142,230,2.361727836

1705,240,2.380211242

1752,235,2.371067862

2554,275,2.439332694

1650,205,2.311753861

2816,280,2.447158031

1919,190,2.278753601

2080,232.5,2.366422957

2108,259.5,2.414137362

2480,275,2.439332694

1746,183.9,2.264581729

2703,290,2.462397998

1640,217.95,2.338356873

2196,220,2.342422681

1701,185,2.267171728

2250,288,2.459392488

2502,197.5,2.2955671

1762,179.975,2.255212182

2016,195,2.290034611

1904,228.4,2.3586961

1652,194.75,2.289477466

2042,195,2.290034611

2019,210,2.322219295

2791,239.55,2.379396175

2514,242,2.383815366

1746,185,2.267171728

1930,175,2.243038049

1552,165,2.217483944

1566,185,2.267171728

1669,173.194,2.238532843

1814,205.15,2.312071521

1794,214.2,2.330819466

1691,182.5,2.261262869

1834,205,2.311753861

1984,208,2.318063335

1966,225,2.352182518

1669,170,2.230448921

2132,216,2.334453751

2007,180,2.255272505

1964,169.2,2.228400359

2325,213,2.328379603

2196,210,2.322219295

2061,185,2.267171728

1828,179.9,2.255031163

1956,196,2.292256071

1852,219.9,2.342225229

1795,159.9,2.203848464

1580,170,2.230448921

1708,169.9,2.230193379

1700,189.5,2.277609214

1742,195,2.290034611

1890,215,2.33243846

1512,171,2.23299611

1840,179.9,2.255031163

1060,120,2.079181246

1540,175,2.243038049

1550,232.9,2.367169489

2787,229.9,2.361538971

1528,174.9,2.242789809

1928,168.9,2.22762965

2329,229.5,2.36078269

1940,236,2.372912003

2114,205.5,2.312811826

1799,212,2.326335861

1864,205,2.311753861

1581,193,2.285557309

1652,180,2.255272505

1592,184,2.264817823

1520,144.9,2.161068385

1792,255,2.40654018

1464,137,2.136720567

1702,178,2.250420002

2180,296,2.471291711

1486,186.5,2.270678836

1364,170,2.230448921

1540,219,2.340444115

1608,188,2.274157849

1668,195.25,2.290591043

1944,175,2.243038049

1883,215,2.33243846

2215,197.5,2.2955671

1916,249.9,2.397766256

1508,180,2.255272505

1809,174.9,2.242789809

1958,189.9,2.278524965

1592,154,2.187520721

1636,150,2.176091259

1800,189.9,2.278524965

1600,157,2.195899652

1550,182,2.260071388

1578,187.65,2.273348569

1644,175,2.243038049

1556,189.9,2.278524965

1672,175,2.243038049

1650,159.9,2.203848464

1676,184.9,2.266936911

1960,174.9,2.242789809

1649,143,2.155336037

1728,164.9,2.217220656

1586,173.5,2.239299479

1592,161.8,2.208978517

1514,148,2.170261715

1952,177,2.247973266

1550,149.9,2.175801633

1544,170,2.230448921

1566,142,2.152288344

1650,186.9,2.271609301

1392,152.9,2.184407485

2981,350,2.544068044

1412,130,2.113943352

2180,167.9,2.225050696

1704,184.9,2.266936911

1600,178,2.250420002

1276,111,2.045322979

1666,207,2.315970345

1760,190,2.278753601

1836,230,2.361727836

1636,165,2.217483944

1748,210,2.322219295

2556,226,2.354108439

1511,149.9,2.175801633

1524,155,2.190331698

1821,219.9,2.342225229

1596,132,2.120573931

2392,195,2.290034611

1748,155.9,2.192846115

1384,119.9,2.078819183

1628,175,2.243038049

1911,304,2.482873584

1624,190,2.278753601

1956,229.1,2.360025089

2012,187.5,2.273001272

1590,173,2.238046103

1687,170,2.230448921

1816,179,2.252853031

1622,162.5,2.210853365

1604,172,2.235528447

1592,153.2,2.185258765

1922,220,2.342422681

1892,174,2.240549248

1628,200,2.301029996

1644,161,2.206825876

1450,135,2.130333768

1592,190,2.278753601

1654,153.8,2.186956335

1388,144.9,2.161068385

1670,165,2.217483944

1953,186,2.269512944

2008,167,2.222716471

1650,189.6,2.277838333

1578,170,2.230448921

1618,189.5,2.277609214

1642,153.5,2.18610838

2008,159.9,2.203848464

1604,158,2.198657087

2035,175,2.243038049

1534,147,2.167317335

1624,155,2.190331698

1700,150,2.176091259

1630,165,2.217483944

1526,147,2.167317335

1672,146.25,2.165095875

1588,177.5,2.249198357

1752,153.65,2.186532565

1674,199.5,2.2999429

1980,186,2.269512944

1396,139.9,2.145817714

1178,160,2.204119983

1263,125,2.096910013

2377,359.9,2.556181847

1304,184.5,2.26599637

1340,155,2.190331698

1559,150,2.176091259

1412,146,2.164352856

1198,129,2.11058971

1424,145,2.161368002

1370,200,2.301029996

1584,149.9,2.175801633

1567,132,2.120573931

1409,136.9,2.136403448

1655,137,2.136720567

1944,185,2.267171728

1922,133.5,2.125481266

1480,124,2.093421685

1184,95.5,1.980003372

Solutions

Expert Solution

Result:

In this problem, we will use linear regression and residual analysis to study the relationship between square footage of a house and the home sales price.

(a) Go to the course webpage and under Datasets, download the CSV file “homes.csv” and follow the accompanying Minitab instructions. Copy and paste the Fitted Line Plots and the Residual Plots in a blank document. Print these out and attach them to your homework.

Regression Analysis: SalePrice versus SqrFeet

Analysis of Variance

Source

DF

Adj SS

Adj MS

F-Value

P-Value

Regression

1

6666447

6666447

1087.53

0.000

SqrFeet

1

6666447

6666447

1087.53

0.000

Error

519

3181416

6130

Lack-of-Fit

412

2848387

6914

2.22

0.000

Pure Error

107

333029

3112

Total

520

9847862

Model Summary

S

R-sq

R-sq(adj)

R-sq(pred)

78.2936

67.69%

67.63%

67.31%

Coefficients

Term

Coef

SE Coef

T-Value

P-Value

VIF

Constant

-82.3

11.4

-7.19

0.000

SqrFeet

0.15909

0.00482

32.98

0.000

1.00

Regression Equation

SalePrice

=

-82.3 + 0.15909 SqrFeet

(b) Based on the fitted line and residual plots for the non-transformed Sales Price, which assumption for linear regression has been violated?

The residual plots shows that normality and equality assumptions of regressions are violated.

(c) After we have taken a log transform of the response, i.e. our response is now log(Sales Price), what happens? Compare the subsequent fitted line plot and the residual plots for the model with the log-transformed response with the fitted line plot and residual plots for the model without the transformation.

Regression Analysis: LogSalePrice versus SqrFeet

Analysis of Variance

Source

DF

Adj SS

Adj MS

F-Value

P-Value

Regression

1

12.9274

12.9274

1270.91

0.000

SqrFeet

1

12.9274

12.9274

1270.91

0.000

Error

519

5.2791

0.0102

Lack-of-Fit

412

4.5368

0.0110

1.59

0.002

Pure Error

107

0.7423

0.0069

Total

520

18.2065

Model Summary

S

R-sq

R-sq(adj)

R-sq(pred)

0.100855

71.00%

70.95%

70.72%

Coefficients

Term

Coef

SE Coef

T-Value

P-Value

VIF

Constant

1.8988

0.0147

128.93

0.000

SqrFeet

0.000222

0.000006

35.65

0.000

1.00

Regression Equation

LogSalePrice

=

1.8988 + 0.000222 SqrFeet

Now the residual plots shows that there is no violation of assumptions.

(d) Using the fitted equation in the Fitted Line Plot “LogSalePrice versus SqrFeet” and then an inverse transform 10logyi (should be yi hat), determine the expected mean sales price if a house has 3,500 square footage.

When square feet is 3500,

Predicted LogSalePrice=1.8988 + 0.000222*3500

=2.6758

expected mean sales price = 10 2.6758

=474.0236


Related Solutions

What is the difference between a linear relationship and a curvilinear relationship in linear regression?
What is the difference between a linear relationship and a curvilinear relationship in linear regression?
From the table below, use a simple linear regression analysis to establish the relationship that may...
From the table below, use a simple linear regression analysis to establish the relationship that may exist between a) number of confirmed cases and deaths; b) number of confirmed cases and number of tests performed; c) number of confirmed cases and number of recoveries; and d) number of deaths and number of recoveries. Briefly discuss these relations. Date Total confirmed Death Recoveries Test 1-Apr 195 5 3 12046 2-Apr 204 5 3 12046 3-Apr 205 5 3 12046 4-Apr 214...
Finally, the researcher considers using regression analysis to establish a linear relationship between the two variables...
Finally, the researcher considers using regression analysis to establish a linear relationship between the two variables – hours worked per week and yearly income. a) What is the dependent variable and independent variable for this analysis? Why? b) Use an appropriate plot to investigate the relationship between the two variables. Display the plot. On the same plot, fit a linear trend line including the equation and the coefficient of determination R2 . c) Estimate a simple linear regression model and...
Finally, the researcher considers using regression analysis to establish a linear relationship between the two variables...
Finally, the researcher considers using regression analysis to establish a linear relationship between the two variables – hours worked per week and income earned per year. c) Estimate a simple linear regression model and present the estimated linear equation. Display the regression summary table and interpret the intercept and slope coefficient estimates of the linear model.                                                            Yearly Income ('000's) Hours Per Week 43.8 18 44.5 13 44.8 18 46.0 25.5 41.4 11.6 43.3 18 43.6 16 46.2 27 46.8...
Question No. 01: Linear Regression Analysis in SPSS Statistics a. Assume a case study to use...
Question No. 01: Linear Regression Analysis in SPSS Statistics a. Assume a case study to use simple linear regression for analysis and precisely interpret the results of your study. Also, use Y=aX + b to predict the results. b. Suppose another case study to use multiple linear regression, Interpret the results tactfully. Also, use Z=aX+bY+c to predict the results. (Use screenshots as required).
Multiple regression analysis was used to study the relationship between a dependent variable, y, and four...
Multiple regression analysis was used to study the relationship between a dependent variable, y, and four independent variables; x1, x2, x3, and x4. The following is a partial result of the regression analysis involving 31 observations. Coefficients Standard Error Intercept 18.00 6.00 x1 12.00 8.00 x2 24.00 48.00 x3 -36.00 36.00 x4 16.00 2.00 ANOVA df SS MS F Regression 125 Error Total 760 a) Compute the multiple coefficient of determination. b) Perform a t test and determine whether or...
Number 1 Say that you want to use simple linear regression to model the relationship between...
Number 1 Say that you want to use simple linear regression to model the relationship between the explanatory variable x= the number of alcoholic drinks consumed per week, and the response variable y= number of hours of exercise per week. You collect data on n=52 people, and find the regression line ŷ=10-0.3x. (2 pts) What is the theoretical model underlying your regression analysis? (4 pts) Say that Lorenzo has six alcoholic drinks a week. Predict how many hours he will...
We performed a linear regression analysis between number of times on phone per drive and number...
We performed a linear regression analysis between number of times on phone per drive and number of near accidents. The equation is Y= 0.320 + 0.943X, where Y is the number of times on phone per drive and X is the number of near accidents. calculate the p-value and give a conclusion. number of times on phone per dr near accidents 0 0 0 1 1 1 2 1 3 1 1 2 2 2 3 2 4 2 2...
In this problem we use Regression Analysis to test the hypothesis that Dividends per share significantly...
In this problem we use Regression Analysis to test the hypothesis that Dividends per share significantly affects price per share of a company. Coefficients(a) Model Unstandardized Coefficients Standardized Coefficients t Sig. B Std. Error Beta B Std. Error 1 (Constant) 26.805 3.922 6.835 .000 Dividends Per Share Paid 2.408 .328 .811 7.345 .000 a Dependent Variable: Price Per Share of Company Stock                                  Model Summary Model R R Square Adjusted R Square Std. Error of the Estimate 1 .811(a) .658...
Let's say that we use 16 mice in a study that seeks a relationship between the...
Let's say that we use 16 mice in a study that seeks a relationship between the concentration (x) of a toxic substance and the mortality time of mice (y), and the findings given in the table below are obtained. Is there a relationship between concentration and death time? What is the direction and strength of the relationship? Find it with correlation analysis. Concentration (microgram) Death Time (minutes) 20 20 25 18 30 18 35 16 40 17 45 16 50...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT