Question

In: Statistics and Probability

In this project, you will collect data from real world to construct a multiple regression model....

In this project, you will collect data from real world to construct a multiple regression model. The resulting model will be used for a prediction purpose. For example, suppose you are interested in “sales price of houses”. In a multiple regression model, this is called a “response variable”. There are many important factors that affect the prices of houses.

Those factors include size (square feet), number of bedrooms, number of baths, age of the house, distance to a major grocery store. The factors (or variables) which are used for a multiple regression model are called “explanatory variables” (or “independent variables”). Good choice of explanatory variables is one of the most important steps to construct a good multiple regression model. www.zillow.com, One of the most recognized realtor website in United States, provides predicted prices (“zestimate”) of houses. Now the goal of the project is to construct your own prediction model of house prices. The first step of the project is to decide which explanatory variables you will use. In this project, please find at least four explanatory variables.

Next step is data collection. You are required to collect at least 100 observations (samples). Otherwise, you will not get full credits. Each observation must include sales value and all the values of explanatory variables of your choice. For example, if your explanatory variables are size, number of beds, number of baths, and age of houses, then the data set must be of the following form

Solutions

Expert Solution

The dependent variable is sales price of house

The independent variables are

  • MSSubClass: The building class
  • LotFrontage: Linear feet of street connected to property
  • LotArea: Lot size in square feet
  • LandSlope: Slope of property
  • Condition1: Proximity to main road or railroad.
Sales
169277.0525
187758.394
183583.6836
179317.4775
150730.08
177150.9892
172070.6592
175110.9565
162011.6988
160726.2478
157933.2795
145291.245
159672.0176
164167.5183
150891.6382
179460.9652
185034.6289
182352.1926
183053.4582
187823.3393
186544.1143
158230.7752
190552.8293
147183.6749
185855.3009
174350.4707
201740.6207
162986.3789
162330.1991
165845.9386
180929.6229
163481.5015
187798.0767
198822.1989
194868.4099
152605.2986
147797.7028
150521.969
146991.6302
150306.3078
151164.3725
151133.707
156214.0425
171992.7607
173214.9125
192429.1873
190878.6951
194542.5441
191849.4391
176363.7739
176954.1854
176521.217
179436.7048
220079.7568
175502.9181
188321.0738
163276.3245
185911.3663
171392.831
174418.207
179682.7096
179423.7516
171756.9181
166849.6382
181122.1687
170934.4627
159738.2926
174445.7596
174706.3637
164507.6725
163602.5122
154126.2702
171104.8535
167735.3927
183003.6133
172580.3812
165407.8891
176363.7739
175182.9509
190757.1778
167186.9958
167839.3768
173912.4212
154034.9174
156002.9558
168173.9433
168882.4371
168173.9433
157580.1776
181922.1526
155134.2278
188885.5733
183963.193
161298.7623
188613.6676
175080.1118
174744.4003
168175.9113
182333.4726
158307.2067
MSSubClass
20
20
60
60
120
60
20
60
20
20
120
160
160
160
120
60
20
20
20
20
60
120
20
120
80
60
60
20
20
20
60
30
20
60
60
120
160
160
160
160
160
160
20
60
20
20
60
50
60
20
20
20
80
90
50
50
85
90
20
20
20
20
50
20
20
190
30
50
20
20
50
30
20
45
50
50
30
70
70
190
70
50
75
30
50
50
50
50
50
50
30
50
70
70
20
50
190
50
70
80
LotArea
11622
14267
13830
9978
5005
10000
7980
8402
10176
8400
5858
1680
1680
2280
2280
12858
12883
11520
14122
14300
13650
7132
18494
3203
13300
8577
17433
8987
9215
10440
11920
9800
15410
13143
11134
4835
3515
3215
2544
2544
2980
2403
12853
7379
8000
10456
10791
18837
9600
9600
9900
9680
10600
13260
9724
17360
11380
8267
8197
8050
10725
10032
8382
10950
10895
13587
7898
8064
7635
9760
4800
4485
5805
6900
11851
8239
9656
9600
9000
9045
10560
5830
7793
5000
6000
6000
6360
6000
6240
6240
6120
8094
12900
3068
15263
10632
9900
6001
6449
7556
LotFrontage
80
81
74
78
43
75
NA
63
85
70
26
21
21
24
24
102
94
90
79
110
105
41
100
43
67
63
60
73
92
84
70
70
39
85
88
25
39
30
24
24
NA
NA
57
68
80
NA
80
NA
80
80
90
88
NA
98
68
120
75
70
70
NA
87
80
60
60
119
70
65
60
81
80
60
56
69
50
69
NA
68
60
50
100
60
53
NA
50
50
50
53
50
52
52
51
57
60
52
100
72
60
65
NA
66
LandSlope
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Mod
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Mod
Gtl
Gtl
Mod
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Gtl
Mod
Gtl
mod
Condition1
Feedr
Norm
Norm
Norm
Norm
Norm
Norm
Norm
Norm
Norm
Norm
Norm
Norm
Norm
Norm
Norm
Norm
PosN
Norm
Norm
Norm
Norm
Norm
Norm
Norm
Norm
Norm
Norm
Norm
Norm
Norm
Feedr
RRNe
Norm
Norm
Norm
Norm
Norm
Norm
Norm
Norm
Norm
Norm
Norm
Norm
Norm
Norm
Norm
Norm
Norm
Norm
Norm
Norm
Norm
Norm
Artery
Norm
Feedr
Norm
Norm
Norm
Norm
Norm
Norm
Norm
Norm
Norm
Artery
Norm
Norm
Norm
Artery
Norm
Norm
Artery
Artery
Norm
Norm
Norm
Norm
Norm
Feedr
Norm
Feedr
Norm
Norm
Feedr
Norm
Norm
Norm
Artery
Norm
Norm
Norm
Feedr
Norm
Norm
Norm
Norm
norm

Related Solutions

Use the following data to develop a multiple regression model to predict from and . Discuss...
Use the following data to develop a multiple regression model to predict from and . Discuss the output, including comments about the overall strength of the model, the significance of the regression coefficients, and other indicators of model fit. y x1 x2 198 29 1.64 214 71 2.81 211 54 2.22 219 73 2.70 184 67 1.57 167 32 1.63 201 47 1.99 204 43 2.14 190 60 2.04 222 32 2.93 197 34 2.15 Appendix A Statistical Tables *(Round...
Discuss the application of multiple regression model using a real-life example. [Hint: You are supposed to...
Discuss the application of multiple regression model using a real-life example. [Hint: You are supposed to examine a possible relationship between a dependent and at least three important independent variables in the example which you have chosen. Identify dependent variable and independent variable and write the mathematical regression equation for the example and explain each component of the equation.]
What is a fitted value for a multiple regression model and the data that is used...
What is a fitted value for a multiple regression model and the data that is used to create it? Select one. -It is the difference between the actual value of the response variable and the corresponding predicted value (regression error) using the multiple regression model. -It is a statistic that explains the relationship between response and predictor variables. -It is a statistic that is used to evaluate the significance of the multiple regression model. -It is the predicted value of...
Use multiple regression with dummies, since the data is seasonal for the regression model. Year Sales...
Use multiple regression with dummies, since the data is seasonal for the regression model. Year Sales (Millions) Trend 2014 1 480.0 1 2014 Q2 864.0 2 2014 Q3 942.0 3 2014 Q4 1,100.0 4 2015 Q1 1,200.0 5 2015 Q2 1,900.0 6 2015 Q3 1,900.0 7 2015 Q4 1,300.0 8 2016 Q1 1,200.0 9 2016 Q2 1,500.0 10 2016 Q3 1,200.0 11 2016 Q4 500.0 12 2017 Q1 356.0 13 2017 Q2 1,300.0 14 2017 Q3 1,000.0 15 2017 Q4...
a. Construct a scatterplot of the data and tell why a linear regression model is appropriate....
a. Construct a scatterplot of the data and tell why a linear regression model is appropriate. (Include this graph in your report.)   b. Run the linear regression procedure on StatCrunch and include the output in your report. c. Give the regression equation using the correct notation. d. Give the Coefficient of Determination AND interpret it.   e. Check the assumptions of the model by constructing each of the following plots and commenting on what they suggest in terms of the assumptions....
The standard project is to use multiple regression analysis to analyze a data set. The data...
The standard project is to use multiple regression analysis to analyze a data set. The data set is a study of student persistent enrolling in the next semester based on Gender, Age, GPA, a 22 questionnaire on self-efficacy, and student enrollment status. The educational researcher wants to study the relationship between student enrollment status as it relates to gender, age, GPA, and the total response to a 22 questionnaire survey. a. The estimated multiple regression analysis equation. b. Does the...
For this part of the project, you’ll collect data on real GDP and potential GDP measures...
For this part of the project, you’ll collect data on real GDP and potential GDP measures from the FRED website.   – Real GDP (GDPC1) AND Real Potential GDP (GDPPOT), 2007 – current year After that, answer all these questions. 1. What is the conceptual difference between real GDP and potential GDP? In terms of the economy using their resources, what does it mean if real GDP is less than potential GDP? What does it mean if real GDP is greater...
Using the data from boston_housing.xls (accessible online) a.) do the appropriate multiple linear regression model procedures...
Using the data from boston_housing.xls (accessible online) a.) do the appropriate multiple linear regression model procedures to obtain a final model
Choose or collect a data file, use regression theory to set up a model, calculate parameters...
Choose or collect a data file, use regression theory to set up a model, calculate parameters value, get the regression model, analysis the meaning of model, including R square, F-test, t-test, explain the relation between dependent variable and independent variables. During the analysis, you need to represent the chart, correlation, regression output table. You’d better choose the multiple regression model, preferably one that includes dummy variables.
The following data is to be used to construct a regression model: x 5 7 4...
The following data is to be used to construct a regression model: x 5 7 4 15 12 9 y 8 9 10 22 16 12 The value of SSE is ____________.
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT