Question

In: Statistics and Probability

1.Develop a multiple linear regression model to predict the price of a house using the square...

1.Develop a multiple linear regression model to predict the price of a house using the square feet of living area, number of bedrooms, and number of bathrooms as the predictor variables

  1.     Write the reqression equation.
  2.      Discuss the statistical significance of the model as a whole using the appropriate regression statistic at a 95% level of confidence.
  3. Discuss the statistical significance of the coefficient for each independent variable using the appropriate regression statistics at a 95% level of confidence.
  4. Interpret the coefficient for each independent variable.
  5.   What percentage of the observed variation in housing prices is explained by the model?
  6. Determine the prediction interval of a house with 3,000 square feet of living area, 3 bedrooms, and 2.5 bathrooms, and comment on the prediction interval.

    Prepare a single Microsoft Excel file to document your regression analyses. Prepare a single Microsoft Word document that outlines your responses for each portion of the case study.

    Selling Price        Living Area (Sq Feet)      No. Bathrooms No Bedrooms    Age (Years)

    $240,000              2,022     2.5          3              20

    $235,000              1,578     2              3              20

    $500,075              3,400     3              3              20

    $240,000              1,744     2.5          3              20

    $270,000              2,560     2.5          3              20

    $225,000              1,398     2.5          3              20

    $280,000              2,494     2.5          3              20

    $225,000              2,208     2.5          4              20

    $248,220              2,550     2.5          3              20

    $275,000              1,812     2.5          2              20

    $137,000              1,290     1              2              20

    $150,000              1,172     2              2              20

    $649,000              4,128     3.5          3              20

    $195,000              1,816     2.5          3              97

    $373,200              2,628     2.5          4              20

    $169,450              1,254     2.5          3              20

    $144,200              1,660     1.5          4              20

    $189,900              1,850     1.5          3              20

    $166,000              1,258     2              3              20

    $160,000              1,219     2              3              20

    $327,355              1,850     2.5          3              20

    $247,000              2,103     2.5          3              20

    $318,000              1,806     2.5          3              20

    $341,000              1,674     1.5          2              17

    $288,650              2,242     2.5          3              20

    $157,000              1,408     1.5          3              20

    $449,000              3,457     2.5          3              21

    $142,000              1,728     1.5          3              21

    $389,000              2,354     2.5          3              21

    $476,000              2,246     2.5          3              21

    $249,230              1,902     2.5          2              21

    $139,900              1,178     1              3              21

    $301,900              2,896     3.5          4              21

    $425,000              2,457     3              3              41

    $121,000              936         1              3              50

    $150,000              934         1              2              21

    $138,000              1,279     1              3              21

    $199,900              1,888     2              3              26

    $145,000              1,686     1.5          4              21

    $465,000              2,310     3              2              21

    $158,000              1,200     1.5          3              21

Solutions

Expert Solution

This is a simple problem related ot tracing the multi variable regression equation and to check its significance.

The data set for the given regression is

Selling price($) Living Area (sq Ft) Bathroom(No.) Bedroom(No.) Age (years)
240000 2022 2.5 3 20
235000 1578 2 3 20
500075 3400 3 3 20
240000 1744 2.5 3 20
270000 2560 2.5 3 20
225000 1398 2.5 3 20
280000 2494 2.5 3 20
225000 2208 2.5 4 20
248220 2550 2.5 3 20
275000 1812 2.5 2 20
137000 1290 1 2 20
150000 1172 2 2 20
649000 4128 3.5 3 20
195000 1816 2.5 3 97
373200 2628 2.5 4 20
169450 1254 2.5 3 20
144200 1660 1.5 4 20
189900 1850 1.5 3 20
166000 1258 2 3 20
160000 1219 2 3 20
327355 1850 2.5 3 20
247000 2103 2.5 3 20
318000 1806 2.5 3 20
341000 1674 1.5 2 17
288650 2242 2.5 3 20
157000 1408 1.5 3 20
449000 3457 2.5 3 21
142000 1728 1.5 3 21
389000 2354 2.5 3 21
476000 2246 2.5 3 21
249230 1902 2.5 2 21
139900 1178 1 3 21
301900 2896 3.5 4 21
425000 2457 3 3 41
121000 936 1 3 50
150000 934 1 2 21
138000 1279 1 3 21
199900 1888 2 3 26
145000 1686 1.5 4 21
465000 2310 3 2 21
158000 1200 1.5 3 21

Now as per the question we need to plot the multi variable regression equation only in terms of three given independent variables

Area of the house

Number of bedrooms

Number of bathrooms

So we plot the regression equation in the MS-Excel by recalling the function regression in the data analysis function.

This gives us the following output.

Now what do we infer from the given regression analysis ?

We infer that :-

The regression equation is given by

Y(Selling price)=84829.70+130.84*(living Area)+38670.33*(Bathroom numbers)-54806*(Bedroom numbers)

1). Since the F test significance value is <0.05, we can say that the overall multi variable regression equation is significant with 95% confidence .

2). Since the p values of the intercept of the equation and the variable (Bathroom numbers) is >0.05 we can infer that these factors do NOT hold significance in the regression equation .

3). Since the p values of the independent variables Living area (in sq ft). and the variable (Bedroom numbers) is <0.05 ,we can infer that these factors do hold significance in the regression equation .

4). coefficients of the variables Bathroom numbers, living area and the intercept are positive indicating that selling price is directly dependnent on them ehile the coefficient of bedroom number is negative which means that as the number of bedrooms increase the selling prices arel ikely to reduce.

5). Now to understand the percentage of variation of output that is explained by the independent input variables we look for the value of R-Squared and Adjusted R- Squared. The R-Squared tells us as to how well the model or regression line “fits” the data.

It tell us about the proportion of variance in the dependent variable (Y) that is explained by the independent variable (X). The R-Squared indicates the percentage of variation in the dependent variable that is explained by the independent variables.

But our regression equation is a multi variable regression equation . so we will seek adjusted R -Squared ?

But Why?

Because Adjusted R-Squared is used when analyzing multiple regression output because When we have more than one independent variable in our analysis, the computation process inflates the R-squared. As the name indicates, the Adjusted R-Squared is the R-Square adjusted for this inflation when performing multiple regression.

Now look at the Adjusted R -Squared value? what is it?

Adjusted R -Squared =0.7756

This means that only 77.56% of all output variations can be explained by our regression model.

Prediction interval

We can predict the selling price by using our regression model

For living space =3000 sq ft

Bedroom =3 number

bathroom =2.5 numbers

The predicted selling price will be

Y(Selling price)=84829.70+130.84*(3000)+38670.33*(2.5)-54806*(3)

=$409607


Related Solutions

            Develop a simple linear regression model to predict the price of a house based upon...
            Develop a simple linear regression model to predict the price of a house based upon the living area (square feet) using a 95% level of confidence.             Write the reqression equation             Discuss the statistical significance of the model as a whole using the appropriate regression statistic at a 95% level of confidence.              Discuss the statistical significance of the coefficient for the independent variable using the appropriate regression statistic at a 95% level of confidence.             Interpret the...
Use the following data to develop a multiple regression model to predict from and . Discuss...
Use the following data to develop a multiple regression model to predict from and . Discuss the output, including comments about the overall strength of the model, the significance of the regression coefficients, and other indicators of model fit. y x1 x2 198 29 1.64 214 71 2.81 211 54 2.22 219 73 2.70 184 67 1.57 167 32 1.63 201 47 1.99 204 43 2.14 190 60 2.04 222 32 2.93 197 34 2.15 Appendix A Statistical Tables *(Round...
Develop a simple linear regression model to predict a person’s income (INCOME) based on their age...
Develop a simple linear regression model to predict a person’s income (INCOME) based on their age (AGE) using a 95% level of confidence. a. Write the regression equation. Discuss the statistical significance of the model as whole using the appropriate regression statistic at a 95% level of confidence. Discuss the statistical significance of the coefficient for the independent variable using the appropriate regression statistic at a 95% level of confidence. Interpret the coefficient for the independent variable. What percentage of...
. Develop a simple linear regression model to predict a person’s income (INCOME) based upon their...
. Develop a simple linear regression model to predict a person’s income (INCOME) based upon their years of education (EDUC) using a 95% level of confidence. a. Write the reqression equation. b. Discuss the statistical significance of the model as a whole using the appropriate regression statistic at a 95% level of confidence. c. Discuss the statistical significance of the coefficient for the independent variable using the appropriate regression statistic at a 95% level of confidence. d. Interpret the coefficient...
Develop a simple linear regression model to predict the Cost of Living Index based upon Restaurant...
Develop a simple linear regression model to predict the Cost of Living Index based upon Restaurant Price Index using a 95% level of confidence. Write the reqression equation. Discuss the statistical significance of the model as a whole using the appropriate regression statistic at a 95% level of confidence. Discuss the statistical significance of the coefficient for the independent variable using the appropriate regression statistic at a 95% level of confidence. Interpret the coefficient for the independent variable. What percentage...
Develop the best logistic regression model that can predict the wage by using the combination of...
Develop the best logistic regression model that can predict the wage by using the combination of any following variables: total unit (X2), constructed unit (X3), equipment used (X4), city location (X5) and total cost of a project (X6). Make sure that you partition your data with 60% training test, 40% validation test, and default seed of 12345 before running the logistic regression (15 points) Wage - X1 Total Unit - X2 Contracted Units - X3 Equipment Used - X4 City...
Question 1: Using 1 variable regression house price vs. number of rooms predict the price of a house with 4 rooms for your client.
Question 1: Using 1 variable regression house price vs. number of rooms predict the price of a house with 4 rooms for your client.You enhance the model by adding 2 more variables:Question 2: What is the predicted house price for a home based on the new model with 3 variables, given that the house has 5 rooms is 10 years old and is 2,000 Square feet?Question 3: Which coefficients are statistically significant at alpha (α) 5% in the model with...
A multiple linear regression was performed using birth weight and gestational age to predict thyroid volume....
A multiple linear regression was performed using birth weight and gestational age to predict thyroid volume. Gestational age was included in the model because it is highly related to both thyroid volume and birth weight. The overall model was significant (p=.0001). Interpret the following regression results. Be sure to talk about all three parameters. Parameter Estimates Source DF Parameter Estimates Standard Error t-value p-value Intercept 1 336.31 168.35 2.00 0.0506 Birth Weight 1 0.1555 0.046 3.41 0.0012 Gestational Age 1...
When we estimate a linear multiple regression model (including a linear simple regression model), it appears...
When we estimate a linear multiple regression model (including a linear simple regression model), it appears that the calculation of the coefficient of determination, R2, for this model can be accomplished by using the squared sample correlation coefficient between the original values and the predicted values of the dependent variable of this model. Is this statement true? If yes, why? If not, why not? Please use either matrix algebra or algebra to support your reasoning.
Discuss the underlying assumptions of a simple linear regression model; multiple regression model; and polynomial regression.
Discuss the underlying assumptions of a simple linear regression model; multiple regression model; and polynomial regression.
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT