In: Math
Prepare a single Microsoft Excel file, using a separate worksheet for each regression model, to document your regression analyses. Prepare a single Microsoft Word document that outlines your responses for each portions of the case study.
Selling Price Age (Years) Living Area (Sq Feet) No. Bathrooms No Bedrooms
$92,000 18 1,527 2 4
$211,002 0 2,195 2.5 4
$115,000 14 1,480 1.5 3
$113,000 53 1,452 2 3
$216,300 0 2,360 2.5 4
$145,000 32 1,440 1 3
$114,000 14 1,480 2.5 2
$139,050 125 1,879 2.5 3
$104,000 14 1,480 1.5 3
$169,900 11 1,792 2.5 3
$177,900 2 1,386 2.5 3
$133,000 14 1,676 2 2
$185,000 0 768 2 4
$115,000 16 1,560 1.5 3
$100,000 91 1,000 1 3
$117,000 15 1,676 1.5 4
$150,000 11 1,656 1.5 3
$187,500 11 2,300 1.5 3
$107,000 25 1,712 1 3
$126,900 26 1,350 1.5 3
$147,000 15 1,676 2.5 3
$62,000 103 1,317 1.5 3
$101,000 30 1,056 2 3
$143,500 13 912 1 3
$113,400 18 1,232 2 2
$112,000 36 1,280 1 3
$112,500 43 1,232 1 3
$97,000 45 1,406 1.5 3
$121,000 6 1,164 2 3
$65,720 123 1,198 1 3
$225,000 10 2,206 2.5 4
Develop a simple linear regression model to predict the price of a house based upon the living area (square feet) using a 95% level of confidence
Dependent variable = price
Independent variable = area
1. Put the values in excel as shown below.
2. We use the regression option under the Data analysis tab.
3. Input the data as shown below.
4. The output will be generated as follows
5. We formulate the regression equation using the output
(highlighted in yellow).
Write the reqression
equation
Price = 39438.54 + 61.8185 Area
Discuss the statistical significance of the model as a whole using the appropriate regression statistic at a 95% level of confidence.
For this we look that pvalue of the Anova. The Pvalue of anova
(highlighted in green)
Since the pvalue is less than 0.05, the model is significant.
Discuss the statistical significance of the coefficient for
the independent variable using the appropriate regression statistic
at a 95% level of confidence.
For this, we look that pvalue of the coefficient . The Pvalue of
the coefficient (highlighted in orange)
Since the pvalue is less than 0.05, hence the variable is
significant in predicting the dependent variable.
Interpret the coefficient for the independent
variable.
One unit increase in the area, increases the price by 61.81
dollars
What percentage of the observed variation in housing
prices is explained by the model?
Variation explained by the model is given by the Rsquare
(highlighted in blue) which 0.3444
Note - Higher the number, better the model is.
Predict the value of a house with 3,000 square feet of
living area.
Price = 39438.54 + 61.8185 Area
Price = 39438.54 + 61.8185 (3000) =224894.30
---------------------------------------------------------
Follow the similar steps for the next variable. The screenshot are provided in case you get stuck.
Develop a simple linear regression model to predict the price of a house based upon the number of bedrooms using a 95% level of confidence.
Dependent variable = price
Independent variable = bedrooms
Write the reqression equation
Price = 30701.86 + 32988.68 Bedroom
Discuss the statistical significance of the model as a
whole using the appropriate regression statistic at a 95% level of
confidence.
For this we look that pvalue of the Anova. The Pvalue of anova
(highlighted in green)
Since the pvalue is less than 0.05, the model is significant.
Discuss the statistical significance of the coefficient for
the independent variable using the appropriate regression statistic
at a 95% level of confidence.
For this we look that pvalue of the coefficient . The Pvalue of the
coefficient (highlighted in orange)
Since the pvalue is less than 0.05, hence the variable is
significant in predicting the dependent variable.
Interpret the coefficient for the independent
variable.
One unit increase in the bedroom increases the price by 32988.68
dollars
What percentage of the observed variation in housing
prices is explained by the model?
Variation explained by the model is given by the Rsquare
(highlighted in blue) which 0.1873
Note - Higher the number, better the model is.
Predict the value of a house with 3
bedrooms.
Price = 30701.86 + 32988.68 Bedroom
Price = 30701.86 + 32988.68 (3)=129667.93
--------------------------
Develop a simple linear regression model to predict the
price of a house based upon the number of bathrooms using a 95%
level of confidence.
Dependent variable = price
Independent variable = bathroom
Write the reqression equation
Price = 63247.78 + 39596.15 bathroom
Discuss the statistical significance of the model as a
whole using the appropriate regression statistic at a 95% level of
confidence.
For this we look that pvalue of the Anova. The Pvalue of anova
(highlighted in green)
Since the pvalue is less than 0.05, the model is significant.
Discuss the statistical significance of the coefficient for
the independent variable using the appropriate regression statistic
at a 95% level of confidence.
For this we look that pvalue of the coefficient . The Pvalue of the
coefficient (highlighted in orange)
Since the pvalue is less than 0.05, hence the variable is
significant in predicting the dependent variable.
Interpret the coefficient for the independent
variable.
One unit increase in the bathroom increases the price by 39596.157
dollars
What percentage of the observed variation in housing
prices is explained by the model?
Variation explained by the model is given by the Rsquare
(highlighted in blue) which 0.2924
Note - Higher the number, better the model is.
Predict the value of a house with 2.5
bathrooms.
Price = 63247.78 + 39596.15 bathroom
Price = 63247.78 + 39596.15 (2.5)=162238.18
-------------------------
Develop a simple linear regression model to predict the price of a house based upon its age using a 95% level of confidence.
Dependent variable = price
Independent variable = age
Write the reqression equation
Price = 152772.22 -660.88 age
Discuss the statistical significance of the model as a
whole using the appropriate regression statistic at a 95% level of
confidence.
For this we look that pvalue of the Anova. The Pvalue of anova
(highlighted in green)
Since the pvalue is less than 0.05, the model is significant.
Discuss the statistical significance of the coefficient for
the independent variable using the appropriate regression statistic
at a 95% level of confidence.
For this we look that pvalue of the coefficient . The Pvalue of the
coefficient (highlighted in orange)
Since the pvalue is less than 0.05, hence the variable is
significant in predicting the dependent variable.
Interpret the coefficient for the independent
variable.
One unit increase in the age decreases the price by
660.8855
dollars
What percentage of the observed variation in housing
prices is explained by the model?
Variation explained by the model is given by the Rsquare
(highlighted in blue) which 0.3066
Note - Higher the number, better the model is.
Predict the value of a house that is 50 years
old.
Price = 152772.22 -660.88 age
Price = 152772.22 -660.88 (50)= 119727.95
Compare the preceding four simple linear regression
models to determine which model is the preferred model. Use the
Significance F values, p-values for independent variable
coefficients, R-squared or Adjusted R-squared values (as
appropriate), and standard errors to explain your selection.
Calculate the predicted sale price of a 50 year old house with
3,000 square feet of living area, 3 bedrooms, and 2.5 bathrooms
using your preferred regression model from part 5.
This is a summary of all the model. We see that all the model are significant and the variable are significant predictors of the dependent variable.
However we see that the Rsquare for model with Area is higher
compared to other. This indicates that the model is able to explain
a higher percentage of the variability of y.
Hence I would prefer the model with Area.
Price = 39438.54 + 61.8185 Area
Price = 39438.54 + 61.8185 (3000) =224894.30