Question

In: Statistics and Probability

Using the data from boston_housing.xls (accessible online) a.) do the appropriate multiple linear regression model procedures...

Using the data from boston_housing.xls (accessible online)

a.) do the appropriate multiple linear regression model procedures to obtain a final model

Solutions

Expert Solution

Please run programme in python

import pandas as pd
import numpy as np
import seaborn as sns
from matplotlib import pyplot as plt
%matplotlib inline
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn import metrics
from sklearn.metrics import r2_score

from sklearn.datasets import load_boston
boston = load_boston()
boston

print(boston['DESCR'])

boston['target']

x=pd.DataFrame(boston['data'],columns=boston['feature_names'])

y=boston.target

x.isnull().sum()

x_train,x_test,y_train,y_test=train_test_split(x,y,test_size=.25)

x_train.shape

x_test.shape

lr = LinearRegression()

lr.fit(x_train,y_train)

pred=lr.predict(x_test)

print('MAE', metrics.mean_absolute_error(y_test, pred))
print('MSE', metrics.mean_squared_error(y_test, pred))
print('RMSE', np.sqrt(metrics.mean_squared_error(y_test, pred)))
print('R squared error', r2_score(y_test, pred))

conclusion

  hence R Squared value is 0.73589 is slightly near to 1 hence our model fitting is good .
we can further influnce the accuracy of model using various techniqe .


Related Solutions

a. Construct a scatterplot of the data and tell why a linear regression model is appropriate....
a. Construct a scatterplot of the data and tell why a linear regression model is appropriate. (Include this graph in your report.)   b. Run the linear regression procedure on StatCrunch and include the output in your report. c. Give the regression equation using the correct notation. d. Give the Coefficient of Determination AND interpret it.   e. Check the assumptions of the model by constructing each of the following plots and commenting on what they suggest in terms of the assumptions....
When we estimate a linear multiple regression model (including a linear simple regression model), it appears...
When we estimate a linear multiple regression model (including a linear simple regression model), it appears that the calculation of the coefficient of determination, R2, for this model can be accomplished by using the squared sample correlation coefficient between the original values and the predicted values of the dependent variable of this model. Is this statement true? If yes, why? If not, why not? Please use either matrix algebra or algebra to support your reasoning.
What is the drawback of using the step_wise model in multiple linear regression? How is feature...
What is the drawback of using the step_wise model in multiple linear regression? How is feature importance addressed in decision trees? Is there a guarantee that an ensemble method always outperforms a simple decision tree? Elaborate on your answer.
Discuss the underlying assumptions of a simple linear regression model; multiple regression model; and polynomial regression.
Discuss the underlying assumptions of a simple linear regression model; multiple regression model; and polynomial regression.
1.Develop a multiple linear regression model to predict the price of a house using the square...
1.Develop a multiple linear regression model to predict the price of a house using the square feet of living area, number of bedrooms, and number of bathrooms as the predictor variables     Write the reqression equation.      Discuss the statistical significance of the model as a whole using the appropriate regression statistic at a 95% level of confidence. Discuss the statistical significance of the coefficient for each independent variable using the appropriate regression statistics at a 95% level of confidence....
The following is the estimation results for a multiple linear regression model: SUMMARY OUTPUT             Regression...
The following is the estimation results for a multiple linear regression model: SUMMARY OUTPUT             Regression Statistics R-Square                                                       0.558 Regression Standard Error (S)                  863.100 Observations                                               35                                Coeff        StdError          t-Stat    Intercept               1283.000    352.000           3.65    X1                             25.228        8.631                       X2                               0.861        0.372           Questions: Interpret each coefficient.
The following is the estimation results for a multiple linear regression model: SUMMARY OUTPUT             Regression...
The following is the estimation results for a multiple linear regression model: SUMMARY OUTPUT             Regression Statistics R-Square                                                       0.558 Regression Standard Error (S)                  863.100 Observations                                               35                                Coeff        StdError          t-Stat    Intercept               1283.000    352.000           3.65    X1                             25.228        8.631                       X2                               0.861        0.372           Question: 1. A. Write the fitted regression equation. B. Write the estimated intercepts and slopes, associated with their corresponding standard errors. C. Interpret each coefficient.
The data presented in Problem 7 are analyzed using multiple linear regression analysis and the models...
The data presented in Problem 7 are analyzed using multiple linear regression analysis and the models are shown here. In the models, the data are coded as 1 = new medication and 0 = standard medication, and age 65 and older is coded as 1 = yes and 0 = no. ŷ = 53.85 − 23.54 (Medication) ŷ = 45.31 − 19.88 (Medication) + 14.64 (Age 65 +) ŷ = 45.51 − 20.21 ( Medication ) + 14.29 ( Age...
USING MINITAB 13.2 - Multiple Linear Regression Exercise – Stock Price Data File: as above The...
USING MINITAB 13.2 - Multiple Linear Regression Exercise – Stock Price Data File: as above The stock broker now wants to include additional predictor variables to determine the driving factors for the stock price increase of Company A. Perform a multiple linear regression analysis between Company A and the 10 Yr. T, GDP and Unemployment (columns C1, C3, C4, and C5). Use the Minitab data file Regression.MTW. Company A   Company B   10-Yr T   GDP ($millions)   Unemployment 36.25   30   2.33   7308755  ...
1. A multiple linear regression model should not be used if: A The variables are all...
1. A multiple linear regression model should not be used if: A The variables are all statistically significant. B The coefficient of determination R2 is large. C Both of the above. D Neither of the above. 2. Consider a multiple linear regression model where the output variable is a company's revenue for different months, and the purpose is to investigate how the revenue depends upon the company's advertising budget. The input variables can be time-lagged so that the first input...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT