In: Statistics and Probability
Using the boston_housing.xls data do the ff., (you can access it online if you search in google: bostonhousing.xl) data can't be uploaded here in chegg
a.) Use the appropriate regression procedure
possible regression procedures: (coefficient of multiple determination criterion/adjusted r-square criterion/Mallow's cp statistic criterion/ prediction sum of squares criterion/ backward elimination procedure/ forward selection procedure/stepwise selection procedure)
b.) obtain a final model
Please run programme in python
import pandas as pd
import numpy as np
import seaborn as sns
from matplotlib import pyplot as plt
%matplotlib inline
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn import metrics
from sklearn.metrics import r2_score
from sklearn.datasets import load_boston
boston = load_boston()
boston
print(boston['DESCR'])
boston['target']
x=pd.DataFrame(boston['data'],columns=boston['feature_names'])
y=boston.target
x.isnull().sum()
x_train,x_test,y_train,y_test=train_test_split(x,y,test_size=.25)
x_train.shape
x_test.shape
lr = LinearRegression()
lr.fit(x_train,y_train)
pred=lr.predict(x_test)
print('MAE', metrics.mean_absolute_error(y_test, pred))
print('MSE', metrics.mean_squared_error(y_test, pred))
print('RMSE', np.sqrt(metrics.mean_squared_error(y_test,
pred)))
print('R squared error', r2_score(y_test, pred))
conclusion
hence R Squared value is 0.73589 is slightly
near to 1 hence our model fitting is good .
we can further influnce the accuracy of model using various
techniqe .