In: Computer Science
In python-
Create a class defined for Regression. Class attributes are data points for x, y, the slope and the intercept for the regression line. Define an instance method to find the regression line parameters (slope and intercept). Plot all data points on the graph. Plot the regression line on the same plot.
Python Code:
Input Datase:------------------------------------------------------------
YearsExperience | Salary | |
---|---|---|
0 | 1.1 | 39343.0 |
1 | 1.3 | 46205.0 |
2 | 1.5 | 37731.0 |
3 | 2.0 | 43525.0 |
4 | 2.2 | 39891.0 |
5 | 2.9 | 56642.0 |
6 | 3.0 | 60150.0 |
7 | 3.2 | 54445.0 |
8 | 3.2 | 64445.0 |
9 | 3.7 | 57189.0 |
10 | 3.9 | 63218.0 |
11 | 4.0 | 55794.0 |
12 | 4.0 | 56957.0 |
13 | 4.1 | 57081.0 |
14 | 4.5 | 61111.0 |
15 | 4.9 | 67938.0 |
16 | 5.1 | 66029.0 |
17 | 5.3 | 83088.0 |
18 | 5.9 | 81363.0 |
19 | 6.0 | 93940.0 |
20 | 6.8 | 91738.0 |
21 | 7.1 | 98273.0 |
22 | 7.9 | 101302.0 |
23 | 8.2 | 113812.0 |
24 | 8.7 | 109431.0 |
25 | 9.0 | 105582.0 |
26 | 9.5 | 116969.0 |
27 | 9.6 | 112635.0 |
28 | 10.3 | 122391.0 |
29 | 10.5 | 121872.0 |
_---------------------------------------------------------------------------------
# Informaion
of employess of company.
30 employees.
# We have employees and their salary. We need to understand
# the correlation between both the columns. We need to
predict
# salaries based on the number of experience an employee has
# and will compare it with actual salary.
import pandas as
pd
import
matplotlib.pyplot
as plt
dataset =
pd.read_csv('Salary_Data.csv')
dataset.head()
YearsExperience | Salary | |
---|---|---|
0 | 1.1 | 39343.0 |
1 | 1.3 | 46205.0 |
2 | 1.5 | 37731.0 |
3 | 2.0 | 43525.0 |
4 | 2.2 | 39891.0 |
x =
dataset.iloc[:,
:-1].values
y =
dataset.iloc[:,
1].values
# Splitting into Training & Testing
from
sklearn.model_selection
import train_test_split
x_train, x_test, y_train, y_test = train_test_split(x, y,
test_size= 1/3, random_state = 0)
# Dataset Divided into 70 and 30 ratio.
# We dont need to
apply Feature Scaling in Regression as
# the Library takes care of the FS itself.
# Below SLR library will take care of FS
# To fit the Simple Linear Regression to
the Training Set.
from
sklearn.linear_model
import LinearRegression
regressor =
LinearRegression()
regressor.fit(x_train,
y_train)
# Regressor is the machine which has learnt from the training
# data. Now the regressor is the
Machine which will now give results.
# Predicting Test Set Results
y_pred =
regressor.predict(x_test)
#To retrieve the intercept:
print("Intercept of
regression model
:",regressor.intercept_)
#For retrieving the slope:
print("Slope of
regression model
:",regressor.coef_)
Intercept of regression model : 26816.19224403119 -------------------( intercept and slope ) Slope of regression model
: [9345.94244312]
#---------------------------------------------------------------------------------------------------------------------
# graph between actual dataset value and predicted by regression model values
df =
pd.DataFrame({'Actual
Value':
y_test.flatten(),
'Predicted Value':
y_pred.flatten()})
df1 =
df.head(25)
df1.plot(kind='bar',figsize=(12,7))
plt.grid(which='major', linestyle='-', linewidth='0.5',
color='green')
plt.grid(which='minor', linestyle=':', linewidth='0.5',
color='black')
plt.show()
# Visualizing the Training Set Results
plt.scatter(x_train, y_train, color = 'red',label='Data
Points')
plt.plot(x_train, regressor.predict(x_train), color =
'blue',label="Regression Line")
plt.title('Salary vs Experience (Training Set)')
plt.legend()
plt.xlabel('Years of Experience')
plt.ylabel('Salary')
plt.show()
# Visualizing the Test Results
plt.scatter(x_test, y_test, color = 'red',label='Data
Points')
plt.plot(x_train, regressor.predict(x_train), color =
'blue',label="Regression Line")
plt.title('Salary vs Experience (Test Set)')
plt.legend()
plt.xlabel('Years of Experience')
plt.ylabel('Salary')
plt.show()
Screen shot: