In: Computer Science
kNN Function: Create a function called predictKNN(). Your function will return the classification of your data-pointIn addition to any parameters you see fit, your function should accept:
If your data does not have a classification column, use the results from your unsupervised learning as the classification.
sl_no gender ssc_p ssc_b hsc_p hsc_b hsc_s degree_p degree_t workex etest_p specialisation mba_p status salary 0 1 M 67.00 Others 91.00 Others Commerce 58.00 Sci&Tech No 55.0 Mkt&HR 58.80 Placed 270000.0 1 2 M 79.33 Central 78.33 Others Science 77.48 Sci&Tech Yes 86.5 Mkt&Fin 66.28 Placed 200000.0 2 3 M 65.00 Central 68.00 Central Arts 64.00 Comm&Mgmt No 75.0 Mkt&Fin 57.80 Placed 250000.0 3 4 M 56.00 Central 52.00 Central Science 52.00 Sci&Tech No 66.0 Mkt&HR 59.43 Not Placed NaN 4 5 M 85.80 Central 73.60 Central Commerce 73.30 Comm&Mgmt No 96.8 Mkt&Fin 55.50 Placed 425000.0
SOLUTION:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import classification_report, confusion_matrix
# In os.chdir() provide the address of your working directory which is where your dataset is stored.
import os
os.chdir(r"C:\Users\LENOVO\Documents\Data Science\DataSets")
def predictKNN(k,dataframe):
#Loading the dataframe
dataframe = pd.read_csv(dataframe)
# Dividing dataframe into indepent variable(X) and dependent or target variable(Y)
X = dataframe.iloc[:, :-1].values
y = dataframe.iloc[:, -1].values
# Dividing dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20)
# Preprocessing the data
scaler = StandardScaler()
scaler.fit(X_train)
X_train = scaler.transform(X_train)
X_test = scaler.transform(X_test)
#Applying KNN
classifier = KNeighborsClassifier(n_neighbors=2)
classifier.fit(X_train, y_train)
y_pred = classifier.predict(X_test)
# Checking the performance
from sklearn.metrics import classification_report, confusion_matrix
print(confusion_matrix(y_test, y_pred))
print(classification_report(y_test, y_pred))
return
Note :
**Fell free to ask any queries in the comment section. I am happy to help you. if you like our work, please give Thumbs up**