Question

In: Computer Science

For an implementation of kNN classification from scratch in python I am not sure if I...

For an implementation of kNN classification from scratch in python I am not sure if I correctly calculate the euclidian distance. Help with python code.

import pandas as pd
import numpy as np
import math
import operator
from collections import Counter

class KNN:
  
def calculate_distance(x1, x2, length):
e_distance = 0
for x in range(length):
e_distance += pow((x1[x] - x2[x]),2)
return math.sqrt(e_distance)
  
def __init__(self, k=5, p=2):
self.k = k

Solutions

Expert Solution

import pandas as pd
import numpy as np
import operator

# loading data file into the program. give the location of your csv file
dataset = pd.read_csv("E:/input/iris.csv")
print(dataset.head()) # prints first five tuples of your data.

# making function for calculating euclidean distance
def E_Distance(x1, x2, length):
distance = 0
for x in range(length):
distance += np.square(x1[x] - x2[x])
return np.sqrt(distance)

# making function for defining K-NN model

def knn(trainingSet, testInstance, k):
distances = {}
length = testInstance.shape[1]
for x in range(len(trainingSet)):
dist = E_Distance(testInstance, trainingSet.iloc[x], length)
distances[x] = dist[0]
sortdist = sorted(distances.items(), key=operator.itemgetter(1))
neighbors = []
for x in range(k):
neighbors.append(sortdist[x][0])
Count = {} # to get most frequent class of rows
for x in range(len(neighbors)):
response = trainingSet.iloc[neighbors[x]][-1]
if response in Count:
Count[response] += 1
else:
Count[response] = 1
sortcount = sorted(Count.items(), key=operator.itemgetter(1), reverse=True)
return (sortcount[0][0], neighbors)

# making test data set
testSet = [[6.8, 3.4, 4.8, 2.4]]
test = pd.DataFrame(testSet)

# assigning different values to k
k = 1
k1 = 3
k2 = 11

# supplying test data to the model
result, neigh = knn(dataset, test, k)
result1, neigh1 = knn(dataset, test, k1)
result2, neigh2 = knn(dataset, test, k2)

# printing output prediction

print(result)
print(neigh)
print(result1)
print(neigh1)
print(result2)
print(neigh2)


Related Solutions

Python: I am not sure where to begin with this question, and I hope I can...
Python: I am not sure where to begin with this question, and I hope I can get an input on it. This is in regards to Downey's program, and we are asked to make two changes. Downey prints time as they do in the Army: 17:30:00 hours. We want to print that as 5:30 PM. Downey lets you define the time 25:00:00 - we want to turn over at 23:59:59 to 00:00:00. (I am asked to identify my changes with...
I am stuck on this problem and I am not sure what the solution is. In...
I am stuck on this problem and I am not sure what the solution is. In C Write item.h and item.c. In item.h, typedef a struct (of type t_item) which contains the following information: t_item: char name[MAX_ITEM_NAME_STRING]; char description[MAX_ITEM_DESCRIPTION_STRING]; Make sure that MAX_ITEM_NAME_STRING and MAX_ITEM_DESCRIPTION_STRING are defined with suitable sizes in your item.h. Typical values are, 25 and 80, respectively. Add the following interface definition to item.h: int item_load_items(t_item items[], int max_items, char *filename); Returns the number of objects loaded...
I am working on this problem for the company AT & T and am not sure...
I am working on this problem for the company AT & T and am not sure how to start it. Draw a chart of the main inter-organizational linkage mechanisms (e.g., long -term contacts, strategic alliances, mergers) that your organization uses to manage its symbiotic resource interdependencies. Using resource dependence theory and transaction cost theory, discuss why the organization to manage its interdependencies in this way. Do you think the organization has selected the most appropriate linkage mechanisms? Why or why...
I am having a trouble with a python program. I am to create a program that...
I am having a trouble with a python program. I am to create a program that calculates the estimated hours and mintutes. Here is my code. #!/usr/bin/env python3 #Arrival Date/Time Estimator # # from datetime import datetime import locale mph = 0 miles = 0 def get_departure_time():     while True:         date_str = input("Estimated time of departure (HH:MM AM/PM): ")         try:             depart_time = datetime.strptime(date_str, "%H:%M %p")         except ValueError:             print("Invalid date format. Try again.")             continue        ...
I am calculating the expected cash flow for a project. However, I am not sure that...
I am calculating the expected cash flow for a project. However, I am not sure that I should use the net cash flow (after tax) or cash flows (before tax) to identify the payback period and the discounted payback period? Thank you
I just need 3 and 5. I am not sure what I am doing wrong. I...
I just need 3 and 5. I am not sure what I am doing wrong. I get different numbers every time. Superior Markets, Inc., operates three stores in a large metropolitan area. A segmented absorption costing income statement for the company for the last quarter is given below: Superior Markets, Inc. Income Statement For the Quarter Ended September 30 Total North Store South Store East Store Sales $ 4,800,000 $ 960,000 $ 1,920,000 $ 1,920,000 Cost of goods sold 2,640,000...
Python 3 Rewrite KNN sample code using KNeighborsClassifier . ● Repeat KNN Step 1 – 5,...
Python 3 Rewrite KNN sample code using KNeighborsClassifier . ● Repeat KNN Step 1 – 5, for at least five times and calculate average accuracy to be your result. ● If you use the latest version of scikit -learn, you need to program with Python >= 3.5. ● Use the same dataset: “ iris.data ” ● Split your data: 67% for training and 33% for testing ● Draw a line chart: Use a “for loop” to change k from 1...
I worked on this my self but I am not sure about it and I feel...
I worked on this my self but I am not sure about it and I feel like I get confuse in explaining some of them, I want to capare and contrast the different types of membrane transport processes.( including differences and similarities) simple diffusion facilitated difusion osmosis primaryactive transport secendary active transport vesicular transport The comparison and similarities should be about each of these topics. 1.direction of transport 2.energy requirement 3.protein requirement 4.types of protein if applicable 5. example of...
I am sure this is a silly question, but I was reading something that described the...
I am sure this is a silly question, but I was reading something that described the pre big-bang universe as having "nearly infinite mass." How can something be "nearly" infinite? The term seems to make no sense.
I have homework on this but I am not sure how to solve it and which...
I have homework on this but I am not sure how to solve it and which formula to use in excel, can u please help me Rebecca is considering buying a 2019 Genesis G70 costing $37,900 and finds that the retaining values of the vehicle over the next four years are as follows: Percent of the total value retained after 24 months: 71% Percent of the total value retained after 48 months: 53% If her interest rate is 5% compounded...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT