In: Computer Science
Problem 4 (20 pt.)
Given the following dataset:
? |
? |
? |
Class |
2.5 |
1.5 |
3.5 |
- |
0.5 |
1.0 |
1.5 |
+ |
0.5 |
0.5 |
1.0 |
+ |
2.0 |
2.5 |
2.5 |
+ |
1.0 |
2.0 |
3.0 |
- |
2.0 |
3.0 |
1.5 |
- |
Supposethatyouwanttoclassifyanobservation?=(?.?, ?.?, ?.?)using?-NearestNeighbors with Euclidean distance as the proximity metric. Answer the following questions:
(8 pts.) What is the distance between ? and every observation in the dataset?
(3 pts.) What is the predicted class label for ? if ? = ??
(3 pts.) What is the predicted class label for ? if ? = ??
(3 pts.) What is the predicted class label for ? if ? = ??
(3 pts.) What is the predicted class label for ? if ? = ??
1) Distance Between z and every observation in dataset:
(i) Point 1:
(ii) Point 2:
(iii) Point 3:
(iv) Point 4:
(v) Point 5:
(vi) Point 6:
2) For K = 1, closest point is V. So the predicted class label for z is '-'
3) For K = 2, closest points are IV and V, both has different class label. So the predicted class label for z is '-'
4) For K = 3, closest points are I, IV and V, most occuring label is '-'. So the predicted class label for z is '-'
5) For K = 4, closest points are I, IV, V, and VI, most occuring label is '-'. So the predicted class label for z is '-'
For question 3, it depends on implementation whether result will be '+' or '-'.