In: Computer Science
Which of the following methods can achieve zero training error on any linearly separable dataset?
(A) Support vector machines
(B) 3-Nearest Neighbor
(C) Linear perceptron
(D) Logistic regression
Please answer with an explanation for each option on why it may or may not achieve zero training error on any linearly separable dataset.
Ans: SVM and Linear Perceptron
Support Vector Machines: The "hard margin" SVM seeks to perfectly separate the data with a (hyper)plane (possibly in some wacky space implied by the kernel function) and then maximize the margin (the space on either side of that plane). Maximizing the margin controls the generalization error.
Linear Perceptron - Since the data set is linearly separable, any subset of the data is also linearly separable. Thus, the perceptron is guaranteed to converge to a perfect solution on the training set. This may not be always true for testing dataset.
Logistic regression: Logistic regression is a linear classifier, i.e. it draws a line (2D datasets) and classifies accordingly (one side is class 0, another side is class 1). So, if classes can be distinguished by a line (or hyperplane in higher dimensions), it is said that the dataset is linearly separable, though this dataset is not.