In: Computer Science
6)True/False with explanations:
a.In KNN, the complexity of the model space can be tuned to account
for more complex decision boundaries by decreasing K.
b.If Margin(Model 1)>Margin(Model 2) on the same data, Model 1
will perform better on unseen data.
c.Perceptron minimizes training error when the data is linearly
separable.
6-a) In KNN, the complexity of the model space can be tuned to account for more complex decision boundaries by decreasing K.
ANS :- TRUE
Explanation = Training a kNN classifier simply consists of determining k and preprocessing documents. If we preselect some value for k and do not preprocess it, then kNN requires no training at all. In practice, we have to perform preprocessing steps like tokenization. It makes more sense to preprocess the training documents once as part of the training phase rather than repeatedly every time we classify a new test document.
Test time is
for kNN. It has a linear size of the training set as we need to
compute the distance of each training document from the test
document. Test time is independent of the number of classes J. kNN,
therefore, has a potential advantage for problems with large J.
In short:
large value of K = simple model = underfit = low variance & high bias
Small vlaue of K = complex model =overfit = high variance& low bias
6-b).If Margin(Model 1)>Margin(Model 2) on the same data, Model 1 will perform better on unseen data.
ANS :- FALSE
Explanation : -
Generalisation error in statistics is generally the out-of-sample error which is the measure of how accurately a model can predict values for previously unseen data.
6-c) Perceptron minimizes training error when the data is linearly separable.
ANS:- TRUE
Explanation :-
The open (closed) positive half-space associated with the n dimensional weight vector w is the set of all points x ∈ IRn for which w·x > 0 (w · x ≥ 0). The open (closed) negative half-space associated with w is the set of all points x ∈ IRn for which w · x < 0 (w · x ≤ 0).
We omit the adjectives “closed” or “open” whenever it is clear from the context which kind of linear separation is being used. Let P and N stand for two finite sets of points in IRn which we want to separate linearly. A weight vector is sought so that the points in P belong to its associated positive half-space and the points in N to the negative half-space. The error of a perceptron with weight vector w is the number of incorrectly classified points. The learning algorithm must minimize this error function E(w). One possible strategy is to use a local greedy algorithm which works by computing the error of the perceptron for a given weight vector, looking then for a direction in weight space in which to move, and updating the weight vector by selecting new weights in the selected search direction. We can visualize this strategy by looking at its effect in weight space.