Question

In: Computer Science

Machine Learning - multivariate methods Let us say in two dimensions, we have two classes with...

Machine Learning - multivariate methods

Let us say in two dimensions, we have two classes with exactly the same mean. What type of boundaries can be defined? show a picture of the options

Solutions

Expert Solution

Is least square error a suitable choice for classification?

No, it is not. One of the issues with the least square solution is that it lacks robustness to outliers. Consider the figure given below, we can see that on left-hand side we have two classes denoted by red and blue, separated by the decision boundary. The green decision boundary corresponds to the solution obtained by logistic regression and magenta decision boundary corresponds to the least square solution. Now on the right-hand side, we see that after adding more data points to the training set, the magenta curve shifts, thus misclassifying few data points but the green curve is unmoved. This shows that least square solution even penalizes the predictions that are too far from the correct side of the decision boundary .

The least square solution fails to give correct solution not only in case of data consisting of outliers but in also in several other cases. This happens because the least square solution is equivalent to the maximum likelihood solution under the assumption of Gaussian distribution (as we will see in the further sections). Therefore in case of the datasets that fails to show such distribution, least square will not be a good choice.
There are several alternative error functions for classification.


Related Solutions

I am writing this machine learning code (classification) to clssify between two classes. I started by...
I am writing this machine learning code (classification) to clssify between two classes. I started by having one feature to capture for all my images. for example: class A=[(4295046.0, 1), (4998220.0, 1), (4565017.0, 1), (4078291.0, 1), (4350411.0, 1), (4434050.0, 1), (4201831.0, 1), (4203570.0, 1), (4197025.0, 1), (4110781.0, 1), (4080568.0, 1), (4276499.0, 1), (4363551.0, 1), (4241573.0, 1), (4455070.0, 1), (5682823.0, 1), (5572122.0, 1), (5382890.0, 1), (5217487.0, 1), (4714908.0, 1), (4697137.0, 1), (4737784.0, 1), (4648881.0, 1), (4591211.0, 1), (4750706.0, 1), (5067788.0, 1),...
This week we have been learning about the methods archaeologists and paleontologists use to reconstruct the...
This week we have been learning about the methods archaeologists and paleontologists use to reconstruct the past. Now it’s time to get creative and apply this new knowledge to the present day. This is going to involve a little bit of research and creativity on your part. Pick a room in your house, apartment, dorm, or alternative domicile. If you walked out the door today and your room remained untouched for 100, 1,000 or 50,000 years, what would be left...
Some say that the rapid growth of machine learning and automated systems are going to be...
Some say that the rapid growth of machine learning and automated systems are going to be a threat to the amount of available jobs for people. Do you think that this is a possible scenario?  Or do you think it will generate jobs in other yet unseen areas? Would you have interest in being a part of programming these next-gen automated systems?  Why or why not?
The directions say to "create two methods:
The directions say to "create two methods: -enrollStudent (Student newStudent) which adds newStudent into the enrolled list and increment numOfEnroll by 1, if the section is full, print out an error message and no change with numOfEnroll. -removeStudent (String studentId) which removes the student with studentId from the enrolled list and decrement numOfEnrollby 1, if the target student is not found in the enrolled list, print out an error message and no change with numOfEnroll."  
1. Briefly describe two shortcomings of univariate methods in ratemaking and explain how multivariate methods behave...
1. Briefly describe two shortcomings of univariate methods in ratemaking and explain how multivariate methods behave with these two shortcomings. 2. Briefly describe why the adoption of multivariate methods has been increasing in the most recent decades.
Explain what an influential multivariate outlier is, and at least two methods of coping with them...
Explain what an influential multivariate outlier is, and at least two methods of coping with them in multiple regression.
1. Define the concept of “Machine Learning”. 2. Summarise two applications of machine learning and the...
1. Define the concept of “Machine Learning”. 2. Summarise two applications of machine learning and the value it create.
Researchers are interested in the mean age of a certain population. Let us say that they...
Researchers are interested in the mean age of a certain population. Let us say that they are asking the following questions: can we conclude that the mean age of this population is different from 30 years? If the sample mean of 10 individuals drawn from that population is 27and the population variance is 20. Make a confidence interval of the mean A. 1.96 , -1.96 B. 24.23 ,29.20 C. 24.23 , 29.77 D. 2.77 , -2.77
Q2. Let (E, d) be a metric space, and let x ∈ E. We say that...
Q2. Let (E, d) be a metric space, and let x ∈ E. We say that x is isolated if the set {x} is open in E. (a) Suppose that there exists r > 0 such that Br(x) contains only finitely many points. Prove that x is isolated. (b) Let E be any set, and define a metric d on E by setting d(x, y) = 0 if x = y, and d(x, y) = 1 if x and y...
Let A be a set of real numbers. We say that A is an open set...
Let A be a set of real numbers. We say that A is an open set if for every x0 ∈ A there is some δ > 0 (which might depend on x0) such that (x0 − δ, x0 + δ) ⊆ A. Show that a set B of real numbers is closed if and only if B is the complement of some open set A
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT