Question

In: Computer Science

What is Unsupervised Learning and difficulties involved in unsupervied Learning and name a few unsupervised algorithms?...

  1. What is Unsupervised Learning and difficulties involved in unsupervied Learning and name a few unsupervised algorithms?

  2. What is a PCA? When to use PCA ? [Please explain with one to two examples]

  3. How does a PCA work? [Please write in atleast 5 sentences] .

  4. What are different methods by which you can compute the PCA? Does every method will yield same result or answer at the end of each method?

  5. What are advantages and disadvantages of PCA? [Explain with example]

  6. What is clustering? Explain how K-Means Clustering Algorithm works?

  7. What are the Advantages and disadvantages of Clustering ALgorithms discussed in our class (K-Means,Hierchal)?

  8. Which Clustering Algorithm is better K-Means or hierarchical Clustering? Explain with a proper example which is better algorithm?

Solutions

Expert Solution

1. Unsupervised learning is a technique which uses unlabelled data to train the model. Unlike supervised learning, the model cannot associate value in a feature to a result. The amount of unlabeled data in real-world is greater than the labeled data. The following are some of the difficulties in unsupervised learning:

- The time complexity is greater when compared to most supervised learning.

- The number of clusters that the model will form is unknown prior to training.

- Data pre-processing is often difficult because of the unavailability of the labels.

- The model might find a pattern in data which is not required.

Few unsupervised algorithms are K-means, Hierarchical clustering, Fuzzy C-Means.

2. Often the real-world datasets have 1000s of features(dimensions) in them. This reduces the performance and accuracy of some models because not all the features in the dataset are useful for the model. These features can be removed during data pre-processing using techniques like PCA, LDA, GDA, etc. Apart from improving the performance of the model, reducing features also help in data visualization.

Example: The popular MNIST data contains handwritten numbers in images. Each pixel in the image is considered as a feature.

Above is an example of the number "2" from the MNIST data set. Notice that all the numbers are centered. The white pixels from the border to the handwritten number are not useful. In these situations, dimensionality reduction can be used to remove unwanted features.

3. PCA is the most popular technique of dimentionality reduction. PCA projects the data into a hyperplane and the dimention which has less variance will be removed. For example, consider a 2D dataset which is projected as follows

It can be noted that the maximum variance is along the solid line, followed by the dotted line and then the dashed line. Thus the solid and dotted axis can be preserved and the dashed axis can be removed.

5. Advantages of PCA: Increases performance, Reduces co-related features, Improves visualization.

Disadvantages: The retained features will be turned into principle components and interpreting this will become difficult in some cases. The data has to be scaled and standardized before using PCA.


Related Solutions

Which of the following statements below is true about supervised/unsupervised machine learning? Unsupervised learning require labeled...
Which of the following statements below is true about supervised/unsupervised machine learning? Unsupervised learning require labeled data for training Supervised learning require unlabeled data for training Supervised learning require labeled data for training Unsupervised learning require no supervision from human
Data Analytics Unsupervised learning is a type of machine learning that looks for previously undetected patterns...
Data Analytics Unsupervised learning is a type of machine learning that looks for previously undetected patterns in a data set with no pre-existing labels and with a minimum of human supervision, which is widely used in cybersecurity. Can you introduce three unsupervised methods to conduct anomaly detection? (2')
Business analytics MBA- There are a number of learning scenarios or types of learning algorithms, that...
Business analytics MBA- There are a number of learning scenarios or types of learning algorithms, that can be used depending on whether a target variable is available and how much labeled data can be used. These approaches include supervised, unsupervised, and semi-supervised learning. Explain the difference between each type of machine learning. Give an example of how each is used. Write your responses in detail with examples. Be sure to identify the source of your example in your posting. Your...
Explain the difference between supervised and unsupervised learning. Provide examples.
Explain the difference between supervised and unsupervised learning. Provide examples.
In each of the following cases, identify whether the task required is supervised or unsupervised learning,...
In each of the following cases, identify whether the task required is supervised or unsupervised learning, and then identify the appropriate technique—i.e., prediction, classification, affinity or clustering analysis—that you would use. Assume that an appropriate dataset is available for your algorithm to learn from. a. Deciding whether to issue a loan to an applicant based on demographic and financial data using a database of similar data on prior customers. b. In an online bookstore, making recommendations to customers concerning additional...
What are devices in your home that appear to use computers or algorithms? Can you name...
What are devices in your home that appear to use computers or algorithms? Can you name at least one device for every room in your house? Describe one algorithm each device performs
Give an example of combining unsupervised and supervise learning methods to provide solutions in real world....
Give an example of combining unsupervised and supervise learning methods to provide solutions in real world. How should companies plan and deal with the consequence of implementing an analytics solution?
A.) explain use of learning curve in budgetary and budget control,and explain the difficulties that the...
A.) explain use of learning curve in budgetary and budget control,and explain the difficulties that the management Accountant may encounter in such use. B.)Explain how allocating Support department costs will encourage service department to operate more efficiency. c.)"A customer profitbility highlights those customers who should be dropped to improve profitability".Doyou agree?Why or Why not?
Summarize the difficulties of learning aggregate demand and aggregate supply.(150 words)
Summarize the difficulties of learning aggregate demand and aggregate supply.(150 words)
1.Is classification consider Supervised or Unsupervised Learning?Explain. 2.Suppose you are given the task of finding a...
1.Is classification consider Supervised or Unsupervised Learning?Explain. 2.Suppose you are given the task of finding a useful training dataset for a Classification problem you have been assigned.Suppose you find the features but dataset dose not include the labels.Briefly explain how you might label the data for the use Classification. 3.What does the sample() function in R do ?
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT