In: Computer Science
In each of the following cases, identify whether the task
required is supervised or unsupervised learning, and then identify
the appropriate technique—i.e., prediction, classification,
affinity or clustering analysis—that you would use. Assume that an
appropriate dataset is available for your algorithm to learn
from.
a. Deciding whether to issue a loan to an applicant based on
demographic and financial data using a database of similar data on
prior customers.
b. In an online bookstore, making recommendations to customers
concerning additional items to buy based on the buying patterns in
prior transactions.
c. Identifying a network data packet as dangerous (e.g., virus,
hacker attack) based on comparison to other packets whose threat
status is known.
d. Identifying segments of similar customers.
e. Predicting whether a company will go bankrupt based on comparing
its financial data to those of similar bankrupt and nonbankrupt
firms.
f. Estimating the repair time required for an aircraft based on a
trouble ticket.
g. Printing custom discount coupons at the conclusion of a grocery
store checkout based on what you just bought and what others have
bought previously.
a. Supervised (Classification)
This problem needs the algorithm to learn about the pattern of the
applicants from the patterns of the prior applicants and then
classify if they are to be given loan or not.
b. Unsupervised (Clustering)
The category of books need to be divided into different clusters,
and then the suggestions need to be provided.
c. Supervised (Classification)
The data packets need to be checked with the previous data packets
which have been classified into dangerous and safe categories.
d. Unsupervised (Affinity)
No prior data is given and the number of clusters is not known so
it needs to find the affinity of customers to different
segments.
e. Supervised (Classification)
The company information needs to be matched with those of the
companies whose records are present and classified as bankrupt or
not.
f. Supervised (Prediction)
The algorithm needs to use regression to predict the amount of time
needed for resolution based on data from previous tickets.
g. Unsupervised (Clustering)
The category of products need to be clustered and the discount
coupon is to be given based on that.