Question

In: Computer Science

Is the following example of classfication, regression, or clustering problems? Why? For a data mining project,...

Is the following example of classfication, regression, or clustering problems? Why? For a data mining project, a student collects information on income, age, sex, profession, and home zip code for fans of the 9 different New York Sports teams. She wants to build a model to predict which team someone roots for.

Solutions

Expert Solution

The given example is of a classification problem.

Regression and classification are supervised learning approaches that map an input to an output based on example input-output pairs, while clustering is a unsupervised learning approach.

Classification predictive modeling is the task of approximating a mapping function (f) from input variables (X) to discrete output variables (y).

Regression predictive modeling is the task of approximating a mapping function (f) from input variables (X) to a continuous output variable (y).

Clustering is the task of partitioning the dataset into groups, called clusters.

Since both Classification and Clustering are used for the categorisation of objects into one or more classes based on the features, they appear to be a similar process as the basic difference is minute. In the case of Classification, there are predefined labels assigned to each input instances according to their properties whereas in clustering those labels are missing.

Classification -

Clustering -

In our case, the output or the value to be predicted are 9 discrete values for 9 different New York Sports teams based on the input features income, age, sex, profession, and home zip code.

Since we have a discrete output variable and we also have the predefined labels assigned to each input instances, it is a classification problem.


Related Solutions

Data Mining Techniques Please discuss whether or not the following problems are data mining tasks. Explain...
Data Mining Techniques Please discuss whether or not the following problems are data mining tasks. Explain why. (a). Retrieve students' records from a relational table with grade = "A". [5 points] (b). From the table of students' information, check if attributes last name and address have any correlations. [5 points] (c). Find all the documents from the text database containing keywords "data mining". [5 points] (d). Divide the text database into several groups, each group containing near-duplicate or similar documents....
(a) Briefly explain the data mining process. (b) What are the different problems that data mining...
(a) Briefly explain the data mining process. (b) What are the different problems that data mining can solve in general? Explain.
What are some pros and cons to data mining? Provide an example of when data mining...
What are some pros and cons to data mining? Provide an example of when data mining was used and the outcome provided an incorrect assumption or issue. How can these types of situations be avoided in the future?
What is Clustering? What is the purpose of clustering? What assumptions are needed for clustering? Why...
What is Clustering? What is the purpose of clustering? What assumptions are needed for clustering? Why do you need to transform qualitative variables to be presented by numbers in clustering? What is the purpose of normalizing quantitative variables?
1) Describe a real-world example that uses one of the Data Mining Tasks and why is...
1) Describe a real-world example that uses one of the Data Mining Tasks and why is this task best suited to this example? PLEASE EXPLAIN IN DETAIL.
Data mining--> Please Perform Principal Component Analysis and K-Means Clustering on the Give dataset Below. [50...
Data mining--> Please Perform Principal Component Analysis and K-Means Clustering on the Give dataset Below. [50 Points] Dataset Link : https://dataminingcsc6740.s3-us-west-2.amazonaws.com/datasets/homework_2.csv 10 Points for Data Preprocessing. 15 Points for PCA Algorithm along with plots and Results Explaination. 15 Points for K-Means Algorithm with plots and Results Explination. 10 Points for Comparing the results between PCA and K-Means and whats your infer- ence from your ouputs of the algorithms. Hints: As per the data preprocessing step convert all the variables in...
INTRODUCTION TO DATA MINING Question 3: K-means clustering Use the k-means algorithm and Euclidean distance to...
INTRODUCTION TO DATA MINING Question 3: K-means clustering Use the k-means algorithm and Euclidean distance to cluster the following seven examples into two clusters: A1=(1, 1), A2=(1.5, 2), A3=(3,4), A4=(5,7), A5=(3.5,5), A6=(4.5,5), A7=(3.5,4.5) Suppose that the initial seeds (centers of each cluster) are A1 and A4. Run the k-means algorithm for 2 epochs. At the end of this epoch show: a) Distance matrix by calculating Euclidean distance. b) The new clusters (i.e. the examples belonging to each cluster) c) The...
Why is data mining a key piece of analytics?
Why is data mining a key piece of analytics?
Provide an example on how data mining can turn a large collection of data into knowledge...
Provide an example on how data mining can turn a large collection of data into knowledge that can help meet a current global challenge in order to improve healthcare outcomes.
What is Data mining application and how it works in telemedicine? with example in how it...
What is Data mining application and how it works in telemedicine? with example in how it works in telemedicine
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT