Question

In: Computer Science

Data Mining Techniques Please discuss whether or not the following problems are data mining tasks. Explain...

Data Mining Techniques

Please discuss whether or not the following problems are data mining tasks. Explain why.

(a). Retrieve students' records from a relational table with grade = "A". [5 points]

(b). From the table of students' information, check if attributes last name and address have any correlations. [5 points]

(c). Find all the documents from the text database containing keywords "data mining". [5 points]

(d). Divide the text database into several groups, each group containing near-duplicate or similar documents. [5 points]

(e). Based on historical stock data, as well as other attributes (e.g., gold price, gas price, etc.) for the past few days, predict the trend of a stock tomorrow. [5 points]

(f). Please provide your own example of the data mining. [5 points]

Solutions

Expert Solution

ANSWER:

A) Relational Model:

Definition: It is a model that organizes data into one or more tables of columns and rows with a unique key identifying each row.Row called as "tuples" which represents the entity or the product and columns called as attributes which represents attribute to that instances.

Retrieve students records from a relational table with grade = "A" is not a "Data Mining Technique but its a Data Base Management Technique".In data mining there is a concept called realational data mining which is used for relational databases.

The query for retrieve students record is:

SELECT * FROM student WHERE grade ='A'; //output gives the list of the data from relational table where whose grades are displayed as "A".

B) Correlations:

Definition:

It is a SQL concept not a data mining. A correlated query is a subquery that uses value from outer query.

Query:

SELECT lastname,address

FROM student outer

WHERE lastname operator

(SELECT lastname,address

FROM student

WHERE lastname = outer.address);

C) I did not have a clear idea about this part of question.

D) Duplicate Record dectection is a process of identifying different or multiple records that refer to one unique object.The process of duplication is detected while data preparation stage,during which data entries are stored in database.the data preparation stage includes a parsing, a data transformation, and standardization steps.This entire process is done in the data mining.As the records contain multiple fields which makes a duplicate detection problem more complicated.

There are two categories for matching the multiple data:

1) Train the data to learn how to match records.

2 )Approaches on domain knowledge to match records.

There are some techniques for matching the models:

1) Bayes decision rule for minimum error

2) Bayes decision rule for minimum cost

3) Decision with a reject region.

4) Supervised and unsupervised learning etc.,


Related Solutions

(a) Briefly explain the data mining process. (b) What are the different problems that data mining...
(a) Briefly explain the data mining process. (b) What are the different problems that data mining can solve in general? Explain.
Discuss whether or not each of the following activities is a data mining task. Provide your...
Discuss whether or not each of the following activities is a data mining task. Provide your reasons as well in detail. Fill in your answers in the space provided below. (a) Dividing the customers of a company according to their gender. (b) Dividing the customers of a company according to their profitability. ( c) Computing the total sales of a company. (d) Sorting a student database based on student identification numbers. (e) Predicting the outcomes of tossing a (fair) pair...
Is the following example of classfication, regression, or clustering problems? Why? For a data mining project,...
Is the following example of classfication, regression, or clustering problems? Why? For a data mining project, a student collects information on income, age, sex, profession, and home zip code for fans of the 9 different New York Sports teams. She wants to build a model to predict which team someone roots for.
Please assist with the following problems: Q1. For each of the following decision-making problems, determine whether...
Please assist with the following problems: Q1. For each of the following decision-making problems, determine whether the problem involves constrained or unconstrained optimization; what the objective function is and, for each constrained problem, what the constraint is; and what the choice variables are. a. We are ordering a new commercial aircraft from Boeing and we choose how to allocate seats between the first-class section and the coach section of the aircraft. The new aircraft has a total of 1800 square...
1) Describe a real-world example that uses one of the Data Mining Tasks and why is...
1) Describe a real-world example that uses one of the Data Mining Tasks and why is this task best suited to this example? PLEASE EXPLAIN IN DETAIL.
Does health care need to evaluate costs using data mining techniques?
Does health care need to evaluate costs using data mining techniques?
What is Data mining in healthcare? Explain briefly the most common challenges of data mining on...
What is Data mining in healthcare? Explain briefly the most common challenges of data mining on Medical Databases? add the references at the end of your paper
What are the 5 defined steps in the Data Mining process to gain knowledge? PLEASE EXPLAIN...
What are the 5 defined steps in the Data Mining process to gain knowledge? PLEASE EXPLAIN IN DETAIL
Please assist with the following: Q1. For each of the following decision-making problems, determine whether the...
Please assist with the following: Q1. For each of the following decision-making problems, determine whether the problem involves constrained or unconstrained optimization; what the objective function is and, for each constrained problem, what the constraint is; and what the choice variables are. We aren’t earning enough profits. Your job is to redesign our global market structure to make sure that we maintain the optimal presence in every profitable market on the planet! Get to work now.
The DeGruy article describes the use of Knowledge Discovery in Databases (KDD) data mining techniques to...
The DeGruy article describes the use of Knowledge Discovery in Databases (KDD) data mining techniques to identify meaningful patterns in large data sets. The article also demonstrates how other industries have leveraged this new-found knowledge for fraud detection, marketing and customer retention. Although this article was published in 2000, it remains relevant since health care continues to struggle with mining health care data to improve outcomes. The Evans article describes an application of KDD in healthcare that uses an enterprise...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT