Question

In: Math

There is a strong linkage between statistical data analysis and data mining. Some people think of...

There is a strong linkage between statistical data analysis and data mining. Some people think of data mining as automated and scalable methods for statistical data analysis. Do you agree or disagree with this perception? Present one statistical analysis method that can be automated and/or scaled up nicely by integration with current data mining methodology.

Solutions

Expert Solution

Th difference between data mining and statistics , we can c

Data Mining Statistics
Explore and gather data first, builds model to detect patterns and make theories. It provides theories to test using statistical.
Data used is Numeric or Non numeric. Data used is Numeric.
Inductive Process (Generation of new theory from data) Deductive Process (Does not involve making any predictions)
Data collection is less important. Data collection is more important.
Data Cleaning is done in data mining. Clean data is used to apply statistical method.
Needs less user interaction to validate model hence, easy to automate. Needs user interaction to validate model hence, difficult to automate.
Suitable for large data sets Suitable for smaller data sets
It’s an algorithm which learns from data without using any programming rule. Formalization of relationship in data in the form of mathematical equation
Use heuristics think (rules used to form judgments and make decisions) Does not have scope for heuristic think.
Classification, Clustering, Neural network, Association, Estimation, Sequence based analysis, Visualization Descriptive Statistical, Inferential Statistical
Financial Data Analysis, Retail Industry, Telecommunication Industry, Biological Data Analysis, Certain Scientific Applications etc. Demography, Actuarial Science, Operation research, Biostatistics, Quality Control etc.

Standard descriptive analytics can be automated to generate insights at one for better visualization and decision making purposes and inferential statistics including t test,z test anova methods have been automated which can be helpul for data mining techniques but sampling distributions are tough to estimate and automate


Related Solutions

Some people think that a “strong dollar” should be a point of national pride. Suppose that...
Some people think that a “strong dollar” should be a point of national pride. Suppose that the value of the dollar were to rise relative to foreign currencies; that is, a dollar could buy more units of foreign currencies than before. What impact would that have on consumers and businesses that buy imported goods?
Some statistical measures form the basis of data gathering and analysis in public health. Consider the...
Some statistical measures form the basis of data gathering and analysis in public health. Consider the following measures: Frequencies Means Medians Standard deviations Quartiles Ranges Respond to the following: Compare and contrast the use of the measures listed in understanding the health status of a population. Give examples in your answer. Discuss the relevance and use of each measure in relation to research studies. Write your initial response in 6 distinct paragraphs. Apply APA standards to citation of sources.
What are some pros and cons to data mining? Provide an example of when data mining...
What are some pros and cons to data mining? Provide an example of when data mining was used and the outcome provided an incorrect assumption or issue. How can these types of situations be avoided in the future?
what are some pros and cons to an ANOVA statistical analysis?
what are some pros and cons to an ANOVA statistical analysis?
Regression analysis is an important statistical method for the analysis of business data. It enables the...
Regression analysis is an important statistical method for the analysis of business data. It enables the identification and characterization of relationships among factors and enables the identification of areas of significance. The performance and interpretation of linear regression analysis are subject to a variety of pitfalls. Comment on what these pitfalls may be and how you would avoid them. Use an example if it helps to clarify the point.
People usually think a "strong" dollar is good. Is this true for U.S. businesses, and does...
People usually think a "strong" dollar is good. Is this true for U.S. businesses, and does it help or hurt the U.S. balance of payments? please explain in 200 words
PTC is a substance that has a strong bitter taste for some people and is tasteless...
PTC is a substance that has a strong bitter taste for some people and is tasteless for others. The ability to taste PTC is inherited and depends on a single gene that codes for a taste receptor on the tongue. Interestingly, although the PTC molecule is not found in nature, the ability to taste it correlates strongly with the ability to taste other naturally occurring bitter substances, many of which are toxins. About 75% of Italians can taste PTC. You...
PTC is a substance that has a strong bitter taste for some people and is tasteless...
PTC is a substance that has a strong bitter taste for some people and is tasteless for others. The ability to taste PTC is inherited and depends on a single gene that codes for a taste receptor on the tongue. Interestingly, although the PTC molecule is not found in nature, the ability to taste it correlates strongly with the ability to taste other naturally occurring bitter substances, many of which are toxins. About 75%75% of Italians can taste PTC. You...
which statistical analysis to use for a survey on a group of 6 people of 3...
which statistical analysis to use for a survey on a group of 6 people of 3 questions with four choices each strongly positive, positive, neutral and not positive and show an example
What are the similarities and differences between database, data warehouse, and data mining?
What are the similarities and differences between database, data warehouse, and data mining?
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT