In: Math
How are exploratory data analysis (EDA) and hypothesis testing different? Explain why EDA could be preferred in data mining, and justify your explanation with a specific example.
It's very necessary to know little bit about the system being explored before the hypothesis and it is very useful to know about the noise and the variation prior to designing a experiment.
Hypothesis testing gives a significant results from the data. Exploratory analyses lead to the design and running of new experiments to specifically test hypotheses.
EDA is an approach to analyse data sets to summarized their characteristics using visual methods. EDA used for seeing what the data can tell us before any calculation or modelling part, it's very important tool for analysis.
Hence it is used in data mining.
For example, we have data set which contains total 1lakh observations, data having information of 5 banks (for 10 years) regarding debits, credits, loans, etc. So first we see is their any pattern follow, or any similarities within the banks, which factor affects most on any particular bank, all these kind of questions can be solved by using EDA. We get some idea of the data from the graphs,so we conclude on the basis of EDA. For further analysis we test the hypotheses and draw the conclusions.