In: Statistics and Probability
Distinguish between data analysis, hypothesis testing modelling and estimation. Give a simple example of each and discuss the appropriate contexts in which each should be used.
Data analysis is a process of inspecting, cleansing, transforming and modeling data with the goal of discovering useful information, informing conclusions and supporting decision-making.
Hypothesis testing is a statistical method that is used in making statistical decisions using experimental data. Hypothesis Testing is basically an assumption that we make about the population parameter.
Estimation statistics is a data analysis framework that uses a combination of effect sizes, confidence intervals, precision planning, and meta-analysis to plan experiments, analyze data and interpret results. It is distinct from null hypothesis significance testing, which is considered to be less informative.
Example of data analysis :- Focusing into Customer Behavior Analytics, the process of this example starts from : collecting customer data using supermarket card data, smartphone app data, Geo-localisation data … and it possible to add other sources of data like weather data.
Example of hypothesis:- For example someone performing experiments on plant growth might report this hypothesis: "If I give a plant an unlimited amount of sunlight, then the plant will grow to its largest possible size." Hypotheses cannot be proven correct from the data obtained in the experiment, instead hypotheses are either supported by the data collected or refuted by the data collected.
Example of estimation:- The key estimation question:
"Based on my random sample, what is my estimate of the population parameter"?
Two types of estination are there:-
1.) Point estimates: A single number (central tendancy)
2.) Interval estimates: A range of numbers
Example: You take your car to your local car dealer's service department and you ask the service manager how much it will cost to repair your car.
If the manager says it will cost you 500$ then she is providing a point estimate.If the manager says it will cost somewhere between 400$ and 600$ yhen she is providing an interval estimate.
Uses of data analysis:- Data analysis is used to evaluate data with statistical tools to discover useful information. A variety of methods are used including data mining, text analytics, business intelligence, combining data sets, and data visualization.
Uses of hypothesis testing:- Hypothesis testing is used to assess the plausibility of a hypothesis by using sample data. The test provides evidence concerning the plausibility of the hypothesis, given the data. Statistical analysts test a hypothesis by measuring and examining a random sample of the population being analyzed.
Estimation, in statistics, any of numerous procedures used to calculate the value of some property of a population from observations of a sample drawn from the population. A point estimate, for example, is the single number most likely to express the value of the property.
In real life, estimation is part of our everyday experience. When you're shopping in the grocery store and trying to stay within a budget, for example, you estimate the cost of the items you put in your cart to keep a running total in your head.