In: Statistics and Probability
Part A: You have been hired by a company to build a predictive analytics model to increase their sales. Before building the model, your manager asked you to start with exploratory data analysis and report the findings. Which visualization tools would you use to display important properties of data such as outliers, range of data, mean, IQR, and distribution of data, etc.? Be specific about the tools and methods that display the skewness, similarity of distributions, and whether data comes from a population that is Normally distributed.
Part B: Identify a visualization, that was created within the last year, that is a good example of the important properties of data. Make sure to include:
Predictive analytics uses historical data to predict future events. Typically, historical data is used to build a mathematical model that captures important trends. That predictive model is then used on current data to predict what will happen next, or to suggest actions to take for optimal outcomes.
Predictive models use known results to develop (or train) a model that can be used to predict values for different or new data. The modeling results in predictions that represent a probability of the target variable (for example, revenue) based on estimated significance from a set of input variables.
Our eyes are drawn to colors and patterns. We can quickly identify red from blue, square from circle. Our culture is visual, including everything from art and advertisements to TV and movies.
Data visualization is another form of visual art that grabs our interest and keeps our eyes on the message. When we see a chart, we quickly see trends and outliers. If we can see something, we internalize it quickly. It’s storytelling with a purpose. If you’ve ever stared at a massive spreadsheet of data and couldn’t see a trend, you know how much more effective a visualization can be.