In: Math
find and provide three examples of data visualizations which mislead a viewer (at least one should be in the business context).
include the screenshots/pictures of those visualizations. Explain why they mislead.
visualizations need visual integrity to ensure that the data they present can be interpreted correctly. visual integrity -- or graphical integrity, as it's also called -- means ensuring that what's presented accurately represents what's in the data being visualized and that no design choices distort or obfuscate the inherent facts and analytical findings.
Good Visualization Requires:
Include info on data provenance. Not properly citing the sources of the data used in constructing a data visualization leaves out potentially important information about how the data was collected, which can affect the credibility of the message being conveyed.
Clearly define the data variables. A visualization designer may presume that the intended audience will understand the meanings of the data variables incorporated into it. In many cases, though, the visual presentation only makes sense if the accompanying text explains what it involves. All data variables should be unambiguously defined to prevent possible misinterpretation.
Keep visualizations free from visual "noise." A well-designed data visualization shouldn't include icons or other graphics that don't correctly reflect the data being presented. In other cases, designers add editorial commentary to the images that can influence how data is interpreted.
Don't filter out data so it can't be viewed. Sometimes the design of a visualization limits the ways data can be viewed -- for example, by filtering constraints for the data dimensions that business execs can drill down into an interactive dashboard. Visualizations shouldn't prevent viewers from freely looking at all the included data values independent of a predefined set of filtering criteria.
And make sure that it's consistent. Present the data in a way that it can be correctly interpreted by viewers. A common example of how to go wrong is overlaying multiple axes on a single chart and including lines pegged to the different axes -- for instance, having two different y-axes, one labeled on the left of the chart and one on the right. Doing so implies comparison; however, if there's no relation between the two axes, there should be no inferred relation between their lines. Other examples include feeding false assumptions about the use of colors, shapes, textures and line types and thicknesses.
Be consistent on the scale, too. The size of the elements in a good data visualization should correspond to what's indicated by the data. In some cases, an enthusiastic designer might try to highlight a finding by scaling the results to make it look more prominent. Doing so is deceptive, especially as the ratio between the size of the graphical element and the actual data value gets bigger.
Graphics that violates good visualization practices are:
Numbers Don’t Add Up When you draw a pie, stacked-bar or stacked-area chart, the numbers should add up to 100. This might sound too silly a mistake to point out here, To avoid this mistake, double check your numbers and make sure you use standard tools. Those tools would not allow you to make the wrong pie chart.
Not Following Conventions Just like slices in a pie-chart should add up to a hundred, a graph that is moving up and right is meant to represent growth in numbers.
Cropped Axes Axes values provide context to charts. You mess with axis and you have a visualization that will paint a completely wrong picture.
Not Using Annotations definitely something worth doing every time you draw a chart. Sometimes a visual alone doesn’t suffice, and you need to add qualifying text or numbers to the chart to make it more meaningful.
Improper Bubble Sizes Bubble charts are very useful for displaying three-dimensional data in two dimensions. Not only you have x- and y-axes, but you can depict the third quantity by varying the size of the bubble.
Incomplete Data Based on the above map alone, you are bound to think that ‘ABC’ has a higher market share. But the right answer here is – ‘it is incomplete information’. Here’s why: We definitely know that ABC leads in more number of states that XYZ does, but we do not know anything about the volume of sales of both products in each state.
Hard to Compare viz is supposed to make the task of interpreting easier and not harder. But in this case, it’s very difficult for a reader to compare.
(A)- Distortion
Showing too little or too much data or emphasizing selected data could simply be an error that results from choosing the wrong format for the data visualization or from not fully understanding the data. These errors can be unintentional, though some presentations of data distort the data in ways that appear to be intentional or agenda-driven. Examples include using different scales when graphing different variables or starting the Y axis at a non-zero point, which can de-emphasize differences in values.
In visualization:-Presenting too much data
Sometimes, showing the big picture can make it hard to identify salient data or stories.the sheer number of lines makes it hard to focus on any one data point or trend. If the designer wanted to obscure some bad news, burying it in a massive amount of information could accomplish that—but it also makes the visualization essentially worthless.
The number of lines in this data visualization makes it hard to isolate any one-factor trend In other cases, the trends that appear when an entire data set is visualized are the opposite of trends that appear when subsets of that data are studied separately.
When learners will need both a big-picture and a detailed visualization, the designer should consider creating a series of data visualizations. News media often do this with large data stories, showing a national map, for example with broad representations of data by region or state, then a series of more narrowly focused visualizations that highlight important trends, outliers, or other information.
(B)-misleading interpretation, the problem, and solution.
The below graph is the one most often referenced to disprove global warming. It demonstrates the change in air temperature (Celsius) from 1998 to 2012.
It is worth mentioning that 1998 was one of the hottest years on record due to an abnormally strong El Niño wind current. It is also worth noting that, as there is a large degree of variability within the climate system, temperatures are typically measured with at least a 30-year cycle. The below chart expresses the 30-year change in global mean temperatures.
And now have a look at the trend from 1900 to 2012:
While the long-term data may appear to reflect a plateau, it clearly paints a picture of gradual warming. Therefore, using the first graph, and only the first graph, to disprove global warming is a perfect misleading statistics example.