In: Statistics and Probability
What two major purposes does a boxplot provide regarding data?
1. Boxplots are particularly useful for assessing quickly the location, dispersion, and symmetry or skewness of a set of data, and for making comparisons of these features in two or more data.
2. A box and whisker plot is a way of summarizing a set of data measured on an interval scale. It is often used in explanatory data analysis. This type of graph is used to show the shape of the distribution, its central value, and its variability.
Handles Large Data Easily
Due to the five-number data summary, a box plot can handle and
present a summary of a large amount of data. A box plot consists of
the median, which is the midpoint of the range of data; the upper
and lower quartiles, which represent the numbers above and below
the highest and lower quarters of the data and the minimum and
maximum data values. Organizing data in a box plot by using five
key concepts is an efficient way of dealing with large data too
unmanageable for other graphs, such as line plots or stem and leaf
plots.
Exact Values Not Retained
The box plot does not keep the exact values and details of the
distribution results, which is an issue with handling such large
amounts of data in this graph type. A box plot shows only a simple
summary of the distribution of results so that you can quickly view
it and compare it with other data. Use a box plot in combination
with another statistical graph method, like a histogram, for a more
thorough, more detailed analysis of the data.
A clear summary
A box plot is a highly visually effective way of viewing a clear
summary of one or more sets of data. It is particularly useful for
quickly summarizing and comparing different sets of results from
different experiments. At a glance, a box plot allows a graphical
display of the distribution of results and provides indications of
symmetry within the data.
Displays outliers
A box plot is one of very few statistical graph methods that show
outliers. There might be one outlier or multiple outliers within a
set of data, which occurs both below and above the minimum and
maximum data values. By extending the lesser and greater data
values to a max of 1.5 times the inter-quartile range, the box plot
delivers outliers or obscure results. Any results of data that fall
outside of the minimum and maximum values known as outliers are
easy to determine on a box plot graph.