In: Statistics and Probability
04.TR.A:
I have several times heard HCC st
students who take classes at Laurel College Center advise other students, “
Buy your gas in Laurel. It’s cheaper.”
As of 8 July 2019, prices in Columbia and Laurel for a gallon of gasoline (regular
unleaded) are listed. Source:
http://www.marylandgasprices.com/
Columbia: 2.51 2.65 2.69 2.69
2.71 2.84 2.85 2.89 2.95 2.99
2.99 2.99
Laurel: 2.55 2.62 2.65 2.67 2.
69 2.69 2.72 2.74 2.74 2.75
2.75 2.99
(a) Find the five-number summary
separately for both locations.
(b) Perform outlier te
sts on both locations.
(c) Construct boxplots on the same set of axes and label them with the five-number summary.
(d) Comment on the differences in the center and spread between the
two boxplots.
(e) Find the mean and standard deviation for both locations.
(f) For each location, which is
the better measure of center an
d spread?
(g) If you commute from HCC to LCC for classes, where will you
buy your gas 6?
Why did you make this choice?
04.TR.B: Exercise
For the data in Exercise A,
(a) Construct histograms side by side for the two locations
(b) Construct stem and leaf plots for the two locations using the same stem and
putting the leaves on opposite sides of the stems.
(c) Discuss relative advantages
and disadvantages of a histogram, a stem and
leaf plot and a boxplot for comparison of the two locations.
(a)
Variable Minimum Q1 Median Q3 Maximum
Columbia 2.5100 2.6900 2.8450 2.9800 2.9900
Laurel 2.5500 2.6550 2.7050 2.7475 2.9900
(b) Columbia:
Inter-quartile range (IQR)=Q3-Q1=2.9800-2.6900=0.29
Lower whisker=Q1-1.5*IQR=2.6900-1.5*0.29=2.255
Upper whisker=Q3+1.5*IQR=2.9800+1.5*0.29=3.415
We know any value lies outside (2.255, 3.415) is an outlier. Hence there is no outlier.
Laurel:
IQR=Q3-Q1= 2.7475-2.6550=0.0925
Lower whisker=2.6550-1.5*0.0925=2.51625, Upper whisker=2.7475+1.5*0.0925=2.88625
2.99 is an outlier for this data set.
(c)
(d)
From above box plots, we see that median for Columbia>median for Laurel (compare the position of central line of two boxes). Again since length of box for Columbia is larger than length of box for Laurel hence spread of data for Columbia is larger than spread of data for Laurel.
(e)
Variable Mean StDev
Columbia 2.8125 0.1596
Laurel 2.7133 0.1058