In: Statistics and Probability
How much time do Americans living in or near cities spend waiting in traffic, and how much does waiting in traffic cost them per year? The data set given includes this cost for 31 cities. For the time Americans living in or near cities spend waiting in traffic and the cost of waiting in traffic per year:
a. Compute the mean, median, first quartile, and third quartile.
b. Compute the range, interquartile range, variance, standard deviation, and coefficient of variation.
c. Construct a boxplot. Are the data skewed? If so, how?
d. Compute the correlation coefficient between the time spent sitting in traffic and the cost of sitting in traffic.
e. Based on the results of (a) through (c), what conclusions might you reach concerning the time spent waiting in traffic and the cost of waiting in traffic.
City | Annual Time Sitting in Traffic (hours) | Cost of Sitting in Traffic ($) |
Boston | 47 | 980 |
New York | 54 | 1126 |
Philadelphia | 42 | 864 |
Washington | 74 | 495 |
Miami | 38 | 785 |
Detroit | 33 | 687 |
Cleveland | 20 | 383 |
Minneapolis | 45 | 916 |
Milwaukee | 27 | 541 |
Chicago | 71 | 1568 |
St. Louis | 30 | 642 |
Nashville | 35 | 722 |
Memphis | 23 | 477 |
Atlanta | 43 | 824 |
New Orleans | 35 | 746 |
Omaha | 21 | 389 |
Wichita | 20 | 379 |
Dallas | 45 | 924 |
Houston | 57 | 1171 |
Denver | 49 | 993 |
Albuquerque | 25 | 525 |
Phoenix | 35 | 821 |
Salt Lake City | 27 | 512 |
Las Vegas | 28 | 512 |
Boise | 19 | 345 |
Seattle | 44 | 942 |
Portland | 37 | 744 |
San Francisco | 50 | 1019 |
San Jose | 37 | 721 |
Los Angeles | 64 | 1334 |
San Diego | 38 | 794 |
a.
(i) Annual Time Sitting in Traffic (hours)
Mean is given as,
= 1213/31
= 39.13
Median is given as,
Median = 37
Note : The median is calculated as the middling value. Since there are 31 data vlaues, the ((31+1)/2)16th values is the median.
First Quartile: Q1 = ((n+1)*0.25)th value = 8th term = 27
Third Quartile: Q3 = ((n+1)*0.75)th value = 24th term = 47
(ii) Cost of Sitting in Traffic ($)
Mean is given as,
= 23881/31
= 770.35
Median is given as,
Median = 746
Note : The median is calculated as the middling value. Since there are 31 data vlaues, the ((31+1)/2)16th values is the median.
First Quartile: Q1 = ((n+1)*0.25)th value = 8th term = 512
Third Quartile: Q3 = ((n+1)*0.75)th value = 24th term = 942
----------------------------------------------------------------------------------------------------------------------------------------------------------
b.
(i) Annual Time Sitting in Traffic (hours)
Range = Maximum value - Minimum Value
= 74 - 19
= 55
Interquartile range
IRQ = Q3 -Q1
= 20
Variance is given as,
where,
n = 31
therefore,
s2 = 210.38
Standard deviation is given as,
= 14.50
Coefficient of variation is given as,
= 0.3707
(ii) Cost of Sitting in Traffic ($)
Range = Maximum value - Minimum Value
= 345 - 1568
= 1223
Interquartile range
IRQ = Q3 -Q1
= 430
Variance is given as,
where,
n = 31
therefore,
s2 = 85092.24
Standard deviation is given as,
= 291.705
Coefficient of variation is given as,
= 0.3786
----------------------------------------------------------------------------------------------------------------------------------------------------------
c.
(i) Annual Time Sitting in Traffic (hours)
(ii) Cost of Sitting in Traffic ($)
Yes, the data is skewed. The data are right skewed as seen in the above boxplots. The (Q3 + 1.5 IRQ) is far from the median in both the cases therefore the values towards right vary in large magnitufde then on the left.
----------------------------------------------------------------------------------------------------------------------------------------------------------
d.
The correlation coefficient is,
the given data is,
The calculations are given as,
where Mx = 39.129
My = 770.35
Therefore, the coefficient of correlation, r = -0.0245
----------------------------------------------------------------------------------------------------------------------------------------------------------
e.
Based on the results of (a) through (c), we can conclude that there is positive linear correlation between Annual Time Sitting in Traffic and Cost of Sitting in Traffic ($). Spending more time in traffic costs more when compared to speding less time in traffic.