In: Statistics and Probability
The average age of the players on each of the 30 Major League Baseball teams in the 2017 season were as follows:
26.6 27.9 27.9 29.9 29.3 28.1 28.4 28.9 27.7 28.7
30.5 29.8 28.5 27.9 30.9 29.3 28.8 28.6 29.1 31.0
30.7 30.3 29.7 31.0 29.4 29.8 29.4 32.7 34.0 31.8
(a) Construct a stem-and-leaf diagram for these data, by using the first two digits of each number as the stem and the third digit as the leaf. Comment on its features.
(b) Use the stem-and-leaf plot to find the median and the mode of this dataset. Which of these two measures would you prefer as a measure for the center of this dataset? Explain.
(c) Construct a dotplot for this dataset.
(d) Give the five-number summary for this dataset.
(e) Find the range and give a rough estimate for the standard deviation of the dataset.
(a) Construct a stem-and-leaf diagram for these data, by using the first two digits of each number as the stem and the third digit as the leaf. Comment on its features.
stem-and-leaf diagram
Stem | Leaves |
26 | 6 |
27 | 7 9 9 9 |
28 | 1 4 5 6 7 8 9 |
29 | 1 3 3 4 4 7 8 8 9 |
30 | 3 5 7 9 |
31 | 0 0 8 |
32 | 7 |
33 | |
34 | 0 |
Comment : The data is bell-shaped, indicating that the majority of the data is clustered around the median.
(b) Use the stem-and-leaf plot to find the median and the mode of this dataset. Which of these two measures would you prefer as a measure for the center of this dataset? Explain.
Mode: The mode of a set of data values is the value that appears most often
We can see the 9 is most occur at stem 27,
Therefore, Mode = 27.9
Median: Median is Middle value of a sorted data in ascending order
Here, n= 30 (Even)
Then Median = (15th value + 16th value)/2 = (29.3 + 29.4) /2 = 29.35
Median will prefer because,
1. The data is bell-shaped (Stem-leaf plot), indicating that the majority of the data is clustered around the median.
2. The mode is the least used of the measures of central tendency and can only be used when dealing with nominal data and our data is continuous.
(c) Construct a dot plot for this dataset. (By using Excel)
(d) Give the five-number summary for this dataset. (By using Excel)
Min = 26.6
Q1 = 28.5
Median = 29.35
Q3 = 30.45
Max = 34.
(e) Find the range and give a rough estimate for the standard deviation of the dataset.
Range = Max - Min = 34 - 26.6 = 7.4
standard deviation:
Table for calculation:
X | (X-Mean)^2 |
26.6 | 8.72 |
27.7 | 3.43 |
27.9 | 2.73 |
27.9 | 2.73 |
27.9 | 2.73 |
28.1 | 2.11 |
28.4 | 1.33 |
28.5 | 1.11 |
28.6 | 0.91 |
28.7 | 0.73 |
28.8 | 0.57 |
28.9 | 0.43 |
29.1 | 0.21 |
29.3 | 0.06 |
29.3 | 0.06 |
29.4 | 0.02 |
29.4 | 0.02 |
29.7 | 0.02 |
29.8 | 0.06 |
29.8 | 0.06 |
29.9 | 0.12 |
30.3 | 0.56 |
30.5 | 0.90 |
30.7 | 1.31 |
30.9 | 1.81 |
31 | 2.09 |
31 | 2.09 |
31.8 | 5.05 |
32.7 | 9.90 |
34 | 19.77 |
Sum | 71.67 |