In: Statistics and Probability
Do you notice any potential outliers? If so, what values are
they? Show your work in how you used the potential outlier formula
to determine whether or not the values might be outliers.
Construct a box plot displaying your data.
Does the middle 50% of the data appear to be concentrated together
or spread apart? Explain how you determined this.
Looking at both the histogram and the box plot, discuss the
distribution of your data.
# of pencils | Frequency | Culumative Frequency | Relative Frequency | Cumulative Relative Frequency |
0 | 5 | 5 | 0.125 | 0.125 |
1 | 14 | 19 | 0.35 | 0.475 |
2 | 10 | 29 | 0.25 | 0.725 |
3 | 7 | 36 | 0.175 | 0.90 |
4 | 1 | 37 | 0.025 | 0.925 |
5 | 0 | 37 | 0 | 0.925 |
6 | 0 | 37 | 0 | 0.925 |
7 | 1 | 38 | 0.025 | 0.95 |
8 | 1 | 39 | 0.025 | 0.975 |
9 | 0 | 39 | 0 | 0.975 |
10 | 1 | 40 | 0.025 | 1 |
Do you notice any potential outliers? If so, what values are they? Show your work in how you used the potential outlier formula to determine whether or not the values might be outliers.
From the table given we find out where the 25th percentile, 50th percentile and 75th percentile occur.
We look at the cumulative relative frequency column and find where 0.25 occurs, 0.25 occurs in the 2nd row hence the 1st quartile = 1
similarly, we find 0.5 and 0.75 which corresponds to the median and 3rd Quartile.
Q1 = 1
Q2 = 2
Q3 = 3
IQR = Q3 - Q1 = 3 -1 = 2
Upper limit = Q3 + 1.5 IQR = 3 + 1.5*2 = 6
Lower limit = Q1 - 1.5 IQR = 1 - 1.5*2 = -2
Hence any number less than -2 or greater than 6 is an outlier.
################# Another way to find the Quartile - Just for understanding###########################
#########################################################################################
Construct a box plot displaying your data.
Does the middle 50% of the data appear to be concentrated together or spread apart? Explain how you determined this.
yes the middle 50% of the data is concentrated together and seem to normally distributed. We see in the box plot the upper and the lower end are equally spaced and the median value lies in the centre of the box. The upper and lower wisker gives an indication that data normally distributed. Also the histogram without the outlier
Looking at both the histogram and the box plot, discuss the distribution of your data.
In both the histogram and the box plot we see that there are outlier on the upper side. those are value greater than 6.
If those values are removed from the data, the data seems to be normally distributed.