In: Statistics and Probability
Arranging the data values in Ascending Order -
14, 18, 21, 34, 38, 39, 41, 41, 45, 50, 53, 82, 90, 92
The number of observations, n = 14
Answer a:
To compute percentiles for a data value we use the formula -
R = (P / 100) (n + 1),
{where R - rank order / position of data value, P - Percentile value and n - number of observations = 14}
Rank order of 38 is 5
Therefore, 5 = (P / 100) x 15
Which implies, P = 33.333
Hence, 38 is the 33rd Percentile of the data set
Answer b:
We use the above mentioned formula to compute the 90th percentile for the data set -
R = (P / 100)(n + 1) where all the symbols have their previous usual meanings
Here, P = 90
Therefore, R = (90/100) x 15
Which implies R = 13.5 or the 13th Observation in the set
Hence, the 90th percentile is 90
Answer c:
Five number summary of the data set is as follows -
Minimum Value of the data set = 14
Maximum Value of the data set = 92
The Median / Second Quartile is the centre most value of the data set. Since, there are 14 observations the median is (7th + 8th observations) / 2 = (41 + 41)/2 = 41
The Lower Quartile (Q1) is the median of upper half of the data set.
Median of first 7 observations, Q1 = 4th Observation = 34
The Upper Quartile (Q3) is the median of the next / second half of the data set.
Median of next 7 observations, Q3 = 4th Observation
Taking 41 (second one in the order) as the first observation and counting from thereon,
Q3 = 4th obs. = 53
Answer d:
The Modified Box - Plot of the data set is given below -
(The dots represent outliers in the data set. The new maximum value is also the Upper Quartile of the data set which is mentioned.)
Answer e:
Therefore,
The Inter Quartile Range, (IQR) = Upper Quartile - Lower Quartile = 53 - 34 = 19
To check for outliers in the data set we multiply, 1.5 to IQR. Let p = 1.5 x IQR, then we subtract this value from lower quartile and also add p to the upper quartile. The obtained values form a range for the outliers of the data set. If all the observations lie inside this range then there are no outliers in the data set, else the observations that do not lie in the range are the outliers of the data set
Therefore,
1.5 x IQR = p = 1.5 x 19 = 28.5
Q1 - p = 34 - 28.5 = 5.5
Q3 + p = 53 + 28.5 = 81.5
The range is [28.5, 81.5].
Hence, 3 observations - 82, 90 and 92 are potential outliers of the data set.