In: Statistics and Probability
248
259
268
272
277
280
282
283
285
285
286
289
290
292
294
295
295
298
310
501
Because the mean is very sensitive to extreme values, it is not a resistant measure of center. By deleting some low values and high values, the trimmed mean is more resistant. To find the 10% trimmed mean for a data set, first arrange the data in order, then delete the bottom 10% of the values and delete the top 10% of the values, then calculate the mean of the remaining values. Use the axial loads (pounds) of aluminum cans listed below for cans that are 0.0111 in. thick. Identify any outliers, then compare the median, mean, 10% trimmed mean, and 20% trimmed mean.
The scatterplot for the untrimmed data is shown below,
The scatter plot shows that there is one potential outlier valued 501 pounds.
The mean of the data values is obtained using the formula,
1 | 248 |
2 | 259 |
3 | 268 |
4 | 272 |
5 | 277 |
6 | 280 |
7 | 282 |
8 | 283 |
9 | 285 |
10 | 285 |
11 | 286 |
12 | 289 |
13 | 290 |
14 | 292 |
15 | 294 |
16 | 295 |
17 | 295 |
18 | 298 |
19 | 310 |
20 | 501 |
Mean | 294.45 |
Median | 285.5 |
Since n = 20 (even), median is the average of two middle values,
10% trimmed
After sorting the data values and then removing 10% (20*10%=2) of the upper and 10% of the lower values, the mean is,
1 | 268 |
2 | 272 |
3 | 277 |
4 | 280 |
5 | 282 |
6 | 283 |
7 | 285 |
8 | 285 |
9 | 286 |
10 | 289 |
11 | 290 |
12 | 292 |
13 | 294 |
14 | 295 |
15 | 295 |
16 | 298 |
Mean | 285.6875 |
Median | 285.5 |
Since n = 16 (even), median is the average of two middle values,
20% trimmed
ABy removing 20% (20*20%=2) of the upper and 20% of the lower values, the mean is,
1 | 277 |
2 | 280 |
3 | 282 |
4 | 283 |
5 | 285 |
6 | 285 |
7 | 286 |
8 | 289 |
9 | 290 |
10 | 292 |
11 | 294 |
12 | 295 |
Mean | 286.5 |
Median | 285.5 |
Since n = 12 (even), median is the average of two middle values,
Comparing median, mean, 10% trimmed mean, and 20% trimmed mean
Mean | Median | |
untrimmed | 294.45 | 285.5 |
10% trimmed | 285.6875 | 285.5 |
20% trimmed | 286.5 | 285.5 |
The median for each data set is same. since median is middle value of the data set, it is calculated irrespective of the data values. Hence it will be unchanged
Since there is a potential outlier present in the untrimmed data set with highest value, the mean is larger compare to mean of trimmed data set (such that mean is calculate after removing the outlier in trimmed data).