In: Statistics and Probability
Consider your work environment, domain of interest, or the world around you, and discuss when it might be more appropriate to use the mean, median, or mode for measures of central tendency.
The mean has one major disadvantage that it is affected with extreme values or outliers. Outliers are values that are unusual compared to the rest of the data set by being especially small or large in numerical value. Taking the median would be a better measure of central tendency in this situation as median is unaffected with extreme values.
Another time when we usually prefer the median over the mean (or mode) is when our data is skewed (i.e., the frequency distribution for our data is skewed towards left or right). If we consider the normal distribution - as this is the most frequently used and popular in statistics - the mean, median and mode are identical for perfectly normal variates. Moreover, they all represent the most typical value in the data set. However, as the data becomes skewed the mean loses its ability to provide the best central location for the data because the skewed data is dragging it away on one side from the typical value. However, the median best retains this position and is not as strongly influenced by the skewed observations.
The mode is used for categorical data where we wish to know which is the most common category. One of the problems with the mode is that it is not unique, so it leaves us with problems when we have two or more values that has the highest frequency which is the case of bimodal or multimodal data.
Another problem with the mode is that it will not provide us with a very good measure of central tendency when the most common mark is far away from the rest of the data in the data set. Say 95% of values are lying between 10-20 but the value 5 has the heighest frequency. So 5 will not be representative of this data set.
When you have a normally distributed sample you can legitimately use both the mean or the median as your measure of central tendency. In fact, in any symmetrical distribution the mean, median and mode coincide or are equal. However, in this situation, the mean is widely preferred as the best measure of central tendency because it includes all the values in the data set for its calculation, and any change in any of the scores will get added to the value of the mean. This is not the case with the median or mode which will remain unaffected with new changes.
If dealing with a normal distribution, and tests of normality show that the data is non-normal, it is better to use the median instead of the mean. However, this is more a rule of thumb than a strict guideline.