In: Statistics and Probability
Discuss the three measures of central tendency. Give an example for each that applies to the measure to a Radar / Speed Enforcement and discusses how the measure is used. What are the advantages and disadvantages for each of the three measures? How do outliers affect each of these three measures? What are some options for handling outliers?
There are three measure of central tendency..
1) MEAN -- The mean is the sum of the value of observations in a dataset divided by the number of observations. Mean is also known as arithmetic mean or arithmetic average. Fr a measure of Radar, the mean can be used to average range of wayes of radar. Not here we can not say average speed of radar because this we used Harmonic mean not arithmetic mean.The main advantage of mean is that it can be applied or used for both continuous and discrete numeric data.The main disadvantage of mean is that it can not be calculated for categorical data, as the values cannot be summed due to non numeric.. Second is as mean includes all values in dataset ( i.e function of all values) therefore it affected by outliers.
MEDIAN--The
median is the middle value in dataset when the values of dataset
are arranged in ascending or descending order.
The median divides the values of dataset arranged in any one
asscending or descending order in half (there are 50% of
observations on either side of the median value). In a distribution
with an odd number of observations, the median value is the middle
value, but for even number of observtions it is the average of two
middle values.The most advantage of median is that is less affected
by outliers and skewed data as the mean, and it is usually the
preferred measure of central tendency when the distribution is not
symmetrical. The disadvantage of median is that it cannot be
identified for categorical nominal data, as it cannot be logically
ordered.
MODE-- The mode is the most occuring value in datasets, for example in given dataset,
54, 54, 54, 55, 56, 57, 57, 58, 58, 60, 60
The most commonly occurring value is 54(maximum time occue(3 times)), therefore the mode of this distribution is 54 years. The mode has an advantage over the median and the mean as it can be found for both numerical and categorical (non-numerical) data. but the main disadvantages of mode is that, in some cases particularly where the data are continuous, the distribution may have no mode at all (i.e. if all values are different) or sometimes it has more than one mode ( as bi-modal, or multi-modal).
Outliers are extreme that are notable different from the rest data set values.it is important to detect outliers within a distribution, because they can alter the results of the data analysis. The mean is more sensitive to the existence of outliers than the median or mode.Despite the existence of outliers in a distribution, the mean can still be an appropriate measure of central tendency, especially if the rest of the data is normally distributed. If the outlier is confirmed as a valid extreme value, it should not be removed from the dataset. Several common regression techniques can help reduce the influence of outliers on the mean value.