In: Operations Management
What is meant by measures of dispersion? What are they and in what type of variables are they appropriate?
In order to describe the data completely, the measures of central tendency are not sufficient. To describe the data, variability plays an important role. Standard deviation, Range, Interquartile Range are the measures of dispersion.
Range
The difference between the largest and smallest observation of data is the range. This is easy to calculate. But, this doesnt use all the observations in the dataset and hence it is more sensitive.
Standard Deviation
Standard deviation is the measure of spread of the data about the mean. This is the square root of sum of squared deviation from the mean divided by the no of observations.
The SD can be used along with the mean to detect skewness. SD is also a useful measure of dispersion as it is used to calculate the observations from the normal distribution, 68% of observations lie between (Mean+/- 1SD), 95% of observations lie between (Mean+/- 2SD), 99.7% of observations lie between (Mean+/- 3SD).
Interquartile Range
Interquartile Range is the difference between 25th and 75th percentile. This describes the middle 50% of observations. This can be used as a measure of variability if the extreme values are not being recorded exactly. Hence, this is not affected by extreme values.
SD is used as measure of dispersion when mean is used as measure of central tendency (in case of symmetric numerical data).
Interquartile Range is used along with the median for ordinal data or skewed numerical data.