In: Statistics and Probability
Describe a situation when the mean may not be the best descriptive statistic to represent a sample. What statistic might be a better statistic to use in that situation and why? If the mean must be used what are some solutions?
Describe a situation when the mean may not be the best descriptive statistic to represent a sample.
When data distribution of a sample is non-normal distribution and there exits skewedness in the distribution, mean may not be best descriptive statistic to represent a sample. In such distribution mean does not lie in the center.
What statistic might be a better statistic to use in that situation and why?
Median is the best statistic to use in that situation because median value is not effected by skewdness in the distribution and always lie in the center. Only in guassian or normal distribution, mean = median = mode.
If the mean must be used what are some solutions?
Clean the data and remove outliers to an extend that the distribution looks alike normal distribution.