In: Statistics and Probability
[4 5 5 2 4 4 6 3 3 7 5 3 6 3 4 4 6 5 4 5 3 7 5 5 4 2 6 5 6 6] This is my dataset
|
Solution :
Mean, Median and Mode are the measures of central tendency.
Mean measure the average of data values. Median is the value that separates bottom 50% observations form that of top 50% observations. Mode is most repeating observation in the given data.
Variance, standard deviation, coefficient of variation and the range are the measures of dispersion.
Variance and standard deviation measures the spread of all the observations from it's mean. Coefficient of variation is the ratio of standard deviation to th mean. The higher the CV, the greater is the level of dispersion from it's mean.
The Chebychev's rule applies to all types of data sets and it states that, the minimum proportion of observation that lies within "k" standard deviations from the mean is this is true for ( k > 1).
For k=5, we get; . And hence the proportion of observation that lie within 5 standard deviation from the mean of data is, 0.96.
Consider the following output;
The column "standardized" gives the standardized values ( or Z scores ) for the data. Since there is not any observation whose Z score lie outside the the interval (-2, 2) we conclude that there is no outlier in the data.