In: Statistics and Probability
What is mean, mode, standard deviation, and error?
solution:
a) Mean : For a data set Arithemetic Mean is the central value a discrete set of numbers .The mean (average) of a data set is found by adding all the numbers in a data set and then dividing by the no.of values in the data set.
It is also called as 'Expected value' or 'Average'.
Ex: A data set consists of 5 numbers : 1,2,3,4,5 ,Then it's Mean = (1+2+3+4+5)/5 = 3
--->The mean can be used for both 'Continuous and discrete 'numeric' data.
---> The mean cannot be calculated for the categorical data as the vales cannot be summed
--->As the mean includes every value from the distribution ,it is influenced by outliers and skewed distributions
1) Mean for ungrouped data:
Mean =
2) Mean for grouped data
Mean () =
b) Mode:Mode is the most frequently occured value in a distribution.
Ex: The mode of data set : 1,1,2,3,3,3,3,4,5 is 3
---> It has an advantage over mean and median as it can be calculated for both numerical and categorical data
---> Mode always may not reflect the 'centre of the distribution'
---> A data set have more than 1 mode.
If a data set has 1 mode - uni modal data
. If a data set have2 modes - Bi modal data
If a data set have more than 2 modes - Multi-Modal data
1) Mode for grouped data
Mode = L + [ (f1-f0) / (2f1-f0-f2) ] * h
where L = lower limit of modal class
h = size of modal class
f0 = frequency of class preceeding the modal class
f1 = frequency of the modal class
f2 = frequency of class succeeding the modal class
c) Standard deviation: It is the measure of amount of variation (or) dispersion of a set of values.
----> If a data set have low standard deviation then the values tends to close to mean
----> If a data set have high standard deviation then the values are spread out over a wider range.
--->As the standard deviation includes every value from the distribution ,it is influenced by outliers and skewed distributions
1) Standard deviation for ungrouped data
2) Standard deviation for grouped data
[ Note: Sample standard deviation is the standard deviations of randomly selected samples from a population.it is represented by (s). But In formula denominator is N-1 ]
d) Error: Error is the difference between a value obtained from a data collection process and true value for the population.The greater the error ,the less the representative of the data of the population .
Data can be highly effected by two types of errors