In: Statistics and Probability
The formula for a sample Standard Deviation is
Say we want to use standard deviation as a way of comparing the amount of spread present in each of two different distributions.
What is the effect of squaring the deviations? (1 mark)
How does this help us when we compare the spreads of two distributions? (1 mark)
With reference to the formula and the magnitude of data values, explain why introducing an outlier to a dataset affects the Standard Deviations more than it would affect the Inter-Quartile Range.
1. The square of the standard deviation is equal to variance i.e. Var(x) = Sq(SD(x))
2. Variance can be used to compare the spread of two distribution or sample.
F-test to Compare Two Population Variances:
To compare the variances of two quantitative variables, the hypotheses of interest are:
and then
i.e.
H0 : The variance of the two population is same.
H1 : The variance of the two population is not same.
Test statistics:
where s12 and s22 and are the sample variances. The more this ratio deviates from 1, the stronger the evidence for unequal population variances.
Critical Region: The hypothesis that the two variances are equal is rejected if F>F(?/2,N1?1,N2?1)
where F(?/2,N1?1,N2?1) is the critical value of the F distribution with N1-1 and N2-1 degrees of freedom and a significance level of ?.
3. The formula for IQR is
Interquartile range = Q3 - Q1
where
The IQR is resistant to outliers because it depends on the order of the values and does not consider the extreme values whereas standard deviation is sensitive to extreme values as it is based on the entire data.