In: Statistics and Probability
When calculating variance, we square the difference between each data point and the mean in a set of numbers. Why do we do this?
A. The deviation scores will sum to zero and cancel each other out if they are not squared first.
B. We need to even out the numbers to make them easier to handle.
C.Squaring the difference between a data point and the mean is the way to calculate deviance.
D. Larger numbers are easier to calculate than smaller numbers.
The variance is a squared value because it's convenient. To calculate it, you first determine the mean of a data distribution, then figure how far each data point is from that mean, and use positive for the right of the mean and negative for those left of the mean.
Add them all together and divide by the number of values you have, and you supposedly have a measure of distance from the mean, or "spread".
But here a problem arises. If you have a huge number of data points right and left, and they all carry their signs, when you add them all together, you get zero every time.
So, we perform a mathematical trick that works out nicely: we square each value before adding them together, and that blows away the negative signs. Just using the absolute values might seem like a simpler solution, but it's not. Squaring the values actually produces a more precise and useful answer.
The benefits of squaring include:
· Squaring always gives a positive value, so the sum will not be zero.
· Squaring emphasizes larger differences and help in analyzing the smaller difference we cannot see without squaring.
Hence option A is the most appropriate and strong reason for squaring.