In: Statistics and Probability
How would you explain the analysis of variance, assuming that your audience has not had a statistics class before?
ANOVA test to check if the classifications used are useful for explaining or describing any group or characteristic. In statistics examples often help a lot to explain a concept.
Suppose that you were studying the heights of people and you wanted to check the demographic and/or biological characteristics that could be useful. It would not explain why people are tall or short, but it would do a good job to see if the heights are grouped around some feature.
Imagine having studied height and not having a good judgment on how to choose a sample. You decide to use a convenience sample instead of a random sample. You are a friend of an NBA member. When you're not studying for graduate school, you're the bartender near Orlando's Amway Center, and because you know so many of the Magic, they agree to measure height. Your wife is a teacher and therefore measures the height and records the sex of each of her students. Finally, use the security tape on the bar to measure the heights of the first ten people entering the bar, estimating them at the bottom of the door.
There are a couple of ways you could do it. First of all, there may be men against women. Secondly, you could examine children compared to adults. Finally, you can divide groups into NBA players, bartenders and kindergarteners. Most likely, given the odd group you choose, categories instead of gender or age are the most explanatory source of variation. Of course, there is a case of reverse causality, people do not join the NBA to get taller, they join the NBA because they are tall. Likewise, people do not go to kindergarten because they are short, they are short because of their age.
ANOVA tells you if a series of features reduces the amount of unexplained information by grouping the groupings rather than omitting all the groupings. There are times when it is better to lose the features.