In: Statistics and Probability
We discussed the Simpson’s Paradox in class. Describe this paradox and why it occurs. Describe one example (your own, not one presented in class or in your textbook) where this paradox occurs and might lead to poor decision making.
Explain how correlation and regression are related concepts. Why is regression impossible without correlation? How does the strength of a correlation directly influence the outcome of a regression equation? Use your own specific example to illustrate these concepts.
Briefly describe the role of chance and probability in hypothesis testing. What do we mean when we say our results are statistically significant or not? What are we essentially saying about the probability of obtaining our sample results due to chance alone? Use your own, specific example to illustrate these concepts.
Describe the three pitfalls associated with risk statistics as describe in your book using your own unique examples to illustrate the importance of each pitfall.
Researchers found a correlation between taking fish oil supplements and having clear skin. Describe three reasons this correlation might exist.
we discussed the Simpson's Paradox in class:
The phenomenon occurs:
1) when there are two or more distinct groups with different trends on a measure but when the measure is seen in common for all groups together, the trend disappears/reverses (or)
2) when inferences about different groups' statistical measures (e.g.: mean) reverse when they are drilled down across a new variable
In the case 1) it happens because of the weights (one large compared to the other) that the two groups whose trends are captured might carry. The rate of change of the larger group, if considerably slower than the rate of change of the smaller one even if it is in the same direction , a combined trend might show the opposite direction.
In the case 2) the proportion of the two groups in relationship with the new variable could hide the statistical measure when aggregated by the variable & seen
Example:
One example from Sports can be the Simpson's paradox arising from a set of Tennis.
Given the scoring system in tennis, a player can win a set at the best scoring 24 points to 0. On the other hand, a player can lose a tightly contested set winning more points than his opponent (as many as 12 points if he wins all his service games to love and loses his opponent's service games reaching deuce and the ensuing tie breaker with 2 point difference)
So, during a live match, Betting on a player looking only at points trends for the 2 players assuming the one with higher points having more likeliness to win could lead to a poor decision making as it could infact be wrong and the player with lesser points won could win the match resulting in a case of Simpson's paradox.