In: Statistics and Probability
County |
Fat Intake (g/day) |
Age-adjusted death rate (per 100,000 population) |
Fresno |
140 |
2.1 |
Kings |
120 |
1.9 |
Tulare |
119 |
0.8 |
Madera |
130 |
0.9 |
Stanislaus |
100 |
1.0 |
Merced |
121 |
1.2 |
Kern |
98 |
1.3 |
San Joaquin |
95 |
1.1 |
The following scatter plot is obtained based on the data provided:-
Some background on scatterplots. They are simply bivariate graphical devices which means that they are constructed to analyze the type of association between two interval variables X and Y as shown above.
Now based on the scatterplot alone we can say that there is a very weak positive association between the Fat Intake and the Age-adjusted death rate.
This generally should not be the case as the two variables should show a strong positive association. The main culprit for the discrepancy is Confounding. There are multiple other variables including income, lifestyle, area of residence , occupation that would greatly affect the death rates apart from the fat intake. Until we can control these effects we would never get a clear picture about the real association between fat intake and death rate. Details about individuals may be thus missed in aggregate data sets due to effects of confounding giving rise to ecological fallacy.