In: Statistics and Probability
M. S. Kanarek and associates studied the relationship between cancer rates and levels of asbestos in the drinking water, in 722 Census tracts around San Francico Bay. They tested over 200 relationships – different types of cancer, different demographic groups, different ways of adjusting for possible confounding variables. After adjusting for age and socioeconomic status (but not smoking), they found a “strong relationship” between the rate of lung cancer among white males and the concentration of asbestos fibers in the drinking water: p < .001.
They found that a 100-fold increase in asbestos concentration was associated with a 1.05-fold in- crease in the lung cancer rate, on average. (This means: If tract B has 100 times the concentration of asbestos fibers in the water as tract A, and the lung cancer rate for white males in tract A is 1 per 1,000 persons per year, then tract B is predicted to have a rate of 1.05 per 1,000 persons per year.)
Are you convinced that asbestos in the drinking water causes lung cancer among white males, to a degree that steps should be taken to reduce it? Please point to at least three distinct flaws in this analysis.
The results here are statistically significant. However, the rate of lung cancer is not substantially significant. The fold change difference is only 0.05 males per 1000 persons. Hence, the results do not suggest to take steps to reduce the effect.
There are several flaws in the experiment.
1. There are several hypotheses being tested on the same dataset and hence it is possible by random chance to find out one hypothesis that would significant. Hence, to ensure that this hypothesis is not just detected by a chance event, it is essential to correct the p-value using multiple hypothesis testing tools like the Bonferroni test.
2. The parameters were tweaked to obtain a significant result. This does not make sense. In a scientific method, things begin with a hypothesis before even the data is collected, and hence the results are simply being tweaked to obtain significance. There is no rationale to choose some parameters over others.
3. As pin-pointed earlier, the difference though significant statistically (without correcting for multiple hypothesis) does not exhibit substantial significance.