In: Statistics and Probability
14. 8. Read the story below from NPR and then identify the very important concept . How does it relate to correlation and Chi-Square
Analysis Finds Geographic Overlap In Opioid Use And Trump Support In 2016
June 23, 20188:02 AM ET
Paul Chisholm, NPR
In 2016, Donald Trump captured 68 percent of the vote in West Virginia, a state hit hard by opioid overdoses.
BRENDAN SMIALOWSKI/AFP/Getty Images
The fact that rural, economically disadvantaged parts of the country broke heavily for the Republican candidate in the 2016 election is well known. But Medicare data indicate that voters in areas that went for Trump weren't just hurting economically — many of them were receiving prescriptions for opioid painkillers.
The findings were published Friday in the medical journal JAMA Network Open. Researchers found a geographic relationship between support for Trump and prescriptions for opioid painkillers.
It's easy to see similarities between the places hardest hit by the opioid epidemic and a map of Trump strongholds. "When we look at the two maps, there was a clear overlap between counties that had high opioid use ... and the vote for Donald Trump," says Dr. James S. Goodwin, chair of geriatrics at the University of Texas Medical Branch in Galveston and the study's lead author. "There were blogs from various people saying there was this overlap. But we had national data."
Goodwin and his team looked at data from Census Bureau, the 2016 election and Medicare Part D, a prescription drug program that serves the elderly and disabled.
To estimate the prevalence of opioid use by county, the researchers used the percentage of enrollees who had received prescriptions for a three-month or longer supply of opioids. Goodwin says that prescription opioid use is strongly correlated with illicit opioid use, which can be hard to quantify.
"There are very inexact ways of measuring illegal opioid use," Goodwin says. "All we can really measure with precision is legal opioid use."
Goodwin's team examined how a variety of factors could have influenced each county's rate of chronic opioid prescriptions. After correcting for demographic variables such as age and race, Goodwin found that support for Trump in the 2016 election closely tracked opioid prescriptions.
In counties with higher-than-average rates of chronic opioid prescriptions, 60 percent of the voters went for Trump. In the counties with lower-than-average rates, only 39 percent voted for Trump.
A lot of this disparity could be chalked up to social factors and economic woes. Rural, economically-depressed counties went strongly for Trump in the 2016 election. These are the same places where opioid use is prevalent. As a result, opioid use and support for Trump might not be directly related, but rather two symptoms of the same problem – a lack of economic opportunity.
To test this theory, Goodwin included other county-level factors in the analysis. These included factors such as unemployment rate, median income, how rural they are, education level, and religious service attendance, among others.
These socioeconomic variables accounted for about two-thirds of the link between voter support for Trump and opioid rates, the paper's authors write. However, socioeconomic factors didn't explain all of the correlation seen in the study.
"It very well may be that if you're in a county that is dissolving because of opioids, you're looking around and you're seeing ruin. That can lead to a sense of despair," Goodwin says. "You want something different. You want radical change."
For voters in communities hit hard by the opioid epidemic, the unconventional Trump candidacy may have been the change people were looking for, Goodwin says.
Dr. Nancy E. Morden, associate professor at the Dartmouth Institute for Health Policy and Clinical Practice, agrees. "People who reach for an opioid might also reach for ... near-term fixes," she says. "I think that Donald Trump's campaign was a promise for near-term relief."
Goodwin's study has limitations and can't establish that opioid use was a definitive factor in how people voted.
"With that kind of study design, you have to be cautious in terms of drawing any causal conclusions," cautions Elene Kennedy-Hendricks, an assistant scientist in the Department of Health Policy and Management at the Johns Hopkins Bloomberg School of Public Health. "The directionality is complicated."
Goodwin acknowledges that the study has shortcomings.
"We were not implying causality, that the Trump vote caused opioids or that opioids caused the Trump vote," he cautions. "We're talking about associations."
Still, the study serves as an interesting example highlighting the links between economic opportunity, social issues and political behavior.
"The types of discussions around what drove the '16 election, and the forces that were behind that, should also be included when people are talking about the opioid epidemic," Goodwin says.
The Chi Square statistic is commonly used for testing relationships between categorical variables. The null hypothesis of the Chi-Square test is that no relationship exists on the categorical variables in the population; they are independent.
The Chi-Square statistic is most commonly used to evaluate Tests of Independence when using a cross tabulation (also known as a bivariate table). Cross tabulation presents the distributions of two categorical variables simultaneously, with the intersections of the categories of the variables appearing in the cells of the table. The Test of Independence assesses whether an association exists between the two variables by comparing the observed pattern of responses in the cells to the pattern that would be expected if the variables were truly independent of each other. Calculating the Chi-Square statistic and comparing it against a critical value from the Chi-Square distribution allows the researcher to assess whether the observed cell counts are significantly different from the expected cell counts.
The calculation of the Chi-Square statistic is quite straight-forward and intuitive:
where
fo = the observed frequency (the observed counts in the cells) and
fe = the expected frequency if NO relationship existed between the variables
As depicted in the formula, the Chi-Square statistic is based on the difference between what is actually observed in the data and what would be expected if there was truly no relationship between the variables.
Opioid Use |
Trump |
Other |
Total |
Yes |
N11 |
N12 |
N11 + N12 |
No |
N21 |
N22 |
N21 + N22 |
Total |
N11 + N21 |
N12 + N22 |
N |
The first step is to state the null hypothesis and an alternative hypothesis.
Ho: Opioid Usage and voting preferences are independent.
Ha: Opioid Usage and voting preferences are not independent.
For this analysis, the significance level is 0.05. Using sample data, we can conduct a chi-square test for independence.
The degrees of freedom (DF) is equal to:
DF = (r - 1) * (c - 1)
where r is the number of levels for one categorical variable, and c is the number of levels for the other categorical variable.
The expected frequency counts are computed separately for each level of one categorical variable at each level of the other categorical variable. Compute r * c expected frequencies, according to the following formula.
fr,c = (nr * nc) / n
where fr,c is the expected frequency count for level r of Variable A and level c of Variable B, nr is the total number of sample observations at level r of Variable A, nc is the total number of sample observations at level c of Variable B, and n is the total sample size.
We select whether to accept or reject the null hypothesis based on the p value of this test.