In: Statistics and Probability
A researcher intends to determine which variables are more likely to predict public assaults against the police in Kentucky. The variable represents the 2012-2016 average assault rates per 100,000 people. The following correlation matrix shows the relationship between assaults against the police in the state and the 2006-2010 average homicide rates per 100,000, and other socio-demographic variables at the county level in KY, based on 2010 US Census data. Relationships among the independent variables are included as well. Table 1 includes the Pearson correlation coefficients corresponding to bivariate relationships among five variables. The unit of analysis is the county (N = 120). Note: The analysis is based on real data. [15 points]
Table 1. Assaults against the police in Kentucky (N = 120): Intercorrelation matrix
Measures |
Police Assault |
Poverty |
Homicide Rate |
AA% |
FHH% |
Assault against POLICE rate |
1 |
||||
Poverty level |
-.297 |
1 |
|||
Homicide rate |
-.176 |
.452 |
1 |
||
% African Americans (AA%) |
.316 |
-.292 |
.040 |
1 |
|
% Female Head of Households (FHH) |
.325 |
.291 |
.199 |
.408 |
1 |
a) Identify the dependent (DV) and the independent (IV) variables
b) Formulate the null and alternative hypotheses [Note: a null hypothesis should be formulated for each relationship the DV has with the selected IVs]. You should have 4 null hypotheses [see column 1 of the correlation matrix].
c) Knowing that N = 120 (the number of counties in KY), test the null hypotheses you’ve formulated [see the critical values of Pearson’s r, at the corresponding df and p< .05, 2-tail test]. You should list the value of the critical Pearson’s r.
d) Reach a statistical conclusion for each test [it should be 4 tests] that examines the bivariate relationship between the dependent variable and the selected independent variables and interpret your findings. What characteristics appear to have those counties that registered a higher incidence of public assaults against the police in KY? What variables are significantly related to assault against the police and what is the direction of the relationship?
e) What percentage of the variation in assaults against the police is explained by the percentage of female-headed households at the county level? [Hint: calculate the coefficient of determination to answer the question].
For extra-credit [up to 4 points]
f) The table also shows the relationship between percent African Americans at the county level and the homicide rate in the county. Based on the results included in the matrix, with an increase in the percentage of blacks would homicide rate decrease or increase? Is the relationship statistically significant? Justify your answer.
g) What characteristics have KY counties that have a higher percentage of female-headed households?
In the given problem, we are asked to determine which variables are more likely to predict public assaults against the police in Kentucky, the possible predictors being Poverty level, Homicide rate, % African Americans and % Female Head of Households:
a) The variable dependent on all the given factors (probable predictors) is Public assaults against the police in Kentucky. And the variables that probabily causes the same are the predictors.
Hence, the Dependent variable is Public assaults against the police in Kentucky. The indepndent variables are: Poverty level, Homicide rate, % African Americans and % Female Head of Households.
(b) In order to attain our objective, we must first determine whether a strong relationship exists between the dependent variable and the predictors. Since, all the given variables are continuous in nature, the appropriate measure of correlation, here, would be Pearson's Correlation - which measures the strength and direction of the relationship.
Let denote the Pearson's correlation coefficient between Public assaults against the police in Kentucky and each of the 4 predictors - Poverty level, Homicide rate, % African Americans and % Female Head of Households respectively.
We may confirm a significant relationship between the two variables if the correlation coefficient between them is significantly different from zero. By definition of r, r lies between -1 and 1, with negative and positive values representing negative and positive relationship respectively.And values close to representing strong relation and those close to zero representing little or no correlation.
Hence, we may test:
1. Vs
2. Vs
3. Vs
4. Vs
c)
The significance of correlation coefficient r is tested using the following t-statistic:
where critical value of t can be obtained from t table for n - 2 degrees of freedom.
For Poverty level, substituting the given values,
= 0.179
Similarly, for correlation between Public assaults against the police in Kentucky and the predictors - Poverty level, Homicide rate, % African Americans and % Female Head is 0.179.
d) Comparing the r values obtained with the critical value, we may reject the null hypotheses if |r| > rcrit:
Based on the given data, we find that the coefficients for Poverty level, % African Americans and % Female Head - 0.297, 0.316, 0.325, respectively, all lie in the rejection / critical region.
Based on the given data,we may conclude that Public assaults against the police in Kentucky exhibit a significant correlation with predictors Poverty level, % African Americans and % Female Head, except with Homicide rate.
Also, based on the given data, it is to be noted that % African Americans and % Female Head are positively significantly correlated with Public assaults against the police in Kentucky. And there exists a significant negative correlation between Public assaults and poverty level.
e) The percentage of variation explained can be computed using the measure - Coefficient of determination which is nothing but the square of the correlation coefficient:
For percentage of the variation in assaults against the police that is explained by the percentage of female-headed households at the county level,
Based on the given data, we find that percentage of female-headed households at the county level explains about 10.6% of the variation in assaults against the police.
f) Based on the given data, the pearson's correlation coefficient for African Americans at the county level and the homicide rate is obtained as 0.04. As the coefficient is positive, we may say, that Homicide rate increases with % African Americans at the county level and vice versa. But since , we may conclude that this correlation is not significant at 5% level. Hence, a relationship as stated above cannot be established.
g) Based on the given data, we find that KY counties that have a higher percentage of female-headed households have higher No. of Police assault cases (r = 0.325 > 0.179), higher % of African -Americans ( r = 0.408 >0.179), higher % of Homicide rate (r = 0.199 > 0.179) and higher level of poverty (r = 0.291 > 0.179) based on the given data.