In: Statistics and Probability
a researcher at the univeristy of vermont was interested in studying the relationship between attorneys' decision to prosecute cases and victims' credibility. she collected data on 4205 cases, her two variables were victims' credibility(coded as 0=low, 1=medium, 2=high) and case prosecuted(coded as 0=no, 1=yes). Of the 2713 cases prosecutes , in 1025 cases the victim had high credibility, in 790 of the cases the victim had medium credibility and in 989 cases the victim had low credibility . Of the 1492 cases that were not prosecuted, in 541 the victim had high credibility, in 745 the victim had medium credibility, and in 206 the victim had low credibility.
Victims credibility
Cases Prosecuted low med high
no 206 745 541
yes 898 790 1025
a. What are the independent and dependent variable?
b. Is there a statistically significant relationship bewteen victim's credibility and the prosecutors decision to prosecute? (Be sure to write out the six steps)
c. What type of error might you be making ? and how could you reduce the likelihood of making that type or error?
d. What percentage of cases were prosecuted in this sample?
e. For those cases where the victims had high credibility, what percentage were prosecuted?
a:
Independent variable: Cases prosecuted
Dependent variable: Victims credibility
b:
Here we need to use hi square test of independence.
Hypotheses are:
H0: There is no statistically significant relationship between victim's credibility and the prosecutors decision to prosecute.
Ha: There is a statistically significant relationship between victim's credibility and the prosecutors decision to prosecute.
Let level of significance:
Degree of freedom: df =( number of rows -1)*(number of columns-1) = (2-1)*(3-1)=2
The critical value using excel function "=CHIINV(0.05,2)" is 5.991
Rejection region:
If , reject H0
Following table shows the row total and column total:
Victims credibility | |||||
Low | Med | High | Total | ||
Cases Prosecuted | No | 206 | 745 | 541 | 1492 |
Yes | 898 | 790 | 1025 | 2713 | |
Total | 1104 | 1535 | 1566 | 4205 |
Expected frequencies will be calculated as follows:
Following table shows the expected frequencies:
Victims credibility | |||||
Low | Med | High | Total | ||
Cases Prosecuted | No | 391.717 | 544.642 | 555.641 | 1492 |
Yes | 712.283 | 990.358 | 1010.359 | 2713 | |
Total | 1104 | 1535 | 1566 | 4205 |
Following table shows the calculations for chi square test statistics:
O | E | (O-E)^2/E | |
206 | 391.717 | 88.05031206 | |
898 | 712.283 | 48.42289383 | |
745 | 544.642 | 73.70589885 | |
790 | 990.358 | 40.53415852 | |
541 | 555.641 | 0.385786652 | |
1025 | 1010.359 | 0.212161104 | |
Total | 4205 | 4205 | 251.311211 |
Following is the test statistics:
Conclusion:
Since test statistics lies in rejection region so we reject the null hypothesis.
That is we can conclude that there is a statistically significant relationship between victim's credibility and the prosecutors decision to prosecute.
c)
Since we reject the null hypothesis so type I error is possible. To reduce the likelihood of making that type I we need to reduce the level of significance.
d)
The percentage of cases were prosecuted in this sample is
(2713 /4205 ) *100% = 64.52%
e)
The required percentage is
(1025 / 1566) *100% = 65.45%