In: Statistics and Probability
A prison administrator is interested in examining the relationships between the type of prison security (maximum, medium, and minimum) and the number of previous offenses committed by an inmate. He believes that maximum security prisons have inmates with many prior offenses, medium security with not so many offenses, and minimum security with almost no prior offenses. He randomly selects eight new inmates from each of the three security levels and compares the number of offenses for which they have ever been charged.
Security Level |
||
Maximum |
Medium |
Minimum |
8 |
4 |
2 |
6 |
4 |
2 |
4 |
3 |
3 |
5 |
6 |
2 |
3 |
5 |
4 |
7 |
6 |
1 |
6 |
3 |
2 |
9 |
3 |
2 |
S = 48 |
S = 34 |
S = 18 |
a) What are the independent and dependent variables? How are they measured (nominal, ordinal, interval, ratio)?
b) Using a significance level of .01, test the null hypothesis that the groups are equal against the alternative that at least one is different.
SST= SSB+ SSW= 56.33 + 45 = 101.33
c) Determine which (if any) differences between means are significant (use a significance level of .01).
Tukey’s Honest Significant Difference Test:
d) How strong is the relationship (if any) between these two variables? Calculate the measure of association for these data. Interpret this value in words.
Ans) a) Here the independent variable is the type of prison security and dependent variable is the number of previous offenses committed by an inmate. The independent variable is a categorical ordinal variable and dependent variable is nominal variable.
b) Here given the significance level is 0.01. Here we need to test the groups are equal against at least one is different. So clearly we need to apply here one way ANOVA. Here we can we R software to get the desired results.The R code is given below:
y=c(8,6,4,5,3,7,6,9,4,4,3,6,5,6,3,3,2,2,3,2,4,1,2,2)
x=as.factor(rep(1:3,each=8))
anova=aov(y ~ x)
summary(anova)
The result of the summary table as follows:
Df Sum Sq Mean Sq F value Pr(>F)
x 2 56.33 28.167 13.14 0.000199 ***
Residuals 21 45.00 2.143
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
The critical value of F statistics is given by: qf(0.99,2,21) = 5.780416. So here the observed F statistics value is 13.14 which is greater than critical value, so we can conclude that at 1% level of significance the groups are different.
c) Now we have to find which groups are different from each other. For this we will apply Tukey's Honest Difference Test in R. The R code is given as:
TukeyHSD(anova,conf.level = 0.99)
The result is as follows:
Fit: aov(formula = y ~ x)
$`x`
diff lwr upr p adj
2-1 -1.75 -4.137015 0.6370148 0.0650195
3-1 -3.75 -6.137015 -1.3629852 0.0001278
3-2 -2.00 -4.387015 0.3870148 0.0320870
From the above table we can clearly observe that there are differences between means of group 1 and 2 and between means of group 2 and 3 as the interval contains 0 value.
d) Now we have to find the measure of association between these two variables. So need to calculate correlation coefficient between these two variables.Here we need to convert independent variable to numerical one ( we can do it here because the we can order this variable). The R code is given by:
cor(as.numeric(x),y)
The value is -0.7450495. This value indicates that there is a strong association between these two variables where the negative sign indicates that maximum security prisons have inmates with many prior offenses, because we are given lower value to maximum level.