In: Statistics and Probability
In a study of the effect of glucose on insulin release, 12 identical specimens of pancreatic tissue were divided into three groups of four specimens each. Three levels (low, medium, high) of glucose concentration were randomly assigned to the three groups, and each specimen within a group was treated with its assigned concentration of glucose. The amounts of insulin release by the tissue samples are as follows: (Stata code for inputting the data is provided at the end of this document) Concentration Low Medium High 1.59 3.36 3.92 1.73 4.01 4.82 3.64 3.49 3.87 1.97 2.89 5.39 Designate the low, medium, and high levels as treatment 1, treatment 2, and treatment 3, respectively. Calculate the following: The overall mean Y ̅_(..), the treatment means 〖Y ̅_(i.)〗_.,i=1,2,3; and the treatment variance s_i^2,i=1,2,3; Graph the boxplot for the data. Based on the plot, would it be reasonable to assume that the variances are the same for the three populations of insulin release measurements? Assume that the variances of populations of insulin release measurements have a common value σ^2. Calculate an estimate for σ^2. Describe the ANOVA assumptions as they pertain to these data. Suppose that the ANOVA assumptions are satisfied by these data. Calculate the values of MST and MSE and test the null hypothesis that H_0: μ_1=μ_2=μ_3. Use the Bonferroni, Scheffe, and Tukey pairwise multiple comparison methods, each at the 0.05 level, to compare the mean insulin releases at the three glucose concentrations. Write your conclusions on the basis of the results from each of the multiple comparison methods.
ans) I have done this problem in R software. The R codes are given as follows:
y=c(1.59,3.36,3.92,1.73,4.01,4.82,3.64,3.49,3.87,1.97,2.89,5.39)
y1=c(1.59,1.73,3.64,1.97)
y2=c(3.36,4.01,3.49,2.89)
y3=c(3.92,4.82,3.87,5.39)
x=as.factor(rep(1:3,4))
y_mean=mean(y)
y1_mean=mean(y1)
y2_mean=mean(y2)
y3_mean=mean(y3)
c(y_mean,y1_mean,y2_mean,y3_mean)
y1_var=var(y1)
y2_var=var(y2)
y3_var=var(y3)
c(y1_var,y2_var,y3_var)
From here we get the value of overall mean is 3.3900, treatments for Low,Medium and High are respectively are 2.2325,3.4375 and 4.5000 and variances for the treatments are respectively 0.9050917,0.2120917 and 0.5426000.
The code for boxplot is : boxplot(y~x) and the graph is given below:
Clearly from the boxplots of treatment 1,2 and 3 we observe that there is significant difference between the medians and structure. Therefore we can conclude that variances for different treatments are not same.
If we need to apply ANOVA we have to assume normality of the population distribution and the error variance are same.
The R code for anova is given below:
anova=aov(y~x)
summary(anova)
The results are as follows:
Df Sum Sq Mean Sq F value Pr(>F)
x 2 10.297 5.148 9.305 0.00645 **
Residuals 9 4.979 0.553
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
From here we get the estimate of overall population variance which is mean sum of sqaure of residuals and here the value is 0.553.
For the given null hypothesis the MST is given as (10.297+4.979)/11 = 1.389 and MSE is given as 0.553.
From observing the p-value which is 0.00645 and is less than 0.05 which is the significance level, so we can conclude that at 5% level of significance there is difference among the treatment means.
Now to detect which treatment means are different we will conduct Tukey pairwise multiple comparison test. The R code is given by:
TukeyHSD(anova)
and the results are :
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = y ~ x)
$`x`
diff lwr upr p adj
2-1 1.2050 -0.2634743 2.673474 0.1085173
3-1 2.2675 0.7990257 3.735974 0.0049932
3-2 1.0625 -0.4059743 2.530974 0.1629216
from the result we observe that treatment 1 and 2 has difference in means and treatment 3 and 2 has difference in means as the value 0 is contained among their range of the intervals.