In: Statistics and Probability
A researcher wanted to study the effect of type of diet on weight gain in mice. The researcher was interested in 3 specific diets: standard mouse chow diet, “junk” food diet, and an organic diet. A total of 54 mice were randomly assigned to receive one of the 3 types of diets (18 in each group). The outcome variable (response variable) was weight gain (in grams) over a 1 month period. The 3 data sets provided below are for the same study conducted by different researchers from completely different labs. For the sake of comparison, we will assume that they all followed the exact same procedures and that the only difference is the samples. For each study (A, B, and C), determine whether it would be appropriate to use a one-way ANOVA to analyze the data. If it is, conduct the analysis and any additional analysis that might be necessary. If it does not seem appropriate to use a one-way ANOVA, use an appropriate alternative, including any posthoc tests. Remember, we are interested in determining whether the diet groups differ with respect to weight gain. Use a significance level of .05.
Study A
stand<-c(10.80, 11.19, 8.31, 11.24, 9.89, 9.88, 10.18, 11.28, 8.27, 11.69,
10.50, 12.53, 10.55, 10.24, 8.95, 11.29, 10.83, 9.94)
junk<-c(11.22, 11.27, 11.78, 11.67, 10.91, 11.91, 13.07, 11.85, 10.83, 11.18, 12.68,
11.68, 10.69, 11.40, 11.87,12.89 ,11.85, 12.33)
organic<-c(6.77, 9.23, 10.29, 8.78, 10.43, 10.80, 9.84, 11.24, 9.07, 10.39,
10.40, 9.11, 8.68, 10.03, 9.57, 11.69, 11.23, 10.28)
a. Based on graphs and statistical tests, does it seem reasonable to assume normality?
b. Based on statistical tests, does it seem reasonable to assume equal variances?
c. Given the evidence regarding the ANOVA assumptions, which statistical result would be used to determine whether there is an effect for type of diet?
d. Assuming there is an overall effect, which groups differ, if any?
stand2<-c(10.80, 11.19, 8.31, 11.24, 9.89, 9.88, 10.18, 11.28, 8.27, 11.69,
10.50, 12.53, 10.55, 10.24, 8.95, 11.29, 10.83, 9.94)
junk2<-c(11.22, 11.27, 11.78, 11.67, 10.91, 11.91, 13.07, 11.85, 10.83, 11.18, 12.68,
11.68, 10.69, 11.40, 11.87,12.89 ,11.85, 12.33)
organic2<-c(6.77, 9.23, 10.29, 8.78, 10.43, 10.80, 9.84, 11.24, 9.07, 10.39,
10.40, 9.11, 8.68, 10.03, 9.57, 11.69, 11.23, 10.28)
#a
shapiro.test(stand2)
shapiro.test(junk2)
shapiro.test(organic2)
#b
install.packages("lmtest")
library(lmtest)
data=c(stand2,junk2,organic2)
f=as.factor(rep(seq(1,3),each=18))
fit=lm(data~f)
bptest(fit)
#c
anova(fit)
#d
TukeyHSD(aov(data~f))
a)
shapiro.test(stand2)
Shapiro-Wilk normality test
data: stand2
W = 0.95491, p-value = 0.5071
> shapiro.test(junk2)
Shapiro-Wilk normality test
data: junk2
W = 0.95049, p-value = 0.4328
> shapiro.test(organic2)
Shapiro-Wilk normality test
data: organic2
W = 0.94001, p-value = 0.2899
p-value in each case is greater than 0.05
Both of 3 are has normally distributed.
b)
bptest(fit)
studentized Breusch-Pagan test
data: fit
BP = 2.8334, df = 2, p-value = 0.2425
data follow homoskedasticity
c)
> anova(fit)
Analysis of Variance Table
Response: data
Df Sum Sq
Mean Sq F value Pr(>F)
f 2 32.471
16.2354 15.832 4.483e-06 ***
Residuals 51 52.300
1.0255
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
p-value < alpha
there is significant difference in the means of different
d)
> TukeyHSD(aov(data~f))
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = data ~ f)
$f
diff
lwr
upr p adj
2-1 1.3066667 0.491812 2.1215213 0.0008900
3-1 -0.5405556 -1.355410 0.2742991 0.2543319
3-2 -1.8472222 -2.662077 -1.0323676 0.0000040
group 1 and 2 , group 2 and 3 differ