In: Math
Please answer part a) through part d) of the question below. Thank you.
Question 3
A tobacco refinery has four methods of measuring pH. To test
the
four methods, a supervisor randomly assigns each of 32
tobacco
samples with known pH to one of the four methods, so that
each
method is applied to exactly eight samples. The difference
between
measured pH and the known pH is recorded, and the data is
below.
Method | Sample | Response |
A | 1 | -0.307 |
A | 2 | -0.294 |
A | 3 | 0.009 |
A | 4 | -0.051 |
A | 5 | -0.136 |
A | 6 | -0.324 |
A | 7 | -0.324 |
A | 8 | -0.164 |
B | 1 | -0.110 |
B | 2 | 0.125 |
B | 3 | -0.013 |
B | 4 | 0.082 |
B | 5 | 0.091 |
B | 6 | 0.459 |
B | 7 | 0.259 |
B | 8 | 0.351 |
C | 1 | 0.137 |
C | 2 | -0.063 |
C | 3 | 0.24 |
C | 4 | -0.05 |
C | 5 | 0.318 |
C | 6 | 0.154 |
C | 7 | 0.099 |
C | 8 | 0.124 |
D | 1 | -0.042 |
D | 2 | 0.69 |
D | 3 | 0.201 |
D | 4 | 0.166 |
D | 5 | 0.219 |
D | 6 | 0.407 |
D | 7 | 0.505 |
D | 8 | 0.311 |
a) Use R to calculate the means and standard
deviations for the four
methods. Based only on these numbers, do the mean pH
differences
seem to differ across the methods? Explain.
b) Do the ANOVA conditions hold? Be sure to
include your R code,
output, appropriate graphs (boxplots, dotplots), and
explanations.
c) Regardless of your answer to (b), run ANOVA
with R. Set-up your
null and alternative hypotheses; provide your test statistic,
p-value,
and conclusion.
d) Use the Bonferroni adjustment to make a
confidence interval for
each of the 6 differences between treatment means, using an
experiment-wise confidence level of 95%.
#### R command
# Read the data
data=read.csv("data.csv", head=T, sep=",")
## The means and standard deviations for the four methods
aggregate(Response ~Method, data = data, function(x) c(mean = mean(x), sd = sd(x)))
## b)
boxplot(Response ~Method, data = data)
## c) ### ANOVA test
model=aov(Response ~factor(Method), data = data)
summary(model)
## Bonferroni adjustment to make a confidence interval
pairwise.t.test(data$Response, data$Method,
p.adjust.method = p.adjust.methods,
alternative = c("two.sided"))
### End the command
## Run
> # Read the data
>
> data=read.csv("data.csv", head=T, sep=",")
> ## The means and standard deviations for the four
methods
>
> aggregate(Response ~Method, data = data, function(x) c(mean =
mean(x), sd = sd(x)))
Method Response.mean Response.sd
1 A -0.1988750 0.1321800
2 B 0.1555000 0.1891409
3 C 0.1198750 0.1297772
4 D 0.3071250 0.2256960
## Comment: From the above table, we can conclude that at least the methods A and D may have a different mean.
> ## b)
>
> boxplot(Response ~Method, data = data)
>
From the above box-plots, we can conclude that the response at each method may follow the normal distribution and the median of method A may different from the remaining methods.
### ANOVA test
# Null hypothesis: All the means at different methods are the same.
# Alternative Hypothesis: At least one method has a significant mean.
> model=aov(Response ~factor(Method), data = data)
> summary(model)
Df Sum Sq Mean Sq F value Pr(>F)
factor(Method) 3 1.0851 0.3617 11.95 3.23e-05 ***
Residuals 28 0.8472 0.0303
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#Test statistic=11.95
#P-value= 11.95 3.23e-05 ***
#Conclusion: The p-value is less than 0.05 level of significance. Hence, we reject the null hypothesis and conclude that at least one method has significant mean. at the 0.05 level of significance.
> pairwise.t.test(data$Response, data$Method,
+ p.adjust.method = p.adjust.methods,
+ alternative = c("two.sided"))
Pairwise comparisons using t tests with pooled SD
data: data$Response and data$Method
A B C
B 0.0017 - -
C 0.0041 0.6852 -
D 1.8e-05 0.1845 0.1202
P value adjustment method: holm
## Comment: From this test the method A has a significant mean difference with remaining methods. But methods B, C, and D do not have a significant mean differences.