In: Statistics and Probability
Please show all the steps in detail, along with all the codes
1. Seedling emergence example
For this question you will need to analyze data given in le seedEmergence. For your convenience, the
data is posted as a text le.
5 seed disinfectant treatments were applied to several agricultural plots, where 100 seeds were planted in
each plot. The response variable is \plants that emerged in each plot". The goal is to compare these 5
treatments using certain appropriate number of blocking levels which could be decided by looking at the
data.
(a) Use graphical (eg, boxplots) and numerical methods(group means, sd etc) to describe the dierences
in treatments.
(b) What statistical model would you use to analyze this data? Explain.
(c) Construct the analysis of variance table for this problem.
(d) Using alpha=0.05, is there any evidence that the treatments dier with respect to emerging plants in
each plot?
(e) Give estimates of "all" the parameters in the model.
(f) Analyze the residuals from this experiment. Which assumptions about the model are satised and
which are not?
treatment block emergence
Control 1 86
Arasan 1 98
Spergon 1 96
Semesan 1 97
Fermate 1 91
Control 2 90
Arasan 2 94
Spergon 2 90
Semesan 2 95
Fermate 2 93
Control 3 88
Arasan 3 93
Spergon 3 91
Semesan 3 91
Fermate 3 95
Control 4 87
Arasan 4 89
Spergon 4 92
Semesan 4 92
Fermate 4 95
Note:
Hey there! Thank you for the question. As you have posted several subparts together, we have solved the first four subparts for you, according to our policy.
(a)
We have used the MegaStat add-in in Excel to draw the boxplot and to find the group counts, means, standard deviations, variances, minimums, maximums, and ranges for the 5 treatment groups.
Open an Excel worksheet. Enter the data in 5 different columns, each column representing a treatment, and the first row holding the treatment names.
Go to Add-Ins > MegaStat > Descriptive Statistics.
Enter Sheet1!$A$1:$E$5 in Input range.
Tick on Mean, Sample variance and standard deviation, Minimum, maximum, range, and Boxplot.
Click OK.
The following outputs are obtained:
It can be observed that Arasan and Fermate have the same mean of 93.50; the mean of Semesan is 93.75, which is also pretty close to the other two. The mean of Spergon is 92.25, which is slightly smaller than that of Arasan, Fermate, and Semesan. However, the mean of Control is 87.75, which is much smaller that of the others.
The standard deviation of Fermate is close to that of the Control group. Again, the standard deviations of Spergon and Semesan are close to each other, although slightly larger than that of the previous two. The standard deviation of Arasan is much higher than that of the others.
The boxplots clearly exhibit the difference in the central tendencies and dispersion among the 5 different treatment groups.
(b)
Since there are two factors affecting the outcomes- the five treatments (Control, Arasan, Spergon, Semesan, and Fermate), and the four blocks, the two-way ANOVA (analysis of variance) model must be used.
(c)
We have used the Data Analysis tool-pack in Excel to construct the analysis of variance table.
We have arranged the data and entered it as follows:
Go to Data > Data Analysis > Anova: Two-Factor Without Replication > OK.
In Input Range, enter $A$1:$E$6; tick on labels, enter Alpha as 0.05, and click OK.
The following output is obtained. Note that the analysis of variance table is given under “ANOVA” in the output.
(d)
In the analysis of variance table above, the “Rows” under “Sources of Variation” correspond to the treatments (as the observations under each treatment are noted along a row), whereas the “Columns” title relates to the block effects.
The P-value for Rows, hat is, for the treatments is 0.0377 (4 decimal places).
The level of significance is given as α = 0.05.
In this case, the null hypothesis is that, there is no significant difference between the 5 treatments, and the alternative hypothesis should be that, not all the treatments have the same effect.
The rejection rule for a test using the P-value is: Reject the null hypothesis, if P-value ≤ α. Otherwise, fail to reject the null hypothesis.
Here, P-value (= 0.0377) < α (= 0.05). Thus, reject the null hypothesis.
Thus, there is evidence that the treatments differ with respect to emerging plants in the plots.