In: Statistics and Probability
The data set ”airquality” in the R datasets library has data on ozone concentration, wind speed, temperature, and solar radiation by month and day for May through September in New York. Attach airquality to your workspace and then construct side-by-side boxplots of Wind by Month. Month is a numeric variable in the airquality data frame. You can treat it as a factor by using the ”as.factor” function, e.g.,
> plot(Wind ∼ as.factor(Month))
Next, do an analysis of variance to determine if wind speed varies significantly by month. Finally, use the ”pairwise.t.test” function to pick out which pairs of months are significantly different. Are the answers what you would expect from looking at the boxplots?
From the above obtained p-values, we can see that the p-value for the pairs month 5-month 7 and month 5-month 8 is less than 0.05 which indicates that the pairs month 5-month 7 and month 5-month 8 are significantly different while all other pairs are not significantly different because the p-values are greater than 0.05.
From the boxplot obtained we can observe that the average of month 5 is different from the average of month 7 and 8 while the average of month 5, 6 and 9 are almost same. The result of pairwise t-test is also giving the same results for the above test.
R-code:
data(airquality)
attach(airquality)
plot(Wind ~ as.factor(Month))
airquality$Month=as.factor(airquality$Month)
ariquality.aov=with(airquality,aov(Wind~Month))
anova(ariquality.aov)
with(airquality,pairwise.t.test(Wind,Month))