Question

In: Statistics and Probability

The data set ”airquality” in the R datasets library has data on ozone concentration, wind speed,...

The data set ”airquality” in the R datasets library has data on ozone concentration, wind speed, temperature, and solar radiation by month and day for May through September in New York. Attach airquality to your workspace and then construct side-by-side boxplots of Wind by Month. Month is a numeric variable in the airquality data frame. You can treat it as a factor by using the ”as.factor” function, e.g.,

> plot(Wind ∼ as.factor(Month))

Next, do an analysis of variance to determine if wind speed varies significantly by month. Finally, use the ”pairwise.t.test” function to pick out which pairs of months are significantly different. Are the answers what you would expect from looking at the boxplots?

Solutions

Expert Solution

From the above obtained p-values, we can see that the p-value for the pairs month 5-month 7 and month 5-month 8 is less than 0.05 which indicates that the pairs month 5-month 7 and month 5-month 8 are significantly different while all other pairs are not significantly different because the p-values are greater than 0.05.

From the boxplot obtained we can observe that the average of month 5 is different from the average of month 7 and 8 while the average of month 5, 6 and 9 are almost same. The result of pairwise t-test is also giving the same results for the above test.

R-code:

data(airquality)
attach(airquality)
plot(Wind ~ as.factor(Month))
airquality$Month=as.factor(airquality$Month)
ariquality.aov=with(airquality,aov(Wind~Month))
anova(ariquality.aov)
with(airquality,pairwise.t.test(Wind,Month))


Related Solutions

load the MASS library in R. A. Package ‘MASS’ which provides a description of the datasets...
load the MASS library in R. A. Package ‘MASS’ which provides a description of the datasets available in the MASS package. Then, answer each of the following questions using the appropriate test statistic and following formal steps of hypothesis testing. A:Test of equal or given proportions: Use the “bacteria” data set to answer the question, “did the drug treatment have a significant effect of the presence of the bacteria compared with the placebo?” B: F-test: Use the “cats” data set...
R has a number of datasets built in. One such dataset is called mtcars. This data...
R has a number of datasets built in. One such dataset is called mtcars. This data set contains fuel consumption and 10 aspects of automobile design and performance for 32 automobiles (1973-74 models) as reported in a 1974 issue of Motor Trend Magazine. We do not have to read in these built-in datasets. We can just attach the variables by using the code attach(mtcars) We can just type in mtcars and see the entire dataset. We can see the variable...
R Programming Exercise Book Problem 31 (a) "airquality.csv" is a data set which consists of ozone,...
R Programming Exercise Book Problem 31 (a) "airquality.csv" is a data set which consists of ozone, solar radiation, wind and temperature measurements taken in New York city from May to September of 1973. Use the command read.csv to read the data set. Now write a code which will take 7 random temperature values from each month and then calculate the mean and the standard deviation for the 7 samples. Display the mean as a variables which includes the name of...
R Programming Exercise Book Problem 31 (a) "airquality.csv" is a data set which consists of ozone,...
R Programming Exercise Book Problem 31 (a) "airquality.csv" is a data set which consists of ozone, solar radiation, wind and temperature measurements taken in New York city from May to September of 1973. Use the command read.csv to read the data set. Now write a code which will take 7 random temperature values from each month and then calculate the mean and the standard deviation for the 7 samples. Display the mean as a variables which includes the name of...
The wind chill factor depends on wind speed and air temp. This data represents the wind...
The wind chill factor depends on wind speed and air temp. This data represents the wind speed (in miles per hour) and wind chill factor at an air temp. of 15 degrees F. Wind speed(x) 5, 10, 15, 20, 25, 30, 35 Wind chill(y) 12, -3, -11, -17, -22, -25, -27 Compute the least squares regression line and correlation coefficient for this data. Predict the wind chill for a wind speed of 50 miles per hour. Determine the wind speed...
The data set airquality is one of R’s included data sets. It shows daily measurements of...
The data set airquality is one of R’s included data sets. It shows daily measurements of ozone concentration (Ozone), solar radiation (Solar.R), wind speed (Wind), and temperature (Temp) for 5 summer months in 1977 in New York City. Some of the observations are missing and are recorded as NA, meaning not available. View an overall summary of the variables in airquality with the command > summary(airquality) Ignore the summaries for Month and Day since those variables should be factors, not...
describe the relationship in detail between wind speed and carbon dioxide concentration
describe the relationship in detail between wind speed and carbon dioxide concentration
In R: Consider dataset “juul” from library “ISwR”. (juul is a built in data set) Are...
In R: Consider dataset “juul” from library “ISwR”. (juul is a built in data set) Are the means of igf1 equal among tanner groups at 5% level? Please use the six step process to test statistical hypotheses for this research problem. Note: You need to convert tanner from numeric to factor type and ignore all the NAs.
install.packages("mosaic") library(mosaic) Data=(RailTrail) RailTrail above is the data set it can be found in R (a)...
install.packages("mosaic") library(mosaic) Data=(RailTrail) RailTrail above is the data set it can be found in R (a) Perform multivariate regression model that can predict the variable volume based on the variables hightemp, lowtemp, cloudcover, precip,. Interpret and discuss all the necessary statics from the output. (b) Test whether cloudcover can be dropped from the regression model given that precipitation, hightemp, and lowtemp are retained. Use the F statistic and level of significance 0.01. State the hypotheses, p-value, and conclusion in terms...
We obtained a large set of data on daily weather, including date, wind gust speed, sunshine...
We obtained a large set of data on daily weather, including date, wind gust speed, sunshine duration, rain or not, temperature, and pressure. With this data, we wish to understand which factors affect whether it will rain or not on the next day. A. This scenario describes a classification problem B. This scenario describes a regression problem Suppose that we have a data with 20 potential predictors. We want to run a subset selection procedure to find a single best...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT