Question

In: Math

The data set airquality is one of R’s included data sets. It shows daily measurements of...

The data set airquality is one of R’s included data sets. It shows daily measurements of ozone concentration (Ozone), solar radiation (Solar.R), wind speed (Wind), and temperature (Temp) for 5 summer months in 1977 in New York City. Some of the observations are missing and are recorded as NA, meaning not available. View an overall summary of the variables in airquality with the command

> summary(airquality) Ignore the summaries for Month and Day since those variables should be factors, not numeric variables, and their summaries are meaningless. Attach airquality to your workspace

> attach(airquality) and make boxplots of Ozone, Solar.R, Wind, and Temp. Comment on any noteworthy features.

Solutions

Expert Solution

#############################

attach(airquality)
airquality
data= data.frame(airquality$Ozone,airquality$Solar.R,
airquality$Wind,airquality$Temp)
summary(data)


###############################
par(mfrow=c(2,2))
boxplot(airquality$Ozone)
boxplot(airquality$Solar.R)
boxplot(airquality$Wind)
boxplot(airquality$Temp)
###############################

output:

> summary(data)
airquality.Ozone airquality.Solar.R airquality.Wind airquality.Temp
Min. : 1.00 Min. : 7.0 Min. : 1.700 Min. :56.00  
1st Qu.: 18.00 1st Qu.:115.8 1st Qu.: 7.400 1st Qu.:72.00  
Median : 31.50 Median :205.0 Median : 9.700 Median :79.00  
Mean : 42.13 Mean :185.9 Mean : 9.958 Mean :77.88  
3rd Qu.: 63.25 3rd Qu.:258.8 3rd Qu.:11.500 3rd Qu.:85.00  
Max. :168.00 Max. :334.0 Max. :20.700 Max. :97.00  
NA's :37 NA's :7   
>

from the above box plot it is very evident that outlier is present in the ozone and wind data.


Related Solutions

WILL RATE HIGH! question 2.4.1, problem 6 The data set airquality is one of R’s included...
WILL RATE HIGH! question 2.4.1, problem 6 The data set airquality is one of R’s included data sets. It shows daily measurements of ozone concentration (Ozone), solar radiation (Solar.R), wind speed (Wind), and temperature (Temp) for 5 summer months in 1977 in New York City. Some of the observations are missing and are recorded as NA, meaning not available. View an overall summary of the variables in airquality with the command > summary(airquality) Ignore the summaries for Month and Day...
The data below shows 6 daily pH level measurements of a fish tank. 7.15, 7.28, 7.31,...
The data below shows 6 daily pH level measurements of a fish tank. 7.15, 7.28, 7.31, 7.2, 7.65, 7.53 Using the sample mean, ¯x=7.35, of this data, which of the following is the 99% confidence interval to estimate μ, the mean daily pH level of the fish tank? Margin of error is accurate to 3 decimal places.
The data set ”airquality” in the R datasets library has data on ozone concentration, wind speed,...
The data set ”airquality” in the R datasets library has data on ozone concentration, wind speed, temperature, and solar radiation by month and day for May through September in New York. Attach airquality to your workspace and then construct side-by-side boxplots of Wind by Month. Month is a numeric variable in the airquality data frame. You can treat it as a factor by using the ”as.factor” function, e.g., > plot(Wind ∼ as.factor(Month)) Next, do an analysis of variance to determine...
Healthcare data sets is an interesting topic. What are data sets? Why would a data set...
Healthcare data sets is an interesting topic. What are data sets? Why would a data set be developed? Provide one to two examples only not a list.
The following data set shows the number of chirps in one minute from a cricket and...
The following data set shows the number of chirps in one minute from a cricket and the temperature outside (in degrees Fahrenheit): Chirps per Minute Temperature 98 58.4 107 67.5 111 54.4 112 67.2 113 68.4 120 62.2 123 76 129 69.3 137 65.9 140 66 142 67.2 148 64.7 151 80.4 158 76.4 165 84.5 What is the rank correlation coefficient?  (Round to three decimal places.) What is the critical rho value at a 0.01 significance? Do we have correlation?...
The following data set shows the number of chirps in one minute from a cricket and...
The following data set shows the number of chirps in one minute from a cricket and the temperature outside (in degrees Fahrenheit): Chirps per Minute Temperature 104 56.2 115 67 130 74.8 135 60.6 142 69.4 143 61.1 152 69 164 74 166 78.5 What is the rank correlation coefficient?  (Round to three decimal places.) What is the critical rho value at a 0.01 significance? Do we have correlation? No, the rank correlation coefficient is smaller (in absolute value) than the...
Data sets for daily high temperatures (in °F) in the months of August and February of...
Data sets for daily high temperatures (in °F) in the months of August and February of 2015, 2017, and 2019: August 2015: 80; 81; 85; 87; 84; 80; 75; 82; 83; 85; 72; 78; 79; 85; 88; 89; 90; 90; 86; 84; 80; 79; 80; 85; 79; 74; 71; 77; 83; 88; 85 February 2015: 31; 33; 23; 37; 33; 22; 37; 42; 36; 29; 32; 32; 15; 24; 12; 12; 23; 28; 15; 13; 24; 37; 30; 23; 33;...
Instructions: 1. Analyze, journalize, and post daily transactions. (Daily transaction data included below.) 2. Prepare an...
Instructions: 1. Analyze, journalize, and post daily transactions. (Daily transaction data included below.) 2. Prepare an unadjusted trial balance (general ledger balances after daily transactions have been posted but before adjusting entries are posted). 3. Analyze, journalize, and post adjusting entries. (Adjustment data included below.) 4. Prepare an adjusted trial balance (general ledger balances after adjusting entries have been posted but before closing entries are posted). 5. Prepare the financial statements: income statement, statement of owner's equity, and balance sheet....
The StatCrunch data set for this question contains the data measurements described in Question 11. (H0...
The StatCrunch data set for this question contains the data measurements described in Question 11. (H0 : µ1 - µ2 ≤ 0 HA : µ1 - µ2 > 0) Assume that the two samples are dawn from independent, normally distributed populations that have different standard deviations. Use this data set and the results from Question 11 to calculate the p-value for the hypothesis test. Round your answer to three decimal places; add trailing zeros as needed. The p-value = [S90PValue]....
Given two sets of data, A and B. i) Data set A has an r value...
Given two sets of data, A and B. i) Data set A has an r value of -.81 and data set B has an r value of .94 Describe the differences between the two data sets as completely as you can using the regression information we have learned. ii) Which linear regression equation, the one for A or the one for B, would probably be a better predictor? Why?
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT