In: Math
The data set airquality is one of R’s included data sets. It shows daily measurements of ozone concentration (Ozone), solar radiation (Solar.R), wind speed (Wind), and temperature (Temp) for 5 summer months in 1977 in New York City. Some of the observations are missing and are recorded as NA, meaning not available. View an overall summary of the variables in airquality with the command
> summary(airquality) Ignore the summaries for Month and Day since those variables should be factors, not numeric variables, and their summaries are meaningless. Attach airquality to your workspace
> attach(airquality) and make boxplots of Ozone, Solar.R, Wind, and Temp. Comment on any noteworthy features.
#############################
attach(airquality)
airquality
data= data.frame(airquality$Ozone,airquality$Solar.R,
airquality$Wind,airquality$Temp)
summary(data)
###############################
par(mfrow=c(2,2))
boxplot(airquality$Ozone)
boxplot(airquality$Solar.R)
boxplot(airquality$Wind)
boxplot(airquality$Temp)
###############################
output:
> summary(data)
airquality.Ozone airquality.Solar.R airquality.Wind
airquality.Temp
Min. : 1.00 Min. : 7.0 Min. : 1.700 Min. :56.00
1st Qu.: 18.00 1st Qu.:115.8 1st Qu.: 7.400 1st
Qu.:72.00
Median : 31.50 Median :205.0 Median : 9.700 Median
:79.00
Mean : 42.13 Mean :185.9 Mean : 9.958 Mean :77.88
3rd Qu.: 63.25 3rd Qu.:258.8 3rd Qu.:11.500 3rd
Qu.:85.00
Max. :168.00 Max. :334.0 Max. :20.700 Max. :97.00
NA's :37 NA's :7
>
from the above box plot it is very evident that outlier is present in the ozone and wind data.