In: Statistics and Probability
The ozone dataset gives a sample of ozone measurements (in partial pressures) taken over the South Pole on September 18, 1997 at altitudes above 10km.
mPa:
1.828, 2.652, 3.307, 3.855, 4.021, 4.173, 4.316, 4.951, 4.787, 4.554, 4.333, 4.039, 4.48, 4.739, 4.791, 5.213, 5.75, 5.79, 5.656, 5.464, 5.092, 3.135, 2.754, 3.14, 5.213, 5.831, 5.736, 5.333, 4.797, 6.704, 6.493, 5.746, 6.31, 6.53, 6.63, 6.071, 5.706, 4.951, 4.392, 4.619, 5.029, 5.302, 5.151, 3.474, 3.285, 3.232, 3.4, 3.503, 3.649, 3.828, 4.235, 4.781, 5.096, 5.262, 5.411, 5.439, 5.08, 4.719, 4.519
1. Compute the five-number summary and sketch the boxplot. Identify any outliers.
2. Compute the mean and standard deviation of the sample.
3. Construct the probability plot of the data. A hypothesis test could be used to compare the mean ozone level on September 18, 1997 to a specified baseline level.
4. Are the assumptions for an hypothesis test of the mean reasonably satisfied?
5. Test to see if the mean ozone amount on September 18 was below 5 mPa, at the 5% level of significance.
Year |
106 × km2 |
1979 |
2.23 |
1980 |
1.88 |
1981 |
1.70 |
1982 |
3.77 |
1983 |
6.24 |
1984 |
8.66 |
1985 |
12.57 |
1986 |
9.58 |
1987 |
18.18 |
1988 |
8.75 |
1989 |
17.75 |
1990 |
17.86 |
1991 |
18.13 |
1992 |
21.28 |
1993 |
22.81 |
1994 |
22.82 |
6. Construct a line graph of the data in the table. Note any trends that you see.
7. Does it make sense to apply inferential methods (of the type we have studied) to the mean size of the ozone hole over time? Explain.
In addition to plotting the ozone concentration versus time, create a linear regression model for it and interpret the slope and comment on how well the model fits the data using the coefficient of variation.
(IF POSSIBLE INCLUDE R CODES WITH THE ANSWERS FOR EACH QUESTION)
Solution-1:
R code is
ozone <- c(1.828, 2.652, 3.307, 3.855, 4.021, 4.173, 4.316, 4.951, 4.787, 4.554, 4.333, 4.039, 4.48, 4.739, 4.791, 5.213, 5.75, 5.79, 5.656, 5.464, 5.092, 3.135, 2.754, 3.14, 5.213, 5.831, 5.736, 5.333, 4.797, 6.704, 6.493, 5.746, 6.31, 6.53, 6.63, 6.071, 5.706, 4.951, 4.392, 4.619, 5.029, 5.302, 5.151, 3.474, 3.285, 3.232, 3.4, 3.503, 3.649, 3.828, 4.235, 4.781, 5.096, 5.262, 5.411, 5.439, 5.08, 4.719, 4.519)
length(ozone)
print(ozone)
par(mfrow = c(1, 2))
boxplot(ozone,main="boxplot for ozone sample")
abline(h = min(ozone), col = "Blue")
abline(h = max(ozone), col = "Yellow")
abline(h = median(ozone), col = "Green")
abline(h = quantile(ozone, c(0.25, 0.75)), col = "Red")
fivenum(ozone)
Boxplot:
output:
minimum=1.828
Q1=4.030
Q2=4.791
Q3=5.425
maximum=6.704
outlier seen from boxplot
Solution-2:
Rcode:
mean(ozone)
sd(ozone)
Output:
mean=4.716559
standard deviation=1.073852
Solution-3:
qqnorm(ozone)
qqline(ozone)
From qqplot ozone sample follows normal distribution
4. Are the assumptions for an hypothesis test of the mean reasonably satisfied?
Yes met since sample follows normal distribution
sample is random sample
and independent