In: Statistics and Probability
Question: The data below were obtained to predict the sound frequency to which a person’s ear will respond ...
The data below were obtained to predict the sound frequency to which a person’s ear will respond based upon the length of time a person has been exposed to a high level of noise. Here “length of exposure” is the amount of time in weeks that a person has been living close to a major airport and the “hearing range” is reported in thousand cycles per second.
Length of exposure In weeks |
Hearing range In thousand cycles per second |
47 |
15.1 |
56 |
14.1 |
116 |
13.2 |
178 |
12.7 |
19 |
14.6 |
75 |
13.8 |
160 |
11.9 |
31 |
14.8 |
12 |
15.3 |
164 |
12.6 |
43 |
14.7 |
74 |
14.0 |
a. Find the correlation coefficient r.
b. Identify our x, explanatory variable and y, response variable and construct a scatter plot of y vs x.
c. Determine least squares equation that can be used for predicting a value of y based on a value of x.
d. Does this slope (in c) differ significantly from 0. Use alpha=0.05.
e. Use your model to predict y when x is i)5, ii)100. f. Provide 95% confidence intervals for the means (e, ii). g. Provide 95% prediction intervals for (e, ii).
h. Comment on your findings in (f) and (g). i. What were the basic assumptions for the model in (c)?
j. Would you consider the model in (b) to be a good model? Why or why not?
# code starts here
x<- c(47,56,116,178,19,75,160,31,12,164,43,74)
y <-
c(15.1,14.1,13.2,12.7,14.6,13.8,11.9,14.8,15.3,12.6,14.7,14)
model <- lm (y ~x)
summary(model)
predict(model,data.frame(x =100), interval= "confidence" )
predict(model,data.frame(x =100), interval= "prediction" )
#running above code
summary(model) Call: lm(formula = y ~ x) Residuals: Min 1Q Median 3Q Max -0.62191 -0.21749 -0.00311 0.15810 0.60064 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 15.321834 0.184512 83.040 1.57e-15 *** x -0.017499 0.001865 -9.381 2.85e-06 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.3645 on 10 degrees of freedom Multiple R-squared: 0.898, Adjusted R-squared: 0.8878 F-statistic: 88 on 1 and 10 DF, p-value: 2.847e-06 > predict(model,data.frame(x =100), interval= "confidence" ) fit lwr upr 1 13.57188 13.32482 13.81895 > predict(model,data.frame(x =100), interval= "prediction" ) fit lwr upr 1 13.57188 12.72298 14.42078
f)
95% confidence interval = (13.3248,13.8190)
g)
95 % prediction interval = (12.7230,14.4208)
h)
we see that prediction interval is wider than confidence interval
i)
j)
Yes, as R^2 = 0.898 which is a strong fit
p-value = 0.000
hence the model is significant