In: Statistics and Probability
The weights (in pounds) and ages (in months) of 35 randomly selected male bears in Yellowstone Park were recorded in a file male_bears.csv. A researcher runs the following code in R to read in the data.
> bears <- read.table(file="male_bears.csv", header=T, sep=",")
The researcher then runs the following code in R to fit a simple linear regression model oftheformyi =α+βxi +εi, i=1,2...n.Notethatsomevaluesintheoutputhave been removed.
> result <- lm(WEIGHT ~ AGE, data=bears) > summary(result)
Call:
lm(formula = WEIGHT ~ AGE, data Residuals:
Min 1Q Median 3Q -204.96 -45.82 -11.69 22.78
Coefficients:
Estimate Std. Error
(Intercept) 73.6415 19.3023
AGE 3.2052 0.3695
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘’1
Residual standard error: 75.13 on 33 degrees of freedom Multiple
R-squared: 0.6952, Adjusted R-squared: 0.6859
F-statistic: 75.25 on 1 and 33 DF, p-value: **REMOVED**
i) Identify the response and predictor variables in the output above. [2]
ii) Using values from the output, write out the estimated linear regression equation.
[2]
= bears)
Max 174.33
t value 3.815 8.675
Pr(>|t|) 0.000567 *** **REMOVED**
‘*’ 0.05 ‘.’ 0.1
2
iii) Provide an interpretation of the coefficient of the variable AGE in the output. [1]
iv) Write out the hypotheses being tested by the t-statistic circled in the output.
[1]
v) What are your conclusions about the hypotheses in (iv)? Use statistical tables to arrive at your conclusion. [3]
The researcher then produced the following analysis for variance for the regression model.
> anova(result) Analysis of Variance Table
Response: WEIGHT Df Sum Sq Mean Sq F value Pr(>F)
AGE 1 424736 424736 75.251 5.017e-10 *** Residuals 33 186260 5644
vi) Using values from the analysis of variance output, compute the coefficient of determination for the regression model. [2]
vii) Provide an interpretation of the coefficient of determination. [1]
Some summary statistics for the variable AGE are provided below. > summary(bears$AGE)
Min. 1st Qu. Median Mean 3rd Qu. Max. 8.00 16.50 32.00 39.34 53.00 177.00
viii) Is the estimated regression equation in (ii) suitable for predicting the weight of a male bear who is 100 months old? Give a reason for your answer. If your answer was yes, find the predicted weight of a 100-month old male bear. [3]
ix) Is the estimated regression equation in (ii) suitable for predicting the weight of a female bear who is 100 months old? Give a reason for your answer. [2]
The researcher then produced the following diagnostic plots in R.
> qqnorm(result$residuals) > qqline(result$residuals)
3
> plot(result$fitted, result$residuals)
x) For each of the plots above, explain the specific purpose of the plot. [2]
xi) State your conclusions made from observing each plot. [2]