In: Statistics and Probability
The invasive diatom species Didymosphenia geminata has the potential to inflict substantial ecological and economic damage in rivers. An article described an investigation of colonization behavior. One aspect of particular interest was whether
y = colony
density was related to
x = rock
surface area. The article contained a scatterplot and summary of a regression analysis. Here is representative data.
x | 50 | 71 | 55 | 50 | 33 | 58 | 79 | 26 |
y | 159 | 1936 | 55 | 29 | 9 | 12 | 42 | 14 |
x | 69 | 44 | 37 | 70 | 20 | 45 | 49 |
y | 276 | 45 | 178 | 20 | 50 | 192 | 32 |
(a) Fit the simple linear regression model to this data. (Round your numerical values to three decimal places.)
y =
Predict colony density when surface area = 70 and calculate the
corresponding residual. (Round your answers to the nearest whole
number.)
colony density | |
corresponding residual |
Predict colony density when surface area = 71 and calculate the
corresponding residual. (Round your answers to the nearest whole
number.)
colony density | |
corresponding residual |
How do the residuals compare?
The residuals for both points are positive. The residual for the first point is negative, while the residual for the second point is positive. The residual for the first point is positive, while the residual for the second point is negative. The residuals for both points are negative.
(b) Calculate the coefficient of determination. (Round your answer
to three decimal places.)
Interpret the coefficient of determination.
The coefficient of determination is the proportion of the total variation in rock surface area that can be explained by a linear regression model with colony density as the predictor. The coefficient of determination is the increase in rock surface area due to an increase in one unit of rock surface area. The coefficient of determination is the increase in colony density due to an increase in one unit of colony density. The coefficient of determination is the probability that the regression model fits the data. The coefficient of determination is the proportion of the total variation in colony density that can be explained by a linear regression model with rock surface area as the predictor.
(c) The second observation has a very extreme y value (in
the full data set consisting of 72 observations, there were two of
these). This observation may have had a substantial impact on the
fit of the model and subsequent conclusions. Eliminate it and
recalculate the equation of the estimated regression line. (Round
your values to three decimal places.)
y =
from above:
y^ =-298.881+9.963x
for x=70
predicted val=-298.881+70*9.963= | 399 |
residual =20-399=-397
for x=71
predicted val=-298.881+71*9.963= | 408.492 ~ 408 |
residual =1936-408=1528
The residual for the first point is negative, while the residual for the second point is positive.
b)
SST = 'Σ(Yi-Y̅)2 = | 3310340.933 | ||
SSR = 'Σ(Ŷ-Y̅)2 = | 409533.571 | ||
SSE = 'Σ(Yi-Ŷ)2 = | 2900807.3626 | ||
Coeffficient of determination R^2 =SSR/SST= | 0.124 |
The coefficient of determination is the proportion of the total variation in colony density that can be explained by a linear regression model with rock surface area as the predictor.
c)
y^ =41.373+0.779x