In: Statistics and Probability
A data set is provided, entitled oldfaithful_asst, on the duration and height of the Old Faithful geyser in the Yellowstone National Park.
Old Faithful data set to answer problem:
Duration | Height |
240 | 140 |
237 | 140 |
250 | 148 |
243 | 130 |
255 | 125 |
120 | 110 |
260 | 136 |
178 | 125 |
259 | 115 |
245 | 120 |
234 | 120 |
213 | 120 |
255 | 150 |
235 | 140 |
250 | 136 |
110 | 120 |
245 | 148 |
269 | 130 |
251 | 130 |
234 | 136 |
252 | 130 |
254 | 115 |
273 | 136 |
266 | 130 |
284 | 138 |
252 | 120 |
269 | 120 |
250 | 120 |
261 | 95 |
253 | 140 |
255 | 125 |
280 | 130 |
270 | 130 |
241 | 110 |
272 | 110 |
294 | 125 |
440 | 250 |
220 | 150 |
253 | 130 |
245 | 120 |
274 | 95 |
a)
b)
Firstly the data point (440,250) was removed as an outlier.
There are no such significant changes in the variable duration as the variable height increases.
This pattern indicates there is no such linear dependence between these two variables.
c)
Pearson correlation of Duration and Height = 0.092
The correlation indicates there is a very weak positive linear
dependence between the two variables.
The r2 = 0.008464
In terms of the coefficient of determination, we can say that only 0.84% of the variability of Duration is getting explained by Height.
d)
The regression equation is given by
Duration = 212.8 + 0.253 Height
The slope coefficient is 0.253 and can be interpreted as the amount of increase in Duration due to a unit increase in Height.
f)
The predicted value of Duration when Height = 115, is 212.8 + 0.253*115 = 241.895.
PLEASE LEAVE AN UPVOTE!!