In: Statistics and Probability
I ONLY NEED PART "e" and "g", parts "a" to "d" have been answered and included for reference.
3. Here are the world record times for women in the 10,000-meter run over several years.
Record Year |
1979 |
1981 |
1981 |
1981 |
1982 |
1983 |
1983 |
1984 |
1985 |
1986 |
Time (in seconds) |
1972.5 |
1950.8 |
1972.5 |
1937.2 |
1895.3 |
1895.0 |
1887.6 |
1873.8 |
1859.4 |
1813.7 |
The explanatory variable is Record Year and the response variable is Time.
form = linear, direction = negative, strength = strong, outliers = No
y = -23.301 x + 48,100.478
For every additional year, the record Time will decrease by 23.301.
e. What percent of the observed variation in the record times can be explained by your model?
g. The record from 1986 was broken in 1993, with a time of 1771.8 seconds. The record from 1986 was finally broken in 2016, with a time of 1757.5 seconds. Make a new scatterplot of these data, including these two additional points. Describe what you see. Would you use the equation you wrote in part c. to describe this new, bigger data set? Why or why not?
(e)
Percentage of variance in y that is explained by this model is given by R^2, which is given by
Therefore, percentage of variance in y that is explained by this model is given by 0.9113402.
(f)
Scatter plot alongwith the new observations (in black and purple colour) is shown below
Observation: The new observations do not lie in the same line of scatter as the older observations.
Hence, the equaton in part (c) won't be useful for the bigger dataset reason being that the new observations are leverage points and including them would change the slope of the equation considerably.
Hope this was useful. Please leave back any comment.