In: Statistics and Probability
The following data represents the winning percentage (the number
of wins out of 162 games in a season) as well as the teams Earned
Run Average, or ERA.
The ERA is a pitching statistic. The lower the ERA, the less runs
an opponent will score per game. Smaller ERA's reflect (i) a good
pitching staff and (ii) a good team defense. You are to investigate
the relationship between a team's winning percentage - YY, and its
Earned Run Average (ERA) - XX.
Winning Proportion - Y | Earned Run Average (ERA) - X |
0.623457 | 3.13 |
0.512346 | 3.97 |
0.635802 | 3.68 |
0.604938 | 3.92 |
0.518519 | 4.00 |
0.580247 | 4.12 |
0.413580 | 4.29 |
0.407407 | 4.62 |
0.462963 | 3.89 |
0.450617 | 5.20 |
0.487654 | 4.36 |
0.456790 | 4.91 |
0.574047 | 3.75 |
(a) Using R-Studio, create a scatter-plot of the
data. What can you conclude from this scatter-plot?
A. There is a negative linear relationship between
a teams winning percentage and its ERA.
B. There is a positive linear relationship between
a teams winning percentage and its ERA.
C. There is not a linear relationship between the
a teams winning percentage and its ERA.
(b) Use R-Studio to find the least squares
estimate of the linear model that expressed a teams winning
percentage as a linear function of is ERA. Use four decimals in
each of your answers.
YˆiY^i =
equation editor
? + -
equation editor
XiXi
(c) Find the value of the coefficient of
determination, then complete its interpretation.
r2=r2=
equation editor
(use four decimals)
The percentage of ? variation standard deviation the
mean in ? a teams winning percentage a teams
earned run average that is explained by its linear
relationship with ? the teams winning percentage the
teams earned run average is
equation editor
%.
(d) Interpret the meaning of the slope term in the
estimate of the linear model, in the context of the data.
As a teams ? winning percentage earned run
average increases by ? one percentage point
one earned run the teams ? winning percentage
earned run average will ? will increase by an
average of will decrease by an average of will increase by will
decrease by
equation editor
. (use four decimals)
(e) A certain professional baseball team had an
earned run average of 3.45 this past season. How many games out of
162 would you expect this team to win? Use two decimals in your
answer.
equation editor
games won
(f) The team mentioned in part
(e) won 91 out of 162 games. Find the residual,
using two decimals in your answer.
Solution
>
Winning_Proportion=c(0.623457,0.512346,0.635802,0.604938,0.518519,0.580247,0.41358,0.407407,0.462963,0.450617,0.487654,0.45679,0.574047)
>
Earned_Run_Average=c(3.13,3.97,3.68,3.92,4,4.12,4.29,4.62,3.89,5.2,4.36,4.91,3.75)
#plot scatter plot
> Scatter_plot=plot(Earned_Run_Average,Winning_Proportion)
A. There is a negative linear relationship between a teams winning percentage and its ERA
#b
> model=lm(Winning_Proportion~Earned_Run_Average)
> model
Call:
lm(formula = Winning_Proportion ~ Earned_Run_Average)
Coefficients:
(Intercept) Earned_Run_Average
0.9621 -0.1073
# Winning_Proportion = 0.9621 - 0.1073 Earned_Run_Average
(e) A certain professional baseball team had an earned run average of 3.45 this past season. How many games out of 162 would you expect this team to win?
#c
> summarymodel=summary(model)
> summarymodel$r.squared
[1] 0.5472422
The percentage of variation standard deviation the mean in a teams winning percentage a teams earned run average that is explained by its linear relationship with the teams winning percentage the teams earned run average is 54.72%.