In: Statistics and Probability
a) Plot the regression line from the full data set on the on the scatter plot. The regression equation is: Wins = 24.5 + 0.08Runs, mark it “ALL SEASONS”
b) Plot the regression line from data set without the partial seasons on the on the scatter plot. The regression equation is: Wins = 43.3 + 0.05RUNS, mark it “ONLY FULL SEASON”. Do the partial seasons seem to be influential? Explain.
c) Using the linear regression model for “ALL GAMES” in the Red Socks data, Wins = 24.5+ 0.08 Runs. Consider the data for the year 2004, (Runs = 949, Wins = 98) Calculate the residual for this year.
d) The coefficient of determination = 67.2% for the Red Socks data. Find the linear correlation coefficient. Round your answer to 2 decimal places.
YEAR |
GAMES PLAYED |
RUNS |
WINS |
2009 |
162 |
872 |
95 |
2008 |
162 |
845 |
95 |
2007 |
162 |
867 |
96 |
2006 |
162 |
820 |
86 |
2005 |
162 |
910 |
95 |
2004 |
162 |
949 |
98 |
2003 |
162 |
961 |
95 |
2002 |
162 |
859 |
93 |
2001 |
161 |
772 |
82 |
2000 |
162 |
792 |
85 |
1999 |
162 |
836 |
94 |
1998 |
162 |
876 |
92 |
1997 |
162 |
851 |
78 |
1996 |
162 |
928 |
85 |
1995* |
144 |
791 |
86 |
1994* |
115 |
552 |
54 |
1993 |
162 |
686 |
80 |
1992 |
162 |
599 |
73 |
1991 |
162 |
731 |
84 |
1990 |
162 |
699 |
88 |
The regression lines for the two models “ALL SEASONS” and “ONLY FULL SEASON” are plotted in R. The screenshot is shown below,
R Code:
R Output:
a)
The model m1 is defined as,
Wins = 24.5 + 0.08Runs
The regression line is shown in red color in the scatterplot
b)
The model m2 is defined as,
Wins = 43.3 + 0.05RUNS
The regression line is shown in blue color in the scatterplot.
From the plot, we can see that the slope of the regression line is decreased which means the two data points are the influential data point.
c)
Residual = Actual - Predicted
The actual value of wins = 98,
The predicted is obtained from the regression model for runs = 949,
Wins = 24.5 + 0.08Runs
Wins = 24.5 + 0.08*949
Wins = 100.42
Residual = Actual - Predicted = 100 - 98 = 2
d)
The coefficient of determination, r^2 = 0.672
The correlation coefficient, r is