In: Statistics and Probability
We wish to predict the salary for baseball players (y) using the variables RBI (x1) and HR (x2), then we use a regression equation of the form ˆy=b0+b1x1+b2x2.
The following is a chart of baseball players' salaries and statistics from 2016.
Player Name | RBI's | HR's | Salary (in millions) |
---|---|---|---|
Miquel Cabrera | 108 | 38 | 28.050 |
Yoenis Cespedes | 86 | 31 | 27.500 |
Ryan Howard | 59 | 25 | 25.000 |
Albert Pujols | 119 | 31 | 25.000 |
Robinson Cano | 103 | 39 | 24.050 |
Mark Teixeira | 44 | 15 | 23.125 |
Joe Mauer | 49 | 11 | 23.000 |
Hanley Ramirez | 111 | 30 | 22.750 |
Justin Upton | 87 | 31 | 22.125 |
Adrian Gonzalez | 90 | 18 | 21.857 |
Jason Heyward | 49 | 7 | 21.667 |
Jayson Werth | 70 | 21 | 21.571 |
Matt Kemp | 108 | 35 | 21.500 |
Jacoby Ellsbury | 56 | 9 | 21.143 |
Chris Davis | 84 | 38 | 21.119 |
Buster Posey | 80 | 14 | 20.802 |
Shin-Soo Choo | 17 | 7 | 20.000 |
Troy Tulowitzki | 79 | 24 | 20.000 |
Ryan Braun | 91 | 31 | 20.000 |
Joey Votto | 97 | 29 | 20.000 |
Hunter Pence | 57 | 13 | 18.500 |
Prince Fielder | 44 | 8 | 18.000 |
Adrian Beltre | 104 | 32 | 18.000 |
Victor Martinez | 86 | 27 | 18.000 |
Carlos Gonzalez | 100 | 25 | 17.454 |
Matt Holliday | 62 | 20 | 17.000 |
Brian McCann | 58 | 20 | 17.000 |
Mike Trout | 100 | 29 | 16.083 |
David Ortiz | 127 | 38 | 16.000 |
Adam Jones | 83 | 29 | 16.000 |
Curtis Granderson | 59 | 30 | 16.000 |
Colby Rasmus | 54 | 15 | 15.800 |
Matt Wieters | 66 | 17 | 15.800 |
J.D. Martinez | 68 | 22 | 6.750 |
Brandon Crawford | 84 | 12 | 6.000 |
Rajai Davis | 48 | 12 | 5.950 |
Aaron Hill | 38 | 10 | 12.000 |
Coco Crisp | 55 | 13 | 11.000 |
Ben Zobrist | 76 | 18 | 10.500 |
Justin Turner | 90 | 27 | 5.100 |
Denard Span | 53 | 11 | 5.000 |
Chris Iannetta | 24 | 7 | 4.550 |
Leonys Martin | 47 | 15 | 4.150 |
Justin Smoak | 34 | 14 | 3.900 |
Jorge Soler | 31 | 12 | 3.667 |
Evan Gattis | 72 | 32 | 3.300 |
Logan Forsythe | 52 | 20 | 2.750 |
Jean Segura | 64 | 20 | 2.600 |
a) Use software to find the multiple linear regression equation.
Enter the coefficients rounded to 4 decimal places.
ˆy= ______ + _____ x1 + ______ x2
b) Use the multiple linear regression equation to predict the
salary for a baseball player with an RBI of 31 and HR of 20. Round
your answer to 1 decimal place, do not convert numbers to
dollars.
________ millions of dollars
c) Holding all other variables constant, what is the correct interpretation of the coefficient b1=0.111 in the multiple linear regression equation?
d) Holding all other variables constant, what is the correct interpretation of the coefficient b2=0.0371 in the multiple linear regression equation?
Output using excel:
SUMMARY OUTPUT | ||||||||
Regression Statistics | ||||||||
Multiple R | 0.423071 | |||||||
R Square | 0.178989 | |||||||
Adjusted R Square | 0.1425 | |||||||
Standard Error | 6.964542 | |||||||
Observations | 48 | |||||||
ANOVA | ||||||||
df | SS | MS | F | Significance F | ||||
Regression | 2 | 475.8568 | 237.9284 | 4.90525 | 0.011826 | |||
Residual | 45 | 2182.718 | 48.50485 | |||||
Total | 47 | 2658.575 | ||||||
Coefficients | Standard Error | t Stat | P-value | Lower 95% | Upper 95% | Lower 95.0% | Upper 95.0% | |
Intercept | 7.061791 | 2.964789 | 2.381886 | 0.021513 | 1.090399 | 13.03318 | 1.090399 | 13.03318 |
RBI's | 0.110976 | 0.068915 | 1.610321 | 0.114321 | -0.02783 | 0.249778 | -0.02783 | 0.249778 |
HR's | 0.03709 | 0.186675 | 0.198688 | 0.843401 | -0.33889 | 0.413074 | -0.33889 | 0.413074 |
a) Regression equation:
ŷ = 7.0618 + 0.1110 x1 + 0.0371 x2
b) Predicted salary for RBI = 31 and HR = 20
ŷ = 7.0618 + 0.1110 * 31 + 0.0371 *20 = 11.2 millions of dollars
c) Answer: For each RBI, a baseball player's predicted salary increases by 0.111 million dollars.
d) Answer: For each HR, a baseball player's predicted salary increases by 0.0371 million dollars.