Question

In: Statistics and Probability

We wish to predict the salary for baseball players (y) using the variables RBI (x1) and...

We wish to predict the salary for baseball players (y) using the variables RBI (x1) and HR (x2), then we use a regression equation of the form ˆy=b0+b1x1+b2x2.

  • HR - Home runs - hits on which the batter successfully touched all four bases, without the contribution of a fielding error.
  • RBI - Run batted in - number of runners who scored due to a batters' action, except when batter grounded into double play or reached on an error
  • Salary is in millions of dollars.

The following is a chart of baseball players' salaries and statistics from 2016.

Player Name RBI's HR's Salary (in millions)
Miquel Cabrera 108 38 28.050
Yoenis Cespedes 86 31 27.500
Ryan Howard 59 25 25.000
Albert Pujols 119 31 25.000
Robinson Cano 103 39 24.050
Mark Teixeira 44 15 23.125
Joe Mauer 49 11 23.000
Hanley Ramirez 111 30 22.750
Justin Upton 87 31 22.125
Adrian Gonzalez 90 18 21.857
Jason Heyward 49 7 21.667
Jayson Werth 70 21 21.571
Matt Kemp 108 35 21.500
Jacoby Ellsbury 56 9 21.143
Chris Davis 84 38 21.119
Buster Posey 80 14 20.802
Shin-Soo Choo 17 7 20.000
Troy Tulowitzki 79 24 20.000
Ryan Braun 91 31 20.000
Joey Votto 97 29 20.000
Hunter Pence 57 13 18.500
Prince Fielder 44 8 18.000
Adrian Beltre 104 32 18.000
Victor Martinez 86 27 18.000
Carlos Gonzalez 100 25 17.454
Matt Holliday 62 20 17.000
Brian McCann 58 20 17.000
Mike Trout 100 29 16.083
David Ortiz 127 38 16.000
Adam Jones 83 29 16.000
Curtis Granderson 59 30 16.000
Colby Rasmus 54 15 15.800
Matt Wieters 66 17 15.800
J.D. Martinez 68 22 6.750
Brandon Crawford 84 12 6.000
Rajai Davis 48 12 5.950
Aaron Hill 38 10 12.000
Coco Crisp 55 13 11.000
Ben Zobrist 76 18 10.500
Justin Turner 90 27 5.100
Denard Span 53 11 5.000
Chris Iannetta 24 7 4.550
Leonys Martin 47 15 4.150
Justin Smoak 34 14 3.900
Jorge Soler 31 12 3.667
Evan Gattis 72 32 3.300
Logan Forsythe 52 20 2.750
Jean Segura 64 20 2.600

a) Use software to find the multiple linear regression equation. Enter the coefficients rounded to 4 decimal places.
ˆy= ______ + _____ x1 + ______ x2


b) Use the multiple linear regression equation to predict the salary for a baseball player with an RBI of 31 and HR of 20. Round your answer to 1 decimal place, do not convert numbers to dollars.
      ________ millions of dollars

c) Holding all other variables constant, what is the correct interpretation of the coefficient b1=0.111 in the multiple linear regression equation?

  • For each HR, a baseball player's predicted salary increases by 0.111 million dollars.
  • For each RBI, a baseball player's predicted salary increases by 0.111 million dollars.
  • If the baseball player's salary increases by 0.111 million dollars, then the predicted RBI will increase by one.
  • If the baseball player's salary increases by 0.111 million dollars, then the predicted RBI will increase by 0.0371.

d) Holding all other variables constant, what is the correct interpretation of the coefficient b2=0.0371 in the multiple linear regression equation?

  • If the baseball player's salary increases by 0.0371 million dollars, then the predicted HR will increase by one.
  • If the baseball player's salary increases by 0.0371 million dollars, then the predicted HR will increase by 0.111.
  • For each RBI, a baseball player's predicted salary increases by 0.0371 million dollars.
  • For each HR, a baseball player's predicted salary increases by 0.0371 million dollars.

Solutions

Expert Solution

Output using excel:

SUMMARY OUTPUT
Regression Statistics
Multiple R 0.423071
R Square 0.178989
Adjusted R Square 0.1425
Standard Error 6.964542
Observations 48
ANOVA
df SS MS F Significance F
Regression 2 475.8568 237.9284 4.90525 0.011826
Residual 45 2182.718 48.50485
Total 47 2658.575
Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0%
Intercept 7.061791 2.964789 2.381886 0.021513 1.090399 13.03318 1.090399 13.03318
RBI's 0.110976 0.068915 1.610321 0.114321 -0.02783 0.249778 -0.02783 0.249778
HR's 0.03709 0.186675 0.198688 0.843401 -0.33889 0.413074 -0.33889 0.413074

a) Regression equation:

ŷ = 7.0618 + 0.1110 x1 + 0.0371 x2

b) Predicted salary for RBI = 31 and HR = 20

ŷ = 7.0618 + 0.1110 * 31 + 0.0371 *20 = 11.2 millions of dollars

c) Answer: For each RBI, a baseball player's predicted salary increases by 0.111 million dollars.

d) Answer: For each HR, a baseball player's predicted salary increases by 0.0371 million dollars.


Related Solutions

We wish to predict the salary for baseball players (yy) using the variables RBI (x1x1) and...
We wish to predict the salary for baseball players (yy) using the variables RBI (x1x1) and HR (x2x2), then we use a regression equation of the form ˆy=b0+b1x1+b2x2y^=b0+b1x1+b2x2. HR - Home runs - hits on which the batter successfully touched all four bases, without the contribution of a fielding error. RBI - Run batted in - number of runners who scored due to a batters's action, except when batter grounded into double play or reached on an error Salary is...
A researcher would like to predict the dependent variable Y from the two independent variables X1...
A researcher would like to predict the dependent variable Y from the two independent variables X1 and X2 for a sample of N=13 subjects. Use multiple linear regression to calculate the coefficient of multiple determination and test statistics to assess the significance of the regression model and partial slopes. Use a significance level α=0.02. X1X1 X2X2 YY 58 64.7 35.9 45 35.4 78.8 69.4 45.5 64.1 63.2 71.9 9 58 42.8 84.3 24.5 55.5 63.5 27.5 49.3 77.3 41.8 50.5...
A researcher would like to predict the dependent variable Y from the two independent variables X1...
A researcher would like to predict the dependent variable Y from the two independent variables X1 and X2 for a sample of N=16 subjects. Use multiple linear regression to calculate the coefficient of multiple determination and test the significance of the overall regression model. Use a significance level α=0.05. X1             X2             Y 48 42.3 47.1 36.3 58.7 65.4 43.4 40.2 63.6 49.5 37.9 45.6 45.5 37.2 50.8 40.6 64.7 42.4 42.5 46.7 63.1 42.7 40 35.8 55.8 10.6 52.1 40.9...
A researcher would like to predict the dependent variable Y from the two independent variables X1...
A researcher would like to predict the dependent variable Y from the two independent variables X1 and X2 for a sample of N=12 subjects. Use multiple linear regression to calculate the coefficient of multiple determination and test statistics to assess the significance of the regression model and partial slopes. Use a significance level α=0.05.    X1          X2         Y 34.4 26.4 59.4 53.7 38.3 90.4 72.8 43.2 71.3 25.4 21.2 64.5 75.9 46.5 71.1 60.4 27.9 72.6 28 56.4 29.9 40.1...
A Baseball coach was trying to predict the number of RBIs players hit based on the...
A Baseball coach was trying to predict the number of RBIs players hit based on the amount of time they spend in the batting cage. Assuming the slope (or change in DV for every change in the IV) is 5 and the Y intercept is 10, what will the predicted RBI total be for player 1 who spent 12 hours in the batting cages compared to the predicted RBI total for player 2 who spent 4 hours in the batting...
Using the data, determine whether the model using (x1, x2, x3, x4) to predict y is...
Using the data, determine whether the model using (x1, x2, x3, x4) to predict y is sufficient, or should some or all other predictors be considered? Write the full and reduced models, and then perform the test. Show your work and state your conclusion, but you do not need to specify your hypothesis statements. y 60323 61122 60171 61187 63221 63639 64989 63761 66019 67857 68169 66513 68655 69564 69331 70551 x1 83 88.5 88.2 89.5 96.2 98.1 99 100...
In 2007, baseball players made on average $1.8 million. Using these randomly sample 12 players (All...
In 2007, baseball players made on average $1.8 million. Using these randomly sample 12 players (All in Millions of Dollars) from 2011 , Is there evidence that salaries are now different? Assume a 5% significance level $2.7, 2.9, 1.5, 2.2, 2.5, 2.0, 1.7, 2.9, 2.8, 2.6, 0.8, 2.7 Provide the following in your final analyses: the distribution to be used and why; all assumptions required for this study and if each assumption is met or not (if an assumption is...
Generate and provide the full regression output using x1, x2, their squared terms, and their interaction, as 'x' variables against the 'y' variable.
X1 X2 Y 10 3 2002 5 14 1747 8 4 1980 7 4 1902 6 7 1842 7 6 1883 4 21 1697 11 4 2021 5 12 1750 6 8 1832 5 18 1795 7 4 1917 8 5 1943 6 9 1830 5 12 1786 A. Generate and provide the full regression output using x1, x2, their squared terms, and their interaction, as 'x' variables against the 'y' variable.
Suppose we wish to build a multiple regression model to predict the cost of rent (dollars)...
Suppose we wish to build a multiple regression model to predict the cost of rent (dollars) in a city based on population (thousands of people), and income (thousands of dollars). Use the alpha level of 0.05. City Monthly Rent ($) 2018 Population (Thousands) 2010 Median Income (Thousands of Dollars) Denver, CO 998 586.158 45.438 Birmingham, AL 711 212.237 301.704 San Diego, CA 1414 1307.402 61.962 Gainesville, FL 741 124.354 28.653 Winston-Salem, NC 750 239.617 41.979 Memphis, TN 819 646.889 36.535...
Suppose we wish to build a multiple regression model to predict the cost of rent (dollars)...
Suppose we wish to build a multiple regression model to predict the cost of rent (dollars) in a city based on population (thousands of people), and income (thousands of dollars). Use the alpha level of 0.05. A. Is the whole regression model effective in predicting the cost of rent? Use alpha of 0.1. Make sure to show which values you use to make the decision. B. Write down the multiple regression equation using actual names of IVs and DVs. C....
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT