In: Economics
Suppose you are a rabid football fan and you get into a discussion about the importance of offense (yards made) versus defense (yards allowed) in terms of winning a game. You decide to look at football statistics to provide evidence of which variable is a stronger predictor of wins.
Part a) Develop a simple linear regression that compares wins to yards made. Perform the following diagnostics on this regression: 1) test of significance on the slope; 2) assess the fit of the line using the appropriate statistics; 3) interpret the slope of the equation if the slope is significant. Part b) Develop a simple linear regression that compares wins against yards allowed. Perform the following diagnostics on this regression: 1) test of significance on the slope; 2) assess the fit of the line using the appropriate statistics; 3) interpret the slope of the equation if the slope is significant. Part c) Which explanatory variable provides a better prediction of the response variable? Support your answer briefly by citing the appropriate diagnostics. Note: Use an alpha of .05 for both tests of significance. Be sure to show ALL steps of the hypothesis testing procedure
EXCEL DATA TO USE.
Team | Win | Rush | Pass | Yds Allowed | Yds Made |
Arizona Cardinals | 62.50 | 93.40 | 251.00 | 346.40 | 344.40 |
Atlanta Falcons | 56.30 | 117.21 | 223.19 | 348.90 | 340.40 |
Baltimore Ravens | 56.30 | 137.51 | 213.69 | 305.00 | 351.20 |
Buffalo Bills | 37.50 | 116.71 | 157.19 | 340.60 | 273.90 |
Carolina Panthers | 50.00 | 156.16 | 174.94 | 315.80 | 331.10 |
Chicago Bears | 43.80 | 93.24 | 217.06 | 337.80 | 310.30 |
Cincinnati Bengals | 62.50 | 128.48 | 180.63 | 301.40 | 309.10 |
Cleveland Browns | 31.30 | 130.45 | 129.75 | 389.30 | 260.20 |
Dallas Cowboys | 68.80 | 131.46 | 267.94 | 315.90 | 399.40 |
Denver Broncos | 50.00 | 114.71 | 226.69 | 315.00 | 341.40 |
Detroit Lions | 12.50 | 101.00 | 198.00 | 392.10 | 299.00 |
Green Bay Packers | 68.80 | 117.85 | 261.25 | 284.40 | 379.10 |
Houston Texans | 56.30 | 92.23 | 290.88 | 324.90 | 383.10 |
Indianapolis Colts | 87.50 | 80.91 | 282.19 | 339.20 | 363.10 |
Jacksonville Jaguars | 53.80 | 126.85 | 209.75 | 352.30 | 336.60 |
Kansas City Chiefs | 25.00 | 120.58 | 182.63 | 388.20 | 303.20 |
Miami Dolphins | 43.80 | 139.48 | 198.13 | 349.30 | 337.60 |
Minnesota Vikings | 75.00 | 119.85 | 259.75 | 305.50 | 379.60 |
New England Patriots | 62.50 | 120.05 | 277.25 | 320.20 | 397.30 |
New Orleans Saints | 81.30 | 131.61 | 272.19 | 357.80 | 403.80 |
New York Giants | 50.00 | 114.81 | 251.19 | 324.90 | 366.00 |
New York Jets | 56.30 | 172.25 | 148.75 | 252.30 | 321.00 |
Oakland Raiders | 31.30 | 106.29 | 159.81 | 361.90 | 266.10 |
Philadelphia Eagles | 68.80 | 102.34 | 255.56 | 321.10 | 357.90 |
Pittsburgh Steelers | 56.30 | 112.05 | 259.25 | 305.30 | 371.30 |
Saint Louis Rams | 6.30 | 111.50 | 167.88 | 327.00 | 279.38 |
San Diego Chargers | 81.30 | 88.94 | 271.13 | 326.40 | 360.06 |
San Francisco 49ers | 50.00 | 100.00 | 190.75 | 356.40 | 290.75 |
Seattle Seahawks | 31.30 | 97.86 | 218.94 | 372.80 | 316.80 |
Tampa Bay Buccaneers | 18.80 | 101.69 | 185.81 | 365.60 | 287.50 |
Tennessee Titans | 50.00 | 161.96 | 189.44 | 365.60 | 351.40 |
Washington Redskins | 25.00 | 94.38 | 218.13 | 319.70 | 312.50 |
a. The regression of wins to yards made regression output is as below.
1) The slope is about 0.386, and the standard error is 0.05838. The null would be , and the alternate hypothesis would be .
The t-statistic is (checking for rounding errors), as given. The critical t-would be , and since (two-way test), we may reject the null. Hence, the slope coefficient is indeed significant (different from zero), which can be matched with the fact that the p-value of the calculated t in the regression is indeed less than 0.05.
2) The fit of the line can be checked with the R-squared (goodness of fit) and the F-statistic (significance of goodness of fit). We have , meaning that 59.31% of the variation in wins is explained by the yards made variable. The F-statistic of 43.72 is significant with its low p-value, suggesting that the R-squared is significant.
3) The slope means that, for a unit increase in the yards made, the wins increases by 0.386 units on average.
b. The regression output is as below.
1) The slope is about -0.3002, and the standard error is 0.1041. The null would be , and the alternate hypothesis would be .
The t-statistic is (checking for rounding errors), as given. The critical t-would be same , and since (two-way test), we may reject the null. Hence, the slope coefficient is indeed significant, which can be matched with the fact that the p-value of the calculated t in the regression is indeed less than 0.05.
2) The fit of the line can be checked with the R-squared (goodness of fit) and the F-statistic (significance of goodness of fit). We have , meaning that 21.71% of the variation in wins is explained by the yards made variable. The F-statistic of 8.32 is significant with its low p-value (less than alpha of 0.05), suggesting that the R-squared is significant.
3) The slope means that, for a unit increase in the yards made, the wins decreases by 0.3002 units on average.
c. The yards made variable is a better at explaining the wins variable, since it have higher R-square. The comparison of two models with respect to their R-square is valid only when the dependent variable is the same, which is indeed the same in this case. Both of the model have significant R-square, but the variation in the wins is explained more by the yards made variable (59.31%) than the yards allowed variable (21.71%).