In: Statistics and Probability
Team | Pts./Game | Number of Wins |
Charlotte | 108.2 | 36 |
Minnesota | 109.5 | 47 |
Houston | 112.4 | 65 |
LA Clippers | 109 | 42 |
Cleveland | 110.9 | 50 |
Milwaukee | 106.5 | 44 |
Phoenix | 103.9 | 21 |
Philadelphia | 109.8 | 52 |
Toronto | 111.7 | 59 |
Brooklyn | 106.6 | 28 |
Okla City | 107.9 | 48 |
Denver | 110 | 46 |
Washington | 106.6 | 43 |
Utah | 104.1 | 48 |
LA Lakers | 108.1 | 35 |
Golden State | 113.5 | 58 |
Memphis | 99.3 | 22 |
Portland | 105.6 | 49 |
Boston | 104 | 55 |
San Antonio | 102.7 | 47 |
New Orleans | 111.7 | 48 |
Atlanta | 103.4 | 24 |
Orlando | 103.4 | 25 |
Miami | 103.4 | 44 |
New York | 104.5 | 29 |
Indiana | 105.6 | 48 |
Detroit | 103.8 | 39 |
Chicago | 102.9 | 27 |
Dallas | 102.3 | 24 |
Sacramento | 98.8 | 27 |
Use your numerical and/or graphical output:
Regression output: Include the appropriate graphs from your output to support your answers.
The graph showing relationship between Pts/Game and No. of wins is:
The variables seem positively linearly correlated from the chart above as increase in pts/game leads to increase in No. of wins. The scatter plot also shows the data spread around roighly around a central line, hence the equality of variance assumption for regression modeling is satisfied.
Correlation between explanatory and response variables, r = 0.704. Hence the response variable (No. of wins) and explanatory variable (Pts/Game) are moderately linearly correlated.
Carrying out regression in Excel (go to Data tab->Data analysis->Regression and choose Pts/Game as X-axis and No. of wins as Y-axis), we get the following output:
SUMMARY OUTPUT | ||||||
Regression Statistics | ||||||
Multiple R | 0.703544 | |||||
R Square | 0.494975 | |||||
Adjusted R Square | 0.476938 | |||||
Standard Error | 8.839378 | |||||
Observations | 30 | |||||
ANOVA | ||||||
df | SS | MS | F | Significance F | ||
Regression | 1 | 2144.231 | 2144.230946 | 27.44278075 | 1.44485E-05 | |
Residual | 28 | 2187.769 | 78.13460908 | |||
Total | 29 | 4332 | ||||
Coefficients | Standard Error | t Stat | P-value | Lower 95% | Upper 95% | |
Intercept | -199.568 | 45.95067 | -4.343092683 | 0.000166464 | -293.6937043 | -105.4423399 |
Pts./Game | 2.262324 | 0.431858 | 5.238585758 | 1.44485E-05 | 1.377703612 | 3.146944934 |
The regression equation obtained is: No. of Wins = -199.57 + 2.262 * Pts/Game
The intercept of -199.57 tells that for Pts/Game = 0, No. of Wins = -199.568 ~ -200 (which means lots of losses).
The slope of 2.262 tells that for per unit increase in Pts/Game, No. of Wins increases by 2.262
Prediction for Team A: No. of Wins = -199.57 + 2.262 * 106 ~ 40
Prediction for Team B: No. of Wins = -199.57 + 2.262 * 96.2 ~ 18
Correlation coefficient, r = 0.704,
Coefficient of determination, r2 = 0.495
The lurking variable here can be number of games played, which is not included as an explanatory variable, and might have an impact on the response variable No. of wins.