In: Statistics and Probability
THe Golfing Statistics provides data for a portion of the 2010 professional season for the top 25 golfers.
A) Find the best multiple regression model for predicting earnings/event as a function of the remaining variables.
B) Find the best multiple regression model for predicting average score as a function of the other variables except earning sand events.
Golfing Statistics |
||||||
Earnings/Event | Events | Avg. Score | GIR (%)* | Driving Distance | Driving Accuracy (%) | Putts/Round |
$239,493.68 | 22 | 70.37 | 67.9 | 288.4 | 60.2 | 31.82 |
$177,249.18 | 28 | 69.43 | 69.4 | 286.9 | 67.9 | 31.30 |
$218,619.18 | 22 | 70.23 | 67.1 | 276.0 | 71.0 | 31.81 |
$186,380.08 | 24 | 70.46 | 68.0 | 308.5 | 56.4 | 31.81 |
$209,511.75 | 20 | 69.78 | 68.3 | 282.9 | 68.5 | 31.43 |
$181,987.29 | 21 | 70.34 | 65.1 | 299.1 | 52.7 | 31.72 |
$162,536.13 | 23 | 69.92 | 66.3 | 287.8 | 65.2 | 31.68 |
$174,534.95 | 21 | 70.25 | 65.3 | 277.0 | 62.4 | 31.52 |
$135,353.70 | 27 | 70.64 | 68.0 | 291.8 | 67.9 | 32.35 |
$212,540.82 | 17 | 69.93 | 68.7 | 294.2 | 61.3 | 31.55 |
$297,079.50 | 12 | 70.26 | 69.3 | 298.7 | 61.3 | 32.31 |
$168,904.45 | 20 | 69.96 | 66.0 | 291.4 | 64.8 | 31.79 |
$135,791.58 | 24 | 70.21 | 68.5 | 309.8 | 55.7 | 31.73 |
$133,695.52 | 23 | 70.53 | 68.2 | 289.1 | 64.8 | 31.86 |
$112,192.04 | 26 | 70.59 | 66.5 | 279.7 | 71.2 | 31.30 |
$215,121.67 | 12 | 70.22 | 66.5 | 292.4 | 60.1 | 32.29 |
$183,922.93 | 14 | 70.86 | 62.9 | 287.2 | 52.0 | 31.99 |
$150,251.76 | 17 | 70.94 | 66.2 | 300.0 | 62.6 | 32.31 |
$183,356.69 | 13 | 71.13 | 66.9 | 291.7 | 67.1 | 32.06 |
$130,274.35 | 17 | 71.53 | 62.5 | 286.8 | 62.7 | 32.47 |
$286,285.40 | 5 | 69.73 | 69.4 | 308.4 | 70.6 | 32.09 |
$72,708.05 | 19 | 70.79 | 61.9 | 292.1 | 56.7 | 31.50 |
$99,597.31 | 13 | 71.07 | 64.1 | 295.8 | 57.2 | 31.52 |
$85,557.56 | 9 | 71.10 | 64.1 | 290.4 | 69.3 | 31.95 |
$46,406.25 | 8 | 71.24 | 61.1 | 289.9 | 65.5 | 32.31 |
*GIR: Greens in Regulation |
Here i copied all data intoExcel:
A) then go to data analysis-> Regression -> In place Input Y range input vector Earnings/Event -> In place of Input X range input all vector other than Earnings/Event-> In cell of labels tick mark. -> Also tick for residuals,standardies residuals -> Select output range -> Then press ok.
Then we get following output.
SUMMARY OUTPUT | ||||||||
Regression Statistics | ||||||||
Multiple R | 0.907422179 | |||||||
R Square | 0.82341501 | |||||||
Adjusted R Square | 0.764553347 | |||||||
Standard Error | 29533.55374 | |||||||
Observations | 25 | |||||||
ANOVA | ||||||||
df | SS | MS | F | Significance F | ||||
Regression | 6 | 73209748917 | 12201624820 | 13.98898648 | 6.49787E-06 | |||
Residual | 18 | 15700154337 | 872230796.5 | |||||
Total | 24 | 88909903254 | ||||||
Coefficients | Standard Error | t Stat | P-value | Lower 95% | Upper 95% | Lower 95.0% | Upper 95.0% | |
Intercept | 1437821.623 | 1347403.762 | 1.067105246 | 0.300028237 | -1392968.633 | 4268611.879 | -1392968.633 | 4268611.879 |
Events | -4747.681755 | 1287.532815 | -3.687425828 | 0.001685195 | -7452.687819 | -2042.675692 | -7452.687819 | -2042.675692 |
Avg. Score | -45049.73715 | 19510.79217 | -2.308965047 | 0.033023094 | -86040.39037 | -4059.083927 | -86040.39037 | -4059.083927 |
GIR (%)* | 22416.76093 | 4698.689378 | 4.770853981 | 0.000152822 | 12545.18087 | 32288.34099 | 12545.18087 | 32288.34099 |
Driving Distance | -3429.311291 | 995.8982298 | -3.443435472 | 0.002898633 | -5521.615828 | -1337.006753 | -5521.615828 | -1337.006753 |
Driving Accuracy (%) | -5413.9034 | 1450.334544 | -3.732865237 | 0.001522986 | -8460.943204 | -2366.863596 | -8460.943204 | -2366.863596 |
Putts/Round | 57949.81073 | 23557.912 | 2.459887393 | 0.024242869 | 8456.474262 | 107443.1472 | 8456.474262 | 107443.1472 |
RESIDUAL OUTPUT | ||||||||
Observation | Predicted Earnings/Event | Residuals | Standard Residuals | |||||
1 | 214353.3044 | 25140.3756 | 0.982936408 | |||||
2 | 195162.1174 | -17912.93735 | -0.700358602 | |||||
3 | 186200.664 | 32418.51596 | 1.267496562 | |||||
4 | 154109.3185 | 32270.76149 | 1.26171967 | |||||
5 | 210720.1049 | -1208.354893 | -0.047244164 | |||||
6 | 155801.2113 | 26186.07871 | 1.023821225 | |||||
7 | 160886.2832 | 1649.846847 | 0.064505581 | |||||
8 | 176021.9942 | -1487.04422 | -0.058140337 | |||||
9 | 158059.8278 | -22706.12781 | -0.887762382 | |||||
10 | 234355.2581 | -21814.43814 | -0.85289917 | |||||
11 | 285287.2656 | 11792.23444 | 0.461051846 | |||||
12 | 162796.8305 | 6107.619451 | 0.23879522 | |||||
13 | 171275.7761 | -35484.1961 | -1.387358281 | |||||
14 | 184136.2119 | -50440.69187 | -1.972126164 | |||||
15 | 94216.33916 | 17975.70084 | 0.702812524 | |||||
16 | 251264.6734 | -36143.00345 | -1.413116278 | |||||
17 | 176537.2319 | 7385.698139 | 0.288765439 | |||||
18 | 149926.8976 | 324.8624437 | 0.012701446 | |||||
19 | 165663.1729 | 17693.5171 | 0.691779726 | |||||
20 | 94403.02563 | 35871.32437 | 1.402494191 | |||||
21 | 248276.4951 | 38008.90487 | 1.486069144 | |||||
22 | 62891.1652 | 9816.884801 | 0.383819782 | |||||
23 | 113843.7961 | -14246.48611 | -0.557007982 | |||||
24 | 109411.4995 | -23853.93946 | -0.93263943 | |||||
25 | 83751.35566 | -37345.10566 | -1.460115975 |
Thus the regression equation is:
Earnings/Event= (-4747.681755)*Event-(45049.73715)*Avg.score+(22416.76093)*GIR-
(3429.3112)*Driving Distance-(5413.9034)*Driving Accuracy+
(57949.81073)*Putts/Round+1437821.623
B) then go to data analysis-> Regression -> In place Input Y range input vector Avg. Score -> In place of Input X range input all vector other than Avg. Score-> In cell of labels tick mark. -> Also tick for residuals,standardies residuals -> Select output range -> Then press ok.
SUMMARY OUTPUT | ||||||||
Regression Statistics | ||||||||
Multiple R | 0.857899053 | |||||||
R Square | 0.735990785 | |||||||
Adjusted R Square | 0.647987713 | |||||||
Standard Error | 0.313379912 | |||||||
Observations | 25 | |||||||
ANOVA | ||||||||
df | SS | MS | F | Significance F | ||||
Regression | 6 | 4.927970554 | 0.821328426 | 8.363239717 | 0.000199396 | |||
Residual | 18 | 1.767725446 | 0.098206969 | |||||
Total | 24 | 6.695696 | ||||||
Coefficients | Standard Error | t Stat | P-value | Lower 95% | Upper 95% | Lower 95.0% | Upper 95.0% | |
Intercept | 53.49098873 | 7.640887788 | 7.000624825 | 1.55168E-06 | 37.43807919 | 69.54389826 | 37.43807919 | 69.54389826 |
Earnings/Event | -5.07228E-06 | 2.19678E-06 | -2.308965047 | 0.033023094 | -9.68753E-06 | -4.57024E-07 | -9.68753E-06 | -4.57024E-07 |
Events | -0.015690723 | 0.017719097 | -0.885526109 | 0.387550073 | -0.052917164 | 0.021535718 | -0.052917164 | 0.021535718 |
GIR (%)* | -0.01239611 | 0.074970262 | -0.165347028 | 0.870513709 | -0.169902786 | 0.145110566 | -0.169902786 | 0.145110566 |
Driving Distance | -0.012270372 | 0.013299185 | -0.92264086 | 0.368397139 | -0.040210923 | 0.015670179 | -0.040210923 | 0.015670179 |
Driving Accuracy (%) | -0.02112577 | 0.019884223 | -1.0624388 | 0.302083597 | -0.062900973 | 0.020649432 | -0.062900973 | 0.020649432 |
Putts/Round | 0.748378832 | 0.228860567 | 3.270020874 | 0.004253522 | 0.267560623 | 1.229197041 | 0.267560623 | 1.229197041 |
RESIDUAL OUTPUT | ||||||||
Observation | Predicted Avg. Score | Residuals | Standard Residuals | |||||
1 | 70.09218605 | 0.277813953 | 1.023651831 | |||||
2 | 69.76174916 | -0.331749164 | -1.222385111 | |||||
3 | 70.12449473 | 0.105505271 | 0.388751761 | |||||
4 | 70.15513167 | 0.30486833 | 1.123338194 | |||||
5 | 69.87096118 | -0.090961176 | -0.335161622 | |||||
6 | 70.38658676 | -0.046586763 | -0.171656695 | |||||
7 | 70.28363962 | -0.363639618 | -1.339890807 | |||||
8 | 70.33848736 | -0.088487365 | -0.32604645 | |||||
9 | 70.73297297 | -0.092972967 | -0.342574399 | |||||
10 | 70.00096642 | -0.070966418 | -0.261487601 | |||||
11 | 70.15672982 | 0.103270184 | 0.380516212 | |||||
12 | 70.34872736 | -0.388727365 | -1.432330791 | |||||
13 | 70.34449886 | -0.134498863 | -0.495583485 | |||||
14 | 70.53358165 | -0.003581651 | -0.013197191 | |||||
15 | 70.17769894 | 0.412301057 | 1.519191993 | |||||
16 | 70.69483861 | -0.474838611 | -1.749622037 | |||||
17 | 70.8767429 | -0.016742903 | -0.061692019 | |||||
18 | 70.81804045 | 0.121959546 | 0.449380281 | |||||
19 | 70.52389203 | 0.60610797 | 2.233305876 | |||||
20 | 71.244834 | 0.285165996 | 1.05074166 | |||||
21 | 69.83994034 | -0.109940344 | -0.40509353 | |||||
22 | 70.84867676 | -0.058676757 | -0.216204294 | |||||
23 | 70.73816413 | 0.331835867 | 1.22270458 | |||||
24 | 71.00458164 | 0.095418358 | 0.351584849 | |||||
25 | 71.61187656 | -0.371876563 | -1.370241204 |
The regression equation :
Avg.score=(-0.015690723)*Event-(5.072*10^(-6))*Earnings/Event-(0.01239611)*GIR-
(0.012270372)*Driving Distance-(0.02112577)*Driving Accuracy+
(0.74837883)*Putts/Round+53.49098873