In: Statistics and Probability
Consider the National Football League data in Table B.1.
Use the forward selection algorithm to select a subset regression model.
Use the backward elimination algorithm to select a subset regression
model.
Use stepwise regression to select a subset regression model.
Comment on the final model chosen by these three procedures.
y | x1 | x2 | x3 | x4 | x5 | x6 | x7 | x8 | x9 |
10 | 2113 | 1985 | 38.9 | 64.7 | 4 | 868 | 59.7 | 2205 | 1917 |
11 | 2003 | 2855 | 38.8 | 61.3 | 3 | 615 | 55 | 2096 | 1575 |
11 | 2957 | 1737 | 40.1 | 60 | 14 | 914 | 65.6 | 1847 | 2175 |
13 | 2285 | 2905 | 41.6 | 45.3 | -4 | 957 | 61.4 | 1903 | 2476 |
10 | 2971 | 1666 | 39.2 | 53.8 | 15 | 836 | 66.1 | 1457 | 1866 |
11 | 2309 | 2927 | 39.7 | 74.1 | 8 | 786 | 61 | 1848 | 2339 |
10 | 2528 | 2341 | 38.1 | 65.4 | 12 | 754 | 66.1 | 1564 | 2092 |
11 | 2147 | 2737 | 37 | 78.3 | -1 | 761 | 58 | 1821 | 1909 |
4 | 1689 | 1414 | 42.1 | 47.6 | -3 | 714 | 57 | 2577 | 2001 |
2 | 2566 | 1838 | 42.3 | 54.2 | -1 | 797 | 58.9 | 2476 | 2254 |
7 | 2363 | 1480 | 37.3 | 48 | 19 | 984 | 67.5 | 1984 | 2217 |
10 | 2109 | 2191 | 39.5 | 51.9 | 6 | 700 | 57.2 | 1917 | 1758 |
9 | 2295 | 2229 | 37.4 | 53.6 | -5 | 1037 | 58.8 | 1761 | 2032 |
9 | 1932 | 2204 | 35.1 | 71.4 | 3 | 986 | 58.6 | 1709 | 2025 |
6 | 2213 | 2140 | 38.8 | 58.3 | 6 | 819 | 59.2 | 1901 | 1686 |
5 | 1722 | 1730 | 36.6 | 52.6 | -19 | 791 | 54.4 | 2288 | 1835 |
5 | 1498 | 2072 | 35.3 | 59.3 | -5 | 776 | 49.6 | 2072 | 1914 |
5 | 1873 | 2929 | 41.1 | 55.3 | 10 | 789 | 54.3 | 2861 | 2496 |
6 | 2118 | 2268 | 38.2 | 69.6 | 6 | 582 | 58.7 | 2411 | 2670 |
4 | 1775 | 1983 | 39.3 | 78.3 | 7 | 901 | 51.7 | 2289 | 2202 |
3 | 1904 | 1792 | 39.7 | 38.1 | -9 | 734 | 61.9 | 2203 | 1988 |
3 | 1929 | 1606 | 39.7 | 68.8 | -21 | 627 | 52.7 | 2592 | 2324 |
4 | 2080 | 1492 | 35.5 | 68.8 | -8 | 722 | 57.8 | 2053 | 2550 |
10 | 2301 | 2835 | 35.3 | 74.1 | 2 | 683 | 59.7 | 1979 | 2110 |
6 | 2040 | 2416 | 38.7 | 50 | 0 | 576 | 54.9 | 2048 | 2628 |
8 | 2447 | 1638 | 39.9 | 57.1 | -8 | 848 | 65.3 | 1786 | 1776 |
2 | 1416 | 2649 | 37.4 | 56.3 | -22 | 684 | 43.8 | 2876 | 2524 |
0 | 1503 | 1503 | 39.3 | 47 | -9 | 875 | 53.5 | 2560 | 2241 |
I have done forward selection, backward elimination and stepwise regression using minitaM software. Steps to perform this procedure is given below,
Open the worksheet containing the data file.
Steps to run stepwise regression in Minitab:
Here, we apply forward selection method for given data. Minitab uses t Statistics for decision making regarding variable selection. In this example we will use alpha = 0.25
From the table of analysis of variance, we see that the regressor most highly correlated with y is , X2, X7,X8 & X9 .( Since p value regarding x2,X7, x8 & x9 is less than alpha=0.25).
So the forward selection procedure terminates with,
Yhat = -1.82 + 0.003819X2 + 0.2169 x7 -0.00401 x8 -0.00163 x9
Also R sq value is 80.12% which implies good fit to the model.
imilarlilyyou can done for backward ellimination procedure.
In this run we have selected the cutoff value by using alpha=0.10 , the default in Minitab .
Minitab uses p value for removing variables. Thus a regressor is dropped if the p value is greater than aplha=0.10 . In the above analysis of variance table, the regressors with corresponding p value is less the alpha = 0.10 . So we include all the regressor variables in the regression model ,
The fitted regression model is,
Y hat = -1.81 +0.003598 X2 +0.1940X7 -0.00482 X8
for stepwise regression
Stepwise regression model requires two cutoff values, one for entering valriabva and one for removing them. In the given model we select only those regressors variables which has p value is less than alpha = 0.15
The fitted regression model is,
Yhat = -1.81+0.003598X2 + 0.1940 X7 -0.00482 X8.