In: Statistics and Probability
Problem 2. Consider the FLAG data set. The first 10 observations are given for informational purposes.
CONTRACT |
COST |
DOTEST |
STATUS |
1 |
1379.43 |
1386.29 |
1 |
2 |
134.03 |
85.71 |
1 |
3 |
202.33 |
248.89 |
0 |
4 |
397.12 |
467.49 |
0 |
5 |
158.54 |
117.72 |
1 |
6 |
1128.11 |
1008.91 |
1 |
7 |
400.33 |
472.98 |
1 |
8 |
581.64 |
785.39 |
0 |
9 |
353.96 |
370.02 |
0 |
10 |
138.71 |
174.25 |
0 |
b) Calculate a confidence and prediction interval for DOTEST = 110.
c) Interpret the confidence and prediction intervals given in the output. Do you see any problems with the interpretation of the prediction interval in terms of what we are trying to predict?
d) Why are confidence intervals always more narrow than prediction intervals?
The regression equation is defined as,
The least square estimate of intercept and slope are,
CONTRACT | COST, Y | DOTEST, X | X^2 | XY |
1 | 1379.43 | 1386.29 | 1921800 | 1912290 |
2 | 134.03 | 85.71 | 7346.204 | 11487.71 |
3 | 202.33 | 248.89 | 61946.23 | 50357.91 |
4 | 397.12 | 467.49 | 218546.9 | 185649.6 |
5 | 158.54 | 117.72 | 13858 | 18663.33 |
6 | 1128.11 | 1008.91 | 1017899 | 1138161 |
7 | 400.33 | 472.98 | 223710.1 | 189348.1 |
8 | 581.64 | 785.39 | 616837.5 | 456814.2 |
9 | 353.96 | 370.02 | 136914.8 | 130972.3 |
10 | 138.71 | 174.25 | 30363.06 | 24170.22 |
SUM | 4874.2 | 5117.65 | 4249222 | 4117915 |
Form the data values, the values are calculated as,
b)
The confidence interval is defined as,
The standard error of regression is calculated as,
CONTRACT | COST, Y | DOTEST, X | Y-hat=-22.236+0.99588X | (Y-Y-hat) | (Y-Y-hat)^2 |
1 | 1379.43 | 1386.29 | 1358.3411 | 21.0889 | 444.7427 |
2 | 134.03 | 85.71 | 63.1208 | 70.9092 | 5028.1181 |
3 | 202.33 | 248.89 | 225.6283 | -23.2983 | 542.8112 |
4 | 397.12 | 467.49 | 443.3275 | -46.2075 | 2135.1291 |
5 | 158.54 | 117.72 | 94.9989 | 63.5411 | 4037.4762 |
6 | 1128.11 | 1008.91 | 982.5163 | 145.5937 | 21197.5366 |
7 | 400.33 | 472.98 | 448.7948 | -48.4648 | 2348.8401 |
8 | 581.64 | 785.39 | 759.9174 | -178.2774 | 31782.8276 |
9 | 353.96 | 370.02 | 346.2591 | 7.7009 | 59.3034 |
10 | 138.71 | 174.25 | 151.2959 | -12.5859 | 158.4049 |
SUM | 67735.1899 |
The prediction interval is obtained using the formula,
c)
The prediction interval gives the interval where the next predicted data would be.
The confidence interval gives the interval of mean value of input variable