In: Statistics and Probability
Question 1
A soft drink bottler is analyzing the vending machine service routes in his distribution system. He is interested in predicting the amount of time required by the route driver to service the vending machines in an outlet. The industrial engineer responsible for the study has suggested that the two most important variables affecting the delivery time (Y) are the number of cases of product stocked (X1) and the distance walked by the route driver (X2). The engineer has collected 24 observations on the delivery times, which are shown on the Excel file.
Observation | Delivery Time, y | Number of Cases, x1 | Distance, x2 (ft) |
1 | 16.68 | 7 | 560 |
2 | 11.5 | 3 | 220 |
3 | 12.03 | 3 | 340 |
4 | 14.88 | 4 | 80 |
5 | 13.75 | 6 | 150 |
6 | 18.11 | 7 | 330 |
7 | 8 | 2 | 110 |
8 | 17.83 | 7 | 210 |
9 | 21.5 | 5 | 605 |
10 | 40.33 | 16 | 688 |
11 | 21 | 10 | 215 |
12 | 13.5 | 4 | 255 |
13 | 19.75 | 6 | 462 |
14 | 24 | 9 | 448 |
15 | 29 | 10 | 776 |
16 | 15.35 | 6 | 200 |
17 | 19 | 7 | 132 |
18 | 9.5 | 3 | 36 |
19 | 35.1 | 17 | 770 |
20 | 17.9 | 10 | 140 |
21 | 52.32 | 26 | 810 |
22 | 18.75 | 9 | 450 |
23 | 19.83 | 8 | 635 |
24 | 10.75 | 4 | 150 |
a) Fit the model Y=Bo+B1X1+B2X2+E to the delivery time data, and give the least squares function.
b) Find the value of SSE that is minimized by the least squares method.
c) Estimate s, the standard deviation of the model.
d) Conduct the ANOVA F-test for model usefulness at the a = 0.05 significance level (be sure to specify the null and alternative hypotheses).
e) Conduct the individual t-tests for b1 andb2 at the a = 0.05 significance level (be sure to specify the null and alternative hypotheses).
f) Find and interpret the coefficient of determination R2, and the adjusted coefficient of determination Ra2.
g) Which model do you think would be best: a simple linear regression with X1 as the predictor variable, a simple linear regression with X2 as the predictor variable, or the first-order multiple regression model with both X1 and X2?
h) Using the chosen model from part g), predict the delivery time when 10 cases need to be stocked, and the distance to be walked is 500 ft. Give a 95% prediction interval for this estimate.
We can easily fit this regression model in Excel.
Steps to do Regression analysis.
Data->data analysis–>Regression -> put x and y in input range -> click okay.
A)
As we can see that coefficinets are highlighted in yellow colour.
so the fitted model is
B)
SSE is given in ANOVA table which is highlighted in yellow colour(Residual SS)
SSE = 124.0023974
C)
Standard deviation of Model is given by square root of MSE
D)
Now we want to check the that overall model is significant.
So p value corresponding to it is given in the anova table as Significance F = 2.85856*E-14 0 < which implies that we have enough evidenct to reject H0
which suggests that the model is significant.