In: Statistics and Probability
In a small-scale experimental study of the relation between degree of brand liking (y) and moisture content (x1) and sweetness (x2) of the product, results were obtained from the experiment based on a completely randomized design. Fit a regression model of the form y = beta0 + beta1 x1 + beta2 x2 + e. Here, e is the random error term which is assumed to be normally distributed with mean 0 and constant variance.
y | x1 | x2 |
64 | 4 | 2 |
73 | 4 | 4 |
61 | 4 | 2 |
76 | 4 | 4 |
72 | 6 | 2 |
80 | 6 | 4 |
71 | 6 | 2 |
83 | 6 | 4 |
83 | 8 | 2 |
89 | 8 | 4 |
86 | 8 | 2 |
93 | 8 | 4 |
88 | 10 | 2 |
95 | 10 | 4 |
94 | 10 | 2 |
100 | 10 | 4 |
Construct and interpret a 99% prediction interval for the brand liking score when the moisture content is 5 and the sweetness level is 4.
Which interval (the mean response or the prediction interval) is wider? Why?
What is the variance inflation factors associated with the two predictor variables? Is the multicollinearity assumption violated? Why or why not?
Construct and interpret a 99% prediction interval for the brand liking score when the moisture content is 5 and the sweetness level is 4.
The 99% prediction interval for the brand liking score when the moisture content is 5 and the sweetness level is 4 is between 68.481 and 86.069.
Which interval (the mean response or the prediction interval) is wider? Why?
The prediction interval is wider because prediction intervals must account for both the uncertainty in estimating the population mean, plus the random variation of the individual values.
What is the variance inflation factors associated with the two predictor variables? Is the multicollinearity assumption violated? Why or why not?
The variance inflation factor for the two predictor variables is 1.000. Therefore, the multicollinearity assumption is not violated because both variance inflation factors are less than 10.
The output is:
R² | 0.952 | ||||||
Adjusted R² | 0.945 | ||||||
R | 0.976 | ||||||
Std. Error | 2.693 | ||||||
n | 16 | ||||||
k | 2 | ||||||
Dep. Var. | y | ||||||
ANOVA table | |||||||
Source | SS | df | MS | F | p-value | ||
Regression | 1,872.7000 | 2 | 936.3500 | 129.08 | 2.66E-09 | ||
Residual | 94.3000 | 13 | 7.2538 | ||||
Total | 1,967.0000 | 15 | |||||
Regression output | confidence interval | ||||||
variables | coefficients | std. error | t (df=13) | p-value | 99% lower | 99% upper | VIF |
Intercept | 37.6500 | ||||||
x1 | 4.4250 | 0.3011 | 14.695 | 1.78E-09 | 3.5179 | 5.3321 | 1.000 |
x2 | 4.3750 | 0.6733 | 6.498 | 2.01E-05 | 2.3468 | 6.4032 | 1.000 |
Predicted values for: y | |||||||
99% Confidence Interval | 99% Prediction Interval | ||||||
x1 | x2 | Predicted | lower | upper | lower | upper | Leverage |
5 | 4 | 77.275 | 73.881 | 80.669 | 68.481 | 86.069 | 0.175 |