In: Statistics and Probability
In a small-scale experimental study of the relation between degree of brand liking (y) and moisture content (x1) and sweetness (x2) of the product, results were obtained from the experiment based on a completely randomized design. The data is in the Quiz3.xlsx file (Tab Experiment). x1 and x2 are indices and have no units of measurements.
Fit a regression model of the form y = beta0 + beta1 x1 + beta2 x2 + e.
Here, e is the random error term which is assumed to be normally distributed with mean 0 and constant variance.
Data Set:
y | x1 | x2 |
64 | 4 | 2 |
73 | 4 | 4 |
61 | 4 | 2 |
76 | 4 | 4 |
72 | 6 | 2 |
80 | 6 | 4 |
71 | 6 | 2 |
83 | 6 | 4 |
83 | 8 | 2 |
89 | 8 | 4 |
86 | 8 | 2 |
93 | 8 | 4 |
88 | 10 | 2 |
95 | 10 | 4 |
94 | 10 | 2 |
100 | 10 | 4 |
Construct and interpret a 99% prediction interval for the brand liking score when the moisture content is 5 and the sweetness level is 4.
Which interval (the mean response or the prediction interval) is wider? Why?
Conduct the global test of model significance. List out all the steps in detail.
What is the variance inflation factors associated with the two predictor variables? Is the multicollinearity assumption violated? Why or why not?
Construct and interpret a 99% prediction interval for the brand liking score when the moisture content is 5 and the sweetness level is 4.
The 99% prediction interval for the brand liking score when the moisture content is 5 and the sweetness level is 4 is between 68.481 and 86.069.
Which interval (the mean response or the prediction interval) is wider? Why?
The prediction interval is wider because prediction intervals must account for both the uncertainty in estimating the population mean, plus the random variation of the individual values.
Conduct the global test of model significance. List out all the steps in detail.
The hypothesis being tested is:
H0: β1 = β2 = 0
H1: At least one βi ≠ 0
The p-value is 0.0000.
Since the p-value (0.0000) is less than the significance level (0.05), we can reject the null hypothesis.
Therefore, we can conclude that the model is significant.
What is the variance inflation factors associated with the two predictor variables? Is the multicollinearity assumption violated? Why or why not?
The variance inflation factor for the two predictor variables is 1.000. Therefore, the multicollinearity assumption is not violated because both variance inflation factors are less than 10.
The output is:
R² | 0.952 | ||||||
Adjusted R² | 0.945 | ||||||
R | 0.976 | ||||||
Std. Error | 2.693 | ||||||
n | 16 | ||||||
k | 2 | ||||||
Dep. Var. | y | ||||||
ANOVA table | |||||||
Source | SS | df | MS | F | p-value | ||
Regression | 1,872.7000 | 2 | 936.3500 | 129.08 | 2.66E-09 | ||
Residual | 94.3000 | 13 | 7.2538 | ||||
Total | 1,967.0000 | 15 | |||||
Regression output | confidence interval | ||||||
variables | coefficients | std. error | t (df=13) | p-value | 99% lower | 99% upper | VIF |
Intercept | 37.6500 | ||||||
x1 | 4.4250 | 0.3011 | 14.695 | 1.78E-09 | 3.5179 | 5.3321 | 1.000 |
x2 | 4.3750 | 0.6733 | 6.498 | 2.01E-05 | 2.3468 | 6.4032 | 1.000 |
Predicted values for: y | |||||||
99% Confidence Interval | 99% Prediction Interval | ||||||
x1 | x2 | Predicted | lower | upper | lower | upper | Leverage |
5 | 4 | 77.275 | 73.881 | 80.669 | 68.481 | 86.069 | 0.175 |
Please give me a thumbs-up if this helps you out. Thank you!