In: Statistics and Probability
Case | taste | Acetic | H2S | Lactic |
1 | 12.3 | 4.543 | 3.135 | 0.86 |
2 | 20.9 | 5.159 | 5.043 | 1.53 |
3 | 39 | 5.366 | 5.438 | 1.57 |
4 | 47.9 | 5.759 | 7.496 | 1.81 |
5 | 5.6 | 4.663 | 3.807 | 0.99 |
6 | 25.9 | 5.697 | 7.601 | 1.09 |
7 | 37.3 | 5.892 | 8.726 | 1.29 |
8 | 21.9 | 6.078 | 7.966 | 1.78 |
9 | 18.1 | 4.898 | 3.85 | 1.29 |
10 | 21 | 5.242 | 4.174 | 1.58 |
11 | 34.9 | 5.74 | 6.142 | 1.68 |
12 | 57.2 | 6.446 | 7.908 | 1.9 |
13 | 0.7 | 4.477 | 2.996 | 1.06 |
14 | 25.9 | 5.236 | 4.942 | 1.3 |
15 | 54.9 | 6.151 | 6.752 | 1.52 |
16 | 40.9 | 6.365 | 9.588 | 1.74 |
17 | 15.9 | 4.787 | 3.912 | 1.16 |
18 | 6.4 | 5.412 | 4.7 | 1.49 |
19 | 18 | 5.247 | 6.174 | 1.63 |
20 | 38.9 | 5.438 | 9.064 | 1.99 |
21 | 14 | 4.564 | 4.949 | 1.15 |
22 | 15.2 | 5.298 | 5.22 | 1.33 |
23 | 32 | 5.455 | 9.242 | 1.44 |
24 | 56.7 | 5.855 | 10.199 | 2.01 |
25 | 16.8 | 5.366 | 3.664 | 1.31 |
26 | 11.6 | 6.043 | 3.219 | 1.46 |
27 | 26.5 | 6.458 | 6.962 | 1.72 |
28 | 0.7 | 5.328 | 3.912 | 1.25 |
29 | 13.4 | 5.802 | 6.685 | 1.08 |
30 | 5.5 | 6.176 | 4.787 | 1.25 |
Please help me with activity 8, answer of 6 and 7 are below it
Activity 8: If you add the proportions of variability in taste that can explained by each variable individually (your results from Activity 6), you do not get the same result as the proportion of variability that can be explained by the combined model in Activity 7. Why is this? Look at the relationships that the predictor variables have with one another by constructing scatterplots and finding the correlations between hydrogen sulfide and lactic acid, between hydrogen sulfide and acetic acid, and between lactic acid and acetic acid.
Activity 6
What proportion of the variability in taste can be explained by hydrogen sulfide?
r^2 =0.7558*0.7558=0.5712
What proportion of the variability in taste can be explained by lactic acid?
r^2 =0.7042*0.7042=0.4959
What proportion of the variability in taste can be explained by acetic acid?
r^2 =0.5495*0.5495=0.3020
Activity 7
Estimate the equation of the regression line predicting taste score based on all three predictor variables in a single equation.
taste = -28.877 + 0.328 x Acetic +3.912 x H2S + 19.671 x Lactic
What taste score would you predict for a cheese whose hydrogen sulfide measurement was 5.0, whose acetic acid measurement was 6.1, and whose lactic acid measurement was 0.90?
taste = 10.39
What proportion of the variability in taste can be explained by the model using all three predictor variables?
R^2 = 0.6518
For the given data, from activity 6 & 7, we find that the coefficient of determination R2, i.e. the proportions of variability in taste that can explained by each variable individually is not the same as the proportion of variability that can be explained by the combined model. When we look at the scatter plots for the predictor variables:
1. Hydrogen sulfide and Lactic acid
Using excel function:
We get r = 0.645
2. Hydrogen Sulfide and Acetic acid
Correlation:
We get r = 0.618
3. Lactic acid and Acetic acid
Correlation:
We get r = 0.604
We find that all three pairs of predictors exhibit a moderate positive relationship with each other. It implies that the effect of one predictor on the dependent variable cannot be chalked out separately since the other predictor also has a role there and hence influences it. This is nothing but multicollinearity.
This issue can also be identified by testing if the proportions of variability in taste that can explained by each variable is different from zero. From activity 6, we find that this is true for all the three of predictors.
It is due to this dependence of predictors among themselves apart from that on the dependent variable, that the sum of proportions of variability in taste that can explained by each variable individually is not the same as the proportion of variability that can be explained by the combined model, as dependent predictor effects are not simply additive.