In: Statistics and Probability
A study was made on the amount of converted sugar in a certain process at various temperatures. The data were coded and recorded as follows:
Temperature, x | converted sugar, y |
1 | 8.1 |
1.1 | 7.8 |
1.2 | 8.5 |
1.3 | 9.8 |
1.4 | 9.5 |
1.5 | 8.9 |
1.6 | 8.6 |
1.7 | 10.2 |
1.8 | 9.3 |
1.9 | 9.2 |
2 | 10.5 |
a) Estimate the linear regression line.
b) Estimate the mean amount of converted sugar produced when the
coded temperature is 1.75.
c) Plot the residuals versus temperature. Comment.
d) Compute SSE and estimate the variance.
e) Construct a 95% confidence interval for intercept;
f) Construct a 95% confidence interval for slope.
g) Use an ANOVA approach to test the hypothesis that slope = 0
against the alternative hypothesis slope ≠ 0 at the 0.05 level of
significance.
Data
Temperature, x | converted sugar, y |
1 | 8.1 |
1.1 | 7.8 |
1.2 | 8.5 |
1.3 | 9.8 |
1.4 | 9.5 |
1.5 | 8.9 |
1.6 | 8.6 |
1.7 | 10.2 |
1.8 | 9.3 |
1.9 | 9.2 |
2 | 10.5 |
Using Excel
a) Excel Steps for obtaining the estimates of regression coefficients
1) Go to "DATA". Select "Data Analysis".
2) Go to "Regression" and click on "OK".
3) Input Y range and X range by selecting data on converted sugar and temperature one by one. Tick on Label if labels are included in the input data.
4) Press "OK".
Excel Output
SUMMARY OUTPUT | ||||||
Regression Statistics | ||||||
Multiple R | 0.707026444 | |||||
R Square | 0.499886392 | |||||
Adjusted R Square | 0.444318214 | |||||
Standard Error | 0.632607239 | |||||
Observations | 11 | |||||
ANOVA | ||||||
df | SS | MS | F | Significance F | ||
Regression | 1 | 3.600090909 | 3.600091 | 8.995911 | 0.014972903 | |
Residual | 9 | 3.601727273 | 0.400192 | |||
Total | 10 | 7.201818182 | ||||
Coefficients | Standard Error | t Stat | P-value | Lower 95% | Upper 95% | |
Intercept | 6.413636364 | 0.924638018 | 6.936375 | 6.79E-05 | 4.321959853 | 8.505312874 |
Temperature, x | 1.809090909 | 0.603167336 | 2.999318 | 0.014973 | 0.444631602 | 3.173550217 |
The estimated linear regression line is given by
b) the mean amount of converted sugar produced when the coded temperature is 1.75 is given by
c) Plot the residuals versus temperature
Using Excel
1) Go to "DATA". Select "Data Analysis".
2) Go to "Regression" and click on "OK".
3) Input Y range and X range by selecting data on converted sugar and temperature one by one. Tick on Label if labels are included in the input data.
4) For residual vs temperature plot, tick on "Residual Plots".
5) Click on "OK".
Output
Residuals |
-0.122727273 |
-0.603636364 |
-0.084545455 |
1.034545455 |
0.553636364 |
-0.227272727 |
-0.708181818 |
0.710909091 |
-0.37 |
-0.650909091 |
0.468181818 |
Interpretation: The plot shows the random pattern which indicates there are no obvious model defects. The linear model is a good fit for the data.
d) Using the summary output in part 1, we have ANOVA table
ANOVA | |||||
df | SS | MS | F | Significance F | |
Regression | 1 | 3.600090909 | 3.600091 | 8.995911 | 0.014972903 |
Residual | 9 | 3.601727273 | 0.400192 | ||
Total | 10 | 7.201818182 |
Sum of square Error, SSE= 3.60107
Estimate of is given by
where n is the sample size.
Hence, the estimate of is given by,
e) Construct a 95% confidence interval for intercept
from the summary output of part (a) we have,
Coefficients | Standard Error | t Stat | P-value | Lower 95% | Upper 95% | |
Intercept | 6.413636364 | 0.924638018 | 6.936375 | 6.79E-05 | 4.321959853 | 8.505313 |
Temperature, x | 1.809090909 | 0.603167336 | 2.999318 | 0.014973 | 0.444631602 | 3.17355 |
95 % confidence interval for the intercept is (4.3219,8.5053).
f) Construct a 95% confidence interval for slope.
from the summary output of part (a) we have,
Coefficients | Standard Error | t Stat | P-value | Lower 95% | Upper 95% | |
Intercept | 6.413636364 | 0.924638018 | 6.936375 | 6.79E-05 | 4.321959853 | 8.505313 |
Temperature, x | 1.809090909 | 0.603167336 | 2.999318 | 0.014973 | 0.444631602 | 3.17355 |
95% confidence interval for slope is (0.4446,3.17355).
g) for testing slope =0, we have ANOVA table from part (a)
ANOVA | |||||
df | SS | MS | F | Significance F | |
Regression | 1 | 3.600090909 | 3.600091 | 8.995911 | 0.014972903 |
Residual | 9 | 3.601727273 | 0.400192 | ||
Total | 10 | 7.201818182 |
The test statistic for testing H0: Slope=0 against H1: Slope is not equal to 0 is
which is F= 8.9959
If H0: Slope= 0, the MSRegression and MSResidual are independently distributed and
follows F distribution with d.f 1 and 9.
The decision rule for H1: Slope 0 is to reject H0 if
F calculated > F critical.
F critical value is 5.12 at d.f (1,9) at 5% level of significance.
Conclusion: Since F = 8.9959 > 5.12= F critical, we reject H0. Hence, Slope 0, there is a significant contribution of the regressor in the model.