In: Statistics and Probability
The catch basin insert is a device for retrofitting catch basins to improve pollutant removal properties. Consider the following data for one particular type of insert on x- amount filtered (1000s of liters) and y = % total suspended solids removed.
Table 1
x: 23, 45, 68, 91, 114, 205, 228
y: 53, 27, 55, 34, 30, 3, 11
a. Test for a significant relationship using the F test. What is your conclusion? Use α=.05.
b. Show the ANOVA table for these data.
c. Compute the coefficient of determination. Comment on the goodness of fit.
Solution: We can use the excel regression data analysis tool to find the answer to the given questions. The excel steps are:
Enter the data in excel as:
Click on Data > Data Analysis > Regression > Ok
Input Y range: Select all the data in the Y column including the column name
Input X range: Select all the data in the X column including the column name
Mark the labels
Choose the cell for the output. The excel output is given below:
SUMMARY OUTPUT | ||||||
Regression Statistics | ||||||
Multiple R | 0.846411042 | |||||
R Square | 0.716411651 | |||||
Adjusted R Square | 0.659693981 | |||||
Standard Error | 11.34105242 | |||||
Observations | 7 | |||||
ANOVA | ||||||
df | SS | MS | F | Significance F | ||
Regression | 1 | 1624.616936 | 1624.616936 | 12.63118979 | 0.016317416 | |
Residual | 5 | 643.0973498 | 128.61947 | |||
Total | 6 | 2267.714286 | ||||
Coefficients | Standard Error | t Stat | P-value | Lower 95% | Upper 95% | |
Intercept | 53.63524317 | 7.810934725 | 6.866686902 | 0.00100143 | 33.55659626 | 73.71389009 |
X | -0.20987946 | 0.059053794 | -3.554038518 | 0.016317416 | -0.361682072 | -0.058076849 |
a. Test for a significant relationship using the F test. What is your conclusion? Use α=.05.
Answer: The F-statistic is 12.631 and the p-value is 0.0163.
Since the p-value is less than the significance level, we, therefore, reject the null hypothesis and conclude that there is a significant relationship between the variables.
b. Show the ANOVA table for these data.
Answer:
Source of variation | df | SS | MS | F | Significance F |
Regression | 1 | 1624.6169 | 1624.6169 | 12.6312 | 0.0163 |
Residual | 5 | 643.0973 | 128.6195 | ||
Total | 6 | 2267.7143 |
c. Compute the coefficient of determination. Comment on the goodness of fit.
Answer: The coefficient of determination is:
.
The value of the coefficient of determination can be interpreted as about 71.64% of the variation in the dependent variable is explained by the regression model. Therefore, this suggests us that the model is the best fit.