In: Statistics and Probability
I am particularly confused regarding the number of degrees of freedom that should be used in conducting the t-tests for the intercept and slope coefficients, and 95% associated confidence intervals.
We are testing means (there are 6) so should there be t 4 degrees of freedom, level of significance 0.05
Is there any good explanations on the relationship between ANOVA and regression models. Are they the same?
The data shown below are the 24-hour dissolution percentage results for a drug product, tested in a dissolution apparatus.
Month | 0 | 3 | 6 | 9 | 12 | 18 |
1st sample | 81.92 | 82.39 | 75.32 | 68.85 | 74.19 | 62.87 |
2nd sample | 81.76 | 82.78 | 78.05 | 76.60 | 77.87 | 56.05 |
3rd sample | 76.22 | 70.33 | 80.02 | 71.50 | 64.06 | 54.20 |
4th sample | 78.95 | 90.10 | 66.69 | 71.68 | 70.13 | 57.72 |
5th sample | 80.05 | 79.23 | 79.00 | 68.23 | 69.45 | 50.90 |
6th sample | 90.55 | 77.04 | 78.00 | 65.05 | 70.82 | 62.23 |
sample mean | 81.58 | 80.31 | 76.18 | 70.32 | 71.09 | 57.5 |
standard deviation | 4.87 | 6.60 | 4.91 | 3.92 | 4.66 | 4.86 |
The data came from a stability monitoring programme: the drugs were stored at 25 degrees C and 60% relative humidity and were measured at 6 different time points or months.
The mean of all the 36 values is 72.83
The total sum of squares is 3089.667
For hypothesis testing, set the significance level to 0.05
(a) To test the hypothesis that the "drug's mean dissolution rates at different time points are the same", we can fit a linear regression model to the data
dissolution = Bo +B1 * month
The estimates of the coefficients are below, along with their standard errors.
Coefficients:
Estimated Standard Error
Intercept: 83.3765 1.4413
Month -1.3186 0.1449
(i) Conduct tests on the hypothesis that:
Ho: Bo = 100
Ho: Bo = 0
(ii) What information do we want to get about the drug by testing:
Ho: Bo = 100
Ho: Bo is not equal to 0
(iii) Conduct 95% confidence intervals for Bo and B1
Based on the confidence intervals, will we reject the above two hypotheses or not? Why?
We then conduct an ANOVA test for the regression model.
The following table shows the regression sum of squares and residual sum of squares
Response: Dissolution | ||||
DF | Sum of Squares | Mean Squares | F-Value | |
Regression | 2190.73 | |||
Residual | 898.94 |
(iv) Explain how the regression sum of squares and residual sum of squares are calculated
(v) Calculate the 5 values, two mean square values, and the F-value
What is the null hypothesis for the F-test?
Another way to test the hypothesis that the drug's mean dissolution rates at different time points are the same is to conduct an ANOVA test
(i) Assume the hypothesis Ho: B1=0 is true, should we reject or not reject the hypothesis in the ANOVA test?
(ii) Assume the hypothesis Ho: B1=0 is false, should we reject or not reject the hypothesis in the ANOVA test?
Conduct an ANOVA test on the hypothesis that the drug's mean dissolution rates at different time points are the same?
(i) and (ii)
Ho: Bo = 100
Ho: Bo is not equal to 100
Value of t test= (83.3765-100) / 1.4413 =-11.5337
p-value=2P(t>11.5337|t~t34)=0.0000<0.05
So we reject H0 at 5% level of significance and hence y-intercept is significantly different from 100.
(iii) 95% C.I. for Bo is
(83.3765-t0.025,34 *1.4413, 83.3765+t0.025,34 *1.4413)=(80.4475, 86.3055)
where t0.025,34=2.0322
95% C.I. for B1 is
( -1.3186 -t0.025,34 * 0.1449, -1.3186 +t0.025,34 * 0.1449)=(-1.6131, -1.0241)
(iv) Regression sum of squares=SSR=dfregdfMSR ; MSR=mean sum of squares due to regression
=1* 2190.73 = 2190.73
Residual (Error) sum of squares=SSE=dferror MSE ; MSE=mean sum of squares due to error
=34* 898.94 =30563.96
(v)
Response: Dissolution | ||||
DF | Sum of Squares | Mean Squares | F-Value | |
Regression | 1 | 2190.73 | 2190.73 | 2190.73/898.94=2.4370 |
Residual | 34 | 30563.96 | 898.94 |
Null hypothesis, H0: regression equation is insignificant i.e. B1=0 vs. Alternative hypothesis, H1: regression equation is significant i.e. B1 is different from zero.
p-value=P(F>2.4370|F~F1,34)=0.1278>0.05 so we fail to reject H0 at 5% level of significance.