In: Statistics and Probability
3. Consider the following data for two variables, x and y.
x 2 3 4 5 7 7 7 8 9
y 4 5 4 6 4 6 9 5 11
a. Does there appear to be a linear relationship between x and y? Explain.(f-test, to do f-test for the overall significance)
b. Develop the estimated regression equation relating x and y.
c. Plot the standardized residuals versus yˆ for the estimated regression equation developed in part (b). Do the model assumptions appear to be satisfied? Explain.
d. Perform a logarithmic transformation on the dependent variable y. Develop an estimated regression equation using the transformed dependent variable. Do the model assumptions appear to be satisfied by using the transformed dependent variable? Does a reciprocal transformation work better in this case? Explain.
data
x | y |
2 | 4 |
3 | 5 |
4 | 4 |
5 | 6 |
7 | 4 |
7 | 6 |
7 | 9 |
8 | 5 |
9 | 11 |
a)
Excel regression result
SUMMARY OUTPUT | |||||
Regression Statistics | |||||
Multiple R | 0.620164219 | ||||
R Square | 0.384603659 | ||||
Adjusted R Square | 0.296689895 | ||||
Standard Error | 2.054229935 | ||||
Observations | 9 | ||||
ANOVA | |||||
df | SS | MS | F | Significance F | |
Regression | 1 | 18.46097561 | 18.46097561 | 4.374783255 | 0.074793318 |
Residual | 7 | 29.53902439 | 4.219860627 | ||
Total | 8 | 48 | |||
Coefficients | Standard Error | t Stat | P-value | Lower 95% | |
Intercept | 2.32195122 | 1.88710113 | 1.230432849 | 0.258274863 | -2.140333878 |
x | 0.636585366 | 0.304353556 | 2.091598254 | 0.074793318 | -0.083096433 |
p-value = 0.0748 > 0.05 (alpha)
hence we fail to reject the null hypothesis
we conclude that there does not appear to be a linear relationship
b)
y^ = 2.32195 + 0.636585 x
c)
Predicted y | Standard Residuals |
3.595121951 | 0.21070321 |
4.231707317 | 0.39982838 |
4.868292683 | -0.451869534 |
5.504878049 | 0.257667178 |
6.77804878 | -1.445728649 |
6.77804878 | -0.404905565 |
6.77804878 | 1.15632906 |
7.414634146 | -1.256603479 |
8.051219512 | 1.5345794 |
residual increasing as x increases
model assumption is not satisfied
d)
ln y
SUMMARY OUTPUT | |||||
Regression Statistics | |||||
Multiple R | 0.621127 | ||||
R Square | 0.385799 | ||||
Adjusted R Square | 0.298056 | ||||
Standard Error | 0.304318 | ||||
Observations | 9 | ||||
ANOVA | |||||
df | SS | MS | F | Significance F | |
Regression | 1 | 0.407196 | 0.407196 | 4.396925 | 0.074212 |
Residual | 7 | 0.648266 | 0.092609 | ||
Total | 8 | 1.055462 | |||
Coefficients | Standard Error | t Stat | P-value | Lower 95% | |
Intercept | 1.182238 | 0.279559 | 4.228938 | 0.003893 | 0.521186 |
x | 0.094543 | 0.045088 | 2.096885 | 0.074212 | -0.01207 |
ln y^ = 1.182238 + 0.094543 x
NO, model assumption is still not satisfied
1/Y = a + bx
now model assumptions are satisfied