In: Economics
Use this information to answer the questions below:
A regression has been run for you using the SAS and the following regression equation was obtained:
Y = 30-2X + U.
The following data are given to you for your use in analyzing the assumptions about the OLS model.
Obs |
Y |
X |
(X-) |
(Y-) |
(X-)(Y-) |
(X-)2 |
(Y-)2 |
U |
U2 |
u(-1) |
X2 |
1 |
10 |
4 |
|||||||||
2 |
50 |
3 |
|||||||||
3 |
10 |
2 |
|||||||||
4 |
30 |
1 |
|||||||||
1 Give the problem which is caused when serial correlation is present.
(a) If du = 1.40 and dl = .630 is there a problem with serial correlation?
(b) If there is a problem explain how you would fix the problem using first differences. What are the new regression coefficients?
2 Is there heteroscedasticity exhibited in this model when the appropriate F value is 5.00?
(a) Explain the problem of heteroscedasticity.
(b) How would you correct this problem?
3 What is multi-collinearity and why should you worry about it.
(a) Why is multi-collinearity more of a problem than serial correlation?
(b) How would you test for multi-collinearity and what might be a solution to this problem? (note: you can not use the data above to answer these questions)
1. The conventional method for linear regression analysis (such
as OLS) assumes that there is no serial correlation or
autocorrelation present. But autcorrelation is usually present in
time series data.
The assumption of autocorrelation says that the two error
terms/disturbances corresponding to two different explanatory
variables have 0 covariance or correlation. When this assumption is
violated, the coefficient estimates are unbiased, but the
variances of the coefficients are larger, and hence the standard
errors are smaller than the true value. Thus, hypothesis
testing becomes unreliable.
(a) If the D-W test statistic is less than
dL(<.630), then we reject the null hypothesis of no
autocorrelation and assume positive correlation.
If the D-W statistic is greater than dU (>1.40), then
we reject the null hypothesis of no autocorrelation and assume
negative autocorrelation.
If the D-W statistic is between .630 and 1,40, we are inconclusive
about the presence of autocorrelation.
(b) When there is autocorrelation present in a data series, we
can get rid of it by using first differencing method.
2. If the critical F value is less than F value estimated from the model (i.e if Fcritical < 5.00), then we reject the null hypothesis of homoscedasticity or constant variance of the error term. The Fcritical value at numerator degrees of freedom (n = 4) and denominator df (n-k =4-2 =2) is 9.24 at .10 level of significance. Therefore, we fail to reject the null of homoscedasticity.
(a) When the error term associated with a model conditional on the independent variables is not constant, then we face the problem of heteroscedasticity. Heteroscedasticity gives unbiased and consistent coefficients but they are inefficient and give invalid standard errors, leading to unreliable hypothesis testing results.
(b) Hetreroscedasticity can be corrected by using weighted least squares (WLS), generalized least squares (GLS) instead of OLS.
3. When the independent variables in a model are linearly dependent with each other, then we come across the problem of multicollinearity. Multicollinearity is problematic because it gives biased and inefficient coefficient terms rendering hypothesis testing useless.
(a) Multicollinearity is more problematic than autocorrelation because with autocorrelation, we get at least unbiased and consistent parameters. But with multicollinearity, we have biased coefficient parameters.
(b) Multicollinearity can be observed if the given model shows high R squared value but individual significance of some variables are low. We can also test using the variance inflation factor (VIF). If there is high VIF, we say the given data has multicollinearity problem. Solutions for this problem can be to drop a redundant variable, transform the linearly dependent variables with a single variable, and by increasing the sample size, etc.