In: Statistics and Probability
1. Explain as fully as possible how multicollinearity causes estimation problems in multiple regression. Why is it not a problem in simple regression? How can the multicollinearity be measured?
Type all your answers. Try to write at least one page
In regression analysis, it is always good to have predictor variables that are highly correlated with the dependent variable. However, the problem of multicollinearity occurs when the predicted variables are correlated between themselves. Multicollinearity appears in the model as a result of keeping redundant independent variables.
Multicollinearity increases the standard error of the model coefficients. It implies that coefficient of some independent variable may not be significant in the model. Thus, presence of multicollinearity inflates the standard errors in such a way that an independent variable becomes insignificant whereas without multicollinearity it should have been significant.
In case of simple regression there is only one independent variable in the model. So, the problem of multicollinearity does not arise.
The presence of multicollinearity is generally detected by VIF (Variance Inflation Factor). VIF measures the inflation of variance of regression coefficient in the presence of that factor.
VIF of factor j is defined as, = , where, is the coefficient of determination obtained by regressing the j-variable on (j-1) other independent variables i.e. using the j-th factor as the dependent variable and the remaining (j-1) variables as independent variables.
If VIF > 5, then presence of multicollinearity is indicated. The factors for which VIF > 5 are generally dropped from the model.
One way to overcome of multicollinearity problem is to perform Principal Component Analysis instead of Regression Analysis.