In: Math
In a study relating consumption expenditure (Y) on income (X2) and wealth (X3) based on 10 observations , the following equation was obtained
Yhat = 24.337 + 0.08764X2 - 0.0349X3
SE (6.2801) (0.31438) (0.0301)
t (3.875) (2.7726) (-1.1595)
i) In your opinion, what type of problem takes in this result? Explain it
ii) What are the practical consequences of this problem?
iii) What are the theoretical of this problem
iv) How to detect this problem?
v)In your judgement, what to do to remove this problem?
vi) How do we remedy this problem
i) The problem is a Multi-collinearity problem.
Explanation: Multicollinearity is a state of very high intercorrelations or inter-association among the independent variables. It is, therefore, a type of disturbance in the data, and if present in the data the statistical inferences made about the data may not be reliable. Here the Income and Wealth are highly correlated. Hence the problem has been found.
ii) Multicollinearity can result in several practical problems. These problems are as follows:
iii) Theoretical Problem:
iv) How to detect this problem?
There are certain signals which help one to detect the multicollinearity.
One such signal is if the individual outcome of a statistic is not significant but the overall outcome of the statistic is significant. In this instance, one might get a mix of significant and insignificant results that show the presence of multicollinearity. Suppose the person, after dividing the sample into two parts, finds that the coefficients of the sample differ drastically. This indicates the presence of multicollinearity. This means that the coefficients are unstable due to the presence of multicollinearity. Suppose that person observes a drastic change in the model by simply adding or dropping some variable. This also indicates that multicollinearity is present in the data.
Multicollinearity can also be detected with the help of tolerance and it's reciprocal, called Variance Inflation Factor (VIF). If the value of tolerance is less than 0.2 or 0.1 and, simultaneously, the value of VIF 10 and above, then the multicollinearity is problematic.
A variance inflation factor(VIF) detects multicollinearity in regression analysis. Multicollinearity is when there’s a correlation between predictors (i.e. independent variables) in a model; its presence can adversely affect your regression results. The VIF estimates how much the variance of a regression coefficient is inflated due to multicollinearity in the model.
VIFs are usually calculated by the software, as part of
regression analysis. You’ll see a VIF column as part of the output.
VIFs are calculated by taking a predictor and regressing it against
every other predictor in the model. This gives you the R-squared
values, which can then be plugged into the VIF formula. “i” is the
predictor you’re looking at (e.g. x1 or
x2):
v) What to do to remove this problem?
If multicollinearity is a problem in your model -- if the VIF for a factor is near or above 5 -- the solution may be relatively simple. Try one of these:
vi) How do we remedy this problem?
The potential solutions include the following: