In: Math
Suppose you perform the following multiple regression: Y = B0 + B1X1 + B2X2 + B3X3. You find that X1 and X3 have a near perfect correlation. How would you conclude on the utility of your regression result? This is a problem of multicollinearity which renders the entire regression invalid. This is a problem of multicollinearity which nevertheless does not invalidate the utility of the model as a whole This is NOT a regression problem and inferences made using the model and the respective coefficients remain valid. This is a problem of multicollinearity. However inferences made concerning the individual contribution of the model coefficients remain valid
We need to see the value of R square value and the adjusted R square of a given data. There shouldn't be much difference between these. A difference of 5 -10% is accepted (5% is much better). If there is a huge difference between these then the regression model doesn't fits the data and on the contrary, it fits the data.
For example,
In case of the above output, value of R square is 0.8 but the value of adjusted R square is 0.6. As there is a huge difference, the model doesn't fits the data and hence the utility is less. In order to solve this we might consider more relevant independent variable or reduce some in order to bring them within a difference of 5-10% at most. If the difference can be reduced more, the model becomes fit for the data.
One of the accepted solution is shown below:
Here the difference between the value of R square value and the adjusted R square is very less hence it is an accepted model.