In: Math
Suppose you were asked to investigate which predictors explain the number of minutes that 10- to18-year-old students spend on Twitter. To do so, you build a linear regression model with Twitter usage (Y) measured as the number of minutes per week. The four predictors you include in the model are Height, Weight, Grade Level, and Age of each student. You build four simple linear regression models with Y regressed separately on each predictor, and each predictor is statistically significant. Then you build a multiple linear regression model with Y regressed on all four predictors, but only one predictor, Age, is statistically significant, and the others are not. What is likely going on among the four predictors? If you include more than one of these predictors in the model, what are some problems that can result?
Answer:
Here the presumption in regression issues is indicators ought not have multicollinearity.
Since when you manufacture straight relapse model with Y relapsed independently on each predictor(x), at that point you have just a single indicator and anticipate Y.
It implies your information have no connection ( multicollinearity) between others indicators in light of the fact that no others indicator accessible in your informational index. along these lines , every indicator is factually noteworthy.
On the other hand , you fabricate a different straight relapse model with Y relapsed on every one of the four predictors(x1 , x2 , x3 , x4), at that point you have get relationship between's Height, weight and grade level to one another along these lines its measurably huge on yield (Y) is exceptionally low, yet age isn't associated with Height, weight and grade level in this manner its factually noteworthy on yield (Y) is high.
On the off chance that you incorporate more than one of these indicators in the model and these indicators will have relationship( Multicollinearity), at that point you will get issues and your precision will go down(i.e balanced R - square diminishes).