In: Statistics and Probability
Your experience tells you that an independent variable is positively correlated to the dependent variable but a multiple regression model give it a negative coefficient. What could cause this?
Your judgement is wrong. Statistics don't lie
The software package made an error
The homoscedasticity assumption has been violated
The model may have correlated independent variables
The heteroscedasticity assumption has been violated
We are given that independent variable is possitively correlated to dependent variable
But if we are using the multiple regression it is possible that for effect of the intercorrelations between the independent variables or the partial regression coefficient of an independent variable may be negative in the face of a positive correlation coefficient between this variable and the dependent variable ( this is also known as multicollinearity )
Multicollinearity occurs when independent variables in a regression model are correlated. This correlation is a problem because independent variables should be independent.
Multicollinearity refers to when your predictor / independent variables are highly correlated with each other. This is an issue, as your regression model will not be able to accurately associate variance in your outcome variable with the correct predictor variable, leading to muddled results and incorrect inferences.
The assumption of Homoscedasticity are that the variance around the regression line is the same for all values of the predictor variable (X) i.e error term is the same across all values of the independent variables.
Whereas Heteroscedasticity is mainly due to the presence of outlier in the data. Heteroscedasticity is also caused due to omission of variables from the model .
So given situation is neither because of Homoscedasticity or Heteroscedasticity , it is due to the fact that independent variables in a regression model are correlated
Hence correct option is
Option D) The model may have correlated independent variables .