In: Economics
What are the possible consequences of not including a constant in a regression equation? Discuss.
Possible consequences of not including the constant in a regression equation:
The constant term in linear regression analysis is simply the value at which the fitted line crosses the y-axis. It describes the mean response value of dependent variable when all independent variables are set to zero. If you don’t include the constant, the regression line is forced to go through the origin. This means that all of the predictors and the response variables must equal to zero at that point. As a result the estimated value of the regression coefficient and the method of forecasting will be biased.
The coefficients in a regression model are estimated by the method of least squares. According to this method we have to minimize the mean squared error. Now, the mean squared error is equal to the variance of the errors plus the square of their mean: this is a mathematical identity. Changing the value of the constant in the model changes the mean of the errors but doesn't affect the variance. Hence, if the sum of squared errors is to be minimized, the constant must be chosen such that the mean of the errors is zero.
In a simple regression model, the constant represents the intercept of the regression line, in unstandardized form. In a multiple regression model, the constant represents the value that would be predicted for the dependent variable if all the independent variables were simultaneously equal to zero. This imply a situation which may not physically or economically viable. If you are not particularly interested in what would happen if all the independent variables were simultaneously zero, then you normally leave the constant in the model regardless of its statistical significance. However the presence of the constant allows the regression line to provide the best fit to data which may only be locally linear.