In: Economics
ECONOMETRICS 2
1) Consider the following estimated regression equation where the sample size is 78 (quarterly data):
IND - OUTPUT (dependent variable): Industrial Production Index.
PRICE (independent variable): Industrial Price Index.
LOGIND-OUTPUT= -76.5- 0.39 LOG(PRICE)
t statistics: (-1.35) (-0.72)
a) Interpret and test the coefficient of LOG(PRICE)?
Assume that an additional regression was run as:
LOGIND-OUTPUT= -33.5 +0.46 LOGPRICE+0.009 T
t statistics: (-4.63) (2.78) (3.55)
where T is a time trend.
b) Interpret the coefficient of T variable, and conduct a test to decide on its significance? What kind of a trend is it (Linear , nonlinear…)?
c) What might be a reason for the intercept and the coefficient of LOG(PRICE) to change in the second equation compared to the first one? What is your decision now about the relationship between the IND - OUTPUT and PRICE variables?
d) What is this kind of a specification error called? Which regression equation do you trust more and why? What is the first regression called?
1. (a) Considering the industrial output be Q and industrial price be P, the regression model is estimated as , and we have or or , meaning that the coefficient of log-price is basically the elsticity of industry supply, ie for a unit percent increase in industry price, the industry demand decreases by 0.39 percent.
The test of significance here would have the null hypothesis as and alternate hypothesis as . The two tailed t-test would have the test statistic as , which would follow the t-distribution with df=n-k-1=78-1-1=76 where k is the number of independent variables. The t-statistic is given as . The critical t would be as . Since we have , we fail to reject the null at 5% significance and conclude that the coefficient of log-price is not significant (not statistically different from zero).
(b) The new regression model is estimated as . For the partial regression coefficient of T here, we have or or , meanign that the coefficient of T is the percentage change in industrual quantity for a unit increase in T (not percentage change, but an absolute change in T), ie for a unit increase in T (unit time - one quarter year) the industrial quantity would increase by 0.009 percent.
Since we have ie or or or , the trend is non linear - exponential in nature as the variables included are non-linear in variable. As Q and T are non-linearly associated, we may say that the trend is non-linear.
The test of significance here would have the null hypothesis as and alternate hypothesis as . The two tailed t-test would have the test statistic as , which would follow the t-distribution with df=n-k-1=78-2-1=75 where k is the number of independent variables. The t-statistic is given as . The critical t would be as . Since we have , we may reject the null and conclude that the coefficient of T (trend variable) is significant (statistically different from zero).
(c) The reson for change in the coefficeint of log-price and intercept is that, in the second model a new independent variable - T is included. As independent variables are included and excluded, their cofficient changes since the coefficient represents the partial effect on the independent variable, and the estimated effects would change as number of independent variable changes.
In the first case, the association between log-output and log-price is negative, while in the later case, the association between log-output and log-price is positive, ie as log-price increases, the log-output decreases in first case, but increases in later case.
(d) This specification error here is known as Omitted Variable Bias (OVB). Under OVB, an important independent variable is left from estimation of model, resulting in violation of OLS-assumptions, as now the residuals in the estimated model includes the effects of omitted variable's effect on dependent variable, causing biased coefficients and inconsistent inference, which would occur even if sample size is increased.
One must trust the second equation (much) more than the first. One of the symptom of the OVB is that including the omitted variable would change sign of coefficient of some or all of the other variables. The regression model here estimates the relationship (in magnitude and in direction, and not necessarily linear relationship) between the industry output and industry price, and one must suppose that for the industry supply curve (as against the consumer demand curve) the quantity and price should be postively associated. But in first equation, the coefficient and hence the association is estimated to be negative, which seems to contradict the underlying theory. While in the second equation, the coefficient and hence the association in estimated to be postive which seems to be the intuitively correct in terms of theory.
The prior model would be called a misspecified model.