Question

In: Economics

Explain what assumptions are needed in order for the OLS estimator to be unbiased in a crosss sectional environment.

Explain what assumptions are needed in order for the OLS estimator to be unbiased in a crosss sectional environment.
Explain what assumptions are needed in order for the OLS estimator to be unbiased in a panel data environment.

 

Solutions

Expert Solution

The following is based on simple cross sections, for time series and panels it is somewhat different.

In the population, and therefore in the sample, the model can be written as:

Y=β0+β1x1+…+βkxk+u=Xβ+uY=β0+β1x1+…+βkxk+u=Xβ+u

This is the linearity assumption, which is sometimes misunderstood. The model should be linear in the parameters - namely the βkβk. You are free to do whatever you want with the xixi themselves. Logs, squares etc. If this is not the case, then the model cannot be estimated by OLS - you need some other nonlinear estimator.A random sample (for cross sections) This is needed for inference, and sample properties. It is somewhat irrelevant for the pure mechanics of OLS.No perfect Collinearity This means that there can be no perfect relationship between the xixi. This is the assumption that ensures that (X′X)(X′X) is nonsingular, such that (X′X)−1(X′X)−1 exists.Zero conditional mean: E(u|X)=0E(u|X)=0. This means that you have properly specified the model such that: there are no omitted variables, and the functional form you estimated is correct relative to the (unknown) population model. This is always the problematic assumption with OLS, since there is no way to ever know if it is actually valid or not.The variance of the errors term is constant, conditional on the all XiXi: Var(u|X)=σ2Var(u|X)=σ2 Again this means nothing for the mechanics of OLS, but it ensure that the usual standard errors are valid.Normality; the errors term u is independent of the XiXi, and follows u∼N(0,σ2)u∼N(0,σ2). Again this is irrelevant for the mechanics of OLS, but ensures that the sampling distribution of the βkβk is normal, βk^∼N(βk,Var(βk^))βk^∼N(βk,Var(βk^)).

Now for the implications.

Under 1 - 6 (the classical linear model assumptions) OLS is BLUE (best linear unbiased estimator), best in the sense of lowest variance. It is also efficient amongst all linear estimators, as well as all estimators that uses some function of the x. More importantly under 1 - 6, OLS is also the minimum variance unbiased estimator. That means that amongst all unbiased estimators (not just the linear) OLS has the smallest variance. OLS is also consistent.

Under 1 - 5 (the Gauss-Markov assumptions) OLS is BLUE and efficient (as described above).

Under 1 - 4, OLS is unbiased, and consistent.

Actually OLS is also consistent, under a weaker assumption than (4)(4) namely that: (1) E(u)=0(1) E(u)=0 and (2) Cov(xj,u)=0(2) Cov(xj,u)=0. The difference from assumptions 4 is that, under this assumption, you do not need to nail the functional relationship perfectly.

A comment in another question raised doubts about the importance of the condition E(u∣X)=0E(u∣X)=0, arguing that it can be corrected by the inclusion of a constant term in the regression specification, and so "it can be easily ignored".

This is not so. The inclusion of a constant term in the regression will absorb the possibly non-zero conditional mean of the error term if we assume that this conditional mean is already a constant and not a function of the regressors. This is the crucial assumption that must be made independently of whether we include a constant term or not:

E(u∣X)=const.E(u∣X)=const.

If this holds, then the non-zero mean becomes a nuisance which we can simply solve by including a constant term.

But if this doesn't hold, (i.e. if the conditional mean is not a zero or a non-zero constant), the inclusion of the constant term does not solve the problem: what it will "absorb" in this case is a magnitude that depends on the specific sample and realizations of the regressors. In reality the unknown coefficient attached to the series of ones, is not really a constant but variable, depending on the regressors through the non-constant conditional mean of the error term.

What does this imply? To simplify, assume the simplest case, where E(ui∣X−i)=0E(ui∣X−i)=0 (ii indexes the observations) but that E(ui∣xi)=h(xi)E(ui∣xi)=h(xi). I.e. that the error term is mean-independent from the regressors except from its contemporaneous ones (in XX we do notinclude a series of ones).

Assume that we specify the regression with the inclusion of a constant term (a regressor of a series of ones).

y=a+Xβ+εy=a+Xβ+ε

and compacting notation

y=Zγ+εy=Zγ+ε

where a=(a,a,a...)′a=(a,a,a...)′, Z=[1:X]Z=[1:X], γ=(a,β)′γ=(a,β)′, ε=u−aε=u−a.

Then the OLS estimator will be

γ^=γ+(Z′Z)−1Z′εγ^=γ+(Z′Z)−1Z′ε

For unbiasedness we need E[ε∣Z]=0E[ε∣Z]=0. But

E[εi∣xi]=E[ui−a∣xi]=h(xi)−aE[εi∣xi]=E[ui−a∣xi]=h(xi)−a

which cannot be zero for all ii, since we examine the case where h(xi)h(xi) is not a constant function. So

E[ε∣Z]≠0⟹E(γ^)≠γE[ε∣Z]≠0⟹E(γ^)≠γ

and

If E(ui∣xi)=h(xi)≠h(xj)=E(uj∣xj)E(ui∣xi)=h(xi)≠h(xj)=E(uj∣xj), then even if we include a constant term in the regression, the OLS estimator will not be unbiased, meaning also that the Gauss-Markov result on efficiency, is lost.


Moreover, the error term εε has a different mean for each ii, and so also a different variance (i.e. it is conditionally heteroskedastic). So its distribution conditional on the regressors differs across the observations ii.

But this means that even if the error term uiui is assumed normal, then the distribution of the sampling error γ^−γγ^−γwill be normal but not zero-mean mormal, and with unknown bias. And the variance will differ. So

If E(ui∣xi)=h(xi)≠h(xj)=E(uj∣xj)E(ui∣xi)=h(xi)≠h(xj)=E(uj∣xj), then even if we include a constant term in the regression, Hypothesis testing is no longer valid.


In other words, "finite-sample" properties are all gone.

We are left only with the option to resort to asymptotically valid inference, for which we will have to make additional assumptions.

So simply put, Strict Exogeneity cannotbe "easily ignored".


Related Solutions

Explain what assumptions are needed in order for the OLS estimator to be unbiased in a time series environment.
Explain what assumptions are needed in order for the OLS estimator to be unbiased in a time series environment.
Show that OLS estimator of variance is an unbiased estimator?
Show that OLS estimator of variance is an unbiased estimator?
explain OLS assumptions and when they are biased and unbiased.
explain OLS assumptions and when they are biased and unbiased.
The Gauss-Markov theorem says that the OLS estimator is the best linear unbiased estimator.
The Gauss-Markov theorem says that the OLS estimator is the best linear unbiased estimator. Explain which assumptions are needed in order to verify Gauss-Markov theorem? Consider the Cobb-Douglas production function
Explain the Gauss-Markov assumptions required for unbiasedness and efficiency of the OLS estimator. Distinguish between the...
Explain the Gauss-Markov assumptions required for unbiasedness and efficiency of the OLS estimator. Distinguish between the assumptions for simple and multiple linear regressions. Provide examples of violations of each assumption. Under what circumstances are coefficient estimates from MLR and SLR identical?
Prove that the OLS estimator is efficient provided the Gauss Markov assumptions hold.
Prove that the OLS estimator is efficient provided the Gauss Markov assumptions hold.
What are OLS assumptions in time series analysis? How are they different of similar to "cross-sectional"...
What are OLS assumptions in time series analysis? How are they different of similar to "cross-sectional" OLS assumptions?
Explain: OLS is BLUE under some assumptions. What assumptions and why?
Explain: OLS is BLUE under some assumptions. What assumptions and why?
Suppose that Yi=?0+?1Xi+ui and that E[ui|Xi] = 0 and therefore OLS is an unbiased estimator. a)...
Suppose that Yi=?0+?1Xi+ui and that E[ui|Xi] = 0 and therefore OLS is an unbiased estimator. a) Show that Zi=Xi is a valid instrument for Xi , i.e. it is both relevant and exogenous. b) Show that the 2SLS estimator of ?1 using Xi as an instrument for Xi is exactly equal to the OLS estimator of ?1 c) Let Zi=X2i and assume Xi is normally distributed N(?,?²). Is Zi exogenous? Is Zi relevant? Explain how the answer to these questions...
The Gauss Markov Theorem says a) Under the LS assumptions, the OLS estimator has the smallest...
The Gauss Markov Theorem says a) Under the LS assumptions, the OLS estimator has the smallest variance among all linear unbiased estimators b) Under the LS assumptions, the OLS estimator has the smallest variance among all linear estimators c) The OLS estimator has the smallest variance among all linear unbiased estimators d) Under the LS assumptions, the OLS estimator is the most consistent estimator of all linear unbiased estimators
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT