Suppose that Yi=?0+?1Xi+ui and that E[ui|Xi] = 0 and therefore OLS is an unbiased estimator. a)...

Suppose that

Yi=?0+?1Xi+ui and that E[ui|Xi] = 0 and therefore OLS is an unbiased estimator.

a) Show that Zi=Xi is a valid instrument for Xi , i.e. it is both relevant and exogenous.

b) Show that the 2SLS estimator of ?1 using Xi as an instrument for Xi is exactly equal to the OLS estimator of ?1

c) Let Zi=X2i and assume Xi is normally distributed N(?,?²). Is Zi exogenous? Is Zi relevant? Explain how the answer to these questions depends on the values of (?,?²)

Expert Solution

Three important threats to internal validity are:

Omitted variable bias from a variable that is correlated with X but is unobserved (so cannot be included in the regression) and for which there are inadequate control variables;
Simultaneous causality bias (X causes Y, Y causes X);
Errors-in-variables bias (X is measured with error)

All three problems result in E(u|X) 0.
Instrumental variables regression can eliminate bias when E(u|X) 0 – using an instrumental variable (IV), Z
Y_i = ₀ + ₁X_i + u_i

IV regression breaks X into two parts: a part that might be correlated with u, and a part that is not. By isolating the part that is not correlated with u, it is possible to estimate ₁.
This is done using an instrumental variable, Z_i, which is correlated with X_i but uncorrelated with u_i.

An endogenous variable is one that is correlated with u
An exogenous variable is one that is uncorrelated with u
In IV regression, we focus on the case that X is endogenous and there is an instrument, Z, which is exogenous.

Digression on terminology: “Endogenous” literally means “determined within the system.” If X is jointly determined with Y, then a regression of Y on X is subject to simultaneous causality bias. But this definition of endogeneity is too narrow because IV regression can be used to address OV bias and errors-in-variable bias. Thus we use the broader definition of endogeneity above.
Two Conditions for a Valid Instrument
Y_i = ₀ + ₁X_i + u_i
For an instrumental variable (an “instrument”) Z to be valid, it must satisfy two conditions:
1. Instrument relevance: corr(Z_i,X_i) 0
2. Instrument exogeneity: corr(Z_i,u_i) = 0
Suppose for now that you have such a Z_i
The IV estimator with one X and one Z
Explanation #1: Two Stage Least Squares (TSLS)
As it sounds, TSLS has two stages – two regressions:
(1) Isolate the part of X that is uncorrelated with u by regressing X on Z using OLS:
X_i = ₀ + ₁Z_i + v_i (1)

Because Z_i is uncorrelated with ui, ₀ + ₁Z_i is uncorrelated with ui. We don’t know ₀ or ₁ but we have estimated them, so…
Compute the predicted values of X_i,X_iˆ, where X_iˆ= ˆ₀ + ˆ₁Z_i, i = 1,…,n.

Two Stage Least Squares, ctd.
(2) Replace X_i by Xˆ_i in the regression of interest:

regress Y on Xˆ_i using OLS:
Y_i = ₀ + ₁Xˆ_i + u_i (2)

Because Xˆ_i is uncorrelated with u_i, the first least squares assumption holds for regression (2). (This requires n to be large so that ?0 and ?1 are precisely estimated.)
Thus, in large samples, ₁ can be estimated by OLS using regression (2)
The resulting estimator is called the Two Stage Least Squares (TSLS) estimator, ˆ^TSLS₁

Two Stage Least Squares: Summary
Suppose Z_i, satisfies the two conditions for a valid instrument:
1. Instrument relevance: corr(Z_i,X_i) 0
2. Instrument exogeneity: corr(Z_i,u_i) = 0
Two-stage least squares:
Stage 1: Regress X_i on Z_i (including an intercept), obtain the predicted values Xˆ_i
Stage 2: Regress Y_i on Xˆ_i (including an intercept); the coefficient on Xˆ_i is the TSLS estimator, ˆ^TSLS₁
ˆ^TSLS₁ is a consistent estimator of ?1.
The IV Estimator, one X and one Z, ctd.
Explanation #2: A direct algebraic derivation
Y_i = ₀ + ₁X_i + u_i
Thus,
cov(Yi, Zi) = cov(₀ + ₁X_i + u_i, Z_i)
= cov(₀, Z_i) + cov(₁X_i, Z_i) + cov(u_i, Z_i)
= 0 + cov(₁X_i, Z_i) + 0
= ₁cov(Xi, Zi)
where cov(u_i, Z_i) = 0 by instrument exogeneity; thus
₁ = cov(Y_i , Z_i)/cov(X_i , Z_i)

The IV Estimator, one X and one Z, ctd.
₁ = cov(Y_i , Z_i)/cov(X_i , Z_i)

The IV estimator replaces these population covariances with sample covariances:
ˆ^TSLS₁ = S_YZ/S_XZ
s_YZ and s_XZ are the sample covariances. This is the TSLS estimator – just a different derivation!

Rahul Sunny answered 3 years ago

Show that OLS estimator of variance is an unbiased estimator?

The Gauss-Markov theorem says that the OLS estimator is the best linear unbiased estimator.

The Gauss-Markov theorem says that the OLS estimator is the best linear unbiased estimator. Explain which assumptions are needed in order to verify Gauss-Markov theorem? Consider the Cobb-Douglas production function

There are two OLS regression specification Yi = aSexi + ui (1) Yi = b Malei...

There are two OLS regression specification Yi = aSexi + ui (1) Yi = b Malei + c Femalei + ui(2) a, b, c, are constants. Sexi = 1 if the person is Female, and 0 otherwise. Malei = 1 if Sexi = 0, Femalei = 1 if Sexi = 1, and both are 0 otherwise. Why are neither regression(1) nor regression(2) directly tell you if the difference in Yi between Males and Females is statistically significant? What would be...

1. Consider the model Ci= β0+β1 Yi+ ui. Suppose you run this regression using OLS and...

1. Consider the model Ci= β0+β1 Yi+ ui. Suppose you run this regression using OLS and get the following results: b0=-3.13437; SE(b0)=0.959254; b1=1.46693; SE(b1)=21.0213; R-squared=0.130357; and SER=8.769363. Note that b0 and b1 the OLS estimate of b0 and b1, respectively. The total number of observations is 2950.According to these results the relationship between C and Y is: A. no relationship B. impossible to tell C. positive D. negative 2. Consider the model Ci= β0+β1 Yi+ ui. Suppose you run this...

Consider the model Ci= β0+β1 Yi+ ui. Suppose you run this regression using OLS and get...

Consider the model Ci= β0+β1 Yi+ ui. Suppose you run this regression using OLS and get the following results: b0=-3.13437; SE(b0)=0.959254; b1=1.46693; SE(b1)=0.0697828; R-squared=0.130357; and SER=8.769363. Note that b0 and b1 the OLS estimate of b0 and b1, respectively. The total number of observations is 2950. The following values are relevant for assessing goodness of fit of the estimated model with the exception of A. 0.130357 B. 8.769363 C. 1.46693 D. none of these

1. Consider the model Ci= β0+β1 Yi+ ui. Suppose you run this regression using OLS and...

1. Consider the model Ci= β0+β1 Yi+ ui. Suppose you run this regression using OLS and get the following results: b0=-3.13437; SE(b0)=0.959254; b1=1.46693; SE(b1)=0.0697828; R-squared=0.130357; and SER=8.769363. Note that b0 and b1 the OLS estimate of b0 and b1, respectively. The total number of observations is 2950. The number of degrees of freedom for this regression is A. 2952 B. 2948 C. 2 D. 2950 2. Consider the model Ci= β0+β1 Yi+ ui. Suppose you run this regression using OLS...

Consider the simple linear regression: Yi = β0 + β1Xi + ui whereYi and Xi...

Consider the simple linear regression: Yi = β0 + β1Xi + ui where Yi and Xi are random variables, β0 and β1 are population intercept and slope parameters, respectively, ui is the error term. Suppose the estimated regression equation is given by: Yˆ i = βˆ 0 + βˆ 1Xi where βˆ 0 and βˆ 1 are OLS estimates for β0 and β1. Define residuals ˆui as: uˆi = Yi − Yˆ i Show that: (a) (2 pts.) Pn i=1...

Question

Suppose that Yi=?0+?1Xi+ui and that E[ui|Xi] = 0 and therefore OLS is an unbiased estimator. a)...

Solutions

Expert Solution

Related Solutions

Show that OLS estimator of variance is an unbiased estimator?

The Gauss-Markov theorem says that the OLS estimator is the best linear unbiased estimator.

There are two OLS regression specification Yi = aSexi + ui (1) Yi = b Malei...

1. Consider the model Ci= β0+β1 Yi+ ui. Suppose you run this regression using OLS and...

Consider the model Ci= β0+β1 Yi+ ui. Suppose you run this regression using OLS and get...

1. Consider the model Ci= β0+β1 Yi+ ui. Suppose you run this regression using OLS and...

Consider the simple linear regression: Yi = β0 + β1Xi + ui whereYi and Xi...

Explain what assumptions are needed in order for the OLS estimator to be unbiased in a crosss sectional environment.

Explain what assumptions are needed in order for the OLS estimator to be unbiased in a time series environment.

Draw a scatterplot with a linear regression line for which the condition E(ui|Xi)=0 does not hold...