Question

In: Statistics and Probability

Suppose that a relevant variable is omitted from a simple regression (i.e., there should be a...

Suppose that a relevant variable is omitted from a simple regression (i.e., there should be a second explanatory variable in the model, but there is not). Under what conditions would the estimated slope coefficient (i.e., the one usually called β1) be biased downward; under what conditions would the estimated slope coefficient be biased upwards?

Solutions

Expert Solution

Suppose the true cause-and-effect relationship is given by

y = a + b x + c z + u {\displaystyle y=a+bx+cz+u}

with parameters a, b, c, dependent variable y, independent variables x and z, and error term u. We wish to know the effect of x itself upon y (that is, we wish to obtain an estimate of b).

Two conditions must hold true for omitted-variable bias to exist in linear regression:

  • the omitted variable must be a determinant of the dependent variable (i.e., its true regression coefficient must not be zero); and
  • the omitted variable must be correlated with an independent variable specified in the regression (i.e., cov(z,x) must not equal zero).

Suppose we omit z from the regression, and suppose the relation between x and z is given by

z = d + f x + e {\displaystyle z=d+fx+e}

with parameters d, f and error term e. Substituting the second equation into the first gives

y = ( a + c d ) + ( b + c f ) x + ( u + c e ) . {\displaystyle y=(a+cd)+(b+cf)x+(u+ce).}

If a regression of y is conducted upon x only, this last equation is what is estimated, and the regression coefficient on x is actually an estimate of (b + cf ), giving not simply an estimate of the desired direct effect of x upon y (which is b), but rather of its sum with the indirect effect (the effect f of x on z times the effect c of z on y). Thus by omitting the variable z from the regression, we have estimated the total derivative of y with respect to x rather than its partial derivative with respect to x. These differ if both c and f are non-zero.

The direction and extent of the bias are both contained in cf, since the effect sought is b but the regression estimates b+cf. The extent of the bias is the absolute value of cf, and the direction of bias is upward (toward a more positive or less negative value) if cf > 0 (if the direction of correlation between x and y is the same as that between x and z), and it is downward otherwise.


Related Solutions

1.Is the following statement true or false? For simple linear regression (i.e., when we predict variable...
1.Is the following statement true or false? For simple linear regression (i.e., when we predict variable Y only on the basis of variable X), the standardized regression coefficient (β) will be equal to the Pearson correlation coefficient (r). 2. Please consider the following values for the variables X and Y. Treat each row as a pair of scores for the variables X and Y (with the first row providing the labels "X" and "Y"). X Y 2 4 4 3...
Omitted variable bias: a. exists if the omitted variable is correlated with the included regressor but is not a determinant of the dependent variable. b. exists if the omitted variable is correlated
Omitted variable bias:a. exists if the omitted variable is correlated with the included regressor but is not a determinant of the dependent variable.b. exists if the omitted variable is correlated with the included regressor and is a determinant of the dependent variable.c. will always be present as long as the regression R2 < 1.d. is always there but is negligible in almost all economic examples.
a. If they are going to run a linear regression, identify which variable should be the independent variable and which should be the dependent variable in a regression equation.
In seeking to determine how influential advertising is, the management of a recently established retail chain collected data on sales revenue and advertising expenditure from its' stores over the last ten (10) weeks. The table below shows the data collected: Advertising Expenditure ($ 000) Sales ($ 000) 3 5 76 50 250 700 450 3.5 75 4 150 4.5 7 200 750 7.5 800 8.5 1,100 a. If they are going to run a linear regression, identify which variable should...
Suppose that you performed a Simple Linear Regression of Height (equal the y-variable) on Weight (equal...
Suppose that you performed a Simple Linear Regression of Height (equal the y-variable) on Weight (equal the x-variable). If the calculated r-value was equal to 0.9575, which of the following statements are appropriate parts of the interpretation of this r-value? Choose ALL that apply. A. Given that the r-value is 0.9575, the r-squared value will be 0.9785, rounded off to the 4th decimal place. B. Since this r-value is a positive number, the estimated y-intercept will also be a positive...
Multiple Linear Regression - Omitted Variable bias. Can someone provide me with an intuitive explanation of...
Multiple Linear Regression - Omitted Variable bias. Can someone provide me with an intuitive explanation of ommitted variable bias.
Give an example of omitted variable bias in a multiple linear regression model. Explain how you...
Give an example of omitted variable bias in a multiple linear regression model. Explain how you would figure out the probable direction of the bias even without collecting data on this omitted variable. [3 marks]
The residuals for 15 consecutive time periods from a simple linear regression with one independent variable...
The residuals for 15 consecutive time periods from a simple linear regression with one independent variable are given in the following table. Time_Period   Residual 1   +4 2   -6 3   -1 4   -5 5   +3 6   +6 7   -3 8   +7 9   +7 10   -3 11   +2 12   +3 13   0 14   -5 15   -7 B) Compute the​ Durbin-Watson statistic. At the 0.05 level of​significance, is there evidence of positive autocorrelation among the​ residuals? The​ Durbin-Watson statistic is D= ​(Round to...
Simple Linear Regression: Suppose a simple linear regression analysis provides the following results: b0 = 6.000,    b1...
Simple Linear Regression: Suppose a simple linear regression analysis provides the following results: b0 = 6.000,    b1 = 3.000,    sb0 = 0.750, sb1 = 0.500,  se = 1.364 and n = 24. Use this information to answer the following questions. (a) State the model equation. ŷ = β0 + β1x ŷ = β0 + β1x + β2sb1    ŷ = β0 + β1x1 + β2x2 ŷ = β0 + β1sb1 ŷ = β0 + β1sb1 x̂ = β0 + β1sb1 x̂ = β0 +...
How do you fix omitted variable bias? Please give a simple, and elaborate response. An example...
How do you fix omitted variable bias? Please give a simple, and elaborate response. An example in addition will help. Thank you!
Linear Regression Linear regression is used to predict the value of one variable from another variable....
Linear Regression Linear regression is used to predict the value of one variable from another variable. Since it is based on correlation, it cannot provide causation. In addition, the strength of the relationship between the two variables affects the ability to predict one variable from the other variable; that is, the stronger the relationship between the two variables, the better the ability to do prediction. What is one instance where you think linear regression would be useful to you in...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT