In: Economics
Suppose you are interested in how neighborhood income
around a school attendance zone would affect student test
scores.
A) First, consider a regression where the independent variable is
the neighborhood income around a school attendance zone and the
dependent variable is student test scores (Note that, this is based
on our research question). Create the variable names on your own
and write down the regression model. What is the likely sign of the
coefficient on the neighborhood income? (2 points)
B) Now, consider a regression where the independent variable is a
measure of violent crime incidents around the school and the
dependent variable is student test scores. Create the variable
names on your own and write down the regression model. Make sure
the variable names are consistent in each part of this question.
What is the likely sign of the coefficient on violent crime? (2
points)
C) Finally, consider a regression where the independent variable is
neighborhood income levels and the dependent variable is the
violent crime incidents. Use the variable names you discussed in
previous questions, and write down the regression model. What is
the likely sign of the coefficient on area income levels? (2
points)
D) Now consider the sign of omitted variable bias in the regression
in part a. What is the sign of omitted variable bias if we omit a
measure of violent crime around a school? Is the estimated
coefficient in the first regression larger or smaller than the true
parameter? (4 points)
A) The dependent variable and the independent variable in this model are student scores and neighborhood income respectively. Let student score be denoted as Stsc and neighborhood income be denoted as NI and in the regression equation Stsc is the y variable and NI would be the x variable.
Therefore, we can state the regression equation in this case as:-
y(Stsc)=+
*x(NI)+
The above regression equation examines the relationship between
neighborhood income and the student test scores in the nearby
school. As stated the y-variable and the x-variable represent the
student test scores(Stsc) and neighborhood income(NI) respective.
The
denote the y-intercept in the equation, the slope of the
independent variable or NI, and the residual or error term which
constitutes the unexplained variation in Stsc in the particular
model or equation. From a practical point of view, as the overall
neighborhood income or NI increases, the families of households
residing in the neighborhood can spend more on test preparation on
their children or kids who attend the neighborhood school possibly
resulting in average higher test scores of students. Therefore, the
statistical relationship between NI and Stsc would be presumably
positive implying that as NI or neighborhood income goes up the
Stsc or student test scores also increase and vise versa indicating
that the sign of the slope or coefficient of NI or neighborhood
would be positive or + in this case.
B) Now, in this case, the independent variable is the measure of violent crime incidents around the school and the dependent variable is the student test scores. Let's suppose that the student test scores is still denoted as Stsc which is the y-variable and the violent crime incidents around the school is labeled as VCI or the x-variable.
Hence, the regression equation, in this case, can be expressed as:-
y(Stsc)=+
*x(VCI)+
Here
denote the y-intercept in the equation, the slope of the
independent variable or VCI, and the residual or error term which
constitutes the unexplained variation in y variables or Stsc in the
equation. The y-variable and the x-variable are student test
scores(Stsc) and violent crime incidents around the school or VCI
respectively. In this instance, as the number or frequency of
violent crime incidents increases the neighborhood safety would
decrease creating unstable neighborhood conditions that would
negatively impact the academic performance of the students or their
test scores in the neighborhood school. Hence, the sign of the
slope coefficient of VCI or violent crime incidents in the
neighborhood would be negative or - implying that as VCI increases
Stsc or student test scores decrease and vise versa.
C) In this case, the dependent variable is VCI and the independent variable is NI signifying that the y-variable and x-variable are VCI and NI respectively.
Thus, the regression equation can be denoted as:-
y(VCI)=+
*x(NI)+
Here again
denote the y-intercept in the equation, the slope of the
independent variable or NI, and the residual or error term which
constitutes the unexplained variation in the y-variable in the
equation. The y and x variables are VCI or violent crime incidents
around the neighborhood and NI or neighborhood income. Now, as the
neighborhood income or NI goes up it is likely that the number or
frequency of violent crimes in the neighborhood would decrease and
vise versa indicating that the sign of the slope coefficient of the
independent variable or NI would be negative or - .
D) In the regression equation derived in part
A) of the question the dependent variable or y is Stsc and the
independent or x variable is NI. In this case, the omitted variable
bias would be created if the VCI is omitted as one of the
independent variables in the regression model or equation which can
be mathematically expressed as Bias()=E(
)-
where
is the slope coefficient of independent variable NI. In this case,
considering that the sign of
as derived part A) is positive or + and the likelihood that as NI
goes up the VCI would go down and vise versa implying that the
statistical correlation between NI and the omitted variable VCI is
negative or less than 0, the bias would be negative or -, in this
case. It further signifies that the coefficient
is greater than the actual true parameter E(
) or the expected value of
.