In: Economics
Suppose you are interested in how neighborhood income
around a school attendance zone would affect student test
scores.
A) First, consider a regression where the independent variable is
the neighborhood income around a school attendance zone and the
dependent variable is student test scores (Note that, this is based
on our research question). Create the variable names on your own
and write down the regression model. What is the likely sign of the
coefficient on the neighborhood income? (2 points)
B) Now, consider a regression where the independent variable is a
measure of violent crime incidents around the school and the
dependent variable is student test scores. Create the variable
names on your own and write down the regression model. Make sure
the variable names are consistent in each part of this question.
What is the likely sign of the coefficient on violent crime? (2
points)
C) Finally, consider a regression where the independent variable is
neighborhood income levels and the dependent variable is the
violent crime incidents. Use the variable names you discussed in
previous questions, and write down the regression model. What is
the likely sign of the coefficient on area income levels? (2
points)
D) Now consider the sign of omitted variable bias in the regression
in part a. What is the sign of omitted variable bias if we omit a
measure of violent crime around a school? Is the estimated
coefficient in the first regression larger or smaller than the true
parameter? (4 points)
A) The dependent variable and the independent variable in this model are student scores and neighborhood income respectively. Let student score be denoted as Stsc and neighborhood income be denoted as NI and in the regression equation Stsc is the y variable and NI would be the x variable.
Therefore, we can state the regression equation in this case as:-
y(Stsc)=+*x(NI)+
The above regression equation examines the relationship between neighborhood income and the student test scores in the nearby school. As stated the y-variable and the x-variable represent the student test scores(Stsc) and neighborhood income(NI) respective. The denote the y-intercept in the equation, the slope of the independent variable or NI, and the residual or error term which constitutes the unexplained variation in Stsc in the particular model or equation. From a practical point of view, as the overall neighborhood income or NI increases, the families of households residing in the neighborhood can spend more on test preparation on their children or kids who attend the neighborhood school possibly resulting in average higher test scores of students. Therefore, the statistical relationship between NI and Stsc would be presumably positive implying that as NI or neighborhood income goes up the Stsc or student test scores also increase and vise versa indicating that the sign of the slope or coefficient of NI or neighborhood would be positive or + in this case.
B) Now, in this case, the independent variable is the measure of violent crime incidents around the school and the dependent variable is the student test scores. Let's suppose that the student test scores is still denoted as Stsc which is the y-variable and the violent crime incidents around the school is labeled as VCI or the x-variable.
Hence, the regression equation, in this case, can be expressed as:-
y(Stsc)=+*x(VCI)+
Here denote the y-intercept in the equation, the slope of the independent variable or VCI, and the residual or error term which constitutes the unexplained variation in y variables or Stsc in the equation. The y-variable and the x-variable are student test scores(Stsc) and violent crime incidents around the school or VCI respectively. In this instance, as the number or frequency of violent crime incidents increases the neighborhood safety would decrease creating unstable neighborhood conditions that would negatively impact the academic performance of the students or their test scores in the neighborhood school. Hence, the sign of the slope coefficient of VCI or violent crime incidents in the neighborhood would be negative or - implying that as VCI increases Stsc or student test scores decrease and vise versa.
C) In this case, the dependent variable is VCI and the independent variable is NI signifying that the y-variable and x-variable are VCI and NI respectively.
Thus, the regression equation can be denoted as:-
y(VCI)=+*x(NI)+
Here again denote the y-intercept in the equation, the slope of the independent variable or NI, and the residual or error term which constitutes the unexplained variation in the y-variable in the equation. The y and x variables are VCI or violent crime incidents around the neighborhood and NI or neighborhood income. Now, as the neighborhood income or NI goes up it is likely that the number or frequency of violent crimes in the neighborhood would decrease and vise versa indicating that the sign of the slope coefficient of the independent variable or NI would be negative or - .
D) In the regression equation derived in part A) of the question the dependent variable or y is Stsc and the independent or x variable is NI. In this case, the omitted variable bias would be created if the VCI is omitted as one of the independent variables in the regression model or equation which can be mathematically expressed as Bias()=E()- where is the slope coefficient of independent variable NI. In this case, considering that the sign of as derived part A) is positive or + and the likelihood that as NI goes up the VCI would go down and vise versa implying that the statistical correlation between NI and the omitted variable VCI is negative or less than 0, the bias would be negative or -, in this case. It further signifies that the coefficient is greater than the actual true parameter E( ) or the expected value of .