In: Economics
Earnings functions, whereby the log of earnings is regressed on years of education, years of on the job training, and individual characteristics, have been studied for a variety of reasons. Some studies have focused on the returns to education, others on discrimination, union non-union differentials, etc. For all these studies, a major concern has been the fact that ability should enter as a determinant of earnings, but that it is close to impossible to measure and therefore represents an omitted variable.
Assume that the coefficient on years of education is the parameter of interest. Given that education is positively correlated to ability, since, for example, more able students attract scholarships and hence receive more years of education, the OLS estimator for the returns to education could be upward biased. To overcome this problem, various authors have used instrumental variable estimation techniques. For each of the potential instruments listed below briefly discuss instrument validity.
(a) The individual's postal zip code.
(b) The individual's IQ or test score on a work related exam.
(c) Years of education for the individual's mother or father.
(a) In the regression model given, the dependent variable or y-variable is the log natural form of earnings and the independent or the x-variables in the model include years of education, the total number of years on the job training, and individual characteristics. Now, instrumental variables or IVs are commonly used in Statistics, Econometrics, and other related disciplines to better explain or interpret the statistical relationship or association between the independent and dependent variables. IVs basically constitute endogenous variables that might be implicitly or poverty correlated or statistically associated with the explanatory or dependent variables or the x-variables and are considered to be uncorrelated with the residual or the error term in the regression model. The IVs facilitate better and articulate detection and interpretation of the statistical relationship of association between the independent and dependent variables through an intermediary statistical association or correlation between the dependent/explanatory variable/s and IV/s. In this case, the postal zip codes of the respective research participants are not practically correlated or statistically associated with any of the dependent variables included in the model. The geographical or physical location or the zip codes of the participants does not have any direct or conceivable practical influence or effect on the years of education obtained by the respective participants, the total number of years of on the job training, and individual personal characteristics as well. Therefore, using the postal zip codes of the study or research participants, in this case, would not change or modify the statistical association or relationship between the dependent variable that is the natural log of earnings and the respective independent variables included in the model. Hence, to overcome the estimation bias caused by the omitted variable in the model, that is the individual ability of the study or research participants, the endogenous IV or the postal zip codes of the respective participants could perhaps prove to be a weak IV having statistically non-significant/small or no correlation with the explanatory or independent variables in the given model.
(b) In this instance, the individual IQ of the study or research participants or their test scores on work-related exams might be practically and statistically correlated or associated with the dependent variables number of years of education and the total number of years on the job training. The IQ of the individual participants in the concerned research could perhaps affect or influence the overall number of formal education obtained by the individual participants including both school and college level(up to the bachelor's degree typically). The respective test scores of the participants could be practically related or statistically associated with the total number of years in the on-job training obtained by the concerned participants considering that scores on work-related qualifying exams are evidently considered as one of the foremost parameters or criteria work admittance into many or at least some of the on-the-job professional or vocational training programs or courses in various colleges and universities. Therefore, in this case, the individual IQ of the respective participants or their test scores on work-related exams can be considered as valid IVs considering their possible practical and statistical correlation with the number of years of formal education and the overall number of years of on the job training of the participants which could improve the interpretation or understanding of the statistical relationship or correlation between the dependent and the independent variables in the given model.
(c) Considering that the total number of years of education of the participant's parents or mother and father can possibly influence the overall number of years of education obtained by the respective participants as it is evidently possible that the families or the households that generally value formal education would usually have relatively higher rates of both school and college enrollment within those families or households. Therefore, higher the educational qualifications and credentials or the number of years of formal education by the mothers and fathers of the participants, the more likely it will be for the respective participants to have the same and vise versa. Hence, the total number of years of education of the participant's mother and father can possibly have a statistical correlation or association with the overall duration of formal education obtained by the respective participants as well. Thus, the total number of years of education of the participant's mother and father can be perhaps strongly correlated with the educational qualification or the number of years of formal education of the study participants qualifying it as a valid IV, in this case. Therefore, considering it as an IV in the model can be helpful in enhancing the understanding of the statistical relationship or association between the dependent and independent variables included in the given regression model.