In: Economics
yi=a0+a1X1+a2X2+a3X3+ui
What is the effect of measurement error in Y? How is this different from the effect of measurement error in X?
The only consequence of the presence of measurement errors in the dependent variables (Y) is that they inflate the standard errors of these regression coefficient estimates.
On the other hand, independent errors (X) that are present in the observations of the regressors xi = xi ∗ + ηi lead to attenuation bias in a simple univariate regression model and to inconsistent regression coefficient estimates (meaning that the parameter estimates do not tend to the true values even in very large samples) in general.
Consider a simple linear regression model of the form
where denotes the true but unobserved regressor. Instead we observe this value with an error:
where the measurement error is assumed to be independent from the true value .
If the ′s are simply regressed on the ′s (see simple linear regression), then the estimator for the slope coefficient is
,
which converges as the sample size increases without bound:
Variances are non-negative, so that in the limit the estimate is smaller in magnitude than the true value of , an effect which statisticians call attenuation or regression dilution.Thus the ‘naïve’ least squares estimator is inconsistent in this setting. However, the estimator is a consistent estimator of the parameter required for a best linear predictor of given : in some applications this may be what is required, rather than an estimate of the ‘true’ regression coefficient, although that would assume that the variance of the errors in observing remains fixed. This follows directly from the result quoted immediately above, and the fact that the regression coefficient relating the ′s to the actually observed ′s, in a simple linear regression, is given by
It is this coefficient, rather than , that would be required for constructing a predictor of based on an observed which is subject to noise.
It can be argued that almost all existing data sets contain errors of different nature and magnitude, so that attenuation bias is extremely frequent (although in multivariate regression the direction of bias is ambiguous. Jerry Hausman sees this as an iron law of econometrics: "The magnitude of the estimate is usually smaller than expected.