In: Statistics and Probability
Please can you explain that why in a spurious regression the coefficients between unrelated variables appear very significant.
In case of spurious regression or spurious corrleation, the variables of our interest seem to be unrelated but there is a "lurking" variable that relates our response and predictor(s).
An easy example would be, if data are collected then one might find that ice-cream sales might exhibit a significant positive correlation with the number of deaths by drowning in a state. This happens because usually ice-cream sales go up during summer and thai is the time when more people use the local pools and hence the increased drowning accidents. So, here the lurking variable is the temperature.
Another example would be, the marks of a student in his undergraduate degree might be found to be negatively correlated to his income after 5 years, atleast in India. Here, the variables do not seem to be uncorrelated but the findings are counter-intuitive. This happens due to a lurking variable as well. The students who score very highly, most of them pursue for a PhD after a masters degree, where as many students who scored average, get into a job right away and get paid more than what Universities pay to their research scholars. Hence, the negative correlation. Here the lurking variable is interest of an individual to pursue a career in academics.