In: Statistics and Probability
Consider the simple linear regression model
Yi = β0 + β1xi + εi, where the errors εi are identically and independently distributed as N (0, σ2).
(a) If the predictors satisfy x ̄ = 0, show that the least squares estimates βˆ0 and βˆ1 are independently distributed.
(b) Let r be the sample correlation coefficient between the predictor and response. Under what conditions will we have βˆ1 = r?
(c) Suppose that βˆ1 = r, as in part b), but make no assumptions on x ̄. Give the fitted regression equation (in terms of x ̄, Y ̄ and r) and use it to show that | Yˆ i − Y ̄ | ≤ | x i − x ̄ | . This inequality helps explain the meaning of the term “regression” since the fitted value Yˆi is closer to the average Y ̄ than the predictor xi is to x ̄, i.e. it regresses toward the mean.
Solution
(a)
Let consider x̅ =0,
The OLS estimators of intercept (b0) and slope (b1) can be expressed as
b0=y̅-b1x̅ ...Eq 1
b1=∑(x-x̅)(y-y̅)/∑(x-x̅)2 ...Eq 2
From these equations it is obvious that b0 is dependent on b1. Which means if b1 (estimate of slope) deviates much from true population slope B1 then b0 also deviates from true population intercept B0.
If we consider x̅ =0 then eq 1 and 2 becomes
b0=y̅
b1=∑(x)(y-y̅)/∑(x)2
By considering x̅ =0, we can see that now b0 is independent of b1(slope) .
(b)
we have
r2=b12 (∑(xi)2/∑(yi)2) where xi=Xi-x̅ and yi=Yi-Y̅
=>r=b1 (Sx/Sy) where Sx, Sy are standard deviations of X and Y
=>If independent variable (X) has the same variance of dependent variable (Y) then we have r=b1 (1/1) =b1.
i.e., the slope b1 can be different from correlation coefficient r only when the standard deviation of X and Y vary.
(c)
For the given Population regression function Yi=B0+B1Xi, the estimated regression would be
yi=b0+b1Xi where b0 and b1 are OLS estimators for B0 and B1
yi=b0+b1Xi
=(y̅-b1x̅)+b1X (By substituting value of b0)
=y̅ +b1(X-x̅)
=y̅ +r(X-x̅) (Since b1=r)... This is the fitted reg equation in terms of y̅,r,x̅
=>yi-y̅ =r(X-x̅) ..Eq 3
For inequality , lets study the r (correlation coefficient)
1. r can take values between -1 and 1, which means r can be either positive or negative
2. If r is negative then X and Y are inversely related and if r is positive then X and Y are directly related
Depending on values of r ,eq 3 becomes |yi-y̅| ≤ |X-x̅|