In: Statistics and Probability
Consider the following statements about unusual observations in linear regression models and pick the correct one.
A. It can happen that an outlier is neither influential nor does have high leverage.
B. It can happen that an observation with high leverage is not an outlier and is neither influential.
C. Both Statements are correct.
D. Both Statements are incorrect.
Consider the following statements about VIFs and choose the incorrect one.
A. If all the predictors are completely uncorrelated with each other, all VIFs would be 1.
B. If a predictor can be totally explained by other remaining predictors, the corresponding VIF would be infinite.
C. VIF gives a sense of the effect of collinearity: “standard error for coefficient of a predictor j is sqrt(vifj) times larger than it would have been without collinearity”
D. A VIF of 0.9 indicates that there is a negative correlation between the corresponding predictor and other remaining predictors.
If we observe that the relationship between X and Y is quadratic, we can build the model as Y = b0 + b1*X2. We don’t have to include in b2*X.
True or False
In the model log(Y) = b0 + b1*log(X), the elasticity of Y is the percentage change in Y (the dependent variable), when X (the independent variable) increases by one unit.
True or False
Suppose you try to decide whether the customers will pay the loan or not. If you classify a customer who will default (Y = 1) as a non-defaulter (Y = 0). This is a Type I error.
False or True
log(p/(1-p)) = b0 + b1x means that as x increases by 1 unit:
a) the odds increase by a factor of exp(b1)
b) the natural log of the possibility increases by b1
c) the odds increase by roughly b1%
d) the odds increase by exp(b1)
Q1.Consider the following statements about unusual observations in linear regression models and pick the correct one.
Answer: D (Because outlier impacts on the linear regression, the slope will be changed if we do not treat the outliers. Try building model with and without outliers, you will see the major difference in the slope coefficients )
Q2. Consider the following statements about VIFs and choose the incorrect one.
Answer : D (because of the VIF = 1/(1-(R2)),where R2 always lies within 0 and 1, thus the VIF would always greater than 1.
Q3.If we observe that the relationship between X and Y is quadratic, we can build the model as Y = b0 + b1*X2. We don’t have to include in b2*X.
Answer: False, because both y = b0+b1X2 and y = b0+b1X + b2X2 , represent quadratic equation. The only difference is that the second equation(i.e. y = b0+b1X + b2X2 ) contains linear factor of X as well.
Q4. In the model log(Y) = b0 + b1*log(X), the elasticity of Y is the percentage change in Y (the dependent variable), when X (the independent variable) increases by one unit.
Answer: True, if we change X by one unit, we’d expect y to change by b1 unit, thus, it can also be written as delta(Y)/Y = (delta(X)/X)*b1
Q5.Suppose you try to decide whether the customers will pay the loan or not. If you classify a customer who will default (Y = 1) as a non-defaulter (Y = 0). This is a Type I error.
Answer: True, It's a False Positive, you classified a customer defaulter but actually, he is not a defaulter
Q6. log(p/(1-p)) = b0 + b1x means that as x increases by 1 unit:
Answer: A (i.e. the odds increase by a factor of exp(b1)) because (p/(1-p)) gives the odds so if log(odds) = b0 +b1x, then the odds = exp(b0 +b1x)