In: Statistics and Probability
Give three ways to check for outliers in a regression analysis.
Three ways to check for outliers in a regression analysis:
(i) Hi (leverage):
A leverage (Hi) measures the distance from an observation's x - value to the average of the x - values for all observations in a data set. It is used to identify observations that have unusual predictor values compared to the remaining data.
(ii) Cook's distance (D):
Cook's distance is a measure of the distance between the fitted values calculated with and without the ith observation. It is used to identify observations that have unusual predictor values compared to the remaining data and observations that the model does not fit well.
(iii) DFITS:
DFITS represents the number of standard deviations that the fitted value changes when each observation is removed from the data set and the model is rfit. This is used to identify observations that have unusual predictor values compared to the remaining data and observations that the model does not fit well.