In: Statistics and Probability
Consider the following multiple regression equation relating a machinist's performance rating on a new machine (RATING) to the following three independent variables.
WKEX - number of years work experience as a machinist
TSCORE - mechanical aptitude score
Years - age
^Rating = 12.5 + 0.8 WKEX + 0.32 TSCORE + 0.3 YEARS
a) Explain all of the steps for finding the value of R^2 (coefficient of determination) which are associated with finding the variance inflation factor (VIF) associated with the independent variable YEARS.
Coefficient of determination, R2, a measure that assesses the ability of a model to predict or explain an outcome in the linear regression setting. More specifically, R2 indicates the proportion of the variance in the dependent variable (Y) that is predicted or explained by linear regression and the predictor variable (X) [ also known as the independent variable]
In general, a high R2 value indicates that the model is a good fit for the data, although interpretations of fit depending on the context of analysis.
Now
R2 = MSS/TSS = (TSS − RSS)/TSS,
where MSS is the model sum of squares (also known as ESS, or explained sum of squares), which is the sum of the squares of the prediction from the linear regression minus the mean for that variable; TSS is the total sum of squares associated with the outcome variable, which is the sum of the squares of the measurements minus their mean; and RSS is the residual sum of squares, which is the sum of the squares of the measurements minus the prediction from the linear regression
A variance inflation factor(VIF) detects multicollinearity in regression analysis. Multicollinearity is when there’s correlation between predictors (i.e. independent variables) in a model; it’s presence can adversely affect your regression results. The VIF estimates how much the variance of a regression coefficient is inflated due to multicollinearity in the model
VIFs are calculated by taking a predictor and regressing it against every other predictor in the model. This gives you the R-squared values, which can then be plugged into the VIF formula. “i” is the predictor you’re looking at (e.g. x1 or x2):
Here the predictor i corresponds to the independant variable Years