In: Math
1. You are given with the regression result that shows the regression model with k variables. Answer the following parts:
a) How do you tell that a certain variable is influential?
b) Suppose the theoretical issue said there exist a linear constraint, how do you figure out the constraint holds?
c) Suppose you have two sets of explanatory variables; how did you consider which set is the better one?
d) What’s the meaning of R-squared? Should we always look for the model that has the high R-squared?
e) Suppose the explanatory variables are subject to linear dependence among themselves, what is the correct procedure to estimate the coefficients?
a) We can test for the influence/ significance of a certain variable using t - test. If the p value of the t statistic is less than 0.05, we conclude null hypothesis is rejected (H0: bi=0 versus Hi: bi not equal 0) that the certain variable is different from zero. This means that it is influential / significant.
b) We can test for the presence of a linear constraint of the following form using restricted regression.
c) Out of the two explanatory variables, the better one will be chosen based on more significance, whose p value is lesser than the other, that will be considered more better.
d) R square calculates coefficient of determination which measures the goodness of fit of the model. Suppose R square is 0.81, this would mean that 81% of variation in dependent variable is explained by independent variables. We don't often seek high value of R square as id depends on the purpose of the study and the field also. for example in Physics, 0.81 R square might be too low but in Psychology, 0.45 might be a very high R square.
e) If explanatory variables are subject to linear dependence among themselves, it results in the problem of multicollinearity under which the OLS estimators do not perform well in terms of variance. So, methods to get rid of multicollinearity must be adopted before proceeding for estimation.