Question

In: Statistics and Probability

1.What is multicollinearity? 2.What sample correlation coefficient values between two x's "warn" of a potential problem...

1.What is multicollinearity?

2.What sample correlation coefficient values between two x's "warn" of a potential problem due to multicollinearity and what is that problem?

3. Can an independent variable in multiple linear regression be a categorical variable?

4.If not, why not, but if yes, how should the categorical variable be worked into the regression?

Solutions

Expert Solution

1) Multicollinearity is a problem in multiple linear regression which occurs when two or more independent variables are correlated. It is a problem because all the independent variables should be independent of each other. Multicollinearity gives us a higher value of coefficient of determination which could be misleading.

2) As a general thumb of rule, we say that if the value of correlation coefficient between two independent variables is greater than 0.8 then it would lead to problem of multicollinearity. However, in some cases the value of 0.7 or 0.9 is also used to decide the potential problem.

Following are problems caused if two variables with correlation of more than 0.8 are both used in multiple linear regression -

i) The coefficient estimates can swing wildly based on which other independent variable is present in the model. The coefficients become very sensitive to small changes in the model.

ii) Multicollinearity reduces the precision of estimate coefficients which weakens the statistical power of the regression model.

iii) It increases the value of coefficient of determination which gives us a misleading interpretation of model being stronger than it actually should be.

iv) The variance inflation factor becomes very high.

------------------------------------

3) Yes, an independent variable can definitely be used in a multiple linear regression.

4) We can use categorical variables by assigning coded values to them. For example, a categorical variable with two levels of measurement can be coded as '1' and '0' for its two levels and used as a normal variable in multiple linear regression.

If it has three levels of measurement, you can code them as '-1', '0' and '1' to use the variable as a normal independent variable in multiple linear regression.

_______________________________


Related Solutions

6. What is the difference between the correlation coefficient and ?2? Why should the correlation coefficient...
6. What is the difference between the correlation coefficient and ?2? Why should the correlation coefficient be -1 and 1? 7. What is the utility of marginal effects in regression models? How are they obtained? 8. What is heterocedasticity and homocedasticity? Explain how to detect and correct the first.
1. If the coefficient of determination is 25%, the correlation between two continuous variables is a)...
1. If the coefficient of determination is 25%, the correlation between two continuous variables is a) -5 b) 5 c) -.25 d) .25 e) a or b f) none of the above 2. To assess the correlation between height and weight, one should use a) spearman correlation b) regression equation c. pearson correlation d) point biserial correlation 3. For a computed r = -0.547, given a dataset of n = 16, alpha = .05, and two-tailed significance, one should fail...
1. Define how to find different values of r with the correlation coefficient. 2. Suppose the...
1. Define how to find different values of r with the correlation coefficient. 2. Suppose the results of a hypothesis test indicate that we should not reject the null hypothesis. according to that result, where should we place the test stat with respect to a one-tail test to the right? Place the stat on the axis for the picture below, and explain the reasoning. 3.a survey of 225 people indicate that 174 of them prefer Jimmy John's to Subway. Construct...
1. What is the difference between Pearson’s correlation coefficient, r, and the coefficient of determination, r2?...
1. What is the difference between Pearson’s correlation coefficient, r, and the coefficient of determination, r2? What does each statistic tell us about the relationship between two variables? What do these statistics NOT tell us about the relationship between two variables?
In a two-tailed test for correlation at α = .05, a sample correlation coefficient r =...
In a two-tailed test for correlation at α = .05, a sample correlation coefficient r = 0.42 with n = 25 is significantly different than zero. True or False
1)What are the features of a scattergram? 2) What is a correlation coefficient and what it...
1)What are the features of a scattergram?2) What is a correlation coefficient and what it measures?3) What can its values be? And what do those values mean as far as weak and strong, negative and positive, and none.4) Given these four images (a), (b), (c), (d) make a prediction about what you think the correlation coefficient might be and justify why.
Calculate the correlation coefficient​ r, letting Row 1 represent the​ x-values and Row 2 the​ y-values....
Calculate the correlation coefficient​ r, letting Row 1 represent the​ x-values and Row 2 the​ y-values. Then calculate it​ again, letting Row 2 represent the​ x-values and Row 1 the​ y-values. What effect does switching the variables have on​ r? Row 1 18 20 36 48 52 66 75 Row 2 118 110 206 200 141 171 202 Calculate the correlation coefficient​ r, letting Row 1 represent the​ x-values and Row 2 the​ y-values. requals Calculate the correlation coefficient​ r,...
For the given reports, are the given values of the correlation coefficient reasonable? 1. A) A...
For the given reports, are the given values of the correlation coefficient reasonable? 1. A) A correlation of r = +0.7 between gender and height. i) Reasonable ii) Unreasonable 1. B) A correlation of r = +1.0 between outdoor temperature and sales of ice cream. i) Reasonable ii) Unreasonable 1. C) A correlation of r = 0 between shoe size and IQ scores in a study of adults. i) Reasonable ii) Unreasonable 1. D) A correlation of r = +0.7...
The correlation coefficient between two assets 1 and 2 is +0.30, and other data are given in the following table:
The correlation coefficient between two assets 1 and 2 is +0.30, and other data are given in the following table:AssetE(r)σ   110%15%225%20%(Show your answers in decimal form. Keep 4 decimal places to all your answers except for (4), e.g. 0.1234)a) If one invests 40% in asset 1 and 60% in asset 2, what are the portfolio's expected rate of return and standard deviation?         Expected rate of return: _0.1900_ Standard deviation: _0.1494_b) Find the proportion α of asset 1 and (1 -...
The minimum and maximum values of the correlation coefficient r are, respectively, A. −1 and 1...
The minimum and maximum values of the correlation coefficient r are, respectively, A. −1 and 1 B. 0 and +∞ C. −1 and 0 D. 0 and 1 Which of the following could be a value of the coefficient of determination r2? A. −0.3646 B. 1.139 C. 0.5558 D. −1.0091 Joan put some data into her TI calculator. When she used its LinReg function, it displayed the following: y = ax + b a = 0.360 b = 1.765 r2...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT