Question

In: Math

In a study relating consumption expenditure (Y) on income (X2) and wealth (X3) based on 10...

In a study relating consumption expenditure (Y) on income (X2) and wealth (X3) based on 10 observations , the following equation was obtained

Yhat = 24.337 + 0.08764X2 - 0.0349X3

     SE   (6.2801)    (0.31438)   (0.0301)

       t    (3.875)      (2.7726)     (-1.1595)               

i) In your opinion, what type of problem takes in this result? Explain it

ii)   What are the practical consequences of this problem?

iii) What are the theoretical of this problem

iv) How to detect this problem?

v)In your judgement, what to do to remove this problem?

vi) How do we remedy this problem

Solutions

Expert Solution

i) The problem is a Multi-collinearity problem.

Explanation: Multicollinearity is a state of very high intercorrelations or inter-association among the independent variables. It is, therefore, a type of disturbance in the data, and if present in the data the statistical inferences made about the data may not be reliable. Here the Income and Wealth are highly correlated. Hence the problem has been found.

ii) Multicollinearity can result in several practical problems. These problems are as follows:

  • Multi-collinearity is a data problem. It does not have any relation with the regression outcome BUT the outcome which we will get, will not be reliable.
  • The standard error will be more than expected, hence the estimates can not be used for forecast purpose for the different datasets.

iii) Theoretical Problem:

  • The partial regression coefficient due to multicollinearity may not be estimated precisely. The standard errors are likely to be high.
  • Multicollinearity results in a change in the signs as well as in the magnitudes of the partial regression coefficients from one sample to another sample.
  • Multicollinearity makes it tedious to assess the relative importance of the independent variables in explaining the variation caused by the dependent variable.

iv) How to detect this problem?

There are certain signals which help one to detect the multicollinearity.

One such signal is if the individual outcome of a statistic is not significant but the overall outcome of the statistic is significant. In this instance, one might get a mix of significant and insignificant results that show the presence of multicollinearity. Suppose the person, after dividing the sample into two parts, finds that the coefficients of the sample differ drastically. This indicates the presence of multicollinearity. This means that the coefficients are unstable due to the presence of multicollinearity. Suppose that person observes a drastic change in the model by simply adding or dropping some variable.   This also indicates that multicollinearity is present in the data.

Multicollinearity can also be detected with the help of tolerance and it's reciprocal, called Variance Inflation Factor (VIF). If the value of tolerance is less than 0.2 or 0.1 and, simultaneously, the value of VIF 10 and above, then the multicollinearity is problematic.

A variance inflation factor(VIF) detects multicollinearity in regression analysis. Multicollinearity is when there’s a correlation between predictors (i.e. independent variables) in a model; its presence can adversely affect your regression results. The VIF estimates how much the variance of a regression coefficient is inflated due to multicollinearity in the model.

VIFs are usually calculated by the software, as part of regression analysis. You’ll see a VIF column as part of the output. VIFs are calculated by taking a predictor and regressing it against every other predictor in the model. This gives you the R-squared values, which can then be plugged into the VIF formula. “i” is the predictor you’re looking at (e.g. x1 or x2):

v) What to do to remove this problem?

If multicollinearity is a problem in your model -- if the VIF for a factor is near or above 5 -- the solution may be relatively simple. Try one of these:

  • Remove highly correlated predictors from the model. If you have two or more factors with a high VIF, remove one from the model. Because they supply redundant information, removing one of the correlated factors usually don't drastically reduce the R-squared. Consider using stepwise regression, best subsets regression, or specialized knowledge of the data set to remove these variables. Select the model that has the highest R-squared value.
  • Use Partial Least Squares Regression (PLS) or Principal Components Analysis, regression methods that cut the number of predictors to a smaller set of uncorrelated components.

vi) How do we remedy this problem?

The potential solutions include the following:

  • Remove some of the highly correlated independent variables.
  • Linearly combine the independent variables, such as adding them together.
  • Perform an analysis designed for highly correlated variables, such as principal components analysis or partial least squares regression.

Related Solutions

income (Y in $1,000s), GPA (X1), age (X2), and the gender of the individual (X3; zero...
income (Y in $1,000s), GPA (X1), age (X2), and the gender of the individual (X3; zero representing female and one representing male) was performed on a sample of 10 people. Coefficients Standard Error Intercept 4.0928 1.4400 X1 10.0230 1.6512 X2 0.1020 0.1225 X3 -4.4811 1.4400 ANOVA DF SS MS Regression 360.59 Error 23.91 a. use Excel/XLSTAT to calculate p-value for the coefficient of X1. Is it significant? α = 0.05. Next, the T table and interpolate the p-value b. use...
hypothetically data on weekly family consumption expenditure Y and weekly family income X. Y = 70...
hypothetically data on weekly family consumption expenditure Y and weekly family income X. Y = 70 ,65, 90, 95, 110, 115, 120, 140, 155, 150. AND X = 80, 100, 120, 140, 160, 180, 200, 220, 240, 260. from the data given above ,you are required to obtain the estimates of the regression coefficients,their variances and standard errors, the correlation and coefficient of determination
Does the input requirement set V (y) = {(x1, x2, x3) | x1 + min {x2,...
Does the input requirement set V (y) = {(x1, x2, x3) | x1 + min {x2, x3} ≥ 3y, xi ≥ 0 ∀ i = 1, 2, 3} corresponds to a regular (closed and non-empty) input requirement set? Does the technology satisfies free disposal? Is the technology convex?
Income    (Yd) Consumption Expenditure Saving Investment Expenditure Government Expenditure Net Export Expenditure Aggregate Expenditure $8000...
Income    (Yd) Consumption Expenditure Saving Investment Expenditure Government Expenditure Net Export Expenditure Aggregate Expenditure $8000 $11,000 $2,500 $5,000 $12,500   12,000 14,000 2,500 5,000 12,500 20,000 20,000 2,500 5,000 12,500 30,000 27,500 2,500 5,000 12,500 50,000 42,500 2,500 5,000 12,500 100,000 80,000 2,500 5,000 12,500 1. Calculate savings, autonomous consumption, MPC, MPS, break even income, and the equilibrium level of income (Y = AE = C + I + G + NX) in the above given information. 2. Draw a...
Income    (Yd) Consumption Expenditure Saving Investment Expenditure Government Expenditure Net Export Expenditure Aggregate Expenditure $8000...
Income    (Yd) Consumption Expenditure Saving Investment Expenditure Government Expenditure Net Export Expenditure Aggregate Expenditure $8000 $11,000 $2,500 $5,000 $12,500   12,000 14,000 2,500 5,000 12,500 20,000 20,000 2,500 5,000 12,500 30,000 27,500 2,500 5,000 12,500 50,000 42,500 2,500 5,000 12,500 100,000 80,000 2,500 5,000 12,500 Calculate savings, autonomous consumption, MPC, MPS, break even income, and the equilibrium level of income (Y = AE = C + I + G + NX) in the above given information. Draw a graph showing...
(1) z=ln(x^2+y^2), y=e^x. find ∂z/∂x and dz/dx. (2) f(x1, x2, x3) = x1^2*x2+3sqrt(x3), x1 = sqrt(x3),...
(1) z=ln(x^2+y^2), y=e^x. find ∂z/∂x and dz/dx. (2) f(x1, x2, x3) = x1^2*x2+3sqrt(x3), x1 = sqrt(x3), x2 = lnx3. find ∂f/∂x3, and df/dx3.
Using Y as the dependent variable and X1, X2, X3, X4 and X5 as the explanatory...
Using Y as the dependent variable and X1, X2, X3, X4 and X5 as the explanatory variables, formulate an econometric model for data that is (i) time series data (ii) cross-sectional data and (iii) panel data – (Hint: please specify the specific model here not its general form).
Explain the ‘permanent income’ theory of household consumption expenditure.
Explain the ‘permanent income’ theory of household consumption expenditure.
In the multiple regression model Y = β1 + β2 X2 + β3 X3 + u, variable X3 is given by X3 = a+βX2. What is the value of the estimate of β3?
In the multiple regression model Y = β1 + β2 X2 + β3 X3 + u, variable X3 is given by X3 = a+βX2. What is the value of the estimate of β3?O It is b².O It is not defined.O It is b²βs.O It is b.
Let x,y ∈ R3 such that x = (x1,x2,x3) and y = (y1,y2,y3) determine if <x,y>=...
Let x,y ∈ R3 such that x = (x1,x2,x3) and y = (y1,y2,y3) determine if <x,y>= x1y1+2x2y2+3x3y3    is an inner product
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT