Question

In: Math

Data has been gathered to explain executive salaries from a variety of factors. Perform a Multiple...

Data has been gathered to explain executive salaries from a variety of factors.

Perform a Multiple Regression to predict SALARY from EXPerience, EDUCation, GENDER, NUMberSupported and ASSETS.

d. Test for violation of the Model Assumptions. How did you do this?

e. Test for multicollinearity. How did you do this?

SALARY EXP EDUC GENDER NUMSUP ASSETS
93300 12 15 1 240 170
130000 25 14 1 510 160
88200 20 14 0 370 170
74400 3 19 1 170 170
115300 19 12 1 520 150
70400 14 13 0 420 160
114200 18 18 1 290 170
72600 2 17 1 200 180
108600 14 13 1 560 180
68600 4 16 1 230 160
102000 8 18 1 540 150
101400 19 15 1 90 180
149400 23 16 1 560 180
57100 5 15 0 470 150
87400 3 16 1 340 190
131000 22 17 1 70 200
90300 24 14 0 160 180
115600 22 16 1 160 190
102800 13 18 1 110 180
141900 21 16 1 410 180
90900 10 13 1 370 190
73400 11 12 1 180 170
101000 12 19 1 60 200
85400 10 19 1 60 180
138300 26 17 1 110 200
82300 7 15 1 280 190
85500 7 19 1 110 180
75300 10 19 0 300 170
87500 23 14 0 220 170
127100 12 15 1 570 200
80100 6 16 1 240 180
90900 15 16 0 300 150
109600 15 18 1 260 170
70700 8 13 1 150 160
104400 18 19 0 350 160
71200 2 13 1 370 190
85400 13 14 1 150 160
89300 12 17 0 480 190
124800 21 15 1 310 180
42800 3 12 0 340 150
125000 20 16 1 520 160
122200 20 19 1 200 170
107100 20 17 0 490 160
61000 1 15 0 570 180
59800 2 17 1 70 160
95700 9 17 1 300 160
85600 11 17 0 190 160
88900 21 13 0 500 160
143000 20 20 1 390 170
109200 17 16 0 520 180
156700 24 12 1 530 200
65100 2 17 0 590 190
105900 9 13 1 560 170
74300 2 18 0 600 190
79300 13 12 0 390 170
106600 14 18 1 110 170
106400 18 13 1 190 190
77400 10 14 1 110 160
129400 21 13 1 430 190
82600 11 14 0 440 150
126100 26 15 1 210 190
121900 22 18 1 320 160
96200 3 16 1 560 180
128900 17 18 1 450 190
72200 2 16 1 410 180
58800 4 18 0 70 150
79300 8 17 1 90 190
96100 13 15 1 290 160
94900 3 18 1 530 180
89000 13 16 0 420 170
108800 25 19 0 150 200
95300 11 15 1 500 190
71200 2 17 0 430 190
173400 26 17 1 570 190
107000 20 20 1 90 150
100000 19 12 1 340 160
100700 12 13 1 440 170
152800 22 18 1 500 160
95300 13 13 0 570 180
77300 2 15 1 560 190
84600 15 14 1 160 170
92600 12 13 1 390 190
85900 13 19 0 370 200
79400 5 17 1 330 160
80100 8 17 0 560 170
114100 21 20 0 590 180
78500 5 16 1 290 200
87300 9 18 0 440 180
102900 19 15 0 480 190
116300 23 19 1 130 150
51500 3 12 0 440 190
106500 13 19 1 310 150
109000 22 17 0 370 200
66600 9 12 0 180 160
111100 7 19 1 520 200
83100 10 18 0 90 180
159500 25 18 1 590 160
122500 10 19 1 480 200
67300 3 19 1 80 160
97900 16 17 0 380 160

Solutions

Expert Solution

d. We need to check the following assumptions:

1. Homoscedasticity of residuals or equal variance

R code :

par(mfrow=c(2,2))  
mod_1 <- lm(SALARY ~ ., data=data)  
plot(mod_1)

Output :

From the Top left graph and bottom right graph , there is no certain pattern. Hence we can say that Homoscedasticity of residuals can be accepted.

2. No autocorrelation of residuals

This can be tested by using dwtest( Durbin Watson test) in R

R code:

library(lmtest)
dwtest(data[,1] ~ data[,2])
dwtest(data[,1] ~ data[,3])
dwtest(data[,1] ~ data[,4])
dwtest(data[,1] ~ data[,5])

Output :

dwtest(data[,1] ~ data[,2])

Durbin-Watson test

data: data[, 1] ~ data[, 2]

DW = 2.2839, p-value = 0.9244

alternative hypothesis: true autocorrelation is greater than 0

> dwtest(data[,1] ~ data[,3])

Durbin-Watson test

data: data[, 1] ~ data[, 3]

DW = 2.3343, p-value = 0.9555

alternative hypothesis: true autocorrelation is greater than 0

> dwtest(data[,1] ~ data[,4])

Durbin-Watson test

data: data[, 1] ~ data[, 4]

DW = 2.1778, p-value = 0.819

alternative hypothesis: true autocorrelation is greater than 0

> dwtest(data[,1] ~ data[,5])

Durbin-Watson test

data: data[, 1] ~ data[, 5]

DW = 2.3297, p-value = 0.9515

alternative hypothesis: true autocorrelation is greater than 0

Since, P-value ( in each case ) > 0.05 , we can say that Autocorrelation is not there in the data.

3. Normality of residuals

This can be visually checked using the qqnorm() plot.

R code:

mod <- lm(SALARY ~ ., data=data)
plot(mod)

Output:

The qqnorm() plot evaluates this assumption. If points lie exactly on the line, it is perfectly normal distribution. In out case the points lie on the line approximately , we can say that Residuals are Normally distributed.

e. Test for multicollinearity

Using VIF function in R we can check for multicollinearity.

VIF is a metric computed for every X variable that goes into a linear model. If the VIF of a variable is high, it means the information in that variable is already explained by other X variables present in the given model, which means, more redundant is that variable.So, lower the VIF (<2) the better.

R Code:

library(car)
mod2 <- lm(SALARY ~ ., data=data)
vif(mod2)

Output:

EXP EDUC GENDER NUMSUP ASSETS
1.002071 1.037777 1.063135 1.101590 1.029408

We can see that VIF value for each variable is less than 2 , we can say that No multicollinearity is present in the data.


Related Solutions

Data has been gathered to explain executive salaries from a variety of factors. Perform a Multiple...
Data has been gathered to explain executive salaries from a variety of factors. Perform a Multiple Regression to predict SALARY from EXPerience, EDUCation, GENDER, NUMberSupported and ASSETS. a. What is the regression equation? Write an interpretation of the regression equation b. Calculate the coefficient of determination. How would you interpret this? c. State the hypotheses to test for the significance of the independent factors. Using t-tests, at alpha= 0.05 determine which factors are significantly related to Executive SALARY SALARY EXP...
From an industry perspective, competition in health care has been affected by multiple factors. Identify three...
From an industry perspective, competition in health care has been affected by multiple factors. Identify three critical elements that are shifting the landscape of the competitive environment on a macro level. Explain why you think they are critical.
The executive team at Current Designs has gathered to evaluate the company’s operations for the last...
The executive team at Current Designs has gathered to evaluate the company’s operations for the last month. One of the topics on the agenda is a special order to produce a batch of 20 kayaks for a client. Mike Cichanowski asked the others if the special order caused any particular problems in the production process. Dave Thill, the production manager, made the following comments: “Since we wanted to complete this order quickly and make a good first impression on this...
The executive team at Current Designs has gathered to evaluate the company's operations for the last...
The executive team at Current Designs has gathered to evaluate the company's operations for the last month. One of the topics on the agenda is the special order from Huegel Hollow, which was presented in CD2. Recall that Current Designs had a special order to produce a batch of 20 kayaks for a client, and you were asked to determine the cost of the order and the cost per kayak. Mike Cichanowski asked the others if the special order caused...
In this problem, we will perform multiple regression on the Boston housing data. The data contains...
In this problem, we will perform multiple regression on the Boston housing data. The data contains 506 records with 14 variables. The variable medv is the response variable. Solve the following problems in R and print out the commands and outputs : To assess the data use library(MASS) data(Boston) (a) First perform a multiple regression with all the variables, what can you say about the significance of the variables based on only the p-values. Next use the ”step” function to...
A- Martin is analyzing a project and has gathered the following data. Based on this data,...
A- Martin is analyzing a project and has gathered the following data. Based on this data,                        what is the average accounting rate of return? The firm depreciates it assets using                        straight-line depreciation to a zero book value over the life of the asset.                                                            Year                          Cash Flow             Net Income                                        0                              -$642,000                       n/a                                        1                              $170,000                $ 9,500                                        2                              $240,000                $79,500                                        3                              $205,000                $44,500                                        4                             ...
Question 4 Which factors explain the growth in executive compensation in banking?
Question 4 Which factors explain the growth in executive compensation in banking?
Victimization data are gathered from various sources. What are the strengths of official data, specifically that...
Victimization data are gathered from various sources. What are the strengths of official data, specifically that gathered in the Uniformed Crime Reports (UCR) and the National Crime Victim Survey (NCVS)?
outline the factors which have been used to explain why historically there has been diversity in...
outline the factors which have been used to explain why historically there has been diversity in accounting practices internationally
in recent years, there has been pressure from a variety of groups for pursuing global accounting...
in recent years, there has been pressure from a variety of groups for pursuing global accounting harmonization. who are these groups, and how do they stand to benefit from accoutning harmonization? DO NOT ANSWER WITH PICTURED OF SOMETHING HANDWRITTEN. I PROMISE YOUR HANDWRITING IS NOT AS GOOD AS YOU THINK.
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT