Question

In: Statistics and Probability

please use R for solving the questions (e) Is multicollinearity a potential problem in this model?...

please use R for solving the questions

(e) Is multicollinearity a potential problem in this model?

(f) Construct a normal regression plot of residuals. Does there seem to be any problem with the normality assumption?

(g) Construct and interpret a plot of the residuals versus predicted response.

(h) Based on the above analysis, what is your recommended model?

[Hint: Use the lm commend in R to fit a regression equation.

Table B.4

y x1 x2 x3 x4 x5 x6 x7 x8 x9
29.5 5.0208 1 3.531 1.5 2 7 4 62 0
27.9 4.5429 1 2.275 1.175 1 6 3 40 0
25.9 4.5573 1 4.05 1.232 1 6 3 54 0
29.9 5.0597 1 4.455 1.121 1 6 3 42 0
29.9 3.891 1 4.455 0.988 1 6 3 56 0
30.9 5.898 1 5.85 1.24 1 7 3 51 1
28.9 5.6039 1 9.52 1.501 0 6 3 32 0
35.9 5.8282 1 6.435 1.225 2 6 3 32 0
31.5 5.3003 1 4.9883 1.552 1 6 3 30 0
31 6.2712 1 5.52 0.975 1 5 2 30 0
30.9 5.9592 1 6.666 1.121 2 6 3 32 0
30 5.05 1 5 1.02 0 5 2 46 1
36.9 8.2464 1.5 5.15 1.664 2 8 4 50 0
41.9 6.6969 1.5 6.902 1.488 1.5 7 3 22 1
40.5 7.7841 1.5 7.102 1.376 1 6 3 17 0
43.9 9.0384 1 7.8 1.5 1.5 7 3 23 0
37.5 5.9894 1 5.52 1.256 2 6 3 40 1
37.9 7.5422 1.5 5 1.69 1 6 3 22 0
44.5 8.7951 1.5 9.89 1.82 2 8 4 50 1
37.9 6.0831 1.5 6.7265 1.652 1 6 3 44 0
38.9 8.3607 1.5 9.15 1.777 2 8 4 48 1
36.9 8.14 1 8 1.504 2 7 3 3 0
45.8 9.1416 1.5 7.3262 1.831 1.5 8 4 31 0
25.9 4.9176 1 3.472 0.998 1 7 4 42 0

Solutions

Expert Solution

Soln

data_9Nov = read.csv(file.choose(),header = T)

head(data_9Nov)

     y     x1 x2    x3    x4 x5 x6 x7 x8 x9

1 29.5 5.0208 1 3.531 1.500 2 7 4 62 0

2 27.9 4.5429 1 2.275 1.175 1 6 3 40 0

3 25.9 4.5573 1 4.050 1.232 1 6 3 54 0

4 29.9 5.0597 1 4.455 1.121 1 6 3 42 0

5 29.9 3.8910 1 4.455 0.988 1 6 3 56 0

6 30.9 5.8980 1 5.850 1.240 1 7 3 51 1

e)

round(cor(data_9Nov),2)

y

x1

x2

x3

x4

x5

x6

x7

x8

x9

y

1

x1

0.88

1

x2

0.71

0.65

1

x3

0.65

0.69

0.41

1

x4

0.71

0.73

0.73

0.57

1

x5

0.46

0.46

0.22

0.2

0.36

1

x6

0.53

0.64

0.51

0.39

0.68

0.59

1

x7

0.28

0.37

0.43

0.15

0.57

0.54

0.87

1

x8

-0.4

-0.44

-0.1

-0.35

-0.14

-0.02

0.12

0.31

1

x9

0.26

0.15

0.2

0.31

0.11

0.1

0.22

0

0.23

1

From the above correlation matrix we can see that x4 is correlated with x1,x2,x6. Hence multicollinearity will be present it we use these variables.

f)

hist(residuals(model))

Since the above plot does not resemble a bell curve, we can conclude that the residuals are not normally distributed.

g)

plot(model,1)

Ideally, the residual plot will show no fitted pattern. That is, the red line should be approximately horizontal at zero. The presence of a pattern may indicate a problem with some aspect of the linear model.

In our example, there is no pattern in the residual plot. This suggests that we can assume linear relationship between the predictors and the outcome variables.

h)

My recommended model is:

model2 = lm(y~x1+x2,data=data_9Nov)

summary(model2)

 
Call:
lm(formula = y ~ x1 + x2, data = data_9Nov)
 
Residuals:
    Min      1Q  Median      3Q     Max 
-4.7639 -1.9454 -0.1822  1.8068  5.0423 
 
Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  10.0418     2.9585   3.394  0.00273 ** 
x1            2.7134     0.4849   5.595 1.49e-05 ***
x2            6.1643     3.1864   1.935  0.06663 .  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
 
Residual standard error: 2.792 on 21 degrees of freedom
Multiple R-squared:  0.8025,  Adjusted R-squared:  0.7837 
F-statistic: 42.67 on 2 and 21 DF,  p-value: 4.007e-08

hist(residuals(model2))


Related Solutions

Identify one potential problem in health care and use the five-stage model for approaching problem solving...
Identify one potential problem in health care and use the five-stage model for approaching problem solving by Wheatley. What is the aspect of this situation that you think is most important? What new learning did you experience? Which things get in the way of solutions? How will we work together?
Please use R to solve this with explanation Consider the simple linear trend model. This model...
Please use R to solve this with explanation Consider the simple linear trend model. This model relates quarterly sales (y) to time period (t). a. Compute the residuals and plot them versus time. b. Does the plot suggest that the error terms are autocorrelated? Explain. c. Calculate the Durbin-Watson statistic. d. Test for positive (first-order) autocorrelation = .05. t Sales 1 21.74 2 26.41 3 26.44 4 26.40 5 23.91 6 27.06 7 26.63 8 26.98 9 28.26 10 27.87...
What is Group Problem Solving? Explain the different types of problem-solving techniques. Please explain what technique...
What is Group Problem Solving? Explain the different types of problem-solving techniques. Please explain what technique you think is most effective and why?
1.What is multicollinearity? 2.What sample correlation coefficient values between two x's "warn" of a potential problem...
1.What is multicollinearity? 2.What sample correlation coefficient values between two x's "warn" of a potential problem due to multicollinearity and what is that problem? 3. Can an independent variable in multiple linear regression be a categorical variable? 4.If not, why not, but if yes, how should the categorical variable be worked into the regression?
Problem 2. (a) Find the electric potential inside (r<R) and outside (r>R) a uniformly charged solid...
Problem 2. (a) Find the electric potential inside (r<R) and outside (r>R) a uniformly charged solid sphere (with the charge density roe) whose radius is R and whose total charge is Q. Use infinity as your reference point. Plot schematically V(r) as a function of r. (b) [15%] By using the result for the electric potential in the previous part, calculate the electric field in each region (r>R and r<R)
Discuss the potential social consequences of solving the problem of Racism from an Individualistic perspective rather...
Discuss the potential social consequences of solving the problem of Racism from an Individualistic perspective rather than approaching it as a Societal or Structural problem. Be sure to ground your answer.
Describe the general model of problem solving presented in the text. What are the three main...
Describe the general model of problem solving presented in the text. What are the three main phases and the different analysis step involved?
Please use R to solve part e and f The data file data2.txt gives a data...
Please use R to solve part e and f The data file data2.txt gives a data set with two variables x and y. The first column in the data set is just row numbers not useful for this question. (e) Use the Shapiro-Wilks test to test for Normality of the data. State your null and alternative hypotheses, p-value and conclusion. Use α = 0.05 (f) Apply the transformation y 0 = log(y) and run the regression on y 0 on...
Instructions : • It is recommended that you use the IRAC problem solving method. • You...
Instructions : • It is recommended that you use the IRAC problem solving method. • You may use headings and subheadings to structure your answer. • Your answer must include legal references (relevant cases and/or sections of Acts). The word limit is 600 words. Question 2 Earlier in the day, when Rob arrived at the Fancy Hotel before the performance, he was surprised to find that there was a valet car parking service. Rob had not been to the Fancy...
Please just answer the question e. Please solve a-e questions Consider an economy with two sectors:...
Please just answer the question e. Please solve a-e questions Consider an economy with two sectors: manufacturing and services. Demand for labor in manufacturing and services are described by these equations: Lm = 200 - 6Wm Ls = 100 - 4Ws where L is labor (in number of workers), W is the wage (in dollars), and the subscripts denote the sectors. The economy has 150 workers who are willing and able to work in either sector. a. If workers are...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT