In: Statistics and Probability
SALARY | EDUC | EXPER | TIME |
39000 | 12 | 0 | 1 |
40200 | 10 | 44 | 7 |
42900 | 12 | 5 | 30 |
43800 | 8 | 6 | 7 |
43800 | 8 | 8 | 6 |
43800 | 12 | 0 | 7 |
43800 | 12 | 0 | 10 |
43800 | 12 | 5 | 6 |
44400 | 15 | 75 | 2 |
45000 | 8 | 52 | 3 |
45000 | 12 | 8 | 19 |
46200 | 12 | 52 | 3 |
48000 | 8 | 70 | 20 |
48000 | 12 | 6 | 23 |
48000 | 12 | 11 | 12 |
48000 | 12 | 11 | 17 |
48000 | 12 | 63 | 22 |
48000 | 12 | 144 | 24 |
48000 | 12 | 163 | 12 |
48000 | 12 | 228 | 26 |
48000 | 12 | 381 | 1 |
48000 | 16 | 214 | 15 |
49800 | 8 | 318 | 25 |
51000 | 8 | 96 | 33 |
51000 | 12 | 36 | 15 |
51000 | 12 | 59 | 14 |
51000 | 15 | 115 | 1 |
51000 | 15 | 165 | 4 |
51000 | 16 | 123 | 12 |
51600 | 12 | 18 | 12 |
52200 | 8 | 102 | 29 |
52200 | 12 | 127 | 29 |
52800 | 8 | 90 | 11 |
52800 | 8 | 190 | 1 |
52800 | 12 | 107 | 11 |
54000 | 8 | 173 | 34 |
54000 | 8 | 228 | 33 |
54000 | 12 | 26 | 11 |
54000 | 12 | 36 | 33 |
54000 | 12 | 38 | 22 |
54000 | 12 | 82 | 29 |
54000 | 12 | 169 | 27 |
54000 | 12 | 244 | 1 |
54000 | 15 | 24 | 13 |
54000 | 15 | 49 | 27 |
54000 | 15 | 51 | 21 |
54000 | 15 | 122 | 33 |
55200 | 12 | 97 | 17 |
55200 | 12 | 196 | 32 |
55800 | 12 | 133 | 30 |
56400 | 12 | 55 | 9 |
57000 | 12 | 90 | 23 |
57000 | 12 | 117 | 25 |
57000 | 15 | 51 | 17 |
57000 | 15 | 61 | 11 |
57000 | 15 | 241 | 34 |
60000 | 12 | 121 | 30 |
60000 | 15 | 79 | 13 |
61200 | 12 | 209 | 21 |
63000 | 12 | 87 | 33 |
63000 | 15 | 231 | 15 |
46200 | 12 | 12 | 22 |
50400 | 15 | 14 | 3 |
51000 | 12 | 180 | 15 |
51000 | 12 | 315 | 2 |
52200 | 12 | 29 | 14 |
54000 | 12 | 7 | 21 |
54000 | 12 | 38 | 11 |
54000 | 12 | 113 | 3 |
54000 | 15 | 18 | 8 |
54000 | 15 | 359 | 11 |
57000 | 15 | 36 | 5 |
60000 | 8 | 320 | 21 |
60000 | 12 | 24 | 2 |
60000 | 12 | 32 | 17 |
60000 | 12 | 49 | 8 |
60000 | 12 | 56 | 33 |
60000 | 12 | 252 | 11 |
60000 | 12 | 272 | 19 |
60000 | 15 | 25 | 13 |
60000 | 15 | 36 | 32 |
60000 | 15 | 56 | 12 |
60000 | 15 | 64 | 33 |
60000 | 15 | 108 | 16 |
60000 | 16 | 46 | 3 |
63000 | 15 | 72 | 17 |
66000 | 15 | 64 | 16 |
66000 | 15 | 84 | 33 |
66000 | 15 | 216 | 16 |
68400 | 15 | 42 | 7 |
69000 | 12 | 175 | 10 |
69000 | 15 | 132 | 24 |
81000 | 16 | 55 | 33 |
This data set was obtained by collecting information on a randomly selected sample of 93 employees working at a bank.
SALARY- starting annual salary at the time of hire
EDUC - number of years of schooling at the time of the hire
EXPER - number of months of previous work experience at the time of hire
TIME - number of months that the employee has been working at the bank until now
2. Use the least squares method to fit a simple linear model that relates the salary (dependent variable) toeducation (independent variable).
a) What is your model? State the hypothesis that is to be tested, the decision rule, the test statistic, and your decision, usinga level of significance of 5%.
b) What percentage of the variation in salary has been explained by the regression?
c) Provide a 95% confidence interval estimate for the true slope value.
d) Based on your model, what is the expected salary of a new hire with 12 years of education
e ) What is the 95% prediction interval for the salary of a new hire with 12 years of education? Use the fact that the distance value = 0.011286
data=read.csv(file.choose())
> data
SALARY EDUC EXPER TIME
1 39000 12 0 1
2 40200 10 44 7
3 42900 12 5 30
4 43800 8 6 7
5 43800 8 8 6
6 43800 12 0 7
7 43800 12 0 10
8 43800 12 5 6
9 44400 15 75 2
10 45000 8 52 3
11 45000 12 8 19
12 46200 12 52 3
13 48000 8 70 20
14 48000 12 6 23
15 48000 12 11 12
16 48000 12 11 17
17 48000 12 63 22
18 48000 12 144 24
19 48000 12 163 12
20 48000 12 228 26
21 48000 12 381 1
22 48000 16 214 15
23 49800 8 318 25
24 51000 8 96 33
25 51000 12 36 15
26 51000 12 59 14
27 51000 15 115 1
28 51000 15 165 4
29 51000 16 123 12
30 51600 12 18 12
31 52200 8 102 29
32 52200 12 127 29
33 52800 8 90 11
34 52800 8 190 1
35 52800 12 107 11
36 54000 8 173 34
37 54000 8 228 33
38 54000 12 26 11
39 54000 12 36 33
40 54000 12 38 22
41 54000 12 82 29
42 54000 12 169 27
43 54000 12 244 1
44 54000 15 24 13
45 54000 15 49 27
46 54000 15 51 21
47 54000 15 122 33
48 55200 12 97 17
49 55200 12 196 32
50 55800 12 133 30
51 56400 12 55 9
52 57000 12 90 23
53 57000 12 117 25
54 57000 15 51 17
55 57000 15 61 11
56 57000 15 241 34
57 60000 12 121 30
58 60000 15 79 13
59 61200 12 209 21
60 63000 12 87 33
61 63000 15 231 15
62 46200 12 12 22
63 50400 15 14 3
64 51000 12 180 15
65 51000 12 315 2
66 52200 12 29 14
67 54000 12 7 21
68 54000 12 38 11
69 54000 12 113 3
70 54000 15 18 8
71 54000 15 359 11
72 57000 15 36 5
73 60000 8 320 21
74 60000 12 24 2
75 60000 12 32 17
76 60000 12 49 8
77 60000 12 56 33
78 60000 12 252 11
79 60000 12 272 19
80 60000 15 25 13
81 60000 15 36 32
82 60000 15 56 12
83 60000 15 64 33
84 60000 15 108 16
85 60000 16 46 3
86 63000 15 72 17
87 66000 15 64 16
88 66000 15 84 33
89 66000 15 216 16
90 68400 15 42 7
91 69000 12 175 10
92 69000 15 132 24
93 81000 16 55 33
> y=data$SALARY
> y
[1] 39000 40200 42900 43800 43800 43800 43800 43800 44400 45000
45000 46200
[13] 48000 48000 48000 48000 48000 48000 48000 48000 48000 48000
49800 51000
[25] 51000 51000 51000 51000 51000 51600 52200 52200 52800 52800
52800 54000
[37] 54000 54000 54000 54000 54000 54000 54000 54000 54000 54000
54000 55200
[49] 55200 55800 56400 57000 57000 57000 57000 57000 60000 60000
61200 63000
[61] 63000 46200 50400 51000 51000 52200 54000 54000 54000 54000
54000 57000
[73] 60000 60000 60000 60000 60000 60000 60000 60000 60000 60000
60000 60000
[85] 60000 63000 66000 66000 66000 68400 69000 69000 81000
> x=data$EDUC
> x
[1] 12 10 12 8 8 12 12 12 15 8 12 12 8 12 12 12 12 12 12 12 12 16 8
8 12
[26] 12 15 15 16 12 8 12 8 8 12 8 8 12 12 12 12 12 12 15 15 15 15
12 12 12
[51] 12 12 12 15 15 15 12 15 12 12 15 12 15 12 12 12 12 12 12 15 15
15 8 12 12
[76] 12 12 12 12 15 15 15 15 15 16 15 15 15 15 15 12 15 16
> model=lm(y~x)
> model
Call:
lm(formula = y ~ x)
Coefficients:
(Intercept) x
38186 1281
> summary(model)
Call:
lm(formula = y ~ x)
Residuals:
Min 1Q Median 3Q Max
-14555.9 -4632.5 444.1 3767.5 22320.7
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 38186 3774 10.117 < 2e-16 ***
x 1281 297 4.313 4.08e-05 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’
1
Residual standard error:
S=6501 on 91 degrees of freedom
Multiple R-squared: 0.1697, Adjusted R-squared: 0.1606
F-statistic: 18.6 on 1 and 91
DF, p-value:
4.077e-05
> \