In: Statistics and Probability
Assignment on Multiple Linear Regression
The Excel file BankData shows the values of the following variables for randomly selected 93 employees of a bank. This real data set was used in a court lawsuit against discrimination. Let = starting monthly salary in dollars (SALARY), = years of schooling at the time of hire (EDUCAT), = number of months of previous work experience (EXPER), = number of months that the individual was hired (MONTHS), = dummy variable coded 1 for males and 0 for females (MALE). Let = the mean starting salary for all male bank employees, and = the mean starting salary for all female bank employees. Using the t-test studied in Section 10.2, you could find some evidence of and provide some support for a discrimination suit against the employer. It is recognized, however, that a simple comparison of the mean starting salaries might be insufficient to conclude that the female employees have been discriminated against. Obviously there are other factors that affect the starting salary to which the relation might be attributed. These factors have been identified as and defined above. Assume the following regression model, , and apply Regression in Data Analysis of Excel (see pages 312 – 314) to find the estimated regression equation . 1. Clearly show the estimated regression equation. Assuming that the values of and are fixed, what is the estimated average difference between the starting salaries of all male and female employees? 2. What starting salary would you predict for a male employee with 12 years educations, 10 months of previous work experience, and with the time hired equal to 15 months? What starting salary would you predict for a female employee with 12 years educations, 10 months of previous work experience, and with the time hired equal to 15 months? What is the difference between the two predicted salaries? Compare this difference with that found in Task 1. 3. Is there a significant difference in the average starting salaries for male and female employees after accounting for the effects of the three other independent variables? Use a 5% level of significance to answer this question. Clearly show the null and alternative hypotheses to be tested, the value of the test statistic, the p-value of the test, your conclusion and its interpretation; see pages 322 – 323 and 333-335.
SALARY | EDUCAT | EXPER | MONTHS | GENDER |
3900 | 12 | 0 | 1 | 0 |
4020 | 10 | 44 | 7 | 0 |
4290 | 12 | 5 | 30 | 0 |
4380 | 8 | 6 | 7 | 0 |
4380 | 8 | 8 | 6 | 0 |
4380 | 12 | 0 | 7 | 0 |
4380 | 12 | 0 | 10 | 0 |
4380 | 12 | 5 | 6 | 0 |
4440 | 15 | 75 | 2 | 0 |
4500 | 8 | 52 | 3 | 0 |
4500 | 12 | 8 | 19 | 0 |
4620 | 12 | 52 | 3 | 0 |
4800 | 8 | 70 | 20 | 0 |
4800 | 12 | 6 | 23 | 0 |
4800 | 12 | 11 | 12 | 0 |
4800 | 12 | 11 | 17 | 0 |
4800 | 12 | 63 | 22 | 0 |
4800 | 12 | 144 | 24 | 0 |
4800 | 12 | 163 | 12 | 0 |
4800 | 12 | 228 | 26 | 0 |
4800 | 12 | 381 | 1 | 0 |
4800 | 16 | 214 | 15 | 0 |
4980 | 8 | 318 | 25 | 0 |
5100 | 8 | 96 | 33 | 0 |
5100 | 12 | 36 | 15 | 0 |
5100 | 12 | 59 | 14 | 0 |
5100 | 15 | 115 | 1 | 0 |
5100 | 15 | 165 | 4 | 0 |
5100 | 16 | 123 | 12 | 0 |
5160 | 12 | 18 | 12 | 0 |
5220 | 8 | 102 | 29 | 0 |
5220 | 12 | 127 | 29 | 0 |
5280 | 8 | 90 | 11 | 0 |
5280 | 8 | 190 | 1 | 0 |
5280 | 12 | 107 | 11 | 0 |
5400 | 8 | 173 | 34 | 0 |
5400 | 8 | 228 | 33 | 0 |
5400 | 12 | 26 | 11 | 0 |
5400 | 12 | 36 | 33 | 0 |
5400 | 12 | 38 | 22 | 0 |
5400 | 12 | 82 | 29 | 0 |
5400 | 12 | 169 | 27 | 0 |
5400 | 12 | 244 | 1 | 0 |
5400 | 15 | 24 | 13 | 0 |
5400 | 15 | 49 | 27 | 0 |
5400 | 15 | 51 | 21 | 0 |
5400 | 15 | 122 | 33 | 0 |
5520 | 12 | 97 | 17 | 0 |
5520 | 12 | 196 | 32 | 0 |
5580 | 12 | 133 | 30 | 0 |
5640 | 12 | 55 | 9 | 0 |
5700 | 12 | 90 | 23 | 0 |
5700 | 12 | 117 | 25 | 0 |
5700 | 15 | 51 | 17 | 0 |
5700 | 15 | 61 | 11 | 0 |
5700 | 15 | 241 | 34 | 0 |
6000 | 12 | 121 | 30 | 0 |
6000 | 15 | 79 | 13 | 0 |
6120 | 12 | 209 | 21 | 0 |
6300 | 12 | 87 | 33 | 0 |
6300 | 15 | 231 | 15 | 0 |
4620 | 12 | 12 | 22 | 1 |
5040 | 15 | 14 | 3 | 1 |
5100 | 12 | 180 | 15 | 1 |
5100 | 12 | 315 | 2 | 1 |
5220 | 12 | 29 | 14 | 1 |
5400 | 12 | 7 | 21 | 1 |
5400 | 12 | 38 | 11 | 1 |
5400 | 12 | 113 | 3 | 1 |
5400 | 15 | 18 | 8 | 1 |
5400 | 15 | 359 | 11 | 1 |
5700 | 15 | 36 | 5 | 1 |
6000 | 8 | 320 | 21 | 1 |
6000 | 12 | 24 | 2 | 1 |
6000 | 12 | 32 | 17 | 1 |
6000 | 12 | 49 | 8 | 1 |
6000 | 12 | 56 | 33 | 1 |
6000 | 12 | 252 | 11 | 1 |
6000 | 12 | 272 | 19 | 1 |
6000 | 15 | 25 | 13 | 1 |
6000 | 15 | 36 | 32 | 1 |
6000 | 15 | 56 | 12 | 1 |
6000 | 15 | 64 | 33 | 1 |
6000 | 15 | 108 | 16 | 1 |
6000 | 16 | 46 | 3 | 1 |
6300 | 15 | 72 | 17 | 1 |
6600 | 15 | 64 | 16 | 1 |
6600 | 15 | 84 | 33 | 1 |
6600 | 15 | 216 | 16 | 1 |
6840 | 15 | 42 | 7 | 1 |
6900 | 12 | 175 | 10 | 1 |
6900 | 15 | 132 | 24 | 1 |
8100 | 16 | 55 | 33 | 1 |
1) This Is the Regression Output from Excel
Factor |
|
SE | T-Stat | P-Value | |
Intercept | 3526.20 | 327.68 | 10.76 | 0.00 | |
EDUCAT | 90.02 | 24.69 | 3.65 | 0.00 | |
EXPER | 1.27 | 0.59 | 2.16 | 0.03 | |
MONTHS | 23.40 | 5.20 | 4.50 | 0.00 | |
GENDER | 722.38 | 117.81 | 6.13 | 0.00 |
the Regreesion Equation will be
Pred_SALARY=3526.2+ (90.01*EDUCAT)+(1.27*EXPER)+(23.4*MONTHS)+(722.3*GENDER).
1.1) From above Output we have got Predicted salaries and from that we got the estimated average difference between the starting salaries of all male and female employees is 818.02 dollar ( since estimated average starting salary of Male is 5956.87 and for female 5138.85).
2) From above regreesion equation predicted salary for male is 5692.32 dollar provided that 12 years educations, 10 months of previous work experience, and with the time hired equal to 15 months.
and For Female 4970.02 dollar provided that 12 years educations, 10 months of previous work experience, and with the time hired equal to 15 months
2.1) the difference between two salaries is 722.30 dollar
2.2) If we compare this result with Task 1result then the difference for salary in male and female is 264.55 and 168.83 dollar respectively.
3) we will use t test to compare means so our null hypothesis will be
H0: There is no significant difference between actual and pred salary for male.
and H1: There is significant difference between actual and pred salary for male.
so after performing t test we got the t test statistic with p value as shown below
t Stat | 0.00 |
P(T<=t) one-tail | 0.50 |
Here P > alpha (0.05) so we accept H0 .. There is no significant difference between actual and pred salary for male.?
similarly for female same hypothesis
H0: There is no significant difference between actual and pred salary for female.
and H1: There is significant difference between actual and pred salary for female.
t Stat | 0.00 |
P(T<=t) one-tail | 0.50 |
Here P > alpha (0.05) so we accept H0 .... There is no significant difference between actual and pred salary for female.?