Question

In: Statistics and Probability

The Excel file BankData shows the values of the following variables for randomly selected 93 employees...

The Excel file BankData shows the values of the following variables for randomly selected 93 employees of a large bank. This real data set was used in a court lawsuit against discrimination.  

Let

= monthly salary in dollars (SALARY),

= years of schooling at the time of hire (EDUCAT),

= number of months of previous work experience (EXPER),

= number of months that the individual was hired by the bank (MONTHS),

= dummy variable coded 1 for males and 0 for females (MALE).

Using the t-test studied in Section 10.2, you could find some evidence that the mean salary of all male employees is greater than the mean salaries of all female employees, and hence provide some support for a discrimination suit against the employer. It is recognized, however, that a simplecomparison of the mean salaries might be insufficient to conclude that the female employees have been discriminated against. Obviously there are other factors that affect the salary. These factors have been identified as  and  defined above.

Assume the following multiple linear regression model,

,

and apply Regression in Data Analysis of Excel (see pages 312 – 314) to find the estimated regression equation

.

Note. Of course, Input Y Range is A1:A94, Input X range is B1:E94, and Labels should be checked.

1. Clearly show the estimated regression equation. What is the percentage of variation in the salary explained by this equation? Assuming that the values of  and  are fixed, what is the estimated difference between the predicted salaries of male and female employees?

2. What salary would you predict for a male employee with 12 years educations, 10 months of previous work experience, and with the time hired equal to 15 months? What salary would you predict for a female employee with 12 years educations, 10 months of previous work experience, and with the time hired equal to 15 months? What is the difference between the two predicted salaries? Compare this difference with that stated in Task 1.

3. Is there a significant difference in the predicted salaries for male and female employees after accounting for the effects of the three other independent variables? To answer this question, conduct the ttest for the significance of at a 1% level of significance. Clearly show the null and alternative hypotheses to be tested, the value of the test statistic, the p-value of the test, your conclusion and its interpretation; see pages 322 – 323 and 333 – 335.

SALARY EDUCAT EXPER MONTHS MALE
4620 10 12 22 1
5040 8 14 3 1
5100 9 36 15 1
5100 10 55 2 1
5220 12 29 14 1
5400 12 37 21 1
5400 12 38 11 1
5400 12 39 3 1
5400 10 48 8 1
5400 10 60 11 1
5700 15 74 5 1
6000 15 88 21 1
6000 12 98 12 1
6000 12 113 17 1
6000 12 115 14 1
6000 15 123 33 1
6000 14 152 11 1
6000 14 173 19 1
6000 15 150 13 1
6000 15 136 32 1
6000 15 156 12 1
6000 15 180 33 1
6000 15 156 16 1
6000 16 145 13 1
6300 15 220 17 1
6600 15 164 16 1
6600 15 259 33 1
6600 15 216 16 1
6840 15 142 17 1
6900 16 175 20 1
6900 15 132 24 1
8100 16 315 33 1
3900 9 3 1 0
4020 10 12 7 0
4290 12 5 10 0
4380 8 6 7 0
4380 8 8 6 0
4380 12 3 7 0
4380 12 4 10 0
4380 12 5 6 0
4440 10 11 2 0
4500 12 12 3 0
4500 12 8 19 0
4620 12 52 13 0
4800 10 70 20 0
4800 12 52 23 0
4800 12 11 12 0
4800 12 75 17 0
4800 12 63 22 0
4800 12 144 24 0
4800 12 163 12 0
4800 15 228 26 0
4800 12 381 10 0
4800 16 214 15 0
4980 10 318 25 0
5100 10 96 33 0
5100 12 36 15 0
5100 12 59 14 0
5100 10 115 1 0
5100 10 165 4 0
5100 15 123 12 0
5160 12 118 12 0
5220 10 102 29 0
5220 12 127 29 0
5280 10 90 11 0
5280 12 190 31 0
5280 12 107 11 0
5400 10 113 34 0
5400 12 128 33 0
5400 12 126 11 0
5400 12 112 33 0
5400 12 98 22 0
5400 12 82 29 0
5400 12 169 27 0
5400 12 124 31 0
5400 15 94 13 0
5400 15 49 27 0
5400 15 121 21 0
5400 15 122 33 0
5520 12 97 17 0
5520 12 196 32 0
5580 12 133 30 0
5640 12 155 9 0
5700 12 123 23 0
5700 12 117 25 0
5700 15 151 17 0
5700 15 161 11 0
5700 15 241 34 0
6000 12 121 30 0
6000 15 244 22 0
6120 12 209 21 0
6300 15 187 30 0
6300 15 231 33 0

Solutions

Expert Solution

The Excel file BankData shows the values of the following variables for randomly selected 93 employees of a large bank. This real data set was used in a court lawsuit against discrimination.  

Let

Dependent variable y = monthly salary in dollars (SALARY),

There are four independent variables.

x1 = years of schooling at the time of hire (EDUCAT),

x2 = number of months of previous work experience (EXPER),

x3 = number of months that the individual was hired by the bank (MONTHS),

x4 = dummy variable coded 1 for males and 0 for females (MALE).

Using the t-test studied in Section 10.2, you could find some evidence that the mean salary of all male employees is greater than the mean salaries of all female employees, and hence provide some support for a discrimination suit against the employer. It is recognized, however, that asimplecomparison of the mean salaries might be insufficient to conclude that the female employees have been discriminated against. Obviously there are other factors that affect the salary. These factors have been identified as  and  defined above.

Assume the following multiple linear regression model,

,and apply Regression in Data Analysis of Excel (see pages 312 – 314) to find the estimated regression equation

.Note. Of course, Input Y Range is A1:A94, Input X range is B1:E94, and Labels should be checked.

THis is the problem of multiple linear regression.

We can do multiple regression in EXCEL.

steps :

ENTER data into EXCEL sheet --> Data --> Data analysis --> Regression --> ok --> Input y range : Salary --> INput X range : Select all the independent variables --> Labels --> COnfidence level : 99% --> Output range : Select one empty cell --> ok --> ok

1. Clearly show the estimated regression equation. What is the percentage of variation in the salary explained by this equation? Assuming that the values of  and  are fixed, what is the estimated difference between the predicted salaries of male and female employees?

The regression equation is,

y = 3389.11 + 90.84*x1 + 2.97*x2 + 16.54*x3 + 720.44*x4

Rsq = 0.7230 = 72.30%

Interpretation of R square : 72.30% of the variation in the salary explained by variation in independent variables.

2. What salary would you predict for a male employee with 12 years educations, 10 months of previous work experience, and with the time hired equal to 15 months? What salary would you predict for a female employee with 12 years educations, 10 months of previous work experience, and with the time hired equal to 15 months? What is the difference between the two predicted salaries? Compare this difference with that stated in Task 1.

Now we have to predict salary for male (y) when x1 = 12 years

x2 = 10 months

x3 = 15 months

x4 = 1

We can find salary (y) using regression equation.

y = 3389.11 + 90.84*x1 + 2.97*x2 + 16.54*x3 + 720.44*x4

= 3389.11 + 90.84*12 + 2.97*10 + 16.54*15 + 720.44*1

= 5477.42

Now we have to predict salary for female (y) when x1 = 12 years

x2 = 10 months

x3 = 15 months

x4 = 0

We can find salary (y) using regression equation.

y = 3389.11 + 90.84*x1 + 2.97*x2 + 16.54*x3 + 720.44*x4

= 3389.11 + 90.84*12 + 2.97*10 + 16.54*15 + 720.44*0

= 4756.98

DIfference in salaries = salary for male - salary for female

= 5477.42 - 4756.98

= 720.44

3. Is there a significant difference in the predicted salaries for male and female employees after accounting for the effects of the three other independent variables? To answer this question, conduct the ttest for the significance of at a 1% level of significance. Clearly show the null and alternative hypotheses to be tested, the value of the test statistic, the p-value of the test, your conclusion and its interpretation

Here we have to test the hypothesis that,

H0 : B = 0 Vs H1 : B not = 0

where B is population slope for male.

Assume alpha = level of significance = 0.01

The test statistic follows t-distribution with n-2 degrees of freedoms.

From the output the test statistic for male is 8.12 and P-value = 2.7E-12 = 0.0000

P-value < alpha

Reject H0 at 1% level of significance.

COnclusion : The population slope for male is differ than 0.

We get significant result about t-test.


Related Solutions

The Excel file BankData shows the values of the following variables for randomly selected 93 employees...
The Excel file BankData shows the values of the following variables for randomly selected 93 employees of a large bank. (A very similar data set was used in a court lawsuit against discrimination.)   Let = monthly salary in dollars (SALARY), = years of schooling at the time of hire (EDUCAT), = number of months of previous work experience (EXPER), = number of months that the individual was hired by the bank (MONTHS), = dummy variable coded 1 for males and...
The Excel file Salary reports the monthly salaries for 93 randomly and independently selected employees of...
The Excel file Salary reports the monthly salaries for 93 randomly and independently selected employees of a bank; there are 32 salaries of male employees and 61 salaries of female employees.   Let um = the mean monthly salary for all male bank employees, and uf = the mean monthly salary for all female bank employees. Your objective is to find some evidence of um > uf, that is, the female employees are discriminated against. Provide descriptive statistical summaries of the...
Assignment on Multiple Linear Regression The Excel file BankData shows the values of the following variables...
Assignment on Multiple Linear Regression The Excel file BankData shows the values of the following variables for randomly selected 93 employees of a bank. This real data set was used in a court lawsuit against discrimination. Let = starting monthly salary in dollars (SALARY), = years of schooling at the time of hire (EDUCAT), = number of months of previous work experience (EXPER), = number of months that the individual was hired (MONTHS), = dummy variable coded 1 for males...
The following data set is obtained by a randomly selected sample of 93 employees working at...
The following data set is obtained by a randomly selected sample of 93 employees working at a bank. SALARY EDUC EXPER TIME 39000 12 0 1 40200 10 44 7 42900 12 5 30 43800 8 6 7 43800 8 8 6 43800 12 0 7 43800 12 0 10 43800 12 5 6 44400 15 75 2 45000 8 52 3 45000 12 8 19 46200 12 52 3 48000 8 70 20 48000 12 6 23 48000 12...
A survey of 120 randomly selected employees of a large insurance company shows that the sample...
A survey of 120 randomly selected employees of a large insurance company shows that the sample proportion of employees which feel secure about their job is 0.35 and the standard error of this estimate is 0.044. Construct a 99% confidence interval for the proportion of all this company’s employees who feel secure about their jobs. Round your bounds to 2 decimal places.
A manager in the human resources department randomly selected five employees files and recorded the following...
A manager in the human resources department randomly selected five employees files and recorded the following data on X = number of weeks of paid vacation (annually) and Y = number of sick days claimed by the employee in the previous year. An analysis of the linear relationship between X and Y is desired. a) Find the value of b0 and b1 b) Find the regression equation c) Interpret the meaning of the Y intercept b0 d) Predict the number...
Suppose the following data are selected randomly from a population of normally distributed values. According to...
Suppose the following data are selected randomly from a population of normally distributed values. According to the U.S. Bureau of Labor Statistics, the average weekly earnings of a production worker in July 2011 were $657.49. Suppose a labor researcher wants to test to determine whether this figure is still accurate today. The researcher randomly selects 56 production workers from across the United States and obtains a representative earnings statement for one week from each. The resulting sample average is $670.76....
On Excel data shows hours of experience and the corresponding quality scores for employees at a...
On Excel data shows hours of experience and the corresponding quality scores for employees at a plant. The company is interested in studying the relationship between hours of experience and quality scores. a)Compute the correlation coefficient between hours of experience and quality scores. Use Excel’s CORREL function. b)Construct a scatter plot of quality scores vs. hours of experience. Does there appear to be a relationship? What kind? c)Obtain a regression output that could be used to predict quality scores based...
Managers rate employees according to job performance and attitude. The results for several randomly selected employees...
Managers rate employees according to job performance and attitude. The results for several randomly selected employees are given below. Performance (x) / 3 / 2 / 3 / 7 / 4 / 8 / 2 / 2 / 2 / 6 Attitude (y) / 2 / 2 / 1 / 7 / 7 / 6 / 2 / 8 / 5 / 4 Use the given data to find the test statistic for testing linear correlation. Round the final values...
Managers rate employees according to job performance and attitude. The results for several randomly selected employees...
Managers rate employees according to job performance and attitude. The results for several randomly selected employees are given below.    performance 59 63 65 69 58 77 76 69 70 64 attitude 72 67 78 82 75 87 92 83 87 78 b. Does the scatter plot show positive / negative / no correlation? Justify your answer.a. Construct a scatter plot for the given data. c. Find the least-square regression line (Best fit line). Graph the equation on the same...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT