Question

In: Statistics and Probability

Data was collected from 40 employees to develop a regression model to predict the employee’s annual...

Data was collected from 40 employees to develop a regression model to predict the employee’s annual salary using their years with the company (Years), their starting salary in thousands (Starting), and their Gender (Male = 0, Female = 1). The level of significance is .01. The results from Excel regression analysis are shown below:

SUMMARY OUTPUT

Regression Statistics

Multiple R

0.718714957

R Square

0.516551189

Adjusted R Square

0.476263788

Standard Error

10615.63461

Observations

40

ANOVA

Df

SS

MS

F

Significance F

Regression

3

4334682510

1444894170

12.82165585

7.48476E-06

Residual

36

4056901131

112691698.1

Total

39

8391583641

Coefficients

Standard Error

t Stat

P-value

Intercept

27946.57894

4832.438706

5.783121245

1.35464E-06

Years

1665.251558

425.0829092

3.917474737

0.000383313

Starting

0.266374185

0.12610443

2.112330112

0.041661598

Gender

-3285.541043

5617.145392

-0.584912943

0.56225464

  1. Is the overall model statistically significant? Why or why not?
  2. Is the slope coefficient on starting salary statistically significant? You must justify your answer.
  3. Find the predicted starting salary for a male employee with 7 years of experience who had a starting salary of $10,000.
  4. Calculate the marginal effect of 3 additional work years on salary?
  5. Do the results provide evidence of gender discrimination? Why or Why not?

Solutions

Expert Solution

Let the regression model being estimated be

a) Is the overall model statistically significant? Why or why not?

The hypotheses to test if the overall model statistically significant are

Using the following

The test statistic is F=12.82

The p-value is 0.0000 (rounded to 4 decimals)

Using the level of significance

we will reject the null hypothesis, if the p-value is less than the level of significance.

Here, the p-value is 0.0000 and it is less than 0.01. Hence we reject the null hypothesis.

ans: Reject H0. The overall model statistically significant as the p-value is less than the level of significance.

b) Is the slope coefficient on starting salary statistically significant? You must justify your answer.

The slope coefficient of the starting salary is in the population regression model.

To test if the slope coefficient on starting salary statistically significant, we test the following hypotheses

Using the following

we get

the test statistic is t=2.112 and the p-value = 0.0417

Using the level of significance

we will reject the null hypothesis, if the p-value is less than the level of significance.

Here, the p-value is 0.0417 and it is not less than 0.01. Hence we do not reject the null hypothesis.

ans: Fail to Reject H0. The slope coefficient on starting salary is not statistically significant as the p-value is not less than the level of significance.

c) Find the predicted starting salary for a male employee with 7 years of experience who had a starting salary of $10,000.

Using this

The estimated regression equation is

the predicted Annual salary for a male employee (Gender=0) with 7 years of experience (Years=7) who had a starting salary of $10,000 (Starting=10 (in 1000s)) is

ans: the predicted annual salary for a male employee with 7 years of experience who had a starting salary of $10,000 is $39,606.00

Note: the change from "the predicted starting salary for..." to "the predicted annual salary for..." as the regression model is to predict the employee’s annual salary using their years with the company (Years), their starting salary in thousands (Starting), and their Gender (Male = 0, Female = 1) as given in the question.

d) Calculate the marginal effect of 3 additional work years on salary?

The estimated slope coefficient of Years is 1665.2516. The positive value tells us that the Annual salary and the Years move in the same direction. It says that for each additional work year, the annual salary increases by $1,665.2516, when keeping other variables unchanged. Hence we can say that for 3 additional work years, the salary would increase by 3*1665.2516=4995.75

ans: the marginal effect of 3 additional work years on salary is $4995.75

e) Do the results provide evidence of gender discrimination? Why or Why not?

There would be evidence of gender discrimination, if the slope coefficient of Gender (which is ) is not equal to 0.

The hypotheses to test this are

Using this

the test statistic is t=-0.585 and the p-value is 0.5623

Using the level of significance

we will reject the null hypothesis, if the p-value is less than the level of significance.

Here, the p-value is 0.5623 and it is not less than 0.01. Hence we do not reject the null hypothesis.

ans: Fail to Reject H0.  There is no sufficient evidence to conclude that the slope coefficient of Gender is not equal to zero. The results do not provide evidence of gender discrimination.


Related Solutions

Use the following data to develop a multiple regression model to predict from and . Discuss...
Use the following data to develop a multiple regression model to predict from and . Discuss the output, including comments about the overall strength of the model, the significance of the regression coefficients, and other indicators of model fit. y x1 x2 198 29 1.64 214 71 2.81 211 54 2.22 219 73 2.70 184 67 1.57 167 32 1.63 201 47 1.99 204 43 2.14 190 60 2.04 222 32 2.93 197 34 2.15 Appendix A Statistical Tables *(Round...
The data in BUSI1013 Credit Card Balance.xlsx is collected for building a regression model to predict...
The data in BUSI1013 Credit Card Balance.xlsx is collected for building a regression model to predict credit card balance of retail banking customers in a Canadian bank. Use this data to perform a simple regression analysis between Account balance and Income (in thousands). (12 points) Develop a scatter diagram using Account Balance as the dependent variable y and Income as the independent variable x. Develop the estimated regression equation. Use the estimated regression equation to predict the Account Balance of...
The data in BUSI1013 Credit Card Balance.xlsx is collected for building a regression model to predict...
The data in BUSI1013 Credit Card Balance.xlsx is collected for building a regression model to predict credit card balance of retail banking customers in a Canadian bank. Use this data to perform a simple regression analysis between Account balance and Income (in thousands). (12 points) Develop a scatter diagram using Account Balance as the dependent variable y and Income as the independent variable x. Develop the estimated regression equation. Use the estimated regression equation to predict the Account Balance of...
Use the following data to develop a quadratic model to predict y from x. Develop a...
Use the following data to develop a quadratic model to predict y from x. Develop a simple regression model from the data and compare the results of the two models. Does the quadratic model seem to provide any better predictability? Why or why not? x y x y 15 229 15 247 9 74 8 82 6 29 5 21 21 456 10 94 17 320
14.1)use the following data to develop a quadratic model to predict y from x. develop a...
14.1)use the following data to develop a quadratic model to predict y from x. develop a simple regression model from the data and compare the results of the two models. Does the quadratic model seem to provide any better predictability? Why or why not ?       x       y         x       y 14 200 15 247 9 74 8 82 6 29 5 21 21 456 10 94 17 320 Answer:simple model: y^=   -14.27+27.128x, F=229.67 with p=.000, se=27.27, R2=.97,...
Use Excel to develop a regression model for the Hospital Database to predict the number of...
Use Excel to develop a regression model for the Hospital Database to predict the number of Personnel by the number of Births. How many residuals are within 1 standard error? Write your answer as a whole number. Personnel Births 792 312 1762 1077 2310 1027 328 355 181 168 1077 3810 742 735 131 1 1594 1733 233 257 241 169 203 430 325 0 676 2049 347 211 79 16 505 2648 1543 2450 755 1465 959 0 325...
Use Excel to develop a regression model for the Hospital Database to predict the number of...
Use Excel to develop a regression model for the Hospital Database to predict the number of Personnel by the number of Births. How many residuals are within 1 standard error? Write your answer as a whole number. Personnel(y) Births(x) 792 312 1762 1077 2310 1027 328 355 181 168 1077 3810 742 735 131 1 1594 1733 233 257 241 169 203 430 325 0 676 2049 347 211 79 16 505 2648 1543 2450 755 1465 959 0 325...
An electronics company is looking to develop a regression model to predict the number of units...
An electronics company is looking to develop a regression model to predict the number of units sold for a special running watch. Data is provided below: Sales (units) Price ($) Advertising ($) Holiday 500 100 50 Yes 480 120 40 Yes 485 110 45 No 510 103 55 Yes 490 108 40 No 488 109 30 No 496 106 45 Yes Compile a spreadsheet for the data and determine the predicted number of units sold if the watch is sold...
A university would like to develop a regression model to predict the point differential for games...
A university would like to develop a regression model to predict the point differential for games played by its men's basketball team. A point differential is the difference between the final points scored by two competing teams. A positive differential is a win for the university's team and a negative differential is a loss. For a random sample of games, the point differential (y) was calculated, along with the number of assists (x1), rebounds (x2), turnovers (x3) and personal fouls...
A business statistics professor at a college would like to develop a regression model to predict...
A business statistics professor at a college would like to develop a regression model to predict the final exam scores for students based on their current GPAs, the number of hours they studied for the exam, the number of times they were absent during the semester, and their genders. Use the accompanying data to complete parts a through c below. Score   GPA   Hours   Absences   Gender 68   2.55   3.00   0   0 69   2.22   4.00   3   0 70   2.60   2.50   1   0...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT