In: Statistics and Probability
QUESTION #2
The set of data below shows a random sample of 14 systems analysts who were surveyed in 1978.
Years of
Sampled Years in Post-secondary Annual Pay
Person Experience Education Gender in $1,000s
A 5.5 4.0 F 29.9
B 9.0 4.0 M 35.5
C 4.0 5.0 F 33.9
D 8.0 4.0 M 34.0
E 9.5 5.0 M 32.5
F 3.0 4.0 F 30.5
G 7.0 3.0 F 31.0
H 1.5 4.5 F 27.7
I 8.5 5.0 M 40.0
J 7.5 6.0 F 35.0
K 9.5 2.0 M 31.0
L 6.0 2.0 F 28.6
M 2.5 4.0 M 30.0
N 1.5 4.5 M 27.5
A MULTIPLE REGRESSION WAS RUN. Note that Male/Female – the gender variable was given dummy symbols as Female = 1 and Male = 0, or vice-versa.
MULTIPLE REGRESSION ANALYSIS
The set of data above shows a random sample of 14 systems analysts who were surveyed in 1978.
Years of
Sampled Years in Post-secondary Annual Pay
Person Experience Education Gender in $1,000s
A 5.5 4.0 F=1 29.9
B 9.0 4.0 M=0 35.5
C 4.0 5.0 F=1 33.9
D 8.0 4.0 M=0 34.0
E 9.5 5.0 M=0 32.5
F 3.0 4.0 F=1 30.5
G 7.0 3.0 F=1 31.0
H 1.5 4.5 F=1 27.7
I 8.5 5.0 M=0 40.0
J 7.5 6.0 F=1 35.0
K 9.5 2.0 M=0 31.0
L 6.0 2.0 F=1 28.6
M 2.5 4.0 M=0 30.0
N 1.5 4.5 M=0 27.5
RESULTS – COMPUTER OUTPUT
SYSTEMS ANALYSTS ANNUAL PAY
REGRESSION FUNCTION & ANOVA FOR PAY
PAY = 20.8779 + 0.801571 EXPRC + 1.595737 EDUC - 0.382572 GENDER
R-Squared = 0.675011
Adjusted R-Squared = 0.577514
Standard error of estimate = 2.251716
Number of cases used = 14
Analysis of Variance
Source SS df MS F Value Sig Prob
p-value
Regression 105.30990 3 35.10330 6.92342 0.008376
Residual 50.70225 10 5.07022
Total 156.01210 13
SYSTEMS ANALYSTS ANNUAL PAY
REGRESSION COEFFICIENTS FOR PAY
Two-Sided p-value
Variable Coefficient Std Error t Value Sig Prob
Constant 20.87790 3.06815 6.80472 0.000047
EXPRC 0.80157 0.22847 3.50845 0.005646
EDUC 1.59574 0.56064 2.84626 0.017361
GENDER -0.38257 1.28741 -0.29716 0.772423 *
* indiciates that the variable is marked for leaving
Standard error of estimate = 2.251716
Durbin-Watson statistic = 2.487978
Use the above computer output to answer the question below
What is the estimated multiple regression?
ANSWER
At a level of significance of ? = 0.05, which variables are statistically significant and which ones are not statistically significant?
ANSWER
Explain the meaning behind R-Squared in this problem
ANSWER
Does the output point to gender discrimination in pay? Why or why not?
ANSWER
What is the estimated multiple regression?
The estimated multiple regression equation for the given regression model is given as below:
Estimated PAY = 20.8779 + 0.801571 *EXPRC + 1.595737 *EDUC - 0.382572 *GENDER
At a level of significance of ? = 0.05, which variables are statistically significant and which ones are not statistically significant?
The P-value for significance of the regression coefficient for the variable experience in years is given as 0.005646 which is less than ? = 0.05, so it is indicate that the variable experience in years is statistically significant.
The P-value for significance of the regression coefficient for the variable years of post secondary education is given as 0.017361 which is less than ? = 0.05, so it is indicate that the variable years of post secondary education is statistically significant.
The P-value for the significance of the regression coefficient for the variable gender is given as 0.772423 which is greater than ? = 0.05, so it is indicate that the variable gender is not a statistically significant variable.
Explain the meaning behind R-Squared in this problem
The value for the R squared or coefficient of determination for the given regression model is given as 0.675011, which means about 67.50% of the variation in the dependent or response variable annual pay is explained by the explanatory or independent variables experience in years, years of post secondary education, and gender.
Does the output point to gender discrimination in pay? Why or why not?
No, given output do not point out the gender discrimination in pay, because the variable gender is not statistically significant as per given regression output.