In: Statistics and Probability
Consider the case of a XYZ company evaluating the performance (Y) of its 25 newly hired employee after the probation period. During the probation period the newly employees were given four tests and their scores (X1, X2, X3, X4) were recorded. The following is the R-output of modeling job performance score (Y) vs. the four tests scores (X1, X2, X2 and X4):
##Call:
##lm(formula = Y ~ ., data = df)
##Coefficients:
## Estimate Std. Error t value Pr(>|t|) ##(Intercept) -124.38182
9.94106 -12.512 6.48e-11 ***
##X1 0.29573 0.04397 6.725 1.52e-06 ***
##X2 0.04829 0.05662 0.853 0.40383
##X3 1.30601 0.16409 7.959 1.26e-07 ***
##X4 0.51982 0.13194 3.940 0.00081 ***
##---
##Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
##Residual standard error: 4.099 on 20 degrees of freedom ##Multiple R-squared: 0.9629, Adjusted R-squared: 0.9555 ##F-statistic: 129.7 on 4 and 20 DF, p-value: 5.262e-14
(a) Fill up the ANOVA Table
(b) Give an interpretation for b3.
(c ) Consider a new employee with scores of X1 = 75; X2 =88; X3 = 80 and X4=97. We create a 95% confidence interval for the employee’s expected job performance score, and a 95% prediction interval for the employee’s actual performance score. The two intervals (in random order) are:
Which interval is the confidence interval? Interpret the prediction interval.
(d) Test whether both X3 and X4 can be dropped from the regression model given that X1 and X2 are retained. Use alpha = 0.025. State the alternatives, decision rule, and conclusion. What is the p-value of the test? The below R output may be of use:
##> anova(lm(Y~X1+X2, df), lm(Y~.,df))
##Analysis of Variance Table
##Model 1: Y ~ X1 + X2
##Model 2: Y ~ X1 + X2 + X3 + X4
## Res.Df RSS Df Sum of Sq F Pr(>F)
##1 22 4851.2
##2 20 336.0 2 4515.2 134.39 2.539e-12 ***
##---
##Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
The solution / interpretaion are details given below.
a)
Solution:
b)
solution:
Conclusion:
Ho: The explonatory vaiable is not significant
H1: The explonatory vaiable is significant
The p-value of b3 is 1.26e-07 and alpha value=0.05
alpha value is greater than p-value then Reject Ho
i.e. The explonatory vairable x3 is significant at 5% of level of significance.
The coefficient of b3 is 1.30601 means that one unit of change in x3 variable then 1.3060 unit change in response variable performance score at all variable are constant.
(c )
Solution: X1 = 75; X2 =88; X3 = 80 and X4=97.
If the response variable of performance score is
Y = -124.382+75*0.29573+88*0.04829+80*1.30601+97*0.51982
Y=56.95079
The confidence interval is given below
95% Lower limit = coefficient - t-critical*S.E. = 0.850423
95% Upper limit = coefficient + t-critical*S.E. = 1.761597
d)
Solution: Ho: The explonatory variable x1 and x2 are not significant
H1: The explonatory variable x1 and x2 are significant
Analysis of Variance Table
Model 1: Y ~ X1 + X2
For X1 variable
The p-value for the model is 2.539e-12 and alpha value = 0.025
alpha value is greater than p-value then to Reject Ho
For X2 variable
The Fstat=6.563212 and F-cri=0.927053
Fstat value for x2 variable is greater than F-critical then Reject Ho
Conclusion: The use x1 and x2 variables are significant at 0.025 level of significane