In: Statistics and Probability
A personnel specialist with a large accounting firm is interesting in determining the effect of seniority (the number of years with the company) on hourly wages for secretaries. She selects a random sample of 10 secretaries and collects the following data on their years with the company (X) and hourly wages (Y):
X Y
0 12
2 13
3 14
6 16
5 15
3 14
4 13
1 12
1 15
2 15
Given that,
X=0,2,3,6,5,3,4,1,1,2
y=12,13,14,16,15,14,13,12,15,15
a) Scatter plot between X & Y isb) regression slope and Y-intercept is,
Y= 0.4579*x + 12.6636
c) the regression line on the scatter plot is,
d) Prediction of the hourly wage of a randomly selected secretary who has been with the company for 4 years is
0.4579* 4 + 12.6636 = 14.49533
e) Coefficients of determination is
R^2= SSR/RSS=sum((y^2-mean(y^))^2) / sum((y-mean(y))^2) = 6.731776/ 16.9 =0.3983~0.40 means our model based upon given data can explained only 40% variation in Y by X.
f) Typical increase in wage for each additional year on the job is 0.4579
Note: I have done this problem in R. So, I am attaching my R-code for your refrence as below:
x=c(0,2,3,6,5,3,4,1,1,2)
y=c(12,13,14,16,15,14,13,12,15,15)
plot(x,y)
model<-lm(y~x)
abline(lm(y~x))
predict_4=predict(model,data.frame(x=4))
y1=predict(model,data.frame(x))
SSR=sum((y1-mean(y1))^2)
RSS=sum((y-mean(y))^2)
coefficient_determination=SSR/RSS