In: Statistics and Probability
Human Resources department at a manufacturing company is concerned about unauthorized absences of its employees. As a part of their research, they decided to investigate the relationship between the number of unauthorized days that employees are absent per year and the distance (miles) between home and work for the employees. A sample of 10 employees was chosen, and the following data were collected. Perform an appropriate analysis on the data at a 0.01 level of significance. Answer the following questions in the boxes below. Problem 1a) State clearly what the independent and dependent variables are in the problem. Problem 1b) Develop a scatter diagram. Based on the scatter diagram developed, do you expect a relationship between the variables? Indicate the expected relationship type if any. Problem 1c) Write down the estimated regression equation (use variable names along with coefficients and not just x and y).
Distance to Work (miles) |
Number of Days Absent |
1 |
8 |
3 |
5 |
4 |
8 |
6 |
7 |
8 |
6 |
10 |
3 |
12 |
5 |
14 |
2 |
14 |
4 |
18 |
2 |
Based on the given data,
1a) Since, it is a prenotion that Change in distance to work is expected to cause a change in no. of absent days and not vice-versa, the dependent variable is 'No. of absent days' and the independent variable is 'Distance to work'.
1b) Using excel,
The above scatter plot reveals that there may exist a negative or inverse relationship between the two variables, since we see that as the Distance to work increases, the no. of absent days decreases.
1c) Running a regression in excel,
The output obtained is:
From the regression output, the p- value of the F test 0.002 < 0.01.It implies that the regression model is significant.Also the regression coefficient of the intercept and the slope are significant with p values 0.000 < 0.01 and 0.002 < 0.01.The estimated regression equation can be thus expressed as:
No. of absent days = 8.098 - 0.344 (Distance to work)
From the goodness of fit measure R2, we may conclude that distance to work can explain about 71.1% of the variation in no. of absent days.