In: Statistics and Probability
The CEO of an organization read a research article describing the importance of generational influence on work motivation. He is now interested in determining if the distribution of Baby Boomer, Generation X, and Millennial employees is the same across engineering and management positions in his company. Staff in the HR department gather records and provide him with the following information:
Employee Age Group |
Number of Managers |
Number of Engineers |
Baby Boomers |
28 |
18 |
Generation X'ers |
36 |
34 |
Millennials |
21 |
45 |
(a)
Null hypothesis is
Alternative hypothesis is
(b)
We have to perform Chi-squared test for independence of variables.
(c)
Here, number of rows and number of columns
So our test statistic will be
where,
is observed cell frequency of i-th row and j-th column.
is observed cell frequency of i-th row and j-th column.
(d)
The variables are
To draw conclusion about dependency or independency we have to perform Chi-squared test as follows.
Observed frequency | Number of Employees | |||
Managers | Engineers | Row total | ||
Employee Age Group |
Baby Boomer's | 28 | 18 | 46 |
Generation X'ers | 36 | 34 | 70 | |
Millennials | 21 | 45 | 66 | |
Column total | 85 | 97 | 182 |
The contingency table of expected frequencies is as follows.
Expected frequency | Number of Employees | |||
Managers | Engineers | Row total | ||
Employee Age Group |
Baby Boomer's | 85*46/182 = 21.5 | 97*46/182 = 24.5 | 46 |
Generation X'ers | 85*70/182 = 32.7 | 97*70/182 = 37.3 | 70 | |
Millennials | 85*66/182 = 30.8 | 97*66/182 = 35.2 | 66 | |
Column total | 85 | 97 | 182 |
Degrees of freedom
[Using R-code '1-pchisq(10.16118,2)']
We reject our null hypothesis if , level of significance.
We generally test for level of significance 0.10, 0.05, 0.01 or something like these.
So, we reject our null hypothesis.
Hence, based on the given data we can conclude that the variables are not independent.
So, the variables are dependent.
(e)
Degrees of freedom associated with the test statistic is 2.