In: Statistics and Probability
Stephanie is curious in knowing whether there is a relationship between gender and the area of interest. She randomly sampled some students and collects the following info. Her objective is find out if the gender influences the choice of the field to study.
Science | Arts | Business | Total | |
Male | 27 | 28 | 45 | 100 |
Female | 20 | 41 | 54 | 115 |
Total | 47 | 69 | 99 | 215 |
1. How many degrees of freedom should she use for the Chi-Square test?
2. What is the Chi-Squared statistic for this test?
3. What is P-value for this test? Use Excel's CHISQ.DIST function.
4. Based on the Chi-Squared test for Independence, can you say that the gender and the choice of study field are independent?
Solution:
Given: We have to test if there is a relationship between gender and the area of interest. Objective is find out if the gender influences the choice of the field to study.
Thus hypothesis of the study are:
H0: the gender and the choice of study field are independent
Vs
H1: the gender and the choice of study field are not independent
Science | Arts | Business | Total | |
Male | 27 | 28 | 45 | 100 |
Female | 20 | 41 | 54 | 115 |
Total | 47 | 69 | 99 | 215 |
Part 1. How many degrees of freedom should she use for the Chi-Square test?
df = ( R - 1) X ( C - 1)
R = Number of rows = 2
C = Number of Columns = 3
Thus
df = ( R - 1) X ( C - 1)
df = ( 2 - 1) X ( 3 - 1)
df = ( 1) X ( 2)
df = 2
Thus degrees of freedom = 2
Part 2. What is the Chi-Squared statistic for this test?
Formula:
Oij = Observed frequencies for ith row and jth column.
Eij = Expected frequencies for ith row and jth column.
Thus we need to make following table in column format:
Oij | Eij | Oij2/Eij |
27 | 21.8605 | 33.348 |
28 | 32.0930 | 24.429 |
45 | 46.0465 | 43.977 |
20 | 25.1395 | 15.911 |
41 | 36.9070 | 45.547 |
54 | 52.9535 | 55.067 |
N = 215 |
Part 3. What is P-value for this test? Use Excel's CHISQ.DIST function.
For Chi-square goodness of fit test, it is always right tailed test and hence we use following excel command:
=CHISQ.DIST.RT( x , df )
=CHISQ.DIST.RT( 3.279 , 2)
=0.1941
Thus P-value = 0.1941
Part 4. Based on the Chi-Squared test for Independence, can you say that the gender and the choice of study field are independent?
Since P-value = 0.1941 > 0.05 significance level, we fail to reject null hypothesis H0.
Thus we conclude that: the gender and the choice of study field are independent