In: Math
This chapter extends the hypothesis testing to analyze difference between population proportions based on 2 or more samples, and to test the hypothesis of independence in the joint responses to 2 categorical variables. Can we provide a real world example for using the Chi-square test along with expectation of the outcomes?
Chi-Square Test - It is used to test the relationship between two (nominal) categorical variables of a sample likely to reflect association for the given population. The test is also known as Pearson's Chi-Square Test
It has the following hypothesis:
Null hypothesis(H0) - There exists no relationship between categorical variables in the population.
Alternative hypothesis(H1) - There exists a relationship between categorical variables in the population.
Symbol - X2
Formula - ((observed value - expected value)2)/expected value
Example - Consider we have two Zodiac Signs who like two different Colours. We performed a survey on the frequency of the above two traits to find the relationship between person zodiac sign and colour preference. Result Table :
Survey | Yellow | Blue | Total |
Leo | 60 | 30 | 90 |
Aries | 40 | 70 | 110 |
Total | 100 | 100 | 200 |
Solution: The expected value for each cell
For the (Leo/yellow) cell - (row total * column total)/total = (90*100)/200 = 45
Calculated - chi-square( X2) = ((observed value - expected value)2)/expected value =( 45 - 60)2/45 = 5
Tabulated - Test Statistic value to be calculated from chi-square Table with df (degree of freedom = (rows -1)(col -1)) and alpha = 0.05(Level of significance)
Interpretation - 1) If tabulated >= calculated, there exists a relationship so we will reject the null hypothesis
2) If tabulated < calculated, there exists no relationship so we will accept the null hypothesis
Chi-square( X2) test can also be calculated on SPSS, R, Excel and Python.