In: Statistics and Probability
If the chi-square test is used to test categorical variables, what are those numbers in the contingency table? What do they represent? How are they calculated?
Ans:
The Chi-Square test of independence is used to determine if there is a significant relationship between two nominal (categorical) variables. The frequency of each category for one nominal variable is compared across the categories of the second nominal variable. The data can be displayed in a contingency table where each row represents a category for one variable and each column represents a category for the other variable.
For example, say a researcher wants to examine the relationship between gender (male vs. female) and empathy (high vs. low). The chi-square test of independence can be used to examine this relationship.
The null hypothesis for this test is that there is no relationship between gender and empathy. The alternative hypothesis is that there is a relationship between gender and empathy (e.g. there are more high-empathy females than high-empathy males).
How to calculate the chi-square statistic:
At First we have to calculate the expected value of the two nominal variables. We can calculate the expected value of the two nominal variables by using this formula:
Where
= expected value
= Sum of the ith column
= Sum of the kth row
N = total number
After calculating the expected value, we will apply the following formula to calculate the value of the Chi-Square test of Independence:
= Chi-Square test of
Independence
= Observed value of two nominal
variables
= Expected value of two nominal
variables
Degree of freedom is calculated by using the following
formula:
DF = (r-1)(c-1)
Where
DF = Degree of freedom
r = number of rows
c = number of columns