In: Statistics and Probability
Cross tabulation is used to test for the categorical variables i.e, nomial or ordinal. To do so chi-square test is performed by drawing a contingency table. Now in our table we have the observed counts, from these we first calculate the expected counts and find the difference between the observed and expected counts to check whether any association is present between them. If so, then we calculate the test statistic of chi-square test with the formula - (Observed Value – Expected Value)2 / (Expected Value) for each cell and finally sum up to get the test statistic. Then at a suitably chosen significance level we check the p- value to reject or accpet the null hypothesis.
We know in statistical calculations we always select data randomly otherwise the test result is not appropriate. Independence of the observations is a critical assumption for the chi-square test of association. The chi-square test of association cannot be performed when categories of the variables overlap. Thus, each observation must be categorized into one and only one category. Each sample should be large enough so that there is a reasonable chance of observed count in every category. If the expected counts are too low, the p-value for the test may not be accurate.
Thus, it is important to asses the minimum observed and expectecd count in cross tabulation because if it is too small the association is not significant and as result it becomes very difficult to obtain the p-value of the test. Ifthe p-value is not perfect then the accpetancee or rejection of null hypothesis is affected by the same and the result of chi-square test is hampered.
If the expected counts for a category is too low, you may be able to combine that category with adjacent categories to achieve the minimum expected count.