In: Statistics and Probability
What is a disadvantage of Cramer’s V when it comes to measuring the strength of the association between two nominal scale variables?
Cramer's V
One final chi square based measure of association that can be used is Cramer’s V. This measure is defined as
By using the information concerning the dimensions of the table, Cramer’s V corrects for the problem that measures of association for tables of different dimension may be difficult to compare directly. Cramer’s V equals 0 when there is no relationship between the two variables, and generally has a maximum value of 1, regardless of the dimension of the table or the sample size. This makes it possible to use Cramer’s V to compare the strength of association between any two cross classification tables. Tables which have a larger value for Cramer’s V can be considered to have a strong relationship between the variables, with a smaller value for V indicating a weaker relationship.
Cramer's V (V) is a variation of the phi coefficient. However, you can use this measure with any size table if at least one of the variables in a particular contingency table is nominal. When you calculate this statistic for a 2 by 2 table, the result is the same as the phi coefficient. Thus, the V value for our example is also .78. Similar to phi, its values range between 0 and 1.
Pearson's Coefficient of Contingency (C)
Pearson's C is more appropriate for larger tables (4 by 4, etc.). Why? Because its upper limit depends on the number of rows and columns. Therefore, the range of values is 0 to something less than 1. In fact, the upper limit for a 2 by 2 table is .71. In our example, the value of this statistic is .61. How do you interpret this value? You could not use the table we gave you because it is based on values ranging from -1.0 to 1.0. As we just said, the upper limit for C for a 2 by 2 table is only .71. So, it is difficult to interpret the magnitude (.61) of the statistic. Thus, this limitation is a distinct disadvantage of the C measure.
A slightly different measure of association is the contingency coefficient. This is another chi square based measure of association, and one that also adjusts for different sample sizes. The contingency coefficient can be defined as
The contingency coefficient has much the same advantages and disadvantages as does ?. When there is no relationship between two variables, C = 0. The contingency coefficient cannot exceed the value C = 1, so that it is constrained more than is ?. But the contingency coefficient may be less than 1 even when two variables are perfectly related to each other. This means that it is not as desirable a measure of association as those which have the range 0 to 1.