In: Statistics and Probability
It's common in statistical analysis to modify categorical variables by pooling categories together, or to create categorical variables by breaking a numerical variable down into ranges. For example, in the lecture for section 6.2, we pool three categories into one so that our variable has only two categories (success/failure); or, in a question on the midterm, we treated "income" as a categorical variable by grouping observations into ranges such as
For your post this week, propose either:
In either case, propose a research or survey question whose answer might appear differently based on the way that the categorical variables are set up. (For instance, a question on spending habits might change based on how income ranges are grouped together, or a political question might be answered differently based on how education levels are grouped.) What differences might you see based on the grouping? If your start is a numerical variable, are there any special values that might be important if included in one category or another?
Let's say we propose a research question that:
Is political learning from news media moderated by one's education level?
So, here political learning will be measured by a numeric variable and education level will be a independent variable which will be categoried into different groups. Now, the answer could appear differently if we categorise the education level into the following manners:
1. Education Levels: Literate or Illiterate
2. Education Levels: School Level, Bachelor Level or Postgraduate level
Hence, we could arrive at a different answer on the basis of how we group the above variable. We could come to a conclusion about how studies affect the political learning OR how the level of degrees of an individual affect the political learnings.
Now Let's propose a research question that:
Self-motivation and self-awareness are related to the income of an individual
Now the result of the above research question could differ on how we have grouped the income variable. We could have done it in the following ways:
1. Low, Medium and High
2. Below Poverty, Low, Lower middle, Upper middle and High
Hence, you could get different results if you group the variable in a different way. This depends on what kind of a result you want.