In: Economics
You would like to study the following question: “Does exercising in the morning give you more energy during the day?”. You asked two of your friends Chris and Erin about when they exercise and energy level. Let ?? be the energy level of person i (it is measured as 1-10 scale, 10 is the highest energy level and 1 is the lowest energy level); ?? is an indicator for whether they exercise in the morning.
Observation
Energy level if exercise in the morning
Energy level if don’t exercise in the morning
Whether exercise in the morning
Chris
10
7
1
Erin
5
6
0
What is selection bias? Show it using the potential outcome framework notation, and then explain it in words. Is it positive or negative?
Selection bias - this is a type of bias which an individual takes when the sample is handpicked instead of undergoing any random selection process. For example, if a person who want to do research on music listening pattern of youngsters in the 15-25 age group, random selection is to select by going to a random college and identify some 10 students in the age group of 15-25. But if the researcher do selection bias, he select 10 of his friends in the age group of 15-25 and do the research.
In this example, you are doing study on exercise in morning. As you have selected two of your friends Chris and Erin, there is a selection bias in your research.
Potential outcome framework - this is a concept stated by economist Donald Rubin and it's called Rubin casual model. As per the model, if there are two outcomes for an event, we can't study both the impact on same sample when both the events are interconnected. Eg - we can take the above example in question, it's about studying the energy level in the morning if a person do excercise. With the same person, we cannot do study of impact on both the options - if he do excercise, we can only get energy level when doing excercise and cannot find how it will be when he has not done excercise. He can either do excercise or not do excercise.
Is it positive or negative - as it's not possible to do the desired study because is such practical concerns, the researcher should use statistical tools such as propensity score matching to correct the sample assigning mechanism.