In: Statistics and Probability
For each of the studies described below, what conclusions can be reached? Are the researchers’ conclusions valid? Why or why not? What alternative explanations, if any, can there be for the research findings? Is the study high or low in internal validity? If you think there are problems with the study or the conclusions reached, how can the study be improved so that there are no flaws or so that alternative explanations can be ruled out? (Note: Some of these studies may not have any serious methodological flaws or alternative explanations.)
In addition to addressing these issues, evaluate each study in terms of its experimental realism, mundane realism, and ethics.
Example I. Taste Test
The owners of a soft-drink company believed that its product, Diet Duff’s, was better than its more popular competitor, Diet Smash. They decided to run a “blind taste test” in which individuals would taste some of each product without knowing which cup contained which drink. Two hundred randomly selected men and women from three different communities participated in the test. Each participant was seated at a table. A cup on the person’s left was labeled “Q” and contained six ounces of Diet Smash. A cup on the person’s right was labeled “M” and contained six ounces of Diet Duff’s. The participants, of course, were not told which drink was in which cup. Half of the time, the participants were told to try the cup on the left first, and half of the time they were told to try the cup on the right first. The drinks in both cups were equally fresh and cold.
The results supported Diet Duff’s hopes: Diet Duff’s was preferred by 105 people, Diet Smash was preferred by 84 people, and 11 people could not indicate a preference between the two drinks. Diet Duff’s began an advertisement campaign stating that in a blind taste test, more people preferred Diet Duff’s than Diet Smash.
I think the primary problem with this experimental design is
that the experiment is not double-blind. Since the researcher knows
which drink is which, they might subtly cue the participants to
"prefer" one drink or the other. There may also be some effects
from having the experimental drink always on the right. Finally,
the labels could be a problem. Subjects might interpret the labels
as being meaningful (e.g. "M" is a more common letter and might be
"preferred" to the choice labelled "Q").
To make the experiment more meaningful, the cups should be
unlabelled with respect to the subject (perhaps the label could be
hidden on the bottom of the cup). The experimenter should not know
which drink is which. One experimenter should pour and tell the
second where to put the cups, and the other experimenter should
deal with the subjects. The two drinks should be randomly placed on
either the left or the right.
The experiment is low in mundane realism (because we don't normally
directly compare soft drinks; we drink them with food or in social
settings). For example, a soda might taste fine in a 6 oz pour but
be terrible after 16 oz. The experiment is high in experimental
realism. People should be able to identify which drink tastes
better with reasonable accuracy, though the ordering will make a
difference. The study has no ethical problems as there is little
potential for harm and no deception.
Another potential problem is experimenter bias. If people from Diet Duff's (or people hired by them ran the test, their hopes about the outcome of the test might have influenced the participants in subtle but significant ways. The best way to minimize the potential problem is to have the person who interacts with the participants not know which drink was in which cup until after the participants make his or her selection. One final point is that the difference between 105 people selecting Diet Duff's and 84 people selecting Diet Smash (ignoring the 11 participants who could not indicate a preference) is not statistically significant. Therefore, the preference for the Diet Duff's may have been due to chance.