In: Statistics and Probability
Think of an example of a study where randomization is not feasible. Put the example in the Potential Outcome Model Framework. Include the outcome variable and treatment. State the counterfactuals.
According to R.A. Fisher, randomization “relieves the experimenter from the anxiety of considering innumerable causes by which the data may be disturbed.” Since, in particular, it is said to control for known and unknown nuisance factors that may considerably challenge the validity of a result, it has become very popular. This contribution challenges the received view. First, looking for quantitative support, we study a number of straightforward, mathematically simple models. They all demonstrate that the optimism surrounding randomization is questionable: In small to medium-sized samples, random allocation of units to treatments typically yields a considerable imbalance between the groups, i.e., confounding due to randomization is the rule rather than the exception. In the second part of this contribution, the reasoning is extended to a number of traditional arguments in favour of randomization. This discussion is rather non-technical, and sometimes touches on the rather fundamental Frequentist/Bayesian debate. However, the result of this analysis turns out to be quite similar: While the contribution of randomization remains doubtful, comparability contributes much to a compelling conclusion. Summing up, classical experimentation based on sound background theory and the systematic construction of exchangeable groups seems to be advisable.
As an example, suppose a unit i is represented by a binary vector ai = (ai1, …, aim). The Hamming distance d(⋅,⋅) between two such vectors is the number of positions at which the corresponding symbols are different. In other words, it is the minimum number of substitutions required to change one vector into the other. Let a1 = (0,0,1,0), a2 = (1,1,1,0), and a3 = (1,1,1,1). Therefore d(a1,a2) = 2, d(a1,a3) = 3, d(a2,a3) = 1, and d(ai,ai) = 0. Having thus calculated a reasonable number for the “closeness” of two experimental units, one next has to consider what level of deviance from perfect equality may be tolerable.