In: Statistics and Probability
During WWII, the Navy tried to determine where they needed to armor their aircraft to ensure they came back home. They ran an analysis of where planes had been shot up, and came up with this. Obviously the places that needed to be up-armored are the wingtips, the central body, and the elevators. That’s where the planes were all getting shot up.
Abraham Wald, a statistician, disagreed. He thought they should
better armor the nose area, engines, and mid-body. Which was crazy,
of course. That’s not where the planes were getting shot.
Except Mr. Wald realized what the others didn’t. The planes were
getting shot there too, but they weren’t making it home. What the
Navy thought it had done was analyze where aircraft were suffering
the most damage. What they had actually done was analyze where
aircraft could suffer the most damage without catastrophic failure.
All of the places that weren’t hit? Those planes had been shot
there and crashed. They weren’t looking at the whole sample set,
only the survivors.
Based on the above scenario critically discuss how managers need to use the statistical results to avoid this misleading conclusion.
The reason for this excitement is that the aircraft damage is an example of what is known as "survivorship bias." This is a technical term for what we all know well: the dead don't often get to tell their side of the story, and yet sometimes it would be better if they did. The loss is the source of all kinds of misinformation, as the Internet will tell you emphatically. Including deceptive practices in selling hot stocks, which may explain much of the buzz.
Well, it's gratifying to see a great mathematician become a legend for good reasons, rather than bad. "MATHEMATICAL GENIUS SCORES AGAINST ARMY BRASS!" reads pretty well. After all, publicity about mathematicians typically concentrates on features most of us would rather not think about. But it would be much more gratifying if there were more truth to the story, or at least more reason for believing it. Some of us prefer our history lessons to be taken from the non-fiction shelves.
The story might well be true, and there is certainly, as we shall see, a solid germ of truth in it, but there is very little evidence for the best bits. The capsule biography of Wald is accurate, and although he might not have been the smartest man in the room, he was probably nearly always the most accomplished mathematician in the room, which counted for a lot. But ... most of the rest of the story is--to use a charitable phrase--"plausible reconstruction." There is extremely little source material for what Wald had to say about aircraft damage.
The autobiographical memoir by W. Allen Wallis is the best source--practically the only source--for the operation of the SRG. It is surprisingly entertaining as well as informative, but its coverage of Wald's work at the SRG concentrates on the invention of sequential analysis, for which Wald eventually became deservedly famous. This is a technique for improving quality control in production, say of military ordnance. It was used, apparently with great success, by thousands of wartime production facilities. But it is not exactly great material for Internet headlines: "HEY! ARMY TRUCK TIRE PRODUCTION ROSE 6.37% IN AUGUST 1944!"
To be precise, regarding Wald's work on aircraft damage we have (1) two short and rather vague mentions in Wallis' memoir of work on aircraft vulnerability and (2) the collection of the actual memoranda that Wald wrote on the subject. That's it! Everything not in one of these places must be considered as fiction, not fact. Or at best, as I say, plausible reconstruction. Not to complain too much--the history of mathematics is plagued by the temptation, rarely resisted, to write as things should have been, rather than what they were. Reality is rarely as logical as one might hope. I should add, though, that it's not only mathematical reality that gets slighted in the Internet versions of this tale--you should be quite amused by the pictures that accompany the Internet headlines. Lots and lots of airplanes with bullet holes scattered all over them. One goes so far as to claim it is showing you Wald's own sketches (and we do not have any idea at all as to whether if he ever made any). Most show diagrams of aircraft that by no means match what must have been involved--my favorite is of a venerable DC3, a plane referred to by the military as the C-47. These served as cargo carriers in WWII and rarely saw real combat except by straying from route. "As long as it has motors and flies" seems to be the criterion for the art work. A few of the web sites show chilling clips of American planes being destroyed in action. These certainly show, in case you might have forgotten, what stakes were ultimately involved in the apparently abstract technology being developed in the comfort of upper Manhattan.
The vague references in Wallis' memoir are particularly interesting, since Wald is not mentioned in them. One of them (p. 323) says in entirety, "The problem of aircraft vulnerability led SRG to devise a technique for determining vulnerability from damage survived by our own planes ... " The other (p. 324) names Wallis himself as the author of a note titled Uses for Aircraft Vulnerability Figures. This, however, is one of a list of a random selection of reports from the SRG, and there might well have been other reports on the same topic. (Do these reports still exist in some deep archive?)
So the only really reliable account of Wald's work is what we find in Wald's own writings.
The true story, or at least part of it
The memoranda by Wald are severely technical. Not much drama at all. In particular, Wald says nothing about what the military should do to improve things. If I understand Wallis correctly, it was the general policy of the SRG to answer just the questions asked and never--well, hardly ever--attempt to offer advice on applications of what they discovered. Military decisions were made by the military.
The memoranda are so technical, in fact, that in the account by Jordan Ellenberg, a photograph of one page of the document is flashed at the reader with an apology for suddenly introducing a topic possibly suitable only for adults. There is, however, a very valuable guide to the memoranda by Marc Mangel and Francisco Samaniego, that appeared almost at the same time the memoranda were made available to the public by the Center for Naval Analyses.
There are eight items among the memoranda. Five of them deal with a single problem, estimating probabilities of an airplane's survival, given that it has already been hit. Its outstanding feature is that it offers a way to estimate damage on the planes that never returned. A kind of magic, indeed. One--just one--deals with the problem of vulnerability of different sections of an airplane, and this shares with the previous sections some impressive estimates. That is, as the Internet fiction suggests, both have to deal with the problem that downed planes aren't around to give evidence.
Consider the first problem. We are given data, such as the number of hits, only on returning aircraft. The question Wald asked--or perhaps the one he was asked to look at--was, "Given these data, what can we say about the probability of surviving a given number of hits?" Not a complicated question, but with a complicated answer. All we know about the planes that didn't return is ... that they didn't return. In truth, there might be a number of reasons for this, since--for example--a number of fatalities in the war were from mechanical failure. Of course Wald had to be very careful. It was in principle possible, one might suppose, that all downed airplanes ran out of gasoline. The point is that this was extremely unlikely. In other words, any answer to the question is complicated by the missing data associated to planes that were downed. Wald could only calculate his probabilities by making certain reasonable assumptions, and being very, very careful about how the assumptions played a role in results. In all his works on statistics, in fact, he was renowned for being very, very careful with assumptions.
His first simplifying assumption is that planes are downed because of enemy fire. Rather than mechanical failure, say.
What data did Wald have to work with? This seems to have varied from time to time, but at the least, in so far as this problem was concerned, he was given the number of planes sent out on missions, the number returning, and the number of hits on each plane that came back. In the example treated by Mangel and Samaniego (following Wald):
|
The NN planes on the mission divide into two major groups, the SS survivors and the LL planes that are downed. These in turn divide into groups according to how many hits they get: NiNi is the total number with exactly ii hits, similarly SiSi and LiLi. Of course we know all the SiSi, and know nothing about the LiLi except for three simple things: (1) L=∑Li=N−SL=∑Li=N−S, and (2) Li+Si=NiLi+Si=Ni, and (3) L0=0L0=0, because we have assumed that all those that are lost are lost because they have been hit. Let N≥iN≥i be the sum ∑j≥iNj∑j≥iNj, etc. ThusN=N<i+N≥i.N=N<i+N≥i.
It seems a little crazy, but what we really want to do is figure out what all the missing numbers LiLi are, or at least estimate them in a reasonable way. It looks at first sight as though this is a task for a conjuror rather than a mathematician.