Maddenation
Lies, Damned Lies, and Statistics
I started this entry earlier today, but had a computer glitch and lost it. Why is it that when you spend time writing something and then lose it, it’s such a sinking feeling that you’ll never again produce anything as good as what was lost?
Previously I was commenting that we have too many categories in this website, although I don’t have any suggestions on how to consolidate them. Anyway, I decided “statistics” didn’t need its own category. This could also have been put under “quotes” because the title is part of a Mark Twain quote.
My real observation is about the Iranian problem and the inability of our troops to find weapons of mass destruction. (Remember that 60’s song by Barry somebody where he said something like, “…and you tell me over and over and over and over again, you don’t believe we’re on the eve of destruction”? Somebody has to write a new song about “and they told us, over and over and over and over again, that Saddam had weapons of mass destruction.”) I heard a week or so ago that the US had identified about 500 sites where they might have stored WMD. So far, we have checked about 200 sites and found nothing. I got to wondering, naturally, what the chances of that were, assuming the WMDs were actually there. The answer is, it depends.
The problem is similar to some I’ve worked out for contaminated sites, where we are trying to determine how many samples it takes to find the “hot spots” where somebody has dumped hazardous chemicals.
The first thing you must assume is that the soldiers are inspecting randomly, which is not likely to be the case. But I don’t know the formulas for the non-random case anyway, so let’s continue. Then you have to know how many sites you would be expecting to contain WMD. If there’s only 1 site, then it’s a lot harder to find than if there are, say, 10 sites. The problem can be solved in two ways, via formula or using the binomial distribution function in EXCEL (or other spreadsheet).
The formula is one I developed for finding a “hot spot.” It is based on the probability of finding it with just one sample (i.e. the probability of the US finding a WMD site at the first location they inspect.) If n is the number of samples (or inspections), and a is the chance of finding it on the first try, then the probability of finding a “hot spot” in n tries is:
P = 1 - (1 - a)^n
In the example I’m talking about, the chance of finding a WMD site if there are 500 possibilities and only one real site is 1/500 or 0.2%. Then, by the above formula, the probability of finding this site in 200 tries is only 33%. If the number of WMD sites is more like 10, then the initial probability is 10/500 or 2%. On this basis, the probability of finding one of these sites in 200 tries is over 98%. Big difference.
I’m not trying to say that Iran doesn’t have WMD, just that if we’ve identified all the places they could have been hidden (500) and inspected 200 of them, then it’s pretty likely we would have found one of them (assuming there are more than one). On the other hand, we probably don’t know where Saddam might have hidden the weapons. So we’re back to “we don’t know where they are” and it might take a long, long time to find them, unless we have help from the bad guys. And the stuff might be in Syria. So, statistics doesn’t really tell you anything for sure. However, I feel better knowing that if there were only one site, we would only have a 1 in 3 chance of finding it after having checked 200 of 500 possible sites. Don’t you?
By the way, if you want to use EXCEL’s binomial distribution function to solve the problem, it can be done, but it’s too complicated to explain here. If you want an exercize in math and logic, go ahead and figure it out. Heh, heh.
Dad • Observations • 06/04/03 • 2 comments
Comments
Patrick • 06/11/03 • 12:07 AM:What’s worse, from a PR point-of-view at least, is that most of the rest of the world doesn’t even need statistical analysis to believe that there were never any weapons of mass destruction. The propaganda machine in the rest of the world is fed up with the lies the US has used in the past to cover up its self-interested raids on foreign peoples and governments, and even pro-US countries and statesmen don’t believe it anymore, they just go along because they feel like they’ll benefit by being on the big guy’s good side.
The more I see, the more I really believe that the brand of democracy we’ve built up (including freedom to buy cheap clothes and electronic equipment and hamburgers) relies on undemocracy elsewhere. If, for instance, every product that could be made in the US were made there, or companies were required to pay US minimum wage no matter where they did their manufacturing, then we’d all feel it in the wallet and revolt. It’s easy for the leader of the pack to preach free-markets and low tariffs because they benefit him.
Of course, it’s not that simple, but that doesn’t mean that it isn’t at all this way. I don’t claim to know why the US invaded Iraq, and I certainly wouldn’t claim that Saddam was a good leader or deserved to stay in power or anything, but I might ask why, then, the US hasn’t invaded Congo recently. Or why the European Union felt it had to issue its own currency. I would also ask why South American countries can’t find the success that small European countries like Denmark or Belgium do, and, in that case, I would find internal reasons as well as external, starting with the rhetoric of victimization that these people are fed and believe, so that they stop working (not all, mind) to better themselves and wait for the miracle from above.
I also mistrust anyone who talks like our politicians do. Right off the bat. They could be telling me to have a nice day, and I’d still disbelieve them, just because of their slimy rhetoric. If there’s one thing a PhD in English will do for you, it’s super-fine-tune your b.s. sensors.
I don’t claim to have enough information to know whether there were actually WMDs in Iraq, but I’ve heard enough evidence to make me lean against that possibility.
Patrick • 12/08/04 • 11:17 AM:I ran across this quote today in my A Word A Day email. It peripherally relates to this consideration of statistics:
Post a comment
Thanks for signing in, . Now you can comment. (sign out)
Please capitalize your name properly and use the same information each time you comment. We will not send you spam, and your email address will not be posted.