Monday, February 12, 2007

Lies, d*mned lies, and statistics

There are three kinds of lies: lies, damned lies and statistics. -- attrbuted to Benjamin Disraeli

A week ago I was watching the Big Football Game Whose Name I Am Not Allowed To Say Lest I Get Sued For Trying To Make Money Off the NFL Without Giving Them Their Fair Share Even Though I Don't Make a Cent From This Blog; for short, let's call it the "Stupendous Bowl." Anyway, the Chicago Bears won the coin toss, and the anouncers remarked that this was the 10th year in a row that the NFC had won the coin toss, a 1-in-1000 occurance. This quote implies that the Bears had a 1-in-1000 chance of winning that coin toss. That implication is wrong. Unless the mob has gotten involved and "fixed" the coin toss, the coin doesn't remember from one year to the next what team won the toss. Each year, there is a 50-50 chance that the coin toss will be won by the NFC team.

The statement isn't totally wrong. If I were to ask the question, "What is the likelihood that the next 10 coin tosses will be won by the NFC?" the answer would be one in a thousand, more precisely, 1 in 1024. But the chance that next year's coin toss will be won by the NFC for the 11th year in a row? 50-50.

People often mess up playing this game of statistics. Let's look at gambling in Las Vegas. Many people who play the slots stand around watching to find a machine that hasn't paid out in a long time, and then they sit down and plop their coins in. Whether they admit it or not, the chance of winning with that machine is no greater than it was the pull after the machine's last payout. By law, those machines can't remember when they last paid out money. It's completely random. Unless the game is fixed (and illegal), take any open seat.

This same game is often played in astronomy. On average, humans see a supernova (an exploding star) in the Milky Way every hundred years. The last supernova in our galaxy was seen in 1604, 403 years ago. So you often hear astronomers say, "we are overdue for another supernova!" But many people who hear this think that we must have one any day now, that somehow the chances of seeing a supernova next year are higher than average because of the long delay. Again, this is not proper thinking about statistics. Other stars in the galaxy don't know when one another explode, and they don't time themselves to go off exactly every 100 years. So, the chance of a bright supernova in the Milky Way this year is the same as the chance of one last year and the chance of one next year: about 1 in 100. (I'll sidestep the point that many supernova in our galaxy may be nearly invisible from Earth.)

What about earthquakes? On average, the San Andreas Fault produces "The Big One" every 150 years; in San Francisco, the last Big One was 100 years ago. Based on what I've said, does this mean that the chance of "The Big One" next year is 1 in 150, no matter when the Big One occurred?

Maybe, but probably not. Unlike exploding stars or coin tosses, the San Andreas Fault "remembers" when the last earthquake was. An earthquake releases stress in the rock along the fault, and we have to wait for that stress to build up again before there can be another big earthquake. This doesn't mean that earthquakes go off like clockwork, and it doesn't mean that the San Andreas Fault can't have two big earthquakes in a row. But it does probably mean that, as time goes on, the chances of a big earthquake are increasing.

Clear as mud? Probably. Statistics plays a central role in so many areas of science and even everyday life (just ask an insurance salesman), but it is very difficult to figure out what statistics mean, even when you work with them every day. So, be a bit dubious when you hear odds and statistics thrown around. And stop and think before plunking down a large chunk of money on a bet of next year's Stupendous Bowl coin toss outcome, thinking that the AFC must finally win the coin toss. The chances are only 50-50.

No comments:

Post a Comment