Skip to main content
No. 1876:
Bayesian Statistics

Today, we learn how to hedge bets. The University of Houston's College of Engineering presents this series about the machines that make our civilization run, and the people whose ingenuity created them.

Your wife and her friend went out and got you a white dog for your birthday, and you wonder which of them selected it. At first blush, it'd be a fifty-fifty guess. But you know two things: your wife doesn't like white dogs very much, and her friend likes them a lot. So, the friend probably chose the dog.

We can actually do a calculation here, but it's not simple. If the likelihood of your wife's picking a white dog is fifteen percent, and her friend's doing so is ninety percent, the odds that her friend chose it turn out to be eighty-five percent.

To get that answer, we use something called Bayesian statistics -- named after eighteenth-century nonconformist cleric Thomas Bayes. Bayes' first book, written in 1731, was on Divine Benevolence. Five years later he wrote a second, quite different, book. In it, he defended Newton's calculus against an attack by the British philosopher, Berkeley.

Not long afterward, Bayes was made a member of the Royal Society. He didn't write much more. But, two years after he died, the Royal Society published his paper on The Doctrine of Chances. In it, he suggests using prior knowledge to improve our prediction of outcomes. That's why, given your wife's and her friend's preferences, you're pretty sure who picked that white dog.

Another more serious example: Suppose you're tested for a certain cancer. The test has a five percent error rate. But prior knowledge tells you that only one person in a hundred thousand really has the cancer. That means a positive reading is almost certainly wrong. The test is nearly useless.

But Bayes' ideas have had tough sledding -- probably because they can be so easily misused. You can misinterpret your friend's preferences. Maybe I mislead myself when I say it's completely unlikely that I have that cancer. You can't really argue with the math, but you have to be very careful with the premises.

All this came to my attention when the New York Times did an article with the title, Subconsciously, Athletes May Play Like Statisticians. A good athlete, it seems, uses a vast wealth of subjective statistics. A tennis player calls up her knowledge of an opponent with a particularly dangerous backhand, and she adjusts the percentages of her own shots to compensate.

The Times article describes a Bayesian thought process as what we use when "uncertainty becomes great enough to give past experience an edge over current observation." Of course that can be very dicey. Think about the roulette player who knows perfectly well that the house odds are stacked against him and still says, "Oh yes, but red likes me."

By now, the obvious usefulness of Bayesian statistics has triumphed over the equally obvious dangers that go with its use. For we know we can get into serious trouble by dropping our guard when we temper our statistics with what we only think is true.

I'm John Lienhard, at the University of Houston, where we're interested in the way inventive minds work.

(Theme music)

For a nice textbook treatment of Bayesian statistics, see: D. P. Bertsekas and John N. Tsitsiklis, Introduction to Probability. Belmont, MA: Athena Scientific, 2002, Section 1.4. The white dog calculation (above) was made using Bayes' Rule on page 31. (Section 1.3, by the way, includes a nice discussion of Monty Hall Problem -- see Episode 1577.)

For more on Bayesian statistics see:

I. hacking, Bayes, Thomas. Dictionary of Scientific Biography (C.C. Gilespie, ed.), New York: Charles Scribner's Sons, 1970-1980, Vol. I, pp. 531-532.

D. Leonhardt, Subconsciously, Athletes May Play Like Statisticians. The New York Times, Science Times, Tuesday, January 20, 2004, pg. 1 and 6.

I am most grateful to three UH colleagues for very helpful counsel on this episode: Charles Peters, Mathematics, as well as Jagannatha Rao & David Zimmerman, Mechanical Engineering. Just for the fun of it, here's another example developed by Dr. Peters:

You've received a Christmas present with no tag on it. It had to've come from either your girlfriend or your brother. Since you know your brother is Christmas-challenged, you'd give 2 to 1 odds that he was the one who forgot to label the package. (This is your prior information.)

You also know that 50% of all presents given by your brother are ugly ties. Your girlfriend has better taste and only 5% of her presents are ugly ties. (This is your model for the outcome, given the state of nature.) Thus it would seem that, if the present is an ugly tie, the liklihood that it came from your brother is 0.50/0.05 or ten to one.

You open the present and, sure enough, it is an ugly tie. So now you wonder just what the odds are that the present came from your brother? One formulation of Bayes' rule says that the posterior odds equal the product of the prior odds times the likelihood ratio. In this case, the prior odds were 2/1 and the likelihood ratio is 10/1. Thus, when we add our posterior knowledge, the odds are = 2 X 10 = 20/1. In other words, we've increased the likelihood of that ugly tie being your brother's gift, from 10/1, to 20/1.