Read Against the Gods: The Remarkable Story of Risk Online
Authors: Peter L. Bernstein
After a spell of feverish calculation, he came up with a precisely
correct solution and was able to predict the exact location of Ceres at
any moment. In the process he had developed enough skill in celestial
mechanics to be able to calculate the orbit of a comet in just an hour or
two, a task that took other scientists three or four days.
Gauss took special pride in his achievements in astronomy, feeling
that he was following in the footsteps of Newton, his great hero. Given
his admiration for Newton's discoveries, he grew apoplectic at any reference to the story that the fall of an apple on Newton's head had been
the inspiration for discovering the law of gravity. Gauss characterized
this fable as:
Silly! A stupid, officious man asked Newton how he discovered the
law of gravitation. Seeing that he had to deal with a child intellect,
and wanting to get rid of the bore, Newton answered that an apple
fell and hit him on the nose. The man went away fully satisfied and
completely enlightened."
Gauss took a dim view of humanity in general, deplored the growing popularity of nationalist sentiments and the glories of war, and
regarded foreign conquest as "incomprehensible madness." His misanthropic attitudes may have been the reason why he stuck so close to
home for so much of his life.12
Gauss had no particular interest in risk management as such. But he
was attracted to the theoretical issues raised by the work in probability,
large numbers, and sampling that Jacob Bernoulli had initiated and that
had been carried forward by de Moivre and Bayes. Despite his lack of
interest in risk management, his achievements in these areas are at the
heart of modern techniques of risk control.
Gauss's earliest attempts to deal with probability appeared in a book
titled Theoria Motus (Theory of Motion), published in 1809, on the motion
of heavenly bodies. In the book Gauss explained how to estimate an
orbit based on the path that appeared most frequently over many separate observations. When Theoria Motus came to Laplace's attention in
1810, he seized upon it with enthusiasm and set about clarifying most of
the ambiguities that Gauss had failed to elucidate.
Gauss's most valuable contribution to probability would come
about as the result of work in a totally unrelated area, geodesic measurement, the use of the curvature of the earth to improve the accuracy
of geographic measurements. Because the earth is round, the distance
between two points on the surface differs from the distance between
those two points as the crow flies. This variance is irrelevant for distances of a few miles, but it becomes significant for distances greater
than about ten miles.
In 1816, Gauss was invited to conduct a geodesic survey of Bavaria
and to link it to measurements already completed by others for
Denmark and northern Germany. This task was probably little fun for an academic stick-in-the-mud like Gauss. He had to work outdoors on
rugged terrain, trying to communicate with civil servants and others he
considered beneath him intellectually-including fellow scientists. In
the end, the study stretched into 1848 and filled sixteen volumes when
the results were published.
Since it is impossible to measure every square inch of the earth's
surface, geodesic measurement consists of making estimates based on
sample distances within the area under study. As Gauss analyzed the
distribution of these estimates, he observed that they varied widely,
but, as the estimates increased in number, they seemed to cluster
around a central point. That central point was the mean-statistical
language for the average-of all the observations; the observations also
distributed themselves into a symmetrical array on either side of the
mean. The more measurements Gauss took, the clearer the picture
became and the more it resembled the bell curve that de Moivre had
come up with 83 years earlier.
The linkage between risk and measuring the curvature of the earth
is closer than it might appear. Day after day Gauss took one geodesic
measurement after another around the hills of Bavaria in an effort to
estimate the curvature of the earth, until he had accumulated a great
many measurements indeed. Just as we review past experience in making a judgment about the probability that matters will resolve themselves in the future in one direction rather than another, Gauss had to
examine the patterns formed by his observations and make a judgment
about how the curvature of the earth affected the distances between
various points in Bavaria. He was able to determine the accuracy of his
observations by seeing how they distributed themselves around the
average of the total number of observations.
The questions he tried to answer were just variations on the kinds
of question we ask when we are making a risky decision. On the average, how many showers can we expect in New York in April, and what
are the odds that we can safely leave our raincoat at home if we go to
New York for a week's vacation? If we are going to drive across the
country, what is the risk of having an automobile accident in the course
of the 3,000-mile trip? What is the risk that the stock market will
decline by more than 10% next year?
The structure Gauss developed for answering such questions is now
so familiar to us that we seldom stop to consider where it came from.
But without that structure, we would have no systematic method for
deciding whether or not to take a certain risk or for evaluating the risks
we face. We would be unable to determine the accuracy of the information in hand. We would have no way of estimating the probability
that an event will occur-rain, the death of a man of 85, a 20% decline
in the stock market, a Russian victory in the Davis Cup matches, a
Democratic Congress, the failure of seatbelts, or the discovery of an oil
well by a wildcatting firm.
The process begins with the bell curve, the main purpose of which
is to indicate not accuracy but error. If every estimate we made were a
precisely correct measurement of what we were measuring, that would
be the end of the story. If every human being, elephant, orchid, and
razor-billed auk were precisely like all the others of its species, life on
this earth would be very different from what it is. But life is a collection of similarities rather than identities; no single observation is a perfect example of generality. By revealing the normal distribution, the
bell curve transforms this jumble into order. Francis Galton, whom we
will meet in the next chapter, rhapsodized over the normal distribution:
[T]he "Law Of Frequency Of Error"... reigns with serenity and in
complete self-effacement amidst the wildest confusion. The huger
the mob ... the more perfect is its sway. It is the supreme law of
Unreason. Whenever a large sample of chaotic elements are taken in
hand ... an unsuspected and most beautiful form of regularity proves
to have been latent all along.13
Most of us first encountered the bell curve during our schooldays.
The teacher would mark papers "on the curve" instead of grading them
on an absolute basis-this is an A paper, this is a C+ paper. Average students would receive an average grade, such as Bor C+ or 80%. Poorer
and better students would receive grades distributed symmetrically
around the average grade. Even if all the papers were excellent or all
were terrible, the best of the lot would receive an A and the worst a D,
with most grades falling in between.
Many natural phenomena, such as the heights of a group of people
or the lengths of their middle fingers, fall into a normal distribution. As Galton suggested, two conditions are necessary for observations to be
distributed normally, or symmetrically, around their average. First,
there must be as large a number of observations as possible. Second,
the observations must be independent, like rolls of the dice. Order is
impossible to find unless disorder is there first.
People can make serious mistakes by sampling data that are not
independent. In 1936, a now-defunct magazine called the Literary
Digest took a straw vote to predict the outcome of the forthcoming
presidential election between Franklin Roosevelt and Alfred Landon.
The magazine sent about ten million ballots in the form of returnable
postcards to names selected from telephone directories and automobile
registrations. A high proportion of the ballots were returned, with 59%
favoring Landon and 41% favoring Roosevelt. On Election Day,
Landon won 39% of the vote and Roosevelt won 61%. People who
had telephones and drove automobiles in the mid-1930s hardly constituted a random sample of American voters: their voting preferences
were all conditioned by an environment that the mass of people at that
time could not afford.
Observations that are truly independent provide a great deal of useful information about probabilities. Take rolls of the dice as an example.
Each of the six sides of a die has an equal chance of coming up. If
we plotted a graph showing the probability that each number would
appear on a single toss of a die, we would have a horizontal line set at
one-sixth for each of the six sides. That graph would bear absolutely no
resemblance to a normal curve, nor would a sample of one throw tell us
anything about the die except that it had a particular number imprinted
on it. We would be like one of the blind men feeling the elephant.
Now let us throw the die six times and see what happens. (I asked
my computer to do this for me, to be certain that the numbers were
random.) The first trial of six throws produced four 5s, one 6, and one
4, for an average of exactly 5.0. The second was another hodgepodge,
with three 6s, two 4s, and one 2, for an average of 4.7. Not much
information there.
After ten trials of six throws each, the averages of the six throws
began to cluster around 3.5, which happens to be the average of 1+2+3+4+5+6, or the six faces of the die-and precisely half of the mathematical expectation of throwing two dice. Six of my averages were below 3.5 and four were above. A second set of ten trials was a mixed bag: three of them averaged below 3.0 and four averaged above 4.0; there was one reading each above 4.5 and below 2.5.
The next step in the experiment was to figure the averages of the first ten trials of six throws each. Although each of those ten trials had an unusual distribution, the average of the averages came to 3.48! The average was reassuring, but the standard deviation, at 0.82, was wider than I would have liked.*
In other words, seven of the ten trials fell between 3.48 + 0.82 and 3.48 - 0.82, or between 4.30 and 2.66; the rest were further away from the average.
Now I commanded the computer to simulate 256 trials of six throws each. The first 256 trials generated an average almost on target, at 3.49; with the standard deviation now down to 0.69, two-thirds of the trials were between 4.18 and 2.80. Only 10% of the trials averaged below 2.5 or above 4.5, while more than half landed between 3.0 and 4.0.
The computer still whirling, the 256 trials were repeated ten times. When those ten samples of 256 trials each were averaged, the grand average came out to 3.499. (I carry out the result to three decimal places to demonstrate how close I came to exactly 3.5.) But the impressive change was the reduction of the standard deviation to only 0.044. Thus, seven of the ten samples of 256 trials fell between the narrow range of 3.455 and 3.543. Five were below 3.5 and five were above. Close to perfection.