CHANCE News 13.03

First digit 
Benford's probability 
1 
30.1% 
2 
17.6 
3 
12.5 
4 
9.7 
5 
7.9 
6 
6.7 
7 
5.8 
8 
5.1 
9 
4.6 
For his project, Greg wanted to see if "natural numbers" on the web satisfied Benford's Law. He writes:
I wanted to understand numbers on the World Wide Web in which real live people were actually interested. In particular, I did not want to accidentally include numbers from data sets intended only for data mining purposes. To accomplish this, I included a piece of text in my search. I wanted to choose a natural piece of text, hence (for lack of a better idea) I used the word “nature”. Thus, my Google Numbers are numbers that occur on a web page that also includes the word “nature”.
I wanted my search to produce robust but reasonable numbers of results. This
is because I wanted to leave myself in a position to actually examine the resulting
hits in order to achieve a sense for how the numbers were derived.
A little experimenting led Greg to the conclusion that searches for sixdigit numbers and the word "nature" resulted in a reasonable number of hits. So he chose nine random five digit numbers and for each of these he added all possible leading digits. His first fivedigit number was x = 13527 giving him the 9 sixdigit numbers 113527, 213527, 313527, ...,913527. He then searched for each of these numbers and the word "nature" in Google and recorded the number of hits. Here is what he found:
x = 13527 
occurrences 
113527 
136 
213527 
44 
313537 
35 
413527 
30 
513527 
27 
613527 
15 
713527 
9 
813527 
13 
913527 
8 
He repeated this for his 8 other random fivedigit numbers and combined the results to obtain:
Leading digit 
count 
Empirical Percent 
Benford 
1 
645 
31.65% 
30.1% 
2 
342 
16,78 
17.6 
3 
262 
12.86 
12.5 
4 
181 
8.88 
9.7 
5 
164 
8.05 
7.9 
6 
143 
7.02 
6.7 
7 
115 
5.64 
5.8 
8 
105 
5.15 
5.1 
9 
81 
3.97 
4.6 
This is a remarkably good fit. Here is his graphical comparison:
Greg wondered if he was just lucky or if there was some explanation for such a good fit. Looking for an explanation, he found that many of the numbers he observed could be considered the result of a growth process. As an example of such a growth process, consider the money you have in the bank that is continuously compounded. Then it is easy to check that the percent of time your money has leading digit k for k = 1,2,3,...,9 fits the Benford distribution. Greg remarks:
Hence, we would expect Google numbers to have a Benford distribution if they satisfied two criteria: first that every Google Number behaves like money with interest continuously compounded, and, second that the probability that a Google number is posted on the web is proportional to how long that quantity is meaningful.
We gave Greg an A on his project but you should read it here yourself. You can also see what Greg did in the Chance Course and some student projects here.
DISCUSSION QUESTION:
Repeat Greg's experiment replacing "nature" by a different word. Do you get similar results?
Chew On.
The New Yorker, 9 Feb. 2004, The talk of the town; You don't say dept.
Pg 22
Ben McGrath
Evaluation of CDs and chewing gum in teaching dental anatomy.
Abstract
K.L. Allen et al
The authors of the study initially set out to compare the effectiveness of
attending traditional lectures versus using a cdrom to learn dental anatomy.
Apparently the gum manufacturer, Wrigley, was willing to support the study,
and for some reason (though perhaps one could guess) the researchers modified
their design to also compare the effects of gum chewing on learning. Specifically,
students were divided into four groups: half attended a standard lecture and
lab, half used a commercially available instructional CD and lab, and half of
each of these groups were required to chew gum. The procedures were continued
for three days, and then the students were given both a machinegraded, multiplechoice
test as well as a practical exam.
The results were mixed. The 29 students who chewed gum had an average score
on the written test that was slightly higher than the average of the 27 nongum
chewing students: 83.6 versus 78.8. The difference on the written test between
the CD students (n = 30) and the lecture students (n = 26) was smaller: 83.7
versus 81.3. (Standard deviations were not provided.) According to the study
abstract, on the practical exam there were no differences between groups.
DISCUSSION QUESTIONS:
(1) The researchers write that "only the written examination average scores
for the gum vs. no gum chewing groups showed differences which appear to be
educationally meaningful, though not statistically significant." What do
you think this means?
(2) Do you think the study provides sufficient evidence that gum chewing helps
students learn? Why or why not?
(3) Do you think the study provides sufficient evidence that CD ROMs are as
effective as lectures? Why or why not? For this and the previous question, how
might a smaller versus a larger exam score standard deviation affect your answer?
(4) The New Yorker article states that "fiftysix students took
part in the pilot study, and, as Allen said, 'We really need a sample size of
about two hundred to determine [the results] beyond a reasonable doubt.”
Why do you think Allen chose "about two hundred" as a preferred sample
size? Do you think that "beyond a reasonable doubt" is an appropriate
phrase to use in this context?
The next article was suggested by Joan Garfield
Is your radio too loud to hear the phone? You messed up a poll.
Wall Street Journal, 12 March 2004
Sharon Legley
Legley begins her article with:
(Mathaverse readers are allowed to skip this paragraph.) The sampling error represents the range of possible outcomes from a random, representative slice of the population. For practical purposes, it equals 1 divided by the square root of the number of people surveyed. If you poll 1,600 people, then the sampling error is 1/40, or 2.5%.
The bulk of the article is devoted to nonsampling errors. Legley remarks that it is well known that polls tend to include too many women, too many whites, and too many older folks. If women are more likely than men to favor a particular candidate the poll will give a biased result. To avoid this, pollsters adjust for demographic factors by weighting the numbers so they match the census. The Gallup poll has a detailed description of how their polls are carried out available here where we read:
In most polls, once interviewing has been completed, the data are carefully checked and weighted before analysis begins. The weighting process is a statistical procedure by which the sample is checked against known population parameters to correct for any possible sampling biases on the basis of demographic variables such as age, gender, race, education, or region of country.
Legley remarks that people are reluctant to indicate how wealthy they are and so this might be a more difficult to correct for though she suggests that zip codes might be a proxy for wealth. She continues:
Worse, polls likely undersample or oversample people in categories the census doesn't count, making an adjustment... virtually impossible. Prof. Gelman's favorite example is surly people. They're more likely to treat a pollster as they would a telemarketer, hanging up and therefore not having their views included. But we don't know how many surly people are in the voting population. If surly people lean toward one candidate, then a poll asking "whom are you most likely to vote for?" will underestimate his support.
Fritz Scheuren, at the University of Chicago National Opinion Research Center and presidentelect of the American Statistical Association considers "noresponse" the biggest source of nonsampling error commenting:
If pollsters really wanted to indicate how good their sample is, they'd skip the plusorminusX% and reveal the noresponse rate.
Legley says that pollsters don't call cell phones (the owner might be driving) and Scheuren comments
As more and more people have only a cell, you have a problem. Because no one knows how ditching one's land line correlates with political leanings, pollsters can't tell how omitting the cellonly population distorts reality.
We wondered why cell phones could not be used. An article by John Kamman in the online Arizona Republic, Dec. 30,2003, addressed this issue. According to this article, pollsters are indeed quite worried about the fact that they cannot contact those whose only phone is a cell phone. Kamman writes:
For a decade, Federal Communications Commission regulations have restricted pollsters from using modern dialing equipment to call cell phones. And even if they dial by hand, another rule prohibits them from phoning anyone who would have to pay for the call. A violation makes the caller vulnerable to a lawsuit with a penalty of at least $500.
This issue was complicated by the 2003 FCC ruling giving customers the right to keep their regular phone number when switching phone companies, making it hard for pollsters to know if they are violating the government rules . The FAA says that the pollsters have ways of tracking changes of numbers from wired to wireless but pollsters say this is not so easy.
It is thought that the number of people who now rely only on cell phones is not large, but it obviously involves more younger people which would introduce a bias. It is also pretty obvious that as cell phone usage increases, pollsters will have to find some solution to this problem as well as to the problem of increased use of answering machines and other callfiltering devices. Earl de Berge, research director for the Behavior Research Center in Phoenix, suggests an oldfashoned remedy. He remarks: "I wouldn't be surprised to see more doortodoor polling."
It was not clear to us what Legley meant by "a random, representative slice of the population" (see first paragraph). This seems related to a question asked on sci.stat.math: "Is there a statistical definition of representative sample?" In this discussion there was little agreement on what representative sample might mean. Some said that it simply meant a random sample but others that it meant a stratifiedrandom sample. Others thought that there is no proper definition. Here is one response:
J Dumais
Organization: Wanadoo, l'internet avec France Telecom
Date: Fri, 3 Nov 2000 19:18:01 +0100
Once upon a time (up to 1925 or about), there was "representative
sampling" meaning, "creating a representation of the population in
the sample" i.e. quotas. The caricature is that to obtain the socalled
representation, you *have to* get those 5 middleage outofwork with postsecondary
education living in towns of less than 20000 people, otherwise you don't have
the "true" representation.
With Neyman's theory on disproportional allocation to stratified random samples,
"representative sampling" sort of faded away, at least among (mathematical)
statisticians dealing with survey sampling.
I have searched in more than 20 textbooks, never coming up with any satisfying
"modern" definition of "representative sampling". Basically,
(mathematical) statisticians have dropped the notion altogether. When I hear
it, I'll question the speaker (if I can) as to what he/she actually means, and
it more or less boils down to unbiasedness.
One writer said that, in certain situations, a "representative sample" was required by the FDA so we asked Susan Ellenberg at the FDA if they have a definition of a "representative sample". She replied:
"Representative sample" is used primarily to describe material a product manufacturer must supply to the FDA for testing. It is actually defined as follows, in Volume 21, Part 210 in the Code of Federal Regulations:
(21) Representative sample means a sample that consists of a number of units that are drawn based on rational criteria such as random sampling and intended to assure that the sample accurately portrays the material being sampled.
I have to say I don't think this is the world's greatest definition, but the latter part of the sentence conveys the basic intent. This section of the regulations was initially published in 1978 and was most recently revised in 1993, so the wording is somewhere between 11 and 26 years old.
DISCUSSION QUESTION:
Well, Google found "representative sample" about 381,000 times so apparently it is not dead. What do you think "a representative sample" means?
The
facts
don't matter.
PBS Ira Glass: This American Life, March 12, 2004, Episode 260, Act
2, 43:06, 15 min
Sarah Koenig
In the last segment of this program Sara Koenig visits a John Zogby polling operation to get some idea how the questions are asked in their telephone polls and how they are answered. This poll was described by Zoby as:
Zogby International conducted telephone interviews of a random sampling of 600 likely primary voters statewide over a rolling threeday period. All calls were made from Zogby International headquarters in Utica, N.Y., from Friday, February 13th through Sunday, February 15. The margin of error is +/ 4.1 percentage points. Slight weights were added to age, race, union and gender to more accurately reflect the voting population. Margins of error are higher in subgroups.
You will enjoy listening to this segment of the program. We describe the first few minutes to encourage you to do so.
We hear a Zogby interviewer named Boden carrying out his first interview of the day. Sara reports that Boden is interviewing a women, aged 41, union member, separated, college graduate, white, conservative, making between 25 and 50 thousand dollars a year.
Boden: Good afternoon, my name is Boden and I am calling for
Zogby International. Today we are doing a poll of Wisconsin voters for Reuters/MSNBC
news.
Boden: And how likely are you to vote in the national elections: very likely,
somewhat likely, or not likely?
Answer: Very likely
Boden: And the Democratic candidates for 2004 are: Howard Dean, John Edwards,
John Karry, Dennis Kucinich, and Al Sharpton. If the primary were held today,
for whom would you vote out of these Democrats?
Answer: Not sure
Boden: O.K. You're sure you're not sure or might you be leaning towards one?
Answer: Not sure
Boden: You're not sure at this point? O.K.
Sara: Boden entered "undecided" in his computer and his screen comes
up with this question which is basically the same question asked in a different
way.
Boden: And if you had to choose today, if you had
to choose, which candidate might you just be leaning towards: Dean, Edwards,Karry,Kucinich
or Sharpton? Just the slightest leaning towards one of them if you had to choose
today.
Answer: Dean
Boden: O.K. Thank you.
This illustrates how hard they work to get an answer. Sara reports that almost 10,000 calls had to be made to get the desired 600 responses.
After the election Sara called a sample of those who responded to the poll to ask them about their experience with the poll and who they finally voted for.
Sara also discusses what she had learned from observing this poll being with Daniel Yankelovich, a pioneer in modern polling who has his own polling company and is the author of the well known book: "Coming to Public Judgment".
This would be a great program for students to listen to though it might not increase their faith in political polls.
DISCUSSION QUESTIONS:
(1) What do you think about the interview?
(2) Here are the results of the Zogby poll released February 16, 2004:
And here are the results of the election held on February 17, 2004:
Were the results within the margin of error? If not, why not?
In basketball games, in order to make the bets even the gambling sites give points to the underdog team. In order to win a bet on the favored team, that team must win by more than the number of points assigned to their opponent. Conversely, a bet on the underdog will win as long as that team either loses by fewer points than it has been given (or the underdog wins the game). The team that represents the winning bet is said to have 'beat the spread.' This point spread is set by people known as bookies. Their job is to set the spread so that about half of the bettors will bet on each side, thereby limiting the gambling sites' exposure (and guaranteeing income for the sites, by way of the vig).
In 1989, Colin Camerer wrote a paper containing statistical evidence that the bookies overemphasized streaks, in that they tended to assign more points than they should have to teams with long losing streaks, and they tended to assign more points to the opponents of teams with long winning streaks than they should have. For example, in the three years' worth of data that Camerer collected, teams who had won at least three games in a row only managed to beat the spread 45.6% of the time against teams with either smaller winning streaks or any type of losing streak. (The reason for this last proviso is, Camerer argued, that if two teams with winning streaks play each other, the one with the longer streak is the 'hotter' team, in the eyes of the bookies and the betting public.)
Thus, had someone bet against the 'hot' team, because it was thought that the bookies were overvaluing the streak, the bettor would have won 53.4% of the bets. The corresponding results involving 'cold' teams were as follows. Teams with at least threegame losing streaks beat the spread against less 'cold' teams 340 out of 643 games, for a winning percentage of 52.9%.
As usual, we have to check whether the above observations are significant. There were 698 games that formed the first observation, and if the point spreads were set so that each team had a 50% chance of winning in a given game, then one would expect to see about 349 of the 'hot' teams and 349 of their opponents win. In fact, 45.6% of 698 is 318, and the probability that one would see 318 or fewer wins by the 'hot' teams is about 1%. The corresponding pvalue for the observation involving 'cold' teams is about 7%.
Your intrepid Chance News editors decided to see what the corresponding data looked like in the current NBA (National Basketball Association) season. The results are not pretty. Suppose, for example, that we bet against the teams with winning streaks of at least three games (if they are playing a team that is not as 'hot' as they are). Through March 24, 2004, we would have lost 128 out of 242 bets, for a winning percentage of 47.1%. If one had bet on teams with at least threegame losing streaks, figuring that the bookies were undervaluing these teams, one would have won 111 out of 243 bets, for a winning percentage of 45.7%. The pvalues of these observations are .18 and .09.
What is interesting is that now the teams on winning streaks are beating the spread more than half the time, while the teams on losing streaks are failing to beat the spread more than half the time, which is the reverse of what happened in Camerer's data set.
One could interpret the new data to mean that the bookies have learned over the years not to overemphasize streaks. In fact, one might be tempted to bet in favor of 'hot' teams and against 'cold' teams, based on the current data.
As we were perusing the Web to find out what was happening in the world of sports betting, we came across a website with a fairly impressive record. This website had made a study of professional basketball games and had determined that the average number of 'possessions' that a team has over the course of a season is well correlated with that team's winning percentage. The number of possessions of a team in a game is defined by the following formula:Number of Possessions = Field Goal Attempts + Turnovers  Offensive Rebounds + (Free Throws) x.44.
Ignoring the .44 for the moment, we can interpret this formula as follows. If a team takes the ball downcourt, one of three things can happen: it can attempt a field goal, it can make a turnover, or there can be a foul called. Once a field goal is attempted, the team keeps the ball only if it misses the field goal attempt and gets an offensive rebound, or it gets fouled. So, for example, if a team takes the ball downcourt and attempts three field goals, getting two offensive rebounds in the process, this sequence contributes 1 to the above sum. The .44 seems to have been computed by a linear regression of the above summands against the team's scores.
After this was posted reader Ann Azevedo suggested the following alternative explanation for the .44:
As the .44 is only a multiplier for free throws, and, since the
number of free throws attempted in any given possession can vary:
1 free throw  if a free throw is awarded after a made basket if the front end of a 1and1 free throw is missed or
2 free throws  awarded for a foul on a missed 2point field
goal awarded for a foul not on a field goal attempt when in the doublebonus
(10 or more team fouls) awarded for making the front end of a 1and1, and
thus being awarded a second free throw or
3 free throws  awarded for a foul on a missed 3point field goal
I would think the 0.44 is a factor for correcting the number
of free throws to the number of possessions they represent.
The website uses this information to predict the score of each game. This predicted score can be compared to the point spread published by the bookies at the gambling websites. For example, suppose San Antonio is playing Philadelphia, and the website predicts that the score will be 10190, with San Antonio winning. If the bookies are giving fewer than 11 points to Philadelphia, the website suggests a bet should be made on San Antonio.
The website also separates out some games that are labeled 'best bets.' These are games in which the website's point spread differs from the bookies' spread by 4.5 points or more. So for example, suppose that the website predicts that Milwaukee will beat New Orleans by 10 points, but the bookies are only giving 5 points to New Orleans. The difference between the two point spreads is 5 points, so this is a 'best bet' on Milwaukee.
In the 200203 season, the website was right on 52.2% of all of the games and on 55.9% of the best bets (there were 295 best bet games, and the website predicted the correct team in 165 of these games). The pvalues of these observations (against the null hypothesis that the website does no better than 50%) are .10 and .02.When we discovered this website (in December of 2003), the website's winning percentages for all games and best bet games were about 53% and 58%, respectively. Thus, they were similar to the percentages recorded in the preceding season.
To further the cause of applied statistics, two of us got together and put a small amount of money (each contributed $100) down on the ensuing best bet as predicted by this website. Before we started betting, the website's winloss record on best bet games was 9073, for a winning percentage of 55.2%.
What happened next can be only classified as an unmitigated disaster. In the ensuing two months, before we went broke, the best bet record was 4555, and it's not very hard to figure out what that winning percentage is. In fact, since January 1, 2004, the website's best bet record is 6587, for a winning percentage of 42.8%, and their overall record is 238303, for a winning percentage of 44.0%. Luckily, some of us have quite a few years of gainful employment left to make up for our losses in this endeavor.
REFERENCES:
Camerer, C. F. (1989a). Does the basketball market believe in the 'hot hand'?
American
Economic Review, 79, 12571261. Available from JSTOR.
Gerry Grossman suggested this article:
123,
NHL playoff teams will be ...
denews.com, MoreSports, Sunday March 7,2000
Here is what Steffy and Cheng say about their project:
In the NHL tournament there are 30 Teams, which are arranged
in five team divisions. There are two conferences, the East and West, each containing
3 divisions. Teams are awarded two points for a win, one point for a tie, and
zero points for a loss, and one point for an over time loss. After the regular
season is completed eight teams are chosen from each conference to advance to
the tournament. In each conference, qualifying teams are made up of the leader
of each division, the team in each division with the most points, and the next
five highest ranked teams in the conference according to score. Special rules
are also employed to break ties between teams who have the same amount of points.
We developed a mixed integer program formulation for the following two problems
 the Guaranteed Qualification Problem: is a given team guaranteed a place
in the finals, and the Possible Qualification Problem: can a given team have
a chance of qualifying.
In addition we modified our formulations to solve the Guaranteed Qualification
and Possible Qualification problem for: Division Leader Status, Conference Leader
Status, and Presidents' Trophy.
You can see the current standings as computed by their program here and more about their project here.
Of course, this isn't a statistics or probability question, but it could become one if we tried at the same time to estimate the probability that a given team reaches the finals. This might be of greater interest to the bookies.
DISCUSSION QUESTION:
How would you estimate the probability that, at a given time, a particular team would reach the finals? Would the program of Steffy and Cheng help you?
Norton Starr brought our attention the following announcement from the Royal Statistical Society:
This issue of the Journal of the Royal Statistical Society Series A features a collection of short papers on the communication of risk, with guest associate editors D. R. Cox and S. C. Darby:
Title and authors  Page number 

The
communication of risk D. R. Cox and S. C. Darby 
203 
Introduction
to the papers on ‘The communication of risk’ A. F. M. Smith 
205 
Human
immunodeficiency virus risk: is it possible to dissuade people from having
unsafe sex? J. Richens, J. Imrie and H. Weiss 
207 
Communicating
risk—coronary risk scores I. M. Graham and E. Clavel 
217 
Tobacco—the
importance of relevant information on risk S. C. Darby 
225 
Tobacco:
public perceptions and the role of the industry D. Simpson and S. Lee 
233 
Communication
of risk: health hazards from mobile phones D. R. Cox 
241 
Crime
victimization: its extent and communication P. Wiles, J. Simmons and K. Pease 
247 
Accidental
fatalities in transport A. W. Evans 
253 
Communicating
the risks arising from geohazards M. S. Rosenbaum and M. G. Culshaw 
261 
The RSS comments:
To assist in the wider appreciation of the issues raised, a short commentary on the papers has been commissioned from the wellknown science writer, Geoff Watts.
This commentary is available here.
These articles are all very good and are available electronically if your library subscribes to the journal. They are short articles which describe the issues involved and any of them could be the basis for an interesting discussion in a statistics class. We illustrate this by discussing the article on coronary risk scores.
Heart disease is the largest cause of deaths in adults in their middle years and older in most European countries. Risk factors such as smoking, high cholesterol levels, and high blood pressure are well known for heart disease. A physician is typically not in a position to assess the overall risk risk when several risk factors are taken into account. To assist in this a number of charts, score cards, computer programs etc. have been developed to assist the physician. There is a bewildering array of these tools using different risk factors and based on different data.
In 1994 the European Society of Cardiology developed a chart from which the probability of coronary heart disease in the next ten years could be determined, given the patients age, sex, blood pressure, total cholesterol level and smoking status. This was widely used in Europe. However, experience with this chart showed some problems. It was based on a relatively small data set of about 5000 people from the Framingham study so some risk combinations had to be based on very little data. Also, in some countries the risk of heart disease is lower than others. One such country is Italy. Researchers in Italy constructed a chart just like the chart that was being used in Europe but based on data from their country. The resulting chart looked quite different from the one they had been using.
So, in 1998 a new chart called SCORE was developed and is now in use. The new chart was designed to estimate the probability of death in the next ten years from any cardiovascular event. It was based on 12 European cohort studies which involved over 200,000 subjects and contain some 3 million personyears of observation and more than 7000 fatal cardiovascular events.
Here is the resulting chart:
Ten year risk of fatal CVD in high risk
regions of Europe by gender,
age,systolic blood pressure, total cholesterol and smoking status.
You can see a similar chart for low risk European countries here.
Here are the instructions on how to use the chart as given in De Backer et al [2].
So let's see what my (Laurie's) risk is. Laurie is a male, does not smoke, has systolic blood pressure 130 and total cholesterol 226. If Laurie were in his fifties he would have only a 1% chance of dying of a cardiovascular event. If he were in his sixties he would have a 5% chance. However Laurie is in his 70's, so over the chart. Thus he will have to live with (die with?) the 5%. Note that he could probably get it down to 2% by taking more pills to lower his cholesterol.
Armed with this ammunition Laurie asked his friendly doctor if he should start taking more pills. His doctor drew several sharply rising curves saying: The first curve is for the risk of dying of a heart attack as a function of age, the second for dying of cancer, the third for dying of parkinson's disease, etc. Taking away one of these won't help all that much and many people consider a heart attack as a pretty good way to go! So Laurie is not taking more pills.
REFERENCES:
[1] Conroy et al, Estimation of tenyear risk of fatal cardiovascular disease in Europe: the SCORE project. European Heart Journal. 2003;24:9871003.
[2] De Backer et al. European guidelines on cardiovascular disease prevention in clinical practice. European Heart Journal, 2003:24: 16011610
DISCUSSION QUESTION:
What do you think of Laurie's doctor's analysis?
Ask Marilyn
Parade Magazine, 28 March, 2004
Marilyn vos Savant
Readers of Marilyn's column now know something about the history of probability from the following multiple choice question that she provided at the end of this column:
In the 17th century, French mathematicians Pierre de Fermat and Blaise Pascal together developed modern probability theory in the course of what activity?
A. Playing spinthebottle with a milkmaid.
B. Predicting the succession of King Louis XIV.
C. Counterfeiting the first scratchoff lottery tickets.
D. Answering a gambler's query about why he lost money on dice.
But, as usual, we ask "Does she have it right?"
In his article "Pascal and the Invention of Probability Theory" [1] Oystein Ore writes:
Most textbooks on probability feel obliged to include a brief
account of the history of the subject. Their descriptions of this process of
initiation usually run somewhat in the following vein: "In the year 1654
a gambler name de Mere proposed to Pascal two problems which he had run across
in his experiences at the gaming table".
It is likely that the distinguished Antoine Gombaud chevalier de Mere, sieur
de Baussay, would turn in his grave at such a characterization of his main occupation
in life. He certainly considered himself a model of courtly behavior and taught
his esthetic principles elegantly to the haut monde as one may see
from the frontispiece of his collected works. His writings...have secured him
a permanent niche in the French literature of the seventeenth century.
One of the problems proposed to Pascal was a "dice problem" and the second was "the problem of points" which had a long history going back to the 15th century. Pascal consulted Fermat about these two problems and their correspondence is considered by many to be the beginning of probability theory. Their letters, translated into English, can be found in F.N. David's book "Gods, Games, and Gambling"[2] and are available on the web here as part of the University of York's Materials for the History of Statistics.
An example of what Ore thinks might make de Mere turn in his grave can be found in Grinstead and Snell's book Introduction to Probability. Here we read:
It is said that de Mere had been betting that, in four roles of a die, at least one six would turn up. He was winning consistently and, to get more people to play, he changed the game to bet that, in 24 rolls of two dice, a pair of sixes would turn up. It is claimed that de Mere lost with 24 and felt that 25 rolls were necessary to make the game favorable.
But this is called the "de Mere legend" by Maistrov [3 ] in his history of probability book. He states that the legend in detail is presented in a story by Khinchin and Yaglom titled "The story of the Knight de Mere." Maistrov provides this story with some omissions. He also provides references to the original story (in Russion). Maistrov argues that de Mere did not turn to Pascal with a problem from an actual gambling experience, but with a purely theoretical question.
So the question that we should ask Marilyn is: did Pascal pose the dice problem because he was a gambler or because he was interested in probability theory?
Since the first letter from Pascal to Fermat is missing, we have to try to determine this from later letters. In Fermat's reply to the missing first letter, he gives his way to determine the chance of getting a six in a specific number of rolls of a die and seems to suggest that Pascal got it wrong. Fermat:
Fermat to Pascal
1654 [undated]
Monsieur
Sir, if I undertake to make a point with a single die in eight throws, and if
we agree after the money is put at stake, that I shall not cast the first throw,
it is necessary by my theory that I take 1/6 of the total sum to he impartial
because of the aforesaid first throw.
And if we agree after that that I shall not play the second throw, I should,
for my share, take the sixth of the remainder that is 5/36 of the total. If,
after that, we agree that I shall not play the third throw, I should to recoup
myself,take 1/6 of the remainder which is 25/216 of the total.
And if subsequently, we agree again that I shall not cast the fourth throw,
I should take 1/6 of the remainder or 125/1296 of the total, and I agree with
you that that is the value of the fourth throw supposing that one has already
made the preceding plays.
But you proposed in the last example in your letter (I quote your very terms)
that if I undertake to find the six in eight throws and if I have thrown three
times without getting it, and if my opponent proposes that I should not play
the fourth time, and if he wishes me to he justly treated, it is proper that
I have 125/1296 of the entire sum of our wagers.
This, however, is not true by my theory. For in this case, the three first throws
having gained nothing for the player who holds the die, the total sum thus remaining
at stake, he who holds the die and who agrees to not play his fourth throw should
take 1/6 as his reward.
And if he has played four throws without finding the desired point and if they
agree that he shall not play the fifth time, he will, nevertheless, have 1/6
of the total for his share. Since the whole sum stays in play it not only follows
from the theory, but it is indeed common sense that each throw should be of
equal value.
I urge you therefore to write me that I may know whether we agree in the theory,
as I believe we do, or whether we differ only in its application.
I am, most heartily, etc.,
Fermat.
Fermat is calculating the probability of obtaining a 6 for the first time on the kth toss when you toss a die 8 times. Thus if you roll the die 8 times the probability that it comes up in the first three rolls is 1/6 + 5/36 + 125/1296. Today we would use the concept of independence to calculate the probability that you don't get a six by the third roll as (11/6)^3 and subtract this from 1 to get the probability that you do get a six in the first 3 rolls.
In a letter dated Wednesday July 29, 1654, responding to Fermat's letter, Pascal writes:
I have no time to send you the proof of a difficult point
which astonished M. (de Mere) so greatly, for he has ability but he is not a
geometer (which is, as you know, a great defect) and he does not even comprehend
that a mathematical line is infinitely divisible and he is firmly convinced
that it is composed of a finite number of points. I have never been able to
get him out of it. If you could do so, it would make him perfect. He tells me
then that he has found an error in the numbers for this reason .
If one undertakes to throw a six with a die, the advantage of undertaking to
do it in 4 is as 671 is to 625. If one undertakes to throw double sixes with
two dice there is a disadvantage of undertaking it in 24 throws. But nonetheless,
24 is to 36 (which is the number of faces of two dice)^{2} as 4 is to
6 (which is the number of faces of one die).
This is what was his great scandal which made him say haughtily that the theorems
were not consistent and that arithmetic was demented. But you will easily see
the reason by the principles which you have.
^{2}[Clearly, the number of possible ways in which two dice can fall.]
Ore explains de Mere's reasoning as:
Pascal does not understand de Mere's reasoning, and the passage also has been unintelligible to the biographers of Pascal. However, de Mere bases his objection upon an ancient gambling rule which Cardano also made use of: One wants to determine the critical number of throws, that is, the number of throws required to have an even chance for at least one success. If in one case there is one chance out of N_{o} in a given trial, and in another one chance out of N_{1}, then the ratio of the corresponding critical numbers n_{0} and n_{1} is as N_{0}:N_{1}. That is we have
n_{0}:N_{0} = n_{1}:N_{1}.
Well, that's all the evidence we have. We leave it to the reader to decide
if de Mere was asking a theoretical question or a question based on his gambling
experience.
REFERENCES
[1] Oystein Ore. Pascal and the Invention of Probability Theory, American Mathematical Monthly. v. 47, p. 40919, May 1960. Available from JSTOR.
[2]David F.N., Games, Gods & Gambling: A History of Probability and Statistical Ideas, Dover, 1998.
[3] Maistrov, L. E. (Leonid Efimovich). Probability theory; a historical sketch. Translated by Samuel Kotz from Teoriia veroiatnostei. Academic Press, New York, 1974.
DISCUSSION QUESTIONS:
(1) Show that betting that two sixes will turn up in 25 rolls of a die is an unfavorable bet and betting that two sixes will turn up in 24 rolls of two dice is an unfavorable bet.
(2) In his article Ore writes:
de Mere believed that the smallest advantageous number of throws should be 24. As the matter has been presented, he turned to Pascal because his own experiences had shown him that 25 throws were required. This is an unreasonable explanation. The difference between the probabilities for 24 and 25 throws is so small, as we have just seen, that to decide experimentally that one of them is less than 1/2 would, according to modern statistical standards, require at least 100 sequences of trial, which in turn would involve several thousand individual throws with the two dice. Besides, the dice would have to be specially made in order to show no bias; the usual bone cubes turned out by the diciers of Paris would be much to inaccurate. To prepare special equipment of this kind and to keep the tedious records involved was evidently contrary to the chevalier's temperament.
Determine either by simulations or theoretically, or both, that betting that 2 sixes will come up in 24 rolls is not a favorable game while with 25 rolls it is.
(3) Do you think Marilyn got it right?
Copyright (c) 2004 Laurie Snell
This work is freely redistributable under the terms of the GNU General
Public License published by the Free
Software Foundation. This work comes with ABSOLUTELY NO WARRANTY.