CHANCE News 7.06

(27 May 1998 to 26 June 1998)


Prepared by J. Laurie Snell, Bill Peterson and Charles Grinstead, with help from Fuxing Hou and Joan Snell.

Please send comments and suggestions for articles to jlsnell@dartmouth.edu

Back issues of Chance News and other materials for teaching a Chance course are available from the Chance web site:


Chance News is distributed under the GNU General Public License (so-called 'copyleft'). See the end of the newletter for details.

In the long run, all random events have a pattern. If I know the pattern, I can predict how likely something is to happen.
Danny Glover in the video series "Life by the numbers"

DISCUSSION QUESTION: What does Danny Glover's remark mean?

Contents of Chance News 7.06



Those who have been following the saga of the moose and wolves on Isle Royale National Park (See Chance News 6.10) will want to check out Rolf Peterson's "1997-8 Ecological Studies of Wolves on Isle Royale" where he discusses the dramatic drop in the wolf population this year.

In Chance News 6.13 we discussed a report by William A. Cassidy to the National Transportation Safety Board relating to the investigation of the TWA flight 800 crash. In his report Cassidy estimated the the chance of a plane being hit by a meteorite. Prof. Cassidy has provided us with a copy of this report which we have put on the Chance Website under "Teaching Aids".


We have added two new videos to our collection of Chance Lectures which can be accessed from the Chance website. These are:

The Bible Code: A series of three talks on Bible Codes given by Brendan McKay, Maya Bar-Hillel and Jeffrey H. Tigay at Princeton University, Tuesday April 28, 1998. McKay and Bar-Hillel give a critical analysis of evidence for Bible Codes. Tigay discusses how the most authentic version of the Bible known today could differ from the original.

The Census 2000: A talk given by Tommy Wright in December 1998 at the Dartmouth Chance Lecture Series. Wright describes how the Bureau of the Census plans to carry out the census 2000.


National Public Radio (NPR) recently had two interesting programs related to chance issues: one on the Bible Codes and another on streaks in sports. Radio programs available on the web provide an interesting source of course material, so we have added links to these on the Chance website under "Teaching Aids." Here is a list of programs today:

NPR Weekend Edition June 13, 1998

A brief but quite good discussion with Keith Devlin of coincidences with special reference to Bible Codes.

NPR Science Friday, May 29, 1998

Richard Harris interviews psychologist Tom Gilovich and Ian Stewart about math in everyday life including discussions of streaks in sports and the birthday problem. There is a particularly interesting discussion between the guests and listeners who call in with their answers to the "birthday problem".

NPR Science Friday, 21 June 1996

A two-hour report from the first World Skeptics Congress. In the first hour the guests are: Paul Kurtz, Philosopher, John Paulos, Mathematician, Milton Rosenberg, Social Psychologist, and Phillip Adams, Broadcaster and Television Producer. They discuss the media's obsession with pseudoscience and, more generally, how radio and television do report science news and how they should report science news.

In the second hour the guests are: Kendrick Frazier, Editor of Skeptical Inquirer, Joe Nickell, Senior Research Fellow for the Committee for the Scientific Investigation of Claims of the Paranormal, Ray Hyman, Psychologist, and Eugenie Scott, Executive Director National Center for Science Education. They discuss what it means to be a critical skeptic. They illustrate the way that professional skeptics study and explain paranormal phenomena.

Car Talk: Week of 5/23/98

The cartalk brothers discuss the infamous two-boys problem: Given that a family with two children has at least one boy, what is the probability the family has two boys.

Car Talk, Weeks of 10/18/97 and 10/23/97

The cartalk brothers discuss the Monty Hall problem. You will find more about this problem, including a historical discussion and a chance to play the game at Monty Hall Puzzler.


The statistics so far for those who play "Let's Make a Deal" on the cartalk website are:

Number of .... Winning Pct

Initially correct: 7941 34.039%

Stickers: 10625 33.459%

Switchers: 12704 66.821%

Total Trials: 23329

Do these numbers differ significantly from what you would expect?


Life by the numbers
Video WQED Pittsburgh
Producer Mary Rawson, Writer David Elisco

This is a seven-hour video to show what mathematics is all about and to show its effects on our everyday life. The video series was three years in development and cost 4 million dollars to produce. The producers had first rate mathematical advisors. This video illustrates the continuing battle between glitz and content that goes on even when material is produced for television. A detailed review of the series by Colm Mulcahy can be found at MAA Online. Keith Devlin has written an accompanying book which sells for $20.97 at the Amazon Bookstore. You can view a video of Devlin discussing the project and showing excerpts from the video series at MSRI - Keith Devlin.

The one hour unit of particular interest to us is "Chances of a Lifetime" on probability and statistics in everyday life. We see a number of familiar faces. The video starts with Hal Stern talking about his interest in the use of statistics in sports. We see Hal rooting for his Iowa Cubs. The Cubs are behind 2 to 1 in the bottom of the sixth with a man on first and no outs. Should the manager signal a bunt? Hal explains what statistics the manager is or should be considering as he makes his decision. He makes the "right decision" and the Cubs go on to win the game. Next George Cobb tells us how statistics got started. He starts with the huge data collection of William the Conqueror in the 11th century and works up to Graunt's use of birth records to estimate the population of London. Then Ed Packel introduces us to gambling and uses the Galton Board simulation to illustrate how the Casino can make so much money at roulette while the rest of us have the excitement of sometimes winning a lot. Then our former Chance Workshop participant Bill Kaigh tells us how he teaches students about the use of polls and how he and his wife actually do it in their business that prepares polls for the El Paso T.V. stations and others. Susan Ellenberg explains the role of statistics in testing drugs using the Salk Vaccine Trials as a basic example. The video ends with Grant Steer, an actuary at Farmer's insurance, explaining how the company developed policy for weddings covering all aspects of the wedding: photographs, dresses, cake, etc.

Of course with these leading actors assures that, at least in this unit, content competes favorable with glitz.


In his discussion of the series Keith Devlin says that "the intention was never to put any mathematics on the screen -- you'd lose your audience immediately if you did" What do you think about this decision?


Investing it; duffers need not apply.
The New York Times, 31 May 1998, Section 3, p. 1
Adam Bryant

An investment compensation expert, Graef Crystal, carried out a study purporting to show that the major companies, whose C.E.O's

had low golf scores, had high performing stocks. Crystal obtained data for golf scores from the journal Golf Digest and used his own data on the stock market performance of the companies of 51 chief executives. He created a Stock Rating which gave each company a stock rating based on how investors who held their stock did with 100 being highest and 0 lowest.

It is rare that an article in the New York Times includes the data set, but this article did. Here it is, as sent to us by Bruce King (we have saved it on the Chance Website in the data section of Teaching Aids):

CEO Company Handicap StockRate

Melvin R. Goodes Warner-Lambert 11 85

Jerry D. Choate Allstate 10.1 83

Charles K. Gifford BankBoston 20 82

Harvey Golub American Express 21.1 79

John F. Welch Jr. General Electric 3.8 77

Louis V. Gerstner Jr. IBM 13.1 75

Thomas H. O'Brien PNC Bank 7.1 74

Walter V. Shipley Chase Manhattan 17.2 73

John S. Reed Citicorp 13 72

Terrence Murray Fleet Financial 10.1 67

William T. Esrey Sprint 10.1 66

Hugh L. McColl Jr. Nationsbank 11 64

James E. Cayne Bear Stearns 12.6 64

John R. Stafford Amer. Home Products 10.9 58

John B. McCoy Banc One 7.6 58

Frank C. Herringer Transamerica 10.6 55

Ralph S. Larsen Johnson & Johnson 16.1 54

Paul Hazen Wells Fargo 10.9 54

Lawrence A. Bossidy Allied Signal 12.6 51

Charles R. Shoemate Bestfoods 17.6 49

James E. Perrella Ingersoll-Rand 12.8 49

William P. Stiritz Ralston Purina 13 48

Duane L. Burnham Abbott Laboratories 15.6 46

Richard C. Notebaert Ameritech 19.2 45

Raymond W. Smith Bell Atlantic 13.7 44

Warren E. Buffett Berkshire Hathaway 22 43

Donald V. Fites Caterpillar 18.6 41

Vernon R. Louckes Jr. Baxter International 11.9 40

Michael R. Bonsignore Honeywell 22 38

Edward E. Whitacre Jr. SBC Communications 10 37

Peter I. Bijur Texaco 27.1 35

Mike R. Bowlin Atlantic Richfield 16.6 35

H. Lawrence Fuller Amoco 8 33

Ray R. Irani Occidental Petroleum 15.5 31

Charles R. Lee GTE 14.8 29

John W. Snow CSX 12.8 29

Philip M. Condit Boeing 24.2 25

Joseph T. Gorman TRW 18.1 24

H. John Riley Jr. Cooper Industries 18 22

Richard B. Priory Duke Energy 10 22

Leland E. Tollett Tyson Foods 16 20

Bruce E. Ranck Browning-Ferris 23 15

William H. Joyce Union Carbide 19 13

Thomas E. Capps Dominion Resources 18 12

Scott G. McNealy Sun Microsystems 3.2 97

William H. Gates Microsoft 23.9 95

Sanford I. Weill Travelers Group 18 95

Frank V. Cahouet Mellon Bank 22 92

William C. Steere Jr. Pfizer 34 89

Donald B. Marron Paine Webber 25 89

Christopher B. Galvin Motorola 11.7 3

Crystal regarded the last seven as outliers and threw them out (described in the article as being scientifically sifted out).

Bruce also sent us a letter he wrote to The New York Times about the article. The Times (The New York Times, 14 June 1998, Money and Business/Financial Desk, Sect. 3, p 12.) published several letters complaining about some of the points that Bruce made in his letter. However, we felt that Bruce's letter best described the many problems with the Times article, so we asked him to allow us to include it here.

To the Editor:

There are several reasons why Sunday's CEO golf/ performance study (Money & Business, pp.1,9) did not deserve an inch of column space, much less the 1+ pages you gave it. The study has at least four problems:

(1) The 74 CEOs who reported their golf handicaps probably are different in unknown ways from the CEOs who chose not to reveal their handicaps. You cannot safely generalize to the population of all CEOs the responses of those who volunteer information.

(2) Such an observational study cannot support an inference that A causes B. In particular, the suggestion that "executive wannabees ... spend more time on the links" is foolishness, and makes no more sense than assuming that moving closer to the Canadian border will improve your IQ (a puckish observation attributed to Senator Moynihan, I believe).

(3) One cannot be sure from the published article, but it seems likely that the observed correlation between golf handicap and executive prowess was the result of a fishing expedition. You quote Mr. Crystal as saying "For all the different factors I've tested ... this is certainly one of ... the strongest ...". Well, let's imagine that Mr. Crystal tested for 50 irrelevant factors for a link to executive prowess; it is likely that one or more of the 50 samples would nevertheless show a statistically significant correlation by chance alone. And if that's the only correlation reported, it looks as if it might be important, rather than just a chance occurrence. It is reasonable to wonder whether Mr. Crystal just continued to fish until he finally found a "keeper".

(4) Mr. Crystal's treatment of seven "outliers" seems to be quite arbitrary. First of all, you may note that these seven constitute the six executives with the greatest performance ratings, and the one with the least. But they are NOT outliers in the usual sense: 1.5 times the interquartile range beyond the middle 50% of the ratings).

Secondly, outliers are not censored just because they "distort the trend lines". If that was the case, any scatterplot could be pruned to show a significant correlation. The conventional strategy is to seek to learn why an outlier is unusual, and to retain all the data that cannot be rejected for cause. (An outlier, for example, may merely be a data-recording error, and if the error cannot be corrected, there is sufficient cause to reject that observation.) Did you notice that the correlation between golf handicap and executive prowess was only -0.042 when the seven > outliers were included, and that deleting the seven changed it to -0.414?

As a long-time Times reader, I depend on it for accurate reporting of the sciences. It is extremely disturbing to see it purveying junk science as "rigorous," however cute it may be.

Bruce King



The following amusing textbook "solution" to the birthday problem was circulating on the Isolated Statisticians electronic discussion group. The title of the book is "Developing Creative & Critical Thinking", the author is Robert Boostrom, and the publisher, National Textbook Company. The following is from pp. 102-103:

Suppose that there were only two people--you and one other person in the class. The chance that you and the other person have the same birthday is approximately 1 in 366. ...

Now, add a third person. The chance that the birthday of the third person will match yours or the other person's is 2 in 366. Next, add a fourth person. ...

Continuing in this way, you end up with twenty-nine ratios that can be added in the way that fractions are

1/366 + 2/366 + ... + 29/366.

Therefore, the probability of two people in a group of thirty having the same birthday is 435/366. This very large ratio means that it is almost certain that two people in any group of thirty will share the same birthday. Not only would you not be surprised to find out this was so, you would expect it."

We got the exact reference from Linda Wagner, who encountered the text in the course EDUC W554 (Creative Problem Solving and Metacognition) at Indiana University/Purdue University/Ft. Wayne. The course counts for credit towards high school teaching certification!


(1) What is the probability of at least one birthday match in a group of 30?

(2) Show that the author's calculation of 435/366 is correct for the expected number birthday matches in a group of thirty. (See Chance News 6.13).

(3) On the NPR program (Science Friday, May 29, 1998) mentioned above, the first caller remarked that his wife was two years younger and they both had the same birthday. He asked: What is the probability of that happening? How would you have answered him?

(4) The second caller remarked that obviously not all days are equally likely for birthdays. He commented that for humans they were probably not very different, but for animals they could be very different; and therefore we should be careful in our claim that 23 was correct. How would you have answered him?


Bob Griffin wrote that the next article was discussed on the American Association for Public Opinion Research listserv:

Ask Marilyn
Parade Magazine, 31 May 1998
Marilyn vos Savant

Marilyn received the following letter:

Can you explain how they arrive at the so-called "margin of error" in public opinion polls?

Thomas Miller, Bethlehem PA.

Marilyn replied:

Good polling is a tricky business, but the guiding principle is simple: The larger the sample, the more accurate it is. After much data collection, pollsters have learned their numerical limits of accuracy and call them collectively the "margin of error." The individual numbers are so consistent that they are considered standard. For this reason, the published margin of error on a particular poll merely tells us the size of the sample. It is based on past polls. For example, if a poll has a margin of error of plus or minus 3% this usually tells us that about 1500 people were polled. That is; the margin-of-error percentage is assigned to the poll, not developed from it.

Smaller sample have larger margins, and larger samples have smaller ones, but only slightly. For most purposes, a national sample of 1500 is adequate. In fact, most public-opinion polls use samples ranging is size from only 1000 to 2000 people, but this is amazingly sufficient."


(1) What do you think of Marilyn's explanation?

(2) How does Marilyn's answer compare with the following standard New York Times explanation of the sampling error in a recent poll of size 1,126 on how people felt about the government's antitrust suit against Microsoft:

In theory, in 19 cases out of 20 the results based on such samples will differ by no more than three percentage points in either direction from what would have been obtained by seeking out all American adults.

For smaller subgroups the potential sampling error is larger. For example, for the 315 most sophisticated computer users in the sample it is plus or minus six

percentage points.

In addition to sampling error, the practical difficulties of conducting any survey of public opinion may introduce other sources of error into the poll. Variations in question wording or the order of questions, for instance, can lead to somewhat different results.



Speaking of polls, we received the following message from Tom Silver:

I thought that our new web site might be of interest. PollingReport.com (http://www.pollingreport.com) includes national polling data -- from major research organizations, like Gallup, Harris, Yankelovich, and Princeton Survey Research -- on a variety of issues.

It also has articles by pollsters on current public opinion and on issues in survey research. (The article currently featured, by the chairman of Louis Harris & Associates, is titled "Myth and Reality in Reporting Sampling Error: How the Media Confuse and Mislead Readers and Viewers.")

Taylor discusses the many sources of errors in polls that go beyond the "random sampling error". He states that Harris typically includes a strong warning in its polls. Here is what Harris Poll said in a recent poll with sample size 1000 on the public perception of political leadership: Harris Poll On Line.

In theory, with a sample of this size, one can say with 95 percent certainty that the results have a statistical precision of plus or minus 3 percentage points of what they would be if the entire adult population had been polled with complete accuracy. Unfortunately, there are several other possible sources of error in all polls or surveys that are probably more serious than theoretical calculations of sampling error. They include refusals to be interviewed (non-response), question wording and question order, interviewer bias, weighting by demographic control data and screening (e.g., for likely voters). It is difficult or impossible to quantify the errors that may result from these factors.


Looking at the Gallup homepage, we find the following methodology explanation at the end of its most recent poll:

The current results are based on telephone interviews with a randomly selected national sample of 1,007 adults, conducted June 5-7, 1998. For results based on a sample of this size, one can say with 95 percent confidence that the error attributable to sampling and other random effects could be plus or minus 3 percentage points. In addition to sampling error, question wording and practical difficulties in conducting surveys can introduce error or bias into the findings of public opinion polls.

(1) What do you think Gallup means by "and other random effects?"

(2) Which of the three methodology explanations, The New York Times, Harris, and Gallup do you prefer? Which one would you use in your newspaper?

(3) Who do you think advised The New York Times on its explanation of how the poll was taken?


Online Voters Get Hankerin' for Anarchy
The Philadelphia Inquirer, 28 May, 1998 F6
Michael J. Himowitz

This article concerns a poll conducted by People magazine on the Internet. The editors of People asked their online readers to vote for Most Beautiful Person. The winner of the poll was to be featured on a future issue of the real magazine. The official ballot included many well-known celebrities and also included a spot to write in a name.

Howard Stern, a radio personality, suggested that his fans write in one of his cohorts on the radio, namely Hank, the Angry, Drunken Dwarf. Stern's fans gave Hank over 230,000 votes, by far the most votes garnered by any one person.

This result points out a weakness of write-in polls. There is no way to control how many times one person votes, and there is no way to guarantee that the people who vote form a random sample of the population. In public opinion polls, such as those conducted by Gallup or Roper, much attention is paid to randomness.


(1) It has been said that in the future, everything will be done on the Internet. Describe how you might try to conduct a survey, a la Gallup, on the Internet. Even if we assume that we have access to all e-mail addresses in the country, what problems would we face?

(2) Who is Hank, the Angry Drunken Dwarf?


Sampling and Census 2000: The Concepts
American Scientist, May-June 1998, pp. 495-524
Tommy Wright

This is a nice article to go with our 2000 Census Chance Lecture by Tommy Wright.

Wright starts by disabusing the readers of the idea that "counting" in the census means a completely accurate result. He uses the following elegant example:

10 people are asked to count the number of persons at a local high-school basketball game during half-time. During half-time, spectators come and go--some leave, some get refreshments, some switch seats--and the players and coaches go to the lock rooms. The ticket count will not do, because some are admitted without tickets, and some who bought tickets do not show. The dynamics of the population of persons in attendance at half-time suggest that some may be counted twice (those who change seats), and even more might be missed (those who were not in their seats when that area of the gym was being counted).

From this example it is easy to see the corresponding uncertainties that arise in a census count. The rest of the article is devoted to the statistical methods to be used, in conjunction with counting, to get an accurate census. This includes a discussion of techniques, such as stratification and use of prior information, that can be used be to cut down the variation in the sampling part of the census. Finally, there is a discussion of how the capture-recapture method will be used to estimate the undercount.


What do you suppose it is like to be working on a mammoth project like the census realizing that at any moment Congress or the courts can rule that you have to start all over?


Hidden truths
New Scientist, 23 May, 1998, p 28
Robert Matthews

This article deals with the topic of missing data. Two major subtopics are discussed; one is the capture-recapture method, and the other is bias in clinical studies. As an example of the capture-recapture method, consider the following scenario. Suppose that we are trying to estimate the number of people who live in a certain area. We can pick a random sample from the area and consider the people in the sample to be 'captured'. Then we can pick another random sample from the same area and count how many of the captured people are picked the second time (i.e. 'recaptured'). Suppose that we pick a sample of 100 people the first time, and then we find that, in our second sample of size 100, 10 people have been recaptured. This tells us that the captured set is about 1/10 the size of the population, leading to an estimate of 1000 for the population size. This method has been applied in ecology, national defense, and public health, to estimate sizes of populations that are hard to count directly.

The second subtopic concerns missing data. Suppose, for example, that a drug has been developed to treat a certain condition. Before it can be introduced into the marketplace, it must be tested to see if it is safe and if it works. It is fairly clear that if one tests a drug sufficiently often, it is probable that at least one of the tests will show that the drug is effective. Unfortunately, there may be many other studies of the same drug that do not show this, and these latter studies are less likely to be published than the former one. The question in this case is whether the fact that these negative studies exist can be discerned from the studies that have been published.

It turns out that the answer to this question is 'sometimes'. One can use what is known as a 'funnel plot' to help determine whether there is missing data. One begins with the data from many small studies concerning the drug's efficacy. Since these studies are small, they tend to vary widely in their predictions concerning the drug. Larger studies are needed to establish statistically significant results. If one compares the results of the published larger studies with the smaller studies, two possibilities exist. The first one is that, while the larger studies tend to cluster more closely to a certain figure than the smaller ones, there is no bias in the larger studies with respect to this figure. The second possibility is that the larger studies are skewed (presumably in the direction that shows the drug is beneficial). This latter situation suggests that some larger studies have been conducted and have remained unpublished because they do not show that the drug is effective at a statistically significant level.

Matthias Egger and some colleagues at Bristol University have used funnel plots in this way to show that, in at least one quarter of 75 published medical studies, there were significant signs of missing data. These results have given more ammunition to medical scientists who have called for access to all study results.


(1) The Census Bureau plans to use the capture-recapture method to determine the undercount in the census 2000. This is done by carrying out two surveys (Mass Enumeration & PES Sample Enumeration) of a population of N blocks at about the same time. The unknown total population is

M = M11 + M10 + M01 + M00

where M11 is the number enumerated on both occasions; M10 the number enumerated on occasion one but not on occasion two, M01 the number enumerated on occasion two but not one, and M00 the number not enumerated on either occasion. Everything but M00 is known here. Assume the two surveys are independent and each individual has the same probability of being counted. Then you can estimate M00 by

M00' = M01M10/M11

and from this obtain an estimate of M as

M' = M11 + M10+M01+M00.

How realistic do you think the assumptions are in this situation?

(2) Matthews mentions the following problem: Jones, a butterfly researcher, captures butterflies for a period of time m. He captures 5 different species with 4,1,10,20,2 representatives of these 5 species. Jones asks: if I were to return the next day and catch butterflies for a length of time n, how many new species of butterflies should I expect to get?

This is a famous and very difficult estimation problem. There is one elegant solution due to I.J. Good (Biometrica, Vol. 40, 1953, pp. 237-264) which works sometimes. Assume that butterflies of species s enter the net according to a Poisson process with rate v which depends on s. Then the distribution of the number of butterflies of species s caught in time t is Poisson with mean tv. Thus, before he starts his research. the probability that Jones catches no s butterfly on the first day but at least one s butterfly on the second day is


Using the infinite series expansion for e^x in the term of this product permits us to write this product as:

P(X = 1)(n/m) - 2P(X=2)(n/m)^2 + 3P(X = 3) (n/m)^3 + ...

where X is the number of s butterflies caught on the first day. Summing these equivalent expressions over all species shows that

Expected number of new species caught on the second day =

r(1)(n/m) - r(2)(n/m)^2 + r(3)(n/m)^3 - + ...

where r(j) is the expected number of species represented j times in the first days catch.

(3) Assume that n=m. Estimate the number of new butterflies that Jones will capture on the second day.

Thisted and Efron (Biometrica 1987,74, 3, pp. 445-55) used this method to determine if a newly found poem of 429 word had about the right number of new words to be consistent with being written by Shakespeare. They interpreted a species as a word in Shakespeare's total works discovered or not discovered. They took the time m to be 884647, the number of words in existing Shakespeare works, and n to be 429, the number of words in the new poem. Using the above result they got an estimate of about 7 for the number of new words that should occur in a new work of 429 words, and since the new poem had 9 such new words they concluded that this was consistent with the poem being a work of Shakespeare.

(4) What do you think about applying the Poisson model to this kind of problem?

(5) It would be interesting to check this model on bird data. A natural thing to try would be the Christmas Bird Count (CBC). I'm told the data is available.


Midwives beat doctors in government survey
The Washington Post, 26 May 1998, pZ5.
Sally Squires

The National Center for Health Statistics compared infant mortality risks for babies delivered by physicians with those delivered by midwives. The study looked at 3.9 million vaginal births with 35-43 weeks gestation. After adjusting for a variety of social and medical risk factors, deliveries by midwives showed a 19% lower infant mortality rate. Neonatal mortality--that is, deaths within the first 28 days of life--were 33% lower in the midwife group.

The authors of the study said that the differences may partly be explained by prenatal, labor and delivery care. Nurse midwives spend more time with their patients during prenatal visits, and stay with the women during labor.

See Chance News 6.06 for discussion of another study that showed the same kind of results.


The article does not contain a recommendation that all pregnancies be handled by midwives. Why do you think this is?


Will you please just park?
Dallas Morning News, 9 June 1998

Michael Precker

Richard Cassady, assistant professor of industrial engineering at Mississippi State University has found himself a celebrity because of an article he wrote with John Kobza of Virginia Tech titled "A Probabilistic Approach to Evaluate Strategies for Selecting a Parking Space" in the current issue of Transportation Science. Everyone from Dateline NBC to the Times of London has called asking about the study. An article in the Richmand Times-Dispatch had the headline: Parking Study Could Save Marriages.

The study compared the strategy of taking the first spot encountered with searching the lot for a space closer to the store. The conclusion: you can get to the store a few seconds sooner if you simply choose a row and take the first available space. "Park in the first available space and you'll be in the door in 61 seconds; maneuver to get closer and it'll take 71 seconds".

Cassady said: We didn't do any empirical study or data collection, we just used existing relatively simple probability techniques to develop the equations in the paper. According to another article this meant semi-Markov chains.


(1) How do you think they determined the parameters of the model to obtain the estimates for the Wal-Mart with no data?

(2) What about getting back to the car from the store?


A debate is unleashed on cholesterol control
The Boston Globe, 28 May 1998, A3
Richard A. Knox

Under current federal guidelines, 14 million Americans should be taking drugs to lower their cholesterol. These are people who have had a heart attack, have symptoms of heart disease, or have above-average cholesterol levels. But Dr. James Cleeman of the National Cholesterol Education Program reports that only 8.5 million of these people are actually taking the drugs.

A newer study indicates that about 6 million should be added to the recommended treatment group. The new data come from the Air Force/Texas Coronary Atherosclerosis Prevention Study, directed by Dr. Antonio Gotto of Cornell University. According to Dr. Gotto, more aggressive drug treatment would benefit healthy middle-aged men and post-menopausal women with total cholesterol levels of 180-240 milligrams, levels of LDL ("bad cholesterol") of 130-190, and low levels of HDL ("good cholesterol"). Dr. Gotto himself falls in this group and has begun taking a "stanin" drug to lower his cholesterol.

Although these findings were widely reported last November, their recent publication in the Journal of the American Medical Association has prompted renewed debate. Dr. Robert Pasternak, director of preventive cardiology at Massachusetts General Hospital, expressed the following concerns: "If you look at the absolute numbers instead of the relative risk, the difference from treatment is really tiny. They had to treat 3000 patients for five years to prevent 60 coronary events. My own opinion is that this is not particularly cost-effective therapy."


What does Dr. Pasternak mean by the distinction between "absolute numbers" and "relative risk"?


Study cast doubt on accuracy of eyewitness recall of suspects
The Boston Globe, 1 June 1998, A3
Associated Press

Eyewitness accounts are used over 75,000 times a year to charge and prosecute crimes in the US. But research appearing in the June edition of the Journal of Applied Psychology calls into question the reliability of eyewitness testimony. It was demonstrated that witnesses who are given positive feedback on their identifications are likely to become much more confident of their testimony.

In the study, 352 people were shown a grainy videotape from a surveillance camera. Captured on the tape was a person who later fatally shot a security guard. The people were then shown either a lineup or photographs and asked to identify the suspect. But the true killer was neither in the lineup nor in the photos.

People who were told they had correctly identified the suspect were more confident of their choice than those who were told the suspect was among the other people shown. Compared to those who received negative feedback or no feedback on their choices, people who received positive feedback recalled having a better view of the suspect, needing less time to make the identification, and being able to make out more details in the video.


(1) Do you think the witnesses were told that the suspect was actually in the line-up? How would this affect the interpretation of the results? What else would you like to know about the design of the study?

(2) Do you think the reported effect comes from positive feedback increasing witnesses' confidence or negative feedback decreasing their confidence? Does it matter?


Christopher Hitchcock and Alexis Manaster Ramer
Glass houses: Greenberg, Ringe, and the Mathematics of Comparative Linguistics
Anthropological Linguistics 38:601-619, 1996

Researchers in Linguistics often find similarities between several different languages, suggesting that the languages came from a common source. Of course, a central question in such research is: could these similarities have been the result of chance. In Chance News 4.10 we discussed a paper by Donald Ringe who discusses mathematical models that he believes can be used to see answer this question.

Alexis Ramer believes that his colleagues in this field have some basic misunderstandings of the use of probability and probability models in these kinds of investigations. The first of the two articles deals with questions related to coincidences. The authors start with the following simple birthday problem. You meet two people. What are the chances, knowing nothing else, that they have the same birthday -- answer 1/365. On the other hand, what is the probability they have the same particular day, say Feb. 1, as their birthdays.-- answer 1/(365*365) = 1/133,225.

Ringe accused Greenberg of inummeracy, based on two alleged errors. The first concerns the following example given by Greenberg:

Let us assume even that accidental resemblances between two languages can be rather high, say twenty percent. The chance that some single meaningful form will appear with similar sound and meaning is then 1/5. The chance that this same element will appear also in some third language is the square of 1/5, that is 1/25.

Ringe charged that this fails to consider the frequencies of various linguistic elements in the languages under consideration. He wrote:

For example, if the probability that the word for 'ear' will begin with a glottal stop in two languages is and astonishing .2 (20%, or 1/5--Greenberg's hypothetical case), it will be so because the frequency of initial glottal stop in both languages is as high or higher...Let us consider the case where the frequency of initial glottal stop is the square root of the square root of .2, or approximately .4472, so that the probability of 'ear' beginning with glottal stop in any two of the three languages is .2...The probability that 'ear' will begin with glottal stop in all three languages is then (.4472)^3, or approximately .0894--not 1/25 (.04), as Greenberg suggests...

Ramer and Hitchcock point out that Ringe is effectively doing a particular-birthday calculation, whereas a same-birthday calculation is called for.

Still, Ramer and Hitchcock point out that Greenberg's analyses do not get everything correct, either. The controversy concerns how to treat similarities between multiple languages. Examples occur in the article Linguistic Origins of Native Americans by Joseph H. Greenberg and Merritt Ruhlen, Scientific American, November 1992. In a comparison of six languages, Greenberg ant Ruhlen sometimes seem to require all six to match to declare a coincidence. But in other places a match on a subset of the languages is highlighted as significant.

Ramer and Hitchcock discuss all this in much more detail and make a fine case for researchers in linguistics really understanding the subtleties of estimating the likelihood of apparent coincidences.


(1) How should one go about trying to determine if similarities on a set of languages could reasonably be attributed to chance?

(2) The Greenberg and Ruhlen example considers six language families which have similar spellings for words in the semantic range "swallow-throat". For example one family has a language where "melqw" means "throat" and another where "milq" means "to swallow". To estimate the probability for the similarity they disregard vowels as less stable than consonants and calculate the chance that the three consonants will match. They limit both languages to 13 consonants and accept only m for the first consonent, l or r for the second and k, k', q, q' for the third (k' an apostrophe means a glottle stop after the k"). Then the probability of an accidently match of milq "melqw" and "milq" is (1/13)(2/13)(4/13) = .004. From this they decide that the probability of such matches in all 6 families is .004^5 which is about 1 in 10 billion. They end the discussion with "So much for accidently resemblances."

Do you see any similarities in this procedure with those used by Drosnin to find Bible Codes?


Experts seek to avert asteroid scares
Chicago Tribune, 7 June 1998, p8
Associated Press

On March 11, front-page newspaper stories reported that Asteroid 1997XF11 was on a course that would bring it within 30,000 miles of Earth in October, 2028. That prediction, made by the International Astronomical Union, led to fears that the asteroid might actually hit the Earth. The depiction of deadly collisions by comets and asteroids in the Hollywood movies "Armageddon" and "Deep Impact" served to further dramatize the possibilities. However, when the trajectory of 1997XF11 was recalculated by experts at NASA's Jet Propulsion Laboratory (JPL), it was found the most likely path would miss the Earth by 600,000 miles (more than twice the distance to the moon).

In order to avoid such sensationalism in the future, scientists are considering better ways to release their findings on asteroids. New discoveries of asteroids headed in our general direction make headlines, but in most cases more careful calculations reveal that the threat is minimal. According to Paul Chodas of JPL, fifteen minutes after he received the XF11 data, his calculations revealed that there was "zero threat".

In April, NASA proposed guidelines for consultation between experts before public announcements. Chodas says that it might take up to 48 hours for such consultation, and NASA recommended an additional 24 hours before any news release. Earthquake expert Alan Lindh of the US Geologic Survey urged more openness about discoveries, as long as the uncertainty of the initial observations is clearly explained. "You can't control the flow of news," said Lindh, "but you can be as truthful as possible up front."

The article reports that astronomers have identified 123 potentially hazardous asteroids that could pass within 5 million miles of Earth and have discovered 200 of the estimated 2000 large asteroids that could pass within 30 million miles.

You can find more information about asteroid hazards on the NASA Ames website. The link has up-to-date information on the 1997XF11 discussion. The related links indexed there include an Asteroid and Comet Impact fact sheet, reviews of recent popular books and films, and a discussion of whether a meteor could have downed TWA Flight 800. (You can find earlier references to this story in Chance News 5.11 and 6.13.)


(1) Do you agree with Lindh's position? What differences do you see between earthquake forecasts and asteroid forecasts?

(2) How do you think astronomers estimate that 2000 large asteroids could pass within 30 million miles of Earth, given that only 200 have been discovered?


Unprovoked shark attacks found to increase worldwide
The Boston Globe, 11 June 1998, A27

The University of Florida keeps records on shark attacks worldwide. A recently released study reports that there were 56 attacks in 1997, up from 36 in 1996. The all-time high of 72 occurred in 1995. The US led the world with 34 attacks. The next closest was Australia with 5. Brazil had 4, the Bahamas and South Africa each had 3, while Japan and New Guinea had 2 each. However, in spite of the large number of attacks in the US, none of the 11 fatal attacks occurred here.

The study's author, Matthew Collahan, said the attacks may partly have been related to warmer temperatures over the past year. He noted that the US has a high rate of tourism, beach and aquatic activities, and also a large amount of coastline.

Nearly half of the 1997 attacks involved surfers, wind-surfers and rafters. Swimmers, waders and divers were attacked more often than kayakers and surf-skiers.

More data on the attacks are available at the University's website: Shark attacks up worldwide after a brief slump


(1) Comment on the headline in light of the attack figures for the last three years.

(2) What do you think of "rate of tourism" and "amount of coastline" as explanations for the large number of attacks in the US?

(3) What reasons can you suggest for the difference in fatality rates for the US attacks as compared to those in other areas?


Dining out in L.A. comes to crunching numbers
The Boston Globe, 22 June 1998
Lynda Gorov

When a 1997 television news investigation revealed unclean conditions at many area restaurants, the Los Angeles County Board of Supervisors responded by requiring restaurants and markets to post the grades they receive from unannounced health inspections. The county has about 21,000 restaurants. Of those inspected from January 16 through May 31, 11,908 got a score of 90% or above, earning a letter grade of A. Among the rest, 6460 earned a B and 2578 earned a C. The 1700 lower scores do not receive a letter. Among those embarrassed were the restaurants at the new Getty Museum, whose three restaurants initially earned two C's and a B (since upgraded to two A's and a B).

Some restaurants owners feel the ratings are unfair because problems with food and problems with equipment are combined in the overall score. For example, the Danish Cafe earned a 67% score, too low for a letter grade. But examination of the Cafe's scorecard shows that dented refrigerator doors were the primary problem area. The owners have repaired the doors but must post the old score until the next inspection.

The California Restaurant Association has yet to take a formal position on the rating system. An official at the health department said that restauranteurs assume that a C will mean a drop in business but pointed out that "a B or C doesn't represent an imminent danger that would cause us to discontinue food service. It means that they have problems that need to be addressed."


(1) If your favorite restaurant got a C from the inspectors, would you still want to eat there?

(2) The federal government inspects meat and produce. Do you know the difference between grade A and grade AA eggs?


Coincidence or conspiracy?
Unpublished Correspondence
Charles M. Grinstead

Coincidences occur quite frequently in real life, and it is a very interesting problem to try to estimate the related probabilities. We give here an example that actually happened quite recently.

The owner of a 1985 Dodge Caravan was sent a notice from the Chrysler Corporation stating that the rear latch on the car might be defective. The notice authorized the owner of the car to take it to any Dodge dealer, and the dealer would replace the latch at no charge to the owner.

The owner drove the car, shifting gears in the normal way (a point that will become important momentarily), to the nearest Dodge dealer. His wife drove their second car so that he could get home. They arrived in tandem at the dealership, and the keys to the Caravan were handed to a worker, so that he could put the car in queue.

Shortly after the couple returned home, they received a call from the dealership. The caller stated that three people at the dealership had attempted to put the car into first gear so that they could drive it to its parking spot, but none had been able to do so. In fact, the only gear that worked was reverse. It was later determined (by the dealership) that a cable between the shifter and the transmission had broken. Without this cable, it is impossible to shift the gears.

The owner of the car, being a probabilist, wondered what the relative probabilities were for the following two possible events: 1) the cable broke while the car was under the control of the owner, and 2) the cable broke after the car was under the control of the dealer. Note that the second event can be partitioned into two subevents: a) the cable broke accidentally, and b) some nefarious activity by someone caused the cable to break.

Finally, we note that the owner drove the car into the dealership in first gear, leaving it there when the engine was turned off. Once the cable broke, it would be impossible to shift gears. Since the only gear that worked was reverse, it is logical to assume that the transmission was in reverse when the cable broke. The big question is: Who put it into reverse?


(1) Estimate the probability that the rear latch gets repaired.

(2) Do you think that the above story would convince a judge in small-claims court to find in favor of the owner of the car?

(3) How would you go about estimating the probabilities of the two events given above?

(4) What is the probability that the car owner ever takes his car back to this particular dealer?



Chance News
Copyright © 1998 Laurie Snell

This work is freely redistributable under the terms of the GNU General Public License as published by the Free Software Foundation. This work comes with ABSOLUTELY NO WARRANTY.


CHANCE News 7.06

(27 May 1998 to 26 June 1998)