!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! CHANCE News 3.13 (2 Sept to 21 Sept 1994) !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Prepared by J. Laurie Snell, with help from Jeanne Albert, William Peterson and Fuxing Hou, as part of the CHANCE Course Project supported by the National Science Foundation. Please send comments and suggestions for articles to firstname.lastname@example.org Back issues of Chance News and other materials for teaching a CHANCE course are available from the Chance Web Data Base in the Multimedia Online Document Library at the Geometry Center (http://geom.umn.edu/) or from their Gopher (geom.umn.edu) in Geometry Center Resources. ======================================= Statistics are like alienists - they will testify for either side. Firello LaGuardia ======================================== IN THIS NEWS LETTER
>>>>>==========>> OTHER INTERNET SOURCES In the last Chance News we discussed the game of frustration solitaire that came up in a recent letter to Marilyn vos Savant. Recall that this game is played as follows: you shuffle a deck of cards and then run through the deck, turning over the cards one at a time as you call out, `Ace, two, three, four, five, six, seven, eight, nine, ten, jack, queen, king, ace, two, three, ...,' and so on, so that you end up calling out the thirteen ranks four times each. If the card that comes up ever matches the rank you call out as you turn it over, then you lose. We promised to provide a solution to the probability of winning that used only very elementary methods. We have done so, and it is available from the Peter Doyle's Web Home Page: http://www.geom.umn/people/doyle.html. You will find solutions by Peter and friends to other interesting problems. We hope by next time to have added the solution to the much more difficult problem of finding value of the game of Treise attempted by Montmort in the early 1700's but not completed.. <<<========<<
FROM OUR READERS
Rating quarterbacks: an amplification
The College Mathematics Journal, September 1994
(Original article in the November 1993 issue.)
Roger W. Johnson
The National Football League (NFL) has established a rating for "All-Time Leading Passers" based on percentage of completions, percentages of touchdowns passes, percentage of interceptions and average gain per pass attempt. This rating is used to gauge the relative performance of quarterbacks for a current season. For example, in the September 13 Seattle Times, in discussing the upcoming game between the Seattle Seahawks and the San Diego Chargers, we read that "Just like Seahawk quarterback Rick Mirer, Humphries hasn't thrown an interception. He has the highest quarterback rating in the NFL - 127.2. Mirer is fourth at 115.1." The formula that is used is not publicized, but Professor Johnson found that the results from least squares analysis yields the formula: Rating = [25 + 10(%Compl) + 40(%TDs) - 50(%Inter) + 50 (Yards/Att)]/12 which is used in most cases. The National Collegiate Athletic Association (NCAA) uses the simpler rating: Rating = (%Comp) + 3.3(%TD) - 2(%Int) + 8.4(Yds/Att). Professor Johnson has provided us with the data he used to obtain these formulas. We put NFL data and NCAA data in Chance Data Base, in Teachers Aids. Students might enjoy trying to obtain these formulas. DISCUSSION QUESTIONS (1) What other factors do you think should be taken into account in rating quarterbacks? (2) Who do you thnk has the highest rating recorded so far in the NFL? Who do you think has the second highest rating? <<<========<<
>>>>>==========>> The reader who suggested the following article has to be listed as anonymous since the name disappeared in transit. How numbers can trick you; the six deadly sins of statistical misrepresentation.
Technology Review, October 1994, pp 39-45
Barnett proposes six deadly sins of statistical interpretation. We give one example of each. The article provides more discussion of these and other examples. You will find Barnett also quoted in the New York Times article on the statistics of the USAair accidents discussed in the next abstract. 1. Generalizing from non-random samples. Researchers at the Harvard Medical School found, in interviews with 1,500 people who had suffered heart attacks in the previous few days,that a disproportionate number reported episodes of extreme anger in the two hours preceding the attack. They were led to an estimate that anger was associated with 2.3 times the usual heart attack risk. The Boston Globe generalized this to all of us by simply reporting that anger "can double the chance for heart attack". 2. Look! a Trend. In 1993 the International Airline Passenger Association began rating airlines in terms of safety. An Airline having the fewest deaths over a five-year period might be considered a particularly safe airline. However, the data shows that the "safest" airline in one period is apt to be the least safe in another period, suggesting that the small trend is normal chance fluctuations having nothing to do with the difference betwen airlines. 3. Unjust Law of "Averages". In 1987, the Department of Transportation required U.S. airlines to report each month the percentage of their flights into the nations 30 busiest airports that arrived on time. This information has been used in advertisements, such as Northwest's boast that it is "the number one on-time airline". If an airline has a large percentage of its flights into a city like Seattle, with lousy weather, it is at a disadvantage in such a contest with an airline that has the highest percentage of its flights into Pheonix. 4. Verbal imprecision A statistical study reported that the odds of a death sentence in a white-victim case were 4.3 times the odds in a black-victim case in Georgia. The New York Times reported this as 4.3 times as likely, and the Supreme Court used this interpetation. Using this incorrect interpretation, the probability of a death sentence being 99% when the victim is white leads to a 23% chance if the victim is black. With the correct interpetation in terms of odds, the 23% becomes 96%. 5. The Unsound Comparison. In early 1992, the New York Times reported a record number of killings occurred in 1991 in four of the nation's ten largest cities: Los Angeles, San Diego, Dallas, and Phonenix. They failed to point out that all four of these cities also reached new highs in population in 1991. 6. The Hidden Defect. An article in the journal "Risk Analysis" in 1991 reported that a U.S.driver -- age 40, sober, wearing a seat belt and driving a heavier than average car -- has a "slightly less" mortality risk on a 600-mile trip than a person who takes the same trip by air. The analysis began with the overall death rate per mile driven on rural interstate highways. This was multiplied by the risk factors for age, wearing a seat belt, and driving a heavier car. Multiplying these risk factors led to a much smaller final risk than was justified, since the factors are not independent. DISCUSSION QUESTIONS (1) Why is the subpopulation of the study in Example 1 not representative of the population as a whole? (2) Why are the risk factors used in Example 6 not independent? (3) Can you provide your other examples of the seven deadly sins of statistical misrepresentation? <<<========<<
>>>>>==========>> When is a coincidence too bad to be true?
The New York Times, 11 September 1994, Section 4, Pg. 4
The recent crash of a USAir plane near Pittsburgh has prompted many questions about the safety of air travel and the relative risks between airlines. According to the article, USAir's accident record does look grim: among major commercial carriers, the last three fatal crashes in the United states were USAir planes; the airline has been involved in four out of the last seven major air disasters; and USAir has had five fatal crashes in the past five years. Ms. Kolata asks: "Would it be a rational decision to avoid flying USAir in favor of its competitors? Or, considering the vast number of passengers carried by airlines, can USAir's tragic losing streak be attributed to the vagaries of chance?" The question of safety seems to arouse little controversy: by all accounts, air travel, as far as major accidents are concerned, is an extremely low risk. Dr. Arnold Barnett of M.I.T. makes this very clear: "roughly speaking, if you were to board a jet flight at random every day, it would take 26,000 years on average before you succumb to a major crash." But what about the relative risks between airlines? Here the answer is not so straightforward. Using safety records, Dr. Barnett ranked eight major airlines over several ten-year periods and found that, not only was the first-ranked airline different each time, but this same airline finished in the bottom half of the other rankings. As for USAir, Barnett says that, while there is a 2 to 10 percent chance that the airline's crash record is due to chance alone, at the same time, if you were to board a USAir jet at random in the 1990's, your chances of being killed would be nine times higher than on any other airline. DISCUSSION QUESTIONS: 1. If you had to fly today, would you choose USAir? What would influence your decision? 2. According to Dr. Barnett, media coverage of fatal air crashes can influence people's perception of the risks of flying. He claims that over a two-year period there was 8,100 times as much coverage per death for commercial jet accidents as there was for cancer. He suggests that such reporting tends to make flying seem more dangerous than it actually is. What do you think of this argument? 3. Dr. Brad Efron of Stanford points out that USAir carries 20 percent of all domestic flights and says this should be taken into account. He calculates that the chance that one major airline has four out of seven fatal crashes, assuming it has a 20 percent market share, is about 10 to 15 percent. What does this mean? How would you explain the difference between this conclusion and Dr. Barnett's 2 to 10 percent figure? Dr. Efron says that his findings are "enough to begin getting suspicious but not enough to hang them." Do you agree? 4. The Dartmouth Math Department softball team makes a lot of errors. Or rather, the players make errors. Say that over the course of the game the team makes 20 errors, and that each time an error is made, it is equally likely to be made by each of the 10 players. How likely do you think it is that at some point in the game, a single player will be responsible for the last three errors made, or for four out of the last seven? How could you get a good idea of what these unknown probabilities are? 5 The following statement appeared in the July 20 issue of ``Flight International". More people died in airline accidents during the first half of 1994 than in the same period of any other year in the last decade-except for the record year of 1985, according to the Flight International Airline Safety Review published this week. <<<========<<
>>>>>==========>> In the war against grade inflation, Dartmouth scores a hit.
Wall Street Journal, 8 September 1994
This article discusses grade inflation at American colleges and universities during the past several decades and some recent attempts to reverse the trend. The author, a professor at Rutgers University, claims that, 43 years ago, 13 percent of the grades at Rutgers were A's, and 29 percent were B's. Today these two account for two-thirds of all grades awarded at the university. Private institutions have even higher rates: about 90 percent of all grades at Stanford are A's and B's, 73 percent at Harvard. Several colleges have abolished the F grade, and, at Oberlin, the D grade, also. What is being done to reduce the number of higher grades? Some universities are reintroducing the F grade. Although a poll of Stanford's undergraduates showed that 48 percent were opposed to the idea, in the 1995-96 academic year the university will include a failing grade--now called "NP", or "not-passed". Dartmouth has taken a different approach. On transcripts this fall the college will include the size and median grade of the class along with a student's grade for the course. Professor Toby claims this will tackle the grade inflation problem on two fronts by reducing the likelihood of professors giving so many A's and B's, and by giving students less incentive to choose courses and professors because of their grading policies. Presumably, these two effects would reinforce each other, as well. DISCUSSION QUESTIONS: 1. Professor Toby says that not only do students need grades but that "society--and society includes parents-- also needs grades." Do you agree? In your opinion, is grade inflation a problem? 2. What do you think of Dartmouth's new grading strategy? Do you think including the class size and median grade on a student's transcript will affect grade inflation, and, if so, how? Why do you think the median is being used instead of, say, the mean? <<<========<<
>>>>>==========>> DNA testing raises questions in county.
The New York Times, 11 September 1994, Pg. WC19 Fay Ellis
The New York Times, 7 September 1994, Pg. B10
Both of these articles focus on the questions raised by the use of DNA testing in criminal court cases. The two most pressing issues are the accuracy of the tests and what role the results of such tests should play in determining guilt or innocence. As the articles point out, the two issues are of course related. The first article discusses the significant impact that DNA testing has had in recent years: in Westchester County, New York, prosecutors have used DNA analysis to link a suspect to a crime in nearly 100 crimes since 1988. On the other hand, if, in addition to other "substational doubts" of a suspect's involvement in a crime, there is no DNA match, the case will not go to trial. The accuracy and reliability of the test appears to be more of a significant issue, especially as relating to the current O.J. Simpson case. The second article focuses primarily on the credibility of Cellmark Diagnostics, the company which performed the DNA tests for this case. According to the article, defense attorneys attack DNA laboratories in their statistical analyses as well as in the quality of their work, and there is ample discussion here of both of these issues. The current examination of Cellmark arose after prosecutors said DNA tests showed that "a sample of Mr. Simpson's blood closely resembled blood recovered by investigators". <<<========<<
>>>>>==========>> Fierce competition marked fervid race for cancer gene.
The New York Times, 20 September 1994, C1
The race to find the cancer gene called BRCA1 is finally over with victory going to the team at the University of Utah headed by Dr. Mark H. Skolnick. Dr. Skolnick attributed their success to the extraordinary genetic resource: Utah's large, stable families and the huge genealogical archives of the Mormon church. Others on his team also gave "luck" a lot of credit. Researchers now can concentrate on studying how the gene works and on developing a screening test to check for mutations in the gene. It is estimated that as many as 5% of all cases of breast cancer might be due to inherited defects in the genes. Frances Visco, president of the National Breast Cancer Coalition is quoted as saying "Women will have to be very careful. You're talking about giving them a test telling them they have an 85 percent chance of getting a disease that we don't know how to prevent, and for which there is no known cure." <<<========<<
>>>>>==========>> Regimen of moderate exercise tied to drop in breast cancer.
The New York Times, 21 Sept. 1994, C10
Jane E. Brody
A new study reported in the current issue of the "Journal of the National Cancer Institute" which studied more than 1000 California women has found that moderate exercise can reduce a women's risk of developing premenopausal breast cancer by as much as 60%. Lifetime exercise habits and other relevant factors were determined through personal interviews with 545 women with ages up to 40 with newly diagnosed breast cancer and an equal number of women who did not have cancer but matched those who did in other respects. If further studies confirm this finding, this will be the first risk factor for breast cancer that women can control. Other risk factors that have been identified are: family history -- risk is lowest among those who do not have a family history of breast cancer, age at onset of menstruation -- risk is lowest among those who start menstruating late, age of first pregnancy -- risk is lowest for those who have their first child by age 20 and who have the largest number of pregnancies, and finally socioeconomic states -- higher status associated with higher risk. <<<========<<
>>>>>==========>> Music to operate with.
The New York Times, 21 Sept. 1994, C10
Another triumph for music! An article in the current "Journal of the American Medical Association" reports that surgeons are who have background music for their operations are apt to do a better job. The study tested 50 men, 31 to 61 years old, all of whom regularly listened to music when they operated. They were hooked up to a polygraph and asked to count backward by 13's, 27's etc., from a five-digit number. This task was repeated while they were listening to no music, while they were listening to special stress- reduction music, and while they were listening to music of their choice. The subjects provided the quickest, most accurate, and least stressful results were obtained with the music of their choice and the worst results were obtained with no music. <<<========<<
Against the odds.
The Economist, 20 Aug. 1994, pp. 59-60
No author given.
Two economists William Chrisy and Paul Schultz wrote a forthcoming article in the "Journal of Finance" titled "Why do NASDAQ avoid odd-eight quotes? The absence of odd-eight quotes is illustrated by comparing a histogram of bid/offer spreads, for 100 stocks in the NASDAQ with a corresponding histogam for 100 stocks in the NYSE/AMEX. The authors suggest that major dealers in NASDAQ are in collusion to round out figures to the nearest 1/4, to keep the bid/off spreads at least 1/4, thereby making their deals more profitable. This has led to lawsuits against these dealers. The dealers have an explanation for the scarcity of 1/8 prices, but, after the Christy Schultz article, 1/8 bid/offer quotes have become much more common. <<<========<<
>>>>>==========>> What the polls say--and what they mean.
New York Times, 17 September, 1994, Section 1 Page 23.
Daniel Yankelovich is well-known for his theories of the meaning of poll and he has explained these theories in his recent book "Coming to Public Judgements." Here he uses the health care issue to show that, while polls faithfully represent what people say, only the most sophisticated tell what they believe. He suggests that the large public support (average over polls of about 71%) for universal health insurance is misleading. It only means that people do not think anyone should be deprived of health care, but only if the country can afford it, choice of doctors is not limited etc. People have not thought through the consequences of their opinions. Yankelovich calls such polls "raw opinions". He feels that, as public debate proceeds, people's opinions will become based on more solid information and the polls will be more meaningful. <<<========<<
Los Angeles Times, 7 September 1994, Pg. 14
Thomas H. Maugh II
A study of medical board exams--standardized tests taken by second-year medical students nationwide--has shown that, while white males score higher than women and ethnic minorities, the difference is due more to undergraduate achievement and educational background than to sex or race. According to the article, on a test with a mean score of 500, Asian Americans scored 15 to 20 points lower than white males, Latinos scored 60 points lower, and African Americans scored 100 to 120 points lower. On average, women scored 30 points lower than males from the same ethnic group. The pass rates for the exam also varied according to race, with 88% for whites, 84% for Asian Americans, 66% for Latinos, and 49% for African Americans. For all groups except Asian Americans and women, the study found that board exam scores could be predicted from undergraduate GPA's, the number of science courses taken, and MCAT scores. Asian Americans and women "scored lower on the boards than would have been predicted from undergraduate data." The article also addresses the issue of the usefulness of the exam in determining which students are likely to become good doctors, and offers some differing viewpoints. What is not disputed is that many of the students with low scores on the boards do not graduate from medical school or become practising physicians. By contrast, the article notes that it is well-known that, while high MCAT scores predict good performance in medical school, low scores are not so useful, especially for ethnic minorities. In particular, African Americans with a low MCAT score are more likely to succeed than whites with a similar score. <<<========<<
>>>>>==========>> Forget PCB's. Radon. Alar.
The New York Times, 11 September 1994, Sec. 6, Pg. 60
"Throughout the world, many more people die each year from filthy air and dirty water than from asbestos, dioxin, electromagnetic radiation, nuclear wastes, PCB's, pesticide residues, and ultraviolet rays," Mr. Easterbrook writes. Although these problems are "real enough and must be dealt with", the author vehemently asserts that air and water pollution, especially in developing countries, is of primary concern. Citing the World Health Organization and Unicef, Mr. Easterbrook reports that last year, 4 million children under the age of five died from diseases stemming primarily from air pollution, and that 3.8 million under five died from diseases, such as diarrhea, caused mostly by impure water. In the developing world, diarrhea kills far more people than cancer, he writes. Mr Easterbrook, a contributing editor to Newsweek and The Atlantic Monthly, takes many Western environmentalist to task for concentrating on problems which he implies are minor compared to the lack of clean air and water in many areas of the world. His data can indeed be alarming: 1.3 billion people in the developing world live in zones of dangerously unsafe air; 1 billion people lack access to drinking water that meets the crudest safety standards; In 1991 there was more toxic water pollution in China alone than in the whole of the Western world, after an estimated 25 billion tons of unfiltered industrial pollutants went directly into the waterways. The article also contains Mr. Easterbrook's suggestions for solving these problems, mostly in the form of hydro- electric dams, petroleum refining, and high-efficiency power plants for the clean combustion of coal. Many environmental groups oppose these large-scale projects, and there is ample discussion on this issue. <<<========<<
>>>>>==========>> The Journal of the American Statistical Association has established a new section, "Statistics in Sports". The current issue (September 1994) of JASA has a collection of papers from this section. The introductory remarks of the editor, Donald Guthrie, include "The popularity of statistical analysis of sports offers another opportunity -- education of spectators, particularly young people, in the principles of sound statistical reasoning. A statistical argument presented in the context of one's experience is far more likely to be retained than one presented in the context of a hypothetical situation." We mention two of the articles that we found interesting. The analysis here naturally gets a little technical, but the problems are ones that students might enjoy exploring on their own. <<<========<<
Exploring baseball hitting data.
JASA, September 1994, pp. 1066-1074.
A recent book by Cramer and Dewan ("STATS 1993 player profiles" published in 1992 by STATS Inc.) provides data relative to aspects of a player's performance, say his batting average, in different "situations". This article uses this data to look for situations that significantly effect a players batting average. The author starts with Wade Boggs' performance in 1992, to see how he performed against left- and right-handed pitchers, pitchers that induce mainly groundballs as compared to flyball pitchers, night games as compared to day games, grass as compared to artificial turf, and home games as compared to away games. Some seem to have an effect and others not. This gives a chance to illustrate that if you look at enough features of a data set, just by chance, one or more will seem unusual. The author then looks at a whole group of players asking the same kind of questions and finally looks to see if differences established here carry over to different seasons. Albert found that "variation in batting averages by pitch count is dramatic--batters generally hit 123 points higher when ahead in the count than with 2 strikes." Smaller, but significant, differences appear when facing a pitcher of opposite arm, facing a groundball pitcher rather than a flyball pitcher, and playing a home. These do seem to carry over to different seasons. Professor Albert has made the data used in this study available by ftp from isds.duke.edu in pub/albert/situation_data <<<========<<
Estimating with selective binomial information.
JASA, September 1994, pp. 1080 to 1089.
George Casella and Roger L. Berger
When Dave Winfield is batting, rather than giving his batting average, the announcer might say "he's really hitting well these days; he's 8 for the last 17." This is selectively reported data and the question is, what can we learn from it about Winfield's true batting average? Of course, we see more serious examples of this problem all the time -- for example, people who do meta-studies may choose only those studies that have been published but then want to make inferences about all studies. The author shows that we cannot do a lot with the small amount of data we have for Dave Winfield, but we can make quite good estimates in similar situations with a larger data set which results fro selective reporting. !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! CHANCE News 3.13 (2 Sept to 21 Sept 1994) !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Please send suggestions to: email@example.com >>>==========>>|<<==========<<< >>>==========>>|<<==========<<<