Claudis Herrion Laurie Snell

Office hours Wed. 11:30-12:30 Office hours Mon. 4:00-5:00

Chance Lab in Choate House Chance Lab in Choate House

**Class 4 JMP Lab and Descriptive Statistics**

**Class 5 Do Prisons Reduce Crime and Standard Deviation**

**Class 6 Birthday Problem and Coincidences**

**Class 9 Polling and Binomial coefficient**

**Class 11 Surveys and Confidence Intervals**

**Class 12 Correlation and Regression**

**Class 13 Regression and Environmental News**

**Class 14 Conditional Probability and False Positives / Finishing Regression**

**Class 16 ESP and Hypothesis Testing**

**Class 18 Prenatal Testing and Card Shuffling**

**Class 19 Shuffling, JMP and Games of Chance**

Welcome to

Topics that might be covered in

- Health risks of electric and magnetic fields;

- Statistics, expert witnesses, and the courts;

- The use of DNA fingerprinting in the courts;

- Randomized clinical trials in assessing risk;

- The role of statistics in the study of the AIDS epidemic;

- Paradoxes in probability and statistics;

- Fallacies in human statistical reasoning;

- The stock market and the random walk hypothesis;

- Demographic variations in recommended medical treatments;

- Informed patient decision making;

- Coincidences;

- The reliability of political polls;

- Card shuffling, lotteries, and other gambling issues;

- Scoring streaks and records in sports.

Every member of each group is expected to take part in these discussions and to make sure that everyone is involved: that everyone is being heard, everyone is listening, that the discussion is not dominated by one person, that everyone understands what is going on, and that the group sticks to the subject.

- Specific assignments that you have been asked to do for your journal. These will
include questions they you are asked to think and write about, related to the
current day's discussion, the
results of computer investigations, etc.

- General comments about the class; things you don't understand; things you finally
do understand. You might describe an
experience of trying to explain material from class to a friend or family member.

- Finding and commenting on news articles about topics relevant
to the course; asking us challenging questions; making
connections between what went on in class and experiences in
your own life; going to a casino and winning a lot of money.

- Anything interesting and imaginative about a chance topic.

We encourage you to cooperate with each other in working on anything in the course, but what you put in your journal should be your own alone. If it is something that has emerged from work with other people, write down who you have worked with. Ideas that come from other people should be given proper attribution. If you have referred to sources other than the texts for the course, cite them.

Journals will be collected and read on these dates:

Thursday 10 October

Thursday 24 October

Thursday 7 November

Thursday 21 November

Tuesday 3 December

- 1. Read the article handed out in class: "Hit the Lotto, Buy a Toaster."
- 2. Form two pairs within your base group.
- Pair 1: Prepare a case for the old (high stakes) form of advertising.
- Pair 2: Prepare a case of the new (low stakes) form of advertising.

- 3. Regroup in pairs within your base group.
- Listen to each other's arguments.
- Come up with a recommendation for the Mayor.

2. Read Chapters 1 and 2 from FPPA. Do the review exercises at the end of Chapter 2 on page 22. (Due Thursday)

- Pair up in your base groups and describe the articles you found in the news.

- Discuss one or two questions that each article raised for you.

- Hand in the titles and sources of your articles.

Read the article "Study Finds Stunted Lungs in Young Smokers".

- Form new groups of three.

- Look at Table 1 Handout on smoking and teenagers, discuss the following:

a. What conclusions would you draw from the table?

b. Is there any relation between maternal smoking and child smoking?

c. Are there significant differences between smokers and non-smokers?

d. Are there significant differences between boys and girls with respect to smoking?

e. What confounding factors might explain differences between boys and girls with respect to smoking and lung problems?

Reminder, due Thursday: Read Chapters 1 and 2 from FPPA. Do the review exercises at the end of Chapter 2 on page 22.

- How to get access to JMP

- Where to find the class survey information

- Basics about JMP

- Some basic statistical terms and ideas

- First look at both MALS Survey and data, Math5 Survey and data.

- String exercise.

- Analyze results.

- Discussion on measurement bias. What other forms might it take?

- What kinds of bias are discussed in the article? Are they measurement biases?

- Read Chapter 3 Do all review exercises (pp. 47-52.)

- Read Chapter 4 Do review exercises: 1-3, 5, 7-10, 13, 14 (pp.70-72).

- Read Chapter 6

Learn how to use JMP.

What kinds of conclusions can you draw from the survey conducted in class?

For class next Tuesday, please bring in a graph--somewhat unusual ones preferred. You can look in books, newspapers, magazines, reports, some document from your work, or anyplace else!

****NOTE**** We will be meeting in the computer classroom in the basement of Kiewit at 5:30 on Tuesday 10/8/96. During the second half we will return to our regular classroom.

- Make sure you know how to get data from Chance folder in Public.

- Play with data from "survey.mals.96" and "survey.jmp.m5"

- What conclusions can you draw from this data?

Review difficult homework problems from last week.

- Average, median and mode. What are the differences? Which contexts lend themselves
to which measures?

- Handout exercise in pairs.

- What is a histogram?

- How do you read a histogram?

- What is the difference between a histogram and a bar graph?

In Chapter 3, you ONLY need to do the following review exercises 1,3,7-10. (Homework for Chapter 4 and 6 is unchanged.)

Read the Gould's article "The Median isn't the Message." Comment on it in your journal.

Thursday is the MALS party (before class) and Kessler talk (after class.)

- What do you think is meant by "measures of central tendency?"

- What are different examples of measures of central tendency?

a. Table 1 (From Statistics Without Tears )

Method of Transport Number of Students.

Bicycle 15

Foot 12

Bus 9

Motorcycle 6

Car 5

Train 3

-----------------------------------------------------------------

Total 50

b. Here are two different groups of 5 people's yearly income. Find the median and the mean for each. Which is a better measure of central tendency in this case, and why?

X $30,000 $38,000 $42,000 $57,000 $73,000

Y $30,000 $38,000 $42,000 $57,000 $244,000

c. You are given the ages of five bus passengers as:

Under 12, 22, 48, 54, over 65 years.What measure of central tendency would you use?

- Read "Incarceration Is a Bargain" by Steve Hanke, in the
*Wall Street Journal*,

Monday, September 23, 1996.

- Discuss the following questions:

- a. How do you think Mr. Leavitt came to the figure that the average criminal would cause $53,900 worth of damage?
- b. Given that most of the people in prison are there for non-violent, non- property crimes (e.g., in 1994 the proportion of federal prisoners who were drug offenders was 62 percent)--sometimes referred to as consensual crimes--does that effect your reading of the article?
- c. How do you think author comes to the conclusion that "violent crime would be approximately 70% higher today if our prison population had not increased since 1973; and property crime would be almost 50% more frequent."?
- d. What other factors might explain the decrease in violent and property crimes reported in the article?

- a. How do you think Mr. Leavitt came to the figure that the average criminal would cause $53,900 worth of damage?

2. **Standard Deviation**

3. **Video and Discussion** of Stephen Jay Gould's "Why the Death of .400 Hitting Records Improvement of Play" (from his *Full House*, Harmony Books, 1996).

- What is the birthday problem?

- What would you be willing to bet that there is a birthday match in this class?

- Determine if there are any matches.

**2. Small Group Discussion of Birthday Problem** (groups of four).

- What is the probability that there is
*not*a match in a group of four?

- What is the probability that there is a match?

**3. The Mathematics of the Birthday Problem.**

- What is the probability of a match in general?

- How many people are needed in order to have at least a 50% probability of a match?

- What is the probability that someone in the room has
*your*birthday?

**4. Video clip: VVhat is Probability?** (*Against All Odds*)

**5. Coincidences in Air Crashes.**

- Read the two letters from the NYT about meteorites and air crashes.

- Which of the two writers do you think has the right approach?

- What is the relationship between these articles and the birthday problem?

**6. Coin Tossing Experiment.**

- What is the probability of you flipping a head four times in a row?

- What is the probability of someone in the class getting a head four times in a row?

**Journal assignment:**

Think of coincidences in your own life. What is the likelihood of these being random chance occurrences?

a. Break into groups of four.b. Identify a member of your group who claims to be able to tell the difference between Pepsi and Coke. (Coke Classic, that is; accept no substitutes!)

c. Design an experiment to test whether this is true. Remember that one swallow doth not a summer make: Don't certify your taste-tester just on the basis of one taste. Write down exactly what data you will collect and what you will do with the data before you start collecting it.

d. What is being tested?

e. Carry out the experiment.

- When you design an experiment like this you should ask several questions. First, what do you want to test? Do you want to test if a person can tell given a single cup whether it contains Coke or Pepsi? Can a person decide which of two cups is Coke and which is Pepsi? Can a person who is given two cups simply decide if they have the same or different drinks? These are all testing slightly different abilities.

- What does your experiment test? Is that what you want it to test?
f. Record your results.

**2. Binomial Distribution.**

- What are the chances of getting these results by chance alone?

- What is a binomial distribution?

**Homework assignment: (Due Thursday)**

FPPA--

- Chapter 15, odd review problems.
- Chapter 16, all review problems.
- Chapter 17, even review problems.

1. The CNN Tracking Poll for October 19-20 interviewed 732 likely voters. They reported that 55% favored Clinton, 34% favored Dole and 6% favored Perot with a sampling error of + or - 4% (sampling error is also called margin of error).

- What do you think sampling error means?

- Discuss how this fits into your group's understanding of "sampling error". Where do you think the " 19 cases out of 20" comes from?

- Can the difference ( 9 pts vs. 25 pts) be explained by chance?

- What are some other possible explanations?

- Do you think that tracking polls are a good idea?

(1) Read the NYT article "Misreading the Gender Gap" by Carol Tavris (September 17,1996), What do you think of her explanation of the gender gap in the current election.(2) How would you explain "margin of error" to a friend who had not had a statistics course?

** Speaker:** Tami Buhr from Harvard University will speak on her experiences in polling.

**Part II.**

Review of mathematical ideas. Binomial coefficients. Homework problems.

**Homework Assignment:** (*due Thursday, 10/31/96*)

Read Chapter 19, 20, 21. Do exercises:

- Chapter 19, Review Problems 1, 3-6, 10

- Chapter 20, Review Problems 2, 3, 4, 6, 7, 9, 11

- Chapter 21, Review Problem 1, 10.

**Journal assignment:**

Comments and reflections on speaker's talk.

**Please hand in on a separate sheet of paper a description of your plans for your final project. (due next Thursday)**

Review of Mathematics

- Binomial Coefficients

- Additional Math Problems

- What are the chances of something occurring by chance?

**Part II.**

Stephen Jay Gould talk in Cook Auditorium.

**Journal assignment:**

- Find two articles in the news. Summarize them and write at least three questions these articles raise for you.
- For those who attend the Stephen Jay Gould talk, give thoughts and comments about his talk.

**Handouts:**

(1)Project suggestions

(2)Comments on journals

(3)**Standard Error and Normal Approximation**

by John Finn(optional reading)

Guest Speaker: Nancy Mathiowetz will speak on Surveys and Data Collection

**Part II.**

Confidence Intervals and Standard Deviation

**Homework Assignment:**

- Chapter 23, Review Problems 1, 3, 6, 12
- Chapter 8, Review Problems 1,2,6,7,10
- Chapter 9, Review Problems 6,7,10,13.

(For those who need a review of how to plot lines, find slopes, etc., read Chapter 7)

- Scatter plots and the correlation coefficient.
- Standard Deviation line.
- Predictions and regression.

**2. Cookie Experiment.**

*Designing an experirnent:*We want to know if there is a correlation between price and quality for chocolate chip cookies. Design an experiment to find out. Decide on as many details as you can, including: how many brands of cookies will be used and how quality and price will be measured.

*Carry out the experiment.*

3. **Class Discussion:** You be the judge: did regression analysis reveal a voting fraud, and was the fraud decisive?

Read "Probability Experts May Decide Pennsylvania Vote" (*The New York Times,* April 11, 1994).

*Discussion Questions:*

- What confounding factors might account for the anomalous election results?
- Was Professor Ashenfelter's method a reasonable way to decide the issue?
- How is a court to determine whether a proven voting fraud was decisive?
- How certain must a court be to take the extreme step of seating the nominal loser?
- How did Professor Ashenfelter arrive at 6% in his calculation of the probability that the anomaly could have occurred by chance?

**Journal Assignment:**

Look for a couple of articles in the news that use statistics or probability. Summarize the article and talk about 2 or 3 questions the article raises for you.

REMEMBER Journals are due this Thursday, November 7.

**2. Guest Speaker:**

Bob Braille who is an adjunct professor in Environmental Studies and has written for the Boston Globe will tank about reporting on environmental issues.

**3. More on Regression.**

- What does regression really mean?

**Homework Assignment:**

- Chapter 10, Review Exercises 1,3,4,8.
- Chapter 12, Review Exercises 2,8. (If you have any questions about the Root Mean Square material, read Chapter 11).

**Journal Assignment:**

Read the two articles about Electromagnetic Fields and health risks. Comment on the differences between the two articles.

**Class Discussion: ***HIV Testing and False Positives*

1. In one of Marilyn vos Savant's columns in Parade Magazine the following question was asked.

Suppose we assume that 5% of the people are drug-users. A test is 95% accurate, which we'll say means that if a person is a user, the result is positive 95% of the time; and if she or he isn't, it's negative 95% of the time. A randomly chosen person tests positive. Is the individual highly likely to be a drug-user?

Marilyn's answer was:

Given your conditions, once the person has tested positive, you may as well flip a coin to determine whether she or he is a drug-user. The chances are only 50-50.

How can Marilyn's answer be correct?

2. An article in the New York Times some time ago reported that college students are beginning to routinely ask to be tested for the AIDS virus.

The standard test for the HIV virus is the Elisa test that tests for the presence of HIV antibodies. It is estimated that this test has a 99.8% sensitivity and a 99.8% specificity. 99.8% specificity means that, in a large scale screening test, for every 1000 people tested who do not have the virus we can expect 998 people to have a negative test and 2 to have a false positive test. 99.8% sensitivity means that for every 1000 people tested who have the virus we can expect 998 to test positive and 2 to have a false negative test.

The Times article remarks that it is estimated that about 2 in every 1000 college students have the HIV virus. Assume that a large group of randomly chosen college students, say 100,000, are tested by the Elisa test. If a student tests positive, what is the chance this student has the HIV virus? What would this probability be for a population at high risk where 5% of the population has the HIV virus?

If a person tests positive on an Elisa test, then another Elisa test is carried out. If it is positive then one more confirmatory test, called the Western blot test, is carried out. If this is positive the person is assumed to have the HIV virus. In calculating the probability that a person who tests positive on the set of three tests has the disease, is it reasonable to assume that these three tests are independent chance experiments?

**Journal Assignment:**

Read and comment on the Manchester, NH Union Leader story "Exit Poll Wrong Call in Senate Race Leaves Anger, Hurt, Red Faces." There are a couple of discussion questions at the end of the article.

*Part l--**Guest Speaker*.

Jamshed Barucha from the Psychology Department will speak on Judgment under Uncertainty.

*Part 2- Streaks in Sports*

Do you believe in streaks?

What do you mean by streaks?

How would you recognize streaky behavior?

**More Discussion:**

What would it take to believe in streaks?

What would it take to not believe in them?

Chapter 26, Review ex. 2, 5.

Chapter 28, Review ex. 2, 3.

Chapter 29, Review ex. 1, 2, 4.

**Journal Assignment:
**

Read the Discover article "Decisions, Decisions" and comment on it.

**1. ESP Experiment**

- What is a null hypothesis? What is an alternative hypothesis?
- What is a p-value? What is meant by statistical significance?
- How much evidence would it take to convince you that ESP exists?

**2. More on Streaks**

What is a streak? How would you recognize streaky behavior?

Computer simulations.

**Journal Assignment:**

Read the article on ESP "They Laughed at Galileo Too", NYT, August 11, 1996 and comment on it in your journal.

**Part I-- Guest Speaker.**

Charlie Lewis from ETS will talk about security problems on SAT exams.

**Part II-- Chi Squared Test**

- What are they for?
- How do you use them?
- Activity on contingency tables.

Left Handed Right Handed Total ________________________________________________________ Men Women Total

**No more homeworks from the text! Time to really focus on your projects.
We will collect your journals at the end of the term instead of next week.
**

*Discussion Questions:*

prevention, preparation, and reassurance. How valid are those reasons?

a) Is it irresponsible for a woman to refuse prenatal testing?

b) Is she morally responsible to access all information or only that which is inexpensive'? Or only that which poses no lisk to the fetus?

a) How is "increased risk" for fetal anomaly like or different from "increased risk" for having some inherited trait such as breast cancer, depression, pattern baldness.

b) What is the relevance of variation determined by diagnostic tests'? What are the advantages and disadvantages of gaining information about the fetus'?

*Part II -- Card Shuffling*

I ) Read the *NYT * article, 1/9/90 , *In Shuffling Cards, 7 Is Winning Number. *

2) *ShufNing activity.*

A boring game of solitaire, which I call Yin/Yang, shows that 7 ordinary riffle shuffles, followed by a cut, of a 52-card deck are not enough to make every permutation equally likely.

Hearts and Clubs are called the Yin suits, and Diamonds and Spades are called the Yang suits. We shuffle the deck of cards 7 times, then cut it, and then start removing and revealing each card from the top of the deck, making a new pile of them face-up (so if this were all we did, we'd just have the deck unchanged after going through it once, except that the deck would be lying face-up on the table).

We start the pile for each suit when we discover its ace, and add cards of the same suit to each of these 4 piles, according to the rule that we must add the cards of each suit in order.

Thus a single pass through the deck is not going to accomplish much in the way of completing the 4 piles, so having made this pass, we turn the remaining deck back over, and make another pass.

We continue this until we complete either the two Yin piles (hearts & clubs), or the two Yang piles (diamonds & spades). If the Yin piles get completed first, we call the game a win; it's a loss if the Yang piles get completed first.

If the deck has been thoroughly permuted (by having put the cards through a clothes dryer, say), then the Yin and Yangs will be equally likely to be first to get completed. Thus our expected proportion of wins will be 1/2.

Discuss your reactions to the video, and respond to some of the discussion questions in today's handout.

**Notes on second journal assignments.**

Everyone seemed to appreciate Tami Burh's talk and learned a lot from it. You had mixed feelings about Steven Jay Gould's talk. Some felt he was unnecessarily rude and egotistical. Most thought his ideas were interesting and it was worthwhile hearing his talk.

You raised some interesting questions about coincidences and you all had had plenty of incidents that you think of as coincidences. Much has been written about whether more "coincidences" have happened they should by chance alone and we are not going to settle this issue easily. However, it is important to bear in mind our earlier discussion about the different between the probability of a specific event and the probability of this event or a similar event sometime during a longer period, for example, during your lifetime. For example, one of you mentioned that you and your mother both had dreams the same night that involved animals with weird colors. This, by itself, would seem very unlikely but perhaps to have in a lifetime to a dream very similar in some weird way to someone else on the same night is not so strange.

It is possible to make models to show more concretely how the an event on a particular day can have very small probability but "once in a lifetime" not so small. The famous psychiatrist Jung was one of those who believed that coincidences occulTed more often than they could be expected to by chance. He mentioned once that he was struck by hearing references to fish 6 times during a 24 hour period. We can make a model to see how unlikely this would be in a lifetime. We have to specify how often on average you hear a story involving fish. The time between fish stories is random . It might be 2 hours or 2 weeks. Let's assume the average time between fish stories is one week. Then you can write a program to simulate this process where the time between stories is random with average time between stories one week. Then run this program for the equivalent of 40 years and record whether or not there is a 24 hour period that contains 6 or more fish stories. Finally repeat this many times to estimate the chance that 6 or more fish stories in a single day occurs during a 40 year period.

Amy asked if you could make a map of acquaintances in a small town and then try to see who knew who. This kind of a problem is a favorite of mathematicians called "graph theorists" They draw a line between each two people who are know each other. Then there is a "path" between two people a and b if you can go from a to b following these lines. It is natural to ask for the smallest number k such that there is a path between x and y of length less than k for every pair of people x and y in the town. If it is a small town you might guess that there is a path of length at most 3 between any two people. In the movie "Six Degrees of Separation" it was suggested that everyone in the world is connected by a path of length at most 6. Just to show you how berserk mathematicians can get over such problems, here are some remarks from my friend David Gliffeath's home page.

Apparently, sometime within the past few years, MTV talk show host Jon Stewart devised a parlor game in which contestants are to link any movie star to Kevin Bacon by a chain of films that share performers. Thus we imagine actors as vertices in a large network, with edges between any two who have been in a movie together. The goal is to find the path of minimal length connecting x to Kevin Bacon, that length then being the Bacon number, which we denote here as B(x). Quite a cult has grown up around this pastime, as witnessed by more than a dozen web pages now devoted to the game. My favorite link to Bacon fanatics is The Center of the Hollywood Universe.Now Brett Tjaden and Glenn Wasson at the University of Virginia have automated the calculation of B(x) at their Web site, The Oracle of Bacon. Their program, which makes use of the marvelous Intemet Movie Database (IMDB), will compute the Bacon number of any performer you care to specify within a few seconds. Hard-core cultists seem threatened by the power of the Oracle, but the rest of us welcome this automation of what can be an arduous evaluation. For instance, B(Bara, Theda) = 3, and in fact the Oracle confirmed an outstanding conjecture that for any x from the United States, either B(x) is at most 4 or B(x) is infinite.

Mathematicians have had their own version of this story for many years, centered around the Hungarian number theorist and combinatorist Paul Erdos, where the links are formed by joint authorship of research papers.

Sue wondered about the coincidence of getting an even dollar amount for the total charge of your groceries. This is an interesting problem. You are adding up several chance events that represent the individual costs for your items. You could make some assumption about the probability distribution for each individual item. For example, the probability that it is 17 cents etc. It is reasonable to assume that your total bill will be more than a dollar. I think you could then show that the probabilities that the last two digits are 00,01,02,..,99 are approximately the same so you could conclude that the chance that your total bill is exactly x dollars for some x is about 1 in one hundred. More interesting is the probability that the leading digit of your bill is 1,2,3,...,9. (If you bill is $14.89 then the leading digit is l). It has been found that the distribution of the leading digits in nature are not equally likely but rather follow a logarithmic distribution. Here is an account of how this tact has been used to find people who cheat on their income tax.

Mark Negrini, who teaches accounting at St. Mary's University in Halilax, wrote his PhD thesis on: "The detection of income evasion through an analysis of digital distributions". He has persuaded business and government people to use Benford's law to test suspicious financial records such as bookkeeping checks and tax returns. Benford's law states that the distribution of the leading digit in data sets is typically not equi-distlibuted hut rather given by the distribution p(k) = log(k+l) - log(k) for k = 1,2,...,9. (The leading digit of .0034 is 3, of 243 is 2 etc.). This gives the probabilities .301, .176, .125, .097, .079, .067, .058, .052, .046 for the chance that 1,2,3,4,5,6,7,8, or 9 will be the leading digit.

Numerous explanations for this have been given but perhaps the most persuasive is that Benford's distribution is the unique distribution for the leading digits that is not changed by a change of units, i.e. multiplying the data by a constant c. Negrini's idea is that, if we are honest, the numbers in our tax returns and on our checks should satisfy Benford's law and if they do not there may he some skullduggery.

The article states that "Mr. Negrini has also lent his expertise to federal and state tax authorities, otficials in Denmark and the Netherlands and to several companies. He has even put President Clinton's tax returns to the Benford's Law test. When he analyzed the president's returns for the past 13 years he found that 'the returns by Clinton follow Benford's Law quite closely"'.

Your explanations to your Uncle George as to what the "margin of er or means" indicates that there is still some confusion about what it mean.s The margin of elTor includes only the sampling error and not the other kinds of er ors that Tami talked about such as elTors caused by non-response. You will find that it is typically about l/sqr(n) cor esponding to this estimate of two standard er ors that we discussed. For example, for a sample of 200, l/sqr(200) = .0707 so this would be reported as a 7 percent margin of error. You have also be careful not to tell Uncle George that the error will be no more than 7 percent since even George will realize that it is possible for the poll to really screw up. A couple of you described it in terms of what would happen when you toss a coin, say 100 times and look at the proportion of heads that comes up. We can say with 95% confidence that this number will be between 40 and 60. Thus if our estimate the probability for heads coming is the proportion of heads in the sample we can be quite certain we are not off by more than 10% for the true probability of heads. I think George would be best served by just giving him the "box" that is included at the end of the New York Times poll reports.

How the Poll Was Conducted

The latest New York Times/ CBS News Poll is based on telephone interviews conducted Feb. 22 to 24 with 1,223 adults throughout the United States.The sample of telephone exchanges called was randomly selected by a computer from a complete list of active residential exchanges in the country. The list of more than 36,000 residential exchanges is maintained by Marketing Systems Group of Philadelphia.

Within each exchange, random digits were added to folm a complete telephone number, thus permitting access to both listed and unlisted numbers. Within each household, one adult was designated by a random procedure to be the respondent for the survey.

The results have been weighted to take account of household size and number of telephone lines into the residence and to adjust for variations in the sample relating to geographic region, race, sex, age, and education.

In theory, in 19 cases out of 20 the results based on such samples will differ by no more than three percentage points in either direction from what would have been obtained by seeking out all American adults.

For smaller subgroups the potential sampling error is larger. For example, it is plus or minus five percentage points for those who say they are likely to vote in a Republican primary or caucus this year.

In addition to sampling error, the practical difficulties of conducting any survey of public opinion may introduce other sources of error into the poll. Variations in question wording or the order of questions, for instance, can lead to somewhat different results.

So you see, as usual, your journals raised as many interesting questions as they solved which is a sign of good journals

**Part I -- Card Shuffling**

1) Read the NYT article, 1/9/90, *"In Shuffling Cards, 7 Is Winning Number."*

2) Shuffling activity.

A game of solitaire, which we call Yin/Yang, shows that 7 ordinary riffle shuffles, followed by a cut, of a 52-card deck are not enough to make every permutation equally likely.

Hearts and Clubs are called the Yin suits, and Diamonds and Spades are called the Yang suits. We shuffle the deck of cards 7 times, then cut it, and then start removing and revealing each card from the top of the deck, making a new pile of them face-up (so if this were all we did, we'd just have the deck unchanged after going through it once, except that the deck would be lying face-up on the table).

We start the pile for each suit when we discover its ace, and add cards of the same suit to each of these 4 piles, according to the rule that we must add the cards of each suit in order.

Thus a single pass through the deck is not going to accomplish much in the way of completing the 4 piles, so having made this pass, we turn the remaining deck back over, and make another pass.

We continue this until we complete either the two Yin piles (hearts & clubs), or the two Yang piles (diamonds & spades). If the Yin piles get completed first, we call the game a win; it's a loss if the Yang piles get completed first.

If the deck has been thoroughly permuted (by having put the cards through a clothes dryer, say), then the Yin and Yangs will be equally likely to be first to get completed. Thus our expected proportion of wins will be 1/2.

**Part II -- Review of Tests in JMP**

- Single variable
- z-test (known population standard deviation)
- t-test (unknown population statistics, small samples)

- More than one variable
- correlation and regression
- comparing means
- Chi-square test

**Part III- Games of Chance**