Class 9: Coke vs. Pepsi: Analyzing the Results.

When you design an experiment like this you should ask several questions. First, what do you want to test? Do you want to test if a person can tell given a single cup whether it contains Coke or Pepsi? Can a person decide which of two cups is Coke and which is Pepsi? Can a person given two cups simply decide if they have the same or different drinks? These are all testing slightly different abilities.

You might think of the experiment as trying to settle an argument say between Linda and Laurie. Linda claims she can tell the difference between Pepsi and Coke and Laurie claims she cannot. Now Linda does not claim she can do it every time in a series of tests but rather can get it right more times that just by guessing. There are two kinds of errors we can make in our experiment. The first type, called a type I error, is that Linda could establish her claim when in fact she is just guessing and was lucky. The second kind of error, called type II error, occurs if Linda really does have the ability she claims but just has a bad day and does not get enough correct to establish her claim. Laurie wants to be sure that the chance of a type I error is small and Linda wants to make sure the chance of a type II error is small.

Let's consider first the group 2 experiment. The experimenters gave a single taster 6 cups known to contain either Pepsi or Coke. The taster was not told how many had Coke and how many had Pepsi. In fact 3 cups contained Coke and 3 contained Pepsi.

Now suppose we required that the taster get all 6 correct to establish the claim.

Since the taster was not told how many cups contained Coke and how many contained Pepsi, it is reasonable to assume that if the taster was just guessing what a single cup contained, they would have a 50% chance of being correct. It is harder to say just what having the ability means. One simple solution is to ask the taster what percentage of the time they would expect to get it right. If the answer is 80% then we might say that if the taster's claim is correct, they would have an 80% chance of being correct on a single cup.

Then the probability of a type I error is (1/2)^6 = .015. The probability of a type II error is 1-.8^6 = .74. This shows that it is obviously unfair to the taster to require all correct. Thus we should consider changing the requirement to getting, say 5 or more correct.

We should consider this kind of analysis for all the experiments. Here is the information we have about the experiments that were carried out in class last time.

Group 1: One taster did three sets of three cups. each set contained one cup of Coke, one of Pepsi, and one of RC Assume the order was always P, C, RC in each case.


    P, C, RC
    P, C, RC
    P, C, RC
taster reported:
   RC, C, P
   RC, C, P
   P, RC, C

Group 2: A single taster was given 6 cups total: 3 of Coke, 3 of Pepsi. The taster was not told how many cups there were of each.

   P    C    P    P    C    C
taster reported
   P    C    C    P    P    C
Group 3: There were three tasters. Each taster was given 6 cups and not told how many cups contained Coke and how many contained Pepsi. The experimenters deliberately used 4 cups of one drink and 2 of the other. The same cups were used for each taster.

   C    C    P    C    C    P
taster 1:
   P    C    P    C    C    P
taster 2:
   P    C    C    C    C    P
taster 3:
   P    C    P    C    C    P

Group 4:

The taster was given 3 sets of 2 cups.

   P,C    P,C    C,P
   C,P    C,P    C,C

Group 5:

This group had two tasters each given 3 sets of 2 cups. The content of each cup was determined by picking one piece of paper out of two folded pieces, one of which said Pepsi and the other said Coke (so one pair of cups could consist of two Cokes, two Pepsis, or one of each). The tasters were told how the contents of cups were picked

   P,C    C,P    C,P
taster 1:
   P,C    C,P    C,P
taster 2:
   P,C    C,P    C,P

Note: We have decided to have the chance fair where you present
your project the last day of the reading period Tuesday May 14
instead of during the final exam period.

Monday we will have a guest speaker John Paulos author of the
best selling book "A mathematician reads the newspaper".

Linda's Laborious Solutions to the Discussion Questions from Class 6

I'll try to work the discussion problems and see if that helps you do the first
journal question for Class 7 (which is supposed to be the same idea, with
different numbers).

These written solutions are no substitute for coming to precepts or office
hours or discussing the problems among yourselves, and I encourage you to do
these things, too.

Here is how I work out the discussion problems.


The journal question should be about the same, though you have to figure out some of the numbers to use yourself. The answer should come out somewhere in between 50% and 98%.


Some comments from Laurie:

Recall that P(A|B) means the probability of A given that B is true.

The AIDS example in terms of conditional probability amount to the following: You are given

P(+test | HIV positive)


P(- test | HIV negative)

(both .998 in our case) and you want to know

P(HIV positive | + test)

In general P(A|B) is not equal to P(B|A)

For example, consider two tosses of a coin and let A be the event that both tosses are heads and B the event that the first toss is a head. Then P(A|B) = .5 and P(B|A) = 1.

To find one of these conditional probabilities from the other you need also to know P(A) and P(B) (actually their ratio is sufficient). In the AIDS example this amounts to knowing the probability the patient is HIV positive before the test is performed.

In law, mixing these two probabilities up is called the "prosecutor's paradox".
A prosecutor will often have a reasonable estimate of

P( the evidence | the accused is innoncent)

and then incorrectly state this probability as

P(the acused is innoncent | the evidence)

because that is what the jury wants to know.

For example, in the Simpson trial the DNA experts give a very small probability for the probability of a DNA match for a person chosen at random say in the Los Angeles area. This is

P(match ! Simpson is innocent)

but this is not the same as

P(Simpson is innocent | match).

Also, don't forget Linda's three general comments about AIDS testing.

1. These calculations show that if a patient is not from a high risk group so that the probability before testing of being HIV positive is small, then a single positive Elisa test will not result in a high probability that the patient is HIV positive. It will if the patient is initially from a high risk group.

2. In practice when a lab has a positive test it carries out another Elisa test and then a Western blot test. If all three are positive it is reported that the patient is HIV positive.

3. The two Elisa tests cannot be assumed to be independent since they are usually carried out on the same blood sample and there may be something other than the HIV virus in the blood that results in a positive test. The Western blot test is more specific to the Aids virus and so it is more reasonable to assume it is independent of the other tests.