Michi’s blog » Response to Heath Raftery

## Response to Heath Raftery

• February 9th, 2007
• 11:57 pm

While I’m still on the subject of writing code with the PFP library, I may as well join in on a discussion that got pulled into the Carnival of Mathematics exposition.

Heath Raftery writes about weird probabilities in dice discussions, a problem very much reminiscent of the Monty Hall problem, both in the amount of controversy it generates when people discuss it, and also, it seems on at least half the people in the discussion so far, in how much the terms use end up confusing people.

So, in a comment to the Carnival post, the suggestion came from Alex, that a simulation be used to settle the whole question. And since I just wrote some things with PFP, I might as well write another!

The question, as posed by Heath, is

Two dice are rolled simultaneously. Given that one die shows a “4”, what is the probability that the total on the uppermost faces of the two dice is “7”?

so in order to settle it, I’m going to let my computer roll a whole lote of dice. Whenever at least one die shows a 4, I’m going to tally the roll, and whenever the sum is 7, I’m going to tally it as a succesful roll. The quotient of these tallies will be the probabilities.

I end up with the following code

import Probability

– A die roll is a uniform random number in [1..6]
die = uniform [1..6::Int]

– Two dice is just a pair of independently rolled die
twoDice = prod die die

tallyOne :: IO (Int, Int)
tallyOne = do
(d1, d2) <- pick twoDice
let (a,b) | d1 == 4 || d2 == 4 = if d1+d2 == 7 then (1,1) else (1,0)
| otherwise = (0,0)
return (a,b)

sumPair (a,b) (c,d) = (a+c,b+d)

simulate n = (fmap (foldr sumPair (0,0)) . sequence . (take n) . repeat)  tallyOne

simulation n = fmap (\(x,y) -> (fromIntegral y)/(fromIntegral x)) $simulate n which pretty much lets me enter a number of tests to perform, and returns the resulting quotient of tallied successes by tallied valid rolls. While playing around with it, I discover that at about 100 000 rolls, the execution outgrows the stack, and so I partition the simulation into pieces of 50 000. This yields me the following *DiceSim> ( fmap (unlines . map show)$ sequence $take 10$ repeat $simulation 50000 ) >>= putStrLn which then — after several minutes execution time, gives a simulated dataset, namely 0.176424464192368 0.1825110045332107 0.18224936651289714 0.18589282208990382 0.18272166287755764 0.18133159268929505 0.18697201689545934 0.17798213004630536 0.1818062827225131 0.18456002616944717  Plugging it into my OpenOffice calculating tool, I retrieve a mean of 0.1822451368729 and a standard deviation of 0.0032384171164. Comparing these, now, to the numerical values of 1/6 = 0.166666666666… 2/11 = 0.18181818181818181818… makes me conclude that sorry, but, the thing that’s within 0.13σ of our experimental mean simply feels much more reliable than that within almost 5σ Thus, in the end, my bets goes with the teacher of Heath: 2/11, and for about the same reasons as the Monty Hall problem. As an invitation to Heath: if you disagree with my model, please propose how to code or pseudocode the model you interpret, so that we can simulate and test it! ### 17 People had this to say... • Dan P • February 10th, 2007 • 0:46 I haven’t tested the code but wouldn’t the strict foldl’ in Data.List solve your stack problem with foldr? Love the simulation program, well done. As I’ve commented back on Alon’s blog, you’ve simulated the alternative question I suggested in my epilogue. Have another read of my original post, and you’ll see that in the last paragraph I describe precisely the scenario you’ve coded – rolls without a 4 are discarded – and agree that the answer to that question is 2/11. My argument is that was not the situation described in the original question, which unfortunately reduces to an argument of ambiguous interpretation. I do enjoy the mathematical argument however – do continue. • Michi • February 10th, 2007 • 10:48 DanP: foldl or even foldl’ make things better. 100 000 tries works without problem. Heath: I’m putting an answer up at Alon’s. • TBone • February 10th, 2007 • 15:13 Why not just simulate the entire probability space in this case. There are only 6^2=36 possible outcomes given two dice. Each outcome has an associated 1/(6^2)=1/36 probability. That’s a heck of a lot less work than what you are doing and it will give you an exact answer. You could generate all outcomes pretty quick with list comprehension, and then just fold the sucker into a counting function. Heck, if you use a strict foldl, it won’t even consume any more space than that required to hold one outcome while doing the calculation. Of course you could just as easily just do the math: X = Outcome of first dice roll Y = Outcome of second dice roll P(X=4&&Y=3 || X=3&&Y=4 given X=4||Y=4) = P(X=4&&Y=3 || X=3&&Y=4)/P(X=4||Y=4) P(X=4||Y=4) = P(X=4) + P(Y=4) – P(X=4&&Y=4) = 2/6-1/36 = 11/36 (these events are not mutually exclusive, so you have to subtract one copy of the combined event as otherwise you have counted it twice because it is included in each of the independent events) P(X=4&&Y=3 || X=3&&Y=4) = P(X=4&&Y=3) + P(X=3&&Y=4) = 2/36 (these two events are mutually exclusive, so nothing has been counted twice — equally, the term you would subtract is P(X=4&&Y=3 && X=3&&Y=4)=0 [being obviously impossible]) Therefore, the answer is 2/36 * 36/11 = 2/11. • Michi • February 10th, 2007 • 15:23 TBone: I’m pretty certain that’s what the probability library does up to the point that I require, with pick, that it involve random selections. And the reason I do it this way instead of by actual probability analysis is that the core of the matter was a disagreement in the analysis of the problem, where the original author, Heath, claimed that 2/11 was invalid. But yeah, the purely mathematical reasoning IS cleaner and better. • TBone • February 10th, 2007 • 16:46 Michl: Yeah, I guess the math argument is not of much use to those who are arguing about the math argument. : ) Interesting post though, I should have a look into that probability library. Sounds like it could be usefull/interesting. For fun, here’s another (slightly less than optimal) proof in the form ghci commands followed by results. — The conditional probability space is: let space = filter (\(x,y) -> x==4 || y==4) [(x,y)|x x+y == 7) space — events = [(3,4),(4,3)] — The probability is therefore (all events in the space being equally likely): let prob = (length events)/(length space) — prob = 2/11 • TBone • February 10th, 2007 • 16:54 Okay, the blog comment system just totally mulched my code. Ah well, it was nothing amazing, just a simple list comprehension to generate the die roll space, followed by a filter to reduce that to the conditional space, and another to reduce that to the events space. Have a good one! • pwyll • February 11th, 2007 • 4:59 The mathematical argument can be made just by counting. There must be at least one 4, so this the sample space: [4 1], [4 2], [4 3], [4 4], [4 5], [4 6], [1 4], [2 4], [3 4] [5 4] [6 4]. There are 11 points in the sample space, only two sum to 7, so the answer is 2/11. The only trick is to not to count [4 4] more than once. If the reason for this is not obvious, think of the rolls as ordered pairs, the dice as red and green, with 1st dice red, 2nd green. • Michi • February 11th, 2007 • 12:18 pwyll: You did notice which side I’m on in the discussion, right? Heath claims the [4 4] should be double counted. I claim it shouldn’t. • Dan P • February 12th, 2007 • 19:14 Related to this probability question is the Sleeping Beauty Problem. It too depends on asking interesting questions about conditional probability. Unlike Monty Hall and its relatives, however, people still debate about what exactly the correct answer is. • Dan P • February 12th, 2007 • 20:28 I had a look at PFP and don’t like it’s handling of fail. Seems to me that you should be able to write lines in the monad like “guard$ x>10″ so that the final outcome is conditioned on x being greater than 10. The ‘pick’ routine should deal with this by renormalising the probability distribution so as to sum to 1. Maybe I haven’t properly thought through the consequences of this and it’s a stupid suggestion. Does it sound reasonable to you?

• Steph
• February 19th, 2007
• 0:45

I tried a million iteration of the following program with Matlab and got
ans =
2.0103 1.0965
ans =
1.9924 1.0867
The first number is /11 while the second is /6.
Without a doubt, 2/11 wins

n4=0;
n7=0;

for i=1:1000000
x=ceil(6*rand(1));
y=ceil(6*rand(1));
if (x==4 & y~=4)
n4=n4+1;
if x+y==7
n7=n7+1;
end
end
if (y==4 & x~=4)
n4=n4+1;
if x+y==7
n7=n7+1;
end
end
if (x==4 & y==4)
n4=n4+1;
end
end

[(n7/n4)*11 (n7/n4)*6]

• John
• March 11th, 2010
• 21:12

It seems to me that simulation is pretty useless. If you read Heath’s page, it is quite obvious that he is not interpreting the problem the way you are setting up your simulation. It is also quite obvious that he would predict an answer of 2/11 for your simulation. So, how is it that you think you are actually resolving anything?
So, editing the program above, here would be a simulation for Heath’s interpretation (which I think is more in line the the wording of the original question).

n4=0;
n7=0;

for i=1:1000000
x=ceil(6*rand(1));
y=ceil(6*rand(1));
z=ceil(2*rand(1));
if (z==2 & y==4)
n4=n4+1;
if x+y==7
n7=n7+1;
end
end
if (z==1 & x==4)
n4=n4+1;
if x+y==7
n7=n7+1;
end
end
end

[(n7/n4)*11 (n7/n4)*6]

In this simulatin z randomly chooses which of the die the “one is a 4″ refers to. But since you randomly assigned the x and y to the dice in the first place, a simpler program would be just as appropriate (and will give the same answer)

n4=0;
n7=0;

for i=1:1000000
x=ceil(6*rand(1));
y=ceil(6*rand(1));
if x==4
n4=n4+1;
if x+y==7
n7=n7+1;
end
end
end

[(n7/n4)*11 (n7/n4)*6]

If the original problem had stated “at least one is a 4″ instead of “one is a 4″, your simulations would be accurate. But how can you possibly think that your simulation can prove that “One is a 4″ means “at least one is a 4″?? Could it be that you couldn’t figure out how to program anything else?

• Michi
• March 11th, 2010
• 22:40

And if you had read the comments on the post, you’d have seen Heath himself making that exact point.

Besides, it’s a 2 year old post – you expect me to still even remember the argument I was making back then?

• John
• March 13th, 2010
• 0:31

I did read the comments, and noticed that Heath made the same point. I also noticed that all comments after his (except possibly for your reply – I don’t know where Alon’s is) ignored him completely. Almost every comment after his seems dedicated to proving the 2/11 answer assuming at least one is a 4. So, my comment was more in response to those comments than to your original post. None of them realized they were wasting their time proving something that everyone already knows.
And it’s been 3 years, and still no one seemed to have been able to come up with a simulation for Heath’s interpretation.

• Michi
• March 13th, 2010
• 1:30

@John
It’s been three years. I wrote the comment as a throw-away little hack, inspired by a blog post I read. It’s not in an area I’m particularly specialized in, nor on a subject I’m particularly invested in.

Pulling it back up again – and especially with language like But how can you possibly think that your simulation can prove that “One is a 4? means “at least one is a 4??? Could it be that you couldn’t figure out how to program anything else? seems a little bit like digging for a dead horse in order to be able to whip it.

I appreciate that you sat down and produced a more accurate simulation; but it seems it might have thrived at least as well in a blog of your own with a link here, not to mention with less randomly thrown insults.

• john
• March 15th, 2010
• 19:50

Sorry, I apparently do not know proper blog etiquette, and apparently don’t quite get the point.