Heath Raftery writes about weird probabilities in dice discussions, a problem very much reminiscent of the Monty Hall problem, both in the amount of controversy it generates when people discuss it, and also, it seems on at least half the people in the discussion so far, in how much the terms use end up confusing people.
So, in a comment to the Carnival post, the suggestion came from Alex, that a simulation be used to settle the whole question. And since I just wrote some things with PFP, I might as well write another!
The question, as posed by Heath, is
Two dice are rolled simultaneously. Given that one die shows a "4", what is the probability that the total on the uppermost faces of the two dice is "7"?
so in order to settle it, I'm going to let my computer roll a whole lote of dice. Whenever at least one die shows a 4, I'm going to tally the roll, and whenever the sum is 7, I'm going to tally it as a succesful roll. The quotient of these tallies will be the probabilities.
I end up with the following code
import Probability import Control.Monad -- A die roll is a uniform random number in [1..6] die = uniform [1..6::Int] -- Two dice is just a pair of independently rolled die twoDice = prod die die tallyOne :: IO (Int, Int) tallyOne = do (d1, d2) <- pick twoDice let (a,b) | d1 == 4 || d2 == 4 = if d1+d2 == 7 then (1,1) else (1,0) | otherwise = (0,0) return (a,b) sumPair (a,b) (c,d) = (a+c,b+d) simulate n = (fmap (foldr sumPair (0,0)) . sequence . (take n) . repeat) tallyOne simulation n = fmap (\(x,y) -> (fromIntegral y)/(fromIntegral x)) $ simulate n
which pretty much lets me enter a number of tests to perform, and returns the resulting quotient of tallied successes by tallied valid rolls.
While playing around with it, I discover that at about 100 000 rolls, the execution outgrows the stack, and so I partition the simulation into pieces of 50 000. This yields me the following
*DiceSim> ( fmap (unlines . map show) $ sequence $ take 10 $ repeat $ simulation 50000 ) >>= putStrLn
which then -- after several minutes execution time, gives a simulated dataset, namely
0.176424464192368 0.1825110045332107 0.18224936651289714 0.18589282208990382 0.18272166287755764 0.18133159268929505 0.18697201689545934 0.17798213004630536 0.1818062827225131 0.18456002616944717
Plugging it into my OpenOffice calculating tool, I retrieve a mean of 0.1822451368729 and a standard deviation of 0.0032384171164.
Thus, in the end, my bets goes with the teacher of Heath: 2/11, and for about the same reasons as the Monty Hall problem. As an invitation to Heath: if you disagree with my model, please propose how to code or pseudocode the model you interpret, so that we can simulate and test it!