The Monkey - Shakespeare Probability

The probability that a solitary, immortal monkey, typing randomly on a standard, unbreakable keyboard, would eventually produce one of Shakespeare's plays is calculated. The question of what age of the universe, if any, would be large enough to give the monkey a 50% chance of success, is investigated.


1.1 History
    The monkey problem is believed to have originated with a statement attributed to Huxley:
    
    "Six monkeys, set to strum unintelligently on typewriters for millions of years, would be bound in time to write all the books in the British Museum."

and is quoted by J.Jeans in Mysterious Universe, Cambridge University Press, 1930, p.4.  As Jeans was an accomplished physicist, it is surprising he does not either prove or disprove the claim. Perhaps because Huxley did not specify exactly how many millions of years he was thinking of, Jeans let the remark go unchallenged.

1.2 Approximations
    The exact age of the universe one finds in the literature tends to vary as more accurate astronomical data is obtained by scientists.The value we will use is that currently given by NASA. The number of letters in a Shakespearean play is also an approximation. However, only order of magnitude accuracy is required to perform the calculation, so the approximations are reasonable.

1. Age of Universe = 13.7 billion years = 10^17s   (Source http://map.gsfc.nasa.gov/m_uni/uni_101age.html )
2. Number of chars/letters in a Shakespeare Play = 10^5  (Rough approximation)
3. The keyboard contains 44 keys (Again, this varies, depending on the keyboard)
4. Ignore upper and lower case distinctions
5. The monkey types at a rate of one char per second

2.1 Probability Of A Given Sequence Of Chars
    Let the monkey type 10^5 characters.We would like to know that probability that the monkey produced a Shakespeare play. The calculation is a simple and routine application of the Product Rule:

    Suppose event E can occur in m ways,and independent of this event, event
    F can occur in n ways. Then, combinations of E and F can occur in mn ways.


Applying the product rule,we can immediately conclude: The probability that any given sequence of 10^5 characters typed at random
will be equal to a given Shakespearean play is (1/44)^10000 = 10^(-164345)

Proof:
    P(sequence) =( #way of ways to make sequence)/(#of all possible sequences)

To calculate the number of possible sequences of characters, we note we have 44 choices for the 1st char, 44 choices for the second, ..., 44
for the 10^5 and last character. So, applying the product rule, the total number of such sequences is:

            44 * 44 * ... * 44    (10^5 such terms)  =  44^10^5 = 44^100000
            
Similarly, calculating the number of ways to make the given play, we have only 1 choice (ignoring upper and lower case distinctions) for the �1st
character, 1 choice for the second ,..., 1 choice for the 10^5 and last term. Applying the product rule, we have 1^10^5 = 1 way to make the play, and thus

    P(sequence)    =  1/44^100000
                
and use log_1044 = 1.64345 to get  log_10P = (-100000) * 1.64345 so that P(sequence) = 10^(-164345) as claimed.

2.2 Bernoulli Trials
    If an event or experiment can only have two outcomes (success and failure), then independent repeated trials of that event are called Bernoulli trials. A standard example of a Bernoulli trial is the tossing of a coin. The key approximation in our approach to the Monkey-Shakespeare
problem is that every 10^5 characters typed by the monkey will be considered to be a trial.That is,we let the monkey type 10^5 characters (the assumed number of characters in the Shakespeare play) and then ask if the given Shakespearean play was, in fact, produced (success) or not (failure). By approximating the problem as a sequence of Bernoulli trials, we can apply well known and standard results of elementary probability theory to get a heuristic answer to our basic question.
    First, we review the basic facts of Bernoulli trials. Denote the probability of success of a Bernoulli trial by p, and the probability of failure as q, with p + q = 1. Then, we have the following result for the probability of k successes in n trials.        

P (k successes) = ({{n}, {k}}) p^k q^(n - k)

and also

P(1 or more success) =1 - q^n

Example: A coin is tossed 20 times, what is the probability of getting 19 heads?

We have n = 20, k = 19, p = q = 1/2, so

P (19 heads) = ({{20}, {19}}) * (1/2)^19 * (1/2)^1 = 1.907 3 * 10^(-5)

3.1 Calculation of Monkey - Shakespeare Probability
    Now, we can proceed with the calculation.
The number of trials n will be:

    n = (Age of Universe )/(Time For One Trial) =   ( 10^17 )/10^5 = 10^12
    
Also, by Bernoulli we know that P(1 or more Shakespeares) = 1 - q^n
          = 1 - (1 - p)^n
          = 1 - (1 - np + neglible terms)        (neglible for p <<1, which is true here)
          = np
      
Plugging in our values for n and p, we finally get
      
          P(1 or more Shakespeare plays) = 10^12 * 10^(-164345)= 10^(-164333)
          
In other words, even if the monkey had been typing for as long as the universe has been in existence, the probability that it will type a given play of Shakespeares is of the order 10^(-164333).
      
3.2 Age of Universe Required for 50% Chance of Monkey Success
      Setting np = 1/2,we get  n = 1/(2p), so that the number of trials n to give the monkey a 50% chance of success is
      
      n_ (50%) = .5 *  10^164345 = 5 *  10^164344
      
  and so (Age of Universe) _ (50%) = 10^5 * 5 *  10^164344 = 5 * 10^1643349 s


Thus we see that the required age of the universe for a reasonable chance of monkey success is a surprisingly huge number relative to the known age 10^17s  of the universe.

In reality, in this calculation we have cheated a little, because one would not normally model the monkey typing by a series of  Bernoulli trials. It would be possible, for example, for the monkey to have successfully reproduced the play after 10^5 + 1 chars  (i.e., 1 trial plus the first char from the next trial) have been typed.  However, approximating by a series of Bernoulli trials does have the advantage of using well known and elementary counting techinques to make obvious the rather absurd nature of Huxley's original statement.


Created by Mathematica  (August 14, 2007) Valid XHTML 1.1!