The monkey typewriter

by Mithrandir

I bet all of you know the now famous theorem of the infinite monkeys. Starting from this idea someone devised the following problem:

Suppose there is a monkey sitting in front of a terminal consisting of  n keys. The monkey will press any of those keys at random and, when all of them will be pressed, she will receive a banana as a payment for her efforts. We are mainly interested in how long will it take for her to obtain that banana in the following two cases:

  • when a key is pressed it remains pressed
  • when a key is pressed for the second time it will behave as though it was not pressed at all.

Practically, we are interested to see what is the average number of keystrokes for the specific number n of keys.

This is what we’ll be doing here. For the first part of our instalment we will use Numerical Methods to deal with the exact mathematical equation for the average number of keystrokes needed.

The second instalment of this problem will tackle with a Monte-Carlo approach to this problem and (maybe) with reducing the computational cost of Monte-Carlo methods, the main reason behind choosing this problem.

So, we begin. We won’t go into details about Markov Chains and how one can use the transition matrix of such a chain to determine the average number of jumps in the chain before arriving at an absorbing state. We will only publish the main concepts behind, no mathematics for now. I think this function (a simple Octave/Matlab script) is pretty self-explanatory.

function t = Markov(P, na)
    % computes the average times (as a row vector t) needed
    % to reach one state from the na absorbing ones
    % existing in a Markov chain, given by the matrix P - the
    % transition matrix
    n = size(P)(1);    % size of the chain
    Q = P(1: (n - na), 1: (n - na)); % the transient part of P
    N = inv(eye(n - na) - Q); % the expectation matrix
    t = N * ones(n - na, 1); % the average number of jumps
endfunction

Now, we only have to build those P matrices for those two cases. Come to think of it, this is also an easy part after observing that we could define our state by counting the number of keys which are pressed (thus, the only absorbing state would be the one where all n keys are pressed). The following Octave/Matlab snippet builds the P matrix for the first case.

v = (0:n) / n;    % auxiliary vector
P = diag(v) + diag(1 - v(1:n), 1); % the transition matrix

For the second case we have the following, full script this time (including the call and the return value).

function t = f2(n)
    % computes the average number of random keystrokes
    % (from n available keys) needed to fill the table
    % in the second case
    v = (0:n) / n;    % auxiliary vector
    P = diag(v(2: (n + 1)), -1) + diag(1 - v(1:n), 1);
    t = Markov(P, 1); % we have only one absorbant state
    t = t(1);
endfunction

Looking at the results, the first case is a lucky one for the monkey: the average number of keystrokes is in O(1),  as it is shown by this plot. Unfortunately, the second case has an exponential growth (as seen from this plot, taken at a logarithmic scale).

As of now, we have a good view on the complexity of the problem, which  is one of the reasons which just makes it perfect for being used in Monte Carlo simulation studies along with the fact that it is not a trivial problem like the one of numerical integration, instead being a problem with different degrees of complexity.

Thank you for your patience, we will come again soon with the second part.

About these ads