The Binary “Look-and-Say” Sequence
The look-and-say sequence (which I talked about here) is the sequence that you get by starting with the number 1 and constructing the next term in the sequence by “reading” the previous term. So 1 becomes “one one”, or 11. That becomes “two ones”, or 21. That becomes “one two, one one”, or 1211, and so on.
In this post, I am going to investigate the related binary version of the sequence, which starts off 1, 11 much like the regular sequence. But then when reading 11, we read it as “two ones”. Since two in binary is 10, the next term in the sequence is 101. When reading that term, we read it as “one one, one zero, one one”, so the next term is 111011. That term is read as “three ones, one zero, two ones”, and since three is 11 in binary and two is 10 in binary, the next term is 11110101, and so on. In this post we will answer two questions in particular about this sequence:
1) On average, how much longer is the (n+1)th term in the sequence than the nth term in the sequence?
2) On average, what is the ratio of the number of ones to the number of zeroes in the sequence?
Non-Interacting Subsequences
Much like the regular look-and-say sequence, we are able to study this sequence by constructing a “basis” of non-interacting subsequences that every term in the binary look-and-say sequence is made up of. Fortunately, constructing such a family of subsequences for the binary version of the look-and-say sequence is much simpler than it is for the decimal version of the sequence – here we only need ten different basic subsequences (whereas we needed 92 different subsequences for the regular look-and-say sequence!). These ten subsequences, and the subsequences they evolve into, are summarized in the following table.
# | Subsequence | Evolves Into |
---|---|---|
1 | 1 | (2) |
2 | 11 | (3)(1) |
3 | 10 | (5) |
4 | 110 | (3)(4) |
5 | 1110 | (6) |
6 | 11110 | (7)(4) |
7 | 100 | (9) |
8 | 1100 | (3)(8) |
9 | 11100 | (10) |
10 | 111100 | (7)(8) |
So for example, the first term in the sequence, 1, evolves into the subsequence (2), which is 11. That term then evolves into subsequence (3) followed by subsequence (1), or 101. That term then evolves into the subsequence (5) followed by the subsequence (2), or 111011, and so on. The reason that this representation of the sequence is useful is we can use it to describe the evolution of the binary look-and-say sequence entirely within a matrix T. In particular, we let T be the matrix with 1 in its (i,j) entry if the subsequence (i) appears in the evolution rule for subsequence (j), and 0 in its (i,j) entry otherwise:
Now if v is a 10-dimensional vector whose ith entry indicates how many times the subsequence (i) appears in a particular term of the binary look-and-say sequence, it follows that the entries of Tv tell us how many times each subsequence appears in the next term of the binary look-and-say sequence. So it follows from standard theory of linear homogeneous recurrence relations that we can now read off all of the long-term behaviour of the binary look-and-say sequence from the eigenvalues and eigenvectors of T.
Rate of Growth of the Sequence
The asymptotic rate of growth of the number of digits in the terms of the binary look-and-say sequence is simply the magnitude of the largest eigenvalue of the transition matrix T above. Using Maple it is simple to derive this value. If Ln is the number of digits in the nth term of the binary look-and-say sequence, then
This limit is approximately 1.465571, which means that the binary version of this sequence grows much faster than the decimal version of the sequence (recall that the growth rate of the number of digits of the regular look and say sequence is approximately 1.303577). This limit is also the unique real root of the cubic x3 – x2 – 1, which follows from the fact that the characteristic polynomial of T is
Ratio of Number of Ones to Zeroes
If we let Nn denote the number of ones in the nth term of the binary look-and-say sequence, and if we let Zn denote the number of zeroes in the nth term of the sequence, what is
In other words, what is the average ratio of ones to zeroes in this sequence? The following table shows the value of Nn/Zn for n = 3, 4, …, 25, which might give some intuition to the problem:
n | Nn/Zn |
---|---|
3 | 2.000 |
4 | 5.000 |
5 | 3.000 |
6 | 2.000 |
7 | 2.000 |
8 | 2.000 |
9 | 1.786 |
10 | 1.762 |
11 | 1.742 |
12 | 1.717 |
13 | 1.691 |
14 | 1.690 |
15 | 1.680 |
16 | 1.676 |
17 | 1.672 |
18 | 1.671 |
19 | 1.669 |
20 | 1.668 |
21 | 1.667 |
22 | 1.667 |
23 | 1.666 |
24 | 1.666 |
25 | 1.666 |
Based on numerical estimates like those given in the table above, it has been conjectured that the limiting ratio is 5/3 (or some nearby value). We will now show that the limit does indeed exist, but its value is not 5/3 — it just happens to be really close to 5/3.
Much like the maximal eigenvalue of T tells us the overall growth rate of the sequence, the corresponding eigenvector tells us the distribution of the different subsequences that are present in the limit. Once we know the distribution of the individual subsequences, it is not difficult to find out the overall ratio of ones to zeroes by weighing the different subsequences appropriately. So our first step is to find the eigenvector corresponding to the maximal eigenvalue. To this end, it will be convenient to let
α is the same as in the previous section, and β is exactly the growth rate limit that we computed. Then the eigenvector corresponding to the maximal eigenvalue of T is:
What this means is that, in the limit, the fifth subsequence, 1110, is β times as frequently-occurring as the sixth subsequence, 11110 (for example). Now we just weigh each subsequence according to how many zeros and ones they contain, and we find the limiting ratio of ones to zeroes is
In particular, this ratio does not equal 5/3, but rather its decimal expansion begins 1.6657272222676… (which is less than 1/1000 away from 5/3).
Edit (September 14, 2022): Standupmaths has made a nice video about this same topic here.
Recent Comments