A Derivation of Conway’s Degree-71 “Look-and-Say” Polynomial

32 Replies

The look-and-say sequence is the sequence of numbers 1, 11, 21, 1211, 111221, 312211, …, in which each term is constructed by “reading” the previous term in the sequence. For example, the term 1 is read as “one 1”, which becomes the next term: 11. Then 11 is read as “two ones”, which becomes the next term: 21, and so on.

The remarkable thing about this sequence is that even though it seems at first glance to be quite arbitrary and non-mathematical, it has some interesting properties that were unearthed by John Conway. Most notably, he showed that the number of digits in each term of the sequence on average grows by about 30% from one term to the next. A bit more specifically, he showed that if L_n is the number of digits in the n^th term in the sequence, then

where λ is the unique positive real root of the following degree-71 polynomial:

In order to demystify this seemingly bizarre fact, in this post we will show where this polynomial comes from and prove that the above limit does indeed equal its largest root (which happens to be its one and only positive real root).

The Cosmological Theorem

What lets us formally study the look-and-say sequence is a rather ominous-sounding result known as the cosmological theorem, which says that the eighth term and every term after it in the sequence is made up of one or more of 92 “basic” non-interacting subsequences. These 92 basic subsequences are summarized in lexicographical order in the following table. The fourth column in the table says what other subsequence(s) the given subsequence evolves into. For example, the first subsequence, 1112, evolves into the 63rd subsequence: 3112. Similarly, the second subsequence, 1112133, evolves into the 64th subsequence followed by the 62nd subsequence: 31121123.

#	Subsequence	Length	Evolves Into
1	1112	4	(63)
2	1112133	7	(64)(62)
3	111213322112	12	(65)
4	111213322113	12	(66)
5	1113	4	(68)
6	11131	5	(69)
7	111311222112	12	(84)(55)
8	111312	6	(70)
9	11131221	8	(71)
10	1113122112	10	(76)
11	1113122113	10	(77)
12	11131221131112	14	(82)
13	111312211312	12	(78)
14	11131221131211	14	(79)
15	111312211312113211	18	(80)
16	111312211312113221133211322112211213322112	42	(81)(29)(91)
17	111312211312113221133211322112211213322113	42	(81)(29)(90)
18	11131221131211322113322112	26	(81)(30)
19	11131221133112	14	(75)(29)(92)
20	1113122113322113111221131221	28	(75)(32)
21	11131221222112	14	(72)
22	111312212221121123222112	24	(73)
23	111312212221121123222113	24	(74)
24	11132	5	(83)
25	1113222	7	(86)
26	1113222112	10	(87)
27	1113222113	10	(88)
28	11133112	8	(89)(92)
29	12	2	(1)
30	123222112	9	(3)
31	123222113	9	(4)
32	12322211331222113112211	23	(2)(61)(29)(85)
33	13	2	(5)
34	131112	6	(28)
35	13112221133211322112211213322112	32	(24)(33)(61)(29)(91)
36	13112221133211322112211213322113	32	(24)(33)(61)(29)(90)
37	13122112	8	(7)
38	132	3	(8)
39	13211	5	(9)
40	132112	6	(10)
41	1321122112	10	(21)
42	132112211213322112	18	(22)
43	132112211213322113	18	(23)
44	132113	6	(11)
45	1321131112	10	(19)
46	13211312	8	(12)
47	1321132	7	(13)
48	13211321	8	(14)
49	132113212221	12	(15)
50	13211321222113222112	20	(18)
51	1321132122211322212221121123222112	34	(16)
52	1321132122211322212221121123222113	34	(17)
53	13211322211312113211	20	(20)
54	1321133112	10	(6)(61)(29)(92)
55	1322112	7	(26)
56	1322113	7	(27)
57	13221133112	11	(25)(29)(92)
58	1322113312211	13	(25)(29)(67)
59	132211331222113112211	21	(25)(29)(85)
60	13221133122211332	17	(25)(29)(68)(61)(29)(89)
61	22	2	(61)
62	3	1	(33)
63	3112	4	(40)
64	3112112	7	(41)
65	31121123222112	14	(42)
66	31121123222113	14	(43)
67	3112221	7	(38)(39)
68	3113	4	(44)
69	311311	6	(48)
70	31131112	8	(54)
71	3113112211	10	(49)
72	3113112211322112	16	(50)
73	3113112211322112211213322112	28	(51)
74	3113112211322112211213322113	28	(52)
75	311311222	9	(47)(38)
76	311311222112	12	(47)(55)
77	311311222113	12	(47)(56)
78	3113112221131112	16	(47)(57)
79	311311222113111221	18	(47)(58)
80	311311222113111221131221	24	(47)(59)
81	31131122211311122113222	23	(47)(60)
82	3113112221133112	16	(47)(33)(61)(29)(92)
83	311312	6	(45)
84	31132	5	(46)
85	311322113212221	15	(53)
86	311332	6	(38)(29)(89)
87	3113322112	10	(38)(30)
88	3113322113	10	(38)(31)
89	312	3	(34)
90	312211322212221121123222113	27	(36)
91	312211322212221121123222112	27	(35)
92	32112	5	(37)

The important thing about this particular basis of subsequences is that the evolution of any sequence made up of these subsequences is determined entirely by the evolution rule for the subsequences given in the final column of the above table. For example, the eighth term in the look-and-say sequence is 1113213211 = (24)(39). The subsequence (24) evolves into (83) and the subsequence (39) evolves into (9), so the ninth term in the look-and-say sequence is (83)(9), which is 31131211131221.

Computing the Number of Digits in Sequences

Since the evolution of every term in the look-and-say sequence after the eighth can be computed using the table above, we can easily compute the length of every term after the eighth as well. For example, the eighth term in the sequence evolves into (83)(9), so the number of digits of the ninth term in the sequence is 6 + 8 = 14. The subsequence (83) evolves into a subsequence with 10 digits, and (9) evolves into a subsequence with 10 digits, so the tenth term in the look-and-say sequence has 10 + 10 = 20 digits.

All of the information about how the lengths of the 92 subsequences change can be represented in a 92×92 matrix T. In particular, the matrix T has its (i,j) entry equal to C_ij × ℓ_i/ℓ_j, where C_ij is the number of times subsequence (i) appears in the evolution rule for subsequence (j) and ℓ_i is the length of subsequence (i). This matrix is represented in the following image – white squares represent zero entries in the matrix, and black squares represent the number 2, which is the largest value present in the matrix. Shades of grey represent non-zero numbers, with larger numbers being darker.

Then if we represent a term in the look-and-say sequence as a vector v with its i^th entry being c_i × ℓ_i, where c_i is the number of times the subsequence (i) appears in that term, we find that the sum of the entries in v is the total length of that term of the look-and-say sequence. More important, however, is the fact that the sum of the entries in Tv is the length of the next term in the look-and-say sequence. The sum of the entries in T²v is the length of the next term in the look-and-say sequence, and so on. So we have found a degree-92 recurrence relation for the length of terms in the look-and-say sequence, and the corresponding transition matrix is T.

Computing the Limit

It is a basic fact of linear homogeneous recurrence relations that a closed-form solution to the recurrence relation can be written down in terms of the eigenvalues of the transition matrix (see the linked Wikipedia page for specifics). As a corollary of this, the limiting ratio of terms in the sequence is equal to the spectral radius of the transition matrix. Fortunately, the transition matrix in this case is quite sparse, so its characteristic polynomial isn’t too difficult to compute:

Indeed, the degree-71 polynomial that λ is a root of is one of the factors of the characteristic polynomial of the transition matrix T. All that remains to do is to get MATLAB to compute the largest root of that polynomial (i.e., the spectral radius of T):

>> max(abs(eig(T)))

ans =
    1.303577269034287

The matrix T is attached below for those who would like to play with it. Something fun to think about: what do the rational eigenvalues (-1, 0, and 1) of T represent?

Download: Transition matrix [plaintext file]

Update [March 17, 2013]: Entry 91 of the subsequence table has been corrected – thanks to Marcus Stuhr and liuguangxi for the correction.

32 thoughts on “A Derivation of Conway’s Degree-71 “Look-and-Say” Polynomial”

WebsterLin November 1, 2010 at 9:05 pm

That polynomial is a thing of beauty! I have often seen the sequence, 1, 11, 21, 1211, 111221, 312211, used as a fun math puzzle, but I have never seen it examined with such depth. I also did not know that it was called “look-and-say” sequence. Is the Conway mentioned in your title John Horton Conway of Conway’s Game of Life fame?

Reply ↓
Nathaniel Post authorNovember 1, 2010 at 10:23 pm

@WebsterLin – That is indeed the same Conway. Quite a guy 🙂

Reply ↓
Pingback: Matrix67: My Blog » Blog Archive » Conway常数是怎么得来的？
Pingback: Look And Say, Revisited « Programming Praxis
Pingback: Look-and-say y la constante de Conway | Gaussianos
andy July 4, 2011 at 12:43 am

I guess the max root 1.3035… is an irrational number? (afaik MATLAB cuts off the number after the 15th decimal by default, thus suggesting the number to be rational)

Reply ↓
Nathaniel Post authorJuly 4, 2011 at 1:13 am

@andy – Yes, it’s irrational. The rational root theorem says that the only possible rational roots of that degree-71 polynomial are ±1, ±2, ±3, and ±6. Plugging those values into the polynomial reveals that none of them actually are roots, so it has no rational roots.

Reply ↓
epiclolz March 16, 2013 at 9:08 pm

to anyone who may want to use this page for any reason… there is exactly one error in the big table

Reply ↓
Marcus Stuhr March 16, 2013 at 10:19 pm

Entry 91 is incorrect

Reply ↓
liuguangxi March 17, 2013 at 12:48 am

Entry 91 should be 312211322212221121123222112

Reply ↓
Anders March 17, 2013 at 9:17 am

What reason would that be? 😀

Thanks for the blog post and thanks for the correction!

Reply ↓
Mike March 19, 2013 at 7:52 am

Correction to 91 has messed up the ordering. If (and only if!) you wish to restore the lexical ordering, the atoms at entries 90 & 91 should be swapped. Their descendants are respectively 35 and 36, and the descendants of 16 17 35 and 36 would also then need amending to respectively: 80 28 89, 80 28 90, 23 32 60 28 89, 23 32 60 28 90 (omitting the brackets). I’d been using the Conway ordering, so it’s worth checking whether I’ve amended my indexes correctly.

Reply ↓
jpatou March 20, 2013 at 5:10 am

I tried to compute this in Python starting from the 8th term and using a transition matrix at power n-8. My results arre correct until the 27th term and the 28th is false. What do you think can be wrong ?

Reply ↓
jpatou March 20, 2013 at 9:37 am

Sorry for that. I just corrected my mistake. I was thinking that there were only 1 in the transition matrix. I missed that (29) appears twice in the evolution of (60).

Reply ↓
Amic March 21, 2013 at 10:17 pm

Here is a thought the signification of the rational eigenvalues.

Taking a vector of α(90)-α(91)+β(35)-β(36), thinking of it as a real quantities of elements 90,91,35 and 36, it becomes after applying the sequence α(36)-α(35)+β(91)-β(90). So with β=α, we get an eigenvector for the value -1.

Of course (61) is an eigenvector for the value 1…

It requires more work to get an eigenvector for the value 0, or to get the dimension of the eigenspaces.

Reply ↓
Amic March 21, 2013 at 10:23 pm

It also works with β=-α to get an eigenvector for the eigenvalue 1…

Reply ↓
Anders March 23, 2013 at 3:16 am

Empirically it is clear that the growth rate tends to the spectral radius, but how can this be rigorously proven?

Equivalently, how can it be shown that the vector corresponding to the eighth element has a component in the direction of the largest eigenvalue in the Jordan basis? (Remember, it’s not orthogonal in general.)

Reply ↓
Andrew May 29, 2013 at 5:18 am

How were the elements derived? It seems like a rather difficult task..

Reply ↓
Nathaniel Post authorMay 29, 2013 at 8:58 pm

@Andrew – Computer search carried out by Conway. The original code that carried out the search is long gone, but Ekhad and Zeilberger independently re-did the search much later, and their code is available here.

Reply ↓
Pingback: Growth of recursive string substitution | Complex Projective 4-Space
Neil April 6, 2015 at 8:52 am

Where do I find the irrational number of Conway’s Constant? I’ve only been able to find the decimal representation and the polynomial that has it. Has anyone gotten the exact closed form representation for the constant?

Reply ↓
Nathaniel Post authorApril 6, 2015 at 9:19 am

@Neil – There almost definitely *isn’t* an exact closed form representation for it. Roots of polynomials of degree 5 and higher often can’t be expressed exactly using things like plus, minus, multiplication, and roots (this is the Abel-Ruffini theorem), so it would be quite a surprise if this root of a degree-71 polynomial could be.

Reply ↓
Neil April 6, 2015 at 4:57 pm

Oh ok. Thank you Nathaniel.

Reply ↓
Gail Sacatani August 25, 2015 at 8:21 am

What are real life applications of this sequence?

Reply ↓
Planchon August 25, 2015 at 8:48 am

Bonjour ,
je préfère la suite SW ( Say and Write de Planchon)
neuf(9) fois neuf(9) = 81
999 999 999 =81
et 1/81 = 0.012346790123…. période 9

puis 2/81,3/81 etc …
ou
trois(3) fois six(6) = 18
666 = 18
etc a new multiplication table !!!!!
see my blog and leave comments :
http://flechedutemps.blogspot.fr/
The univers is numbers.
Many thanks

Reply ↓
Josh Chapman February 13, 2017 at 7:35 pm

Great article. Anyways, I assume the value 1 represents that if the sequence starts with “22” instead of “1”, then the ratio of the length of each term is 1 instead of 1.303,

Reply ↓
Max May 18, 2017 at 9:06 am

According to oeis.org/A014715, the last decimals of the above value from Matlab are not correct. Following the links, there is further information available.

Reply ↓
Andy Latto April 26, 2020 at 6:15 pm

How is it shown that the successor of a string consisting of a sequence of these atomic strings is itself a sequence of atomic strings?

Denote the seeit-sayit transformation by s, so s(1221) = 112211, etc.

The basic fact that lets pieces of a string evolve independently is that if S(A) = B, S(C) = D, then S(AC) = BD just if the last digit of A is different from the first digit of B. But how do you show that this never happens when starting from a sequence of the 92 atoms? There are atoms that end in 1, and others that start with 1; how do you show that you will never get those adjacent to each other when applying s to a sequence of atoms?

Reply ↓
Nathaniel Post authorApril 29, 2020 at 12:58 pm

@Andy Latto – By brute force computation. You can just track the paths of all 92 subsequences and see that you never end up getting a sequence that ends in 1 followed by a sequence that begins with 1 (for example). This is why where are 4 subsequences starting with “132211331”. That substring alone isn’t included in the list of 92 because how it evolves depends on what digits come after it.

Reply ↓
Pingback: Why is the Conway ‘Look and Say’ sequences constant defined by this polynom? – Math Solution
Stuart McFarlane March 11, 2022 at 5:59 am

One one was a race horse
Two two was one too
One one won one race
Two to won one too

Reply ↓
Clément March 11, 2023 at 12:25 pm

Great article! I just don’t understand why the limiting ratio of the terms of the sequence is necessarily the largest eigenvalue of the transition matrix (even if it seems obvious empirically). I’ve looked at the linked Wikipedia page but can’t seem to find this information on it.

Reply ↓

Nathaniel Johnston

Quantum information theory, cellular automata, and recreational mathematics

A Derivation of Conway’s Degree-71 “Look-and-Say” Polynomial

The Cosmological Theorem

Computing the Number of Digits in Sequences

Computing the Limit

32 thoughts on “A Derivation of Conway’s Degree-71 “Look-and-Say” Polynomial”

Leave a Reply Cancel reply