Well, hello there internet people.

For my entry to the Summer of Math Exposition I'm going to look at some sequences from the OIES. They're all sequences that I found independently, and then confirmed afterwards that the OEIS already had them (for more than 20 years, in some cases). It was nice to see that I was looking at things others had already found. That seemed like a good indicator that I hadn't gone too badly astray while exploring. I'm aiming to give an example of trying stuff, learning things along the way, and seeing where those discoveries lead, and the thinking that guided me towards what I tried next. I seem to have gone a reasonably long way. Am I within shouting range of a proof? Mayyybe. A friend of mine, who has a maths doctorate, was not persuaded. So probably not. But I wanted to talk about the chain of reasoning I followed, even if it fizzled out before the very end. And for this entry I did conscientiously limit myself to just the summer of the competition, and there's more things I want to try that I just haven't had time to do. (That includes making the tree diagrams in this entry bigger and clearer, which has proved really quite tricky. Sorry about that.) As I am looking at a notorious problem, it's not surprising one summer wasn't enough.

The first sequence I'm going to mention is A354236. This is a square array, which means it continues to infinity in both directions.

R_{0} | R_{1} | R_{2} | R_{3} | R_{4} | R_{5} | R_{6} | R_{7} | R_{8} | R_{9} | |
---|---|---|---|---|---|---|---|---|---|---|

1 | 5 | 3 | 17 | 11 | 7 | 9 | 25 | 33 | 43 | … |

2 | 10 | 6 | 34 | 22 | 14 | 18 | 49 | 65 | 86 | … |

4 | 20 | 12 | 35 | 23 | 15 | 19 | 50 | 66 | 87 | … |

8 | 21 | 13 | 68 | 44 | 28 | 36 | 51 | 67 | 89 | … |

16 | 40 | 24 | 69 | 45 | 29 | 37 | 98 | 130 | 172 | … |

32 | 42 | 26 | 70 | 46 | 30 | 38 | 99 | 131 | 173 | … |

64 | 80 | 48 | 75 | 88 | 56 | 72 | 100 | 132 | 174 | … |

128 | 84 | 52 | 136 | 90 | 58 | 74 | 101 | 133 | 177 | … |

256 | 85 | 53 | 138 | 92 | 60 | 76 | 102 | 134 | 178 | … |

512 | 160 | 96 | 140 | 93 | 61 | 77 | 196 | 260 | 179 | … |

… | … | … | … | … | … | … | … | … | … |

Column R_{0} is just the powers of two. In the other columns, for
every number that appears you will also see double that number, but new odd
numbers also start appearing, and then consecutive runs of numbers. What's going
on?

All right, I won't leave you hanging. This sequence is related to the Collatz conjecture. But it doesn't obviously resemble the Collatz tree that most people are used to seeing. How is this sequence generated, and what uses might it have? Before diving in I will give a quick recap of the Collatz problem:

Take any positive integer. If it's even, halve it. If it's odd, multiply by three and add one. Repeat indefinitely.

That's the Collatz procedure. The Collatz conjecture states that if you
follow the Collatz procedure on any positive integer, they all eventually reach
the number one. This has been checked up to numbers past 10^{20}, and no
exceptions have been found, but that still doesn't count as a proof.

I was inspired to look into this after seeing Professor David Eisenbud's Numberphile video on the Collatz Conjecture about eight years ago. (A couple of ad-stripped links to YouTube here: invidious.perennialte.ch and yewtu.be.) About halfway through the video, Brady says, "There does seem to be a very striking feature of it, and this is this line down the middle here." Professor Eisenbud replies, "That's right, this is like the third rail. Once you touch that line you fall right away to one."

The 'striking feature' Brady was talking about is the powers of two, column
R_{0} in table i, and the fact that they're like
a live rail got me thinking. If the third rail is live, then anything connected
to that rail is also live, and so is everything connected to that, and so on,
out to infinity. What are the members of the other rails? So I independently
came up with the table above, and then found it on the OEIS.

How do you generate the table then?

First, start with column R_{0}, the powers of two. ('R' stands for
'Rail'.) We can fill those in ahead of time. Then check each entry in turn,
asking how it can be reached, and collate those numbers. If you find a number
you already know about, skip it.

How do you get to the number 1? There's only one way – you have to descend from the number 2. Nothing new yet. How do you get to 2? Descend from 4. Still nothing new. How do you get to 4? Well, there are two ways. You can descend from 8, and you can also multiply 1 by 3 and add 1, and that gives you 4 as well. But we've seen 1 already, so still nothing new has shown up.

How do you get to 8? Descend from 16.

How do you get to 16? Descend from 32, yes, but also, 5 × 3 + 1 = 16, so 5 is
the first member of R_{1}.

And if 5 is a member of rail one, then so are 10 and 20 and 40 and 80 and 160 and so on, to infinity.

But rail R_{0} isn't finished yet, so continue.

How do you get to 32? Descend from 64.

How do you get to 64? Descend from 128, but also, 21 × 3 + 1 = 64, so there's
the second odd member of rail one; it's 21. And if 21 is a member of
R_{1}, so are 42 and 84 and 168 and so on. Keep going.

How do you get to 128? Descend from 256.

How do you get to 256? Descend from 512, but also, 85 × 3 + 1 = 256, so
there's the third odd member of R_{1}. And if 85 is a member of
R_{1}, so are 170 and 340 and so on.

So there are infinitely many members of R_{1}, just as there were
infinitely many members of R_{0}. R_{1} consists of all the
numbers which are one tripling operation from reaching the 'live rail' and
terminating.

What about column R_{2}?

For R_{2}, you just repeat the same algorithm on each of the members
of R_{1}. This gives the list of all numbers which are two tripling
operations away from reaching the 'live rail'.

For R_{3}, repeat the same algorithm on each of the members of
R_{2}.

And so on to infinity.

Sort all of these numbers on each rail into ascending numerical order, and you get table i above.

So we now have a way to organise the members of the Collatz sequence into
rails. Even though they're different sizes, all the members of a given rail
behave the same. They're all the same number of tripling operations away from
R_{0}, so they are in some sense related, and if you look at them in
binary you will see that their binary structures have features in common. So it
seems helpful to group them together. We also know that if we pick some number
*n* on rail R_{k}, then every time we perform a tripling operation,
we transfer to rail R_{k−1}. That means we have a full list of all the
members of the Collatz tree that don't loop and don't go to infinity. All of
these numbers, by construction, will eventually reach rail R_{0}.

And once you've left rail R_{k}, you can never get back to it. You
have to drop all the way to R_{0}, and only then might you be able to
loop back. So if you could prove that every number eventually shows up on one of
these rails, you would prove the Collatz conjecture. Do these rails really cover
all numbers? They look like they do, every number anyone has ever tested shows
up on a rail, and there's no clear reason they shouldn't, so perhaps somewhere
in here there's a proof by induction waiting to be found.

By the way, OEIS sequence A092893 lists the lowest numbers which are members of each rail, although it doesn't talk about 'rails' because that's new terminology inspired by Professor Eisenbud. And rails one, two, three up to about rail ten are given as sequences A062052, A062053, A062054, A062055, A062056, A062057, A062058, A062059, A062060, A072466 and A072122.

All right. The Collatz conjecture claims that all numbers eventually reach one, and the infinite rail set seems to agree. Every member of that rail set does go to one, but there's no clear way to prove that every number is a member. What if a number doesn't eventually go to one? What would we see?

At a glance there are three possibilities. Maybe a number can wander chaotically forever, without becoming infinite and without repeating. Maybe it can enter a repeating cycle of some kind. Or maybe it can somehow evade ever hitting a power of two and creep up to infinity.

The first of those is the easiest to deal with. That can't happen. If a number stays finite, but wanders forever, there are only a finite number of integers of that size. Therefore it must eventually hit one it's visited before, and then it must repeat. If it repeats, it's in a loop. You can only have a number stay finite but wander forever if you have a non-integer.

The second option is, what if you have a loop? Well, we have one loop already
– the 1, 4, 2, 1 loop. But that's the only one that's been found. It's known
that other loops might be possible, because if you look at the 3n−1 procedure,
that does have three separate loops. The 3n−1 procedure is very like the Collatz
procedure except that you multiply by three and *subtract* one instead of
adding. It behaves in a very similar way, but it has a 1, 2, 1 loop, a 5, 14, 7,
20, 10, 5 loop and a longer loop whose lowest member is 17. We can't immediately
discount the possibility that the Collatz procedure might have something
similar.

Let's assume, then, that there is some number, *N*, that is a member of
a loop other than the 1, 4, 2, 1 loop. I've called it capital *N* because
we've checked all numbers up to 10^{20} and more without finding it, so
if *N* exists it must be large. Now, if it's not connected to the 1, 4, 2,
1 loop then it cannot be a member of any of the rails we've already found,
because by construction they're all connected to R_{0}. So it must be on
some new rail, which I'll call R′_{0}. That works; we can do that
without introducing any contradictions. If *N* is a member of
R′_{0}, then so are 2*N* and 4*N* and 8*N* and 16*N*
and so on. So R′_{0} has infinitely many members, just as R_{0}
does. And then we can use the same generation method as before to produce the
members of rail R′_{1}. That's going to have the same number of members
as rail R_{1}. And then R′_{2}, R′_{3} and so on, to
infinity. In fact you can place these numbers in a one to one correspondence
with the members of R_{0}, R_{1}, R_{2} and the rest, so
in some strong sense, there would be exactly the same number of exceptions as
there are numbers that eventually reach one.

In hindsight, this is pretty obvious and I'm sure it's not a new discovery.
If there is one exception to the Collatz conjecture, then there must be
infinitely many exceptions. It not like finding one number in an infinite
haystack; as many as half of all numbers could be an exception. Which should mean
that you could just pick some super-collossal number, and as long as it's
significantly bigger than *N* and chosen with a bit of care, there could be as
much as a 50–50 chance that it will lead you to *N*. And if it doesn't, just pick
some other colossal number. You should luck out reasonably quickly. Now, you're dealing
with infinities here, so some care is required, and this method will only work
if there is a counter-example that is within reach of current computation
techniques. But at least you don't have to worry about skipping numbers any more.

What if there is some number, *N*, that goes to infinity? That's a
bigger ask, and I think it's impossible. The reasoning is similar to before. If
there is some number *N* that goes to infinity, then it too cannot be a
member of any existing rail set. So we have to assign it its own rail set, which
I will call R″_{0}. And from R″_{0} you can generate
R″_{1}, and from R″_{1} you can generate R″_{2} and so
on, just as you could for a number that loops. But, this number *N* has to
go to infinity. So it has to keep rising. There has to be some rail
R″_{−1} that it links to, and a rail R″_{−2} and so on. You need
infinitely many rails in both directions. Not only that, this sequence can never
repeat any number on any other rail, because if it does it collapses into a loop.

Let's consider some rail with a large negative index. R″_{− a
bajillion}, for example. We can start from that rail and work back towards
rail R″_{0} again. One in three of the even numbers on each rail will
generate an infinite set of numbers on the next rail. If we started at rail
R″_{− a bajillion}, by the time we get back to rail R″_{0} we
ought to have found infinitely more numbers than just *N* and 2*N* and
4*N* and 8*N* and so on. And the argument continues for rail
R″_{1}, R″_{2} and onwards. We must have missed loads of
numbers! That's worrying, but maybe we can fix it, by putting in all the extra
numbers we missed. But that doesn't help, because then we can go back to rail
R″_{−tree(3)} for example, and do the same thing, and find even more
numbers that we must have missed. And more and more and more, the further back
we go. You have to conclude that the numbers found so far are only a tiny
fraction of the total numbers, and that the overwhelming majority of all numbers
must go to infinity. And the more negative the rail number we choose, the worse
it gets, which is a problem as we have to go back infinitely far. So where are
all the numbers that go to infinity then?

However, Professor Terence Tao proved a few years ago that 'almost all' numbers get smaller. See https://arxiv.org/abs/1909.03562. But if one number escapes to infinity, it looks like overwhelmingly many of all numbers would need to escape, and that contradicts Professor Tao's result. That strongly suggests that all these numbers going to infinity can't possibly exist, which hopefully means a proof by contradiction is possible.

(Sorry, this isn't a maths paper. This is an entry to the Summer of Math Exposition. I'm not a trained mathematician; I'm just writing up my explorations of a well-known maths problem. Although I have tried, I haven't found a watertight proof here, just a reasonable argument pointing towards such a proof.)

All right. The next thing I decided to do was to look at *all* the
numbers. I decided not to worry about where they showed up in the Collatz
sequence, and just start at 1 and work up.

My reasoning for doing this was, well, really because the usual Collatz tree, which everyone else has been looking at for years (here's a link to quite a big one) doesn't seem to be very helpful. By the time you've collated anough numbers to see what's going on, you've got too many numbers to see what's going on. If the Collatz tree was any use, it should have helped solve the problem by now. So I decided to ignore the Collatz tree entirely, and simply look at all numbers in ascending order. I know they're all on the tree somewhere, so why not just examine where each number jumps to? The hope was that if I found anything, I could sort out the discovery first, then reassemble the Collatz tree afterwards if needed.

As a time-saver, the even numbers are safe to skip, because every even number gets halved repeatedly until it becomes odd, and it can be dealt with then. That gives the following list, which has all the odd numbers in column A, and those numbers multiplied by 3 and plus 1 in column B. And then if the result is even it's halved in column D, and then E and so on, as many times as necessary. To cover all numbers, infinitely many columns are needed.

A | B | C | D | E | F | G | H | … | |
---|---|---|---|---|---|---|---|---|---|

1 | 4 | 2 | 1 | ||||||

3 | 10 | 5 | |||||||

5 | 16 | 8 | 4 | 2 | 1 | ||||

7 | 22 | 11 | |||||||

9 | 28 | 14 | 7 | ||||||

11 | 34 | 17 | |||||||

13 | 40 | 20 | 10 | 5 | |||||

15 | 46 | 23 | |||||||

17 | 52 | 26 | 13 | ||||||

19 | 58 | 29 | |||||||

21 | 64 | 32 | 16 | 8 | 4 | 2 | 1 | ||

23 | 70 | 35 | |||||||

25 | 76 | 38 | 19 | ||||||

27 | 82 | 41 | |||||||

29 | 88 | 44 | 22 | 11 | |||||

31 | 94 | 47 | |||||||

33 | 100 | 50 | 25 | ||||||

35 | 106 | 53 | |||||||

37 | 112 | 56 | 28 | 14 | 7 | ||||

39 | 118 | 59 | |||||||

41 | 124 | 62 | 31 | ||||||

43 | 130 | 65 | |||||||

45 | 136 | 68 | 34 | 17 | |||||

47 | 142 | 71 | |||||||

49 | 148 | 74 | 37 | ||||||

51 | 154 | 77 | |||||||

53 | 160 | 80 | 40 | 20 | 10 | 5 | |||

55 | 166 | 83 | |||||||

57 | 172 | 86 | 43 | ||||||

59 | 178 | 89 | |||||||

61 | 184 | 92 | 46 | 23 | |||||

… | … | … | … | … | … | … | … | … |

Notice that columns C, E, G and so on are the same except for the different spacing, as are columns D, F, H and so on. After you apply the 3n+1 operation and then halve as necessary, you end up on one of two arithmetic sequences – 2, 5, 8, 11, … and 1, 4, 7, 10, … which alternate forever. So the Collatz procedure, at its heart, breaks down into these two sequences. And they're arithmetic sequences, so they're super predictable. They're just… cleverly interwoven.

However, dealing with a table with infinite rows and infinite columns is a pain. Is there a better way? Well, how about this:

A | B | C | D | E | F | |
---|---|---|---|---|---|---|

1 | 4 | 1 | ||||

3 | 10 | 5 | ||||

5 | 16 | 1 | ||||

7 | 22 | 11 | ||||

9 | 28 | 7 | ||||

11 | 34 | 17 | ||||

13 | 40 | 5 | ||||

15 | 46 | 23 | ||||

17 | 52 | 13 | ||||

19 | 58 | 29 | ||||

21 | 64 | 1 | ||||

23 | 70 | 35 | ||||

25 | 76 | 19 | ||||

27 | 82 | 41 | ||||

29 | 88 | 11 | ||||

31 | 94 | 47 | ||||

33 | 100 | 25 | ||||

35 | 106 | 53 | ||||

37 | 112 | 7 | ||||

39 | 118 | 59 | ||||

41 | 124 | 31 | ||||

43 | 130 | 65 | ||||

45 | 136 | 17 | ||||

47 | 142 | 71 | ||||

… | … | … | … | … | … |

Columns A–D are unchanged, and columns F, G, H and so on all the way to infinity are merged into column E. What is the value in column E? It doesn't look especially predictable, but in fact it's equal to the contents of rows 1, 2, 3, 4, 5, 6, … of the table. Now we've got two arithmetic sequences, which are trivial to calculate, and a final column where row 3 is the same as row 1, row 7 is the same as row 2, row 11 is the same as row 3, row 15 is the same as row 4, etc. So we don't have to calculate anything new, we can just look up earlier rows in the table. Notice also that if you read out columns C, D and E in ascending row order, you get 1, 5, 1, 11, 7, 17, … and that's the same as you get if you just read out column E on its own, which also goes 1, 5, 1, 11, 7, 17, … This is an OEIS sequence as well. It's A075677.

What's happened here is that the entire Collatz tree, all the way to
infinity, has been chopped up into individual 'twigs'. And whenever you jump
from column A to column B, you jump from some rail R_{k} to an earlier
rail R_{k−1}, for all numbers. And that's true even if they are members
of some weird loop, or are on their way off to infinity. Things might change
when R_{0} is reached, but until then the rail numbers simply count down
one step at a time.

At this point, I shall get rid of the numbers altogether and just record what row jumps where. That gives the table below, except I'm temporarily going to ignore column D:

Row | B | C | D |
---|---|---|---|

1 | 1 | ||

2 | 3 | ||

3 | 1 | ||

4 | 6 | ||

5 | 4 | ||

6 | 9 | ||

7 | 3 | ||

8 | 12 | ||

9 | 7 | ||

10 | 15 | ||

11 | 1 | ||

12 | 18 | ||

13 | 10 | ||

14 | 21 | ||

15 | 6 | ||

16 | 24 | ||

17 | 13 | ||

18 | 27 | ||

19 | 4 | ||

… | … | … | … |

Do pause for a moment and confirm that this simpler sequence really does capture the jump pattern that the Collatz numbers follow. You can convert back to Collatz numbers at any time by doubling the row number and subtracting 1. The main advantage of looking at row numbers is that they just count up from row 1 without any gaps.

Note that every number seen in column D is a number already seen in column B
or C – commonly about 3/4 of a table earlier, but sometimes much more. But every
number in column D is repeated infinitely often, and every number in column B or
C eventually jumps to column D. And when a number jumps to column D, it jumps to
a significantly earlier point, because column D is where the row number drops by
the largest amounts. And when it jumps *out* of column D, it will
reappear at some point in column B or C, but at some new, lower number that will
also eventually jump back onto column D. And so on, until the number eventually
reaches 1. Furthermore, every number is also a member of some rail
R_{k}. And no matter whether that number grows or shrinks, it also
always drops to an earlier and earlier rail number with each jump. Eventually it
hits R_{0} (or R′_{0} or R″_{0}, except that for the
Collatz procedure, only one rail set has been found).

Now what I'm going to do is to rewrite the Collatz procedure and simply decree that column D just goes to some row of my choosing. Let's pick, oh, 23. I'll fill that in, and I'll override row 1 as well to be complete, and that gives this:

Row | B | C | D |
---|---|---|---|

1 | 23 | ||

2 | 3 | ||

3 | 23 | ||

4 | 6 | ||

5 | 4 | ||

6 | 9 | ||

7 | 23 | ||

8 | 12 | ||

9 | 7 | ||

10 | 15 | ||

11 | 23 | ||

12 | 18 | ||

13 | 10 | ||

14 | 21 | ||

15 | 23 | ||

16 | 24 | ||

17 | 13 | ||

18 | 27 | ||

19 | 23 | ||

20 | 30 | ||

21 | 16 | ||

22 | 33 | ||

23 | 23 | ||

24 | 36 | ||

25 | 19 | ||

… | … | … | … |

For example, if you start on row 5, you jump to row 4. If you're on row 4, you jump to row 6. If you're on row 6, you jump to row 9. If you're on row 9, you jump to row 7. If you're on row 7, you jump to row 23.

This modified table is always going to end up on row 23, no matter where you start. Even if there was some set of numbers that escape to infinity, nope, now they're all going to row 23. A second loop? Nope. Row 23. Some numbers survive longer than others before they get caught, but every row that isn't row 23 must continually jump to another row. Just like rolling a dice as many times as necessary until you get a 6, you will eventually get a 6. Eventually the path will hit a row number that is 3 modulo 4, (4 modulo 3? Which way round should that be?) and that's the end of it.

Let's change all those row 23s to row 1. What happens now? Exactly the same thing, except everything ends up on row 1. I may have cheated, but on this table, the Collatz conjecture really is true because I've brute-forced it to be. But shortly I'll be covering the fact that the actual (non-forced) Collatz tree can also be filled up with ones up for quite a long way – perhaps long enough that the real Collatz conjecture might be true too.

Row | B | C | D | |
---|---|---|---|---|

1 | 1 | |||

2 | 3 | |||

3 | 1 | |||

4 | 6 | |||

5 | 4 | |||

6 | 9 | |||

7 | 1 | |||

8 | 12 | |||

9 | 7 | |||

10 | 15 | |||

11 | 1 | |||

12 | 18 | |||

13 | 10 | |||

14 | 21 | |||

15 | 1 | |||

16 | 24 | |||

17 | 13 | |||

18 | 27 | |||

19 | 1 | |||

20 | 30 | |||

21 | 16 | |||

22 | 33 | |||

23 | 1 | |||

24 | 36 | |||

25 | 19 | |||

… | … | … | … | … |

What about if we change row D to 1, 2, 3, 4, 5, and so on?

Row | B | C | D | |
---|---|---|---|---|

1 | 1 | |||

2 | 3 | |||

3 | 1 | |||

4 | 6 | |||

5 | 4 | |||

6 | 9 | |||

7 | 2 | |||

8 | 12 | |||

9 | 7 | |||

10 | 15 | |||

11 | 3 | |||

12 | 18 | |||

13 | 10 | |||

14 | 21 | |||

15 | 4 | |||

16 | 24 | |||

17 | 13 | |||

18 | 27 | |||

19 | 5 | |||

20 | 30 | |||

21 | 16 | |||

22 | 33 | |||

23 | 6 | |||

24 | 36 | |||

25 | 19 | |||

… | … | … | … | … |

I *think* you can verify that everything still ends up at row 1. If
you're not at row 1, you will always have to jump to some other row, and the
pattern of jumps is interleaved in such a way that you never hit a loop until
you hit row 1, and also so that you are guaranteed to descend. You will
sometimes jump upwards, but that only happens in column B, and each series of
jumps upwards is only a finite and predictable number of steps. And there's a
simple row-calculating formula (see below).

By the way, as everything up to 10^{20} has been checked and goes to
1, you *could* fill in the entire table with ones all the way up to about
row 5 × 10^{19} in column B. And then you could also fill in the first 5
× 10^{19} entries in columns C, D, E and so on to infinity. And once
those entries are filled with ones, even more of columns B and C can be filled
in to match, allowing yet more entries to be filled in in column D, and so on.
This bootstrap process covers more and more numbers with each iteration. We also
have the equations to let us jump backwards or forwards from any row to the next
or the previous row, so as soon as it's known that some row ends at 1, we can
find infinitely many other numbers that also end at 1. I'm actually beginning to
wonder why Collatz isn't already considered solved. Do we really not have enough
information to prove it?

The concept of rails shows that every number eventually reaches some rail
R_{0}, R′_{0}, R″_{0} or similar, and also that if there
is a second rail set, up to half of all numbers would be members of that rail
set. But if some odd number *n* is a member of some rail, then *4n+1*
(and infinitely many others) must also be members of the same rail. As column D
is recursively filled up, there doesn't seem to be space in the standard Collatz
procedure for a second rail set to exist. If there's no space for a second rail
set, then we can only have one rail set, and the conjecture is true. But by
comparison, if you try the same operation on the 3n−1 procedure, the gaps are
larger and there *is* space, and lo, the 3n−1 procedure *does*
have multiple loops. The 3n−1 procedure acts as a helpful counter-example; it
shows us what to look for to show that the Collatz conjecture is false.

In column B we have row 3, 6, 9, 12, … all the way to infinity. In column C we have row 1, 4, 7, 10, … all the way to infinity. In column D we have row 1, 2, 3, 4, … all the way to infinity. (Yes, this does introduce extra jumps, but check against table iv to confirm that row 3 goes to 1, row 7 goes to 3, row 11 goes to 1, row 15 goes to 6 and so on.)

Let *r* equal some row number.

What's the next row?

- if
*r*mod 2 = 0, then the next row is given by the function*f(r) = 3r / 2* - If
*r*mod 4 = 1, then the next row is given by*g(r) = (3r + 1) / 4* - If
*r*mod 4 = 3, then the next row is given by*h(r) = (r + 1) / 4*

(As an alternative for *h(r)*, you could define *i(r)* to be the sequence
1, 3, 1, 6, 4, 9, 3, 12, 7, … That will have a rather more
complicated definition, but if you plug *r* in you get the correct
destination row out without needing the extra jump(s) that *h(r)* requires.)

And the row after that is … well, just plug in the new value of *r* and repeat
as many times as needed.

Every row, all the way to infinity, goes to row 1. Can we actually
*prove* this? Rather than just trying it and seeing that it always works?

One possible approach would be to prove that there is a second loop, or to prove that there can't be. Escaping to infinity already looks impossible (although that's not rigorously confirmed), but if additional loops could be proved impossible that would probably eliminate escaping to infinity as an option as well.

What is a loop? Obviously, in order to have a loop, you have to start at some
number *r*, apply the row formulae to it in some unknown order, and then
eventually get back to the original number. But the thing about a loop is that
you can go round it in either direction. The forwards direction has been done
to death; as I said, those numbers have been checked past 10^{20}. But
what about in reverse?

Okay, I'll define the inverse row functions as well, then. These equations
are such that, given some row number *r*, they will tell you which other
row or rows you could have reached *r* from. This allows you to work
backwards through the Collatz tree, and with *f(r)*, *g(r)* and
*h(r)* (or *i(r)*), you can traverse the tree freely in either
direction. For example, given *r* = 63, you could have reached that from
*r* = 42. And *r* = 42 could have been reached from *r* = 28. And
so on. I'll write these inverse functions as f^{−1}(r) etc.

- Column B: If
*r*mod 3 = 0 then*f*^{−1}(r) = 2r / 3 - Column C: If
*r*mod 3 = 1 then*g*^{−1}(r) = (4r − 1) / 3 - Column D: For any
*r*,*h*^{−1}(r) = 4r − 1 - Column E: For any
*r*,*i*^{−1}(r) = <some recursive definition, generating infinite entries for each cell, that I won't try to write up here>

Row | B | C | D | E |
---|---|---|---|---|

1 | 1 | 3 | 3,11,43,171,683,… | |

2 | 7 | |||

3 | 2 | 11 | 7,27,107,427,… | |

4 | 5 | 15 | 19,75,299,… | |

5 | 19 | |||

6 | 4 | 23 | 15,59,235,… | |

7 | 9 | 27 | 35,139,555,… | |

8 | 31 | |||

9 | 6 | 35 | 23,91,363,… | |

10 | 13 | 39 | 51,203,811,… | |

11 | 43 | |||

12 | 8 | 47 | 31,123,491,… | |

13 | 16 | 51 | 67,267,1067,… | |

14 | 55 | |||

15 | 10 | 59 | 39,155,619,… | |

16 | 19 | 63 | 83,331,1323,… | |

17 | 67 | |||

18 | 12 | 71 | 47,187,747,… | |

19 | 22 | 75 | 99,395,1579,… | |

20 | 79 | |||

21 | 14 | 83 | 55,219,875,… | |

22 | 25 | 87 | 115,459,1835,… | |

23 | 91 | |||

24 | 16 | 95 | 63,251,1003,… | |

25 | 28 | 99 | 131,523,2091,… | |

… | … | … | … | … |

If you use column E from table viii, you get table ix below. Rows which contain multiples of 3 and have no further connections are coloured red. For example, row 2 contains the number 3, row 11 contains the number 21, etc. These are all 'dead ends' in the Collatz structure; nothing feeds in to them from any higher-numbered rail.

R_{0} | R_{1} | R_{2} | R_{3} | R_{4} | R_{5} | … |
---|---|---|---|---|---|---|

683 | ||||||

… | ||||||

7275 | 4850, 19399, … | |||||

171 | 1819 | 2425, 9699, … | ||||

455 | ||||||

114 | 76, 303, … | |||||

… | ||||||

3627 | 2178, 8711, … | |||||

1 | 43 | 907 | 1209, 4835, … | |||

227 | ||||||

57 | 75, 299, … | |||||

11 | ||||||

… | ||||||

1707 | 1138, 4551, … | |||||

427 | 569, 2275, … | |||||

3 | 107 | |||||

27 | 18, 71, 283, 1131, … | |||||

7 | 9, 35, 139, 555, … | |||||

2 |

For those going cross-eyed trying to decipher this table, I agree, it's
horrible. It's much easier to digest by ignoring the connections entirely and
just treating it as rails R_{0}, R_{1}, R_{2} and so on,
as shown in table i. Don't worry about what number jumps
where, just confirm than every set of numbers links to the next and the previous
rails.

However, column D of table viii can be used to
generate a different Collatz tree, shown below. This version of the tree
contains extra jumps compared to the standard tree, because of the use of
*h ^{−1}(r)* instead of

6 | |||||

9 | |||||

35 | |||||

2 | 7 | ||||

18 | |||||

27 | |||||

107 | |||||

1 | 3 | ||||

38 | |||||

57 | |||||

227 | |||||

11 | 43 | ||||

114 | |||||

171 | |||||

683 |

If you pick some number on the tree, for example 18, then the rest of the subtree for row 18 can be generated. This consists of all the numbers that lead to row 18. Every number links to at least one other number, and therefore an infinite tree can be generated for all numbers. There is no need to care what path 18's descendants have to take in order to reach 18; it's sufficient that they do. The same argument applies for every other number that could be chosen. This means the entire tree to the right of row 18 can be generated, then pruned and discarded. But 27 is 18's parent, so the 18-tree and the 107-tree can also be pruned. So in turn can 27, and then 7, 2, 3, all the way up to the root of the tree, which is 1. Every number in the Collatz tree, all the way to infinity, has 1 as its ultimate ancestor. I'm as sure as I can be that there are no loops, and there are no numbers that escape to infinity.

As a sanity check, I shall also generate the equivalent trees for the 3n−1
procedure. As stated earlier, the 3n−1 procedure does have loops (1, 2, 1), (5,
14, 7, 20, 10, 5) and a longer loop whose lowest member is 17, and consequently
the 3n−1 procedure also has three rail sets – R_{0}, R′_{0} and
R″_{0} as shown below. It's striking how much more quickly the numbers
in each rail set grow for the 3n−1 procedure than they do for Collatz. By
R_{4} we already seem to be past the point where it's reasonable to
expect numbers as small as 5 or 7 to show up, and yet they're still missing. By
about R_{10} or so it looks overwhelmingly likely that they must be
members of their own separate rail set(s). And this is easily confirmed by
simply checking a few of the missing numbers, at which point the R′ and R″ rail
sets are quickly found.

R_{0} | R_{1} | R_{2} | R_{3} | R_{4} | R_{5} | R_{6} | R_{7} | R_{8} | R_{9} |
---|---|---|---|---|---|---|---|---|---|

1 | 3 | 15 | 39 | 53 | 69 | 95 | 127 | 85 | 57 |

2 | 6 | 29 | 77 | 103 | 71 | 189 | 245 | 169 | 113 |

4 | 11 | 30 | 78 | 106 | 138 | 190 | 253 | 170 | 114 |

8 | 12 | 58 | 79 | 206 | 141 | 367 | 254 | 327 | 226 |

16 | 22 | 59 | 154 | 207 | 142 | 378 | 490 | 338 | 227 |

32 | 24 | 60 | 155 | 211 | 275 | 379 | 506 | 339 | 228 |

64 | 43 | 115 | 156 | 212 | 276 | 380 | 507 | 340 | 451 |

128 | 44 | 116 | 158 | 411 | 282 | 734 | 508 | 654 | 452 |

256 | 48 | 118 | 307 | 412 | 283 | 755 | 979 | 675 | 454 |

512 | 86 | 120 | 308 | 414 | 284 | 756 | 980 | 676 | 456 |

… | … | … | … | … | … | … | … | … | … |

R′_{0} | R′_{1} | R′_{2} | R′_{3} | R′_{4} | R′_{5} | R′_{6} | R′_{7} | R′_{8} | R′_{9} |
---|---|---|---|---|---|---|---|---|---|

5 | 7 | 19 | 13 | 9 | 47 | 63 | 167 | 223 | 149 |

10 | 14 | 38 | 26 | 18 | 93 | 125 | 333 | 446 | 298 |

20 | 27 | 75 | 51 | 35 | 94 | 126 | 334 | 447 | 595 |

40 | 28 | 76 | 52 | 36 | 186 | 250 | 335 | 891 | 596 |

80 | 54 | 143 | 102 | 70 | 187 | 251 | 666 | 892 | 1190 |

160 | 56 | 150 | 104 | 72 | 188 | 252 | 667 | 894 | 1192 |

320 | 107 | 152 | 191 | 139 | 371 | 495 | 668 | 1782 | 2379 |

640 | 108 | 285 | 203 | 140 | 372 | 499 | 670 | 1784 | 2380 |

1280 | 112 | 286 | 204 | 144 | 374 | 500 | 1331 | 1787 | 2383 |

2560 | 214 | 299 | 208 | 255 | 376 | 502 | 1332 | 1788 | 2384 |

… | … | … | … | … | … | … | … | … | … |

R″_{0} | R″_{1} | R″_{2} | R″_{3} | R″_{4} | R″_{5} | R″_{6} | R″_{7} | R″_{8} | R″_{9} |
---|---|---|---|---|---|---|---|---|---|

17 | 23 | 31 | 21 | 55 | 37 | 25 | 33 | 45 | 117 |

34 | 46 | 61 | 41 | 109 | 73 | 49 | 66 | 90 | 233 |

68 | 91 | 62 | 42 | 110 | 74 | 50 | 67 | 175 | 234 |

136 | 92 | 122 | 82 | 111 | 146 | 98 | 131 | 179 | 239 |

272 | 182 | 123 | 83 | 218 | 147 | 99 | 132 | 180 | 466 |

544 | 184 | 124 | 84 | 219 | 148 | 100 | 134 | 349 | 467 |

1088 | 363 | 243 | 163 | 220 | 291 | 195 | 262 | 350 | 468 |

2176 | 364 | 244 | 164 | 222 | 292 | 196 | 264 | 358 | 477 |

4352 | 368 | 246 | 166 | 435 | 294 | 198 | 267 | 360 | 478 |

8704 | 726 | 248 | 168 | 436 | 296 | 200 | 268 | 698 | 931 |

… | … | … | … | … | … | … | … | … | … |

For the 3n−1 procedure, the row equations are:

- If
*r*mod 2 = 1,*f(r) = (3r − 1) / 2* - If
*r*mod 4 = 2,*g(r) = 3r / 4* - If
*r*mod 4 = 0,*h(r) = (r + 2) / 4*

The inverse row equations are:

- If
*r*mod 3 = 1,*f*^{−1}(r) = (2r + 1) / 3 - If
*r*mod 3 = 0,*g*^{−1}(r) = 4r / 3 - For any
*r*,*h*^{−1}(r) = 4r − 2

Its row table is as follows:

Row | B | C | D |
---|---|---|---|

1 | 1 | 2 | |

2 | 6 | ||

3 | 4 | 10 | |

4 | 3 | 14 | |

5 | 18 | ||

6 | 8 | 22 | |

7 | 5 | 26 | |

8 | 30 | ||

9 | 12 | 34 | |

10 | 7 | 38 | |

11 | 42 | ||

12 | 16 | 46 | |

13 | 9 | 50 | |

14 | 54 | ||

15 | 20 | 58 | |

16 | 11 | 62 | |

17 | 66 | ||

18 | 24 | 70 | |

19 | 13 | 74 | |

20 | 78 | ||

21 | 28 | 82 | |

22 | 15 | 86 | |

23 | 90 | ||

24 | 32 | 94 | |

25 | 17 | 98 | |

24 | 32 | 94 | |

… | … | … | … |

If you generate the 3n−1 tree starting from row 1, it's slightly lopsided, and the first few branches look like this:

40 | |||||

8 | 30 | ||||

118 | |||||

1 | 2 | 6 | |||

20 | |||||

15 | |||||

58 | |||||

22 | |||||

86 | 342 |

But wait, where's 3? Where's 4? Where's 7? Where's 9?

These numbers never appear. Row 3 is not connected to row 1, and nor is row 9. And nor are infinitely many other rows.

To get to the row 3 family an entire second tree is needed, and a third tree is needed for row 9.

Note than in both cases the row *tables* do contain all numbers, but
the *tree(s)* generated from the tables only contains the set of row
numbers linked to their parent row. The 3n−1 procedure has three loops and needs
three trees, while it looks pretty sure Collatz only has one loop and only needs
one tree. And if that is the case, as it appears to be, then that pesky ol'
conjecture really is true.