Ordinal collapsing function

From Wikipedia the free encyclopedia

In mathematical logic and set theory, an ordinal collapsing function (or projection function) is a technique for defining (notations for) certain recursive large countable ordinals, whose principle is to give names to certain ordinals much larger than the one being defined, perhaps even large cardinals (though they can be replaced with recursively large ordinals at the cost of extra technical difficulty), and then "collapse" them down to a system of notations for the sought-after ordinal. For this reason, ordinal collapsing functions are described as an impredicative manner of naming ordinals.

The details of the definition of ordinal collapsing functions vary, and get more complicated as greater ordinals are being defined, but the typical idea is that whenever the notation system "runs out of fuel" and cannot name a certain ordinal, a much larger ordinal is brought "from above" to give a name to that critical point. An example of how this works will be detailed below, for an ordinal collapsing function defining the Bachmann–Howard ordinal (i.e., defining a system of notations up to the Bachmann–Howard ordinal).

The use and definition of ordinal collapsing functions is inextricably intertwined with the theory of ordinal analysis, since the large countable ordinals defined and denoted by a given collapse are used to describe the ordinal-theoretic strength of certain formal systems, typically[1][2] subsystems of analysis (such as those seen in the light of reverse mathematics), extensions of Kripke–Platek set theory, Bishop-style systems of constructive mathematics or Martin-Löf-style systems of intuitionistic type theory.

Ordinal collapsing functions are typically denoted using some variation of either the Greek letter (psi) or (theta).

An example leading up to the Bachmann–Howard ordinal[edit]

The choice of the ordinal collapsing function given as example below imitates greatly the system introduced by Buchholz[3] but is limited to collapsing one cardinal for clarity of exposition. More on the relation between this example and Buchholz's system will be said below.

Definition[edit]

Let stand for the first uncountable ordinal , or, in fact, any ordinal which is an -number and guaranteed to be greater than all the countable ordinals which will be constructed (for example, the Church–Kleene ordinal is adequate for our purposes; but we will work with because it allows the convenient use of the word countable in the definitions).

We define a function (which will be non-decreasing and continuous), taking an arbitrary ordinal to a countable ordinal , recursively on , as follows:

Assume has been defined for all , and we wish to define .
Let be the set of ordinals generated starting from , , and by recursively applying the following functions: ordinal addition, multiplication and exponentiation and the function , i.e., the restriction of to ordinals . (Formally, we define and inductively for all natural numbers and we let be the union of the for all .)
Then is defined as the smallest ordinal not belonging to .

In a more concise (although more obscure) way:

is the smallest ordinal which cannot be expressed from , , and using sums, products, exponentials, and the function itself (to previously constructed ordinals less than ).

Here is an attempt to explain the motivation for the definition of in intuitive terms: since the usual operations of addition, multiplication and exponentiation are not sufficient to designate ordinals very far, we attempt to systematically create new names for ordinals by taking the first one which does not have a name yet, and whenever we run out of names, rather than invent them in an ad hoc fashion or using diagonal schemes, we seek them in the ordinals far beyond the ones we are constructing (beyond , that is); so we give names to uncountable ordinals and, since in the end the list of names is necessarily countable, will "collapse" them to countable ordinals.

Computation of values of ψ[edit]

To clarify how the function is able to produce notations for certain ordinals, we now compute its first values.

Predicative start[edit]

First consider . It contains ordinals and so on. It also contains such ordinals as . The first ordinal which it does not contain is (which is the limit of , , and so on — less than by assumption). The upper bound of the ordinals it contains is (the limit of , , and so on), but that is not so important. This shows that .

Similarly, contains the ordinals which can be formed from , , , and this time also , using addition, multiplication and exponentiation. This contains all the ordinals up to but not the latter, so . In this manner, we prove that inductively on : the proof works, however, only as long as . We therefore have:

for all , where is the smallest fixed point of .

(Here, the functions are the Veblen functions defined starting with .)

Now but is no larger, since cannot be constructed using finite applications of and thus never belongs to a set for , and the function remains "stuck" at for some time:

for all .

First impredicative values[edit]

Again, . However, when we come to computing , something has changed: since was ("artificially") added to all the , we are permitted to take the value in the process. So contains all ordinals which can be built from , , , , the function up to and this time also itself, using addition, multiplication and exponentiation. The smallest ordinal not in is (the smallest -number after ).

We say that the definition and the next values of the function such as are impredicative because they use ordinals (here, ) greater than the ones which are being defined (here, ).

Values of ψ up to the Feferman–Schütte ordinal[edit]

The fact that equals remains true for all . (Note, in particular, that : but since now the ordinal has been constructed there is nothing to prevent from going beyond this). However, at (the first fixed point of beyond ), the construction stops again, because cannot be constructed from smaller ordinals and by finitely applying the function. So we have .

The same reasoning shows that for all , where enumerates the fixed points of and is the first fixed point of . We then have .

Again, we can see that for some time: this remains true until the first fixed point of , which is the Feferman–Schütte ordinal. Thus, is the Feferman–Schütte ordinal.

Beyond the Feferman–Schütte ordinal[edit]

We have for all where is the next fixed point of . So, if enumerates the fixed points in question (which can also be noted using the many-valued Veblen functions) we have , until the first fixed point of the itself, which will be (and the first fixed point of the functions will be ). In this manner:

  • is the Ackermann ordinal (the range of the notation defined predicatively),
  • is the "small" Veblen ordinal (the range of the notations predicatively using finitely many variables),
  • is the "large" Veblen ordinal (the range of the notations predicatively using transfinitely-but-predicatively-many variables),
  • the limit of , , , etc., is the Bachmann–Howard ordinal: after this our function is constant, and we can go no further with the definition we have given.

Ordinal notations up to the Bachmann–Howard ordinal[edit]

We now explain more systematically how the function defines notations for ordinals up to the Bachmann–Howard ordinal.

A note about base representations[edit]

Recall that if is an ordinal which is a power of (for example itself, or , or ), any ordinal can be uniquely expressed in the form , where is a natural number, are non-zero ordinals less than , and are ordinal numbers (we allow ). This "base representation" is an obvious generalization of the Cantor normal form (which is the case ). Of course, it may quite well be that the expression is uninteresting, i.e., , but in any other case the must all be less than ; it may also be the case that the expression is trivial (i.e., , in which case and ).

If is an ordinal less than , then its base representation has coefficients (by definition) and exponents (because of the assumption ): hence one can rewrite these exponents in base and repeat the operation until the process terminates (any decreasing sequence of ordinals is finite). We call the resulting expression the iterated base representation of and the various coefficients involved (including as exponents) the pieces of the representation (they are all ), or, for short, the -pieces of .

Some properties of ψ[edit]

  • The function is non-decreasing and continuous (this is more or less obvious from its definition).
  • If with then necessarily . Indeed, no ordinal with can belong to (otherwise its image by , which is would belong to — impossible); so is closed by everything under which is the closure, so they are equal.
  • Any value taken by is an -number (i.e., a fixed point of ). Indeed, if it were not, then by writing it in Cantor normal form, it could be expressed using sums, products and exponentiation from elements less than it, hence in , so it would be in , a contradiction.
  • Lemma: Assume is an -number and an ordinal such that for all : then the -pieces (defined above) of any element of are less than . Indeed, let be the set of ordinals all of whose -pieces are less than . Then is closed under addition, multiplication and exponentiation (because is an -number, so ordinals less than it are closed under addition, multiplication and exponentiation). And also contains every for by assumption, and it contains , , , . So , which was to be shown.
  • Under the hypothesis of the previous lemma, (indeed, the lemma shows that ).
  • Any -number less than some element in the range of is itself in the range of (that is, omits no -number). Indeed: if is an -number not greater than the range of , let be the least upper bound of the such that : then by the above we have , but would contradict the fact that is the least upper bound — so .
  • Whenever , the set consists exactly of those ordinals (less than ) all of whose -pieces are less than . Indeed, we know that all ordinals less than , hence all ordinals (less than ) whose -pieces are less than , are in . Conversely, if we assume for all (in other words if is the least possible with ), the lemma gives the desired property. On the other hand, if for some , then we have already remarked and we can replace by the least possible with .

The ordinal notation[edit]

Using the facts above, we can define a (canonical) ordinal notation for every less than the Bachmann–Howard ordinal. We do this by induction on .

If is less than , we use the iterated Cantor normal form of . Otherwise, there exists a largest -number less or equal to (this is because the set of -numbers is closed): if then by induction we have defined a notation for and the base representation of gives one for , so we are finished.

It remains to deal with the case where is an -number: we have argued that, in this case, we can write for some (possibly uncountable) ordinal : let be the greatest possible such ordinal (which exists since is continuous). We use the iterated base representation of : it remains to show that every piece of this representation is less than (so we have already defined a notation for it). If this is not the case then, by the properties we have shown, does not contain ; but then (they are closed under the same operations, since the value of at can never be taken), so , contradicting the maximality of .

Note: Actually, we have defined canonical notations not just for ordinals below the Bachmann–Howard ordinal but also for certain uncountable ordinals, namely those whose -pieces are less than the Bachmann–Howard ordinal (viz.: write them in iterated base representation and use the canonical representation for every piece). This canonical notation is used for arguments of the function (which may be uncountable).

Examples[edit]

For ordinals less than , the canonical ordinal notation defined coincides with the iterated Cantor normal form (by definition).

For ordinals less than , the notation coincides with iterated base notation (the pieces being themselves written in iterated Cantor normal form): e.g., will be written , or, more accurately, . For ordinals less than , we similarly write in iterated base and then write the pieces in iterated base (and write the pieces of that in iterated Cantor normal form): so is written , or, more accurately, . Thus, up to , we always use the largest possible -number base which gives a non-trivial representation.

Beyond this, we may need to express ordinals beyond : this is always done in iterated -base, and the pieces themselves need to be expressed using the largest possible -number base which gives a non-trivial representation.

Note that while is equal to the Bachmann–Howard ordinal, this is not a "canonical notation" in the sense we have defined (canonical notations are defined only for ordinals less than the Bachmann–Howard ordinal).

Conditions for canonicalness[edit]

The notations thus defined have the property that whenever they nest functions, the arguments of the "inner" function are always less than those of the "outer" one (this is a consequence of the fact that the -pieces of , where is the largest possible such that for some -number , are all less than , as we have shown above). For example, does not occur as a notation: it is a well-defined expression (and it is equal to since is constant between and ), but it is not a notation produced by the inductive algorithm we have outlined.

Canonicalness can be checked recursively: an expression is canonical if and only if it is either the iterated Cantor normal form of an ordinal less than , or an iterated base representation all of whose pieces are canonical, for some where is itself written in iterated base representation all of whose pieces are canonical and less than . The order is checked by lexicographic verification at all levels (keeping in mind that is greater than any expression obtained by , and for canonical values the greater always trumps the lesser or even arbitrary sums, products and exponentials of the lesser).

For example, is a canonical notation for an ordinal which is less than the Feferman–Schütte ordinal: it can be written using the Veblen functions as .

Concerning the order, one might point out that (the Feferman–Schütte ordinal) is much more than (because is greater than of anything), and is itself much more than (because is greater than , so any sum-product-or-exponential expression involving and smaller value will remain less than ). In fact, is already less than .

Standard sequences for ordinal notations[edit]

To witness the fact that we have defined notations for ordinals below the Bachmann–Howard ordinal (which are all of countable cofinality), we might define standard sequences converging to any one of them (provided it is a limit ordinal, of course). Actually we will define canonical sequences for certain uncountable ordinals, too, namely the uncountable ordinals of countable cofinality (if we are to hope to define a sequence converging to them...) which are representable (that is, all of whose -pieces are less than the Bachmann–Howard ordinal).

The following rules are more or less obvious, except for the last:

  • First, get rid of the (iterated) base representations: to define a standard sequence converging to , where is either or (or , but see below):
    • if is zero then and there is nothing to be done;
    • if is zero and is successor, then is successor and there is nothing to be done;
    • if is limit, take the standard sequence converging to and replace in the expression by the elements of that sequence;
    • if is successor and is limit, rewrite the last term as and replace the exponent in the last term by the elements of the fundamental sequence converging to it;
    • if is successor and is also, rewrite the last term as and replace the last in this expression by the elements of the fundamental sequence converging to it.
  • If is , then take the obvious as the fundamental sequence for .
  • If then take as fundamental sequence for the sequence
  • If then take as fundamental sequence for the sequence
  • If where is a limit ordinal of countable cofinality, define the standard sequence for to be obtained by applying to the standard sequence for (recall that is continuous and increasing, here).
  • It remains to handle the case where with an ordinal of uncountable cofinality (e.g., itself). Obviously it doesn't make sense to define a sequence converging to in this case; however, what we can define is a sequence converging to some with countable cofinality and such that is constant between and . This will be the first fixed point of a certain (continuous and non-decreasing) function . To find it, apply the same rules (from the base representation of ) as to find the canonical sequence of , except that whenever a sequence converging to is called for (something which cannot exist), replace the in question, in the expression of , by a (where is a variable) and perform a repeated iteration (starting from , say) of the function : this gives a sequence tending to , and the canonical sequence for is , , ... If we let the th element (starting at ) of the fundamental sequence for be denoted as , then we can state this more clearly using recursion. Using this notation, we can see that quite easily. We can define the rest of the sequence using recursion: . (The examples below should make this clearer.)

Here are some examples for the last (and most interesting) case:

  • The canonical sequence for is: , , ... This indeed converges to after which is constant until .
  • The canonical sequence for is: , , This indeed converges to the value of at after which is constant until .
  • The canonical sequence for is: This converges to the value of at .
  • The canonical sequence for is This converges to the value of at .
  • The canonical sequence for is: This converges to the value of at .
  • The canonical sequence for is: This converges to the value of at .
  • The canonical sequence for is: This converges to the value of at .
  • The canonical sequence for is:

Here are some examples of the other cases:

  • The canonical sequence for is: , , , ...
  • The canonical sequence for is: , , , ...
  • The canonical sequence for is: , , , ...
  • The canonical sequence for is: , , ...
  • The canonical sequence for is: , , , ...
  • The canonical sequence for is: , , , ...
  • The canonical sequence for is: , , , ...
  • The canonical sequence for is: , , ... (this is derived from the fundamental sequence for ).
  • The canonical sequence for is: , , ... (this is derived from the fundamental sequence for , which was given above).

Even though the Bachmann–Howard ordinal itself has no canonical notation, it is also useful to define a canonical sequence for it: this is , , ...

A terminating process[edit]

Start with any ordinal less than or equal to the Bachmann–Howard ordinal, and repeat the following process so long as it is not zero:

  • if the ordinal is a successor, subtract one (that is, replace it with its predecessor),
  • if it is a limit, replace it by some element of the canonical sequence defined for it.

Then it is true that this process always terminates (as any decreasing sequence of ordinals is finite); however, like (but even more so than for) the hydra game:

  1. it can take a very long time to terminate,
  2. the proof of termination may be out of reach of certain weak systems of arithmetic.

To give some flavor of what the process feels like, here are some steps of it: starting from (the small Veblen ordinal), we might go down to , from there down to , then then then then then then then and so on. It appears as though the expressions are getting more and more complicated whereas, in fact, the ordinals always decrease.

Concerning the first statement, one could introduce, for any ordinal less or equal to the Bachmann–Howard ordinal , the integer function which counts the number of steps of the process before termination if one always selects the 'th element from the canonical sequence (this function satisfies the identity ). Then can be a very fast growing function: already is essentially , the function is comparable with the Ackermann function