[ UNDER CONSTRUCTION ]
I couldn't find a good explanation of Loader's number anywhere online, so I decided to make one of my own. Specifically, I hope to explain what's going on in loader.c and how it is able to generate such large numbers. I read through and comprehended the full.c provided by Loader, and then I rewrote it in a more readable and understandable format. I was mostly inspired by the incompleteness of LittlePeng9's analysis. For reference, Loader's commented version of his code can be found here. So, without further ado...
Lambda Calculus[]
Loader's number is defined by diagonalizing over the Calculus of Constructions (often shortened as λPω, CoC, and sometimes C). So, before we can even talk about Loader's number, we need to understand what the Calculus of Constructions is.
The Calculus of Constructions is a typed lambda calculus that happens to be both highly expressive and strongly normalizing. But that doesn't really mean anything to anyone that isn't well learned in type theory. So I'll be explaining each of these terms.
Type theory is an alternative to set theory in serving as the foundation upon which we describe all of mathematics. Set theory goes about this by having the basic unit of everything being a list of objects (order does not matter). These lists are called sets, and can only contain sets. So, naturally, we can only begin building our universe in set theory with an empty set. From this empty set, you can build definitions for anything in math, like natural and ordinal numbers (due to von Neumann), booleans, functions, vectors, etc.
Type theory, on the other hand, does this by having the basic unit of everything being terms. Terms can be a name, a variable, or a bunch of terms joined with symbols. Every term has a type. This is a mostly intuitive concept that is exactly like how it sounds. For example, the type of two is a natural number. This fact is called a typing judgement, and is expressed in symbols as .
Lambda calculus is another system of mathematical logic that can be used to model computation. Terms are constructed using a combination of three things: variables, function definitions, and function application. However, in lambda calculus, functions have no names. Instead, functions are defined at the same place that they are applied. The usual syntax for a lambda function is , where x is the parameter and M is the body. The application of a function to some input is notated as .
Lambda calculus has a rule for evaluating the value of terms be reducing them down to an irreducible form. This form is called a normal form, and this rule of reduction is called β-reduction. This rule states that can be evaluated to . That is, the entire term is equivalent to M, but with every instance of x replaced with E.
Typed lambda calculus is very similar. It is a merger between lambda calculus and type theory, where all terms in lambda calculus must have a type. This includes variables, lambda functions, and the variables bound as parameters. When defining a function, you now have to specify the type that the parameter must be. For example, the successor function in typed lambda calculus is written as .
The Calculus of Constructions[]
Now you might ask: Why would we want to add types to lambda calculus? The answer to that question is that programming makes extensive use of type theory already, and untyped lambda calculus is vulnerable to infinite loops. As a simple example of an infinite loop, consider the following function:
,
and watch what happens when we apply it to itself:
It just keeps going, and β-reduction never ceases! There are a few ways that can solve this problem, and λPω offers a solution that still lets you express a highly inclusive range of computations. It can't express all computations, as by not having any infinite loops, it is not Turing complete.
How λPω does this is by defining a new kind of types called dependent types. Dependent types are basically the types judged to functions. Dependent types are composed of a variable x, a type A, and a family of types B(x). By a family of types, I mean an association of each term a:A with a type B(a). Symbolically, such a dependent type is represented as . Any function that takes in parameters of type A and returns a type B(x) have the this dependent type. E.g., the successor function is typed as . Dependent types are often referred to as pi-types and product types. Sometimes, a right arrow is used as a shorthand or convenient notation for a pi-type.
Notice how you can't make a function that takes itself as a parameter. To see this, try to find the type of a lambda function that can take itself as a parameter...
You can't! You would just get an infinite regression of nested pi-types.
λPω also defines two new types: and . is the type of all types of terms. That is, if A is a type of terms, then we can make the judgement . On the other hand, is the type of all types of types. So, if A is a type of terms, B is a type of types, and we assume the judgement , then we can then make the judgement .
λPω then proceeds with a series of inference rules. That is, rules for determining what you are allowed to deduce from certain givens. The rules are as follows.
1.
2.
3.
4.
5.
6.
So what do all these symbols mean? Above the line are the stuff you are given, and below is what you are allowed to infer or deduce from those givens. here is just a list of typing judgements: . This list of judgements is called a context. The turnstile, , just says that, based on the context given, every typing judgement after the turnstile is valid.
In the above, K is used as a symbol to mean any of either or .
So, what do the rules mean then? Rule 1 just says that, regardless of anything, is always a valid judgement. Rule 2 says that it doesn't matter if is in our context, in order it to be a valid judgement, A needs to satisfy either or . Rule 3 says that if we have proven that something is indeed a valid type, then we can bind a variable to that type in the context. Rules 4 and 5 establishes a definition of the pi-type as the type ascribed to lambda functions. Rule 6 establishes how you should type an application of a lambda function.
De Bruijn indices[]
Before we jump into Loader's program, we have one thing left to clear up. Loader uses a specific notational convention for variables called De Bruijn indices. Basically, this system replaces all variable names with indices. The index of a variable represents how many more recently bound variables there are. So, an index of 0 is the inner-most bound variable. As an example, consider the following lambda expression:
Here, we see the lambda and pi expressions being composed of a head and a body. With our new variable notation, we omit the variable name in the head. What is left is just the type of the variable currently being bound. In the body of the inner-most lambda expression, we see the use of two variables, variable 0 and variable 1. Variable 0 is bound to the inner-most lambda, like so:
So variable 0 has a type equal to variable 1 within the head of the inner-most lambda expression. I'll deal with this case later. First, let's look at the other variable in the body. It should be bound to the next lambda or pi expression after the inner-most one.
It wouldn't be the pi expression, because that is under the boxed lambda expression. Variable indexing only goes outward, but to reach the pi, you would have to go outward two and in one. So now we need to deal with the case of a variable appearing in the head of a pi or lambda expression. This case can be thought of like this: 0 refers to the last bound variable, but in the head, the variable is currently being bound but is not yet bound. Hence, you have to keep stepping outwards until you find yourself in the body of an expression instead of the head. So, we have the following variable bindings:
If you're an experienced programmer, then you can also think of variable bindings as a stack. Each time you enter the body of an expression, a variable binding is added to the stack. Index 0 refers to the top of the stack.
Once you understand De Bruijn indices and typed lambda calculus, then we can finally jump into Loader's program and examine it.
Curry-Howard Isomorphism[]
Maybe? I'm not sure about adding this. This is already getting a bit long.
Loader's Program[]
In loader.c, Loader defined some helper functions and macros. The first of these helper functions are the binary tree encoding. Earlier, I mentioned that I rewrote loader.c. The subsequent blocks of code will be snippets of my rewrite of that code unless stated otherwise.
Encoding[]
So, his definitions are (from full.c):
typedef int Tree; typedef int INT; typedef int TREE; typedef int BitStream; #define DESCEND xx Tree lastRight, accumulate;
And the helper functions are:
TREE Pair(TREE yy, TREE xx) { return yy - ~yy << xx; } TREE R(TREE xx) { return xx % 2 ? 0 : 1 + R(xx / 2); } TREE L(TREE xx) { return xx / 2 >> R(xx); }
At the beginning, Loader defines a few types to make what he's doing make a little more sense. TREE
and Tree
are used to indicate that the variable is supposed to store a binary tree. BitStream
says that the variable is a tape to be read and processed, bit by bit. He also defines DESCEND
to be xx, which will only be used once in the Derive function. It is used as a condition to check whether to continue a loop, which "descends" down the bit stream or "moves along" the tape. He then defines lastRight
. This is defined for golfing purposes, to save characters. The lastRight variable stores either the return value of the last call to R, or what the return value of R would have been if the last call to L was replaced with R. These functions I will get into soon. The accumulate variable is the return variable that will end up growing large.
The function Pair takes two values, yy and xx, and encodes them together into a single integer. The encoding it uses is . Each pair has a unique representation, and every positive integer represents a pair. The two values you put into the pair could be pairs themselves as well. Hence, what you have is a way to encode nested pairs, i.e. binary trees. The R and L functions each return the right and left components of the pair xx.
What the binary trees are going to be used for are as syntax trees. The way the syntax trees are represented are using the following:
Normalization[]
It is here that Loader begins to define the helper functions used to reduce terms and obtain their normal form. He defines two functions, Subst
and Apply
, which, as I will explain, are used together to perform recursive β-reduction. First, Loader defines the following macro:
// Increment the index of each variable in xx. #define Lift(xx) Subst(0, 13, -1, xx)
Now we see a definition of a macro Lift. It uses a function called Subst, which has not been defined yet. What Lift does, as will be seen, is it goes through xx and increments every variable by 1. It replaces variable 0 with the term 13, which is the same as , or variable 1. It then increments every variable index above 0 by 1.
// Substitute all instances of variable vv with yy inside term. // Normalizes result if term and yy are normal. TREE Subst(INT vv, TREE yy, INT context, TREE term) { TREE aux = L(term); TREE xx = R(term);
So now we define a function called Subst, which searches the syntax tree term for every instance of the variable with index vv, and replaces each instance with a copy of the syntax tree yy. It then goes through every variable whose index is greater than vv, and subtracts that index by the size of the context being used. At the same time, Subst also performs β-reduction until the expression reaches its normal form. (That is, if yy and term are already in their normal form.)
// APPLY: beta reduction if (aux == 2) return Apply( Subst(vv, yy, context, L(xx)), Subst(vv, yy, context, R(xx)) );
Here we see how it searches through the tree. It descends through each of the terms the tree is composed of by accessing them with L and R and recursing through each of these elements. Recall that aux is the left element of the term, and xx is the pair of terms joined by the operation represented by aux. In this case, it checks if aux is 2. That means the term is an application of a lambda function on a term. Here we see it use a function called Apply, which has not been defined yet. It then proceeds to substitute everything inside using the same substitution rules.
// PI or LAMBDA: de Bruijn indices else if (aux < 2) return Pair( aux, Pair( Subst(vv, yy, context, L(xx)), Subst(vv+1, Lift(yy), context, R(xx)) ) );
Now it checks if aux < 2, i.e. pi-type or lambda function. In this case, the notice how we descend into the body of the lambda or pi term. When we call Subst, we increment the variable we're searching and replacing! On the contrary, when we descend into the parameter type, we don't increment. The reason why we do this is because of the convention of de Bruijn indices. Consider the example:
Let us apply this term to a type, say, A. How we do that is by replacing every instance of the 0th variable and removing the outermost lambda or pi. We descend into the left part of the middle lambda. Since this is where we are defining and binding the parameter, we have not yet actually bound the 0th variable to the middle lambda. Then we descend into the pi-type. Again, starting into the left, the 0th variable still refers to the parameter of the outermost lambda. We replace that with A, and go back out.
We next go into the right of the pi-type. Now, we have bound the 0th variable to the parameter in the pi-type, and this is why we increment when we descend into the right component. Because now, instead of replacing the 0th variable, we want to replace the 1st variable. So we do that,
and then we go back out, into the body of the middle lambda, and into the parameter type of the inner lambda. The 0th variable is bound to the middle lambda, but not yet bound to the inner lambda. We replace the 1st variable with A, and move out. Now we are done, because we increment again to search for 2nd variables in the body of the inner lambda, but there are none. So now we just drop the outer lambda and voila!
Notice how we also perform a Lift on the tree we are using to substitute the variable. This is because as we keep descending inward into the term, we are burying ourselves deeper below bound variables. Hence, the things those variables in yy are actually supposed to refer to are getting further and further away. We need to account for this by Lifting yy.
// VAR = vv: substitute yy for vv else if (aux == 4+2*vv) return yy;
This is where we do the actual substitution. We found that aux is , which means this encodes a variable of index vv. We return yy in place of this variable to substitute the tree yy for the variable.
// VAR > vv: shift variables down by the size of the context else if (aux > 4+2*vv) return Pair( 4+2*(vv - context), 0 ); return term; }
Now we shift all variables with indices above vv down by the size of the context being used in the substitution, which is usually going to be 1. The reason why we do this is because by substituting each instance of variable vv, that variable is no longer bound to anything. We must then shift all variables above down to adjust for this change of matters.
If none of the above conditions are met, then your term is already normalized. Nothing has to be done to it, so we just return it.
// Apply lambda expression yy to the term xx. // IE: (\A.B xx) => B[0:=xx] TREE Apply(TREE yy, TREE xx) { if (L(yy) == 1) return Subst(0, xx, 1, R(R(yy))); else return Pair(2, Pair(yy, xx)); }
This function, Apply, is as it says on the tin. It applies the lambda function yy to the term xx. Again, this is done by replacing every instance of variable 0 in the body of the lambda function yy with xx. So we make a call to Subst(0, xx, 1, R(R(yy)))
to do exactly that. We only do that, however, if yy actually is a lambda function, as we check with L(yy) == 1
. Otherwise, we just return the syntax tree representing this malformed expression, as it cannot be parsed and reduced.
The Derive function[]
Now we finally arrive at the main function used to compute Loader's number: his Derive or D function. This is no doubt the most interesting function in the program.
// xx is a bit stream that encodes a proof of a theorem, // and Derive(xx) returns the theorem that xx proves. // Within xx are proofs of lemmas that are used to prove // the theorem. TREE Derive(BitStream xx) {
Essentially, what Derive does is it takes in a stream of bits xx. It scans through each bit, using a certain encoding to read it. xx encodes the proof of a theorem, and Derive traces this proof. Derive is also a recursive function, as you will see later. The reason why it is recursive is because xx contains within itself bit streams of proofs of different theorems. Those theorems inside xx are then used as lemmas to help prove the main theorem. That, then, is why we call Derive recursively: to trace the proof of these lemmas and obtain the lemma, so that Derive may use the lemmas in the proof. The final return value of Derive is the theorem itself that xx proves.
The encoding used for theorems is:
TREE lemma = 0, auxTerm, auxType, term = 7, type = 14, context = 0;
We define some variables here. lemma
is used to store our helper theorem. auxTerm
and auxType
are the term and the type in the typing judgement that lemma
makes. Likewise, term
and type
are the typing judgement that has been proven so far.
context
encodes our typing context. To add a typing judgement to a context , you make the pair . This binds the 0th variable to the type A, and increments the index of all other variables in the context by 1. So, a context of would be represented in the encoding as . When the context is 0, that just means it is empty. There are no typing judgements in the context.
So, how these variables are initialized are telling us that we can just start out with the assumption that .
// Read through bit stream until it is told to stop for ( ; xx % 2 ; xx /= 2) {
Now we start our loop. Notice how we are dividing by two on each pass. Why are we doing this?
Remember, xx is a bit stream. So you have to imagine that xx is a tape being fed into a machine. Dividing by two just means that the machine is eating the tape and moving it over to read the next bit. That is what xx % 2 does: read the bit. From now on you'll be seeing a lot of (xx /= 2) % 2
. Whenever you see that, that just means that the machine is moving over the tape and reading the next bit.
So this loop condition means that, at certain intervals, the machine interprets the next bit to be a flag that tells it whether to stop there or not. There may or may not be stuff on the tape after it stops. If there are, the theorem being proven is likely a lemma in a larger proof.
// Parse a section of xx to retrieve a lemma or // theorem needed for our proof. lemma = Derive(xx /= 2); // Next term auxTerm = L(lemma); // Type of auxTerm auxType = L(R(lemma)); // Left-over bit stream xx = L(R(R(lemma)));
Now we extract the lemma from the bit stream xx. We then assign auxTerm, auxType accordingly. We then move over the tape to the end of the proof of this lemma, as L(R(R(theorem))) stores the bit stream left over after parsing.
If you don't want to use a lemma on this step of the proof, just leave the next bit 0. Derive will halt immediately and return the axiom .
// INFERENCE RULE: Application // Gamma :- X : (PI T.Z), S : T // => Gamma :- (X S) : Z[0:=S] // Context must be equal to context given in lemma if (context == R(R(R(lemma)))) {
This is where we begin to implement our first inference rule. Basically, the next few bits on the tape will store data about which inference rules to use to move onto the next step in the proof. This inference rule is listed above as Rule 6. We have to make sure that both contexts are equal to each other in order for the precondition of this inference rule to be satisfied. Recall that both s above the line are the same context.
This is where we may be using our lemma. This is the only inference rule that invokes a separate theorem or lemma.
// Type must be Pair( 0 , Pair( auxType , - ) // IE: type = PI(auxType, -) // NEXTBIT if (L(type) == 0 && L(R(type)) == auxType && (xx /= 2) % 2 ) { // TYPE: PI(auxType, tt) => tt[0:=auxTerm] // TERM: (term auxTerm) type = Subst(0, auxTerm, 1, R(R(type))); term = Apply(term, auxTerm); }
This is where we actually use the inference rule. In the rules listed above, we let X be term, Y is type, S is auxTerm, T is auxType, Z is a wildcard for any term or type, x is the 0th variable, and K can be either or . We will use this relation for all inference rules.
So, inspecting the code, we see that it verifies that term has type , or . If so, then we have satisfied the conditions for using the inference rule.
To actually use the inference rule, we then change the term to the application of X on S, and change type to the body of the original pi-type. We also substitute each instance of variable 0 with auxTerm in type. So what we are left with is .
// If auxType is STAR or BOX and NEXTBIT // Then Gamma => Gamma, auxTerm // term => Lift(term) // type => Lift(type) if ((xx /= 2) % 2 && (auxType & 2)) { context = Pair(auxTerm, context); term = Lift(term); type = Lift(type); } }
Now we have the option of adding a variable of type auxTerm to the context (Rule 3). We can do this, as in this code we make sure that auxTerm has been proven to have type or . Hence, auxTerm is a valid type for a variable. To tell Derive that we do want to invoke Rule 3 to introduce another variable, we set the next bit to be 1. The code then adds auxTerm to the context. Now that another variable is judged in the context, we have to increment every variable in term and type to adjust. So we Lift them.
// INFERENCE RULE: Lambda introduction (strong!) // Gamma, A :- X : Y // => Gamma :- (\A.X) : (PI A.Y) // INFERENCE RULE: Pi formation // Gamma, A :- X : K // => Gamma :- (PI A.X) : K // If context is non-empty and NEXTBIT if (context && (xx /= 2) % 2) {
Next, we implement Rules 4 and 5. In order to use these inference rules, there must be some type A in the context. Hence, we must make sure that the context is non-empty, and then check the tape to see if we use this step in the proof.
// PI: Type is STAR or BOX and xx does not specify lambda introduction // LAMBDA: Type is not STAR or BOX or xx specifies lambda introduction if ((xx /= 2) % 2 || !(type & 2)) { term = Pair(1, Pair(L(context), term)); type = Pair(0, Pair(L(context), type)); } else { term = Pair(0, Pair(L(context), term)); } // Remove the context item we just used context = R(context); }
Here, the machine reads the next bit on the tape. If it reads 1, then that means lambda introduction (Rule 4) is definitely the next step in the proof. Otherwise, it might be pi formation (Rule 5). However, in order for pi formation to work, our term X must have a type or . This can be checked using type & 2
. If X is not either of those, then pi formation is not the next step, and the next step must then be lambda introduction.
We do not need to change the type in the case of the pi formation, since the type is already or .
Before applying the inference rule, our context was . After the rule, our context is just . We must account for this in our code by removing the item from the context, setting context = R(context)
. Notice how, even though we are removing something from the context, we are not decrementing the variable indices in our term and type. That's because we are also adding a lambda or pi.
Here, we see Loader break from the normal rules of λPω. If we look back at Rule 4, the abstraction rule, we see that there should actually be two preconditions:
.
Where did the second one go? By lifting this restriction, the result is that it is possible to infer a term with the type of, say, . This shouldn't be possible in λPω, because what would be the sort of that type? It would have the same sort as , but that doesn't have a sort.
Loader appears to be aware of this deviation from the normal rules of λPω, as in his full.c, he makes the following comment at this point:
// We allow LAMBDA introduction whenever the context is non-empty. // This extension is a conservative extension of CoC.
It appears that Loader is making the claim that this extension of λPω conserves the nice properties that we want from it. Let us examine this claim, then.
The omission of the second condition for Rule 4 may actually make the language even more expressive, which could serve the purpose of the program well. The problem with strengthening a language or theory, however, is the risk of making it too strong, i.e. inconsistent.
The extension of λPω formed by Loader's modified rule set is essentially equivalent to or a subset of adding another sort , the axiom , and the rules and . The main thing to worry about is Girard's paradox, but the rule needed to construct a term of type is . This extension of λPω forms a weaker subset of the Extended Calculus of Constructions, an area of focused research. Based on that, I think it's safe to say that Loader's extension is indeed conservative.
// INFERENCE RULE: Restriction of types to STAR or BOX // Gamma :- X : K // => Gamma, X :- x : X // If type is STAR or BOX and NEXTBIT, // Then add term to the context // new term is x // type is old term if ((xx /= 2) % 2 && type & 2) { context = Pair(term, context); type = Lift(term); term = Pair(4, 0); } }
This part implements Rule 2. If the next bit on the tape reads 1, then that tells us that the next step is to use this rule of inference. But this rule also requires type to be or , so we check for that as well. We add our term X to the context. Our new term is x, which is variable 0. Our new type is our old term. We have to Lift it as well, since we are adding another judgement to our context, which uses up another variable.
TREE theorem = Pair( term , Pair( type , Pair( xx , context ) ) ); theory = Pair(theorem, theory); return theorem; }
After going through up until the tape signals a halt, we arrive at a final theorem that states that . Using the encoding above, we define a variable called theorem
, within which we store our theorem AND the remaining tape left over after running through the proof. The reason why we put the rest of xx into there is in case if this theorem is actually a lemma used for a longer proof. In that case, the xx is still needed by the parent call.
Finally, we chuck our theorem into a global variable called theory
(accumulate
in full.c), and return the theorem we have just proven. Theory represents all of the theorems that have so far been proven. It is this theory variable that ends up actually becoming Loader's number.
Examples[]
Let's try some test runs, then, shall we? Remember that xx represents a tape onto which a proof is encoded. So log(xx) is approximately the length of our proof. Let's try an empty proof. What statement does nothing prove? Well, if we run Derive(0)
, it never enters the loop. The machine is indicated to halt immediately. "Ok, theorem proven." So let's skip over to the end. Our term is , our type is , and our context is empty. So, as it turns out, nothing proves ! This is reflective of Rule 1.
Let's try running it on . We start with a 1, which tells Derive that there's more to prove than just . We jump into the loop and run Derive(1101100b), which simply proves as the first bit is 0. Move the tape forward and read. We don't add auxTerm to the context (xx = 110110), so proceed. We have no context, so skip the next conditions. Now we can introduce a variable. Move the tape forward and read (xx = 11011), so we do introduce a variable. Now our statement is . Again, move the tape forward and read, testing if we continue (xx = 1101). We do, so we prove our lemma, Derive(110b)
. The context is different from the lemma's context, so we skip the first block. We do have a context now, so move forward and read (xx = 11). Now to determine if we do lambda introduction or pi formation. We read again (xx = 1), and we do lambda introduction. The rest is all zeroes, so that is where the proof ends.
So, in the end, the number 217 proves .
Sense of Scale[]
"But wait!" you interject. "What makes this function grow so quickly?"
It's hard to see how this function does grow quickly. But we can make sense of a lower bound for the growth rate of this function using Church numerals. Church numerals is a way to represent natural numbers in lambda calculus. Using Church encoding, a natural number n is a lambda function that receives a function and a term such that when it is applied, it evaluates to the n-fold composition of the given function acting on the given term. To get a sense of what this means, here are a few examples:
Of course, we can't do this with typed lambda calculus. In λPω, the number 1 would look more like this:
.
Therefore, with this encoding, we can define the natural number type to be
.
Now, consider how Church numerals would be represented using the encoding Loader defined. Since they are represented as nested pairs, and pairs are represented using powers of two, a natural number N has a Loader encoding value of approximately . Likewise, the type has the value of about . So if a bit stream xx proves , then Derive(xx) returns .
If we can write a short proof of for large n, then Derive should grow fairly quickly. Suppose that we have already proved . Then how do we go about proving ? We can do this by proving that the successor function is typed . Then, by Rule 6, . And, since 1 is β-equivalent to , we can conclude that .
The takeaway here is not that we can prove from , but rather the method we used to prove it. In particular, we used to prove our theorem. Replace with any rapidly growing function, and we start to see just how extremely powerful Derive
is.
Every step in a proof takes up 1 or 2 bits in a bit stream. That means that, if S is the sum of the amount of steps in the proofs of all lemmas used to prove a theorem, including the theorem itself, then at most the bit string representing the proof of said theorem should at most be . So if xx proves , then Derive(Derive(xx))
can prove theorems that require more than googolplexplexplex steps! In David Moews' commentary on Loader's program, he mentions that it is possible to prove in a reasonable amount of steps. With a power tower level of proof length, we can go much, much further than that.
Wrapping up[]
So, I made a few functional modifications to the Derive function. In the original function, Derive started reading the bit stream from the second bit. Another change I made was the removal of an element of recursion. The original performed a proof search by not only proof-checking xx, but also all numbers below xx. This massively inflates theory even more, and makes sure that it catches some important theorems. Many long, redundant proofs may result in very short theorems. To counter the possibility that Derive may disappoint by proving such a short theorem, the original had a Derive(xx-1)
term inside the body of its loop.
Another notable thing that Loader did differently was that Derive returned the theory itself rather than the theorem on its own. This was likely done to save characters.
Loader employed a clever method to reach great heights in very little characters. Its concept is reminiscent of that of Rayo's number. However, Loader was under the restriction of having to use computable methods. Instead of calculating the "smallest number than cannot be described in less than n symbols in first order set theory," Loader instead created a function to compute the "largest number that can be proven to exist in less than log(n) steps in the Calculus of Constructions."
Appendix[]
Loader's rule set[]
Loader's rule set for his extension of λPω is
1.
2.
3.
4.
5.
6.
where .
0 : nat[]
Here is a proof of , using the Church encoding for numbers.
In bit stream format (my slightly tweaked version), the proof would be
.
In brackets are the proofs of lemmas. I use subscripts to denote what the lemmas actually are, using the numbering in the proof above. The decimal value of this bit stream is
.
Construction of fast-growing functions[]
Alemagno12 is working on ways to construct fast-growing functions in λPω in his blog post here.