This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
B, a Pm}
SEMAN TICAL ANALYSIS OF INTUITIONISTIC LOGIC I
111
H) = T. Let HI be any member ofK such that HRH I , where H = 1(0). Then either H = HI' or HSnH I for some n > 0: in either case there exists HI E K such that HRH I and HI = l(H I ). By assumption, either '1(A, HI) = F or '1(B,Ol) = T, whence, by the induction hypothesis, either q,(A, HI) = F or q,(B, HI) = T. Since HI was arbitrary (subject to HRH I , q,(A => B, H) = T. Conversely, suppose l1(A => B, H) = F. Then for some HI such that HRH I , '1(A, 01) = T and l1(B, HI) = F. By the induction hypothesis, q,(A, l(H I ) ) = T and q,(B, 1(H 1 ) ) = F; since l(O)Rl(H I ) , q,(A => B, l(H)) = F, as desired. The case of' is quite
similar. Q.E.D.
Notice that the situation contrasts with that in S4, where it is often impossible to replace an arbitrary finite model by an equivalent finite tree model (cf. [2]). The third part of Theorem 1 extends the procedure for finding a tree model equivalent to an arbitrary model to quantificational models. Here we cannot use the same construction as a tree q. model and as a Beth q. model, as will be seen when we define the latter, in preparation for the fourth part of theorem 1. THEOREM 1 (Third part): Let (G, K, R) be a q.m.s. with domain function ljJ(H). (R need not be anti-symmetric.) Let S be any relation (not necessarily irreflexive) such that R = S*. Let q, be a quantificational model on (G, K, R). Let (G, K, R) be defined as in the second part of the theorem, and let ilI(H) = ljJ(l(O)). Let '1(r, H) = cp(r, l(A)) for each predicate letter P" and each H E K. Then '1 is a quantificational model on the q.m.s. (G, K, R) with domain function ill. Further, relative to a given assignment to the free variables of A, '1(A, H) = q,(A, l(H)): in particular, '1(A, G) = q,(A, G).
The proof is left to the reader. Notice that, since S is not required to be irreflexive, it may in particular be R itself: thus (G, K, R) may be as in the second part of Theorem 1, or may be identical with the Beth model (G', K', R') of the first part. As a quantificationa1 model, however, '1 will not be a Beth quantificationa1 model, to the definition of which we now turn. Unlike our own models, with their variable domains (a feature we have noted to be essential), the Beth quantificationa1 models are based on a fixed domain D. We define a Beth q.m.s. to be a Beth m.s. (G, K, R),
112
SAUL A. KRIPKE
together with a domain D with at least one element. A Beth q, model Yf is a binary function Yf(pn, H), whose value is T or F when n = 0, and is a subset of D" for n ~ 1. We require, in addition to the conditions (b) and (e) above on n, the analogues for n ~ I: (bn). If HRH', Yf(r, H) S; H'); (en) if H is barred by B S; K, then
«r:
n
H', B
Yf(r, H')
S;
Yf(pn, H).
For an atomic formula r(x l , . . . , x n), define Yf(r(x I , . . . , x n), H) = T, relative to an assignment of aI' .. , anEDtox l , . . . ,xmiff(a I , . . . ,an)E Yf(r, H); otherwise, =F. We then define the values for more complex formulae by induction. The inductive clauses for the propositional connectives are as above. Let the formula A(x I , ••• , X m y) contain only the free variables listed. We define Yf«y)(A(x I , . . . , x m y), H) = T, relative to an assignment of a, ED to Xi (1 ::::; i ::::; n), iff Yf(A(x I , ••• , (xn,y), H) = T relative to any assignment of an element bED to y and a, to Xi; otherwise, = F. Again Yf«3y)A(x I , ••• , X n, y), H) = T when a, is assigned to Xi iff there is a B S; K such that H is barred by B and for any H' E B there is a bED such that Yf(A(x I , .•• , X m y), H') = T when ai is assigned to Xi and y is assigned b; otherwise, = F. Using the inductive clauses and the conditions on atomic formulae, we can prove the analogues of (b) and (e) for an arbitrary formula A, relative to a fixed assignment to its free variables in a Beth quantificational model n, If Yf(A, H) = T and HRH', Yf(A, H') = T. If H is barred by B and 1'f(A, H') = T for any H' E B, then Yf(A, H) = T. Suppose we are given a quantificational model ¢ on a m.s. (G, K, R) such that u = U l/I(H) H,K
is countable. We will transform ¢ into a Beth quantificational model whose domain D is the set N of non-negative integers. Let (G/, K /, S') be as above, and R' = S'*. Notice that N is a countable union of disjoint countable sets; call these NJi = 0, ... ). We have a procedure, which, for each H' E K', generates certain elements of N at H'; the set of elements generated at H' will be identical with n
UN;
i=O
SEMANTICAL ANALYSIS OF INTUITIONISTIC LOGIC I
113
for some n. Further, ifP is any path in K', every pEN will be generated at some H' E P. Further, the procedure will satisfy the condition that if H' R'H", every element generated at H' is also generated at H". An element generated at H', but not at its predecessor (if any exists), is said to be introduced at H'. Further, any natural number n generated at H' is assigned a unique element of t/J(l(H')); this element is called v(n, H'). The v-function will satisfy the condition that if n is generated at H', and H'R'H", then v(n, H') = v(n, H"). We give an inductive definition on the tree (G', K', S') of a procedure with these properties; at any stage, satisfaction of these properties will be taken to be part of the inductive hypothesis. First, consider the origin G' of the tree. We generate exactly the elements of No at G', and we define v(n, G '), for n E No, in such a way that No is mapped onto t/J(G). (This is possible since t/J(G) is at most countable. All arbitrary choices can be made precise, if desired, using well-orderings of the denumerable sets Nand Ll.) Suppose we have defined the set of all integers generated at H' it is, say, m
(M = U N;) i~O
and have defined v(n, H') for each n E M. Let H'S'H". Then introduce all elements of N n + l' so that the set of elements generated at H" is M v N m + l' Define v(n, H") for n E M v N m + 1 by v(n, H") = v(n, H') for n E M, and such that v(n, H") maps N n + 1 onto t/J(1(H")). Then the inductive definition is complete. We now define a Beth quantificational model Yf whose domain is N on the Beth m.s. (G', K ' S') as follows: If P is a propositional letter, define Yf(P, H') = ¢(P, l(H')). For an n-adic predicate letter n define Yf(r, H') to be the set of n-tuples (ml' , m n ) of natural numbers such , m; are all generated at H" and that, for every H" E K' such that m l , H'R'H", (v(m l , H"), ... , v(mm H")) E ¢(r, l(H")). THEOREM I: (Fourth part): Yf is a Beth quantificational model on (G', K ', S') whose domain is N. For any H' E K' and formula A(x l , . . . , x n ) , whose free variables are exactly those listed, and natural numbers m l' . . . , m.; which have been generated at H', Yf(A(x1 , ••. , x n) , H') = T when Xl' ••• , X n are assigned m l , . . . , m i; respectively, if and only if ¢(A(x 1 , ... , x n) , l(H')) = T when Xl' ••. , X n are assigned v(m 1 , H'), ... , v(m n ,
114
SAUL A. KRIPKE
H'), respectively. In particular (n = 0), ifA is a closedformula, rJ(A, H') = cf>(A, l(H'». PROOF. We show first that rJ is a Beth quantificational model. Conditions (b) and (b n ) are obvious. Condition (c) is proved as in the first part of the theorem. Condition (en) (n ~ 1) is proved as follows: Suppose H' E K' is barred by B £; K', and suppose {m I' ... , m n) is not in rJ(pn, H'). We show that there is an H" E B such that (m l , ... , m n ) is not in rJ(pn, H"). Since (m I' ... , m n) is not in I/(pn, H'), there is an H~ E K' such that H' R'H~, m I' . . . , m n are all generated at H~, and (v(m I' H~), ... , v(mn> H~» is not in cf>(pn, l(H~)). As in the first part of this theorem, let P be the path P(H~) through H~, with the property that, for H" on the path and H~R'H", l(H~) = l(H"). Then P intersects B in an element H". If H" R'H~, then since clearly (m l , ... , mn) is not in rJ(P", H~), by condition (b"), it is not in rJ(pn, H"). If H~R'H", then since l(H") = l(H~), and v(mi' H~) = v(mi' H"), we have «v(m l , H"), ... , v(m n, H"» tt ¢(pn, l(H"», so that (m l , . . . , mn) ¢ rJ(P", H"), the desired conclusion.
We now prove the assertion in the second sentence of the present Fourth part by induction; the third sentence is a special case. Let A(x l , . . . , nX) be atomic. If n = 0, see the proof of the first part of this theorem. If n > 0, write A(x l , . . • , xn) as P"(x l , . . . , xn) . Suppose m l , . . . , mn are all generated at H' E K '. Let H = l(H'), and a, = v(mi' H'). If c/>(P"(XI' ... , x n) , H) = T, when Xi is assigned a, (l ~ i ~ n), then (a I ' . . . , an) E cf>(P", H). If H' R'H~ (H~ E K'), let H o = l(H~). Then HRH o, hence a I' , an E ",(H o) . Also a, = v(mi' H') = v(mi , H~). This shows that , Inn) E rJ(pn, H'), hence rJ(pn(x l , . . . , x n) , H') = T, relative to the. (m l , assignment of m, to Xi' as desired. On the other hand, if cf>(P"(x I' •.. , x n ) , H) = F relative to this assignment, and hence (ai' ... , an) ¢ cf>(pn, H), we clearly have (m l , . . . , mn ) ¢ rJ(P", H'), again as desired. The inductive clauses for the propositional connectives are as in the first part of this theorem. Suppose the result proved for A(x l , . . . , X n , y). Again let m, be assigned to Xi' let H = l(H'), and let a, = v(mi' H') (i = 1, ... , n). Let cf>«3y)A(xl , ••• , X n , y), H) = T when Xi is assigned a.. Then there is e b e ",(H) such that cf>(A(x I' ... , X n , y), H) = T when in addition y is assigned b. v(p, H') maps the elements generated at H' onto ",(H), so let v(p, H') = b, where p is generated at H'. Then, by inductive hypothesis rJ(A(xl , • . . , X n , y), H') = T when Xi is assigned m, (i = 1, ... , n) and
SEMANTICAL ANALYSIS OF INTUITIONISTIC LOGIC I
115
X n, y), H') = T when x, is assigned m.; On the other hand, suppose
y is assigned p; hence '7«3y)A(x I' ... ,
116
SAUL A. KRIPKE
1](A(Xl' ... , x n , y), Hi) = F when Xi is assigned m, and y is assigned p. Hence 1]«y)A(x 1 , ••• , x n ' Y), H~) = F when Xi is assigned m.; but since H'R'H'b 1]«y)A(x 1 , ••. , Xm y), H') = F relative to this same assignment.
This concludes the proof of the theorem. Q.E.D.
The fourth part of Theorem I shows how any quantificational model 1]. Essentially if we have arrived at a certain position H' E K' and if H = 1(H'), the numbers introduced at H' are "identified" with certain elements 0 f I/I(H) by v (n, H'). An example, following the spirit though not the letter of the proof of Theorem I, fourth part, converts the countermodel of section I . I, Figure 3, for (x) (P(x) v Q). ~ . (x)P(x) v Q into a corresponding Beth quantificational countermodel in the natural numbers. In Figure 3, there are two evidential situations, G and H; I/I(G) = {a}, t/J(H) = {a, b}. As natural numbers are generated, as long as we remain at the evidential situation G, we must "identify" each natural number with a (and therefore give it all properties assigned to a in Figure 3), but if we pass to H, we must "identify" some natural number with b. These considerations lead to the following figure:
¢ can be transformed into a Beth quantificational model
P(O)
P(1)
i
P(2)
Q i H. I
i---r
Q i H.
Figure 5.
P(3)
I
Q i H, I
This is exactly Beth's counterrnodel to (x) (P(x) v Q). ~ . (x)P(x) v Q. As long as we remain on the horizontal branch but are uncertain that we will continue thereon, we have not established (x)P(x) v Q; but on the other hand, for each natural number x, either P(x) or Q is eventually established. We have not mechanically applied the proof of Theorem I to obtain this model, but instead have reproduced its spirit; in particular, we have introduced a simplification analogous to that required to obtain Figure 4b from Figure 4a. Notice that Figure 5 can be interpreted in terms of absolutely free choice sequences as follows: Let a be an absolutely free choice sequence on the binary spread. Let P(x) abbreviate a(x) = 0, and let Q be (3x) (a(x) = I). Then, if x ranges over the natural numbers, clearly (a ~ B) (x) (P(x) v Q), but ,(a ~ B) «x)P(x) v Q). And, analogously, as Kreisel
SEMANTICAL ANALYSIS OF INTUITIONISTIC LOGIC I
117
and Dyson (Kreisel [II] and Dyson & Kreisel [9]) have observed, countable Beth quantificational models can always be interpreted thus. So Theorem I gives a new intuitive interpretation of our models, in which all quantifiers range over the natural numbers. Since below we will obtain a completeness theorem for countable quantificational models, and since such models can always be transformed into Beth (q.) models, our completeness results include those of Beth. (Beth required his models to be finitary, but we will show in part II how to obtain finitary Beth models.)
1.3. Other interpretations ofthe models Sections 1. I and 1.2 gave interpretations of our models which were intended to accord with the interpretations intuitionists customarily assign to their logical constants. In this section we will give two formal interpretations of the modelling which do not claim any direct intuitionistic content. (Both interpretations are actually direct special cases of the modelling; they simply consider a restricted class of models.) One interpretation is based on provability in formal systems; it was described briefly in [3]. The other is based on Paul Cohen's notion of forcing [5]. The two interpretations are intimately related to each other. This section may be omitted without loss of continuity.
I. Provability interpretation. Let Eo be a formal system, and let E be an arbitrary extension thereof. Let K be the set of all such E, and let ERE' iff E' is an extension of E. We define an atomic formula P to be a closed wff of Eo. (Note that P need not be an atomic formula of Ea.) We can then build non-atomic formulae out of the P's using the connectives A, ::J, " v. If we define l/>(P, E) = T iff P is provable in E and F otherwise, then l/>(P, E} is a model on the m.s. (Eo, K, R). Thus for any complex formula A which is a theorem of the intuitionist propositional calculus, l/>(A, Eo} = T. If Eo is elementary number theory Z, and P is Godel's undecidable formula, then l/>(P v ,P, Eo} = F; for P is not provable in Eo, but it is provable in certain extensions E. The larger problem, whether Heyting's propositional calculus is complete with respect to this particular choice of Eo, remains open. To interpret intuitionistic quantification theory in this manner, we must assume that the system Eo and its extensions have notions of free
118
SAUL A. KRIPKE
variables and of constants, and that Eo contains at least one constant. For any E E K, let I/I(E) be the set of all constants of E. Then if ERE', I/I(E) s I/I(E'). For every n, define an n-adic atomic predicate P" to be a formula of Eo with n free variables, together with a I-I function from the integers I, ... , n to the free variables of P", The variable assigned by this function to m(1 ~ m ~ n) is called the mth free variable of P". Define, for n ~ 1, the set ¢(pn, E) s [I/I(E)]n as follows: An n-tuple (a l ' . . . , an) of constants in I/I(E) is in ¢(P", E) iff the result of the simultaneous substitution of aj(1 ~ i ~ n) for the ith free variable of P" is a theorem of E. Out of the atomic n-adic predicates (which play the role of the n-adic predicate letters above), we can build more complex formulae using the propositional connectives and the quantifiers. ¢(P", E) then becomes an intuitionistic quantificational model. It is clear that in the preceding K can be replaced by any subset K' thereof (e.g., the finitely axiomatizable extensions of Eo). Further, restrictions, such as recursive enumerability, on the notion of formal system, can be removed at will. There is also a more "model-theoretic" variant of the present interpretation of Heyting's predicate calculus, which eliminates the assumption that E must-contain constants. Further, the interpretations can be extended in other directions so as to yield new interpretations of larger parts of intuitionistic mathematics; in particular, we can give an interpretation of FC which leads to a proof that FC is an inessential extension of Heyting's arithmetic"). For more on provability interpretations of intuitionistic and modal logics, cf. [3]. 2. Cohen's notion of "forcing," Let D be an arbitrary countable infinite set. Let 9 = (9 0 , ( 1) be a pair of finite, disjoint subsets of D, and let K be the set of all such pairs. If 9 = (.0/'0' !!J I) and 9' = (9~, 9'1) are in K, theqdefine 9 R9' (or, f!}' is an extension of 9) iff 9"0 s 9~ and 9 1 S 9~. Further, let I/I(g» = 9 0 u:3' 1. Now consider a single monadic predicate letter P. For any g; E K, define ¢(P,9) = 9 0 . Let K' be the set of all 9 E K such that 1/1(.9) is non-empty. Then for any g; E K', (g>, K', R) is a q.m.s., with the associated domain function 1/1. (If we had modified Heyting's predicate calculus so as to admit the empty domain and thus permit I/I(Y') to be empty, the rather artificial use of K' in place 1) Kreisel has independently obtained this result using an elimination of free choice sequences by contextual definition.
SEMANTICAL ANALYSIS OF INTUITIONISTIC LOGIC I
119
of K could be dropped.) Then ¢ is a model on (g'J, K', R), and for any formula A built from P using propositional connectives and quantifiers, the inductive definitions we have given define a truth-value ¢(A, g'J'), for any g'J' E K', relative to a fixed assignment of elements of D to the free variables of A. If this value is T, we say that g'J' forces A relative to the assignment. (Notice that the value of ¢(A, g'J') is clearly independent of the choice of the "designated" element g'J of (g'J, K', R).) If D' is a subset of D, we say that g'J , agrees with D' iff g'J~ s D' and g'J~ s D - D'. We can say that D' forces A (relative to a given assignment to the free variables) iff there is a g'J' E K' which agrees with D' and forces A. Notice that if g'J' and g'J" agree with D', they have a common extension which agrees with D'; thence it easily follows that D' cannot force a statement together with its negation. Call D' generic iff for every A, and fixed assignment to the fret: variables thereof, D' forces either A or ..,A. Cohen proves that generic sets exist: Let {An} be an enumeration ofall the ordered couples Ai =
e::
e-:
120
SAUL A. KRIPKE
assertion is readily proved by induction on the complexity of A. Since, classically speaking, a (x) can always be replaced by ,(3xh the restriction that universal quantifiers be absent is not important. The definition we have given differs from Cohen's in inessential respects. (It may be closer to a definition given by Feferman, which we have not seen 1). It is clear that the notion can be extended. For example, we need not deal with a single predicate P(x); we can deal with several such, not all of which need be monadic. The modifications needed for this more general situation should be obvious. Further, we can replace the countable set D by a set of regular cardinality N,,; K will consist of disjoint pairs of sets of cardinality less than ~". Cohen's motivation was radically different from ours, but it is clear that his notion is intimately related to our model theory. The "deeper" reasons for this relation may yet be unknown. It should be noted that Dana Scott had already observed that Cohen's idea was similar to an interpretation conjectured by Kreisel [17]. And indeed, if Kreisel's conjectures prove correct, his interpretation of intuitionism will be closely related to ours.
2. Semantic tableaux In this section we develop Beth semantic tableaux for intuitionistic logic. The notion developed here is similar to those of [2], [11], which can be read as background if desired. We deal at each stage of the construction with a system of alternative sets of tableaux; each alternative set is ordered in the form of a tree, and the origin of the tree is called the main tableau of the set. We call the tree ordering relation on an alternative set "S"; the smallest reflexive and transitive relation containing "S" is called "R". We can assume, at a given stage of the construction, that each alternative set is diagrammed on a piece of paper; corresponding to the system of all the alternative sets of the stage, we have a leaflet of which the separate sheets of paper are pages. Given a formula A of Heyting's predicate calculus, to see whether it is valid we attempt to find a countermodel to A. If A has the form Al A •.. Am. :=> • B I V ... B n, then what we need is a model ¢, such that relative to some assignment to the free variables of A, ¢(A j, G) = T and 1) See note at end of paper.
SEMANTICAL ANALYSIS OF INTUITIONISTIC LOGIC I
121
¢(B j ' G) = F, 1 :0; i :0; m, 1 :0; j :0; n. We represent the situation by putting A l , ... , Am on the left, and B l , . . . , B; on the right of the main tableau of a construction. We continue the construction, which gives a systematic attempt to find a tree countermodel to A, by the following rules, which apply to any tableau of any alternative set of the construction:
NI. If..,A appears in the left column of a tableau, put A in the right column of that tableau. Nr. If..,A appears in the right column of a tableau t, start out a new tableau t l , with tSt", by putting A on the left of t l • AI. If A left of t.
A
B appears on the left of a tableau t, put A and B on the
Ar. If A A B appears in the right column of a tableau t, there are two alternatives; extend the tableau t either by putting A in the right column or by putting B in the right column. If the tableau t is in an ordered set Y, it is clear that at the next stage we have two alternative sets, depending on which extension of the tableau t is adopted. Informally speaking, if the original ordered set is diagrammed structurally on a sheet of paper, we copy over the entire diagram twice, in one case putting in addition A in the right column of the tableau and in the other case putting B; the two new sheets correspond to the two new alternative sets. The formal statement is rather messy: Given a tableau t in an alternative set Y, if t has A A B on the right, we replace Y by two alternative sets Y 1 and Y 2, where !f'l = Y - {t} u {tl} and Y 2 = Y - {t} u {t 2}, and t l [t2 ] is like t except that in addition it contains A [B] on the right. The tree ordering S 1 of the new set Y 1 is precisely the same as S, save that t l replaces t throughout; and similarly for the tree ordering S2 of Y 2 • (Formally, Sl agrees with S on !f' - {t}, and, if t' is the predecessor [a successor] of t, then t'Sltl[tlSlt'].) We say!f' splits into Y l and !f'2' Similar remarks apply to the rule VI and PI below. VI. If A v B appears on the left of t, put either A on the left of t or B on the left of t. (As in the case of Ar, this splits the set !f' containing t into two alternative sets.) Vr. If A v B appears on the right of t, put A and B on the right of t. PI. If A
=:>
B appears on the left of t, either put A on the right of t
122
SAUL A. KRIPKE
or put B on the left. (Thus again the set g> containing t is replaced by two alternative sets.) Pro If A ::> B appears on the right of t, start out a new tableau t l, with A on the left of t 1 and B on the right, such that tSt l • For a construction involving quantifiers, we associate, at a given stage of a construction, a set I/1(t) of variables with each tableau t. We start out the definition of I/1(t) by assuming that, at the initial stage of the construction, which starts out with a single tableau to, I/1(to) consists of a single variable x. At later stages I/1(t) is to be enlarged only as required by the rules Ilr and II below and the stipulation that tSt l is to imply that I/1(t) S I/1(t 1 ) . We are now in a position to state the rules for quantifiers as follows: Ill. If (x)A(x) appears on the left of t and y is any variable in I/1(t), put A(y) on the left of t. Ilr. If (x)A(x) appears on the right of t start out a new tableau t 1 with tSt l • If y is the alphabetically earliest variable which has not yet occurred in any tableau of any alternative set at this stage, put y E I/1(t 1) and put A(y) on the right of t': El. If (3x)A(x) appears on the left of a tableau t, and y is the alphabetically earliest variable which has not yet appeared in any tableau of any alternative set at this stage, put y E I/1(t) and put A(y) of the left of t. Et, If (3x)A(x) appears on the right of a tableau t, and y is a variable in I/1(t), put A(y) on the right of t.
In addition to the rules we have stated, the following stipulation holds throughout the construction: if t and t 1 are tableaux of some one alternative set, at any given stage, such that tSt! , and A appears on the left of t, then put A on the left of t 1. Notice that, since the stipulation is to be iterated an arbitrary number of times, it also applies when A is on the left of t and tRt l • The relation tSt l is to hold in a construction only as required by the rules listed above. The rules may be applied in any order, as long as the order stipulated is such that every applicable rule is eventually applied. A tableau t is called closed iff some formula occurs in it on both the left and the right. A set or tree of tableaux is closed iff some tableau in
SEMANTICAL ANALYSIS OF INTUITIONISTIC LOGIC I
123
the set is closed. A system of alternative sets is closed iff every set of the system is closed. A construction started out by putting A on the right of the main tableau of the construction is called the construction for A. We can place the following restrictions on constructions: A rule is not to be applied to a tableau of a closed set; nor is it to be applied if it is "superfluous" (e.g., Al is not to be applied if A and B already appear on the left of the tableau t in question). Let us call an alternative set at any stage of a construction terminal iff it is not replaced at any stage of the construction by another set or pair of sets; thus, in particular, every closed set is terminal. In any construction, let a be some fixed sequence Y'1' Y'2' ... of alternative sets such that Y'1 is a set at the first stage of the construction and Y'i+ 1 is the set or one of the two sets, which, at the (i + 1)-th stage, replaces Y'i; a terminates at Y' n iff Y' n is terminal. (If the construction does not terminate there is at least one infinite such sequence a.) Any tableau t in Y'I or in Y'i + 1 which is not an immediate descendant of any tableau in Y'i is called an initial tableau. Let K be the set of all sequences. of tableaux tl' ' 2, ... such that 11 is an initialtableau andr, + I isan immediate descendant of t, and r terminates at iff belongs to a terminal set Y'm. Let be that member of K whose first term II is in Y'1' Let tpx', for r, t ' in K, iff for some Y'i in a there are terms t, I' of t, t' in Y'j such that IRt' (R the ancestral of the tree ordering S). Then, intuitively, (.0' K, p) forms a q.m.s. with domain function
'n 'n
.0
'in
If a quantificational model ¢ is defined so that, for any sentence letter P, ¢(P, r) = Tiff P appears on the left of some I in r, and, for any predicate letter P", ¢( P", r) is the set of n-tuples (x l ' . . . , x n) of variables such that pn(x 1 , .•• , x n) appears on the left of some I in r, then, for every formula B, if B appears on the left of some I in r, ¢(B,.) = T (relative to the
assignment of each free variable in B to itself). Further, the dual law that, for every B. if B appears on the right of some I in r, then ¢(B, r) = F, holds iff z does not terminate in a closed set Y' n' Hence, if the construction was a construction for A, this is just the condition under which a provides a countermodel for A.
124 THEOREM
SAUL A. KRIPKE
2: The construction for A is closed if and only if A is valid.
The proof, which follows the lines sketched intuitively above, and in addition shows that the alternative sets of the construction for A exhaust the possibilities of finding a countermodel for it, is omitted because it is a routine variation on the proofs of the corresponding theorems of [2] and [16jl). 3. Completeness theorem 3. 1. Consistency property THEOREM
3: If A is provable in Heyting's predicate calculus, then A is
valid.
This theorem is almost trivial; we need only verify that, in a standard formalization of Heyting's predicate calculus, the axioms are all valid, and the rules preserve validity. Such a verification is left to the reader. It follows that if A is provable, the construction for A is closed. 3 . 2. Completeness property
We show that every valid formula A is provable by showing that if the construction for A is closed, then A is provable. As in [2] and [16], we do this using a notion of "characteristic formula." As in [2], define the rank of a tableau in a finite tree of tableaux (or, indeed, of a node in any finite tree), as follows: An endpoint of the tree has rank O. If tis not an endpoint, let t 1 , ..• , t n be its successors; then Rank(t) = Max {Rank (ti)}+l. It is easy to verify that, for any finite tree of tableaux, a unique rank is defined for each tableau of the tree. 1) Define A to be tree valid iff '" (A, G) = T for every model rp on a tree q.m.s,
(G, K, R). Then what really is readily proved is that the construction is closed iff A is tree valid. But, by section 1.2 above, validity coincides with tree validity. Alternatively, we can argue as follows without use of section 1.2: Clearly validity implies tree validity, and provability implies validity. The completeness result below shows that tree validity implies provability, so the three notions coincide. We could have defined a tableau procedure, based on a relation R, which would have been more appropriate to models than to tree models; a reader familiar with [21 will know how this could be carried out. Notice that, as observed in analogous cases in [21 and [161, the countermodels for non-valid formulae obtained by Theorem 2 from tableaux are always on a countable tree q.m.s. (G, K, R) with a countable set U of individuals involved. This "LowenheimSkolem" result will be used in part II to show that the present completeness results include those of Beth [81.
SEMANTICAL ANALYSIS OF INTUITIONISTIC LOGIC I
125
Given any tableau t in a tree of tableaux, define the following sequence {til: to = t, t j + l = the predecessor of t j , if such a predecessor exists, and undefined otherwise. The sequence is clearly finite, and its last term is the origin of the tree We call it the "path from t back to the origin." The terms of the sequence other than t "come before t" on the tree. For any t on a tree, let X(t) be the set of all variables occurring free in t but not in any tableau coming before it. At any stage of a construction, the tableaux of an alternative set form a finite tree. We define the characteristic formula of a tableau t in the set at a given stage by induction on its rank in the set. Given a tableau t, let AI' ... , Am[Bl , . . . , B n] be the formulae occurring on the left [right] of t. Further, let Xl' ... , x q be the elements of X(t). (Possibly q = 0.) If Rank (t) = 0, then the characteristic formula of t is defined as (x.). .. (Xq ) (AI A .. . A m.::::> .B l V .. • B n ) ; or, if there are no formulae on the left [right] of t, as (Xl)' .. (Xq) (B I V ... B n ) [(Xl)' .. (xqHA 1 A Am)]' If Rank (t) > 0, let t l , . . . , t p be the successors of t, and let C l , , Cp be the corresponding characteristic formulae. Then the characteristic formula of t is (Xl)' .. (x q) (AI A •.. Am.::::> .B; V ... B; V C, V ... C p ) ; or, if there are no formulae on the left [right] of t, the characteristic formula is (Xl)" . (Xq ) (B 1 V .. . B; V C l V " .Cp ) [(Xl)" . (Xq ) (A 1 A .. • A m.::::> ,C 1 V ... C p ) ] . The characteristic formula of an alternative set (tree) of tableaux is defined as the characteristic formula of the main tableau of the set. The characteristic formula of the entire system of alternative sets at a given stage of a construction is defined as the conjunction of the characteristic formulae of the alternative sets of the system. In a natural sense, the present notion of characteristic formula is "dual" to that of [2] and [16]. It may facilitate the reader's comprehension of the notion of characteristic formula if he consults the corresponding treatment of characteristic formulae in [2], [16]. LEMMA: If A o is the characteristic formula of the initial stage of a construction, and B o is the characteristic formula of any stage of the construction, then I- B o ::::> A o.
PROOF. It suffices to show that the characteristic formula of any stage of the construction implies the characteristic formula of the preceding stage. But the characteristic formula of the mth stage has in general the orm D 1 A .. . Dj A .. . Dn , where the Di(l :::; i :::; 11) are the characteristic
126
SAUL A. KRIPKE
formulae of the alternative sets of the stage. The rule which is applied and changes the mth stage into the m + lth affects only one alternative set, say with characteristic formula D i: If the rule is PI, Ar, or VI, it will change this set into two distinct alternative sets, with characteristic formulae D', and Dj; we wish to prove, then, J- D 1 A .. . D', A DjA ... D n . =:> • D 1 A ... D j A ... D w To do this, it suffices to prove D', A Dj. =:> • D i: Similarly, if the rule applied is other than PI, Ar, or VI, then D j is transformed into Dj; to prove that J- D 1 A ... D', A ... Dn' =:> • V 1 A ... D j A ... D n , it suffices to prove J- D', =:> D i: So, when a rule is applied transforming the mth stage of a construction into the m + 1th, we need only consider the characteristic formula of the set to which the rule is actually applied. Suppose, then, a rule (other than PI, or Ar, or VI) transforms a set Y with characteristic formula D j into one with characteristic formula Dj; we wish to prove J- Dj =:> D i: Let t be the tableau to which the rule is actually applied, and let C be its characteristic formula. Further, let C' be the characteristic formula of the tableau t' into which t is transformed by the given rule. (The rules Nr, Pr and IIr leave t unchanged, appending a new tableau t': In this case t' will be identical with t, but the new characteristic formula C' of t will not be identical with the old one C.) Suppose we can show J- C' =:> C. Then if t is the main tableau of the set Y, we have shown J- Dj =:> D j • Otherwise, let t 1 be the predecessor at stage m of t, let t~ be the predecessor at stage m+ Lof r', and let Cl[C~J be the characteristic formula oft 1 [ t a Then C 1 is a universal quantification (u.q.) of a formula of the form X. =:> • Yv C, and C~ is a u.q. of X. =:> • Yv C'. Since J- C' =:> C, clearly J- (X. =:> • Yv C') =:> (X. =:> • Y V C). Applying universal generalization to this last statement, and distributing universal quantifiers across the implication sign, we obtain J- C~ =:> C iIf t 1 is the main tableau of g, then C~ =:> C 1 is D', =:> D ; Otherwise, let t 2[t;J be the predecessor of tl[t~J, and apply the same reasoning as before. Eventually we will obtain D', =:> D j • Thus in the case ofany rule other than PI, VI, or Ar, we need only consider the tableau t to which the rule is actually applied, and prove the formula C' =:> C stated above. Notice that in general C, the characteristic formula of t, is a u.q. of a certain formula B, and C' is a u.q. of a certain formula B'. If we prove J- B' =:> B, then by universal generalization and distribution of the quantifiers across the implication sign, we can obtain C' =:> C.
SEMANTICAL ANALYSIS OF INTUITIONISTIC LOGIC I
127
Bearing these remarks in mind, we break down the proof into the following cases, depending on the rule applied to obtain the m + 1th stage from the mth. We can say a case is "justified," if we have shown, for the case, that f- D / ::::J D j' which usually reduces to f- B' ::::J B. The reader is advised to consult the similar treatments in [2] and [16]. In considering a rule, we will in general assume that the tableau t to which it is applied contains formulae both on the left and the right, and that its characteristic formula is therefore an implication. The cases where the left or right side is empty will be left to the reader. Case NI. The characteristic formula of t is a u.q. of X A ,A. ::::J. Y; after A has been put on the right, its characteristic formula becomes a u.q. of X A ,A. ::::J • Y v A. The case is justified by f- X A ,A. ::::J • Yv A: ::::J : XA,A.::::J .Y. Case Nr. The characteristic formula of t is a u.q. of X. ::::J • ,A v Y. When we start out a new tableau t 1 with A on the left, and t St", the characteristic formula of r ' is ,A (since X(t 1 ) is empty because any free variable of A already occurs in t), and that of t becomes a u.q. of X.::::J .,A v Yv ,A. The case is justified by f- X.:::::>. ,A v Yv,A::::::> :X. :::::>. ,A v Y. Case A 1. Justified by f- X
A
A
A
BAA
A
B. ::::J . Y: ::::J :X A A
A
B. ::::J . Y.
Case Ar. Let the characteristic formula of t, call it C, be a u.q. of X.:::::> . Yv (A A B). The rule Ar "splits" t into two alternative tableaux, t' and t", whose characteristic formulae C' and C" are u.q.'s of X . :::::> • Y v (A A B) v A and X.::::J . Yv (A A B) v B, respectively. Using f- (X.:::::>. Y v (A A B) v A) A (X. ::::J . Yv (A A B) v B): ::::J: X. ::::J . Yv (A A B), and generalizing, and distributing quantifiers, we obtain f- C' A C". :::::> • e. If t is the main tableau of the set, this is the desired result f- Dj A Dj. ::::J .Dj • Otherwise, let t 1 be the predecessor of t. The characteristic formula C 1 of t 1 is a u.q. of Xl' ::::J. Y1 V C; it is transformed by Ar into two alternative characteristic formulae C~ and e~, which are u.q.'s, respectively, of Xl' ::::J . Y1 V C' and Xl' ::::J . Y 1 V C", Using f- C' A C", ::::J • C, we easily obtain f- C~ A e~. ::::J . C 1 . Continuing this process along the path from t back to the origin, in a finite number of steps we obtain f- Dj A Dj. ::::J . Di: Case PI. Like Ar, using f- (X A (A ::::J B). ::::J . Y v A) B.:::::>. y)::::::>:XA (A :::::> B).:::::>. Y.
A
(X A (A ::::J B)
A
128
SAUL A. KRIPKE
Case Pro Let the characteristic formula of t be a u.q. of X. ::::> • Y v (A ::::> B). Pr instructs us to start out a tableau t 1 , with A on the left and B on the right, whose characteristic formula is thus A ::::> B (X(t 1 ) being empty). Then the characteristic formula of t is transformed into a u.q. of X.::::>. Yv (A ::::> B)v (A::::> B), and I- X.::::>. Yv(A ::::> B) v (A::::> B):::::>: X. ::::> • Yv (A ::::> B) justifies the case. Case VI. Like Ar, using I- (X A (A v B) B.::::>. Y):::::> :(X A (A v B).::::>. Y).
A
A.
::::> • Y) A
(X A (A v B)
A
Case Vr. Justified by I- X.::::>. Yv (A v B)v A vB:::::> :X.::::>. Yv (A v B). Case 171. If t has as characteristic formula C, a u.q. of X A (3x)A(x). ::::> • Y, after application of 171, t is transformed into t 1 , whose characteristic formula C' is a u.q. of X A (3x)A(x) A A(a). ::::> • Y. Since a is a new variable not previously introduced, a E X(t 1 ) . Thus, we can take C' to be a U.q. of (a) (X A (3x)A(x) A A(a). ::::> • Y). So I- (a) (X A (3 x)A (x) A A(a). ::::> • Y):::::> :X A (3x)A(x). ::::> • Y justifies the caseCase 17r. Justified by I- X.::::>. Y v (3x)A(x) (3x)A (x). Case III. Justified by I- X
A
(x)A(x)
A
A(a).
A
A(a):::::> :X.::::>. Y v
::::> • Y: ::::>
:X A (x)A(x).
::::> • Y.
Case IIr. The characteristic formula of t is a u.q. of X. ::::> • Y v (x)A(x). IIr instructs us to start out a new tableau t\ with tSt\ and with A(a) on the right, where a has not previously been used. Then X(t 1 ) = {a}, since a is the only free variable of t 1 which does not occur in t, Hence the characteristic formula of t 1 is (a)A(a), and the characteristic formula of t is transformed into a u.q. of X.::::>. Yv (x)A(x) v (a)A(a). So I- X. ::::> • Y v (x)A(x) v (a)A(a): ::::> :X. ::::> • Y v (x)A(x) justifies the case.
Finally, we must justify the rule stipulating that if a formula A appears on the left of a tableau t, and t St", we must put A on the left of t 1 • This is justified by XAA.::::>.Yv«X' AA)::::> Y'):::::>:XAA.::::>.Yv (X' ::::> Y'). The lemma is proved. THEOREM 4: If A is valid, then A is provable in Heyting's predicate calculus. PROOF. We can assume A has no free variables. Since A is valid, the
SEMANTICAL ANALYSIS OF INTUITIONISTIC LOGIC l
129
construction for A is closed. Then there is a stage at which each alternative set is closed; let the characteristic formula of that stage be D 1 /\ • • • D., where the D /s are the characteristic formulae of the alternative sets of the stage. By the lemma, D 1 A ..• D n • =:l.A (since A is the characteristic formula of the initial stage). So it suffices to show D j for each j. The alternative set whose characteristic formula is D j' being closed, contains a closed tableau t. Then t contains a formula B on both sides, so its characteristic formula C is a u.q. of X /\ B. =:l • Y v B. Clearly I- C. If t is the main tableau of the set, this is D i: Otherwise, let t 1 be the predecessor of t. Then the characteristic formula C 1 of t 1 is a u.q. of X'. =:l • Y ' V C. Clearly I- C 1 • Continuing in this manner, we are driven back along the path from t to the origin until we obtain I- D I: Q.E.D. REMARK. The theorem gives a finitary proof that if the construction for A is closed, I- A. We could have proved it alternatively by showing that the tableau procedure is equivalent to a standard Gentzen formulation of Heyting's system. Of course the theorem and proof apply to the propositional calculus, even though the proof was carried out for the predicate calculus.
References [I] Saul A. Kripke, Semantical Analysis of Modal Logic (abstract). The Journal of Symbolic Logic 24 (1959) 323-324. [2] Saul A. Kripke, Semantical Analysis of Modal Logic I. Normal Modal Propositional Calculi. Zeitschrift fur Mathematische Logik und Grundlagen der Mathemathik 9 (1963) 67-96. [3] Saul A. Kripke, Semantical Considerations on Modal and Intuitionistic Logic. Acta Philosophica Fennica 16 (1963) 83-94. [4] Saul A. Kripke, The Undecidability of Monadic Modal Quantification Theory. Zeitschrift fur Mathematische Logik und Grundlagen der Mathematik 8 (1962) 113-116. [5] Paul J. Cohen, The Independence of the Continuum Hypothesis. Proceedings of the National Academy of Sciences, U.S.A. 50 (1963) 1143-1148. [6] E. W. Beth, Observations on an Independence Proof for Peirce's Law (abstract). The Journal of Symbolic Logic 25 (1960; published, 1962) 389. [7] M. A. E. Dummett and E. J. Lemmon, Modal Logics between S4 and S5. Zeitschrift fur Mathematische Logik und Grundlagen der Mathematik 4 (1958) 250-264. [8] E. W. Beth, Semantic Construction of Intuitionistic Logic. Mededelingen der Koninklijke Nederlandse Akademie van Wetenschappen, Afd, Letterkunde, Nieuwe Reeks, Deel 19, No. 11. [9] V. H. Dyson and G. Kreisel, Analysis of Beth's Semantic Construction of In-
130
SAUL A. KRIPKE
tuitionistic Logic. Technical Report no. 3, Stanford University Applied Mathematics and Statistics Laboratories, Stanford, California. [l0] S. C. Kleene, Introduction to Metamathematics. (Van Nostrand, New York; North-Holland Publishing Co., Amsterdam and P. Noordhoff Ltd., Groningen). 1952. [ll] G. Kreisel, A Remark on Free Choice Sequences and the Topological Completeness Proofs. The Journal of Symbolic Logic 23 (1958) 369-388. [l2] A. Heyting, Intuitionism: An Introduction. (North-Holland Publishing Co., Amsterdam 1956). [l3] S. Kuroda, Intuitionistische Untersuchungen der formalistischen Logik. Nagoya Mathematical Journal 2 (1951) 35--47. Known only from references. [14] A. A. Markov, 0 nepreryvnosti konstruktivnyh funkcij (On the continuity of constructive functions). Uspehi Matern. Nauk 9 (1954) 226-230. Known only from references. [15] G. Kreisel, On Weak Completeness of Intuitionistic Predicate Logic. The Journal of Symbolic Logic 27 (1962) 139-158. [l6] Saul A. Kripke, A Completeness Theorem in Modal Logic. The Journal of Symbolic Logic 24 (1959) 1-14. [l7] G. Kreisel, Set Theoretic Problems suggested by the Notion of Potential Totality in: Infinitistic Methods (Warsaw 1961). Note (added in proof, August 9,1964). We have since seen Feferman's paper, and his version of forcing is indeed virtually identical with ours, although he, of course, does not base it on any model theory for or connection with intuitionistic logic. He credits his version to Dana Scott. Note (added in proof, October 28,1964). In connection with the "Remark" at the end of section 1.1, it should be pointed out that the example in part I of the Remark already refutes Markov's principle. For we observed there that, in FC, (a) (IX I B) , (x) (IX (x) = 0), but also (b) , (IX I B) (3x) (IX x) = 1). By (b), noting that since B is the binary spread, (IX I B) (x) (IX (x) =F 0 :::> IX (x) = I), we have (c) , (IX I B) (3x) (IX (x) =F 0). But (a) and (c) jointly contradict Markov's principle. The example in part 2 of the Remark is of interest in showing that a single counterexample can refute both Markov's principle and Kuroda's conjecture. It should be noted that Markov's principle would imply, for IX on the full binary spread, that , (x) (IX (x) = 0) :::> (3x) (IX (x) = 1). From this it is easy to derive, for a real number a, that a =F 0 implies a # 0 (similarly to part 1 of the Remark). Hence if Brouwer's disproof (using ips depending on the solving of problems) of the latter is accepted, Brouwer has already refuted Markov's principle. I wish to thank M. A. E. Dummett and John Crossley for their help in editing this paper, and in particular, M. A. E. Dummett for an important correction in section 1.2.
SET THEORY AND HIGHER-ORDER LOGICl) RICHARD MONTAGUE University of California, Los Angeles, Calif., USA
Several mutual applications of set theory and higher-order logic are developed. Second-order logic is used to discover the standard models of Zermelo-Fraenkel set theory, consideration of standard models leads to the introduction of new systems of set theory, one of these systems is applied in finding a definition of truth for higher-order sentences, and finally Zerrnelo-Fraenkel set theory with individuals is given a philosophical justification as logically true within higher-order logic. 1. Standard models
Let us consider three well-known first-order theories. The first, called Peano's arithmetic, has the non-logical constants 0, S, +, " and the following axioms"): ,0= Sx,
Sx = Sy
--+
x = y,
x+O = x, x+Sy = S(x+y), l) I am indebted to the United States' National Science Foundation, which supported the preparation of most of this paper under grant number NSF GP 1603 (Montague). 2) It is convenient for the purposes of the present paper to regard a first-order theory as determined by a sequence of non-logical constants and a set of axioms. I use the logical constants " A, Y, -->-,"-', A, Y, =, which are the respective symbols of negation, conjunction, disjunction, implication, equivalence, universal quantification, existential quantification, and identity. (I use ",", "A", etc. as names of certain symbols of the object language, and I indicate concatenation by juxtaposition.)
132
RICHARD MONTAGUE
x :0
= 0,
x'Sy
= (x'y)+x,
prO] A Ax[P[x]
P[Sx]]
-+
-+
AxP[x].
The last principle is regarded as a schema, called the Induction Schema; we take as axioms all formulas of Peano's arithmetic obtainable without clash of variables by substituting a formula for P in the schema. The second theory, called (at least in an alternative formulation 1» the theory of real closed fields has the non-logical constants 0, 1, +, " -, -1, ~, and the following axioms: x+(y+z) x+y x+O
= (x+y)+z,
= y+x, = x,
x+( -x) x . (y . z)
= 0, = (x . y) . z,
x : y = y' x,
= x, = 0 -+ x
x·l ., x
. x- 1 = 1,
x· (y+z) = (x : y)+(x' z),
.,0
o~ o~ o~ x
=
~
0- 1
1,
x v0 A 0 xA0
x
~ ~ ~
-x, - x -+ x = 0, Y -+ 0 ~ x +Y A 0
~ X •
y.
y ...... O ~ y+(-x),
= 0,
Vx P[x]
A
VyAx[P[x]
Az[Ax[P[x]
-+
-+
x ~ y]
x ~ z]
-+
-+
Vy(Ax[p[x]
-+
x ~ y]
A
y ~ z)).
The last principle is called the Continuity Schema and plays a role analogous to that of the Induction Schema; that is, we consider as axioms all formulas of the present theory obtainable (again without clash of vari1) The usual formulation involves fewer primitive symbols, and hence has somewhat more complicated axioms. It is clear that several of our symbols, for instance -1 could be defined in terms of the others.
133
SET THEORY AND HIGHER-ORDER LOGIC
abIes) by substituting a formula for P in the schema. As a final example, consider Zermelo-Fraenkel set theory, whose only non-logical constant is e and whose axioms are the following: Au[u e a
+-+ u
Vu u e a
-+
s b]
-+
a
= b,
Vu[u e a It. ., Vv(ve u It. v e a)],
VaAu[uea+-+u
= xvu = y],
VbAu[u s b +-+ Vv(u e v It. v s a)], VbAu[u e b +-+ Ax(x s u Va[Vu u s a
It.
Au(u s a
-+ -+
x e a)], Vv[u s v It. v e a])],
AxAyAzAaAbAcAqAr[Au(u e a +-+ u
Au(u e b +-+ u Au(u s q +-+ U It.
P[q] It. P[r]
=
x)
It.
= x v u = y) It. Au(u s c +-+ u = x v u = z) = a v u = b) It. Au(u s r +-+ u = a v u = c) -+
y = z]
-+
AsVtAy[y e t
= x] It. Au[u e b +-+ u = x Au[ueq+-+u = avu = b] It. P[q])]. Au[u s a
+-+ u
+-+ VxVaVbVq(x
vu
It.
e s It.
= y] It.
The last principal is called the Replacement Schema; as with the schemata above, we take as axioms all formulas of Zermelo-Fraenkel set theory obtainable without clash of variables by substituting a formula for P in this schema. 1) A possible model of Peano's arithmetic is a structure
AxAyAz[P[(x, y)]
A
P[(x, z)]
AsVtAy[y e t <--> Vx(x
£ S A
->-
y
= z] ->-
P[(x, y)])].
134
RICHARD MONTAGUE
quantifiers on those variables. 1) The concepts of possible model and model may of course be used also in connection with other theories. For instance, a possible model of the theory of real closed fields will have the form
°
1) For an exact definition of the notion of truth in a possible model, see Tarski and Vaught [12]. 2) For the general notion of isomorphism of structures, see, for instance, Tarski [11].
135
SET THEORY AND HIGHER-ORDER LOGIC
ventionally identifying 1/0 with 0), and R is the usual relation of magnitude among real numbers. For set theory we first introduce a transfinite hierarchy of types. Intuitively, if a is any ordinal, T(a) is to be the a~ type in a Russellian hierarchy which begins with an empty set of individuals; the recursive definition is the following. T(O) = the empty set A. T(a+ 1) = the set of all subsets of T(a).
If A is a limit number, T(A) is the union of all sets
T(~)
for
~
< A.
(Notice that the types are according to this definition cumulative and hence not pairwise disjoint. For instance, the empty set is a member of both T(l) and T(2). This is in no way incompatible with Russell's theory. Though that theory requires distinct symbols for the empty set of type o and the empty set of type 1, nothing prevents the two symbols from designating the same object. The identity between the two symbols is not a sentence of the theory and can be neither asserted nor denied.) By a standard model of Zermelo-Fraenkel set theory is understood a structure isomorphic to
~
P[Sx]]
~
AxP[x]).
(Here P now serves as a genuine predicate variable.) Similar the secondorder theory of real numbers and second-order Zermelo-Fraenkel set 1) For several equivalent definitions of the notion of a strongly inaccessible ordinal, see Montague and Vaught [8], which contains investigations of certain properties of the structures (T(rx), E;(T(rx))).
136
RICHARD MONTAGUE
theory are to be exactly like the theory of real closed fields and ZerrneloFraenkel set theory respectively, except that the Continuity Schema and the Replacement Schema are to be replaced by the second-order axioms obtainable from them by prefixing universal quantifiers on P. The possible models of any of these three theories coincide with the possible models of the corresponding first-order theory. (Thus as always the collection of possible models depends only on the sequence of nonlogical constants of the theory in question.) A possible model of a second order theory will be considered a model of that theory if all axioms of the theory are true under the interpretation supplied by the possible model; individual variables are regarded as ranging over the universe of the possible model (that is, the set which is the first constituent of the possible model), and (one-place) predicate variables as ranging over the set of all subsets of that universe. Now it is well known that the models of second-order Peano's arithmetic coincide exactly with those structures which were described earlier as standard models of (first-order) Peano's arithmetic, and it is rather easily shown that the models of the second-order theory of real numbers and those of second-order Zermelo-Fraenkel set theory coincide respectively with the standard models of the theory of real closed fields, and the standard models of (first-order) Zermelo-Fraenkel set theory. These facts suggest a unification of the divergent special notions: whenever we speak of the standard models of a first-order theory T we have in mind a related second-order theory U; the standard models of Tare then identified with the models of U. In any context in which standard models are considered the theory of basic interest seems always to be a second-order theory (or, in some cases which will not arise in this paper, a theory of higher than second order). We may, of course, consider as well various first-order subtheories of the second-order theory.') For some purposes, indeed, it is essential to do so in view of certain properties - compactness and the Lowenheim-Skolern property, for example - which are possessed by all firstorder theories but not by any interesting second-order theory. Suppose that we have selected a certain second-order theory as our 1) One theory is called a sub theory of another if all theorems of the first are theorems of the second. (A theorem of a first- or second-order theory T is a first- or second-order formula of T which is true in every model of T.)
SET THEORY AND HIGHER-ORDER LOGIC
137
basic object of attention. How are we to select a subtheory that might be called "the corresponding first-order theory" ? Let us consider a procedure applicable to those second-order theories T whose axioms, like those in our examples, all have the form
where Po, ... , P n-l are predicate variables and ¢ is a formula without second-order quantifiers. 1) By a first-order instance within T of the formula displayed above, we understand a first-order formula obtainable (without clash of bound variables) by substituting formulas of T for the predicate variables Po' ... , P n - 1 in ¢. The first-order theory corresponding to T might then be identified with that theory whose non-logical constants are those of T and whose axioms are the first-order instances within T of axioms of T. It is according to this notion that the three second-order theories considered above can be said to have Peano's arithmetic, the theory of real closed fields, and Zermelo-Fraenkel set theory as their corresponding first-order theories. This way of selecting a first-order theory is, however, unnatural: we can easily find two equivalent second-order theories") whose first-order counterparts in the present sense are not equivalent. A much more natural procedure, and the one we shall adopt, is to identify the firstorder counterpart of a second-order theory T with that theory whose nonlogical constants are those of T and whose axioms consist of those firstorder sentences which are theorems of T. In the light of this analysis two of the first-order theories considered above, Peano's arithmetic and Zermelo-Fraenkel set theory, lose interest. Neither is equivalent to what we now regard as the first-order counterpart of second-order Peano's arithmetic or second-order ZermeloFraenkel set theory. The first-order counterparts of these two theories are, to be sure, not equivalent to any recursively axiomatized theories (or even to theories with arithmetical axiom sets, in the sense of Kleene [3]), and we may for some purposes wish to consider recursively axiomatized first-order subtheories. But Peano's arithmetic and Zermelon
1) The first-order axioms of our examples can be regarded as having this form with
= O.
2) Two first- or second-order theories are said to be equivalent if they have the same models.
138
RICHARD MONTAGUE
Fraenkel set theory, though they satisfy this description, seem in no tangible sense pre-eminent among a wide range of other, non-equivalent theories which also qualify. The situation changes when we consider the second-order theory of real numbers. It is a result of Tarski [10] that what we now call the first-order counterpart of this theory is equivalent to the theory of real closed fields. 2. Rank-free set theory There are interesting structures of the form
order to maintain harmony with conventional model theory, which does not countenance any model with an empty set of elements. 3) The fact that these axioms give a theory with the required property is not completely obvious. Some preliminary derivations from the axioms which make the remainder of the proof rather simple may be found in the forthcoming monograph Montague, Scott, and Tarski [7].
SET THEORY AND HIGHER-ORDER LOGIC
(I)
Au[uBa~uBb)---+a=b,
(2)
YbAx[x Bb ~ Yy(y Ba
(3)
YkAmAb(Ax[x B m ---+ x B k) A Ax[x Bb ~ Yy(y B m A Az[z B X
A
Az[z B X
139
---+ Z BY))),
---+ Z E y))) ---+ b Bk v Ax[x s a ---+ x E b)),
(4)
APAaYbAx[x e b ~ P[x) A x e a).
(In the last axiom, the familiar Aussonderungsaxiom, P is of course a predicate variable.) We may call this theory second-order rank-free set theory. (The name comes from the fact that the theory is neutral with respect to the ranks, or types, of its models.) Perhaps more interesting, at least from the viewpoint of philosophy and empirical science, is that kind of set theory which allows for the possible existence of individuals or non-sets (objects which contain no elements but differ from the empty set). We therefore consider also a modified version of the last system, and understand by second-order rank-free set theory with individuals that theory whose non-logical constants are e and l: (the latter understood as the predicate of being a set) and whose axioms are the following: Ax[x e a ~ x s bJ A l: a A l: b ---+ a
=
b,
y e x ---+ l: x, l: a ---+ YbAx [x s b ~ , l: x v Yy(ye a A
A z [z e x
---+
z e yDJ,
AaYkAmAb(Ax[x e m ---+ x s k) A A x [x e b ~ , l: x v Yy(y s m A Az[z E X ---+
b e k v Ax[x e a
---+
Z E
yDJ
---+
x e b)),
APAa[l:a ---+ Yb(l:b A Ax[x s b ~ P[x) A x e a))).
(The last axiom is another version of the Aussonderungsaxiom.) A possible model of this theory will have the form
m
140
RICHARD MONTAGUE
extension (possibly definitional, possibly axiomatic) of ZermeloFraenkel set theory with individuals"), Thus in our metatheory we allow for the possible existence of individuals. We do not commit ourselves as to their number; it may indeed be zero. We do not even need to commit ourselves as to whether the individuals form a set, although the natural approach is to assume, as in the next-to-last axiom displayed above, that they do. Now given any set U we can construct a transfinite cumulative Russellian hierarchy based on U. If a is any ordinal, Tu(a) is to be the a0 type in this hierarchy; the recursive definition is the following. Tu(O) = U. Tu(a+ 1)
= the union of Tu(a) and the set of all subsets of Tu(a).
If A is a limit number, then TuU) is the union of the sets ~ < A.
Tu(~)
for
(Notice that if U is the empty set and a any ordinal, then Tu(a) = Tla).) The possible models which most naturally come to mind in connection with set theory with individuals are structures of the form
141
SET THEORY AND HIGHER-ORDER LOGIC
ture, a model of second-order rank-free set theory with individuals. With the help of a few auxiliary notions we can state some facts about type structures which do not depend on the number of individuals. Let 'll be a type structure, and let 'll have the form
T\lI(O) = In \lI' T\lI(e>:+ 1) = the set of x in A such that, for all y, if
If A- is a limit number, then '\lI(A-) is the union of the sets ~
< )..
'\lI(~)
for
It will turn out that A = '\lI(e>:) for some ordinal rx; we call the least such ordinal the rank of'll. If B is any set and rx any ordinal greater than 0, then there is a type structure 'll with rank a such that In\lI = B. If B is a non-empty set, there is a type structure'll of rank such that In\lI = B. If'll and ill' are type
°
structures of the same rank, and In\lI and In\lI' have the same cardinality, then'll is isomorphic to'll'. If'll and 'U' are type structures, and f is a one-to-one correspondence between In\lI and In\lI" then there is at most one isomorphism between ill and ill' which is an extension off If ~l is a type structure, 'll = :) is non-empty, then '\lI(rx), R', B) is a type structure of rank «, where R' is the restriction of R to '\lI(rx). If in second-order rank-free set theory and second-order rank-free set theory with individuals we drop the initial quantifier of the Aussonderungsaxiom and treat the result as a schema, we shall obtain two first-order theories, which we may calljirst-order rank-free set theory andjirst-order rank-free set theory with individuals respectively. Like Peano's arithmetic, these theories have no theoretical pre-eminence among a number of recursively axiomatized first-order subtheories of the corresponding second-order theories, but they have some practical interest. Various useful systems of set theory, some well-known and others as yet unexploited, can be obtained from these two first-order theories in a uniform way, by the addition of "axioms of infinity", that is, principles imposing conditions on the ranks of models. For example, Zermelo-Fraenkel set theory with individuals may be
<
142
RICHARD MONTAGUE
obtained by adding to first-order rank-free set theory with individuals the axiom AxVyxey,
as well as all formulas of this theory obtainable without clash of variables by substituting a formula for the two-place predicate R in the schema AxVy R[x, y] ~ Vb[Ax( ~x s a x s b)" Ax(x s b ~ Vy[ye b r; R[x, y]])]
(which may, to borrow a term from Raymond Smullyan, be called the Principle of Disjunctive Closure). Additional examples are furnished by Zermelo-Fraenkel set theory and the set theory of Morse, formulations of which can be obtained similarly, starting from first-order rank-free set theory. Among the less familiar examples are a theory T 1 which, roughly speaking, bears the same relation to the theory of Morse as that theory bears to Zermelo-Fraenkel set theory, a theory T z which bears the same relation to T 1 , and so on; the latter theories seem relevant to foundational problems arising in connection with several branches of mathematics, for instance, abstract algebra, model theory, and algebraic topology. 1 ) 3. Higher-order logic
The problem of defining truth (or more precisely, truth in a model) for sentences containing variables of transfinite type seems never to have been completely settled in the literature, but can be rather easily and naturally solved with the aid of considerations in the preceding section. As ingredients of higher-order formulas we assume the following disjoint categories of symbols to be available: (1) the logical constants listed in footnote 2, p. 131, (2) an additional logical constant 1'/, regarded as indicating membership, (3) for each natural number n, a collection (whose cardinality need not be specified here) of n-place predicates, (4) for each natural number n, a collection (again of unspecified cardinality) of n-place operation symbols, and (5) for each ordinal Ct, a denumerable set of variables of type Ct. 1) For a discussion of a few of these problems see MacLane [4].
SET THEORY AND HIGHER-ORDER LOGIC
143
The quantifiers and = may apply to variables of any type; the predicates and operations symbols, on the other hand, may apply only to individual variables (that is, variables of type 0) or, somewhat more generally, to individual terms.
We understand by an individual term an expression t which is a constituent of some finite sequence s such that each constituent of s is either an individual variable or, for some natural number n, the concatenation of an n-place operation symbol with n earlier constituents of s; by a term either an individual term or a variable of type greater than 0; and by a higher-order formula an expression ¢ which is a constituent of some finite sequence s such that each constituent of s is either (1) t = u or t 17 u, for some terms u and t, (2) the concatenation of an n-place predicate with n individual terms, for some natural number n, (3) the negation of an earlier constituent of s, (4) the conjunction, disjunction, implication, or equivalence formed from earlier constituents of s, or (5) A v¢ or Vv¢, where ¢ is an earlier constituent of s and v a variable of arbitrary type. This characterization of higher-order formulas could have been more liberal in two ways. In the first place, the only higher-order variables we have admitted are one-place predicate variables. We could also have included predicate variables of various numbers of places (and of various types). Such an approach is appropriate in connection with a truncated higher-order logic, in which a finite upper bound is placed on the types considered. When no such bound is present, however, everything expressible by use of predicate variables of several places can also be expressed by use of one-place predicate variables; and it seems desirable to avoid the unpleasantly complicated type hierarchy that predicate variables of various numbers of places would introduce. A second possible extension of the present approach would consist in admitting predicates and operation symbols that meaningfully apply not to individual terms but to variables of higher type. Such an approach would require a more general notion of a model than the one usual in the literature, which is also the one defined below. The more general notion of a model would indeed have interest. It would permit a unified treatment of such structures as topological spaces, uniform spaces, and systems of classical particle mechanics of a given number of dimensions, 1) 1) The last phrase is used in the sense of McKinsey, Sugar, and Suppes [5J.
144
RICHARD MONTAGUE
which cannot well be construed as first-order structures (that is, models in the usual sense). But a discussion of such matters is best deferred to another occasion. On the other hand, it is possible to imagine an approach of a more, rather than less, restrictive character than ours. We could have imposed some condition of stratification on the formulas t = u and t 11 v, where t, u, v are variables - for instance, that t and u have the same type, and that the type of v be greater by one. It seems preferable not to impose such restrictions, however; they would lead to no simplification in the problem of interpreting higher-order formulas but only to a considerable reduction in power of expression. In the examples considered earlier a model (or a possible model) was construed as a sequence, and position in that sequence determined the intended correspondence between interpretations and symbols to be interpreted, which were themselves given in a sequence. In the general situation it is more convenient to establish this correspondence directly, by means of a function. Thus we now understand by a model an ordered pair
Now let
SET THEORY AND HIGHER-ORDER LOGIC
145
F(n), (2) for each ordinal a, the variables of c/J of type a are regarded as
ranging over the set TA(a), (3) = is interpreted as the identity relation, and (4) rJ is interpreted as membership. 1) Thus the types are regarded as cumulative; a consideration of transfinite types would indicate rather clearly the desirability of this course. Furthermore, the choice between cumulative and non-cumulative types has no effect on the truth of second-order formulas satisfying the usual conditions of stratification. In particular, let c/J be a higher-order sentence such that (I) whenever u = v is a subformula of c/J, both u and v are individual terms, and (2) whenever u rJ v is a subformula of c/J, u is an individual term and v is a variable of type 1. In addition, assume that
146
RICHARD MONTAGUE
be regarded as true in
SET THEOR Y AND HIGHER-ORDER LOGIC
147
preted as the identity relation, and (iv) 1'/ is interpreted as the relation R, where 58 has the form
148
RICHARD MONTAGUE
References [1] [2] [3] [4]
Kurt Godel, The Consistency of the Continuum Hypothesis (Princeton 1940). John L. Kelley, General Topology (Princeton 1955). S. C. Kleene, Introduction to Metamathematics (Princeton 1952). Saunders MacLane, Locally Small Categories and the Foundations of Set Theory. Infinitistic Methods (Warsaw 1961). [5] J. C. C. McKinsey, A. C. Sugar, and Patrick Suppes, Axiomatic Foundations of Classical Particle Mechanics. Journal of Rational Mechanics and Analysis 2 (1953) 273-289. [6] Richard Montague, Reductions of Higher-order Logic. Proceedings of the International Symposium on the Theory of Models (Amsterdam, forthcoming). [7J Richard Montague, D. S. Scott, and Alfred Tarski, An Axiomatic Approach to Set Theory (Amsterdam, forthcoming). [8J Richard Montague and R. L. Vaught, Natural Models of Set Theories. Fundamenta Mathematicae 47 (1959) 219-242. [9J Patrick Suppes, Axiomatic Set Theory (Princeton 1960). [IOJ Alfred Tarski, A Decision Method for Elementary Algebra and Geometry, 2nd ed. (Berkeley and Los Angeles 1951). [11] Alfred Tarski, Contributions to the Theory of Models I. Indagationes Mathematicae 16 (1954) 572-581. [12J Alfred Tarski and R. L. Vaught, Arithmetical Extensions of Relational Systems. Compositio Mathematica 13 (1957) 81-102.
EXISTENCE IN LESNIEWSKI AND IN RUSSELL A. N. PRIOR Manchester University, Manchester, UK
Anyone who learns his logic in Manchester, Notre Dame or Chapel Hill, is bound to hear a good deal about Lesniewski's logic, and especially about the discipline that Lesniewski called "ontology"; but anywhere else in the world, even in Warsaw, the student is likely to find Lesniewski's name hardly mentioned. I suspect that one of the reasons for this is that Lesniewski's theories, and again I have in mind especially his ontology, have often been rather puzzlingly presented. It is often said by its advocates that ontology is an answer to an early prayer of Russell's. Principia Mathematica contains a theorem, namely *24.52, which asserts that the universal class is not empty, that is, that there is at least one individual. And this is a theorem which Russell found an embarrassment - in a footnote to his Introduction to Mathematical Philosophy (p. 203) he describes it as "a defect in logical purity". In Lesniewski's ontology this defect, if it is one, doesn't exist - ontology is compatible with an empty universe. What is puzzling is the explanation which is commonly given of this achievement. The lowest-type variables of ontology are described, like Russell's lowest-type variables, as standing for names; but it is said that whereas Russell's variables stand for singular names only, Lesniewski's stand equally for empty names, singular names and plural names. Existence is therefore something that can be significantly predicated with an ontological "name" as subject - "a exists" is a well-formed formula, and is in some cases but not in all cases true, and that it is true in some cases is, although true and statable, not a theorem of the system. Another peculiarity, connected with the preceding ones, is that the ontological symbol "s", unlike the Russellian symbol "s",
150
A. N. PRIOR
stands between expressions of the same logical type. "a 6 a", for example, is well-formed. What are we to make of all this? I want to suggest that what we are to make of it is that ontology is just a broadly Russellian theory of classes deprived of any variables of Russell's lowest logical type. Ontology's so-called "names", in other words, are not individual names in the Russellian sense, but class names. This immediately explains the first two of the peculiarities I have mentioned. For while it makes nonsense to divide up individual names in this way, class-names are divisible into those which apply to no individuals, those which apply to exactly one, and those which apply to several. It makes sense also to say that some classes "exist", either in the sense of having at least one member or in the sense of having exactly one member, and some classes do "exist" in these senses and some do not. The disappearance of the theorem that there is a non-null class still requires explanation, and so does the typehomogeneity of the arguments of the functor "6", but we shall consider these points shortly. Before getting on to that, I want to mention one feature of Lesniewski's socalled "names" which exponents of his theories don't generally make much of, but which seems to me to tell quite conclusively in favour of interpreting them as class-names, namely that they can be logically complex. Given any pair of Lesniewskian names we can for instance form their logical product and their logical sum, and we can construct a name which is logically empty, e.g. the compound name "a and not-a". Russellian variables of lowest type, on the other hand, are logically structureless - you can construct other things out of them, together of course with other symbols; for example, you can construct a one-place predicate (e.g. "- shaves Peter:') out of a two-place predicate and a name; but there is nothing out of which you can construct Russellian individual names. (Definite descriptions, e.g. "the x such that x shaves Peter" are notoriously not "names" in Russell's view.) The formal development of a Russellian class theory without variables of the lowest type presents, however, some very taxing problems. For as Russell presents this theory, individual variables are not just an optional appendage which can be lopped off without damaging the rest of the system. On the contrary, Russell regards classes as logical constructions out of individuals and functions of individuals. He has so to speak a primary language and a secondary one. In the primary language there
EXISTENCE IN LESNIEWSKI AND IN RUSSELL
151
are just individual names, functors forming sentences out of these, functors of higher type operating on the preceding functors, and so on. His class theory is merely a set of convenient alternative locutions by which talk about individuals can sometimes be replaced. The basic sentences of this class language are of the form "x e ex", asserting an individual's membership of a class, and where "ex" is, say, the class of things that/, the form "x e ex", "x is an f-er", is simply a re-writing of "x f''s", or more accurately of "For some g, such that if anythingf's it g's and vice versa, x g's". And when we wish to define complex classes, for example the logical product of two classes, or the null class, we fall back again and again on this basic form - the logical product of the ex's and the /3's, for example, is the class of x's such that x is an ex and x is a /3 - these x's, these individual names, are just not dispensable. Lesniewski meets this dfficulty by introducing an undefined constant expressing a relation between classes - it can be, but it does not need to be, the functor "s" previously mentioned. This functor, as I have also previously said, has arguments of the same logical type, so that what it expresses is not Russellian class-membership. It expresses rather the inclusion of a unit class in another class. This interpretation of the Lesniewskian "s" was suggested some time ago by Jerzy Los, and although Lesniewski himself did not like it, no other interpretation of the symbol seems to me intelligible. It is tantamount to reading the form "a s b" as "The a is a b", or "There is exactly one a and every a is a b", There are of course Russellian forms, though not the form "x e ex", that have this meaning. And the Lesniewskian form "a = b" does not express Russellian class identity - "The a's coincide with the b's" - but means rather "The a is the b", that is, "There is exactly one a and exactly one band they are the same". But this is not quite Russellian individual identity either - "The a is the a" is false if there are no a's, or if there are several, but the Russellian "x = x" is a law of the system, and is in fact definable as the obvious truism "For any f, if x f 's then x f''s", So if we define individual existence as Lesniewskian self-identity, it amounts to a class's being a unit one, and is predicable of some classes but not of others, whereas if we define it as Russellian self-identity it is predicable of everything a Russellian name can stand for. Complex classes can also be defined by using the Lesniewskian "a", Lesniewski's rules of definition are in fact a little complicated, more
152
A. N. PRIOR
complicated than those which are required in Russell's primary language, the language of individual names and functors operating on these; but they constitute an interesting solution of a genuine and interesting problem, and they are much tidier than the rules for translating Russell's secondary language, in which he talks about classes, into his primary one; they do not give rise, for example, to scope ambiguities. And while Lesniewski's procedure requires the introduction of a primitive constant that Russell's doesn't need, since Russell can get all the constants he wants in his primary language out of propositional calculus and quantification theory, this very fact gives Lesniewski greater freedom over what his class theory will or will not contain. For this constant "8", in terms of which existence and non-existence, for example, are definable, must have special axioms laid down for it, and it is easy enough so to choose these axioms that the formula "For some a, the a exists", or "For some a, the a is the a" is not a theorem. No Lesniewskian denies that it is a truth, but it is not a provable truth in Lesniewski's system. From Russell's system, on the other hand, it is impossible to delete this theorem, for what Russell means by the non-emptiness of the universe is that for some x, x is x, that is, for some x, for every f, if x f's then x f's, and this is provable from propositional calculus and quantification theory alone. Or if we can eliminate it by attaching complicated provisos to the ordinary rules of quantification, the resulting system is very difficult to interpret. It is profitable at this stage to ask ourselves whether the appearance of this theorem in Principia Mathematica really does indicate a "defect in logical purity", and if so what is the source of this infection. Basically it comes from the interpretation of Russell's lowest-type variables as standing for individual names, that is to say symbols whose only contribution to what is actually said by the sentences in which they occur is the identification of the individual objects that the sentences are about. If such a symbol fails to identify any individual object, the sense of the sentence is incomplete and nothing is really said. Using the word "This" as a symbol of this sort, what is said by "This exists" is bound to be either a truth or nothing at all. Hence the assumption that there are complete statements of the form ''fx'', where "x" is a symbol of this kind, already involves the non-emptiness of the universe. One way of purging logic of this assumption would be to conceive quantification
EXISTENCE IN LESNIEWSKI AND IN RUSSELL
153
theory as being concerned simply with the application of quantifiers to functors with their arguments, without regard to what parts of speech these functors and arguments are. The form of quantification theory would in fact be unchanged if we interpreted Russell's lowest type variables as standing for Lesniewskian "names", that is to say class names, and his predicates for functors forming sentences out of these. The form "For some x, for all j, iffx thenfx", or as we might now prefer to write it "For some a, for all j, if fa then fa" would still be provable from propositional calculus and quantification theory alone, and indeed it is so provable in ontology, but it now carries no existential implications, since an a that would instantiate it could be an empty class. If however, we may regard Russell's interpretation of his lowesttype variables as an extra-logical matter, we may equally so regard the undefined constant which is required in ontology. For as we have seen we can define existence in terms of this constant, and we can formulate the proposition that for some a, the a exists, even if we so choose our axioms that we cannot prove it. Such a choice of axioms seems, indeed, strangely arbitrary - the proposition concerned is, after all, formulated purely in terms of the constants and variables of the system, and is acknowledged to be true, so if those constants are regarded as purely logical, why is the truth not so regarded? I cannot really see much sense in this. H may seem from I have said that ontology, on my interpretation of it, is committed to the existence of classes as nameable entities, though in fact Lesniewski was notoriously nominalistic. But this is a misunderstanding, arising from the use of the perhaps unfortunate term "classname". What we have to do with here are common nouns, and these are not strictly speaking names of objects at all. When we read the Russellian "x eel' as "x is a member of the class of IX'S", for example, "Russell is a member of the class of men", this looks as if we are asserting a relation between a concrete object and an abstract one, but the theory of types itself should warn us that this is not quite right, and we might do better to read the form simply as "x is an IX", for example "Russell is a man". Here the form "is a" is not quite a proper verb, that is a functor which makes sentences out of individual names; rather it makes a sentence out of a name and a common noun. And the functors which join the a's and b's in ontology, and the IX'S and {J's in class theory, are not, properly
154
A. N. PRIOR
speaking, predicates; they are functors like "Every - is a -", 'The - is a -", "There is no such thing as a -". In fact, these functors which take arguments of Lesniewski's lowest type include ordinary numerical functors, like "There is exactly one -", "There are exactly forty-three - s", and so on. It is no doubt convenient to use forms like "The class of a's is an empty class", "The class of a's is a member of the class of pairs", and so on, and Lesniewski introduces a higher-order "s" which is so defined that "i e g" may be read as "The unit class-of-classes f is included in the class-of-classes g". But these are no more than convenient locutions; "The class of a's is an empty class", for example, means no more and no less than "There is no such thing as an a", from which the suggestion of naming an abstract object, the class of a's, has been removed. It is true that Lesniewski quantifies over variables of his lowest type, and indeed over variables of all types, and there is a doctrine current among some American logicians that any variable subject to quantification thereby counts as standing for a name, but this seems to me a quite eccentric criterion of namehood. What ontology in fact does is to combine the maxim that only individuals are real with the view that the only way we can linguistically get at individuals is by speaking of them as what certain common nouns apply to - maybe uniquely; and that their application is unique is of course something that can be said within the system, not by having Russellian individual names in it, but by having as it were an individuating functor, namely the Lesniewskian "s" or "The - is a -". The phrase "The so-and-so" is not, as it is in Frege, itself an individual name; there are no individual names; but the phrase does occur as part of the larger functor, and so to speak individuates, or purports to individuate, as it makes the full statement. There are many contemporary philosophers, here in Oxford for example, who are not very happy about Russellian individual names, and would rather like to do without them, and ontology seems to me worth offering to these philosophers as a system in which their programme is really carried out. In fact it may be far less important as an answer to one of Russell's prayers than as an answer to one of the prayers of the anti-Russellians. Lesniewski's own system is, indeed, characterised by an extreme extensionalism which is not likely to appeal very much to the philosophers I have in mind, and for that
EXISTENCE IN LESNIEWSKI AND IN RUSSELL
155
matter it doesn't appeal to me either; this extensionalism, moreover, is as thoroughly wrought into Lesniewski's methodology - underlying, for example, his rules of definition - as the use of individual names is wrought into Russell's theory of classes. However, I am sure that with a little trouble one can disentangle the more desirable features of ontology from this less desirable one, just as ontology itself disentangles the pure theory of common nouns from its Russellian name-and-predicate basis.
FUNCTIONS AND ROGATORS1) A. SLOMAN University of Sussex, Brighton, UK
Section A 1. The concept of a "function", though frequently used by logicians (e.g. in talking about truth-functions or propositional functions), has rarely been discussed systematically since the writings of Frege and Russell. The notion may be approached in several different ways, either syntactically, through the notion of a "function-sign", or semantically, through the notion of what corresponds to such signs. The syntactical approach may either deal with function-signs as "incomplete" or "unsaturated", following Frege, or it may deal with them as complete signs (e.g. signs containing variable-letters "x", "y" etc., or signs prefixed with Church's lambda operator). The latter approach is more common, the former more fundamental. (A similar distinction could be made at the semantic level: see end of par. 11, below.) The semantic approaches may also be subdivided into two sorts, depending on whether they are intensional or extensional. Once again, the latter is more common, the former more fundamental. It would be of considerable interest and importance for the philosophy of logic to analyse these various approaches and their interrelations, especially as most modern text-books are somewhat narrow, favouring one or other approach as the only acceptable one, others being, at best, mentioned with a few disparaging remarks.i) 1) I wish to thank Michael Dummett and members of the Philosophy department at Hull University, for helpful comment and criticism at various stages in the development of this paper. 2) See, for example, P. Suppes, Introduction to Logic (Princeton 1957) 229f. Similar remarks are made by A. Tarski on p. 72 of his Introduction to Logic (New York 1946).
FUNCTIONS AND ROGA TORS
157
2. In this paper only the semantic conceptions will be discussed. An attempt will be made to explain the difference between the extensional and the intensional approach, with the aim of showing that the latter is not just a confused version of the former, but is something quite different, and, in one sense, prior to the other. This is not a new suggestion. Some of what I have to say has been said before, e.g. by Russell (in Introduction to Mathematical Philosophy, p. 12 ff, p. 183 ff) and F. P. Ramsey (in The Foundations of Mathematics, p. 15). But I am not aware of the existence of any detailed discussion of the distinction or its applications. This first section will be devoted to a brief explanation of the distinction. The next (section B) will compare it with other distinctions likely to be confused with it. In the final section (C) some applications will be mentioned.
3. The following notion of a function-sign will be assumed to be familiar: a function-sign is obtained from a sentence or referring expression (for example), by replacing one or more words or phrases in it by so-called "variable-letters" such as "x" or "y". Examples are: "The mother of x", "The town in which x was born", "y is the father of x". The semantic approach involves regarding a function as something which, in some sense, corresponds to such a function-sign. It is said to take arguments and yield values correlated with the arguments. If a name or sign for an argument is substituted for each variable-letter in a function-sign, then the result is taken to be a name or sign for the value correlated by the corresponding function with that argument or set of arguments. The things which correspond to function-signs and which take arguments and yield values are normally described as "functions", but I shall use two words "function" and "rogator" to mark the difference between the extensional and the intensional concepts. (This is less cumbersome than talking about "extensional functions" and "intensional functions", and avoids confusion which might arise out of the fact that this latter terminology has been used to mark another distinction, to be mentioned below. I retain the word "function" for the extensional concept, since that seems to be its normal use at present, though it could be
158
A. SLOMAN
argued that the normal meaning is somewhat indefinite in this respect. A mathematician recently said to me that he thought of a function as a sort of machine, which churned out numbers as numbers were fed into it. This could be taken as an intensional explanation.) Functions and rogators, then, are thought of as corresponding to function-signs, and as taking arguments and yielding values. This much they have in common.
4. In order to explain the difference between functions and rogators we need the notion of "extensional equivalence". Two functions, or two rogators, are said to be extensionally equivalent if (a) they each have values for the same arguments, and (b) they correlate the same values with the same arguments. That is, two functions, or rogators, "Fx" and "Gx" are extensionally equivalent if, and only if, (x) (y) [(y
= Fx) ==
(y
=
Gx)].
The difference can now be explained. Extensional equivalence is a necessary and sufficient condition for the identity of functions, but not for identity of rogators. Thus, the functions corresponding to "the mother of x" and "the woman first loved by x" may be extensionally equivalent, in which case they will be one and the same function. But this does not mean that there will be one and the same rogator corresponding to them, even though the two rogators are extensionally equivalent. To say that rogators are intensional entities, then, is simply to say that extensional equivalence does not guarantee identity of rogators: there are no further metaphysical or psychological implications. Of course, this account of the difference between rogators and functions is not a definition of either. We may regard it as a partial definition, or a criterion for adequacy of a definition of the notions. Let us now see if we can give adequate complete definitions. 5. If we are allowed to make use of the notion of a set, then we can define "function" in the familiar way as a set of ordered pairs satisfying the condition that no two pairs in the set have the same second element. Since sets satisfy extensional criteria for identity it follows that this definition of "function" fits the criterion of the previous paragraph. That is, if two functions contain exactly the same ordered pairs, then they are identical, since the sets of ordered pairs are identical. But containing
FUNCTIONS AND ROGATORS
159
exactly the same ordered pairs means correlating the same values with the same arguments. So we have found something (the usual thing) that can be called a "function". 6. It is not so easy to give a full definition of "rogator", We want things which take arguments and correlate them with values, thereby generating sets of ordered pairs, but which do not satisfy the extensional criterion for identity. I claim that we are talking about such things whenever we talk about functions as pairing off elements according to a plan 1 ) , or mention the way in which a function yields or produces its value from its argument") or the principle of classification"). For example, it is clear that to the two expressions (1) "The sum of the first x odd numbers" and (2) "The number which is equal to the product of x by itself" there correspond two different methods or principles of calculation, even though when applied to positive integers they always give the same result. So although the two functions (on the domain of positive integers) corresponding to (1) and (2) are identical, there are other things which are not. Hence these other things, namely the rules, methods or principles (etc.) do not satisfy extensional criteria for identity. Let us therefore say that in talking about rogators we are simply talking about these other things, in effect, and that the criteria for identity of rogators are simply the criteria which we normally use for identifying and distinguishing these other things. Then talking about an object as an argument for a rogator which correlates it with a value, is just a neater, and more general, way of talking about the object as something to which a rule or principle may be applied in order to yield a result or outcome of the application. I shall not attempt to give an explicit definition of "rogator" in terms of "rule" or "method" of "principle", etc., since (a) these terms are in some contexts subject to the extensional-intensional ambiguity themselves, (b) it is not clear that their use is sufficiently general and (c) it would be odd to describe them as having arguments and values. The connection between the concept of a "rogator" and these other concepts will simply have to
') w. v. O.
Quine, Mathematical Logic (Cambridge, U.S.A. 1955) 198.
2) A. Church, Introduction to Mathematical Logic (Princeton 1956) 16. 2) F. P. Ramsey, The Foundations of Mathematics (London 1931) 15.
160
A. SLOMAN
be hinted at and illustrated by the remarks already made, and the examples which will now be discussed.
7. We have admitted that the functions "the woman first loved by x" and "the mother of x" may be identical. But even if they are, it is clear that the principle by which we pick out a woman as someone's mother is quite different from the principle by which we select a woman as the first person loved by that person, even if we end up with the same woman in each case. So here we are applying non-extensional criteria for identity of the principles involved, and these criteria enable us to distinguish two rogators, even if there is only one function. Again, if we consider the expression "the town in which x was born", then we may say that there is a function corresponding to it which correlates (some, but not all) persons with towns. Suppose that Aristotle's first pupil, whoever he was, was born in Athens. Then Athens is the value of the function for that man as argument. But that man might have been born elsewhere, for example if his mother had decided to go on holiday just before his birth. In that case a different town would have been the value for the same man as argument. But a value of what? A different town could not be the value of the same junction, for then the set of ordered pairs would be different, and so, since a function just is a set of ordered pairs (or at any rate something satisfying extensional criteria for identity), it would be a different function. Hence, if, as seems quite natural, we wish to say that the same something might have had a different value for the same argument, then, if we are not to contradict ourselves, we must regard the "something" as not satisfying extensional criteria for identity. Clearly, it is the same rogator that is wanted: for one and the same principle or rule might have correlated the same man with a different town if he had been born not in Athens but elsewhere. That is, the rogator corresponding to that rule might have had a different value for the same argument. 8. It is important to note that the remarks made in the previous paragraph could not have been made, and the reader would not have understood them, if they had not employed the concept of a rogator or some other non-extensional concept. We can therefore take the fact that the remarks are intelligible as demonstrating that there are such things as rogators, or at least that the concept of a rogator is a coherent one, and
FUNCTIONS AND ROGATORS
161
not unfamiliar. There is a further argument, used by Russell, to show that there must be rogators. The argument is simply that unless there were such things we should not be able to talk about individual functions such as "the square of x" or "the town in which x was born", whose domains are either infinite or unsurveyable on account of being scattered about in space and time. For how can I have this set of ordered pairs in mind rather than that, and how can I know that you and I are talking about the same set in these cases? The function "the square of x" contains infinitely many different pairs of numbers, and the other function includes pairs containing persons and towns that I have never seen or heard of (especially if it applies to persons who have lived in the past, or will live in the future). So in neither case can I say that I have in mind just this function because I have identified all the pairs in it. And I cannot say that I am sure you have the same function in mind on the basis of having checked through the set of pairs which you have in mind. Thus, if there is one function that I have in mind, and if you have the same function in mind, it can only be because we use some principle, or rule, i.e. a rogator, according to which we can tell whether an ordered pair does or does not belong to the function in question. It follows that since we can and do identify and talk about functions with infinite or unsurveyed domains, there are such things as rogators. I am not saying that extensional functions could not exist if there were no rogators, only that individual ones could not be talked about or even thought about without them. (Though as pointed out by Ramsey in The Foundations of Mathematics, pp. 15 and 22, it may be possible to make general assertions about them, not mentioning individual ones, without presupposing the existence of a rogator. Whether there are some functions - or sets - to which norogators correspond, so that they cannot be talked about or thought about individually, is a question which I shall not discuss. One form of Platonism involves giving an affirmative answer to this question. This sort of view seems to have lain behind the axiom of reducibility, and Ramsey's claim that the axiom was unnecessary.)
9. These considerations seem to establish that there are such things as rogators and that they are, in one sense at least (namely, epistemologically), prior to functions. Although this fact was acknowledged by Russell, he did not wish to pay much attention to it, since he was preoccupied
162
A. SLOMAN
with giving mathematics a logical foundation, and apparently thought this could be done without introducing intensional considerations. (See Introduction to Mathematical Philosophy, p. 187.) This may also explain why Frege apparently was not very interested in an intensional approach to the concept of a function. It should be noted at this stage that, although I have indicated in a rough sort of way what sorts of things rogators are, I have not yet given a definition, for I have not yet stated a set of necessary and sufficient criteria for identity of rogators. Certainly if "Ex" and "Gx" are the same rogator (involve the same rule or principle) then they must correlate the same arguments with the same values, and if a certain argument (e.g. Aristotle's first pupil) would have been correlated with a different value by "Ex" (e.g. "the town in which x was born") if the world had been different, then in the same conditions "Gx" would have had the same value for that argument. So extensional equivalence in all possible states of affairs (i.e. necessary extensional equivalence) is a necessary condition for identity of rogators. But we do not wish to say that it is a sufficient condition, since we wish to say that the two rogators mentioned in par. 6, namely "the sum of the first x odd numbers" and "the number which is equal to the product of x by itself" (defined on the domain of positive integers), are different rogators, despite the fact that they necessarily, i.e. in all possible states of the world, have the same values for the same arguments. It might be thought that the only difference is that they correspond to different signs, that is that the criteria for identity of rogators are partly syntactical. But this is not so, for it is possible that in some strange language the expression "the sum of the first five odd numbers" means what we mean by "the number which is equal to the product of five by itself", in which case the rogator corresponding to their expression "the sum of the first x odd numbers" would be different from the rogator corresponding to ours, since it would correspond to a different principle of calculation, despite the syntactical and extensional equivalence. These remarks should suggest that it is not easy to give necessary and sufficient criteria for identity of rogators, i.e. to explain, in a clear and non-circular manner, how we identify and discriminate rules or principles or methods of calculation. Ultimately, we simply have to make use of something like the notions "same pattern" and "different pattern", i.e. the notions of identity and difference of properties or universals. All explanation of meanings must start with examples, and it
FUNCTIONS AND ROGATORS
163
seems clear that we have here something which can be taught by means of examples, but which cannot be described, except in a circular manner. I shall therefore not attempt to formulate .sufficient criteria for identity of rogators. 10. NormaIly, when we wish to talk about a function, we specify the one in question not by enumerating its arguments and values, but by indicating some principle according to which they can be picked out. And this is usually adequately achieved by the use of a function-sign as illustrated in par. 3, above, for if the function-sign is constructed out of parts which have unambiguous meanings, the method of construction, together with those meanings, uniquely determines a principle or rogator. This permits us to talk about the rogator corresponding to such a sign, just as we talk about the function corresponding to it. We could, of course, introduce a notation for talking about functions and rogators by enclosing the function-sign in different sorts of quotation marks or by using prefixes, such as Church's prefix" AX-" for functions, and perhaps "px-" for rogators. However, if we talk about "the function 'Fx'" or "the rogator 'Fx" there should be no ambiguity. (In such locutions the letter "x" is, of course, a sort of bound variable.) To one function there generally correspond many different rogators, since one and the same set of ordered pairs may be picked out in many different ways, i.e. according to many different rules or principles. Since, for reasons mentioned, no complete definition of "rogator" has been given, it may be useful to compare and contrast the functionjrogator distinction with several other distinctions with which some may be inclined to confuse it.
Section B II. Near the end of section 71 of The Logical Syntax of Language Carnap implies that Frege's distinction between a function and its value-range (Wertverlauf) is a distinction between intensional and extensional entities. But this seems to be a misunderstanding, for this distinction of Frege's is a distinction between entities which are "complete" and entities which are "incomplete" or "unsaturated", and, as far as I can see, has nothing to do with different criteria for identity. Frege did not use "function" to mean "set of ordered pairs", since he
164
A. SLOMAN
defined the notion of a set or class in terms of the notion of a function. Nevertheless, it seems likely that he thought of functions in an extensional way, since he thought of concepts as being functions of a certain sort, and he thought of them as extensional. For he wrote: "coincidence in extension is a necessary and sufficient criterion for the occurrence between concepts of the relation corresponding to identity between objects". (See Translations from the Philosophical Writings of Gottlob Frege, by Geach and Black, p. 80, and also "Class and Concept" by P. T. Geach, in Philosophical Review, October 1955.) Strictly speaking, Frege could not regard the relation of identity as applicable to functions, since, for him, they were "incomplete" or "unsaturated", and this was why he had to introduce value-ranges. (Loc. cit. pp. 26ff.) Frege's distinction between complete and incomplete entities seems to be based, in the first place, on a syntactical distinction between function-signs and argument-signs. (Loc. cit. pp. l2ff., 32, l13ff, 152.) He apparently thought that the analysis of (say) a sentence into argument-sign and function-sign could be parallelled by analysis of what was expressed into function and argument, or, in some cases, concept and object. But the important thing about his functions, or concepts, was not intensionality but incompleteness. So Frege's distinction was not the same as the function/rogator distinction. Indeed, a follower of Frege might argue that just as Frege distinguished between "incomplete" functions and "complete" value-ranges, so ought I to distinguish between "incomplete" and "complete" sorts of rogators. The incomplete ones would correspond to Frege's incomplete functionsigns, such as "the mother of .. ", whereas the complete ones would correspond to complete signs or names for rogators, such as "the rogator mentioned in the previous sentence". So Frege's distinction cuts across mine. Next it may be thought that the notion of a rogator might be explained in terms of the notion of a function by saying that if R is the rogator corresponding to the function-sign "Ex" , and F is the corresponding function, then R is just a function which takes different arguments and values from F, as follows. If any 0 bject is taken as an argument of F, then that argument must be picked out or identified in some way, and the method by which it is picked out will fix the sense of the argumentsign which refers to it. Similarly, any sign which picks out the value of 12.
FUNCTIONS AND ROGATORS
165
F for a given argument must have a sense, corresponding to the way in which the thing is picked out. The suggestion I am considering is that R is just a function from senses to senses: that is, instead of taking objects as arguments and values, it takes senses of argument-signs and correlates them with senses of signs for the corresponding values of F. So R is supposed to be a set of ordered pairs of senses of signs, or a set of ordered pairs of ways of identifying arguments and values. Now there is no reason at all why we should not talk about such functions from senses to senses,and it seems certain that they would mirror some of the properties of rogators, such as there being many different rogators corresponding to one function. But the argument of par. 8 shows clearly that such "secondlevel" functions will not do everything that rogators can do. In particular they cannot explain how we are able to think and talk about particular functions whose arguments and values we cannot enumerate. For, if there are too many arguments, then there will automatically be too many senses of possible argument-signs, or ways of identifying objects, since everyone of the arguments may be referred to in many different ways. Hence, if F 2 is a second-level function from senses to senses, corresponding to the "unsurveyable" function F, then F 2 will be even more unsurveyable, and we shall need a rogator in order to talk about it! This argument is most important, for it can be used against any attempt to construe a rogator as a kind offunction. For example, Professor Richard Montague, referring to work done by Tarski, suggested to me that we could avoid talking about rogators if we talked instead about functions with an additional argument-place, to be filled by a (sign for a) possible state of the world, which would certainly enable us to deal with the examples of par. 7. But if such functions were really extensional, that is, if they consisted of sets of ordered triples, one member of each triple being a (sign for a) possible state of the world, then, as before, every such function would be far more complicated than the set of ordered pairs corresponding to the actual state of the world. For example, since we cannot enumerate arguments and values for the function "the town in which x was born", we shall find it even more difficult, on account of the geater multiplicity of arguments, to enumerate arguments and values for the function "the town in which x was (or would have been) born in possible world y" (apart from any difficulties in identifying the same particulars in all possible states of the world). Hence, as before, if we are to think or talk
166
A. SLOMAN
about such a function, we need something non-extensional, such as a principle of correlation, or a rogator, by means of which it can be identified. This sort of argument works equally well against the much cruder suggestion that a rogator is just a time-dependent function. I shall not elaborate on this suggestion, for it should be clear by now that a rogator is not a type of function at all. 13. The terminology of "intensional functions" and "extensional functions" has been used by Russell (Principia Mathematica, 2nd ed., p. 72ft') and Kneale (The Development of Logic, p. 609), but for them these terms are not so much concerned with the distinction between rogators and functions as with a different distinction, noticed by Frege (See "On Sense and Reference", in Translations). Quine has illuminatingly described the distinction as being between "referentially opaque" and "referentially transparent" contexts. The distinction may be illustrated by the pair of function-signs: (I) "the day on which our chairman first thought about x" and (2) "the day on which our chairman was first seen by x".
These both look as if they correspond to functions in the usual way, but there is a difference: for if a sign which does not refer to anything is substitued for "x" in (2), then the resulting sign does not refer to anything, and if two signs referring to the same argument are substituted in (2), then the two resulting complex expressions cannot refer to different days. On the other hand, there is no person referred to by "Mr. Pickwick", yet if it is substituted in (I) the resulting expression will probably refer to a definite day, and if the two expressions "Bertrand Russell" and "The author of The Principles of Mathematics", which refer to the same person, are substituted in turn for "x" in (I), then it is very likely that the resulting expressions will pick out different days. In short, the value of (2) depends only on what, If anything, is taken as argument, whereas the value of (I) seems to depend on how the argument is identified, that is, on the sense of the sign for the argument. We may say that (I) corresponds to an "oblique" function, (2) to a "direct" function. But it should not be thought that this distinction is the same as the distinction between rogators and functions. For the rogator "the town in which x was born" takes a value only if a sign which refers to something is taken
FUNCTIONS AND ROGATORS
167
as argument-sign: it cannot have a value for a non-existent argument. Moreover, its value depends only on which person or animal is the argument, not on how the argument is identified. It is in order to avoid this ambiguity that I refrained from describing rogators as "intensional functions": as already remarked, such a terminology might be confused with Russell's. 14. Finally, it is clear that the distinction between rogator and function is in some ways analogous to Frege's distinction between sense and reference (op, cit.). For the reference of a name (or definite description) is an object, and the sense is that in virtue of which this object is the one referred to by the name or other expression. Similarly, it should by now be clear that the rogator corresponding to a functional expression is that in virtue of which a particular function (set of ordered pairs) is the one corresponding to that expression. But, for Frege's purposes, it is important to distinguish between a complete expression referring to a function, e.g. the expression "the function described in paragraph 7", or "the function (corresponding to) 'the square of x"', and an incomplete function-sign, such as that which is common to "the square of six" and "the square of twenty-two". Strictly speaking, the latter is not a sign at all, but an aspect or pattern or structure common to different signs. We could say that rogators and functions serve respectively as senses and referents for such "incomplete" entities. The previous kind, being "complete" signs, already fall under Frege's discussion of sense and reference. Despite the analogies, to identify rogators with senses of signs involves some linguistic strain, since it is odd to say that a sense can take arguments and have values. (I do not know whether Frege himself made any attempt to extend his sense/reference distinction to what he called function-signs.) This completes the comparison of the rogator/function distinction with other distinctions, and now all that remains to be done is to describe some applications of the distinction. Section C
15. The first and most obvious application is analogous to Frege's application of the sense/reference distinction to identity statements. For, just as identity statements, such as "The evening star is (identical
168
A. SLOMAN
with) the morning star" would be either quite trivially true or self-contradictory if referring expressions were directly associated with objects without the mediation of a sense (or method of identification), so also would statements of extensional equivalence between functions, such as "For any argument x, the function 'the mother of x' has the same value as the function 'the first woman loved by x"', reduce either to mere triviality or to self-contradiction if the sign for a function were directly correlated with a set of ordered pairs without the mediation of a rogator (that is, a rule or principle). The significance of statements of identity depends on the fact that it may be a significant (e.g. contingent) question whether two senses pick out the same referent. Similarly, it is because the question whether two rogators pick out the same function, the same set of ordered pairs, may be a significant (e.g. empirical) question, that statements of extensional equivalence have any significance. 16. Secondly, once the distinction has been made, we can see that the notion of a function can be explained or analysed or "reduced" in terms of the notions of a rogator and extensional equivalence. But it is not possible to "reduce" the notion of a rogator to that of a function, or set. A third application may be mentioned briefly here, in connection with this. Since "function" can be defined in terms of "rogator", and since a rogator is something like a rule or principle, which can be identified independently of any enumeration of the objects which it correlates, it follows that there is something wrong with the statement in Principia Mathematica (2nd ed., p. 39) that a function is only well-defined if its values are already well-defined. So there is something wrong with one argument in favour of the vicious circle principle. I shall not enlarge on this, but it seems likely that further investigation might lead to a better understanding of some of the problems connected with the ramified theory of types and the axiom of reducibility. 17. The fourth application which I shall mention is one which seems to me to be particularly interesting and important for the philosophy of logic. If we look back at two of our examples of rogators, namely "the town in which x was born", and "the square of x", which may be
FUNCTIONS AND ROGATORS
169
referred to as "Fx" and "Gx" respectively, we notice the following difference (cf. par. 7, above): suppose the value of "Fx" for Aristotle's first pupil as argument to be the town Athens. Then the same rogator might have had a different value for the same argument, since the man in question might have been born in some other town. However, if we take the number six as an argument for the second rogator, we see that its value is thirty six, and could not have been anything else in any circumstances. It looks as if we have a distinction between two sorts of rogators: one sort has a value which depends on how things happen to be in the world, whereas the other fully determines its value independently of contingent facts. As pointed out to me by Mr. Dummett, there is something odd about putting the distinction in this way, since if we take different argument-signs, the rogators in question seem to exchange their positions with regard to the distinction. For example, if we apply "Fx" to the argument identified as "the man whose mother was the first woman in 1930 to give birth in Rome to her only son", then it is clear that the value (if there is one at all) must be Rome. On the other hand if we apply the rogator "Gx" to the argument identified as "the number of hours between lunch and dinner according to the Colloquium time-table", then it seems that although the value is thirty six, it makes good sense to say that it might have been different, if the printers had made a mistake on the time-table, or if the eating arrangements had been different. Moreover, a problem arises if we apply several different rogators, such as "the mother of x", "the father of x", "the wife of x", "the day on which x was born" etc., to the person taken previously as argument for "Fx", namely Aristotle's first pupil. For even if it makes sense to say of each of these in turn that it might have had a different value for the same argument, it certainly does not make sense to say that all of them might simultaneously have had different values for the same argument. For how could one have had a different mother, a different father, a different wife, been born on a different day, and in a different town, etc., and still been the same person? 18. Such difficulties are avoided if we describe the contrast not as one between types of rogators, but as a contrast between cases of application of a rogator to an argument to yield a value. In general, the value of a rogator for a given argument is fully determined by three factors
170
A. SLOMAN
(a) the rogator itself (i.e. the principle according to which arguments are correlated with values), (b) the method by which the argument is identified (or, in particular, the sense of the expression taken as argumentsign) and (c) contingent facts, or how things happen to be in the world. As the application of "Fx" to Aristotle's first pupil, and the application of "Gx" to the number of hours between lunch and dinner according to the Colloquium time-table show, it is not generally the case that two of the factors suffice to determine a value. On the other hand, some of the other examples show that in some cases the first two factors (a) and (b) do suffice. Thus, how things happen to be in the world cannot affect the outcome of applying the rogator "the square of x" to an argument identified as the number six. 19. We can now give a precise formulation of the distinction referred to two paragraphs ago. It is a distinction between cases where two (or one) of the factors (a) (b) and (c) suffice to determine the value of a rogator for an argument identified in a certain way, and cases where all three factors are required. In particular, when the third factor, how things happen to be in the world, is not relevant, i.e. where (a) and (b) suffice to determine the value, I shall say that the application of the rogator satisfies the N'Cb-conditton (the non-contingent determination condition). In most mathematical contexts the NCD-condition is satisfied, since the standard methods of identifying numbers, or other mathematical objects (e.g. as things which satisfy certain axioms), are such that once they have been used to fix a number they automatically determine all its properties and relations to other numbers, and therefore also help to determine the values of mathematical rogators taking those numbers as arguments. On the other hand, the normal methods of identifying nonmathematical objects, such as persons, places, etc., do not automatically determine their properties and their relations to other objects of the same kind, in general: these depend on contingent facts. Since the NCDcondition is normally satisfied in mathematical contexts, philosophers primarily concerned with the foundations of mathematics have not felt any pressing need to take account of the distinction between cases in which it is satisfied and cases in which it is not. This is connected with the fact that the distinction cannot be made if the concept of a function is used instead of the concept of a rogator. For, since a function is iden-
FUNCTIONS AND ROGATORS
171
tified in terms of which objects it correlates with which (i.e. via a set of ordered pairs), it makes no sense to distinguish cases in which a function might have had a different value from/cases in which it could not have had a different value (for the same argument). For, since functions are extensional, the value cannot be different unless the function is. 20. This shows that the concept of a rogator or some other nonextensional concept is essential for making the distinction between cases where the NCO-condition is satisfied and cases where it is not. It may be noted that there is something unsatisfactory about describing the distinction in terms of factors which determine the value of a rogator. For it might be said that in all cases the value is in some sense determined by the two factors (a) the rogator and (b) the sense of the argumentsign, the difference between the general case and cases where the NCDcondition is satisfied being that in the latter the value is determined in two different ways. (E.g. if "a" is an expression referring to a person, then the sign "the town in which a was born" usually identifies a town. On the other hand, if "a" is the expression "the man whose mother was the first woman in 1930 to give birth in Rome to her only son", then we have two different ways of referring to the value, the new one being by means of the word "Rome". These two expressions - or their senses - must, independently of contingent facts, pick out the same thing, and this, it might be said, is all that satisfaction of the NCO-condition comes to.) This way of looking at the distinction, though illuminating, makes no difference for our present purposes and will not be discussed any further. It should also be noted that I have not taken account of the fact that in some cases, even where the NCO-condition appears to be satisfied, the three factors (a), (b) and (c) may fail to determine a value at all, on account of the failure of some term to refer, or for some other reason. E.g. Aristotle might not have had one first pupil, if at the start he took his pupils in groups; or his first pupil, if there was one, may not have been born in any town at all. In either case, applying the rogator "the town in which x was born" to Aristotle's first pupil could yield no value. This may be allowed for by inserting the qualification "if it has a value" at various points in the discussion. It has been omitted in the interests of simplicity.
172
A. SLOMAN
21. We have seen how the notion of a rogator, unlike the notion of a function, can be used in a formulation of the distinction between satisfaction of the NCD-condition and non-satisfaction of the condition. This may now be illustrated and applied further. Any two-valued rogator can be used to define a propositional function. If R(x, y, z, . . . ) is the rogator, whose value for any set of arguments is always one or other of the two objects K and L, whatever they may be, then there corresponds to it a propositional function which is satisfied by the ordered set of objects (a, b, C, . . • ) if and only if R(x, y, z, . . .) has the value K for these objects as arguments (and a similar propositional function may be defined in terms of L). Conversely, it is possible to think of any propositional function as if it were simply the "value-range" (in Frege's sense) of a rogator taking the words "true" and "false", or any other arbitrarily selected pair of objects, as values.') The normal methods of replacing non-logical words and phrases in a sentence by variables to yield a sentential matrix can be used to represent such a rogator: e.g. "x is A", "All A's except x and yare B's", "p or q", "« or not-p", etc., can all be thought of as representing what I call propositional rogators. In general, the logical form of a proposition can always be thought of as a rogator, sometimes a rogator whose arguments are of different types, as in "x is A". This shows that the familiar analysis of propositions in terms of functions and arguments can be replaced by an analysis in terms of rogators and arguments. The sense of a sentence expressing a proposition is then partly determined by the rogator corresponding to the logical words and constructions in the sentence. We can conclude that insofar as rogators are prior to functions (i.e. to sets of ordered pairs), the sense of a proposition is prior to the set of its truth-conditions. (This might be 1) This seems to be what is important in Frege's decision to regard sentences as names of truth-values. To object that this is an unacceptable use of the word "name" is to miss the important point. The main advantage of this move is that it yields a theory of meanings, propositions and truth which fully accounts for all the properties and relations of these concepts which are of interest to logicians, without depending on discussions of such notions as "thinking", "asserting", "communicating" or the presuppositions and implications of such activities as statement-making. In short, it clearly sorts out confusions between logic and the sociology or psychology of language. In his paper on "Truth" (Proc. Aristotelian Society 1958-59) Michael Dummett attempts to criticise such a Fregean theory, but I think it can be shown that his criticisms fail to take account of its full potentialities. Perhaps Frege was not aware of them either. (It is hoped that this will be developed in another paper.)
FUNCTIONS AND ROGATORS
173
developed to support a claim that there is a sense in which meaning is prior to use.) 22. We have seen that in most non-mathematical contexts the NCDcondition is not satisfied by the application of rogators to arguments, and this applies equally to the propositional rogators corresponding to logical constants or logical forms. For example, the rogator "p or q" may be applied to the two propositions "the moon is shining" and "dawn is breaking", and its value will be (say) the word "true" or the word "false". But there is no way of finding out which it is, even if the time and place of utterance are known, except by empirical investigation of contingent facts, for the value is not fully determined by the rogator and the methods by which the arguments are identified: the NCDcondition is not satisfied. This fact, that in general the third factor (c), mentioned in par. 18, is relevant to the value of a propositional rogator is what justifies correspondence theories of truth. To say that truth is a matter of correspondence with facts, is to mention one instance of the generalisation that the value of a rogator depends on how things happen to be in the world. (This shows that falsity is also a matter of correspondence with the facts.) Similarly, to say that any proposition determines a set of possible states of the world in which it would be true, its "truthconditions", is to draw attention to one application of the more general fact that if R(x, y, Z, . . . ) is a rogator, (a, b, c, . . .) arguments of R, and K a possible value of R, then the rogator, the argument-set and the value K together determine a set of possible states of the world, namely those in which R would take the value K for the arguments in the set (a, b, c, ... ). By considering rogators which take more than two values we thus find a natural interpretation for systems of many-valued logic. The fact that the propositional rogator "p or q", and the methods for identification of its arguments, do not in general suffice without the third factor to determine the value of the rogator, is what makes it possible for such logical words and constructions to be used in sentences which express contingent propositions, i.e. say things about the way the world happens to be. So the rules according to which they are used must make allowance for this connection with contingent fact, and this is a point that is missed by those who say that logical constants are governed by purely syntactical rules, that their use can be fully characterised by means of formal systems,
174
A. SLOMAN
and that logic can be reduced to syntax. Moreover, it can be argued that to speak of "truth", "proposition", "validity" etc., in connection with a formal system which in no way allows for the influence of contingent facts (how things happen to be in the world) on truth-values, is simply to generate confusion, since it obscures the fact that no such formal system could ever do what can be done by real languages, namely enable us to make statements about something non-linguistic.
23. Once we have seen that it is essential to propositional rogators that their applications do not always satisfy the NCO-condition, we are in a position to be struck, in a new way, by the fact that they sometimes do. How can their values sometimes be determined independently of contingent facts even though they are constructed or defined in such a way that contingent facts are to be relevant to their values? Or again, how is it that, starting with rogators whose values normally depend on contingent facts (e.g. "p or q", "not-p") we can construct new ones (e.g. "p or not-p") whose values never depend on contingent facts, whose applications always satisfy the NCO-condition? What I am getting at is that the necessary truth of a proposition can often be construed as illustrating the more general notion of satisfaction of the NCO-condition by the application of a rogator. And if we develop a theory of rogators, which describes and compares the different ways in which values of rogators may come to be determined independently of how things happen to be in the world (e.g. sometimes relations between the ways in which arguments are identified, sometimes relations between the method of identifying an argument and the rule for the rogator, sometimes only the way the rogator is constructed out of others, will be relevant), we may find (as I have found) that it is quite natural to say that there are different sorts of necessary truth, some of which can be described as "logical", some as "analytic", some as "synthetic". (This would provide an interpretation for a system of modal logic with different modal operators of different "strengths".) It is even to be hoped that studying the various ways in which the.Nt.Dvcondition may come to be satisfied, and noticing their differences, may rid people of the inclination to oversimplify by saying that all necessity is due simply to language, or to conventions, or to syntax. This may be illustrated by the following comparison. A rogator whose
FUNCTIONS AND ROGATORS
175
application to an argument identified in a certain way satisfies the NCDcondition is none the less a rogator, and the value which it takes is the very same thing as it may take in other applications not satisfying the NCD-condition. In particular, if it is a propositional rogator, and its application occurs in the construction of a proposition, then the mere fact that the NCD-condition is satisfied, e.g. if the proposition turns out to be one which is logically true, is no more justification for saying that what we have is not a proposition but a convention or rule, or for saying that it is not true in the same sense as other propositions, than there is for saying that the rogator is no longer a rogator, or that it does not have a value in the usual sense. 24. This completes my account of the applications of the notion of a rogator. I hope these rather condensed remarks show that we can look at some old problems in a new and illuminating way if we make the distinction between a function and a rogator.
INFINITELY LONG TERMS OF TRANSFINITE TYPE W. W. TAIT Stanford University, Stanford, Calif., USA
1. Functionals of higher type were introduced into proof theory by K. Godel in [2], where he gives an interpretation of first order number theory in terms of the impredicative primitive recursive (p.r.) functionals of finite type. The aim of this work was to show that for the consistency of number theory, Gentzen's use of induction up to GO (with respect to p.r. properties) can be replaced by a quite different constructive - but like Gentzen's, non-finitist - principle, namely, the assumption of constructive functionals of finite type and of their closure under p.r. operations. However, another view of Godel's result is possible: Instead of assuming functionals of higher type, we may regard the definitional schemata for the p.r. functionals simply as rules of computation, i.e. for transforming symbols. On this view, Godel's result may be interpreted as a consistency proof relative to the quantifier-free theory of p.r. functionals (his system T), which in turn must be justified by a proof that all the constant numerical terms of the theory can be transformed by the rules of computation into unique numerals. This latter viewpoint is what I will discuss here. It has become especially interesting in virtue of Spector's [7] extension of Godel's interpretation to classical analysis by adding the general principle of bar recursion to the schema for primitive recursion. For, while on any reasonable conception of computability at higher types, the computable functionals are constructively closed under the p.r. operations, there is no known constructively valid interpretation of these operations together with bar recursion.') It appears 1) There is a constructively valid interpretation for bar recursion of lowest type (i.e. in Spector's notation, where c is a sequence of numbers or functions) using the
INFINITEL Y LONG TERMS OF TRANSFINITE TYPE
177
that the best hope for a constructive justification of bar recursion lies in an analysis of the computations of bar recursive functionals. However, here I will discuss only p.r. functionals, or rather a certain generalization of them. On another occasion I will apply the present ideas to the analysis of functionals involving bar recursion of lowest type. William Howard has recently extended Godel's interpretation to ramified analysis, as formulated by Schutte [6] and Feferman [I], using p.r. functionals oftransfinite type. In view of this, it is fitting that we formulate the results of this paper for the wider context of transfinite types, even though we will not discuss Howard's result here. Actually, Howard uses a more complex concept of transfinite types than is introduced here, but his result can be obtained using the present conception. Just as Lorenzen and Schutte greatly simplified the problem of cutelimination for formal proofs involving induction principles by effectively representing such proofs by infinite well-founded proof trees, 1) so a similar device will serve us here in analyzing the computations of functionals involving definition by recursion. Consider, for example, the (not necessarily numerical valued) functional cP defined by the primitive recursion: cp(n+ I, a, b) = b(cp(n, a, b), n). cp(O, a, b) = a, The p.r. functionals of finite type can be generated from such cp by means of A-abstraction and explicit definition (where the definiens may contain and the successor operation, of course). Now, write CPo = sab . a and CPn+ 1 = Xab . b(CPn(a, b), n). Then cP is represented by the "infinite term" (CPo, CPl" .. ) which takes the value CPn for the argument n. The close connection between this infinitary rule of term formation and the rule of infinite induction (used by Schutte in the construction of proof trees), i.e. A(O), A(I), ... != (x)A(x), is made clear if we consider the Godel interpretation of the latter:
°
(t/t)B(O, CPo, t/t), (t/t)B(l, CPl' t/t), •.. !=(Ecp) (x) (t/t)B(x, cp(x), t/t). continuous functionals of finite types. By the methods of this paper, however, we can expect to obtain something more, namely, a least ordinal", such that the functionals of finite type defined by recursion up to '" are closed under bar recursion of lowest type. 1) Lorenzen was first to give a constructive formulation of cut-elimination for infinite proofs, using transfinite induction. Schiitte showed how to determine exactly in some cases which ordinals are involved, thus restoring the precision of Gentzen's original results.
178
W. W. TAIT
For, using our infinitary rule of term formation, the solution for tp is simply qJ = (qJo, qJl' ... ). In dealing with transfinite types, we make two further uses of Lorenzen's and Schutte's idea. First, transfinite types are themselves to be infinite objects. Namely if a, and 'Ci are types for all i < «, o: .:s; CU, and the a i are all distinct, then the set {(a is 'C in i < a is a type. In fact, it is the type of functionals which are defined for objects of any type a., i < a, and whose value for an object of type a, is of type 'Ci' A particular case is when a = 1, in which case the type is simply denoted by (170' 'Co)' Also, we will use another kind of infinite term to piece together functionals of type {(ai' 'Ci)L -c c from functionals of types (ai' 'Ci), i < rx. Namely, if for eachi < rx,qJiisoftypeai,thenqJ = (qJo,qJl,oo.)isoftype{(ai,r)},and for each argument x of type aJi < rx), qJ(x) = qJi(X), We are dealing here only with functionals of a single argument. But it is well-known how to reduce functionals of several variables to functionals of one variable (but of higher type). In the next section we will set up a formal calculus of infinitely long terms, or rather, a "semi-formal" calculus in Schutte's sense, which will codify the constructions of functionals which we have been discussing. In section 3, we will prove by induction up to rx that every term t can be computed, i.e. can be reduced by certain conversion principles to an irreducible term t', where the ordinal a is given explicitly in terms of certain bounds on t. From the proof of computability, in fact, it is easy to give a definition of t' = f(t) where the functional f is defined by predicative transfinite recursion up to rx. "Predicative" here means that f is obtained by explicit definition and recursive definition up to a, where the latter is used only to define numerical valued functions. Thus the definition of f does not involve functionals of higher type (unlike the definitions of the p.r. functionals of finite type). 2. We will use the usual notations for countable ordinals and operations on ordinals, but these should be interpreted in terms of a suitable constructive system of ordinal notations. In fact, for the sake of definiteness, we will assume that all discussion of ordinals here refers to the p.r. wellordering of the natural numbers defined in Schutte [4]. All the ordinal functions which we will use are represented in that ordering by p.r. functions.
179
INFINITELY LONG TERMS OF TRANSFINITE TYPE
The types and their ranks are inductively defined by Tp
1.
Tp 2.
o is a type with
rank RO = O.
If (Ji and t , are types with R(Ji, Rx, < /3 for all i < ct. ~ OJ and if a, #- (J j for i #- j (i, j < ct.), then the set p = {((Ji' 'ri), /3}; < a is a type with Rp = /3.
When the rank of {((J;, 'r;), /3} i < a is not relevant in a given context, we usually abbreviate this type by {((J;, 'rJ}; < a' and for ct. = 1, by ((Jo, 'ro). It would not be satisfactory in Tp 2 to take for p simply {((Ji' 'ri)}; < a and define Rp to be the supremum of the R(Ji and R'r;, since this supremum is not computable from p. On the other hand, our present definition has the unpleasant feature that two types may be extensionally the same, and be distinct only in virtue of their ranks. However, we can easily avoid this difficulty by taking
{(ni'
to mean
pJ, /3}; < a = {((Ji'
'rJ, b}i <
a
In particular, the condition a, #- (Jj in Tp 2 should be interpreted in this way.') The x-terms (i.e. terms of type r) and their lengths are inductively defined as follows: Tm 1.
Each variable a' of type 'r is a r-term with length I a'
Tm 2.
0 is
Tm 3.
If s is a O-term, then so is Ss.
Tm4.
If p = {((Ji,'ri)}i
a O-term.
I0I =
I=
O.
O.
I Ss I = I s 1+1. «, Si is a 'rcterm with is a p-term.
1) D. Scott has suggested, quite reasonably, that we call the object defined by Tp 1 and 2 type notations and distinguish them from the corresponding extensional types. Thus, a i a j in Tp 2 would mean that the types corresponding to the notations a i and aj are distinct. Of course, in speaking about notations, we would have to restrict ourselves to sets {(ai' T i ) , {l} which can be represented by numbers, e.g. recursively enumerable sets. The present formulation allows that our types may be free choice sequences of a certain spread (see the remarks below about the constructive meaning of infinite terms), but it is too early 10 see whether this will be of any particular use.
*
180
w.
W. TAIT
Tm 5. If So, S1' ... are r-term, and I s.; I < 13 for n S 0, then (Si; f3)i <0) is a {(O, r), R,,+ l}-term. I (Si; 13) I = 13· Tm 6.
If r is a p-term, S is a a-term, and (a, r) e p, then (rs) is a r-term, I rs I = max (I r I, I s I) + 1.
Regarding the constructive meaning of this definition, there are two possible views. On the narrower view, infinite types and infinite terms are to be given by effective, and so finite, rules for their construction, and our theorems about terms are ultimately about these rules, and so really only concern finite objects. This is the view expressed by Schutte in connection with his work with infinite proof trees. E.g. see [5], p. 369. Another, more general, viewpoint is that infinite terms are essentially free choice sequences of a certain spread. The details of the construction of the spread are left to the reader; but it will be noted that the possibility of identifying the terms with the choice sequences of a spread depends essentially on the fact that each term has an associated length which exceeds the length of its subterms, and each type {(ai' 'i)' f3} has a rank which exceeds the ranks of the a, and 'i. lt is easy to see that each term has a unique type. In particular, if r is a p-term, s a a-term and rs is a r- and a ,'-term, then (a, r) s p and (a, r') e p, and so , = i', r', s' and t' will denote r-terrns, The rank Rt of a r-term t is simply R". When there is no need for greater explicitness, we will write {Ad", Si} and (s.) for the terms given by Tm 4 and 5. Also, for IX = 1, write Jeauos o for {Aau " siL < a. is intended to denote 0, and S the successor operation, so that the numerals are 0, T = SO, 2 = sT, etc. An occurrence of a variable b in a term is called bound if it is in a context Jeb . s; and otherwise, it is called free. Let t(a U ) be an arbitrary term, and s a a-term. Replace each variable with free occurrences in s in all of its bound occurrences in t(a U ) by a distinct new variable, and replace each part {AaO"', Si' f3} and (Si; 13) in t(a U ) by {Aau , • Si' IX} and (Si; IX), respectively, where IX = I S 1+ 13. Finally, replace each free occurrence of a" in the resulting expression by s. The result will be denoted by t(s). We can assume that the change of bound variables in t(a U ) is done in a unique way, so that t(s) is unique.
o
INFINITELY LONG TERMS OF TRANSFINITE TYPE LEMMA
I t(s) I
~
1:
If t(a")
is a t-term and s a a-term, then t(s) is a t-term with
1s I + I t(a) I·
The proof is by straightforward induction on I t(a) We will write t 1t2 ... t. for ( .. . (t1t2)" .tn ) . Now we can formulate the rules of conversion. 1. {Aa"', si(a"')} r": II. (s;)ii --+ s•. III. (r ;)st --+ (rit)S.
181
--+
I.
s.(r"n).
The relation r =1 s (r reduces to s) is inductively defined by 10. 2°. 3°. 4°. 5°.
If r --+ s then r =, s. If r =1 s then t(r) =1 t(s). If r, =1 s, for each i < a, then {Aa"', r;}i <
(l'
In 2°, t(r) and t(s) are obtained from a term t(a) in the manner specified above. It is clear that if r" =i s, then s is a a-term, and moreover, on the intended interpretation, r" and s denote the same functional. For i = I, II or III, if rs --+ t is an instance of rule i, then rs is said to be i-convertible ti-conv, or simply conv) into t with principal part r. If t contains no conv. subterms, then it is called irreducible or is said to be in normal form. 3. We prove in this section that every term can be reduced to a term in normal form. The non-trivial part of the proof consists in showing that we can reduce t to a term without 1- or III-conv subterms. Dt ~ a will mean that Rr < rt for every principal part r of a 1- or Ill-conv subterm of t. We can read Dt ~ o: as "the degree of t is ~ «", providing that we do not assume Dt to be an ordinal which is computable from t. LEMMA
2:
If Dt(a"),
Ds"
~
o: and Rs" <
rt,
then Dt(s)
~
«.
Let uv be a 1- or III-conv sub term of t(s). We must show that Ru < «, If uv is a subterm of s, this follows from Ds ~ rt. If u = s, it follows from Rs < «. If uv is III-conv and u is of the form sw (so that s is of the form (s.), <w), then Ru < Rs < tx, But in every other case, uv is of the form
182
W. W. TAIT
u'(s)v'(s) where, if uv is f-conv (i = I or III), then u'(a)v'(a) is an z-conv subterm of t(a). But since Dt(a) ::s; a, it follows that Ru = Ru'(a) < a. LEMMA
3:
If Dr, Ds ::s; a, rs is a term, and Rs < a, s a and I t I ::s; Max (I rs I, 1s 1+ 1r I).
then there is a t
with rs =! t, Dt
Proof by induction on I r I. If rs is not 1- or III- cony, then t = rs suffices, since rs is the only possible cony subterm of rs which is neither a subterm of r nor of s.lf r = {Aa'''' r;(aO"')} and s is of type a n> then t = rn(s) suffices by Lemmas 1 and 2. If r = (r;; f3)u, then since I r n I < I r I, there is a r, with rns=1 tn> oi;« a and 1t n 1 s Max (I rns I, Is 1+lrnl),forn ~ O. Hence, t = (t .: y)u suffices, where y = Max (Max(f3, I s I) + 1, 1 s I + 13).
4: If Drs ::s; a and Rr a'r; ... rin ~ 0). LEMMA
~
a, then r is of the form (u i ) or else
In fact, r is of the form rOr l ••• rn> where r o is not of the form UV, and Rr., ~ Rr ~ a. Now in all cases other than r o = at and r o = (uJ with n = 0, r o is the principal part of a 1- or lII- cony sub term of rs. But this is impossible since Drs ::s; a. Let x(~) = 2"', and for y "# 0, let x(r) be the iah simultaneous solution 13 of x(~) = 13, for all y' < y. Then x(r) is a normal function of a.
1: If Dt ::s; a ::s; xf~)I'
THEOREM
and 1 t'
I
+
wY, then there is a t' such that t =
r or,
::S;a
The proof is by induction on )I, and within that, by induction on 1 t I. If I t I = 0, then Dt = 0, so that t' = t suffices. Using the fact that the length of a term exceeds the length of its subterms, and that x~Y) is normal in 13, we can set (Ss)' = Ss', paO"'. s;, f3}' = {AaO"'. s;, x~Y)} and (s;; 13)' = = (s~·l' X(Y») Let t • = rs Then there are r' and s' with Dr' , Ds' < , a r - r' p • s=jS', 1 r' I ::s; and I s'l ::s; Xm. If Rr ~ a+w Y, then r is of the form a'r ; ... r« or (uJ, by Lemma 5. Hence, r' is of the form atr~ ... r~ or (u;), and so r's' is not 1- or III-conY. Thus, t' = r's' satisfies the theorem. Assume now that Rr < a+w Y• We must consider three cases: Case 1. y = 0. I.e. Rr s; a+ 1, so that Rs < a. Then by Lemma 4 there is a t' with t=r's' t', Dt' ::s; a and 1 t' I ::s; 21sl+21rl < 2 1t l• Case 2. y = (j + 1. Then since Rr < a+w·· w, we have Rr < a+w· . k for sufficiently large k. Then Dr's' ::s; a + w· . k, and so by k iterated
Xrn
-I
,
183
INFINITELY LONG TERMS OF TRANSFINITE TYPE
applications of the inductive hypothesis for J, there is a t' with r's' Dt' ~ IX and (y)
< XI t
Case 3. 'Y
= lim 'Yn' Then since Rr < n
+ COYk.
IX
t' ,
I'
+ oi', there is a k with
Rr <
IX
+
Hence, Dr' s' ~ IX + co'", and so by the inductive hypothesis for 'Yk' there is a t' with r's' =1 t', Dt' ~ IX and
I t' I < -
(Yk)
Xmax (x\~I. x\~(l+
1
<
(y)
Xltl'
This completes the proof. THEOREM 2: If Dt ~ co~, then there is a t' in normalform with t =' t' and
I t 'I
~
X(.j lt l.
This follows from Theorem I and the following lemma. LEMMA
and I t'
5:
If
I s I t I·
Dt
=
0, then there is a t' in normal form with t = t'
The proof is by induction on I t I. If I t I = 0, then t is in normal form, so we can take t' = t. Set (Ss)' = Ss', {Aa<1'· Si' {J}' = {Aa<1'· s;, {J}, and (s.; 13)' = (s;; (J). Let t = totl ... t n where n > and to is not of the form uv. Since Dt = 0, it follows from Lemma 4 that to is a variable, or else it is of the form (Ui) with n = 1. In the first case, t' = t~t~ ... t~ suffices; in the second case, if t~ is a numeral then t' = u~, and if t~ is not a numeral, then t' = t~t~. This completes the proof that every term t can be reduced to a normal form t', From the proof, it is clear that if we restrict ourselves to terms of length < 13 and degree < IX, then t' = f(t) can be defined by predicative recursion up to X~~) . IX, since the doubie induction in the proof of Theorem 1, up to IX; and within that, up to i t I, can be transformed into an induction up to X~~) . IX. The essential unicity of the normal form for a given term follows from
°
m,
THEOREM
3:
If r -
s, r = t and t is in normal form, then s
=1
t.
184
W. W. TAIT
For, if s is also in normal form, then t can be obtained from s only by trivial uses of 2°, namely, where in going from u(v) to u(w) (where v =1 w), v does not actually occur in u(v), so that u(w) differs from u(v) only by some changes of bound variables. (See the definition of the substitution u(v).) Hence, in this sense, t is a mere notational variant of s. In particular, if r contains no free variables and is a O-term, then all its normal forms must be numerals, and hence, identical. The proof of Theorem 3 proceeds by induction on the "length" of a reduction of r to s, defined in a suitable way. I omit this proof, which is routine, long and unpleasant. 4. Thefinite types are obtained by restricting Tp 2 to the case CI. = 1, so that they are built up from by means of the composition (a, r). Similarly, the terms offinite type are obtained by restricting Tm 4 to the case CI. = 1, so that the terms given by this clause are of the form AaG • s. The impredicative p.r. functionals of finite type are obtained from the terms of finite type (in our sense) by replacing Tm 5 by the schema for primitive recursion given in § 1. But we saw in § 1 how to replace each
°
.2(1)'2
(with k exponents). In particular, let t be a constant (0, 0) - term, i.e, without free variables. Then for every n,j(tn) is a numeral. Then
INFINITELY LONG TERMS OF TRANSFINITE TYPE
185
References [I] S. Feferman, Systems of Predicative Analysis. To appear. [2] K. Godel, Dber eine bisher noch nicht benutzte Erweiterung des finiten Standpunktes. Dialectica 12 (1958) 280-287. [3] G. Kreisel, Inessential Extensions of Heyting's Arithmetic by Means of Functionals of Finite Type. Abstract. JSL 24 (1959) 284. [4] K. Schutte, Kennzeichnung von Ordnungszahlen durch rekursiv erklarte Funktionen. Math. Ann. 127 (1954) 15-32. [5] , BeweistheoretischeErfassung der unendlichen Induktion in der Zahlentheorie, Math. Ann. 122 (1951) 369-389. [6] , Predicative Well-orderings. These Proceedings, p. 280. [7] C. Spector, Provably Recursive Functionals of Analysis: A Consistency Proof of Analysis by an Extension of Principles Formulated in Current Intuitionistic Mathematics. Recursive Function Theory. Proc. of Symposia in Pure Mathematics, Vol V., Am. Math. Soc. (1962) 1-27.
Added in proof The work of Lorenzen to which we refer is "Algebraische und logistische Untersuchungen tiber freie Verbande", Journal of Symbolic Logic 16 (1951) 81-106. However, an earlier treatment of proof theory by means of a constructive theory of infinite proofs is given in P. Novikov, "On the consistency of certain logical calculus", Matematicesky sbovnik 12, no. 3 (1943) 353-369. In particular, a constructive consistency proof for arithmetic is given.
II. SYMPOSIUM ON RECURSIVE FUNCTIONS
CONSTRUCTIVE ORDER TYPES, II) JOHN N. CROSSLEy2) St. Catherine's College, Oxford, UK
Introduction 1. The theory of constructive order types constitutes a new approach to the problem of providing a constructive analogue of ordinal number theory. Recently, Dekker, Myhillet al. (e.g. [8]) have considered a generalization of the notion of cardinal number which may be regarded as a constructive analogue of Cantor's theory. Ordinal number theory may be approached in two ways. (1) Ordinals may be considered as being generated in a certain way (v. [1] § 3, p. 19; [2] p. 87). (2) Ordinals may be regarded as the equivalence classes of well-ordered sets under (arbitrary) one-one order-preserving maps (isotonisms). Church and Kleene ([3]) considered a constructive analogue of (1). In the present work we embark on a constructive analogue of (2). We define constructive order types as equivalence classes of (linear) orderings under effective one-one order-preserving maps (recursive isotonisms). (We are only concerned with denumerable orderings.) In particular, co-ordinals are the equivalence classes of well-orderings obtained under recursive isotonisms. Since there are only denumerably infinitely many recursive isotonisms, co-ordinals are, in general, proper sub-classes of the corresponding classical ordinals. In establishing our results we use classical set theory together with 1) The author is deeply indebted to Prof. G. Kreisel for his valuable suggestions and comments on the problems discussed here. 2) The work presented here was done whilst the author was a Junior Research Fellow at Merton College, Oxford.
190
JOHN N. CROSSLEY
recursive function theory. We define addition, multiplication and exponentiation in a way which agrees (with respect to the orderings) with the classical versions of these functions. Most of our basic results in the additive and multiplicative theory hold not only for co-ordinals but also for a collection of constructive order types we call quords. Quords are the constructive order types of those linear orderings which contain no effective infinite descending chains. As there were close similarities between some aspects of Tarski's Cardinal algebras [16] and Dekker and Myhill's Recursive equivalence types [8], it is not surprising that there should be analogous connections between Tarski's Ordinal algebras [17] and the present work. However, neither in the theory of recursive equivalence types nor in the theory of constructive order types is there a natural analogue of the infinite sum which Tarski introduced (v. [7], p. 197). This may be regarded as the main reason why constructive order types do not naturally form an ordinal algebra. 2. In section I we introduce various kinds of isotonisms, i.e. one-one, order-preserving maps, and show that two recursive linear orderings are recursively isomorphic if and only if they are recursively isotonic. Thus the theory of recursively isomorphic well-orderings is part of the theory of constructive order types. Addition is defined in § II. For this we require the notion of (r.e.) separable relations (cf. [15]) which is studied in § II. 1. The rest of § II is devoted to establishing elementary properties of addition and of an ordering by initial segments (:0:;). In particular, we prove the Separation Lemma (II. 5 . 1) and the Directed Refinement Theorem (II. 5 .2) which are fundamental for much of the later work. The Directed Refinement Theorem also shows that :0:; is a tree ordering, i.e. A :0:; C and B :0:; C imply A :0:; B or B :0:; A, as well as being a quasi-ordering of all constructive order types. We restrict our attention exclusively to linearly ordered sets and their constructive order types from § III. There we consider an analogue of the descending chain condition. We caIl linearly ordered sets which contain no infinite recursive descending chain quasi-well-orderings, and we call their constructive order types quords. For quords we also have the cancellation law A + B = A + C implies B = C. However, quords are not partially well-ordered by :0:;.
CONSTRUCTIVE ORDER TYPES, I
191
Co-ordinals, which we discuss in § IV, are the constructive order types of well-orderings. Although ~ is a partial well-ordering") of C(? (the collection of all co-ordinals) it is not a well-ordering and for every (classical) limit number there exist uncountably many co-ordinals which are subclasses of that classical ordinal. It follows that these co-ordinals are incomparable. Because of this [an analogue of] a classical law for addition fails for co-ordinals. However, when we introduce the notion of a principal number for addition then the [analogue of the] law holds for predecessors of any given principal number. For such co-ordinals we also obtain a unique additive decomposition. A negative result is that a co-ordinal A may be such that there is a (classical) ordinal T, less than the (classical) ordinal of A, for which there is no co-ordinal C which is both < A and of classical ordinal T (example IV.5. I). But we do have the following Representation Theorem (IV. 5 .4): For every (classical) ordinal r, there is a co-ordinal C (or ordinal T) which is such that, for every ordinal Ll < r, there is a co-ordinal D of ordinal A such that D < C. If T is infinite there are uncountably many such co-ordinals (corollary IV. 5 .5). In § V we prove that a collection of quords (a fortiori of co-ordinals) has a least upper bound if, and only if, it has a maximum, i.e. there are no non-trivial least upper bounds. Multiplication is introduced in § VI and some of its basic properties derived. Most of the fundamental classical laws [have analogues which] go through. In order to show that AB = AC implies B = C (if A '# 0) we prove that the (classical) isotonism between representatives of Band C can be extended to a recursive isotonism. Analogously to the situation for addition, the law A < B implies AC ~ BC does not hold in general; but we show that it does hold for predecessors of a given principal number for multiplication. Exponentiation is defined in § VII and the development is very similar to that of § VI. In § VIII we prove that the collections of principal numbers for addiWw tion and multiplication of ordinal less than W W and w , respectively, lie in a single branch of the tree of co-ordinals. This result is best possible in the sense that there exist incomparable principal numbers for addition Ww and multiplication of ordinal co" and w , respectively. 1) :::; is a tree ordering, also, by theorem II. 5.3.
192
JOHN N. CROSSLEY
Because predecessors of a principal number for (e.g.) addition obey all the classical laws for addition, the well-orderings belonging to co-ordinals lying in the branch of the tree just mentioned may be called natural well-orderings (with respect to addition). By increasing the number of functions considered and ensuring that the collections of principal numbers form a nested sequence, we can get characterizations of natural well-orderings up to larger segments of the Cantor second number class in terms of their co-ordinals. 1) Terminology and notation
The development of the theory of constructive order types will be informal but we shall use logical symbolism freely for brevity. We write "&", "v", "I", "---+", "+--).", "3", "V", "E!", "flx" for "and", "or", "not", "implies", "if and only if", "there exists", "for all", "there is a unique", "the least x such that", respectively, and we also use the Anotation (cf. [9], p. 34). We sometimes use dots for bracketing purposes in the usual way. A number means a natural number (0, I, 2, ... ) unless otherwise stated. A set is a collection of numbers and a class is a collection of sets. We denote the set of all natural numbers by.J' and the empty set by 0. We use lower case Greek letters for sets. {x : P(x)} is the set of all elements satisfying the predicateP. &('J.- 13 = {x : x s ('J. & x ¢ f3}. ii = .J' -('J.. e('J. £: 13 means x s ('J. -+ x e 13 and ('J. c: 13 means ('J. £: 13 & ('J. i= 13. <x, y) is the ordered pair of the numbers x, y. ('J. x 13 = {<x, y) : x e ('J. & y e f3}. ('J.2
=
('J.X('J..
A relation is a set of ordered pairs of natural numbers, i.e. a subset of
.J'2. We use upper case bold face letters (A, B,... ) for relations. A relation A is said to be reflexive if
(x, y) e A
-+
<x, x) e A &
The converse of a relation A is {
CONSTRUCTIVE ORDER TYPES, I
193
{x : (3y) «x, y) B A v
: <x, y)
B
A}.
We assume familiarity with classical ordinal number theory (as in e.g.
[1] or [14]). We also assume familiarity with the notions of recursive
and partial recursive functions and recursive and recursively enumerable (r.e.) sets. We sometimes use Turing machine methods for convenience (for details see e.g. [5] or [9]). We make heavy use in the sequel of the facts: (i) If a is a r.e, set and a ~ of, thenf(a) is r.e., (ii) If a is an infinite r.e. set, then a is recursive if, and only if, there is a recursive function which enumerates a in order of magnitude ([13], p. 291). We recall that a set containing no infinite r.e. subset is said to be immune and that there exist immune sets and r.e. non-recursive sets ([6], p. 89, [13], p. 291). We use the well-known (primitive) recursive functions defined by
= !(x+Y) (x+y+ l)+x, j(k(x), lex)) = x j(x, y)
(v. [5], p. 43). j maps .?2 one-one onto f
{j(x, n) : x
and (A ; n) for
B
We write j(a, n) for «}
{<j(x, n),j(y, n) : <x, y)
B A}.
Unexplained notations may be found in [9], p. 538.
194
JOHN N. CROSSLEY
I. Recursive isotonism
1.1. All relations are assumed to be reflexive unless otherwise stated.
A function f from the field of a relation A to the field of a relation B is said to be relation preserving (between A and B) if
(x, y) s A +-+
In the above definition and in definitions I. I .2 and I. I .5, below, f is to be one-one on the whole of its domain (and not merely on the field of A). This condition ensures that in all three cases f- 1 is well-defined on pi (Under definition 1.1.2 we may haveff-t :1: 1.) Clearly isotonism is an equivalence relation. We write RT(A) = {B : B ~ A} and if A = RT(A), then A is said to be a relation type. DEFINITION I. 1.2: Suppose A and B are relations. Then a map p(x) is said to be a recursive isotonism from A to B if (i) p is a partial recursive function, (ii) p is one-one, (iii) {)p ;2 C'A and p( C'A) = C'B, (iv) p is relation preserving between A and B. A is recursively isotonic to B if there is a map p which is a recursive isotonism from A to B. We write p : A ~ B if p is a recursive isotonism from A to B and A ~ B if there is a recursive isotonism from A to B.
We claim that recursive isotonism is an equivalence relation. The identity map is recursive, hence recursive isotonism is reflexive. If p is a one-one partial recursive function, then p - 1 (defined, of course, only on pp) is also partial recursive (see [11], p. 177). It follows that if p : A ~ B, then p-l : B ~ A. It is clear that recursive isotonism is a transitive relation. We can now introduce our next definition. 1.1.3: If A = {B : B ~ A}, then A is said to be a constructive relation type. We write A = CRT(A). DEFINITION
CONSTRUCTIVE ORDER TYPES, I
195
DEFINITION 1.1.4: A function f is said to be a recursive permutation iff is recursive and maps f one-one onto itself. DEFINITION 1.1.5: A relation A is said to be totally recursively isotonic to a relation B if there is a recursive permutation f which is a recursive isotonism from A to B. We write A ~ B if A is totally recursively isotonic to B.
Again, totally recursive isotonism, is an equivalence relation; we write TRRT(A) = {B : B ~ A} and if A = TRRT(A) for some relation A, then A is said to be a total recursive relation type.
1.2. From now on we use upper case Roman letters for constructive relation types (C.R.T.s). The collection ofall C.R.T.s will be denoted by (jI. I. 2. I : (i) A ~ B -. A ~ B -. A ~ B, (ii) There exist relations A, B such that A ~ B but A ~ B, liii) There exist relations C, D such that C ~ D but C D. THEOREM
PROOF. (i)
(ii) Let
(X
*
Clear from definitions 1.1.1, 2 and 5. be a r.e. non-recursive set. Then (X is infinite with an infinite
non-r.e. complement ii. Let
A = {(a, a') : a, a' s (X & a
B = {(b, b') : b, b' s ii & b
~
a'},
~
b'}.
Then A '" B, since A, B both represent well-orderings of type t». But A ~ B implies ii = f«(X) for some partial recursive function! This implies ii is r.e., contradicting the choice of (x. (iii) Let C = {(c, c') : 0 ~ c ~ c'}, and D = {(d, d') : I s d s d'}.
*
Then if f(n) = n+ 1, f: C ~ D. But C D, for if C ~ D by g, then g-l(O) is undefined, which is in contradiction with g being a recursive permutation.
Corollary 1.2.2. (i) RT(A);2 CRT(A);2 TRRT(A),(ii) There is a relation C such that RT(C) ::::> CRT(C) ::::> TRRT(C).
196
JOHN N. CROSSLEY
PROOF. (ii) Let C, D be as above in the proof of theorem 1.2.1, and let E = {<x,y): x::;; y&x,yeO"} where 0" is a non-Leo set. Then D e CRT(C) - TRRT(C) and E s RT(C) - CRT(C). Corollary 1. 2.2 shows that constructive relation types give a finer classification of (denumerable) relations than do (classical) relation types.
1.3. We observe that A ~ B if, and only if, A* ~ B* (and similarly for '" and ~). DEFINITION 1. 3.1: A* is said to be the converse of (the C.R.T.). A if A = CRT(A) and A* = {B : B ~ A*}. THEOREM 1. 3 . 2: (i) A * = {B* : B (ii) {A* : A e.?Jl} = .?Jl, (iii) A** = A.
~
A}
where
A
= CRT(A),
THEOREM 1.3.3: Let A ~ B, (f. = C'A and f3 = C'B. Then (i) (f. is r.e. +-4 f3 is r.e., (ii) (f. is immune +-4 f3 is immune, (iii) There exist relations A', B' such that C'A' is recursive, C'B' is not recursive and A' ~ B'. PROOF. (i), (ii) Left to the reader. (iii) Let f3 be a r.e, non-recursive set enumerated without repetitions by the recursive function ben). Let
B'
=
{
:i
s
j} and A' = {
s
j}.
Then h : A' ~ B'.
1.4. DEFINITION 1.4.1: A relation A is said to be recursive (r.e.) if there is a recursive function f(a, b) (f(a, b, c) such that
(= {x : (3z)j(x, x, z) = O} if A is Le.). DEFINITION 1.4.3: A relation A is said to be recursively isomorphic to a relation B if there is a recursive predicate L(X, y) such that, for some
CONSTRUCTIVE ORDER TYPES,
197
I
isotonism, f, between A and B, I(X, y) ~ f(x) = y. In this case we write I : A == B and if there is such an I we write A == B. If A = {B : B == A} then A is said to be a recursive isomorphism type and we write A = RIT(A). Recursive isomorphism is an equivalence relation. Since (i) I : A == A if I(X, y) ~ X = y, (ii) if I : A == B, then 1* : B == A where I*(X, y) ~ I(Y, x), (iii) if I : A == Band K : B == C, then A : A == C where
A(X, y)
~
(3z) (I(X, z) & K(Z, y))
~
(V'z) (I(X, z)
--t
K(Z, y)),
since A is recursive by [9], theorem VI (p. 284). THEOREM 1.4.4: If A, B are recursive relations, then A == B if; and only if, A ~ B. PROOF. Suppose A, B are recursive relations and I : A == B. Then I(x,y) &1 (x, z) --t y = zby definitionI.4.3, and hence that thefunctionf, defined by f(x) = flyl(X, y), is partial recursive. C1early,jis an isotonism, Thus f: A ~ B. Conversely, suppose f : A ~ B. Then by theorem 1.4.2 IX = C'A and f3 = C'B are recursive. If IX or f3 = 0, then the assertion is trivial. Hence we may assume there exist numbers a e IX and b s f3 such that b = j(a). Set
I(X, y)
~ f(a{l...:...
cix)} + xcix)) = (b+ 1) (1...:... cpCy)) + ycp(y)
where cix) = 1 if x belongs to the recursive set y, = 0 otherwise. It is easily verified that I : A == B. This completes the proof. This theorem allows us, when discussing recursive relations, to work with partial recursive functions rather than with predicates. THEOREM 1. 4.5: There exist r.e. relations which are not recursively isotonic to any recursive relation. PROOF. Let
IX
be a r.e. non-recursive set and let
A = {<x,y) : x = y .v. x e IX &y = x+1}. Since IX is (infinite) r.e. there is a recursive one-one function f such that = f(f) (cf. [5], p. 73).
IX
Hence
<x, y) e A
~
(3z) (I x- y 1{ If(z)-x 1+ 1y-(x+ 1) I} = 0),
JOHN· N. CROSSLEY
198
and it follows that A is a r.e. relation. Suppose p : A ~ B for some recursive relation B = {<x, y) : g(x, y) = O}; then, since C'A = J, p is total and hence recursive. pp = C'B is recursive by theorem 1.4.2. Therefore x e IX +-+ g(p(x), p(x + 1» = 0 which implies ex is recursive, which is a contradiction. We conclude that A is not recursively isotonic to any recursive relation. For a certain class of relations, however, each r.e, relation (of the class) is recursively isotonic to a recursive relation (in that class). We recall that a (reflexive) relation A is said to be a partial ordering if (i) <x, y) e A &
= 0).
Let ex = C'A. If ex is finite there is nothing to prove since then A is finite and hence recursive. Otherwise ex is infinite r.e. and there is a one-one recursive function, g, such that ex = g(J) (cf. [5], p. 73). Since A is linear, (Vx) (Vy) (3z) (f(g(x), g(y), z)
= 0 v f(g(y), g(x), z) = 0).
This is equivalent to (Vx) (Vy) (3z) (j(g(x), g(y), z)· f(g(y), g(x), z) = 0).
(1)
Since A is anti-symmetric, f(g(x), g(y), zo)
= 0 &f(g(y), g(x), Z1) = 0
-+
g(x) = g(y),
and hence x = y and f(g(y) , g(x), zo) = O. Now let B be the relation defined by <x, y) e B
+-+
f(g(x), g(y), liz {[(g(x), g(y), z) . f(g(y), g(x), z) = O})
Then B is recursive, by (1), and clearly g : B
~
A.
=
O.
CONSTRUCTIVE ORDER TYPES, I
199
II. Addition 11.1. If A and B are arbitrary relations, then the ordinal sum of A and
B is defined by
A
+-
B = A u B u (CA x CB).
If the ordinal sum of two relation types is defined as the relation type of the ordinal sum of arbitrary representatives of the given relation types, then this definition is not, in general, unique. This is because the fields of the representative relations may have non-empty intersection in some cases, depending on the choice of representatives. But if we define the relation type of the sum in terms of representatives which do have disjoint fields, then the definition is unique (cf. [19], pp. 341, 345 * 160.48). Two relations are said to be strictly disjoint if their fields are disjoint. We observe that if A, B are reflexive relations, then A n B= 0
~
C'A n C'B = 0.
Now, in order to define a constructive version of ordinal sum we require "constructive disjointness" i.e. CA and CB must be contained in sets which are "effectively disjoint". If this is not the case, then the following situation arises: let IX be a r.e. non-recursive set and let fJ be a r.e, set containing a. Then there is no (partial) recursive function, defined on Ji which agrees with f(x) = x on IX and g(x) = x + 1 on fJ. Hence there can be no (partial) recursive function defined on C(A +- B) where A = IX2 and B = aZ, although A and B are strictly disjoint. DEFINITION 11.1.1: A is r.e. separable from B if there are disjoint r.e. relations Ai' Bi such that A ~ Ai and B ~ Bi. If A is r.e. separable from B we write A)( B. Note. In general we shall be concerned only with r.e. separability and shall omit the qualification "r.e.".
DEFINITION 11.1.2: Ais recursively separable from Bif there are disjoint recursive relations Ai' B, such that A ~ Ai and B ~ Bi. If A is recursively separable from B we write A
>
THEOREM 11.1.3: (i) A)( B ~ A*)( B*, (ii) A B ~A* B*,
><
><
200
JOHN N. CROSSLEY
(iii) A ) ( B -. A)( B, (iv) There exist relations A, B such that A )( B but not A ) ( B. PROOF. (i) ((ii)) follows from the fact that the converse of a r.e. (recursive) relation is r.e. (recursive). (iii) Every recursive relation is r.e, (iv) We call two sets o; P r.e. (recursively) separable if the relations a 2 , p2 are r.e. (recursively) separable. Let !!Z be a consistent incomplete formal system containing formal arithmetic and let
T = {(x, y) : x is (a Godel number of) a proof of the sentence (with GOdel number) y}, To = {x: (3y) (y, x)
8
R)},
R = {<x, y) : x is (a Godel number of) a proof of the negation of the sentence (with Godel number) y} and
Ro
= {x : (3y) (y, x) 8 R)}.
Then T, R are both (primitive) recursive relations (though not reflexive relations [9], p. 252-5) and To and R o are r.e. sets. Let To = T5 and Ro = R~. Then To and Ro are r.e. and disjoint (since !!Z is consistent), i.e. T o ) ( Ro. By [15], theorem 22 (p. 59), To and R o are not recursively separable. Hence To and Ro are not recursively separable. Let A = To and B = Ro and (iv) is established. II. 1.4: (i) A is r.e. (recursively) separable from B if, and only if, there are r.e. (recursive) sets (Xl' Pl such that C'A S (Xi> C'B S Pl and THEOREM
(Xl
n
Pl
= 0·
(ii) If C'A or C'B is finite, then A)
PROOF. (i) If A is r.e. (recursively) separable from B, then there are r.e. (recursive) relations A l , Bl such that A S Ai> B S B. and A l n B, = 0. Let (Xl = C'A l , Pl = C'B l , then (Xu Pl are r.e, (recursive) by theorem 1.4.2. Since A l , Bl are reflexive, x 8 (Xl +-+ <x, x) 8 A l , and similarly for Pl and Bl • Hence A l n Bl = 0 +-+ (Xl n Pl = 0· Conversely, suppose there are r.e, (recursive) sets (Xl' Pl such that C'A s (Xl' C'B S Pl and (Xl n Pl = 0. Let A l = (Xi and Bl = pi. Then A l , Bl are r.e. (recursive) and reflexive and they clearly r.e. (recursively) separate A and B. (ii) This part of the theorem follows at once from (i)
CONSTRUCTIVE ORDER TYPES, I
201
and the fact that every finite set is recursive and so is the complement of a finite set. The second version of part (i) of this theorem is false if the relations are not assumed to be reflexive. For let T, R be the relations defined in the proof of theorem II. I .3. (iv); and suppose that there exist recursive sets " p such that CT s; , and CR s; p where, n p = 0. Then the sets,' = {x : x s C'T and x is (a Godel number of) a single formula} and p' = {x : x e C'R and x is (a Godel number of) a single formula} are recursively separable by r and p, But r = To and p' = R o which contradicts [15] theorem 22. The converse assertion, namely, that if there exist disjoint recursive sets containing the fields of A and B, then A and B are recursively separable, still holds, of course. THEOREM ILl.5: Let (X = CA, /3 = CB; then A)( B if, and only if, there is a partial recursive function, p, such that
f>P:2
(X
U
/3,
pp s; {O, I} (S)
and x e (X u /3 implies x s
(X -
p(x) = 0 .&. x s /3 - p(x) = l.
PROOF. If A )( B, then by the preceding theorem, there are r.e. sets
0(1 :2 0( and /31:2 /3 such that 0(1 n /31
= 0. For arbitrary r.e. set y let
c;(x) be the partial recursive function defined only on y such that c;(x) = 1 for x e y. Then x e 0(1 implies 1 ~ C~,(x) = 0 and x e /31 implies cp,(x) = 1. Let T, be a Turing machine which calculates 1..:... C~, and let
Tp be a Turing machine which calculates Cp,' Further, let T(m, n) = the number (represented) on the tape of the Turing machine T at the m-th step") in the calculation for argument n. Now let a new machine To be defined such that To(m, n) is as follows: (i) If Ta , Tp have not halted before the m-th step for argument n, then To(2m + l , n) = Ta(m + l , n), (ii) If T; has not halted before the (m + 1)-st step and Tp has not halted before the m-th step, then To(2m + 2, n) = TpCm + 1, n), (iii) If 1;. halts at the m-tll step and T p has not halted before the m-th step, then To halts at the (2m + 1)-st step, 1) "Step" does not mean here just one operation of the Turing machine, but a whole phase in the calculation. We assume m ~ 1.
JOHN N. CROSSLEY
202
(iv) If Tp halts at the m-th step and T~ has not halted before the + 1)-st step, then To halts at the (2m + 2)-nd step. Let p(x) be the function defined by the machine To. Then P is partial recursive and satisfies the conclusion of the theorem since, for an argument in a u P, T~ halts if, and only if, Tp does not. Conversely, let a l = {x: p(x) = O} and Pl = {x: p(x) = I}. Then a l and Pl are r.e. and disjoint; the required result follows from the preceding theorem. (m
THEOREM 11.1.6: If A, Bare r.e. (recursive) relations, then A )( B +--+ A ('\ B
=0
(A ) ( B +--+ A ('\ B
= 0).
THEOREM II.I .7: Any two C.R.T.s have recursively separable representatives. PROOF (v. [8], theorem 9(a)). Let A s A and Be B and let
C
= {(2x, 2y): (x, y) e
A},
D = {(2x+ 1, 2y+ 1): (x, y) s B}. Then C
~
A, D
~
Band C ) ( D.
11.2. THEOREM 11.2.1: Let A l +- B1 ~ A l +- Bz·
~
Al
Al ,
s,
~
a; A l )( s, and A l
)(
Bl , then
PROOF. Let a i = C' Ai' Pi = C'B i (i = 1,2). By hypothesis there exist recursive isotonisms p, q such that p: Al ~ Az and q: B1 ~ Bz. P; (i = 1,2) such that P; = 0. Let Further, there are r.e. sets Pl be the partial recursive function with domain ~P ('\ which is equal to P on ~Pl and let Pl be the partial recursive function with range PPI ('\ which is equal to PI on ~Pl' Let ql be the partial recursive function whose definition is obtained by replacing P by q and a by P in the preceding sentence. Then ~Pz ('\ ~ql = 0 and ppz ('\ pqz = 0. Hence r: A l +- Bl ~ A l +- Bl where r is the partial recursive function which is equal to Pz on its domain and equal to qz on its (disjoint) domain r is one-one since PPl ('\ pqz = 0 and Pz- ql are one-one. The other requirements are obviously satisfied. By virtue of this theorem we can now define addition of C.R.T.s uniquely as follows:
a;,
a;
a; ('\ a;
CONSTRUCTIVE ORDER TYPES, I
203
DEFINITION II .2.2: A +B = CRT(A-tB) whereAe A, Be Band A)( B. Notation. 0
= CRT(0).
We write "A+B" for "A-tB" when A)( B.
THEOREM 11.2.3: (i) A+O = O+A (ii) A + B = 0 +-+ A = 0 = B, (iii) (A+B)* = B*+A*.
= A,
PROOF of (ii). Let A s A, B e B where A )( B. Then A + B = A u B u CAC'B) = 0. Hence A = B = 0.
0
implies
THEOREM II. 2.4: + is associative, viz.for all A, B, C e:Jl, A + (B+ C)
= (A+B)+C.
PROOF. By definition II . 2 . 2 there exist A s A, B s Band C e C such that B )( C and A)( {B+C}. Now the latter implies A )( B and A )( C, hence A+B is defined, {A+B})(C and (A+B)+C is well-defined. We leave the reader to verify that A+(B+C) = (A+B)+C. As in the classical case addition is not commutative in general. 11.3. We can now introduce two relations on the collection :Jl of all C.R.T.s. These relations are reflexive and transitive, i.e. are quasiorderings. Later (§§ III, IV) we shall show that the former of these two quasi-orderings is anti-symmetric on a sub-collection of :Jl and is a partial well-ordering of C.R.T.s of well-orderings. DEFINITION 11.3.1: A :-:;; B if there is a C.R.T. C such that A + C = B. A < B if there is a C.R.T. C i= 0 such that A + C = B. A < B A = {<x, Then A, where 11 A = B.
is not, in general, equivalent to A :-:;; B & A i= B. For let y) : y :-:;; x}, B = A [(J - {O}), A = CRT(A) and B = CRT(B). B are both of classical order type w* and clearly B + 11 = A, = CRT({ <0, O)}). But A ~ B under the map x -+ x+ 1, hence
DEFINITION 1I.3.2:A:-:;;* BifthereisaC.R.T.CsuchthatC+A
=
B.
We shall refer to ":-:;;" as "the ordering by initial segments" and":-:;; *" as "the ordering by final segments".
JOHN N. CROSSLEY
204
THEOREM II.3.3: (i) A ::s:; A, (i)* A ::s:;* A, (ii) o s A, (ii)* 0 ::s:; * A, (iii) A ::s:; 0 ~ A = 0, (iii)* A ::s:; * 0 ~ A = 0, (iv) A::s:; B & B ::s:; C --+ A ::s:; C, (iv)* A ::s:; * B & B ::s:; * C --+ A ::s:; * C, (v) s « C --+ A+B ::s:; A+C, (v)* A ::s:;* B --+ A+C::s:;* B+C. PROOF. (i)-(iii)* follow from theorem II. 2.3, (iv)-(v)* follow from theorem II. 2 . 4.
Corollary II.3.4. ::s:; and ::s:;* are quasi-orderings of9f!. THEOREM II.3.5: There exist C.R.T.s of well-orderings, A, B, say, such that A ::s:; B but not A ::s:; * B. PROOF (as in the classical case). Let A be the natural ordering (by magnitude) on J - {O} and let A = CRT(A). Let 11 be as in § II. 3, then, setting B = 11 , clearly A = B+A and B::s:; A. But we cannot have B ::s:; * A, since A has no last element and C + B has last element 0 for every (separable) C.
11.4. We introduce some notation. o
n
I
i~O
ceO =
Ai = A o,
+
L
1
i~O
n
Ai =
L
i~O
A i+A n + 1 •
0, IY..n = {j(a,m): m < n Sc a e cq,
o .o: = {j(a,n) :neJ&aer:t.}; A.O =
0, A.n = {<j(a, m), j(a', m'» : m < m' < n & a, a' s C'A .v. m = m' < n &
A.w = {(j(a, m), j(a', m'» : m < m' & a, a' s C'A .V.m = m'&
A.O = 0, A.(n
+
1) = A.n+A, A.w = CRT(A.w) for A e A.
Part (ii) of the following theorem shows that it is immaterial which element A of A we use to define A. w.
CONSTRUCTIVE ORDER TYPES, THEOREM 11.4.1: (i) A ~ B --+ A.n ~ B .n, (ii) A ~ B --+ A.w ~ B.w, (iii) A . n eA. n, (iv) O.n = 0, (v) A.(m+n) = A.m+A.n, (vi) A.(mn) = (A.m).n, (vii) A. w = A + A .w, (viii) (A.n).w = A.w, (ix) if n > 0, then A .n = 0 +-+ A . to = 0 +-+ A (x) m ~ n --+ A.m ~ A.n, (xi) m
s
m
n
--+
i
I
205
= 0,
n
I
Ai
=0
s I i
Ai'
=0
PROOF. The proofs of the various parts of this theorem are elementary and we only prove parts (ii), (viii) as examples, and leave the other parts to the reader. (ii) Suppose p : A ~ B, then q : A. t» ~ B. w where q(z) = j(pk(z), fez)). (viii) Let A e A and A.w then belongs to A.w by part (ii). Let q(x), rex) be the (primitive) recursive functions such that
x
= nq(x)+r(x) and 0
~
rex) < n
and letp(x) = j(j(k(x), r(l(x))), q(l(x))). Thenp is one-one and (primitive) recursive. We assert that p is relation preserving between A.m and (A.n).m, for
(A.n)m
and where
= {(j(j(a, s), u),j(j(b, t), v) : (u < v .v. u = v & s < t) & a, b e C'A .v. u = v & s = t &
(1)
<x,y)eA.w+-+x =j(a,nq+r)&y =j(a',nq'+r')
o~
r, r' < n & nq-s-r < nq' +r' & a, a' e C'A or nq+r
(2)
= nq' +r' &
Condition (2) is equivalent to:
(q < q' .v. q
= q'
& r < r') & a, a' e C'A
.v. q = q' & r = r' &
(3)
206
JOHN N. CROSSLEY
Comparison of (I) and (3) immediately shows thatp is relation preserving. This completes the proof. THEOREM II.4. 2: (i) A ) ( B ~ A . w )( B. w, (ii) (A+B).n+A = A+(B+A).n, (iii) (A+B).(n+ 1) = A+(B+A).n+B, (iv) (A+B).w = A+(B+A).w.
PROOF. (i) Let p be a partial recursive function satisfying the requirements (S) in theorem II. 1.5 for A and B. Now x e C'A. w ~ k(x) e C'A and similarly for B. w. Hence if x s C'A. w u C'B. w, then x s cA. w ~ pk(x) = 0.&. x e C'B.w ~ pk(x) = 1. (ii), (iii). Proof by induction on n using the associativity of addition (theorem 11.2.4). (iv) Let A s A, Be B, a = C'A and 13 = C'B where A)( B. It is easily verified that (A+B).w = (A; O)+C where C
= ~
m<"
m=O
{(B; m) u (A; m+ 1) u U(f3, m) xj(a, n)]
u U(a, m + 1) xj(f3, n + I)]}.
We construct a recursive isotonism p such that p: C ~ (B+A) .w. Suppose ai' 131 are r.e. sets separating A and B (v. theorem II .1.4.(i)). Let p be the partial recursive function defined only on the r.e. set (al'oo u f31.(0)-j(a1, 0) by p(z)=z
ifzef31'00
=j(k(z),I(z)~I) if
ZBa 1.00-j(a1,0).
By part (i) p is well-defined. It is then readily verified that p has the required properties. 11.5.
11.5.1: (SEPARATION LEMMA.) If A = B+ C and A e A, then there are relations B e Band C e C such that B )(C and B+C = A. LEMMA
PROOF. Suppose A e A and A = B+ C. Then there are relations B' s Band C s C such that B' )( C and for some f, f: B' +C ~ A. Let B = J(B') and C = J(C). Clearly, B e Band C s C. By theorem II. 1 .5,
207
CONSTRUCTIVE ORDER TYPES, I
there is a partial recursive function, p, such that Jp;2 C'B' pp ~ {O, I} and if x s C'B' u C'C' then x s C'B' ...... p(x) =
°.&. x s C'C' ...... p(x)
U
C'C',
= 1.
If y s C'B u C'C then, sincefis one-one, there is a unique x s C'B' u C'C' such that y = f(x). Thus pF 1 is partial recursive, J(pf-l) ;2 C'B u C'C, p(pf- l) ~ {O, I} and if x e C'B u C'C then x e C'B ...... pf-l(X) = 0.&. x e C'C ...... pf-I(X) = 1.
Hence by theorem II. 1.5, B)( e. Clearly B+C complete.
= A,
thus the proof is
11.5.2: (DIRECTED REFINEMENT THEOREM.) If A +C = B+D then there is an E such that either A = B+E and E+C = D THEOREM
or
A+E = Band C = E+D.
PROOF. If D = 0, let E = C. Otherwise we may assume D =F 0. Let A 6 A and C 6 C where A )( C. Then by the Separation Lemma (11.5.1) there are relations B 6 Band D 6 D such that B)( D and A +C = B+ D. Let rx = C'A, {3 = C'B, y = C'C and J = C'D. Then rx u y = P u J. Case 1. If J n rx =F 0, let " = rx - {3 and E = D [ n, By construction, rx = {3 and n s: J. Therefore B)( E and E)(e. Now, B ~ A+C and E ~ A+C, further x 6 {3 and y 6" imply y 6 J and <x, y) 6 Px J ~ A+e. Thus B+ E ~ (A+C) [rx = A. Conversely, if <x, y) e A then either (i) x, y 6 {3, (ii) x, Y 6" or (iii) x s {3 Bc y e n. (We cannot have x & ye {3 since (J x {3) r, (B+ E) = 0 and" £J.) In all these three cases <x, y) s B+ E. Hence A ~ B+ E, and therefore A = B+ E. Similarly, E+C = D. Set E = CRT(E) and the theorem follows in this case. Case 2. If (j n rx = 0, then {3 n y =F 0 or B = 0. In the former case the existence of an E such that A + E = Band C = E + D is proved as in case 1 except that all occurrences of "A" are replaced by "B" and all those of"C" by "D" and vice versa (with corresponding changes in the associated Greek letters) and in the latter case put E = 0. This completes the proof.
u"
6"
208 THEOREM
E such that
JOHN N. CROSSLEY
II. 5.3: If B. w
=
A + C and C
=1=
0, then there exist n, D,
A = B.n+D, D+E = Band E+B.w = C. PROOF. We consider only the non-trivial case where A, B, C are all non-zero. Let Be B, then by the Separation Lemma (II. 5.1) there exist A e A and C s C such that A)( C and B. w = A + C. By assumption C =1= 0 =1= A, hence there is ace c-c where c = j(b, n') for some b e CB and some n' ~ 0. Therefore 11 = {lea) : a e CA} is a set of natural numbers bounded by n', Let n be the maximum number in 11 and let D = A [j(!3, n), E = C [j(!3, n). Then, as in the proof of the Directed Refinement Theorem it is easily verified that D)( E and A = B.n+D, D+ E = Band E+B.w [{x: lex) > n} = C. (1) We observe that B.w [{x: lex) > n} ~ B.w under the map p : x - j(k(x), lex) ~ (n+ 1» defined only on {x ; lex) > n}. Taking C.R.T.s of both sides of the equations in (1) completes the proof.
Note. There is no obvious link between theorems II.5.2 and II.5.3 for the following reason. We know that B.n+B.w = B.w for all n (by theorem II. 1.4. (vii) and induction). Suppose B. w = A + C, then by theorem II. 5.2 it easily follows that for each n, either A ::;; B. n or B.n s A. If A ::;; B.n for some n, then we are through, but A ~ B.n for all n does not l ) imply A ~ B.w nor is ::;; anti-symmetric on~.
We sum up the properties of ::;; in the following theorem. THEOREM II. 5.4: The relation ::;; is a quasi-ordering on ~ and satisfies the following tree condition
A ::;; C and B ::;; C imply A ::;; B or B
s
A.
By definition II. 3 . 1, if A ::;; C and B ::;; C then there exist D, E such that A+D = C and B+E = C. Hence by theorem II.5.2 there is an F such that either PROOF.
or
A+F
A
= Band
= B + F and
D
=
F+E
(1)
=E
(2)
F+ D
1) For it follows from theorems IV.4. 5 that if B for all n, but A :2: B. w (= W).
=
1 and A
=
V, then A > B. n
CONSTRUCTIVE ORDER TYPES, I
209
In case (I), A s B and in case (2), B ~ A. ~ is a quasi-ordering of fJi by corollary II. 3 .4. III. Quords 111.1. We now commence our study of proper subsets of fJi. DEFINITION 111.1. I : A C.R.T. A is said to be a constructive order type (C.O.T.) if there is an A e A which is a linear ordering. Since C.R.T.s are subsets of the corresponding (classical) relation types, if A is a constructive order type then every relation A e A is a linear ordering. DEFINITION III. 1. 2: A sequence {a i};~ 0 in the field of a linear ordering relation A is said to be an infinite recursive descending chain if the function Aiai is recursive and for all i,
if, it contains no splinter.
PROOF. Let A be a linear ordering and suppose that {gJ;"= 0 is an infinite recursive descending chain in A. Define f, a as follows: a = go. f(n) = g(l+ll y{g(y) = tI})
Then {l(a)};"=
0
(v. [18], p.33).2)
is a splinter in A.
1) This use of the word "splinter" is derived from that in [18).
2) By our convention (v. Terminology and notation) we write gi for the value of
gat t.
JOHN N. CROSSLEY
210
Conversely. suppose that {l(a)}(= 0 is a splinter in the linear ordering A. Let g be the function defined by g(O)
=
a, g(n + 1)
= f(g(n».
Then g is totally defined and computable, hence recursive. I.e. {giL"'= 0 is an infinite recursive descending chain in A. THEOREM III. 1.6: If A is a quasi-well-ordering and f is a one-one partial recursive function such that f: A ~ A then f is the identity map on C'A. PROOF. (This proof is essentially that in [14J. p. 264.) If f =F 1 on C' A then there is an a e C' A such that f(a) =F a. Since A is a linear ordering, either (f(a), a) e A or
DEFINITION III.2.2: A C.O.T. A is said to be a quord if there is an A s A which is a quasi-well-ordering. We write .fl for the collection of all quords. It follows at once from example 111.2.1, that there are some quords which are not e.R.T.s of well-orderings, though we shall show later (§ IV) that the cardinal numbers of quords and of e.O.T.s of wellorderings are the same (namely c, theorem IV. 3.3). We shall see that quords possess many additive and multiplicative properties analogous to those of classical ordinals. THEOREM 111.2.3: A is a quord if, and only if, every A s A is a quasiwell-ordering.
211
CONSTRUCTIVE ORDER TYPES, I
PROOF. By definition III. 2.2 there is a B.s A which is a quasi-wellordering. Let A be any other relation e A, then there is an f such that f: A ~ B. Suppose that A is not a quasi-well-ordering, then there is an infinite recursive descending chain {aJ;"= 0 in A. But then {f(a;)};"= 0 is an infinite recursive descending chain in B since lif(a i) is totally defined. This contradicts our assumption and we conclude that A is a quasi-wellordering. The converse is trivial. THEOREM III. 2.4: (i) 0 e .2, (ii) A = B+C implies A e.2 ...... B, C e.2. (iii) A e.2 ...... (3n) (n "# 0 & A.n e..@) ...... (Vn) (A.n e..@) ...... A.w e.2.
PROOF.
Left to the reader.
Let .91 be a collection of C.R.T.s, then if we define A :s; .RIB to mean (3 C) (C e.91 & A + C = B) then :s; is absolute for quords in a certain sense by the following corollary. Corollary III. 2 .5. If A, B e PROOF.
s. then A s
B
+-+ A
:s; fiB.
Immediate from theorem 1I1.2.4.(ii).
THEOREM
111.2.6:
If A
is a quord, then A
= B+A+C -+ C = O. = B+A+C where C"#
PROOF. Suppose A is a quord and A O. Then there exist quasi-well-orderings A e A, Be Band C s C and a recursive isotonism f such that B+A+C is well-defined and f: B+A+ C ~ A. Since C "# 0, C "# 0 and hence there is an element c e C'C and for this element, f(c) e C'A. Now A)( C, hence C'A () C'C = 0 and c "# f(c). But
ir
Corollary III.2.7. If A is a quord, then A+B
=
A +-+B
= O.
Corollary III.2.8. If A is a quord, then B < A ...... B:s; A & B"# A. Corollary Ill. 2 .9. If A or B is a quord, then A :s; B & B
~
A
-+
A
= B.
PROOF. By hypothesis there exist C, D such that A + C = Band B+D = A. Hence A = (A+C)+D = A+(C+D) and B = B+(D+C) and if A is a quord, then C, Dare quords by theorem III. 2.4. (ii) and C = D = 0 by corollary 111.2.7 and theorem 11.2.3; similarly if B is a quord.
212
JOHN N. CROSSLEY
Corollary III. 2 .10. If A or B is a quord, then A :::;; B & B :::;;* A
-+
A
= B.
PROOF. By hypothesis, there exist C, D such that A + C = Band D+B = A. If A is a quord, then A = D+A+C and by theorem III.2.6, C = 0; hence A = B. Similarly if B is a quord.
Corollary III.2.ll. If A is a quord, then A+B PROOF. Suppose Theorem (II. 5 .2) E + B = C or A + E III. 2. 7, E = 0 and
=
A+C+-+B
=
C.
A + B = A + C, then by the Directed Refinement there is an E such that either A = A + E and = A and B = E + C. In either case, by corollary B = C. The converse is trivial.
Corollary III. 2 .12. If A is a quord, then B< C-+A+B
Corollary III.2.9 establishes that z; is a partial ordering of !2. We shall show later (Theorem IV. 2.6) that :::;; is a partial well-ordering of the collection of all C.R.T.s of well-orderings. :::;; is not a partial wellordering of !2 as is shown by the example below. Example III. 2. 13. Let A be as in example III.2.1 and suppose A = {
n.
Corollary III. 2. 10 may be regarded as a constructive analogue of the following theorem attributed to Lindenbaum (given in [14], p. 248): "If an order type A is an initial segment of an order type B and the order type B is a final segment of the order type A, then A = B." By corollaries III. 2.9 and 10, A = B is equivalent (in !2) both to A ::;; B & B :::;; A and to A :::;; B & B :::;; * A. But it follows from the existence of quords incomparable with respect to :::;; (see below § IV. 4)
CONSTRUCTIVE ORDER TYPES,
I
213
that :-s; * is not anti-symmetric on .£U) For let A, B be two incomparable quords and let C = (A+B).w and D = B+(A+B).w. Then clearly C:-s;* D and D:-S;* C. If C = D, then by theorems II.4.I.(vii) and II 5 . 4 it easily follows that A and B are comparable, contradicting our assumption. 0
IV. Co-ordinals IV.l. In this section we establish some properties of co-ordinals which are the C. R. T.s of well-orderings. We regard classical ordinals as relation types (v. § 1.1) of (denumerable) well-orderings. Hence co-ordinals are sub-classes of the corresponding (classical) ordinals. We use upper case Greek letters for classical ordinals (and variables over the denumerable classical ordinals). Notation. 10 = 0, 11 = CRT {
I:, 0
°
0
PROOF. We prove only part (viii) leaving the other parts to the reader. (viii) A+ln = B+ln ~ (A+l n)* = (B+l n)* ~ I:+A* = I:+B* (by theorem n.203) ~ In+A* = In+B* (by part (ii)) ~ A* = B* (by (vi)) ~A=B.
DEFINITION IV. I .2: A C. O.T. is said to be finite (or a finite co-ordinal) if it is In for some n. We remark that this definition corresponds to the classical definition of finite sets as sets which are inductive. A search for an analogue of Dekker and Myhill's Isols (v. [8]) proved abortive. 1) This example is based on that in [17] p. 25.
214
JOHN N. CROSSLEY
THEOREM IV. 1.3: Any two linear orderings with fields of the same finite cardinal are totally recursively isotonic. PROOF. Since any two finite sets of the same cardinal can be mapped onto each other in a one-one manner by a recursive permutation (any permutation of the natural numbers which interchanges only a finite number of numbers is recursive), and since every finite linearly ordered set is well-ordered and so is its converse, it follows that any two linearly ordered sets of the same finite cardinal are totally recursively isotonic. IV.2. DEFINITION IV. 2.1: If A is the C.R.T. of a well-ordering, then A is said to be a co-ordinal. We let 'fJ denote the collection of all co-ordinals. If E is a classical ordinal such that E :2 A, then E is said to be the (classical) ordinal of A and we write E = 1 A I. Corollary IV.2.2. If A is a co-ordinal then As; then A = I A I.
1
A
I
and if A is finite
THEOREM IV. 2.3: (i) 0 e -e, (ii) A = B+ C implies A e'fJ ...... B, C s 'fJ, (iii) A s 'fJ ...... (3n) (n "# 0 & A.n s 'fJ) +-+ (Vn) (A.n s 'fJ) ...... A.w s 'fJ. Corollary IV.2. 4. If A, Be 'fJ, then A s B ...... A s 'lB. Thus ~ is absolute for co-ordinals as well as for quords (cf. corollary III.2.5).
LEMMA IV.2.5:
I A+B I = I AI+I
B
I·
THEOREM IV. 2.6: ~ is a partial well-ordering of'fJ and satisfies the tree condition (v. theorem II.5.4). PROOF. As we remarked earlier (§ 111.2) every co-ordinal is a quord, hence ~ is a partial order of'fJ by corollary 111.2.9. Further, ~ satisfies the tree condition by theorem 11.5.4 and corollary IV. 2.4. Now suppose {AJ?= 0 is an infinite descending chain of co-ordinals and let E j = I A j I for each i. Then, by lemma IV. 2.5, {E j }?= 0 is an infinite descending chain of (classical) ordinals under the natural ordering of ordinals. This is impossible, hence ~ is a partial well-ordering of 'fJ.
CONSTRUCTIVE ORDER TYPES,
I
215
Corollary IV.2. 7. If A is a co-ordinal, B:::; A and C:::; A and B = C.
1B I = I C I, then
PROOF. Since :::; is a tree ordering, either B :::; C or C :::; B. Suppose B < C, then there is an E #= 0 such that B + E = C. Therefore, by lemma IV. 2.5, I B I + I E I = I C I and I E I #= 0; thus I B I < I C I where < denotes the classical ordering of ordinals. Similarly we cannot have I C I < I B I· Corollary IV.2.8. If A is a co-ordinal, then &(A)
= {B : B < A} and &+(A) = {B : B :::; A}
are well-ordered by :::;. IV. 3. DEFINITION IV. 3 . 1: A co-ordinal A is said to be infinite if, for some A e A, C'A is infinite. It follows at once that a co-ordinal is infinite if, and only if, it is not finite. THEOREM IV. 3 .2: A co-ordinal A is infinite ff, and only if, In < A for all n. PROOF. Let A e A and suppose that In < A for all n. Then by the Separation Lemma (II. 5 . I), for each n there exist Bn such that An + Bn = A, where
An = {
:
i :::; j
< n} and A = {<ar, a,j) : T :::; Ll < I A I}.
It follows at once that C'A 2 {a i
: i s $} and that C'A is infinite. Conversely, if A is infinite, then using theorem II. I .4. (ii) one easily shows that In < A for all n. We leave the details to the reader.
Notation. By virtue of theorems IV. 1. I and IV. 3.2 we now write "n" for "In" and "$" for "{In: n is a natural number}" where there is no danger of confusion.
THEOREM IV.3.3: (i)$ c Cfl c fl c!3£, (ii) The cardinalities ofCfl, fl,!3£ are all c (the cardinal of the continuum). PROOF. (i) Every finite linearly ordered set is well-ordered, hence
JOHN N. CROSSLEY
216
J! s;
C{}. There exist infinite well-ordered sets, hence J! #- rr5. By example III.2. 1 there exist quasi-well-orderings which are not well-orderings but every well-ordering is a quasi-well-ordering, hence C6' c fl. Finally, let W* be the converse of the natural ordering of the natural numbers and let W* = CRT(W*). Clearly, W* is not a quord. Hence fl c !Jf. (ii) The cardinality of !Jf is :0;: 2N~ since any C.R.T. is an equivalence class of subsets of J!2. But 2N~ = c; thus in order to prove (ii) it suffices to prove that the cardinality of rr5 ~ c. Now every equivalence class of well-orderings contains at most ~o well-orderings since there are only No recursive isotonisms, Further, there are at least c distinct well-orderings of subsets of J!. Hence, if x is the number ofelements ofrr5, then ~o. x ~ c and it follows, using the axiom of choice that x ~ c. This completes the proof.
As in the classical case subtraction does not playa major role, but we introduce the notion now for notational convenience. By corollary III. 2. 11, if A = B + C then C is uniquely determined by A and B; it follows by theorem IV. 3 . 3 that the same is true for co-ordinals. Hence the following definition gives a unique value for A - B (which by Theorem IV. 2. 3(ii) is a co-ordinal if B, A are co-ordinals). DEFINITION IV. 3 .4: If A ~ B and (A and) Bare quords, then A - B is the unique C such that A = B + C. THEOREM IV. 3 . 5: If A, B, Care quords, then (i) A-A = 0,
(ii) (A+B)-A = B, (iii) if B :0;: A, then B+(A-B) = A, (iv) if A+B :0;: C, then C-(A+B) = (C-A)-B. PROOF OF (iv). A + B
:0;:
C
-+
(ElD) (C = A + B + D), hence
C-(A+B) Also, C-A
= B+D and
=
D.
therefore
(C-A)-B = D. THEOREM IV. 3 .6: If I A I is a successor number A + m, where A is a limit number, then for each n there is a unique BII which is comparable with A and of classical ordinal A+n; further, B; = A ± I m-n I (where
CONSTRUCTIVE ORDER TYPES,
I m - n I is the modulus of m - n and the as En < A or En ;:::: A).
+ or
217
I
- sign is taken according
This theorem follows at once from the fact that if A e A then A has a final segment of type m which is finite, and hence, by theorem II. 1 .4. (ii), separable from its complement in A. We leave the details to the reader. It follows from this theorem that every co-ordinal has a unique successor. We shall show later (§ V) that limits of strictly increasing sequences of co-ordinals are never uniquely determined by such sequences without other conditions. IVA. THEOREM IV .4.1: For each limit number A there exist ceo-ordinals
of ordinal A.
PROOF. There are c distinct infinite subsets of .Y. Let each of these be well-ordered with ordinal A (this is possible since A is denumerable). Then these c subsets are spread among, say, x equivalence classes containing at most No members each since there are only No recursive isotonisms. Hence No.X = c and therefore (using the axiom of choice)
x = c.
Corollary IVA. 2. There are c co-ordinals
V~
such that
I V~ I = w.
Notation. W = {
v = {<x,y): x,yep&x ~ y} and V
=
CRT(V).
DEFINITION IV. 4.3: W is said to be the standard well-ordering of type w, W is the standard w-co-ordinal; V is called the generic counterexample.i) THEOREM IV .4.4: (i) 1 + W = W, (ii) 1 + V"# V. PROOF. (i) It is easily verified that W = I. w, hence by theorem 1I.4.1.(vii) with A = 1, 1+ W = W. (ii) Suppose 1 + V = V. Since p is r.e. non-recursive, p is non-empty, 1) Since most of our counterexamples are based on V.
218
JOHN N. CROSSLEY
say ao s p. Therefore there is a recursive isotonism
f such that
»:m
f: {(ao, ao)}+V ~ V. Hence V = {(r(ao),f"(a o
~
n}
and g : W ~ V where g(n) = I"(ao). But then g enumerates p in order of magnitude, and it follows by Post's lemma ([13], p. 291) that p is recursive, contradicting our choice of p. Corollary IVA.5. V and Ware incomparable.
PROOF. By the theorem V '" W. But V < Wor W < V implies Vor W (respectively) is finite. This corollary shows that there exist incomparable co-ordinals, and hence, that there are incomparable quords. It therefore completes the demonstration (end of § 111.2) that ~ * is not anti-symmetric (even on ..2). THEOREM IV.4.6: There exist co-ordinals A, B, C such that A < B but A+C $ B+C. PROOF. Let A = I, B = V and C = W. Then by theorem IV.3.2, A < B, and by theorem IV.4.4.(i)A+C = C. If A+C ~ B+C, then C ~ B + C and B ~ B + C. Hence by theorem IV. 2. 6, Band Care comparable which contradicts corollary IV.4. 5. We now consider important classes of co-ordinals for which the law (+) A < B - A + C ~ B+ C does hold. In fact, if A, B, C are predecessors of the same principal number for addition then (+) holds. Classically, a principal number for addition, otherwise called a y-number ([I], p. 67) or a prime component ([14], p. 279), may be defined as an ordinal II '" 0 satisfying one of the three (equivalent) conditions (l C)-(3C) below. r
r,
+ L1 = II L1
-+
L1
=0
or L1
< II - r + L1 < II
= II
We consider constructive analogues of these conditions, viz.: B < A - B+A = A B + C = A -+ C = 0 or C B, C < A _ B + C < A
=
(I),
A
(2), (3).
CONSTRUCTIVE ORDER TYPES, I
219
THEOREM IV. 4.7: If A is a co-ordinal #- 0, then (i) (1) +-+ (2), (ii) (1) ~ (3), (2) ~ (3), (iii) (3) -+-+ (1), (3) -+-+ (2). PROOF. (i) Suppose B+C = A and (1) holds. Then if C #- 0, B < A, and by (1), B + A = A. Hence by corollary III . 2 . 11, A = C. Conversely suppose (2) holds and B < A. Then there is a C#-O such that B+ C = A and by (2) we have C = A, hence B + A = A. (ii) Suppose (1) holds and B, C < A. Then B+A = A and C+A = A, hence (B+C)+A = B+(C+A) = B+A = A and since A #- 0, B+C< A. That (2) ~ (3) follows from (i). (iii) It suffices to prove (3) +} (1). Let A = V, then B < A implies B is finite, hence (3) holds for V. Since 1 < V, if (1) held we would have 1 + V = V contradicting theorem IV. 4.4. (ii). This completes the proof.
°
DEFINITION IV. 4.8: A co-ordinal A is said to be a principal number for addition if A =1= and B < A ~ B+A = A. If A = 1, then A is called an improper principal number for addition and if A#-1 then A is called a proper principal number for addition. We write£'( +) for the collection of all principal numbers for addition.
THEOREM IV. 4 .9: Every proper principal number is a co-ordinal whose classical ordinal is a limit number. PROOF. Clearly, no finite co-ordinal is a proper principal number. Suppose A is a proper principal number for addition and I A I = A + m, where A is a limit number and m is finite. Then by theorem IV. 3 .6 there is a co-ordinal u; < A such that I Bo I = A. Hence I B o + A I = A.2+m> I A I and consequently Bo+A #- A. THEOREM IV.4.lO: If P e£'( +), then P.w e£'( +) and P < P.w. PROOF. Suppose P is a principal number for addition and A < P. w; then, by theorem 11.5.3, there is an n and a D such that A = P.n+D, where D < P. Hence A+P.w = A+(P+P.w) [by theorem 11.4.1. (vii)] = (A+P)+P.w = (P.n+D+P)+P.w = P.(n+l)+P.w [sinceP
220
JOHN N. CROSSLEY
is a principal number for addition and D < P] = P. co [by (n + I) applications of theorem II.4.I.(vii)]. Hence P.we£(+). Clearly, P < P.w since P "* O. THEOREM IV 04.11: (i) If P is a principal number for addition and A, B, C < P, then A < B--+ A+C s B+C. (ii) Similarly under the hypothesis that A, B, C
s
P.
PROOF. (i) By theorem IV.4.7.(ii), A < B--+ A+C < P&B+C < P. Therefore, by theorem II. 5 . 4, A + C and B + C are comparable. But, classically, ep < 'JI --+ ep + r ~ 'JI+ r, hence by lemma IV. 2.5, A + C s B + C. (ii) follows at once from (i) using P. co instead of P and theorem IVA.IO. Corollary IVA .12. If A, B s P and P is a principal number for addition, then B s A + B. Now we prove that any (non-zero) predecessor of a principal number for addition is uniquely expressible as a finite sum of non-increasing principal numbers for addition.')
THEOREM IV. 4. 13: If 0 < A < P e£ ( +), then there exist principal numbers for addition C l' . . . , Cn such that P > Cn 2: Cn _ 1 •.. 2: C 1 and A = Cn + ... +Cl • Further, if A = Cn + ... +C1 and A = D m + ... +D l are two decompositions such that P 2: Cn 2: Cn - 1 . . . 2: C 1 and P > D m 2: D m _ 1 . . . 2: D l and all the C, and D, are principal numbers for addition, then n = m and for all r S n, Cr = Dr' Conversely, if A is expressible as Cn + ... +C l where C; 2: Cn _ 1 2: ... C l and all the Cr are principal numbers for addition, then there is a principal number, namely,
c..«
2: A.
PROOF by transfinite induction with respect to the partial well-ordering S. We assume 0 < A < P e£( +) and take as induction hypothesis: If 0 < B < A, then B is uniquely expressible as a finite sum of principal numbers < P. If A is a principal number for addition, then there is nothing to prove. 1)
This theorem was conjectured by A. L. Tritter.
CONSTRUCTIVE ORDER TYPES, I
221
Now suppose A is not a principal number, then there exist B, C such that B+ C = A, where C "# 0, A (and hence B "# 0).
(4)
By corollary IV.4.l2, C < A. Let C 1 be the least C satisfying (4) (i.e. under the ordering by initial segments). We now show that C 1 is a principal number for addition. Suppose C 1 = D + E, then by corollary IV.4 .12, E < P and hence by theorem 11.5.4, C 1 and E are comparable. But 1E 1 ~I C 1 I, hence E s C 1 and by the minimality of C b E = C 1 • Thus C 1 is a principal number by theorem lV.4.7.(i). Now let B 1 be the least B such that B+C 1 = A. Then if B 1 = 0 we only have to prove uniqueness, and otherwise by the hypothesis of the induction, B 1 has a (unique) decomposition B = Cn + + ... +C z where P> C, ~ ... ~ C z and all the Cr(r = 2, ... , n) are principal numbers. Hence A = Cn+ ... +C 1 and, since C 1 < P, all the C, (r = 1, ... , n) are comparable. Suppose C z < C 1 , then by the definition of a principal number for addition, Cz + C 1 = C l' hence A = (C n+ ... +C Z)+C 1
= (C n+··· +C 3)+C 1 •
Now C z "# 0 -+ Cn+ ... +C 3 < Cn+ ... +C z = B 1 • But B 1 was chosen as the least B such that B+C 1 = A. We therefore cannot have C z < C 1 and must have Cz ~ C L: Thus A = C, + ... + C 1 is a decomposition of the required type. As regards uniqueness, letA = Cn+ ... + C 1 and A = Dm+ ... +D 1 be two decompositions of A as a sum of non-increasing principal numbers. By theorem 11.5.4, C, and D m are comparable. Suppose Cn > D m , then D m + C n = C, since C n is a principal number for addition. Therefore A = D m + C n + ... + C 1 and by substituting D m + C, for C, m times more we obtain A = Dm.(m+ 1)+A which implies Dm.(m+ 1) < A. Now if i ~ m, then D;+D m = Dm or Dm.2 according as Di < Dm or D i = Dm. Therefore Dm.(m+l)+A = A < A+Dm.m ~ Dm.2m and hence, by corollary JIL2.11, A ~ Dm.m. This contradicts Dm.(m+l) < A and we therefore cannot have D m < Cn" Similarly, Cn -{: D m and we conclude C; = Dm . Now by corollary 111.2.11 it follows that
222
JOHN N. CROSSLEY
Repeating this argument the minimum of m and n times and letting s be this minimum, we obtain C, r = Dm _, (r = 0, ... , s) and hence either Ct+ ... +C t = OorDt+ ... +D t = 0 where t = 1m - Ill. By theorem 11.2.3. (i i) it follows that t = 0 and hence that n = m and C, = D, for every r. Conversely, if A = Cn+ ... +C t , then as for Dm above, A+Cn.n :::;; :::;; Cn . 2n and hence by theorems IV.4 .10 and 11.4.1. (vii) it easily follows that A < Cn • w which is a principal number for addition. This completes the proof. -r
This theorem is not an immediate corollary of theorem 2, p. 280 in [14] for the following reasons: (i) it may be the case that I P I is a classical principal number while P is not a principal number, e.g. V, (ii) P may be a principal number but I P I may not be a classical principal number (see § VIII .1) and (iii) comparability conditions have to be established.
IV.5. By theorem IV .4. 1 above there are c co-ordinals corresponding to each limit number (and these co-ordinals are therefore incomparable with each other) but there are some limit number co-ordinals which have no predecessors of some smaller ordinal. More formally: Let A, B, C range over co-ordinals, over classical ordinals, then
e
(3A) (3B) (3 e)
(/
A
& (V C)
I = r & IB I = A & A < B & r < e < A (I C 1"1= e v C 1: B v A 1: C».
This is shown by example IV. 5.1 below. If, however, we restrict ourselves to recursive co-ordinals then this situation does not arise. We hope to present the results for recursive co-ordinals in [4]. Example IV. 5 .J. P is as given in § IV.4. Let T be the well-ordering of type w. 2 defined by <x, y) e T +--> x e p & yep v xc y e p &x:::;; y v x, yep & x :::;; y.
Let T = CRT(T). Suppose T = V + V' where I V I = I V' I = w, then by the Separation Lemma (II. 5 .1) there exist relations U, U' such that U )( U' and T = U + U'. Hence C'U and C'U' are contained in disjoint
CONSTRUCTIVE ORDER TYPES, r.e. sets (x, p. But this implies trary to the choice of p.
(X
223
I
= p & P = P and that p is recursive, con-
e
We observe that if the condition on above is satisfied for some successor number l ' then by theorem IV. 3 . 6 it is satisfied for some limit number 2 • On the other hand we do have c co-ordinals which have predecessors representing all ordinals less than that of the given co-ordinal. This is the content of the representation theorem below. We shall use the following classical theorems in proving the representation theorem.
e
e
THEOREM IV.5.2: ([14], p. 379, theorem 1.) Every denumerable ordinal which is a limit number is the limit ofa strictly increasing sequence, of type ill, of ordinals less than the given number. i
THEOREM IV. 5.3: ([14], p. 264, corol/ary 3.) If A and B are isotonic weI/orderings then there is an isotonism f such that every isotonism between A and B is an extension off. THEOREM IV. 5.4: (REPRESENTATION THEOREM.) Let F, ..1 range over (denumerable) classical ordinals, C, D over co-ordinals, then
('IT) (3C)
(I C 1=
r &('1..1) (..1
->
(E!D)
(I
D
1=
L1 & D < C))).
PROOF BY TRANSFINITE INDUCTION. The assertion is trivial if T = O. We assume the assertion holds for all ordinals less than T, If r = e + 1, then by the hypothesis of the induction there is a co-ordinal T such that
I T I = e & ('1..1) (..1 < e
->
(ElD)(1 D
I = ..1 & D <
T)).
Then by theorem IV.3.6, T < T+ I and D < T+ 1 -> D s T. Let C = T + 1, then by corollary IV. 2.7, it easily follows that C has the required properties. If r is a limit number, then by theorem IV. 5.2, F is the limit of a strictly increasing sequence {4>;} j < w of ordinals. We may assume 4>0 #- O. Put II 0 = 4>0' Il, + 1 = 4>j + 1- 4>j (by [14], p. 275, Il, is well-defined). Then
By the hypothesis of the induction, for each i there is a P, such that
224
JOHN N. CROSSLEY
I Pi I = IIi & ('v'A)(A < IIi
(E! D)(I D I = A & D < Pi»'
-+
(5)
Using the axiom of choice, choose a fixed Pi in Pi (such that 0 a CP i for each i 1 Now define
».
C
= {<j(p, m),j(q, n»: p e CPm & q a CP n & m
< n
.v. m = n &
and C = CRT(C). Clearly,
L
IC I = Now suppose A
<
i
r, then for
IIi =
r.
some n,
A <
n
L IIi i; 0
and we may assume that n is minimal. Therefore A
where e and T <
n- 1
= L u.s o i; 0
< II n- From (5) it follows that there is a T such that I T I =
r; Let D =
n - 1
e
n
L Pi + T, then I D I = A and D::;; L0 i; 0 i;
Pi'
Since ::;; is a tree-ordering, in order to complete the proof it suffices to prove that, for all n, n L r;« C. i; 0
Let Pen) = C[{x: lex) ::;; n}, then it is easily verified that
Pen) a
n
L Pi' i; 0
Further, let pen) = C[{x: lex) > n}. Then p(n»( pen) since if x a CP(n) U cp(n) (= CC), then xaCP(n)+-->l(x)
s:
U
n .&. xaCp(n)+-->l(x) > n.
Hence, if p is the partial recursive function sg (l(x)-=- n), then p satisfies 1) We shall use this auxiliary condition in the proof of corollary IV. 5.5.
225
CONSTRUCTIVE ORDER TYPES, I
the conditions in theorem II .1. 5. Hence p(n) + p
i
L= 0 Pi <
C
and the proof is complete. Corollary IV.5 .5. There are ceo-ordinals C A for each ordinal T such that I C A I = rand
('ILl) (Ll < PROOF.
r -+ (E! D)( I D I =
Ll & D < C A ) ) .
~
w
(6)
Case 1. If F is a limit number.
Let VA be a co-ordinal such that I v~ I = W & VA i= W, by corollary IV.4.2 there are c such co-ordinals. Let YA e VA and suppose YA = {
C' = {G(p,
» : P e eP
vm ) , j(q, vn
m
& q s eP n & m <
. v. m = n &
/I
(7)
and C' = CRT(C'). As before, I C'I = rand (6) holds with C' replacing CA' Clearly, C '" C' under the map f: x -+ j(k(x), v/(x» and hence, by theorem IV. 5.3, every isotonism between C and C' is an extension off. Therefore, if C = C', then g : C ~ C' for some partial recursive extension of f. In particular, gj(O, m) = j(O, vm ) for every m and hence the map m -+ Vm is partial recursive. This contradicts our choice of VA' and we conclude C i= C'. Similarly, if C" is obtained from Vn then C i= C" and C' i= C" since the former implies that the map m -+ U m is partial recursive and the latter that the map Vm -+ u.; is partial recursive, where Yn = {
Suppose T = e + n where e is a limit number. Then by case 1, there exist c co-ordinals LA such that I LA I = e and (6) holds with LA re-
226
JOHN N. CROSSLEY
placing CA' Let C A = LA follows for this case also.
+ n, then
by theorem IV. 3 . 6 the conclusion
We observe that the limit of a sequence of recursive co-ordinals (coordinals containing a recursive well-ordering) is not uniquely defined either, since by theorem 1. 4. 6 the generic counterexample is a recursive co-ordinal and using this Vas the VA of the corollary proof we obtain a C' "# C. In the case of recursive co-ordinals, however, there are only ~o distinct "limits".') V. Bounds
V.I. Since the ordering by initial segments is a partial order on f2 and on l(/ we can define upper and lower bounds in the usual way. We use the techniques developed in the previous section in order to show that there are no non-trivial upper bounds for collections of quords or co-ordinals. DEFINITION V .1.1: A quord B is said to be a lower (upper) bound for a collection of quords, d, if Qed implies B s Q (B ~ Q).
V. I .2: A quord B is said to be a greatest lower bound (least upper bound) for a collection of quords, d, if B is a lower bound (upper bound) for d and every lower bound (upper bound) for d is ~ B (is ~ B). DEFINITION
By the anti-symmetry of ~, least upper bounds and greatest lower bounds are unique if they exist at all. (The proof of the following lemma is based on the idea in the proof of theorem 4lb in [8].) LEMMA
decessors.
V. I .3: A quord has at most denumerably infinitely many pre-
Let A be a quord and A s A. For fixed A we show that every r.e. set determines at most one predecessor of A and that every predecessor of A determines at least one r.e. set. The lemma follows at once from these results. Let 13 be a r.e. set, then 13 determines at most one predecessor of A as follows: Let B = A [13, then B = CRT(B) ~ A only if there is a r.e. set PROOF.
1) Since there are only ~o recursive co-ordinals.
227
CONSTRUCTIVE ORDER TYPES, I
Y such that B)( A [y and B+A [y = A (using the Separation Lemma II. 5 .1). On the other hand, if B ~ A, then there is a B such that B e Band B is contained in some r.e. set p separating B from A [ (C'A - C'B). This completes the proof of the lemma. LEMMA V.I. 4: A denumerable collection of quords has an upper bound if, and only if, every two members of the collection are comparable. PROOF. Let d = {Ai: i e J'} be a collection of quords. If there is an upper bound, U, for d, then A; ~ U for all i and by theorem II. 5.4, for all i, j, either A; ~ A j or A j ~ Ai' Conversely, suppose A; and A j are comparable for all pairs i, j. We
may assume that there is no maximum A; since the assertion is trivial in that case. We now set Bo = A" B; + 1 = A /(;), where r = Jls{ As =F O}, t(i) = Jls{A s > B i } · Clearly, i < j --+ B, < B j • Hence the C i , defined by Co = B o, C i + 1 B; + 1 -B;, are all non-zero. For each i, let C; be a fixed representative of Ci such that 0 e C'C;. Now let
U
=
{(j(e, m),j(d, n»: e s C'C m & d e C'Cn & m < n . v. m = n & (c, d) e Cm}.
Further, let U = CRT(U), Yn = {x: lex) ~ n}, Urn) = U [Y(n) and urn) = U [~n' We shall prove:
~n
=
{x: lex) > n};
1) urn) =F 0, 2) Urn) )( urn),
3) for all n, Urn) s e; 1) urn) =F 0 since m > n --+ (j(0, m), j(O, m» e urn) by construction and the choice of the C; 2) For each n, x s C'U implies x e C'U(n) ~ sg (l(x) -=-n) = 0 & x e C'u(n) ~ sg (l(x) -=-n) = 1. Hence by theorem II .1.5, 2) holds. 3) U(o) = (Co; 0) ~ Co s Co = B o. Now we assume Urn) e B; and prove Urn + 1) e B n + i -
228
JOHN N. CROSSLEY
By construction, (C n+ 1 ; n+1) <:; Urn»)( Urn)' Hence Urn)+ ( C n+ 1 ; n+1) = T, say, is well-defined and T s Bn+Cn = B n + r- But T = Urn +1) by construction of U and 1'n + r- Hence, for all n, Urn) e Bn" This proves 3)1). Using 1) it follows at once that B; < V for all n and it only remains to prove that V is a quord. Suppose U(Xi' ninr'= 0 is an infinite recursive decending chain in U. Then by the definition of U, {ni: i <; J} has a maximum; let this be n. Then the given chain is also an infinite recursive descending chain in U(n) which is impossible, since U(n) s B; and B; is a quord. Thus the proof is complete.
v.2. An analysis of the proof of corollary IV. 5. 5 shows that there exist c incomparable limits to certain increasing sequences of co-ordinals. We now prove the stronger result that any increasing sequence of coordinals without a maximum has c incomparable co-ordinals whose classical ordinals are all equal to the limit of the classical ordinals of the given sequence. This will be a corollary of the next theorem. THEOREM V. 2. l: A collection of quords has a least upper bound if, and only if, it has a maximum.
The "if" part is trivial. Now suppose that sf is a collection of quords without a maximum, but with a least upper bound L. By lemma V. 1.3, L has at most denumerably infinitely many predecessors, hence sf is at most denumerably infinite. And sf is not finite since sf has no maximum. In order to prove the theorem we construct two incomparable upper bounds V and U' which are, in a certain sense, minimal upper bounds and obtain a contradiction. Since sf is denumerable we construct U and V exactly as in the proof of lemma V. 1.4. Let V be a well-ordering in the generic counterexample, say, V = {
1) We are here using the fact that, for any finite set of one-one partial recursive functions with mutually disjoint domains and ranges, there is a one-one partial recursive function which agrees with each member of the given set on its respective domain.
CONSTRUCTIVE ORDER TYPES, I
229
V.l.4 easily yields that V' is also an upper bound for d (see especially footnote.') p. 224), the details of this verification we leave to the reader. We now prove A < V
-+
(3n) (A < En)'
(I)
Suppose A < V, then there exist relations A, D, such that A)( D, A + D = U and D :1= 0 (by the Separation Lemma 11.5. 1 and corollary 111.2.8). Hence there is a number j(x, m) e CD for some m. Clearly, CA ~ {x : l(x) ~ m}, hence, using the same notation as in the proof of lemma V.l.4,
U = A+D [{x: l(x)
s
m}+um.
Taking C.R.T.s we obtain A s Em and if n = m+ 1, then A < En since c, :1= 0. This proves (1). Similarly one proves (I) with U' replacing V. It follows at once that L ~ V and L ~ U', Thus in order to complete the proof we only need to establish V:I= V But V = U' implies there is a recursive isotonism g such that g : U ~ U'. But U' = l(U) and hence U = g[(U). Hence by theorem III .1. 6, gf = 1 on CU. But this implies the map h : j(O, n) -+ j(O, vn) defined only on numbers of the formj(O, n) has a partial recursive inverse, namely g, which implies that h is partial recursive contrary to our choice of V. Thus the proof is complete. f
•
Corollary V. 2 .2. A collection of co-ordinals has a least upper bound if, and only if, it has a maximum. Corollary V.2. 3. Let d be a collection of co-ordinals with no maximum, but such that all its members are comparable. Further, let lim
A.JII
IA I =
A,
then there exist c incomparable upper bounds V4> such that PROOF.
I V4> I =
A.
Clearly,
IV I=
lim
A.JII
IA I
by the construction of V. We leave the reader to verify that c upper bounds
230
JOHN N. CROSSLEY
U~ can be constructed from the c incomparable co-ordinals corollary IV. 4.2 (cf. proof of corollary IV. 5.5).
V~
given by
V.3. Since 0 ~ A for every quord A, every collection of quords has a lower bound. There exist collections of quords with a greatest lower bound but no minimum as the following example shows. Example V. 3 . 1. Let U be a co-ordinal such that I U I = wand there is a U e U such that C'U is immune. Then, clearly U =F W. U - n is well defined for all n and if m =F n, then U- m =F U- n since otherwise U = r+ U for some r. We shall show later (Lemma VIII.l.4) that this last equation implies U =F W. However, U'
PROOF.
Immediate from theorem 11.5.4.
The converse of this theorem is false. For example, let ir be the collection of all co-ordinals of classical ordinal ro, then every finite co-ordinal is a lower bound for ir but there is no greatest lower bound for the ordinal of any greatest lower bound would be w. VI. Multiplication VI. 1. From now on we shall be principally concerned with co-ordinals and unless otherwise stated all C.R.T.s mentioned will be assumed to be co-ordinals. We give a natural definition of multiplication of C.O.T.s in this section and show that most of the [analogues of the] basic classical laws hold for co-ordinals. There is one striking breakdown, namely in the case of the law (1) A < B -+ AC s BC
CONSTRUCTIVE ORDER TYPES, I
231
which we shall show fails for some co-ordinals. If, however, A, B, C are all predecessors of a principal number for multiplication, then (I) does hold. Notation. A.B = {<j(a, b),j(a', b'»: a, a' e C'A .&:
THEOREM VI. 1. 1: If A, B are reflexive relations (linear orderings, quasi-well-orderings, well-orderings) then A. B is a reflexive relation (linear ordering, quasi-well-ordering, well-ordering). PROOF. All except the case of quasi-well-orderings follow at once from the classical definition of multiplication of relations (cf. [14J, p.229). Suppose A and Bare quasi-well-orderings and that A. B is not. Then there is an infinite recursive descending chain in A. B. Since every element ofthe field of A. B is of the formj(a, b), this chain must be ofthe form {jean> bnn:,= 0 where an e C' A and b; s COB. Let IX = {an} and f3 = {b n } , then there are four cases to consider: and f3 are both finite, infinite and f3 is finite, (iii) IX is finite and f3 is infinite, (iv) IX and f3 are both infinite. (i) (i i)
IX
IX is
(i) is impossible since then {jean> bn)} would only contain a finite number of elements. (ii) Since f3 is finite, there is at least one number be f3 for which {jean, b): an e IX} is infinite. Let the distinct an in this set be a(n;) (i = 0,1, ... ) where i < j ~ n, < nj • Then {j(a(n), b)}?= ° is an infinite recursive descending chain in A. B since
a(no) ain,
+
= a o,
1) = a(Jls{r < s
--+
a, # as})'
It follows at once that {a(ni)}~= 0 is an infinite recursive descending
chain in A which is a contradiction. (iii) This case is dealt with a manner very similar to (ii). We omit the details.
JOHN N. CROSSLEY
232
(iv) Let {b(n)};x;"o be the set of distinct b., where i <} ~ n, < nj' Then every ben) occurs in {b n };:'= 0 at most finitely many times for the following two reasons: 1. If b, = ben) for some fixed i and all} greater than some }o, then there are only finitely many distinct bn> namely, those occurring in b o, ... , b(jo)· 2. If i <} and
This is impossible since B is a quasi-well-ordering. This completes the proof of the theorem. THEOREM VI. 1.2: If Al ~ A2 and B1 ~ B2 , then Ai' B1 ~ A2 · B2 • PROOF. Suppose p : A 1 ~ A2 and q : B1 ~ B2 , then r: Ai' Bi ~ A2 . B2 where rex) = j(pk(x), ql(x». THEOREM VI. 1.3: (A 1.A2).A3
~
A1.(A2.A 3 ) .
PROOF. <x, y) s (Ai' A2 ) . A3 ~ X = j(j(a 1, a 2), a3) & y =
j(j(a~,
a;), a;)
& ai' a; e C' Ai (i = 1, 2, 3)
.&:
x = j(a l,j(a2, a 3» & y = j(a~,j(a;, a;» & a.; e C' Ai (i = 1,2, 3)
a;
.&:
I
(2)
I
(3)
CONSTRUCTIVE ORDER TYPES, I
233
Since j is one-one, conditions (2) and (3) are equivalent and the proof is completed by using the recursive isotonism x
-+
DEFINITION VI. 1. 4: A. B
j(kk(x), j(kl(x), lex))).
=
CRT(A. B) where A s A and Be B.
Theorem VI. 1. 1 guarantees the uniqueness of this definition. (We often write "AB" for "A .B".) Corollary VI.I.5. Multiplication is associative, i.e. (A.B).C A.(B.C).
=
By virtue of this corollary we may omit brackets in a product of several C.O.T.s. THEOREM VI. 1.6: A. B = 0 +-+ A = 0 v B = O. PROOF. j(a, /3)
=0
+-+ a
=0
v f3 =
0.
THEOREM VI. 1. 7: A(B + C) = AB + A C. PROOF. It is sufficient to establish that separability conditions are satisfied, since the proof that the order type is the same on both sides of the equation is proved exactly as in the classical case. Let A e A, B e Band C e C, then x s C'A. B +-+ k(x) s C'A & lex) 8 C'B
and
x e C'A.C
+-+ k(x)
e C'A & lex) e c-c.
Hence by theorem II .1.4.(i) A.B)( A.C
+-+ A =
0
v B)( C.
THEOREM VI.I.8: (i) For all n, A.In = A.n, (ii) A. W
= A.
PROOF. (i) If n = 0, then A.In = 0 = A.n. If n = 1, then A.I = A (by definition § 11.4) and A.I1 = CRT {(j(x, O),j(y, 0) : x, y e C'A & 0
= 0 & (x, y) s A}
where CRT(A) = A. If n > 1, then A.In = A.{Il.n) = (A.I1).n = A.n by the first part of the proof and corollary VI. 1.5. Hence, for all n, A . In = A. n. (ii) A. W = A.{Il'OJ) = (A.I1).OJ = A.OJ.
234
JOHN N. CROSSLEY
VI. 2. By analogy to principal numbers for addition, we now introduce principal numbers for multiplication (v. [1], p. 66). DEFINITION VI. 2. 1: A co-ordinal A is said to be a principal number for multiplication if A i= 0, 1 and
°<
B < A
-+
BA
= A.
If A = 2, then A is called an improper principal number for multiplication, and if A i= 2, then A is called a proper principal number for multiplication. We write £(.) for the collection of all principal numbers for multiplication. THEOREM VI. 2.2: Every proper principal number for multiplication is a co-ordinal whose classical ordinal is a limit number. PROOF. Left to the reader (cf. theorem IV.4.9). As in the classical case, B < A -+ BA = A is a stronger condition than BC = A -+ B = A v C = A. But also, the former condition is stronger than B, C < A -+ BC < A for co-ordinals. For the generic counterexample V satisfies this last condition but is not a principal number for multiplication since 2. V i= Vas we shall show later (lemma VIII.2.4); in fact we show that if n.A = A for any n, then A = W. It follows at once that W is a principal number for multiplication. (Alternatively, that W is a principal number for multiplication follows immediately from theorem II.4.I.(viii).) We now establish analogues of classical laws for multiplication (of co-ordinals) and show that these all go through if the co-ordinals concerned are all predecessors of the same principal number for multiplication. THEOREM VI. 2.3: (i) If B i= 0, then A ~ AB, (it) If B > 1, then A < AB whenever A i= 0. PROOF. We prove only (it) leaving (i) to the reader. (it) B > 1 -+ (E!C) (B = I+C & C i= 0). Hence AB = A (I+C) = A+AC where AC i= 0 if A i= O. Thus A < AB. THEOREM VI.2.4: If A i= 0 and A, B, C are co-ordinals, then AB = AC -+ B = C.
CONSTRUCTIVE ORDER TYPES, I
235
PROOF. Let A e A, Be Band C e C and suppose p : AB ~ AC. Then AB ,.., AC and since AB and AC are well-orderings, it follows that p is an extension of the unique minimal isotonism, Pe, between AB and AC (theorem IV. 5.3). Now, classically, :F 0& = eJ -+ T = J. Therefore there is an isotonism qe (not necessarily partial recursive) such that qe : B ,.., C. Now the map Te :j(a, b) -+ j(a, qe(b» defined only on C'AB is an isotonism between AB and AC. Hence by theorem IV. 5 . 3 p is an extension of r.: Since A :F 0, there is an element, say a o, in C'A. Let p' be the map p with domain and range restricted to {j(ao, n) : n e J}, then p' is partial recursive. Further, if p'(j(ao, x) is defined then its value is j(ao, y) for some y. Now let q' be the map x -+ l(p'(j(ao, x»), then clearly q' is partial recursive and q' agrees with qe on C'B (again by theorem IV. 5.3). q' is one-one, since
e
= q'(y)
q'(x)
er
-+ l(p'(j(ao, x») -+ p'(j(ao,
x)
= l(p' (j(a o, y»)
= j(ao, c) & p'(j(ao, y» = j(ao, c)
(since pp' £ {j(ao, n) : n s J} by construction) -+ j(ao, x) = j(ao, y) -+
x
= y.
(since p is one-one)
Thus q' is partial recursive, agrees with qe on C'B and is one-one and order-preserving, i.e. q' : B ~ C, from which the theorem follows. LEMMA
VI. 2.5: If M is a principal number for multiplication, and
~C:F~~fflOC<M-B<M&C<M
Suppose BC < M, then B < M or C = I by theorem VI.2.3. (ii). In the former case BM = M and in the latter trivially, C < M. Now BC < M -+ BCM = M, and therefore, by theorem VI. 2.4, CM = M. Using theorem VI.2.3.(ii) it follows that C < M. Conversely, C < M -+ CM = M and B < M -+ BM = M. Hence (BC)M = B(CM) = BM = M and by theorem VI.2.3.(ii), BC < M. PROOF.
VI. 3
VI.3.1: (i) If A :F 0, then B < C C -+ AB s AC.
THEOREM
(ii) B
s
-+
AB < AC,
236
JOHN N. CROSSLEY
PROOF. (i) B < C -+ (E tD) (B + D = C & D "# 0). By theorem VI. I. 7, AC= A(B+D) = AB+AD. AD"#O by theorem VI. I. 6, hence AB < Ae. (ii) follows at once from (i). THEOREM VI. 3 .2: There exist co-ordinals A, B, C ("# 0) such that A < B but AC $ Be. PROOF. Let A = 1, B = Vand C = W, then AC = Wand BC = Vw. By theorem VI. 2 .3 . (i), V:s; Vw. Hence if W:s; VW, Wand V are comparable by theorem II. 5 .4 which contradicts corollary IV. 4.5. THEOREM VI. 3.3: If there is a principal number for multiplication such that B, C < M (or equivalently BC < M) then A < B -+ AC :s; Be. PROOF. If B or C = 0 there is nothing to prove. Similarly if A = O. Otherwise, by lemma VI. 2.5, A C < M and BC < M. Hence, by theorem II. 5.4, AC and BC are comparable. Now, classically, 4'> < lJI -+ 4'>r ..s lJIr, hence AC :s; BC. THEOREM VI. 3.4: If A, B, C are co-ordinals, then A C < BC
-+
A < B.
PROOF. If C = 0, then the assertion is trivial. If C "# 0, then by theorem VI. 2.3. (i), A :s; A C and B :s; Be. Hence by the transitivity of :s; and theorem II. 5 . 4, A and B are comparable. By the classical theorem 4'>r < lJIr -+ 4'> < lJI, we have I A I < I B I and hence A < B. THEOREM VI. 3.5: There exist co-ordinals A, B, C such thatA C :s; BC but A :$ B. PROOF. (As in the classical case.) Let A
=
2, B
=
1, C
=
THEOREM VI . 3 .6: If B, C are comparable, then AB < A C
W.
-+
B <
e.
PROOF. Immediate from theorem VI. 3 . 1. (i). THEOREM VI. 3.7: If there is a principal number for multiplication, M such that (AB <) AC < M, then AB < AC -+ B < e. PROOF. By lemma VI.2.5, AB < AC < M -+ B < M & C < M. Hence by theorem II. 5.4, Band C are comparable. The theorem now follows at once from theorem VI. 3.6. We leave the question of prime numbers and unique factorization of certain co-ordinals to a later paper [4].
CONSTRUCTIVE ORDER TYPES, I
237
VII. Exponentiation VII.t. In this section we define exponentiation but restrict our attention to co-ordinals, since although the collection of all C.O.T.s is closed under exponentiation, the collection of quords is not. This result is implicit in [12]. Since some properties of exponentiation depend on multiplicative properties (e.g. (ABf = ABC) it is to be expected that we should not be able to prove analogues of all the classical laws concerning monotonicity. However, if we consider predecessors of principal numbers for exponentiation we get results analogous to those in the preceding section for multiplication. Since we have defined C.O.T.s in terms of classes of sets of ordered pairs of natural numbers and since a classical method of defining exponentiation depends on consideration of finite (descending) sequences in a given ordering, we now define a (primitive) recursive function e which assigns a natural number to each finite sequence of elements in a representative of a C.O.T. which is indexed by a sequence in a representative of another C.O.T. DEFINITION
VII. 1. 1: A symbol of the form
b O •.• b n) ( a o '" an
where n > -1 and the a.; b, (i = 0, ... , n) are natural numbers is said to be a bracket symbol. If n = -1, the symbol is simply 0 which we call the empty bracket symbol and denote by O. We use upper case bold face letters (A, B, C, etc.) for bracket symbols. DEFINITION
VII. 1.2: e(O)
if n ?: 0 and A
= 0; =
(b o '" b
then e(A) =
n
ao '"
n
)
an
p{(ai,b i)
i= 0
where Pi denotes the i-th prime (Po
=
2).
+
1
JOHN N. CROSSLEY
238
THEOREM VII. 1.3: e is a one-one primitive recursive function from the set of all bracket symbols into J. Further, pe is recursive. PROOF. Left to the reader.
VII.2. We define exponentiation of C.O.T.s in this sub-section using the function e. DEFINITION VII. 2.1: If A is a linear ordering and a e C'A, then a is said to be the minimum element of A if b s C'A --+ eA. Clearly, if a linear ordering has a minimum element then it is unique. Notation. We write a = min(A) if a is the minimum element in the ordering A. If {bJ7 = 0 is a sequence of elements in C'B such that
then we write
o ::;;
i
< n
--+
B & b, + 1 i= bi'
--+ b2 --+ •.. --+
bn>s B.
DEFINITION VII. 2 .2: If A, B are linear orderings then E (A, B) is the set of all bracket symbols K such that K
=
bo ... bn) ( ao ... an where ("1m) (m < n --+ am e C'A & am i= min(A))
>
and
=
{<e(K), e(K'): K
0 and B i= 0, then AB = 0; otherwise
= (b o .. , bm ) ao ... am
&K,K'eE(A, B) .&:K
& K'
ao
=o.V.
K i= 0 & [em ::;; n & ("Ir) (r ::;; m --+ a, = a; &
v (3r) ("Is) {(s < r
--+ as
=
a~
& «»; b;) e B & br i= b;.v.b r
b:)
= (b?
an
b,
=
b;n
& b, = b~)
= b; &
Since E(A, B) with the ordering induced by definition VII. 2.3 is equivalent to the classical definition of A raised to the power B (cf. [14J, p. 306 et seq.) and A B is a linear ordering of a subset of J, A B is a relation in our sense.
CONSTRUCTIVE ORDER TYPES, I THEOREM VII.2.4: If Ai ~ Bi (i
=
239
1,2), then A~l ~ A~2.
PROOF. The only non-trivial case is where Ai (or equivalently, A 2 ) is non-empty. Suppose Ai #- 0 and p: Ai ~ A 2 and q: B1 ~ B2 • Let r be the map defined only on pe by
reO) = 0, r(n) = e
q(bo) ... q(b m) ) ( p(ao) ... p(a m)
if n 13 pe and n
= e ( bo ... bm ) . ao ... am
r is partial recursive, since pe is recursive, and is one-one and onto since p, q are one-one and onto and e is one-one. The order-preserving property
follows from the classical case. DEFINITIONVII.2.5:AB = CRT(AB) where AsA and BsB'.J.We sometimes write "A exp B" for "A H" and "A exp B" for "A B" . By theorem VII. 2.4, A B is uniquely defined. THEOREM VII.2.6: (i) If A, B are co-ordinals, then A B is a co-ordinal. (ii) If A, Bare e.O.T.s, then A B is a e.O.T. (iii) (Parikh [12]) IfB is a well-ordering and A is a quasi-well-ordering, then A B is a quasi-well-ordering. (iv) (Parikh [12]) There is a quasi-well-ordering A such that T A is not a quasi-well-ordering where T 13 2. (v) If A B is a quord, and A #- 0, 1 and B #- then A and Bare quords.
°
PROOF OF (v). Suppose A 13 A and B I: B but A is not a quasi-wellordering. Then there is an infinite recursive descending chain, {aJ7'= 0, in A. Since B #- there is an element b in C' B and hence
°
is an infinite recursive descending chain in A B which is impossible. Suppose then that B is not a quasi-well-ordering, then there is an infinite recursive descending chain {bJ7'= 0 in B. Now since A #- 0, 1 there is an element a #- min(A) in C' A. Therefore
JOHN N. CROSSLEY
240
fe(b i ) }
t
00
i~O
a
is an infinite recursive descending chain in A B. This too is impossible and (v) is established. VII. 3. THEOREM VII.3.l: A exp (B+C) = AB.A C • PROOF. Let A GA, B GBand C GC where B)( C. Then B+ C is well-defined and by theorem II. 1. 4. (i) there exist r.e. disjoint sets /3, y such that C'B £ /3 and C'C £ y. C'(A exp (B + C)) = {e(K): KG E (A, B+ C)} and
KG E(A, B+ C)
+-+
K =
(eo
er ) & ('Vi) (a, i= min(A))
ao
a;
& <eo -+ e l -+ ... -+ er ) GB+C. This last clause is equivalent to
<eo -+ el
v
-+ .. , -+
<eo -+ el -+
... -+
v (3 s) (s < r & C'(A B and Kl
<e
.
e
<eo -+
r)
B G C G
... -+
e
es )
G
C
& s + 1 -+ .,. -+ r ) G B). AC) = {j(e(K l ) , e(Kz)) : K l GE(A, B) & K z GE(A, C)}
G E(A,
B) +-+
es + 1 Kl = ( as + 1 and
er )
& <es + 1
-+
es + z
•.. •..
-+ ... -+
er )
er) & ('Vi) (ai i= min(A)) ar G
B
K z G E(A, C) +-+
Kz =
eo ... es) & ('Vi) (a l i= min(A)) ( a o ... as
CONSTRUCTIVE ORDER TYPES,
241
I
Recalling that if r = s, then K 1 = 0 and e(K 1 ) = 0, and if s = -1, K 2 = 0 and e(K2 ) = 0, it follows that the map p defined by p(x) = l(x)* k(x) is order-preserving between A B • A C and A exp (B + C). Now let
.
.
15 = }(e(K 1 ) , e(K2 ) ) : K 1 =
t
(b o ... b
m)
ao
&K = 2
am
(co
a~
Cn)
a~
& (Vi) (ai' a; =f. min(A) &
b,
B
f3 & c, B Y)}
and let q be the map p with domain restricted to 15. Then q is partial recursive since 15 is r.e. Further q is one-one. For suppose q(x) = q(y) = z, say. Then z = pj(e(K 1 ) , e(K 2 )) = e(K) for some bracket symbols K 1 , K 2 , K. But K
=
(eo
ao
e
r)
a;
where a, =f. min (A) and e, B f3 U y, and there is precisely one number s such that -1 ~ i ~ s -+ e, B f3 & s < i ~ r -+ e, B y by the definition of 15. Therefore K 1 and K 2 are uniquely determined by K and our assertion is proved. This we have proved that q is a one-one, partial recursive, order-preserving map between A B • A C and A exp (B + C), i.e. is a recursive isotonism. The theorem follows at once from this. Notation. A O = 1; A n+ 1 = An.A. Corollary VII. 3 .2. If A is a co-ordinal, then AI" = An.
PROOF. If A = 0, the assertion is trivial. If A =f. 0, the reader will easily verify that N
242
JOHN N. CROSSLEY
in all other cases for A, B, C = 0 or 1, both sides are 1. We therefore assume A, B, C i= 0, 1. Let A e A, B e B, C e C, then
Cn) eE(A B, C) qn
(ABf = {<e(D), e(E): D = (co qo &E =
(C~
C~') s E( A B, C) &
qo
qn'
(Vr) (qr = e(Qr) & q; = e(Q;) & Q" Q; e E(A, B)
0 . v. D i= 0 & [en s n' & (Vr) (r s n --+ c, = c; & qr = q;))
:&: D
=
v (3r) ("Is) {(s < r
&
--+ Cs
« c., c;) s C & c, i=
= c; & qs = q;)
c; . v .
c, = c; & e AB +-+ (r,
s
t; & ("Is) (s S
v (3u) ("Iv) {(v < u
t, --+
--+
br• = b;s & a., = a;s))
b.;
= b;v &
a.;
= a;v)
& [«bru' b;u> e B & »; i= b;u) v (b ru = b;u & <aru' a;u> s A)]}.
Now
ABC = {<e(D), e(E): D = (j(b o, co) .. , j(b m Cn)) s E(A, BC) ao an & E
= (j(b~: c~) ao
: &: D
j(b~,: C~,)) e E(A, BC) an'
= 0 . v. D i= 0 & [en s n'
(Vr) (r
s
n
--+ j(b"
&
c.) = j(b;, c;) & a, = a;))
v (3r) ("Is) {(s < r --+ j(b., cs) = j(b~, c~) & as = a~) & «j(b" cr),j(b;, c;» s BC &j(b" c.) i= j(b;, c;) . v. j(b" c.) = j(b;, c;) & e
An]}.
CONSTRUCTIVE ORDER TYPES, I
But
(j(b r , c.), j(b;, c;» s BC +-+ (c r , c;) & C & c, #- c; . V.
and
c, = c; & (b" b;) & B
Now let p be the partial recursive function defined only on
{e(X) :X= (idoo
s. =
in) & (Vi) (d s pe)}, i
d;
(where we recall that pe is recursive) by (p(O) = 0 and) Co
.•.
Cn
))
p ( e ( e(Qo) ... e(Qn) = e (j(b oo, co)
aoo
where
j(b omo, cO)j(b 1 •0 , c1 ) a omo
•••
a1 , o , "
j(b 1,m" c1 )
•••
a1,m,
... j(bno, cn) ... j(bnmn, Cn») ... ano . .. a nmn
Using the definitions of (ABf and ABC given above the reader will readily verify that p is order-preserving, one-one and onto, from which it follows that p: (AB)c ~ ABC. Taking C.R.T.s completes the proof. As in the classical case, we do not have in general, ACBc = (ABf.
VII. 4. We now introduce principal numbers for exponentiation and show that predecessors of principal numbers for exponentiation satisfy [the analogues of] the classical laws for exponentiation. DEFINITION VII .4.1: A co-ordinal A > 1, is said to be a principal number for exponentiation if
1~ B < A We write £ nentiation.
--+
BA
=
A.
(exp) for the collection of all principal numbers for expo-
244
JOHN N. CROSSLEY
THEOREM VII. 4.2: All principal numbers for exponentiation are infinite co-ordinals whose classical ordinals are limit numbers. PROOF. Left to the reader (cf. theorem IV. 4.9). The condition in definition VII. 4. 1 is stronger than the condition: 1 ~ B, C --+ Be < A. This will be shown later in a manner analogous to that referred to in § VI. 2 by proving that if 2A = A, then W divides A. THEOREM VII. 4 . 3: W is a principal number for exponentiation. PROOF. It suffices to prove that, if N I; In> then W ~ N W . Let N = {(x, y): 0 ~ x ~ y < n}, then clearly N I; In. If S I; of, then s is expressible in the form
where for all i, 0 defined by
~
a, < n. Let f be the (partial) recursive function
°)
f(s) = e r r - 1 ... ( a r a r - 1 '" a o where columns with bottom entry
°
have been omitted.
E·g·f(n 2 .3+n.0+2) Then, if u, v I; of and u a; and b, may be zero,
(2 0)
=e 3 2 .
= nrar+ ... +a o and v = n'br+ ... +b o' where
and
(1)
(We remark that the fact that a" a; _ l' . . . and b" b, _ l' . . . may be zero does not affect the ordering.) But the ordering ~ given by (1) is precisely the ordering in N W of the bracket symbols
(a,r
0 ) and ao
(r .. , 0 ) b; ... b o
where columns with bottom row zero have been omitted. Clearly, one-one. Hence f: W ~ N W and the theorem is proved.
f
is
245
CONSTRUCTIVE ORDER TYPES, I
Corollary VIl.4.4. 2w = W. THEOREM
VIIA.5: If A > 1, then A B = A C
-+
B =
c.
Let A G A, B G Band C G C, and suppose p: A B ~ A': Then A ..... AC and since A B and AC are well-orderings, it follows that p is an extension of the unique minimal isotonism, Pc, between AB and A C• Now, classically, e > 1 & e r = e.1 -+ r = ..1. Therefore there is an isotonism qe (not necessarily partial recursive) such that qe: B ..... C. Now the map PROOF. I) B
defined only on E( A, B) is an isotonism between A B and Ac. Hence by theorem IV. 5.3, p is an extension of r: Since A > 1, there is a non-minimum element, say a O, in C' A. Let p' be the map p with domain and range restricted to
then p' is partial recursive. Further, if p'
(e (:0))
is defined then its value is e
(~o)
for some y. Now let q' be the map
then clearly q' is partial recursive") and agrees with qe on theorem IV. 5.3). q' is one-one, since
c-s (again by
I) We are here using a similar extension procedure to that used in the proof of theorem VI. 2.4. 2) (x)o = exponent of (po =) 2 in the prime factorization of x.
246
JOHN N. CROSS
(e (:0)) = p (e (~O)) 2
2j ( u, q'(x» + d. 3X 1 . . . . . P:" &
--. p'
=
j
( u' , q'(y»
+ -: 3Y1 •
. ..
.
p~m
for some u, u', d, d', n, m, XI' .•. , X n' YI, ... , Ym where d, d' = 0 or 1. O But by the definition of p', any image of p' is of the form 2 j ( a , b) + I and hence d = d' = 1, n = m = 0, u = u' = a O and p'
and p'
(e (:0)) = 2
(e (~o)) =
j
( aO, q'(x»
2j (a
O
, q' ( Y»
+
1
+ 1.
Therefore
from which it follows, since p' and e are one-one, that X = y. Thus we have shown that q' is a recursive isotonism between Band C. This completes the proof. THEOREM
but A :F B.
VII.4. 6: There exist co-ordinals A, B, C such that AC = BC
PROOF (as in the classical case). Let A = 2, B = 3, C = W. Then by theorem VII.4.3, 2w = s". THEOREM PROOF. A
VII.4. 7: C > 1 & A < B --. C" < CB.
< B -. (ElD) (D :F 0 & A+D
= B). Hence by theorem
VII.3.1, C = C" + D = CA.. CD. Now CD:F 0 since C:F 0, hence (3E) (CD = 1+E). Hence CB = C\1+E) = CA+CA.E by theorem VI. 1.6 and C A ~ CB. But B
CA. = C B
-.
CA. E = 0 -. E = 0 -. CD = 1 -+ D = 0
which is a contradiction. This completes the proof.
CONSTRUCTIVE ORDER TYPES, I
247
VIIA.8: (i) If A, C> 1, then A < A C • (ii) If C > 0, then A s A C • LEMMA
PROOF. (i) Since C > 1, there is a D # 0 such that 1 +D = C. Therefore A C = A 1+ D = A.A D by theorem VII.3.!. Now IADI > 1, ,by classical arguments, hence there is an E # 0 such that AD = 1 + E. Hence A C = A(l+E) = A+AE where AE # 0, i.e. A < A C • (ii) follows at once. THEOREM
VII 04.9: There exist co-ordinals A, B, C such that A < B
but A C $ B C •
Let A = 2, B = V and C = W, then by theorem VIIA.3, Wand by lemma VIIA.8, V < V W = B C • Now if A C ::s; B C , then by theorem 11.5.4 and the transitivity of ::S;, Vand Ware comparable, which contradicts the construction of these co-ordinals. PROOF.
AC
=
Thus we see that the analogue of one of the classical laws for exponentiation breaks down in a very similar way to one of the multiplicative laws (theorem VI.3.2). We have, however, theorem VIlA. 11 which is analogous to theorem VI. 3. 3. VII.4. 10: If E is a principal number for exponentiation, then A, B < E -+ A B < E and conversely if A, B > 1. LEMMA
The assertion is trivial if A, B ::s; 1. Otherwise, if E is a principal number for exponentiation, then A < E -+ A E = E and similarly for B. Hence A IBE) = E. Now B < E and therefore there is a C # 0 such that B+C = E. Therefore E = A IBE) = A(B+C) = AB.A c . But A C > 1, since C # 0; hence A C = 1 +D for some D # O. It follows that E = A B(l+D) = AB+ABD where ABD # 0, i.e. A B < E. Conversely, suppose A, B > 1 and A B < E. Then by lemma VII.4. 8 .(i), A < E. Since E is a principal number for exponentiation, E = A E = (ABl = ABE. By theorem VII 04.5 it follows that BE = E and hence by theorem VI. 2.3. (ii) B < E. PROOF.
THEOREM VII.4. ll : If there is a principal number for exponentiation, E, such that B, C < E (or equivalently B C < E or B, C::s; l) then A < B-+ AC::S;~.
PROOF.
By the transitivity of ::s; and lemma VII. 4. 10, A C < E and
248
JOHN N. CROSSLEY
BC < E. Hence by theorem 11.504, A C and BC are comparable. Now, classically, F < Ll -+ t" ~ Lltl>, hence AC < BC -+ A < B. THEOREM
VII 04.12: If A, B, C are co-ordinals, A C < B C
-+
A < B.
If C = 0 then there is nothing to prove. Otherwise, by lemma VII.4.8, A s A C and B s B C and therefore, by theorem 11.5.4 and the transitivity of ~, A and B are comparable. Hence by the ciassical theorem cpr < tpr -+ cp < tp, we have I A I < I B I and hence A < B. PROOF.
THEOREM
VIlA. 13: There exist co-ordinals A, B, C such that I < A C ~ B C but A $ B.
(as in the classical case). Let A = 3, B = 2 and C = W, then by theorem VII 04.3 (proof), A C = BC = W. PROOF
THEOREM
VII A. 14:
If B, C are
comparable and A > I, then
A B < AC PROOF.
-+
B < C.
By theorem VII 04.7.
THEOREM VIlA. 15: If there is a principal number for exponentiation, E, such that A C < E, then
I < A B < AC
-+
B < C.
PROOF. 1 < A < A implies A, B, C are all ~ 1. By lemma VII 04.10, if A C < E, then A, C < E and BC < E -+ B, C < E. Hence by theorem 11.504 and the transitivity of ~, Band C are comparable. Hence by theorem VII .4.14, B < C. B
C
VIII. Natural well-orderings up to
w(J)w
VIII.t. We showed in § IV that the finite co-ordinals are unique but that for each infinite classical ordinal F there exist c mutually incomparable co-ordinals of classical ordinal F. We now go on to give criteria for collections of co-ordinals which contain precisely one representative for each member of a given collection of classical ordinals. Using these we can give simple criteria for recursive well-orderings to be natural well-orderings, in the sense that if two recursive well-orderings are of the same classical ordinal, then they are recursively isomorphic provided
CONSTRUCTIVE ORDER TYPES, I
249
they are of not too large an ordinal and they are both natural wellorderings. By theorem 1.4.4 it is sufficient to describe co-ordinals which contain such natural well-orderings. In this section and the next we work in a slightly more general context: we do not assume that all our wellorderings are recursive, though it will turn out that they are. In [4] we shall extend our results much further as announced in [21]. DEFINITION VIII. 1. 1: I) If d' is a collection of co-ordinals, then d' is said to be T -unique if
IA I = IB I
A, Bed' &
-+
A = B.
d' is said to be strictly Fvunique if d' is T-unique but not A-unique for any A > T. By theorem IV. 3.6 it follows that d' is strictly T-unique if d' is Tunique but not (T + I)-unique. Corollary VIII, 1.2.
f(? is
strictly co-unique.
PROOF. Immediate from corollaries IV. 2.2 and IV.4. 2. We now give two proofs of the following theorem. The first proof does not use multiplication except in the form A. w. 2) The first three lemmata are common to both proofs. THEOREM VIII. 1.3: The collection £"( +) of all principal numbers for addition is strictly wW-unique. LEMMA VIII, 1 .4:
If A is a quord, then B+A
=
A
+-+
B.w
s
A. 3 )
PRooF.4 ) Suppose B. w ::; A, then there is a co-ordinal C such that B,w+C = A. ThereforeB+A = B+(B.w+C) = (B + B.w)+C = B.w+ C (by theorem II.4.1.(vii» = A. Now suppose B + A = A. If B = 0, then the assertion is trivial. If A = 0, then B = 0, hence we may assume A t= 0 t= B. By hypothesis I)
This definition is adapted from [10],
2) Since we may define W by recursion, thus W = Lr», W"+ I = W" .co, 3) Bu» :S A may also be written (3C) (B, W C = A) which brings out the
+
similarity with theorem VIII. 2.2. 4) This Iheorem can also be proved for co-ordinals using a technique similar to thai in the proof of theorem VIII.2,2.
250
JOHN N. CROSSLEY
there exist quasi-well-orderings A, B and a recursive isotonism f such that f: B+ A ~ A where B)( A. Let (X = CA, proof only.
P = CB.
We introduce the following notation for this 00
Poo
=
Boo
=
(xo
= {x: (Vn)f-" (x)
Ao
=
u
"=0
j" + 1 (P),
A [Poo, s
(X)},
A [(Xo'
We shall prove: 1)
(xo
("\
2)
(xo
u
Poo = 0, Poo = (x,
3) Boo e s.»,
4) x e (xo -+ f(x) = x, 5) x e Poo -+ f(x) # x, 6) Boo)( Ao, 7) Boo+Ao = A.
1) If x e Poo, then x = j"(y) for some n > 0, some yep. Hence f-n(x) is defined and t (X; so x t (xo. 2) Since f maps P u (X onto (x, x s (X implies either ('
or (3n) [F"(x) s P].
I.e. x e (X -+ x s (xo v x e Poo. Conversely, x s (xo -+ X = fO(x) s (X and x s Poo -+ x = j"(y) for some yep, some n > 0, i.e. x e (x. 3) Since B)( A there is a partial recursive function p such that if x e p u (X then xs
(X +-+
p(x) = 0 & x s
p +-+
p(x) = 1
(by theorem II .1. 5). We now use p to calculate a function g such that x s Poo
-+
g(x) = j( r-"(x), n -1)
CONSTRUCTIVE ORDER TYPES, I
251
where n = 11,{r'(x) s P & (\'s) (s < r
Step A. Calculate j-I(X). If a value (say) P(XI)'
-+
j-S(x) e oe}.
XI
is obtained, calculate
Three cases arise: 1. No value is obtained for XI or XI is defined but no value is obtained for p(x I ) ; 2. XI is defined and p(x l ) = 0; 3. XI is defined and p(x I ) = 1. We proceed according to cases. Case 1. g(x) is undefined. Case 2. Repeat step A with
Xl
replacing x,
Case 3. g(x) = j(xl, n) where n is the number of times case 2 has arisen in the computation and X I is the value most recently obtained in performing step A.
g is clearly partial recursive. Suppose g(x) = g(y), then g(x) = j(xl,n) = g(y) for some Xl = j-"-I(X) = j-"-l(y). But is one-one, therefore X = Y and g is one-one. We now show g maps Pw onto p. 00. By the definition ofg, g(pw) 5;; p. 00. If j(x,n)ep.oo then f"+l(x)epw and g(f"+I(X» =j(x,n); hence p.oo 5;; Pw. Next we show that g is order-preserving between Bw and B. ca. It suffices to show that if (xo, Yo) s Bw and Xo = rex) and y = f"(y) where x, yep and 0 < r < m -+ rex) e oe and 0 < S < n -+j'(y) e oe, then I ~ m < nor 1 ~ m = n & (x, y) s B. If m > n, then since j is one-one and order-preserving, (jm - "(x), y) s B+ A. But yep and r - "(x) e oe which contradicts B) (A. Hence In ~ n. If m = n, then (x, y) s B+ A where x, yep. We conclude (x, y) e B. This completes the proof of 3).
r:
4) Since A is a quasi-well-ordering and A o 5;; A, A o is a quasi-wellordering. Now j maps oeo = C' Ao onto oe o since X e oeo -+ j-I(X) e oe o & j(x) e oeo which implies oe o 5;; j(oeo) 5;; oe o' But j is order-preserving, hence by theorem III. 1.6, j = 1 on oe o ' 5) x e Pw -+ x = f"(y) for some n > 0, some yep. Since j is one-one, x = j(x) impliesr"(x) = j-" + lex). Butj-"(x) e p andj"?' + I(X) e oe and p n oe = 0 since B )( A. Therefore j(x) ¥ x. 6) Since j is partial recursive, bj is r.e. If x e Pw, then by 6) j(x) ¥ x.
252
JOHN N. CROSSLEY
If XC Ci o, then by 5) f(x) = x. Hence Cio, {J(O are contained in the disjoint r.e. sets {x: x C fJf&f(x) ¥- x} and {x: x s fJf&f(x) = x}. Hence by theorem 1I.1.4.(i) B(O)( A o . 7) By 6), B(O + Ao is well-defined. By 2), C(B(O + Ao) = Ci. By definition B(O ~ A and A o ~ A. It therefore suffices to prove that {JwXCio ~ A and A ~ Bw + A o . If x e {Jw and y s Cio then(3n) (f-n(x) c {J) but ("In) (f-n(y) c «). Hence <J-n(x), rn(y) c {J x rx ~ B + A, for some n, and since f is orderpreserving, <x, y) e A. If (x, y) e A then either (i) x, y e {Jw or (ii) x e {Jw, y s Cio or (iii) x, y c Cio or (iv) x s Cio, Y e {Jw by 2). Hence in order to complete the proof of 7) we only need to show (iv) is impossible. If (iv) holds, then there is an n such that f-n(x) e Cio and f-n(y) c f3 which is impossible since f is order-preserving and (Ci x {J) n (B + A) = 0. We now complete the proof of the lemma. By 3) Bw e B .w. Let C = CRT(A o), then by 7), B.w+C = A and hence B.w ::;; A. LEMMA VIII. 1.5: A co-ordinal A is a principal number for addition if,
and only
if; B <
PROOF.
A
~
B. co ::;; A.
Immediate from definition IV. 4.8 and lemma VIII. 1. 4.
LEMMA VIII. 1.6: If A cYt'( +), then A = wn < A.
wn for some 11, or
i
for all n,
PROOF. If A = 1, the assertion is trivial. If A > 1, then by lemma VIII. 1.5, 1. t» = W::;; A. If A ¥- W, then W < A. Now suppose W n < A (where n > 0). Since A is a principal number for addition, by lemma VIII .1. 5 W n • w = W n + 1 ::;; A. Hence either A = W n for some n or for all 11, W n < A.
LEMMA VIII. 1.7: If P is a principal number for addition, then P. w is a
principal number for addition and there is no principal number Q such that P < Q < P.w.
The first part is a restatement of theorem IV. 4 .10. Suppose Q e.Yf'(+) and P < Q, then by lemma VIII. 1. 5, P. w ::;; Q; hence PROOF.
Q 1:: P.w.
LEMMA VIII. 1.8: W n is a principal number for addition for every n,
CONSTRUCTIVE ORDER TYPES, I
253
PROOF. If n = 0 or 1, then the assertion is trivial. Suppose n > 0 and W n is a principal number for addition, then by lemma VIIL1. 7, W n + 1 = W n • w is a principal number for addition. Hence the lemma is proved by induction. PROOF OF THEOREM VIn. 1 .3 (FIRST VERSION). By lemmata VIII. 1.6 and VIII .1. 8 a co-ordinal A of classical ordinal < W W is a principal number for addition if, and only if, it is of the form Wn • Hence £( +) is wW-unique. Now let V, V' be two incomparable upper bounds for {W n : n e Y} constructed as in corollary V. 2.3. Then 1V I = I V' I = co", Now A < V --. A < W n < V for some n, and similarly for V'. But A < W n--. A+W n = W n and therefore A+V = V and A+V' = V', i.e. V and V' are principal numbers for addition. Thus £( +) is strictly wW-unique. LEMMA VIII.l.9: (i) W m < W n if m < n, (ii) Ifn ~ 1,1+ W n = W n ,
(iii)
If m <
n, W m+ W n = W n.
PROOF. (i) If m < n, then n = m+(n-m). Hence by theorem VII.3.1, W n = Wm+(n-m) = WmW n- m = W m(I+E) [for some coordinal E] = W m+ WmE. Now I W m I < I W n I, hence W m < W n. (i i) By (i), if n ~ 1 then W:s; W n and hence by lemma VIII. 1 .4, 1+ W n = W n • (iii) W m+ W n = Wm(l + W n - m) = wmW n - m = W n if m < n. DEFINITION VIII .1.10: A co-ordinal C (an ordinal T) is said to be a polynomial in W (polynomial in co) if C (T) can be expressed in the form C = W n .a n+ ... +ao = p(W) (r = w n .a n+ ... +a o = p(w)) where the a, are natural numbers and an ¥= O. The degree of p(8p) is n and the rank of p (rk(p)) is the number of non-zero ai' We observe that I p(W)
1
= p(w).
LEMMA VIII. 1 . 11: If p( W) is a polynomial in W of degree < n, then p(W)+ wn = W n • PROOF by induction on the rank of p. If rk(p)
=
1, then p(W) = Wma m
JOHN N. CROSSLEY
254
for some m
~
0, some am #
p(W)+ W n = W n if op
< n,
o.
Applying lemma VIII .1. 9. (iii) am times,
Now assume the lemma holds for rk(p) = m -1 > o. Then peW) = = i!r{a r # O}. Then rk(q) = rk(p)-l. By am applications of lemma VIII. 1. 9. (iii), peW) + wn = q(W) + W n and by the induction hypothesis, q(W)+ W n = W n. q(W)+ Wm.a m where m
LEMMA VIII. 1. 12: If n > 0, then A < nomial in W of degree < n.
wn
if, and only if, A is a poly-
PROOF. By lemma VIII .1.11, peW) < wn if op < n. Now if A < W n, i A I = p(w) for some polynomial in ca. Hence by corollary IV. 2 .7, A
=
peW).
LEMMA VIII .1.13: WWand W V are principal numbers for addition. PROOF. Since n < V, there is a U such that V = n + U. Then W n+ W V = W n+ W n + U = W n(1+ W u) = WnW U since 1 < U and U W< (using lemma VIII. 1.4). Hence Wn+W V = WnW U = n W + U = W V , and W n < W V for every n. Similarly W n < W W for
w
every n. Now every ordinal < W is represented by a polynomial in wand hence by corollary IV. 2.7 and lemma VIII .1.12 we also have, conversely, A < W V --+ A < W n for some n, and similarly for W w . Therefore if A < WV W
A+ W
V
=
A+(W
n+
W
v)
=
(A+ W
n)+
W
V
=
W
n+
W
V
=
W
V
for large enough n (and similarly for W w). Thus W V and W W are principal numbers for addition. PROOF OF THEOREM VIII .1.3 (SECOND VERSION). By lemmata VIII .1. 6, VIII .1.11 and VIII .1.12, every co-ordinal of the form W n is a principal number for addition and there are no other co-ordinals which are principal numbers and have ordinal < co". Hence £( +) is wW-unique. By lemma VIII. 1.13, W W and W V are principal numbers for addition. But W W = W V --+ W = V by theorem VII.4. 5, which contradicts the definitions of W, V. Hence £( +) is strictly wW-unique. It follows at once from theorem VIII. 1.3 that the collection of predecessors of principal numbers of ordinal < W W contains precisely one
CONSTRUCTIVE ORDER TYPES,
I
255
co-ordinal for each ordinal < W W and is closed under addition by theorem IV.4.11. We close this section with an example of a principal number for addition whose classical ordinal is not a (classical) principal number for addition (v. § IV.4). Example VIII .1. 14. Let p, V be as given in § IV. 4. Let IJ( = C'W v and let U = {(x, y) : x, yea & x S y}. Then IJ( is r.e., clearly, but is
not recursive. For
IJ(
recursive implies
{x: (3y) (y = e(~)) &yelJ(}
= p
is recursive, which contradicts the choice of p. U is ofclassical order type t» and W V and U are strictly disjoint (§ 11.1) but clearly not (even r.e.) separable. Hence W V +- U is well-defined, but does not belong to CR T(W v ) + CR T( U), and is of ordinal W W + co. Let P = CR T
(Wv+-U).
Now if p = {v;}:"= 0 where i < j --+ Vi < vj and Vn = V [ {Vi: i < n}, then Vn s nand C'WVn is recursive. Now CRT(WVn) = W n and by theorem II .1. 6 it follows that W n < P for every n. However, W V {: P since W V + B = P implies that W V + U s P which is a contradiction. By theorem IV. 3.6 we similarly have W V + n {: P for all n. Since A < P --+ I A I < W W + w it follows that A < P --+ A < W n for some n. Therefore A+P = A+(Wn+Q) [for some Q since W n < P] = (A+ Wn)+Q = Wn+Q [by lemma VII1.l.8] = P. Hence Pis a principal number for addition. I P I is not a classical principal number for addition since W W < wW+w but wW+(ww+w) > co", VIII. 2. In this section we prove a multiplicative analogue of theorem VIII.l.3. LEMMA VIII.2.1: LI =f. 0 & T > LIT'
--+
r > I",
PROOF. Immediate from theorem 2, p. 292 in [14]. THEOREM VII1.2.2: If A is a co-ordinal, then BA = A ~ B W divides A, i.e. ~ (3C) (A = BWe). PROOF. B WC=A-+BA=B 1 + WC=B wC=A by theorem IVA.4.(i).
256
JOHN N. CROSSLEY
Conversely, suppose BA = A. We may assume that A > 1, since otherwise there is nothing to prove. By hypothesis there exist well-orderings A, B and a recursive isotonism f such that A e A, B e Band f: A
Let
IX
= C' A,
/3
~
BA.
= C' B. We also write
"a
"I a I" for "I CRT(A [{x: x
8 IX,
f(a) = j(b, at) where b 8
/3 and
fear) = j(b, a, + 1) for some b 8
at 8
IX
/3.
Since f is order-preserving,
la 1 = I B I . I a 1 1+ A for some A < I B
I·
Hence lal~IBI.lall and by lemma VIII.2.1, lal~lall. Similarly, 1a, I ~ I a, + 1 I· It follows that, since A is linear,
n(x) = /lr(xr = x; + 1) [= Pr{(lfY(x) = (If)' + l(X)}] is always defined if x 8 IX. If a 8 IX and n(a) = n, then
I an I = I B I· I an I + A where A < I B I·
But I B
I . I an I z I an I since B "# 0 and therefore A
= 0 and j'(c.) = j(min(B), an)'
257
CONSTRUCTIVE ORDER TYPES, I
We observe that, for any x, if n > n(x) then (lft(x) = (If)n(x)(x).
Let C = A[{x: Lf(x) = x} and let D = O(Bw ) where g is the partial recursive function, defined only on pe, which maps only bracket symbol images ofthe form e (no n l bno bn ,
...
•••
ns) where n i , bn , I> of and no > n l > n2 > ... > ns bn,
~0
onto
n l+ln l nl-I min(B) bn , min(B)
no no-lno-2 ( e bno min(B) min(B)
n, ns-I bn, min(B)
0 ) min(B) .
I.e. g(x) inserts the missing positive integers in the top row of e -I(X) and in the columns where an integer was missing inserts min(B) in the bottom row and takes the image under e of the resulting bracket symbol. It is clear that g is one-one, so D is well-defined. We shall now show that A ~ D. C from which it follows at once that A = B W C where C = CRT(C). Let
.((n(X)-1
hex) = J e kf(lf)"(X) -
...
1 ••.
i
0)
., .
kf(lfi(x) ... kf(x) , (if)
n(x») (x)
.
Clearly, h is partial recursive. Suppose hex) = hey), then kh(x) = kh(y) and 111(x) = Lh(y). Hence (If)"(X)(x)
Now, since e is one-one we have and hence, for 0 :c::; r < n(x),
n(x)
= (Lf)"(Y).
=
(I)
n(y)
kf(lf)r(x) = kf(lf)'(y).
(2)
Putting r = n(x)-I in (2) and using (I) we have f(lf)"(X) - I(X)
But f is one-one, hence (If)n(x) - I(X)
= f(lf)n(x) = (If)"(X) -
I(y).
I(y).
258
JOHN N. CROSSLEY
Now assume where s < n(x) = n(y). Then by (2) with r = s-l
(If)S(x)
(kf) (If)' - I(X)
and using (3)
= (If)S(y)
(3)
= (kf) (If)' -
I(y)
f(lf)S - I(X) = f(lf)' - I(y).
By the one-one property off, (If)s - I(X) = (If)s - I(y)
and by induction it follows that x = y. I.e. h is one-one. It is clear that h maps C' A onto C' D. C and it only remains to prove that h is order-preserving. Suppose a
a,
+-+
f(a j )
+-+a j + 1
or
a, + 1 = a; + 1 &
where b, + Hence
1
b, + 1 < B b; + 1
= kf(aJ
a
+-+
an
an = a~ an
=
& b; < B b~ or
a~ & an -
I
=
..... or +-+
an
=
an
an
a~ & ... &
al
a~ -
=
I
& b; -
a~ &
I
b~ -
I
or
b, < B b',
a~ or
= an &
b; < B b~ or (4)
..... or since f(aJ
= j(a
an = a~ & bn = b'; & ... & b 2 = b; & b l
+ I'
b, + I)'
259
CONSTRUCTIVE ORDER TYPES, I
Now h(a)
hea') +-+ an an
=
or
a~ & (3r) (1 :::;
s < r ...... b.
=
(5) b~ & b,
< B b~).
(4) and (5) are equivalent since, as we observed above, if n > n(a), then an = an(a) and b; = min(B). Finally, h maps C' A onto C' D. C for suppose
x
=j
(
e
II (
bn
•.. •••
0) ) bo
,c e C'D. C,
where c e C'C (and b, e C'B).
=
Let ao
c, a, + 1
= r:
(j(b r, ar»'
Then ao e C' A and if a, e C' A, then a, + 1 e C' A. In particular, an e C' A and an = h - '(x), We have therefore proved h: A ~ D. C and the theorem is established. LEMMA VIII.2.3: A co-ordinal A is a principal number for multiplication if, and only if,
o<
B < A +-+ B W divides A.
PROOF. Immediate from theorem VIII. 2.2 and definition VI. 2.1. 1 We observe that Wwo = W 1 = W, (Wwn)w = W wn. w = W wn+ by theorems VII. 3.3 and VII. 3.1. LEMMA
either A
VIII. 2.4: If A is a principal number for multiplication, then n n n ww for some n, or for all n, Ww divides A and Ww < A.
=
PROOF. By theorem VI. 2.2, if A is a principal number for multiplication then 2 < A. Hence 2A = A and by theorem VIII.2.2, 2w divides A. But by corollary VII .4.4, 2 w = W. Therefore, if I A I = w, A = W. Otherwise Wdivides A and W < A by theorem VI.2.3.(ii). wm Now suppose W < A for m < n (where II > 1). Then, by lemma m 1 VIII.2.3, if A is principal (Wwm)w = Ww + divides A. By theorem n wm 1 wn VI.2.3.(i), W + :::; A and hence, if I A 1= ww , A = W or, for all r, w w W • divides A and W • < A.
Corollary VIII.2 .5. If A, B are principal numbers for multiplication
260
JOHN N. CROSSLEY
and B < A then A = B < A.
r:
LEMMA
VIII. 2 . 6:
PROOF. p'(W) a
wn
for some n, or for all n, B
=
WOP'.a+q(W)
where
q
p(W)+p'(W)
is a polynomial in Wand =
p'(W).
VIII.2.7: If Dp < op', then WP(W). WP'(W)
=
W p'(W).
By theorem VII.3.3, WP(W). WP'(W) = lemma then follows from the previous one. PROOF.
LEMMA
VIII.2.8:
WP(W)+P'(W).
The
If ap < op', then WP(W)+ WP'(W)
PROOF.
divides A and
Ifp( W), p' ( W) are polynomials in Wand ap < ap', then p(W)+p'(W) = p'(W).
#- O. By lemma VIII. 1. 11, LEMMA
wn
=
Wp'(W).
By lemma VIIL2. 7,
WP(W)+WP'(W)
=
WP(W)+WP(W).WP'(W) =
WP(W) {l+W P'(W)}.
By lemma VIIA.8.(ii), W::::;: WP'(W), hence by lemma VIII. 1.4, = WP'(W). Therefore WP(W)+ WP'(W) = WP'(W).
1+ WP'(W)
LEMMA
VIII. 2.9: If a co-ordinal A is of the form A
=
WPI(W).at
+ ... + WPe(W).a e +
q(W)
(6)
where Pt, ... , Pe and q are polynomials in W such that Pt(W) > P2(W) > ... > peCW) at #- 0 and Pt(W) > W, then
A < Wwn and A+ Wwn = wwn if apt < n. Conversely, if A < wwn for some n, then A is expressible in the form (6) where apt < n. PROOF. We prove the two parts simultaneously. Suppose 0 < A < W then I A I has Cantor normal form (cf. e.g. [14], p. 320)
all. at + ... + roT•. a e + q(ro).
where T i > T 2 > ... > T c-
w:
(7)
CONSTRUCTIVE ORDER TYPES,
261 wO' For each i, T, is a polynomial in ro, since otherwise roT; ~ ro which contradicts A < Wwn. Now to every ordinal of the form (7) there corresponds naturally and in a bi-unique way a co-ordinal of the form (6) (i.e. under the mapping p(ro) -+ p(W)). In order to prove the lemma it therefore suffices by virtue of corollary IV. 2 .7 to prove that A + W W" = W w" where n > 0Pl = degree of the polynomial I', (in co), Suppose oq = m - 1, then A+ Wwn = WPl(W).al +
=
WPl(W).al +
I
+ WPe(W).ae+q(W)+ Wwn +q(W)+(Wm+ Wwn)
by lemma VIII. 2 . 8 = WPl(W).al + ...
= Now by
e
i
L ~
1
WPl(W).al + ...
+ WPe(W).a e+ W m+ Wwn + WPe(W).a e+ Wwn =
by lemma VIII. 2 .8 C, say.
a, applications oflemma VIII.2.8 we have C = Wwn.
LEMMA VIII. 2. 10: (i) (3n) (A < WW'') +4 A < Www.
+4
A < WWv,
(ii) (3n) (A < WW)
PROOF. (i) Let V= n+U, then I U 1= co, By lemma VII.4.8, W ~ W U , hence by lemma VIII. 1. 4, 1 + W U = W U • Now Wwn. WWV = W exp (W n+ W v ) = W exp (W n+ W n +u) = W exp (W n • {1 + W u } ) = W exp (W n . W u) = W exp (W n + U ) = WWV. Hence by lemma VIIA.S, Wwn < Wwv. wn Conversely, suppose A < Wwv, then I A 1< ro for some n. But by lemma VIII. 2.9 there is a co-ordinal A' of the form (6) such that I A' I = I A I and A' < Wwn. Hence by corollary IV.2.7, A = A' and A < WW". (ii) follows at once by substituting 'w' for 'V'. (In this case, U = W.) THEOREM VIII. 2. 11; The collection £(.) of all principal numbers for wO' multiplication is strictly ro -unique. PROOF. By lemmata VIII. 2.4 and VIII. 2.9 every principal number for wn wO' multiplication of classical ordinal < ro is of the form W and con-
262
JOHN N. CROSSLEY n
versely, alI the co-ordinals Ww are principal numbers for multiplication. Hence£{.) is co",W-unique. w V Ww and WW are principal numbers for multiplication, since by the w w v V n. n. proof of lemma VIII. 2.10, Ww Ww = Ww and Ww Ww = WW • ww wv wn Further, A < W or W implies A < W for some n; hence, since w n ww all the Ww are principal numbers for multiplication, A. Ww = W v WV and A. Ww = W • wv But WW,W = W implies, by theorem VII.4. 5 (twice), W = V which is a contradiction. Therefore£"(.) is strictly co"'w -unique, THEOREM VIII .2.12: £' (exp) c £' (.) c £' (+). PROOF. By theorem VII.4. 2 every principal number for exponentiation is infinite. Suppose Pe£(exp), then by lemma VII.4.10, A < P-+ AA < P and hence (AAy = P = A P• Hence if A > 1, then by theorem VII. 4.5, AP = P and hence P is a principal number for multiplication. Now suppose P E £(.), then W :==::; P by lemma VII. 2.4, hence by lemma VIII. 1.4, I+P = P. Therefore if 0 < A < P, P = AP = A(1+P) = A+AP = A+P. I.e. Pe£(+). W W E £(.) - £(exp) since for every F < co'" there is a co-ordinal C < W W but to" is not a (classical) principal number for exponentiation. W 2 E £( +) - £(.) by similar argument. Hence alI the inclusions are strict. THEOREM VIII. 2. 12 indicates how we might extend our classes of co-ordinals to get uniqueness up to higher ordinals. We shall present results obtained by this approach in [4] and [21]. Appendix At. In many theorems concerning (classical) ordinals use is made of the theorem
If a well-ordered set ex is similar to a subset of a well-ordered set then ex is similar to an initial segment of p.
p,
The proof of this theorem requires the axiom of choice. Accordingly, it is not surprising that its analogue fails for C.O.T.s and co-ordinals.
CONSTRUCTIVE ORDER TYPES, I
263
In fact, we have made use of this fact in giving counterexamples to analogues of classical laws like A < B --+ AC :s Be. DEFINITION AI.I: A::s 8 if there is a recursive isotonism from A onto a (linearly ordered) sub-relation of B, i.e. if A ~ A' S; 8. Clearly, if Al ~ A 2 , 8 1 ~ 8 2 and Al ::S 8 1 , then A 2 ::S 8 2 , DEFINITION A I .2: A ::S B if there exist A e A and B e B such that A ::S 8. We write A <. B if A ::S B and A ;f. B. THEOREM AI. 3: (i) A ::S A,
(ii) A ::S B & B S C --+ A ::S C, (iii) A < B --+ A -< B, (iv) there exist co-ordinals A, B such that (a) A -< B but A -cI: B, (b) A ::S B & B S A but A ;f. B.
PROOF (i) - (iii) Left to the reader.
(iva) Let A = V B = (ivb) Let A = V, B = contains an infinite Clearly, V [u ~
W, then clearly V ::S Wand V;f. W. W. Now by Post's lemma ([13], p. 291) p = C'V (naturally ordered) recursive proper subset a, W. Hence W -< V.
DEFINITION AI.4: A e.O.T. A is said to be quasi-finite if A and A* are quords. We write :F for the collection of all quasi-finite C.O.T.s. THEOREM AI.5::F is partially ordered by:::;. PROOF. Suppose A ::S Band B::s A where A, B e:F. If A or B = 0 then A = B = O. We may therefore assume A;f. 0 ;f. B. Suppose g : A ~ 8 1 s; Band h: 8 ~ Al s; A, then f: A ~ Al s; A where f = hg, and Ai ;f. 0. Since A, Al are linear orderings, for every x e C'A either <x,f(x» s A or
on:F.
264
JOHN N. CROSSLEY
References [I] H. Bachmann, Transfinite Zahlen (Berlin 1955). [2] P. Bernays and A. A. Fraenkel, Axiomatic Set Theory (Amsterdam 1958). [3] A. Church and S. C. Kleene, Formal Definition in the Theory of Ordinal Numbers. Fund. Math. 28 (1936) 11-21. [4] J. N. Crossley, Constructive Order Types, II. (To appear). [5] M. Davis, Computability and Unsolvability (New York 1958). [6] J. C. E. Dekker, The Constructivity of Maximal Dual Ideal in Certain Boolean Algebras. Pacific J. Math. 3 (1953) 73-101. [7] , An Expository Account of Isols, Summaries of talks (Summer Institute of Symbolic Logic, Cornell 1957) pp. 189-199. and J. Myhill, Recursive Equivalence Types, University of California [8] Publications in Mathematics, n.s, 3, no. 3, 67-214. [9] S. C. Kleene, Introduction to Metamathematics (Amsterdam 1952). [10] G. Kreisel, Non-uniqueness Results for Transfinite Progressions. Bull. Acad. Polon. Sci 8 (1960) 287-290. [II] J. McCarthy, The Inversion of Functions defined by Turing Machines, Automata Studies. Annals of Maths. Studies, no. 34 (Princeton 1956) 177-181. [12] R. J. Parikh, Some Generalizations of the Notion of Well-ordering (Abstract). Notices Amer, Math. Soc. 9 (1962) 412. [13] E. L. Post, Recursively Enumerable Sets of Positive Integers and their Decision Problems. Bull. Amer. Math. Soc. 50 (1944) 284-316. [14] W. Sierpinski, Cardinal and Ordinal Numbers (Warsaw 1958). [15] R. Smullyan, Theory of Formal Systems. Annals of Maths. Studies, no. 47 (Princeton 1961). [16] A. Tarski, Cardinal Algebras (New York 1949). [17] , Ordinal Algebras (Amsterdam 1956). [18] J. S. UIlian, Splinters of Recursive Functions, JSL 25 (1960) 33-38. [19] A. N. Whitehead and B. Russell, Principia Mathematica, Vol. II 2nd ed. (Cambridge 1927). [20] K. Schutte, Predicative Well-orderings, these Proceedings. p. 280. [21] J. N. Crossley and R. J. Parikh, On Isomorphisms of Recursive Well-orderings (Abstract). JSL 28 (1963) 308.
MULTIPLE SUCCESSOR ARITHMETICS R. L. GOODSTEIN Leicester University, UK
I am going to talk about some recent developments in logic-free formalisations of arithmetic. Primitive recursive arithmetic may be formalised as a simple equation calculus, with substitution and uniqueness rules, and primitive recursive and explicit definitions as the only axioms. For instance we may take as the inference rules A=B A=C
A=B j(A) = j(B)
B=C
j(O) U
=
j(x) = g(x) j(A)
= g(A)
g(O)
j(Sx) = H(x,J(x)) g(Sx) = H(x, g(x)) j(x) = g(x)
where A, B are recursive terms. Rule U is in effect a rule asserting the uniqueness of a function defined by the primitive recursion j(Sx)
=
H(x,J(x)).
In U,jmay contain parameters which may also appear in H. These rules may be sharpened in various ways. We may for instance eliminate H from U, replacing U by 4 special cases. We may also replace the infinity of recursive and explicit definitions by a single axiom of recursion and a single axiom of composition formulated with function variables. However it is not my purpose now to go into these developments. In 1959 V. Vuckovic introduced a very interesting generalisation of the
266
R. L. GOODSTEIN
equation calculus, with a multiplicity of successor functions. Thus in place of the numerals 0, SO, SSO, we have elements 0, S to, S zO, ... , SnO, StSzO, StS30, ... , StSZS30, which are formed by prefixing one of St, Sz, ... .S; to any element of the set, starting with zero. We shall assume that the successors Si are commutative, that is
for any i, j. There is another theory in which we dispense with commutativity, but again I shall not be speaking about this. In place of a pair of defining equations for recursive functions we have now a set of equations 1) 2)
F(x, 0)
= a(x)
F(x, SiY) = b;(x, Y, F(x, y))
i = 1,2, ... , n,
where the b, are subject to the restriction
to ensure that the values of F(x, SiSjY), F(x, SjSiY) obtained from equations (2) are the same. The rules of inference are the same as in the calculus with a single successor except that U now contains a line for each successor. Thus U becomes j(O) = g(O) j(SiX) g(SiX) j(x)
= H;(x,j(x))
=
H;(x, g(x))
i = 1,2, ... , n.
= g(x).
Vuckovic introduced n linear functions x a, Y where X
a, 0=
O=:;;i=:;;n-l
X
X (1iSjY = Si + i (x (1; y)
l=:;;j=:;;n
(where i +j is replaced by its excess over n if in fact i+ j exceeds n). The first of these, (1o, is called addition and denoted by +. Thus x+O = x,
x+Siy
= S;(x+y).
267
MUL TIPLE SUCCESSOR ARITHMETICS
All the familiar properties of + may now be proved exactly as in the one successor system. Thus to prove O+x = x, write L(x) = O+x, R(x) = x then L(O)
=
0+0
= 0,
L(Six)
=
SiLx, RO
yielding
=
Lx
= 0,
RStx
=
SiRx,
Rx.
Addition is commutative and it is associative with all linear operations i.e, (x+ y)er i
Z
=
x+ yeri
Z.
Cross multiplication is defined by xxO = 0 x
X
i
SiY = (x x y)erix
=
1,2, ... n,
and is commutative, and distributive over all linear operations. The predecessor functions PiX are PiO = 0,
PjSix
= x, = SiPjX,
j
=
i
j 1= i.
Finally, x...:... y is defined by x...:...O
=
x,
x...:...Siy
=
P;(x...:...y);
cross multiplication is not distributive over difference. To prove the key equation a+(b...:...a) = b+(a...:...b)
Vuckovic was obliged to apply the uniqueness rule to definition by double recursion in the form
Ft», 0) = a(x) F(x, S,Y) = b;(x, y, F(P,x, y)).
The same situation arose thirty years ago in my first account of the equation calculus, where an application of the uniqueness rule to a double recursion was made to prove the same equation a+(b-=-a)
=
b+(a-=-b)
268
R. L. GOODSTEIN
but I was subsequently able to prove this equation without introducing definition by double recursion. Now many of the results and techniques of the single successor system transferred quite readily to the Vuckovic system, but the rather complicated proof of the key equation did not yield to attempts to make it work in the multiple successor system. While working on this problem my student Mr. M. T. Partis noticed a remarkable similarity between Vuckovic's system and a system whose elements are ordered sets of natural numbers. Thus with a Vuckovic number x we associate natural numbers Xl' X2' .•. , x; in the following way:
and if
o=
(0, 0, ... , 0)
then SiX
=
(Xl' X2' .•. , Xi - l' SiXi' Xi + l' . . . ,
x n) ·
It follows that with S2" ,S2
S3" ,S3
----------------k k 2
3
we associate
We call xl> Xl> ... , x, the components of x. Accordingly with a Vuckovic-function F(x) we associate n functions h(xI' X2' .•. , x n) , i = 1, 2, ., ., n, the components of F. What can we say about these functions j, the components of F? It turns out that if F is recursive in Vuckovic's system, i.e. if F is defined from initial functions SiX, Zx = 0, Ix = X by substitution and recursion then likewise the component functionsh(xl' . . x n) are each primitive recursive. What is more surprising is that one can show that any ordered set of primitive recursive functions h(x l . .. x n) , i = 1,2, ... n, are the components of a function F recursive in Vuckovic's system. One does this by listing the generation by iteration and substitution of each f, from initial functions 0, x+l, x 2 , ••• x-'-y, x+y, thus:
MULTIPLE SUCCESSOR ARITHMETICS
269
°cPu
where in each column cPu is an initial function or formed from previous functions in the column by substitution and iteration; it is readily seen that we may suppose only one column changes at a time. One then shows that each step from one row to the next may be imitated in the Vuckovic system (the initial row is itself the components of the element 0 of the Vuckovic-system). The essential construction is that of a Vuckovicfunction whose ith component is respectively the ith component of the ith function of some set of Vuckovic-functions, To this end we define the component-function Ci(x) thus:") so that Then if we have where
j'l= i,
C;(x) = (0, 0, ... , Xi' 0, ... ,0). F;(x)
= ULf~, ... ,f~)
rC;F;(x)
= Ut,
n, ...,f:)
rg;(x) = gl (x) + gz(x) + g3(X)+ ...
+ gk(X)+ ... + gn(x).
Not only is there this (1, 1) correspondence between Vuckovic-functions and sets of primitive recursive functions but we can imitate in the Vuckovic-system any proof carried out in terms of components and we obtain a Vuckovic-proof. This is because a Vuckovic-proof is a series of steps each of which is effectively a Vuckovic-definition, or a substitution or an appeal to uniqueness, and a proof in components uses the same operations. In particular from the proof of x+(y...:...x) = y+(x...:...y) 1) Variables and numerals in the Vuckovic system are here printed in bold type.
270
R. L. GOODSTEIN
in recursive arithmetic it now follows that x+(y-'-x) = y+(X-'-y) is provable in Vuckovic's system without appeal to double recursion. In primitive recursive arithmetic x+(y-'-x) and x..:.(x-'-y) are respectively the greater and smaller of x, y. In the Vuckovic system two elements are not necessarily comparable but Partis has shown that x+(y-'-x), x-'-(x-'-y) are respectively the least upper and greatest lower bounds of x, y so that the Vuckovic system is a lattice in which x + (y -'-x) is the union, and x -'-(x -'-y) the intersection of x, y. I shall conclude by showing a little of the purely arithmetical resources of the system. Let x· Y = (Xl' Yl' , X n ' Yn)' so that X· = 0, X· SiY = Y = x· y+Cix, and let x = (xi', , x~n) so that
°
x O = 1 = (1, 1, ... ,1), xS,y = xY. (Uix) where UiX j i' i.
We define
a ::; b +--> a -'- b
=
0,
so that a ::; b +--> a, ::; hi, for all i. We define further a Ib
+-->
(3c) (c ::; b & b = ac) (3c;) [(c i
::;
+-->
(Vi) (l ::; i ::; n
--+
b;) & (b i = aic;)]).
If f I a and fib, f is called a common factor of a, b. If h I a and hi b and if k I a, k I b ~ k] h then h is the h.c.f. of a, b, it is easily shown that hi is the h.c.f. of a.; hi' If 1 is the h.c.f. of a, b then a, b are said to be relatively prime. It follows that if a, b are relatively prime then all ai' b, are relatively prime and conversely. If ¢(x) is Euler's function which counts the number of numbers (including 1) which are less than and prime to x, then ¢(x) is primitive recursive and so there is a ([> such that
MUL TIPLE SUCCESSOR ARITHMETICS
271
Since ¢(b j ) = 1 +mjb i when a.; bi are relatively prime therefore a
= l+mb
a
= 1 (mod b)
i.e,
when a, b are relatively prime.
References R. L. Goodstein, Recursive Number Theory (North-Holland Publishing Co. Amsterdam 1957). M. T. Partis, Commutative partially ordered recursive arithmetics. Mathematica Scandinavica 13 (1963) 199-216. V. Vuckovic, Partially ordered recursive arithmetics. Mathematica Scandinavica 7 (1959) 306-320.
UNSOLV ABLE PROBLEMS IN THE THEORY OF COMPUTABLE NUMBERS B. R. MAYOR University of Oslo, Blindern, Norway
With each total, general, recursive singulary functionf on the natural numbers (hereafter "recursive function") one can associate the real number '/' = rx = ± a o . ala2a3 . . . that satisfies:
o: :2: 0 if f(O) is even, rx ao = [f(0)/2] ,
s
0 if f(O) is odd,
(Ll) for all i :2: 1, a, = the remainder on dividing f(i) by 10.
A real number rx is said to be computable if there is a recursive function associated with a, Let R denote the class of real numbers, and C the class of computable numbers. 1: If function f: Rk - R and open interval Q c R k are such that f restricted to Q is continuous, monotone in each argument, and can be effectively calculated for any k-tuple offinite decimals in Q, then the value of f is a computable number for every k-tuple ofcomputable numbers in Q as argument. THEOREM
PROOF: Let (Xl' X2' ... , Xk) be any k-tuple of computable numbers in Q. As all finite decimals are computable - since any function whose value is 0 for all but a finite number of arguments is recursive - it suffices to consider the case whenj'(x., Xl> ••• , x k ) = rx is an infinite decimal. Let fl' f2' .. ·,fk be the recursive functions by which Xl' X 2, ... , X k are presented. For any positive integer m, letflm,f2m, ... ,fkm be the recursive functions given by:
fimO) = f;(j)
o
if)::;; m if) > m
THE THEORY OF COMPUTABLE NUMBERS
273
and d lm, d 2m, ... , d km be the finite decimals associated with 11m,12m, ... , Ikm' As Q is open, there is a positive integer I such that m ~ I implies that k Q includes the 2 arguments
Corollary Ia. C is closed under the rational operations, so it is a field. Corollary lb. C is closed under the elementary functions; such functions as x - t exp x, x - t log x, X - t sin x, x - t n", x -+ "x, have computable values for computable arguments. In particular e = exp 1 is computable [10, p. 256]. THEOREM 2: III : R - t R is a continuous function whose sign can be computed effectively at any finite decimal that is not a root, then all simple roots 011 are computable. PROOF. Again one need only consider the case of a simple root is not a finite decimal. As IX is isolated, there is a finite decimal
such that
IX is
IX that
the only root oif in the closed interval
By Weierstrass' theorem tion: g(i)
IX
= 2d o
is presented by the following recursive func-
2d o+ 1
d,
if (j > 0 and i = 0 if (j < 0 and i :F 0 if 1 :::;; i :::;; k
the least z such that fid-s-z : lO-i) and I(d+(z+ 1)' lO- i ) have opposite signs, otherwise.
274
B. H. MAYOH
Corollary 2a. All the roots of a polynomial with computable coefficients are computable [2, 8]. In particular all algebraic numbers are computable [10, p. 254]. Corollary 2b. x -+ sin x satisfies the requirements, so [10, p. 256].
1t
is computable
Corollary 2c. C is closed under the inverse circular and hyperbolic functions. However C can be closed under a functionfwithout f being effective.
THEOREM 3: There is no effective procedure which, given two recursive functions f1 and f2, will stop and present a recursive function f3 such that
PROOF. One can effectively find the following recursive function for any Turing machine M: g 1 (i)
=
°
if i < 2 or M stops and presents 1 within i steps when started on its own Godel number 9 otherwise,
gii) = 2 if i < 2 or M stops and presents
°within
i steps when started
° + If Let be the digit in the first decimal place of presents ° when run on its own Godel number, then on its own Godel number otherwise.
M stops and dis 1 (0). The usual diagonalisation argument shows that one cannot find d effectively. d
'gt'
'g2'.
(1)
Similar proofs show that subtraction, division, multiplication, extraction of roots, exponentiation and the taking of logarithms are also non-effective. Moreover for any computable number a one can show that x -+ x + a, x = x]« when a #- 0, and x -+ x-a are effective if and only if a is a finite decimal. Thus doubling but not trebling is effective. If we had chosen to work with ternary instead of decimal expansions, the opposite would have been true. Such dependence on the number base can occur as conversion from base p to base q is only effective when q divides a power of p (cf. [5] theorems 3 and 5). It is curious that the existence of a procedure for any of the above non-effective operations does not seem to imply the solvability of the Halting problem [10], though each is reducible
THE THEORY OF COMPUTABLE NUMBERS
275
to the Halting problem. However the Halting problem is equivalent to the problem of finding the computable number that is the limit of a given recursively convergent recursive sequence of rationals. There is a profound analogy between the way in which a computable number is associated with each recursive function and that in which a semigroup is presented by each Thue system. Just as finite semigroups can be given by a multiplication table whilst infinite semigroups require a set of defining relations, so finite decimals can be written down directly whilst infinite decimals must be given by a rule. Just as not all semigroups can be presented by Thue systems, so not all real numbers are computable. Most interesting properties of semigroups are "Markov properties" [3, 7] in the following sense: i) There is a Thue system r 1 that presents a semigroups enjoying P. ii) There is a Thue system that presents an inhibiting semigroup S*, i.e. if S* can be embedded in the finitely presented semigroup S, then S does not enjoy P. iii) P is preserved under isomorphisms. The natural analogue of this definition is: A property of real numbers P is said to be pseudo-Markov if i) There is a recursive function Pfl associated with a number enjoying P. ii) There is a recursive function Pf2 associated with an inhibiting number (X* i.e. if recursive function g is associated with a number (x, that differs from (X* at only finitely many places, then (X does not enjoy P. It is known that (1) for every Markov property P of semigroups, the problem of determining whether or not a given Thue system presents a semigroup enjoying P is unsolvable [3, 7], and that (2) for each recursively enumerable degree of unsolvability D, there is a class of Thue systems A such that the above problem, restricted to A, has D as its degree of unsolvability [1]. N. Shapiro has proved the analogue of (I) [9, theorem 2.2]; the following theorem is the analogue of (2). THEOREM 4: For any pseudo-Markov property P of real numbers and any recursively enumerable degree of unsolvability D, there is a class A of recursive functions such that D is the degree ofunsolvability of the problem
276
B. H. MAYOH
of determining whether or not the number associated with a given recursive function in A enjoys P. PROOF. Let SD be an infinite recursively enumerable set of positive integers whose decision problem has D as its degree of unsolvability. For each positive integer i. one can effectively find the recursive function:
gii) =
IJii)+ 10· j if i # 0 and
"1
j is enumerated amongst the first i
elements of SD'
if i = 0, "1(i)+ 10· j otherwise. (i)
The associated computable number enjoys P if and only if j ¢ S D' so {gl(i), g2(i), ... } will serve for A. 5: If a property P of real numbers is enjoyed by at least one computable number, if any number, agreeing with a number enjoying P at all but a finite number of places, also enjoys P, and if one can recursively enumerate a set of recursive functions such that a computable number enjoys P if and only if it is associated with at least one recursivefunction in the set, then P is a pseudo-Markov property and the problem of determining whether or not the number associated with a given recursive function enjoys P is of degree 0", i.e. has the same degree of unsolvability as the problem of determining whether or not a given partial recursive function is total. THEOREM
PROOF.
f(i)
Consider the recursive function:
=6
if 5 is the remainder of the value of the U+ lj-st listed function for argument i on division by to 5 otherwise.
Its associated number does not enjoy P, and no computable number that agrees with it on all but a finite number of places can enjoy P. Thus P is a pseudo-Markov property. For any partial function p, one can effectively find the recursive functions:
THE THEORY OF COMPUTABLE NUMBERS g(O) = 1 g(i+ 1) = g(i)
g(i) + 1
fiO)
277
if p(g(i) - 1) cannot be computed within i + 1 steps otherwise.
= 0
fp(i+ 1)
= h(i){i+ 1)
if g{i+ 1) = g(i)
6
if g{i + 1) =f. g(i) and the remainder on dividing h(i){i+ 1) by 10 is 5
5
otherwise.
The number associated with fp enjoys P if and only if p is not total. Moreover for each recursive function r one can effectively find the partial recursive function: q(i)
= undefined if r and the i-th listed function always agree modulo
o
10 and they agree exactly for argument otherwise
o.
q is total if and only if the number associated with r does not enjoy P.
For properties of the form "Being algebraic of degree in S" where S is a recursive set ofpositive integers, and in particular for "Being algebraic" and "Being rational", this has been proved in another way [9, theorem 11. 10]. The theorem also applies to the apparently simpler property "Being expressible as a finite decimal". It is known that (3) for every recursively enumerable degree of unso1vability D there is a class A of Thue systems such that the isomorphism problem restricted to A has D as its degree of unso1vability [1], and that (4) the isomorphism problem for semigroups is unsolvable [4,6]. The next two theorems give the analogous results in the theory of computable numbers. THEOREM 6: For every recursively enumerable degree of unsolvability D, there is a class of computable numbers A such that the problem of determining whether or not two recursive functions in A have the same associated number has D as its degree of unsolvability. PROOF. Let SD be as in the proof of theorem 4. For any positive integer i. one can effectively find the recursive function:
278
B. H. MAYOH
.fii) = 0 l+lO'j
10· j
if i = 0 if i #= 0 andj is enumerated among the first i elements of SD otherwise.
Let fo be any recursive function whose value is always O. Then {fo, ft. f2" .. } will serve as A, since "I tj SD" reduces to "Do fo andfj have the same associated number?", and one can effectively find the natural number g* such that f g* = g for any recursive function g in A so "Do the functions g and h in A have the same associated numbers T" reduces to (g* tj SD and h* tj SD) or g* = h*.
This proof may not be available for other conventions that associate real numbers with recursive functions; it is then necessary to distinguish between identical recursive functions that are defined differently in order to reduce the decision problems in theorems 4 and 6 to that of SD' THEOREM 7: The problem of determining whether or not an arbitrary pair of recursive functions have the same associated numbers has the same degree of unsolvability as the Halting problem, viz. 0'. PROOF. Letfo be as in the last proof. For any Turing machine M, one can effectively find the recursive function
fM(i) = 0
if M runs for at least i steps when started on blank tape
1 otherwise.
M stops if and only if 10 and 1M have different associated numbers. Furthermore for any recursive functions f and g one can effectively find a machine M* that compares f(i) with g(i) for i = 0, 1, 2, ... until 'I' and 'g' diverge. M* stops if and only iff and g have different associated numbers. Similarly the following five decision problems are also of this degree: To determine of any pair of recursive functions whether or not the number associated with the first is greater than (not less than, different from, not greater than, less than) the number associated with the second. However one can show - by modifying M* in a suitable fashion - that all six decision problems are solvable when restricted to pairs of functions that have different associated numbers.
THE THEORY OF COMPUTABLE NUMBERS
279
N. Shapiro has proved: If a property P of real numbers is enjoyed by only finitely many computable numbers (and by one at least), then it is a pseudo-Markov property and the problem of determining whether or not the number associated with a given recursive function enjoys P has the same degree of unsolvability as the Halting problem [9, theorem 2.16]. In particular this applies for" a" where a is any computable number. The decision problems for the properties: "> a", "~ a", "#- a", ":::; a", "< a" are also of this degree. However judicious use of M* enables us to solve these six decision problems when restricted to recursive functions with associated numbers different from a. If it seems unnatural to associate a real number with every recursive function, as we have done - e.g. one might prefer to replace clause (,1) in the definition in the first paragraph by "a, = j(i)", or to disregard those decimal expansions that end in a string of nines - then one is free to change the definition; our results will still hold if "recursive function" is replaced throughout by "recursive function that is associated with some number".
=
References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10]
w. W. Boone, Partial Results regarding Word Problems and Recursively Enumerable Degrees of Unsolvability. Bull. Amer. Math. Soc. 68 (1962) 616-623. A. Grzegorczyk, Computable Functionals. Fund. Math. 44 (1957) 61-71. A. A. Markov, Impossibility of Algorithms for recognising some Properties of Associative Systems (in Russian). Dokl. Akad. Nauk. SSR 77 (1951) 953-956. , Impossibility of Certain Algorithms in the Theory of Associative Systems (in Russian). Dokl, Akad. Nauk. SSR 77 (1951) 19-20. A. Mostowski, On Computable Sequences. Fund. Math. 44 (1957) 37-51. , Review of [4], J. Symb. Logic 16 (1951) 215. , Review of [3], J. Symb. Logic 17 (1952) IS\. H. G. Rice, Recursive Real Numbers. Proc, Amer. Math. Soc. 5 (1954) 784-795. N. Shapiro, Degrees of Computability. Ph. D. thesis, Princeton University (1955). A. M. Turing, On Computable Numbers. Proc. London Math. Soc. (2) 42 (1936) 230-265.
PREDICATIVE WELL-ORDERINGS KURT SCHUTTE Kiel University, Germany
A hierarchy of critical numbers of the second number class can be defined in the following way: (1) The l-critical numbers are the s-numbers, (2) An ordinal IX is v-critical (v > 1) if I,,(IX) = IX for every I" is the ordering function of the JJ-critical numbers.
jJ
< v where
These ordering functions I" are normal functions. For JJ < v the set of v-critical numbers is a proper subset of the set of u-critical numbers. If IX is JJ-critical then JJ ~ IX. We say that an ordinal K is strongly critical if it is x-critical, i.e. if 11/0) = K. It turns out that the smallest strongly critical number K o is a least upper bound for predicative reasoning. We define in section 1 an ordering relation -< of equivalence classes of natural numbers representing a sufficiently large segment of the second number class in a constructive way. With respect to this representation of ordinals, we prove in section 3 transfinite induction up to ordinals smaller than K o by using a formal system of ramified type theory which is defined in section 2. In this way well-ordering up to any ordinal IX < K o is provable by predicative methods (if the ordinals are defined in a sufficiently constructive way as in section 1). In another paper (Schutte [7]) there is a proof that well-ordering up to ordinals ;;::: K O cannot be proved by predicative methods. The same result was found independently by S. Feferman [l ]-[3]. Our wellordering -< is based on normal functions according to a construction of Veblen [8]. The well-ordering -< of this paper corresponds to a proper segment of a well-ordering in Schutte [5]. It is very closely related to the
PREDICA TIVE WELL-ORDERINGS
281
well-ordering --< of section 11 in Schutte [6]. Both well-orderings represent the same segment of ordinals. 1. A constructive system of ordinals We use small Latin letters as syntactical variables for natural numbers (including 0) and define a binary relation :S on natural numbers. We denote by <, = and ~ the usual relations on natural numbers. Po denotes the prime number 2. P« for n :j= 0 denotes the n-th odd prime number.
If a :j= 0 then a i denotes the exponent of Pi in the prime factorization of a. Inductive definition of the relation :S (a 4; b denotes the negation of a :S b): a :S b if and only if at least one of the following four conditions is fulfilled: [l]a=Oandb=O.
[2] b 9= 0 and a :S b, for at least one i. [3] a
:j=
0, b 9= 0, and a, :S b, for all i.
[4] a
:j=
0, b 9= 0, and there are numbers m
a, = 0
~
n such that
for all i < m (if m 9= 0),
am:s b
b 4; aj
for all} with m < j < n (if m+ I < n),
b; 4; an ak :S
s,
for all k > n.
Obviously, for any natural numbers a, b it is decidable whether a :S b or
a ~ b. It is easy to prove by mathematical induction (according to the
natural ordering of natural numbers) that :S is a reflexive total ordering relation, i.e.
a :S b or b:s a (totality), If a :S band b :S c then a -< c (transitivity). We define: a == b if and only if a
:S band b :S a,
282 a
KURT SCHUTTE
-< b if and only if
b
~
a.
-< b.
a -I( b denotes the negation of a
Since S is a reflexive total ordering relation it follows that == is an equivalence relation, and < is an irreflexive total ordering relation with respect to the relation ==, i.e. a==a
If a == b then b == a If a == band b == c, then a == c a-l(a
If a -< band b -< c then a -< c a -< b or b -< a or a == b If a == c, b == d, and a -< b, then c
-< d and -<
(reflexivity of ==) (symmetry of ==) (transitivity of ==) (irreflexivity of -<) (transitivity of -<) (trichotomy) (compatibility of == and
-<)
From the definitions of 5, == we can derive the following criteria. Criterion for equivalence. a == b if and only if at least one of the following four conditions is fulfilled: [EI]a=Oandb=O. [E2] a
=1=
0, b
=1=
0, and a, == b, for all i.
[E3] a
=1=
0, b
=1=
0, and there are numbers m < n such that
ai =
°
am == b, a j <, b
-< b; ak -< b" an
for all i < m (if m
=1=
0),
for allj with m < j < n (if m+ I < n), for all k > n.
[E4] The condition corresponding to [E3] by exchanging a and b.
Criterion for the irreflexive ordering relation. a -< b if and only if at least one of the following four conditions is fullfilled: [01] a = [02] b =1=
°and °and
b = 1. a
-< b, for at least one i.
[03] a =1= 0, b =1= 0, and there are numbers m < n such that bm =1=
and a == b"
°
PREDICATIVE WELL-ORDERINGS
* 0, b * 0, and there is a number n such that a, -< b for all < n (if n * 0),
283
[04] a
i
a" -< b; ak 5 b,
for all k > n.
These criteria for == and -< together are primitive recursive as is the definition [1]-[4] of5' A natural number a is called a finite or a transfinite ordinal (with respect to the relation -<) according as a -< 3 or 3 5 a. It is easy to check: If a is a finite ordinal then 2a is the successor of a. If a is a transfinite ordinal then 2a
== a.
3 is the smallest transfinite ordinal (it represents the ordinal w). To define addition of ordinals we use auxiliary functions ex and which are defined in the following way:
P
a ao if a = 2ao • 3 " ex(a) = { 0 otherwise.
if a = 2ao . 3a "
a1
{3(a) = { a
otherwise.
For addition EB of ordinals we have the definition a
EB
b if a = 0, b- { 2'x(a ) EBb • 3 P( a ) if a O.
*
*
Since !X(a) < a for a 0, the definition is recursive. Therefore a EB b is computable for any natural numbers a, b. One can prove by mathemathical induction: If a
== c and b == d then a EB b == c EB d (compatibility of EB and ==),
If a 5 b then a EB c 5 b EB c (weak monotony on the left), If b
-< c then a EB b -< 0 a EB 0
==
EB c (strict monotony on the right),
a and 0 EB b
a EB (b EB c)
==
== b.
(0 EB b) EB c (associative law).
The difference [) is defined in the following way:
284
J(a, b) =
KURT SCHUTTE
l b
if a
15 (a (a), b)
if a
J(a(a), a(b»
if a
J(a, a(b»
if a
°or b = 0, *' 0, *' 0, and *' 0, *' 0, and *' 0, *' 0, and =
b
fJ(a) <, fJ(b),
b
fJ(a)
b
fJ(b)
== fJ(b),
-« fJ(a).
By mathematical induction it follows that a ffi J(a, b)
== b if a ::S b.
We say that a function f (on natural numbers) is an ordering function (with respect to of a set S (of natural numbers) if it satisfies the following three conditions:
-«)
(1) f(a)
E
S for any natural number a,
(2) If b e S then there is a natural number a such thatf(a) == b, (3) If a
-« b then f(a) -«f(b).
Remark. The uniqueness of ordering functions can be proved only by using the well-ordering of Therefore we cannot speak of the ordering function of a set S before we have a proof of well-ordering for the relation A number a is called a main ordinal (with respect to addition) if a*,O and x EB a == a for all x a. (Obviously, if a == b, then a is a main ordinal if and only if b is a main ordinal.) Using the properties of EB we find the following criterion for main ordinals:
-«.
-«.
-«
*' °is a main ordinal if and only if
(1) A number a = 20 0 • 3 ' with ao a == ao and ao is a main ordinal. D
(2) Every number a
=3
D
'
is a main ordinal.
(3) If a*,O and there is an n > I such that an ordinal.
*' 0, then a is a main
We define a function 2 in the following recursive way: 2(a)
={
2(ao)
if a = 20 0 • 3D '
fJ(a)
otherwise.
and a == a o
Then a _ J U o ) if a is a main ordinal. It follows that 3D as a function
285
PREDICA TlVE WELL-ORDERINGS
of a is an ordering function of the set of main ordinals. (That means that 3a represents the ordinal w lal if a represents the ordinal 1 a I.) By the properties of addition and of main ordinals it follows that: For any number a =l= 0 there are numbers C u ... , Cm (m 2: 1) such that
a == 3C 1 E9 ... E9 3Cm and cm
:5 ... :5 C 1
(Cantor's normal form).
The numbers Cl' . . . , c.; are uniquely determined up to equivalence, and they are computable for any given a =l= O. Therefore multiplication x of ordinals can be defined recursively in the following way: (1) a x b = 0
=
(2) a x 1
(3) If Cm (3
C1
if a = 0 or b = O. (if a =l= 0)
a
:5 ... :5 c 1 (m
E9 ..• E9 3
Cm
)
x 3
2: 1) and e '1= 0 then e
==
3C! Ell e
(4) If (a =l= 0 and) en :5 .. , :5 el (n > 1) then a x (3 e , E9 ..• E9 3en) == (a x 3e , ) E9 •.. E9 (a x 3en) (5) If a
==
C
and b
== d then
axb
== c x d (compatibility of x and ==).
The following properties of this multiplication are provable by mathematical induction:
== b == (a x b) E9 (a x c) If a :5 b then a x c :5 b x c
(distributive law)
If a =l= 0 and b
(strict monotony on the right)
ax 1
==
a and 1 x b
a x (b E9 c)
(axb)xc
==
-< c then a x b <' a x c
ax(bxc)
(weak monotony on the left) (associative law)
Let n be a natural number 2: 1. We say that a number a is (c 1 , critical in the following cases: (1)
Cl
••. ,
cn ) -
= ... c; = 0 and a =l= O.
(2) There is an m such that 1 ::::; m ::::; n, c, = 0 for all i = I, ... , m - 1 (if m > 1), c.; =l= 0, and for all x
-< Cm'
286
KURT SCHUTTE
According to this definition we have the following facts: If a == b then a is (c l , critical.
... ,
cn)-critical if and only if b is (c., ... , cn ) -
°
If n < Nand Cn + I = .. , = CN = then the (CI' ••• , cn}-critical numbers are the same as the (c I , .•. , cN)-critical numbers. Furthermore, we get the following criterion for critical numbers. A number a is (cl' ... , cn)-critical if and only if it satisfies at least one of the following two conditions:
°
[CI] a =l= and a == p~ . p~l ... p~n where b is one of the numbers 0, ao, or a. [C2] a =l= 0, and there is an i such that a == a, and a, is (c I , critical.
... ,
cn) -
Using this criterion one can prove by mathematical induction: If a is (cI ' . . . , cn)-critical then there is a computable number b such that a == p~ . p~' ... p~n. It follows that p~ . p~' ., . p~n as a function of a is an ordering function of the set of (c I , . . . , cn)-critical numbers. A number a is called an e-number if 3a == a. It is easy to check that the (c)-critical numbers are the numbers >- 3c , and the (0, I)-critical numbers are exactly the s-numbers. Every (c I ' . . . , cn)-critical number with Cn =l= O(n > 1) belongs to the set of s-numbers. We say that a number a is a strongly critical number if it is (a, 1)critical. These strongly critical numbers are exactly the (0,2)-critical numbers. We have ordering functions ({J(a, b)
=
2a • 3b • 5 of the set of (b, I)-critical numbers,
and Ka
= 2a
•
52 of the set of strongly critical numbers.
The relation -< can be proved by impredicative methods to be a wellordering using a proof similar to that for the related -<-relation in § 12 of Schutte [6].
PREDICATIVE WELL-ORDERINGS
287
2. A formal system of ramified analysis
2.1. Primitive symbols 1. Numerals (symbols representing the natural numbers), 2. free number variables, 3. free predicate variables, 4. bound variables, 5. symbols for computable functions on natural numbers, 6. the relation symbols =, =1=, <, ::;;, =, ¥=, -<, :S, 7. the connectives " A, V, -+, ~, and the universal quantifier A, 8. comma and parentheses.
2.2. Inductive definition of terms 1. Every numeral is a term, 2. Every number variable is a term, 3. If fis a symbolfor an n-ary computable function (n ~ 1) and t l , are terms, then t«; ... , tn) is a term.
••• ,
tn
2.3. Computable functions Besides other symbols for computable functions we use "max" and
"r" as symbols for the binary computable functions which are defined
in the following way: max
(Zl' Z2)
r(zl' Z2)
= {
Z2
ifz 2 . 1f z1
o
if
Zl
= {Zl if Zl Z2
:s Zl -< Z2 -< Z2
:S Zl
2.4. Abbreviations We omit parentheses if they are superfluous. t 2 ) , Ee (tl' t 2 ) we write t 1 + t 2 , t 1 ED t 2 · Instead of + In the same way we write terms in other cases in the usual way. Instead of r(t 1, t 2 ) we write only (t l' t2 ) ·
«:
2.5. Numerical terms A term is called numerical if it does not contain a free number variable. According to the interpretation of function symbols (as symbols for
288
KURT SCHUTTE
computable functions) every numerical term t has a computable value I t I which is a natural number. 2.6. Elementary prime formulas
An elementary prime formula is a formula where t 1 , t 2 are terms and >- is a relation symbol of our formal system. A numerical prime formula is an elementary prime formula which does not contain a free number variable. According to our interpretation of function symbols and relation symbols every numerical prime formula has a computable truth-value "true" or "false". A numerical prime formula (tl "" t 2) has the value "true" if and only if I t 1 I ~ I t 2 I holds. An elementary prime formula t 2 ) is called verifiable if there is a constructive proof of the following property: Whenever ti, ti are the results of substituting arbitrary numerals for the free number variables in t 1 , t 2 the numerical prime formula (ti "" ti) has the value "true".
«. ""
2.7. Nominal forms
A nominal form is a finite sequence containing no symbols other than primitive symbols of our formal system and a nominal symbol (which is not a primitive symbol of our system). If ~ denotes a nominal form and X denotes a primitive symbol or a finite sequence of primitive symbols then ~(X) denotes the results of substituting X for the nominal symbol in ~. 2.8. The supremum
Let a be a free number variable occurring neither in the nominal form s nor in the term s, and let sea) be a term. We say that the relation sup s == s is verifiable if there is a constructive proof of the following properties: (1) The elementary prime formula sea) :S s is verifiable. (2) Whenever s*, s* are the results of substituting arbitrary numerals for the free number variables in s, s, then for every natural number
289
PREDICATIVE WELL-ORDERINGS
n
<:: I s* I there is a computable numeral
holds.
z such that n
<:: I 5* (z) I
We shall use this relation sup 5 == s only in a metamathematical sense. We shall not define it as a formula of our formal system. 2.9. Inductive definition offormulas and predicates
We define only monadic predicates. To any formula and to any predicate we associate a level s which is a term of our formal system. I. Every elementary prime formula is a formula of level O. 2. If p is a free predicate variable and s is a term then pS is a predicate of level s. 3. If q is a predicate of level sand t is a term then q(t) is a formula of level s. 4. If F is a formula of level s then' F is also a formula of level s. 5. If F I , F 2 are formulas of levels Sl' (F I
A
S2
then
F 2 ) , (F I v F 2 ) , (F I -+ F 2 ) and (F I
are formulas of level max
.-.
F2 )
(Sl' S2)'
6. Suppose (l) ~(a) is a formula of level s(a) where the free number variable a does not occur in the nominal forms ~ and s. (2) Either a does not occur in the superscript of a variable in ~, or ~ does not contain a free predicate variable. (3) The relation sup 5 == s is verifiable where s is a term containing no free number variables other than those which are contained in s. (4) x is a bound variable which does not occur in ~. Then (x~(x» is a predicate of level sand Axlll(x) is a formula oflevel s. 7. Suppose (1) ~(pt) is a formula of level s. (2) The free predicate variable p does not occur in the nominal form Ill. (3) The bound variable x does not occur in 'li. Then Axt~(x') is a formula of level s.
290
KURT SCHUTTE
Remark. According to this definition a formula or a predicate can belong to different levels St, Sz but only if the elementary prime formula St == Sz is verifiable. 2.10. Syntactical variables
a, b, c, d for free number variables, s, t, u, v for terms, z for numerals, x, y for bound variables, p for free predicate variables, q for predicates, Latin capitals for formulas, Gothic letters for nominal forms. We shall also use these letters with subscripts. 2.11. Elementary formulas
An elementary formula is a formula which does not contain a free predicate variable or a bound variable. It is composed only of elementary prime formulas and connectives. A numerical formula is an elementary formula which does not contain a free number variable. According to the truth-values of numerical prime formulas and the interpretation of connectives every numerical formula has a computable truth-value "true" or "false". An elementary formula F is called verifiable if there is a constructive proof that every numerical formula which results by substituting arbitrary numerals for the free number variables in F has the value "true". 2.12. Notations Aform of level s is a formula of level s or a predicate of level s. If a form F contains an expression pU or XU with a free predicate variable p or a bound variable x then we call u a level indicator of F. A level indicator is either a term or the result of substituting bound variables for some free number variables in a term. THEOREM 2. I : The level s of a form F contains no free number variables other than those which are contained in level indicators of F.
291
PREDICATlVE WELL-ORDERINGS So
THEOREM 2.2: If a form F of level s contains a form Fo of level So then S s is verifiable. THEOREM 2.3: (term substitution) Suppose
~(a) is a form of level sea) where a is a free number variable which does not occur in the nominal forms ~, s, (2) t is a term. Then ~(t) is aform of a level s such that set) == s is verifiable. (~(t) is a formula if and only if~(a) is aformula.)
(1)
THEOREM 2.4: (predicate substitution). Suppose (1) ~(i) is a form oflevel u~(i) where t is a term and p is a free predicate variable which does not occur in the nominal form ~, (2) q is a predicate of level aq. Then ~(q) is a form of a level u~(q) such that uq
S
t -+ u~(q)
s u~(i)
is verifiable. m(q) is a formula if and only if ~(pt) is a formula.) These theorems 2. 1-2.4 are immediately provable by induction on the formation rules for formulas and predicates. THEOREM 2.5: IfAxt~(xt) is aformula of level s then t
:S s is verifiable.
PROOF. According to our definition, the formula Axt~(xt) has the same level s as the formula ~(pt) containing a predicate pt of level t. Therefore t S s is verifiable according to Theorem 2.2.
2.13. Axioms
(AI) Every formula which is derivable in intuitionistic logic. (A2) Every elementary formula which is verifiable. (A3) The abstraction axioms: where t is a term and
(x~(x»
(t)
(x~(x»
is a predicate.
~ ~(t)
(A4) The axioms for trivial quantification: t
where t is a term and
=0
-+ Axt~(xt)
Ax~(xt) is
a formula.
292
KURT SCHUTTE
(AS) The axioms for predicate quantification:
Axl'll(XI )
A
oq -< t
-+
'll(q)
where q is a predicate of level oq and Axl'll(x') is a formula. 2.14. Inference rules
(BI) Every inference rule of intuitionistic logic. (B2) The inference rule of mathematical induction: FA Ax[x < a
-+
'll(x)]
-+
'll(a)
=>
F -+ 'll(t)
where a is a free number variable which does not occur in the formula F or in the nominal form'll, 'll(a) is a formula, and t is a term. (B3) The inference rule for predicate quantification: F -+ 'll(p(a, I)
=>
F
-+
Axl'll(x')
where p is a free predicate variable which does not occur in the formula F or in the nominal form'll, and a is a free number variable which does not occur in F, 'll or the term t. Remark. A formula Axl'll (x') has the interpretation: "'ll(q) for all predicates q of level -< r: 2 . 15. Definition
A formula of level s is regularly derivable if it is derivable only from formulas of levels s, such that Sj ::5 S is verifiable. 3. Proof of transfinite induction in the formal system of ramified analysis We define the foIlowing formulas: Pr (q) ( <, -progressivity
+-+
der
Ay [Ax(x
<
y -+ q(x» -+ q(y)]
of the predicate q),
l(q, t) ~ [Pr(q) -+ Ay(y
<
t -+ q(y))]
(transfinite -<-induction up to the ordinal t applied to the predicate q), J(s, t) ~ Ax'l(x', t)
PREDICATlVE WELL-ORDERINGS
293
(transfinite -<-induction for predicates of levels -< s up to the ordinal t), The formulas Pr(q) and I(q, t) have the same level as the predicate q. The formula J(s, t) has the level s. We want to derive J(s, t) for ordinals t smaller than the first strongly critical number "0' In the following derivation we use the abbreviations: "log. der." for "derivable in intuitionistic logic", "by prec." for "it follows from the preceding formula", "by n prec." for "it follows from the last n preceding formulas". To prove transfinite induction up to the first a-number in Gentzen's way [4] we use two binary functions y and t/J which are defined in the following way: y(a,O)
=
t/J(a, b)
o { = A(8(b, a»
a, yea, b+ I)
=
yea, b) 6' a,
if a -< b, if b :S a.
The following formulas are verifiable: (1.1) a =1= 0 -. 3~(a):s a (1.2) a
-< y(3~(a),
a)
(1.3) a <; b 6' y(3",(a,bl, 8(b, a» (1.4) c =1= 0 A a <, b Ell 3' -. t/J(a, b)
-< c.
The verifiability of (I. I) and (1.2) is provable by mathematical induction on a. PROOF OF (1.3). Suppose Then a == b Ell 8(b, a).
By (1.2) 8(b, a)
hence a
-<
b:s a.
-< y(3~(d(b,a»,
(Otherwise the statement is trivial.)
b(b, a»
= y(3 "'(a, b), 8(b, a»,
b Ell 'l'(3l/!(a,b), b(b, a».
PROOF OF (1.4). Suppose c =1= 0 and a -< b Ell 3'. If a -< b then t/J(a, b) = 0 -< c. If a == b then t/J(a, b) = A(c>(b, a» = leO) = 0 -< c. If b <, a then b Ell b(b, a) == a -< b Ell 3' implies 8(b, a) -< 3'. By (1.1) it follows that 3~(d(b,a» :S b(b, a) -< 3', hence t/J(a, b) = A(b(b, a» -< c.
294
KURT SCHUTTE
We associate with a predicate q a predicate q(t) ~ Ay[Ax(x
q such that
-< y --. q(x)) -+ Ax(x -< y
EB 31
-+
q(x))].
(According to section 2, q can be defined as a predicate (xlll(x)) of the same level as the predicate q.) Then the following formulas are regularly derivable: (1.5) Ax(x
-< b -+ q(x)) A q(t) --. Ax[x -< b EB
(1.6) I(q, c) --. I(q, 3
C
')'(31, u)
-+
q(x)]
)
(1.7) J(s, A(a)) -+ J(s, a).
The formula (1.5) is derivable by using the inference rule (B2) of mathematical induction with respect to the term u. Derivation of (1.6). c =l= 0 A a Ax(x
a
-<
c
b EB 3 -+ tjJ(a, b)
-< c -+ q(x)) A tjJ(a, b) <
-< b EB ')'(31/1(0, "', b(b, a))
Ax(x
-<
-< c
is the verifiable formula (1.4)
c -+ q(ljJ(a, b))
is log. der.
b -+ q(x)) A q(tjJ(a, b)) -+ [a
c =l= 0 A Ax(x
-< c --. q(x)) A Ax(x -<
c =l= 0
<
c
A
Ax(x
= 0 A Pr(q)
-< b
is the verifiable formula (1.3)
EB
')'(31/1(0, bl,
by (1. 5)
b --. q(x)) --. (a
-<
b EB 3c
--.
by 4 prec.
q(a))
by prec.
c --. q(x)) -+ q(c)
is trivially derivable
--. q(c)
by 2 prec.
Pr(q) -+ Pr(q) Pr(q) A Ax(x
b(b, a)) --. q(a)J
<
I(q, c) -+ I(q, 3
C
c
-+
q(x)) -+ Ax(x
<
c
3
--.
q(x))
is trivially derivable by 2 prec.
)
Derivation of (1.7). is trivially derivable
J(s, A(a)) -+ J(s, A(a) EB 1) J(s, A(a) EB 1) A (b, s)
s =l= 0
-+
(b, s)
-< s
-< s -+ I(p(b, s), lea)
EB 1)
is an axiom (A5) is verifiable
295
PREDICA TIVE WELL-ORDERINGS
I(lb. sl, A(a) EI7 1) I(p(b. sl, 3A(al Ell 1)
~
~
I(lb. sl, 3A(al
by (1.6)
Ell 1)
because a -< 3 A(a l Ell l is verifiable
I(p(b. sl, a)
s =1= 0 A J(s, ,l.(a)) ~ I(p(b. sl, a)
by 5 prec.
s =1= 0 A J(s, ,l.(a)) ~ J(s, a)
by an inference (B3)
s = 0
is an axiom (A4)
J(s, a)
~
by 2 prec.
J(s, -lea)) ~ J(s, a)
LEMMA 1: The formula Jis, 5) is regularly derivable. (The number 5 represents the first a-number with respect to the -<. -ordering.)
PROOF.
Its, 0)
is trivially derivable
Jts, ,l.(a)) ~ Jts, a)
by (1.7)
a =1= 0 a
A
a
-<. 5 ~
-<. 5 ~
-lea)
< a A -lea) -<. 5 is verifiable
Jts, a)
from 3 prec. by mathematical induction by prec.
/(s,5)
For the proof of the next lemma we need binary functions J1 and v which are defined recursively in the following way: J1(a, b)
0
a=O
v(a, b)
0
{ a o =1= 0 ao = 0
J1(a o, b)
v(ao, b)
p(a 1 , b)
Veal> b)
{a 1 S b b < a1
ao
a1
a
b
{a S b b-
0
a
a
b
{a S b n > 2 with an =1= 0 b-
0
a
a
b
a = 2ao • 3a, a
= 2ao • 3a , • 5
a =
2ao • 3a, . 5az
1
-< a z
There is an
296
KURT SCHUTTE
One can prove by induction on a that the following formulas are verifiable:
(2.1) v(a, b)::s b
-<
(2.2) a oF 0 A v(a, b)
b
--+
Ilea, b) < a
(2.3) a = 0 v ..1.(a) < a v a == 2/1(0, b) • 3'(0, b) • 5 LEMMA
2: The formula
PROOF.
We use the following abbreviations:
Ax Ay(y -< t 1 A J(s, x) --+ J(s, 2x • 3Y • 5)) A J(s ffi 1, to) --+ J(s, 2to • 3t 1 • 5) is regularly derivable for arbitrary terms to and t i -
m:(s, t 1 ) ~(s,
~
Ax Ay[y <, t 1
A
J(s, x)
--+
J(s, 2X • 3Y • 5)J
b, t 1) +-+ Ax(x <; b --+ J(s, 2x • 3t l • 5)) def
il(s, a, b, t 1 )
+-+ Ax[x def
< a
--+
(x
-< 2 b • 3t l • 5 --+ J(s, x))J
Derivation of the asserted formula m:(s, td
A
J(S ffi 1, to)
--+
J(s, 2t o • 3t l • 5).
v(a , tl) -: t 1 A a == 2/1(0, tl). 3v(a, ttl. 5 A m:(s, t 1) ( a <, 2b • 3t l • 5 --+ J(s, a)
v(a,
==
t1)
t
( --+ J(s, a)
v(a, t 1 )
1
A
a == 2/1(0, ttl. 3v(a, ttl . 5 A
0
--+
A
il(s, a, b, t 1 )
J(s, a)
A
is derivable by using (2.2) with t 1 instead of b t
1)
-< t 1
-'c(a) < a
=
b,
il(s, a, b, t 1)
A
a
< 2b • 3t 1 • 5
is derivable by the definition of ~ is verifiable by (2. 1)
a == 2/1(0, til. 3v ( a , til. 5 A m:(s, t 1 ) {A a -< 2 b • 3t l • 5 --+ J(s, a)
a
~(s,
A
A
a
-< 2
b
A ~(s,
b, t 1 )
A
il(s, a, b, t 1 ) by 3 prec.
•
t1
3
•
5
--+
J(s, a)
is derivable by using (1.7) is an axiom (A4)
297
PREDICATIVE WELL-ORDERINGS
a = 0 v A(a) < a va == l2I(s,
t I) A
!B(s, b,
t I) A
21'(o,'tl • 3,(o,'tl •
i!(s, a, b,
t I) A
a
5
-< 2b . 3
11
is the verifiable formula (2.3) •
5
J(s, a) by 4 prec.
-+
by mathematical induction
l2I(s, t 1 ) A !B(s, b, t 1 ) l2I(s, t 1 )
-+
-+
J(s, 2b • 3'1. 5)
by prec.
Pr«x J(s, 2"" . 3" . 5»)
by prec.
Prtt» J(s, 2"" . 3" . 5») A J(s EB 1, to)
-+
J(s, 2'0 . 31 1 • 5) because the pre-
dicate (xJ(s, 2"" . 3" . 5» has the level s <, s EB 1
l2I(s,
tl)
A J(s EB 1, to)
-+
J(s, 2'0 . 3'1 . 5)
by 2 prec.
To prove the next lemma we need a binary function 0 which is defined recursively in the following way:
I {36 ( b, P(o» EB O( cx( a), h)
o( a, b) --
if a = 0 or f3(a) if a 4= 0 and b
The following formulas are verifiable: (3.1) a (3.2) a
-< 3
bxO(a,
<
LEMMA
b
3
h)
'" 1 X V -+
8(a. h)
-< 3 xv.
3: The formula
Ay [J(3 U x (y, 3 3 ' ) EB 1, t l ) A
u
J(3 fI; 1 x(h, 3
3
-+
'), tl) -+
J(3 U x (y, 3 3 ' ) , t 2)] J(3
ue
I
x (h, 3
is regularly derivable for arbitrary terms s, 11' 12• u. PROOF. We use the abbreviations
A ~ Ay[J(3 U x (y, 3 3 ' ) EB 1,11)
s(h)';;r 3u fI; 1 x(h, 3 3 ') o(a, h) ';;r O«a, s(b», u).
-+
J(3 U x (y, 3 3 ' ) , t 2 ) ]
3
A
s 4= 0
' ) , t 2)
-<
b,
:S f3(a).
298
KURT SCHUTTE
The formula A depends on 5,1 1,12 , u. The nominal forms sand n depend on 5 and u. The asserted formula of our lemma is A A 5 =!= 0 A J(s(b), 11) -+ J(s(b), ( 2 ) ,
Derivation of this formula: (a, s(b»
-< 3"
s(b) =!= 0
-+
(3.3) s(b) =!= 0 u(a, b)
Ell
1
x (b, 3 3 ' )
s(b) =!= 0
8«a, s(b», u)
3
' ) --+
3" x (u(a, b), 3
A A J(s(b), ( 1) A 3" x (u(a, b), 3 { -+ J(3" x (u(a, b), 3 3 ' ) , ( 2)
(3.4) s(b) =!= 0 A A
(3.5)
5
=!= 0
A
s(b) =!=
A
3
<3
-< s(b) ' ) EEl 1 -< s(b)
by 2 prec. is trivially derivable
x (b, 33 ' ) -+ u(a, b)
=!= 0 --+ o(a, b)
OA 5
-< 3" x u(a,
{ -+ (a, s(b»
<
s(b) =!= 0 A
=!= 0
-< 33
5
(a, s(b»
A
3
{ --+ I(p(a, db), ( ) 2
s =!=
s(b) =!= 0 A A
b) A o(a, b)
3" x (u(a, b), 3
J (3" x (u(a, b), 3
OA
EEl 1 -< s(b) is verifiable
J(s(b), t 1) -+ J(3" x (u(a, b), 3 3 ' ) ,
(a, s(b» <. 3" x o(a, b)
(3.6) s(b) =!=
')
( 2)
by 2 prec.
u(a, b)
(a , s(b»
by 2 prec.
3
3" x (u(a, b), 33 ' ) EEl 1
-+
by (3.2)
is verifiable
-< 3 x (b, 33 ' )
-+ u(a, b)
-< 3 x (b, 33 ' )
-< s(b)
(a, s(b»
-< 3 x (b, 3
-+
OA
A 5
' ) , ( 2) A
3
by (3.3) and (3.5)
<
A
by (3.1)
-< 33 '
is verifiable
')
-< 3" x (u(a, b), 33 ' )
(a, s(b»
by 3 prec.
-< 3" x (u(a, b), 33 ' )
J(3" x (o(a, b), 3 3 ' ) ,
=!= 0
-< 33 ' is verifiable
is an axiom (A5)
( 2) --+
I(p(a'5(b», ( 2 ) by 2 prec.
J(s(b), ( 1) --+ I(p(a, ,(b», (2 )
by (3.4) and (3.6)
s(b)
*0
s(b)
== 0
A
PREDICA TlVE WELL-ORDERINGS A
-+
*0
AS
A
A
A S
*0
A
J(s(b), t l )
J(s(b), t 2 )
-+
by an inference (B3) is an axiom (A4)
J(s(b), ( 2 ) J(s(b),
299
by 2 prec.
J(s(b), t 2 )
II) -+
We define a predicate Qs such that
.
Qs(t) +-+
* 0 then
If S
-+ J(3 3 ("
-<
(t, s)
3
{AY Ax [J(3 3 (" s )
-< 33 '
On the other hand if u
-<
(y, 3 3 ' )
33 ) ,
(y,
sand
3 (" ')\91 X
(y, 33 ' ) , x)
')$ I X
3 3 ("
then u
::5
2:< . 3(" .) . 5)]
')$llil3'
3 3 ("
')lill
== 3 3 ' . X
(u,3 3 ') .
Therefore the formula (4.1)
is verifiable. (Qs is a predicate of level 33 ' if LEMMA
4: The formula S
*0
-+
S
* 0.)
Pr(Qs)
is regularly derivable for any term s. PROOf.
Let t be the term 33 6 ( b ,
(c, 3 3 )
X
(a • • ))
A X(X < a -+ Q.(x» A b < (a, s) A J(3 3 ( b , {-+ J(3 3 ( b , ')$1 X (t, 3 3 ') , 2d • 3(b, s). 5) b
-< (a, s) -+ (r, 3
{ A (b, s)
3 ')
==
1
A 3 3 ( b , ')$1 X
Ax Ay[y -+
X
(t, 3 3 ' ) , d)
is trivially derivable
==
t
$ I
33 (b, . ) X
== b
A X(X -< a -+ Qs(X» A b -< (a, s) {-+ )(3 3 ( a , . ) X (c, 3 3 ' ) , 2d • 3b • 5)
A
s)
J(33<"·
J(3 3 ( a •
-< (a, s) s)
• )
X
X
(c,
A J(3 3 ( a ,
33 ),
2:< .
(c, 33 ' ) EB 1, d)
is verifiable A
X (c,
.)
3Y •
(c, 33 ' )
J(3
3
(a• • )
X (c,
3
3 ),.d)
by 2 prec. 3 3 '),
x)
5)]
-+
J(3 3 ( Q•
• )
X
(c, 33 ), 2d • 3(a. s) • 5)
by Lemma 2
300
KURT SCHUTTE
A X(X -< a -+ Q.(x» /\ J(3 3 ( a . , ) X (c, 33 ' ) Ell 1, d) {-+ J(3 3( a . ,) X (c, 3 3 ' ) , 2d • 3(0,.) . 5) Ay[J(3 3 ( a . /\ S (
-+
X
, )
~ 0/\ J(3
J(3 3 ( a .
')(1)
(y, 33 ' ) Ell 1, d)
3(a,
1
, ) (I)
1 X
-+
J(3 3 ' a .
, )
X
x(c, 33 ' ) , 2d • 3(0,.). 5)
by Lemma 3 1 X
(a.
s
~
0
-< a --+ --+
(y, 33 ' ) , 2d • 3(0,.) . 5)J
(c, 33 ' ) , d)
A X(X <. a --+ Q.(x» /\ s ~ 0/\ J(3 3
Q.(x» /\ s
~
0
--+
(c, 3 3 ') , d) by 2 prec. by prec.
Q.(a)
by prec.
Pr(Q.)
LEMMA
by 2 prec.
5: The formula J(3 3 ' Ell 1, t) /\ t
-< S -+ J(33', 3' . 5)
is regularly derivable for arbitrary terms s, t. PROOF.
s ~ 0
-+
aQ.
J(3 3 ' Ell 1, t) Q.(t)
--+
J(3
3
-< A
<"
33 ' Ell 1/\ Pr(Q.)
-<
aQ. s ) (I)
1X
by (4.1) and Lemma 4
3 3 ' Ell 1/\ P"(Q.) (v, 3
3
'),
-+
Q.(t) is trivially derivable
3("') . 5)
is trivially derivable
-<
s -+ s ~ 0 A (t, s) == t is verifiable (5.1) J(3 3' Ell 1, 1) A t -< S --+ J(3 3' tI! 1 X (v, 33 ' ) , 3' . 5) by 4 prec. t
Let v be the term (a, 3 3 ' ) Ell 1. Then the formula (a, 3 3 ' )
-<
3 3 ' (1) 1
X
(v, 3 3 ' ) is verifiable. Therefore 3
(5.2) J(3 3 ' EIl l x (v, 3 3 ' ) , 3'· 5) -+ l(p(D. 3 ' ), 3" 5) 3 J(3 3' Ell I, t) /\ t -< S -+ l(p(D, 3 \ 3'· 5)
J(3 3 ' Ell I,
1) A t
-<
S --+
J(3 3 ' , 3' . 5)
is derivable by (5.1) and (5.2) by an inference (B3)
We define a monadic function tt in the following recursive way: 1) If b
-<
5 then n:(b)
= O.
PREDICATIVE WELL-ORDERINGS
301
2) If b == 3b • 5 then neb) = b.
-< 3 b • 5 there are
3) If 5 ::5 b 3.1) b = 2
bo
•
b,
3
•
only the following two cases:
Then neb) = max (n(b o), n(b i)
3.2) b = 2bo • 3b , • 5. Then neb) = max (n(b o), bi)' For this function n the following three formulas are verifiable: (6.1) b
-<
3
K
·5
(b) $ 1
(6.2) 5 ::5 b --. 3
K
(b) .
(6.3) b::5 3c • 5 A
5 ::5 b
C =!=
0 --. neb) EEl I ::5 c.
The formulas (6.1) and (6.2) are provable by induction on b. (6.3) is trivial for b -< 5. It follows from (6.2) for 5::5 b. LEMMA
6: The formula
J(s, t)
t ::5 s
A
A
35 ==
S --.
J(s, 3' . 5)
is regularly derivable for arbitrary terms sand t. PROOF. U
~f
max
We use the abbrevations:
«a, s) EEl 1, neb) EEl 2).
v d7r neb) EEl 1 v <, u is verifiable, therefore J(3 3 " EEl 1, v) --. J(3 3 " , 3v
b
-<
•
3' . 5 A t =!= 0 --. v ::5 t
v ::5 t
s t ::5
J(5, t)
A
3
3
"
SA
3 == 5
EEl 1 <
S A V
t =!= 0 A J(s, t) A t ::5 (6.4) { --. J(3 3 " , J" . 5) b-<3
v'5
(a, s) <, 3
S --.
S A
is derivable by Lemma 5
5)
3
3
"
by (6.3)
EEl 1 -< s
::5 t --. J(3 35 ==
S A
3
b
"
<
is verifiable EEl 1, v) is trivially derivable 3' . 5
by 4 prec. by (6.1)
3
"
is verifiable
302
KURT SCHUTTE
(6.5) J(3 3 " , 3"· 5) -+ l(p(a.
s),
t =1= 0 A J(s, t) A t
:5 s A 3 ==
SA
t =1= 0 A J(S, t) A t
:5 s A 3s ==
s
t
=0
J(s, t)
A
s
t
:5 SA 3 == s
((0) = 0, ((n + 1) =
t
t
J(S, 3
-+
•
5
•
5)
-+
s), b) by (6.4) and (6.5)
l(p(a,
by prec. by Lemma 1
s
t
J(S, 3
-+
We define monadic functions ( and
L(a)
b -< 3
J(s, 3t • 5)
-+
by 2 prec.
b)
3~(n)
L(7t(a)) + 1
={ 1
•
if 0
•
L
5)
by 2 prec.
in the following recursive way:
5.
-< a -:
52
otherwise.
Then the following formulas are verifiable: (7.1) Sen) (7.2) a
-< 52
-< 52 -+ a -< S(L(a)).
LEMMA 7: For any natural numbers m :::; n the formula J(s(n), s(m+l)) is regularly derivable. PROOF by induction on m. (1) m = O. Then S(m+ 1) = 5. By Lemma 1 the formula J(s(n), 5) is regularly derivable. (2) 0 < m :::; n. Then sen) is an s-number, i.e. 3~(n) == Sen). Furthermore sCm)
:5 Sen). By the induction assumption
J(S(n), sCm)) is regularly derivable. By Lemma 6 it follows that J(C,(n), sCm + 1)) is regularly derivable.
THEOREM: For any number provable in a predicative way. PROOF. If z
Z
-< 52
transfinite induction up to z is
-< 52 then z -< ((L(Z)) by (7.2).
Therefore it is sufficient to prove: For any natural number n transfinite induction is provable predicatively up to numbers :5 ((n). We prove this statement by induction on n. (1) n
= O. Then ((n) = 0, and the statement is trivial.
PREDICA TlVE WELL-ORDERINGS
303
(2) Suppose the statement is true for n. Then we have a predicative proof that the restriction of our formal system to levels ::S ((n) is a predicative formal system because these levels are well-ordered. According to Lemma 7 we have a proof of J(((n), ((n+ 1» in this formal system. Therefore transfinite induction up to numbers ::S ((n+ 1) is provable predicative1y. References [1] S. Feferman, Constructively provable well-orderings. Notices Amer. Math. Soc. 8 (1961) 495.
[2] S. Feferman, Provable Well-orderings of and Relations between Predicative and Ramified Analysis. Notices Amer. Math. Soc. 9 (1962) 323. [3] S. Feferman, Systems of Predicative Analysis. Text of an invited address delivered to a meeting of the Association for Symbolic Logic at Berkeley, January 1963. [4] G. Gentzen, Beweisbarkeit und Unbeweisbarkeit von Anfangsfallen der transfiniten Induktion in der reinen Zahlentheorie. Math. Annalen 119 (1943) 140-161. [5] K. Schutte, Kennzeichnung von Ordnungszahlen durch rekursiv erklarte Funktionen. Math. Annalen 127 (1954) 15-32. [6] K. Schutte, Beweistheorie (Berlin-Gottingen-Heidelberg 1960). [7] K. Schutte, Eine Grenze fur die Beweisbarkeit der transfiniten Induktion in der verzweigten Typenlogik. To appear in Archiv f. math. Logik und Grundlagenforschung. [8] O. Veblen, Continuous Increasing Functions of Finite and Transfinite Ordinals. Transactions Amer. Math. Soc. 9 (1908) 280-292.
REMARKS ON MACHINES, SETS, AND THE DECISION PROBLEMl) HAO WANG Harvard University, Cambridge, Mass., USA
1. Machines and production systems 1. 1. The basic distinction between monogenic and polygenic systems corresponds to the contrast of calculations with proofs, functions with relations, and machines with production systems. In calculations, we generally have a fixed procedure such that the answer is completely determined by the question. In looking for a proof of a given statement in a given formal system, we have in general an unbounded number of choices at each stage since, for example, there are infinitely many p's such that p ~ q together with p would yield q. If there is a fixed number n such that at each node, there are only n or less choices, then clearly we can get a monogenic system in the search for proofs. A monogenic proof procedure, such as the Herbrand expansion procedure for the predicate calculus, need not give a decision procedure. On the other hand, a monotone system, such that by some criterion the conclusion is always longer or more complex than the premisses, is always decidable when there are finitely many rules only. Thus, given a statement p, the total number of statements which can enter in a proof of p is finite since every rule has a fixed number of premisses. Hence, it is of interest to inquire when a polygenic system is equivalent to a monogenic one, and when either is equivalent to a monotone one. 1.2. A machine which halts on every finite input corresponds to a function from the input to the output. If, on the other hand, we allow, 1) Work for this paper was supported in part by NSF grant GP-228 and in part by Bell Telephone Laboratories, Inc., Murray Hill, New Jersey.
MACHINES, SETS, AND THE DECISION PROBLEM
305
e.g., that the machine can do either of two things at each moment, then for each input we can get many outputs, and we get, in general, a relation Rxy such that y is an output of the machine for the input x. It seems somewhat unnatural to speak of a polygenic machine, but with a Post production system, the distinction between monogenic and polygenic is perfectly natural. In Turing machines, we are usually interested in tapes which are blank for all but a finite number of squares. The consecutive minimum portion containing all marked squares and the square presently under scan could be taken as the string of symbols in a production system. In that case, a machine corresponds to a monogenic production system except for the fact that the former has a scanned square at each moment and has different states. DEFINITION I: A labeled rewriting system is a finite set of rules Pi -+ Q I such that in each PI and Qi exactly one symbol has an arrow above it (the label indicating the square under scan). THEOREM I: There is an effective method by which, given any Turing machine, we get a corresponding monogenic labeled rewriting system in which each Pi (also each Q;) contains exactly two symbols, one of which is labeled. To prove this, we use a Turing machine formulation such that in each state, a machine prints, shifts, and changes state according to the symbol newly under scan. In other words, if there are m states ql' , qm' n symbols Sl' ... , SII' a machine is given by qaSi± ISjqb (a = I, , m; i, j = I, ... , n), so that if the machine is in state qa scanning symbol Sj, it shifts right (+ I, or left, -I) and then scans the next square, ending up in a state qb determined by the newly scanned symbol Sj' It is not hard to verify that this formulation is equivalent to the usual one in the sense that they can simulate each other. With this formulation, we can always use an alphabet with (m + l)n symbols and one state only. Thus, instead of the given state qa and the symbol Sj, we have the symbol (a, i). This is changed to (0, i). After the shift, the scanned symbol is (0, j) which is now changed into (b,j). In other words, for c = I, ... , m and d = I, ... , n, (c, d) is a symbol indicating state c and symbol d, when the square is under scan; a symbol d
306
HAO WANG
in other squares is represented by (0, d). This makes it easy to give a I-state universal machine and yields a measure of the complexity of Turing machines solely by the size of the alphabet (using always 1 state only). This also gives Theorem 1 immediately, since the rules are simply of the forms (a,· i) (0, j)
(0, i) (b; j) for right shift,
(0, j) (a~ i)
(b/j) (0, i) for left shift.
1.3. Multiple tapes naturally make it possible to simulate each m x nk one-tape machine by an (m, n, k) (m states, n symbols, k tapes) machine; but the full force is not used in the simulation and it is desirable to find more accurate measures than these. Recently, P. K. Hooper [7] proved: THEOREM 2: There is a (2,3,2) universal Turing machine; there is a (1,2,4) UTM, having afixed loop/or one a/its/our tapes.
In the realm of "real time computation," Michael Rabin has recently proved that there are calculations which can be performed by two tapes but not by one tape. The whole area of efficient calculations (as against theoretical computability) is wide open and promises much interesting work. Although there are various elegant formulations of Turing machines, they are still radically different from existing computers. To approach the latter, we should use fixed word lengths, random access addresses, accumulator, and permit internal modification of the programs. Alternatively, we could, for example, modify computers to allow more flexibility in word lengths. Too much energy has been spent on oversimplified models so that a theory of machines and a theory of computation which have extensive practical applications have not been born yet. 1 .4. There are a number of conceptually neat results on the theoretical side. We mention a few recent ones at random. The most elegant formulation of Turing machines is perhaps the SSmachines of Shepherdson and Sturgis [16]. An SS-machine is a finite sequence of instructions, each of which is of the following two types.
MACHINES, SETS, AND THE DECISION PROBLEM
Po, Pi: print
°
307
(or 1) at the right end of the string S and go to the next instruction.
SD(k): scan and delete the leftmost symbol of S; if it is 0, go to the next instruction, otherwise, go to instruction k; if S is null, halt.
They have proved: 3: Every Turing machine (in particular, a UTM) can be simulated by an SS-machine. THEOREM
It is particularly easy to simulate these machines by Post production
systems (see [22]).
1.5. A combinatorial system in the most general sense would be any finite set of rules, each of which effectively produces a finite set of conclusions from a finite set of premisses. The most intensively studied case is the one in which each rule has a single premiss and a single conclusion. Such a system is called monogenic if the rules are such that for any string at most one rule is applicable. From this broad class of monogenic systems, Post chooses to consider the tag systems. A tag system is determined by a finite set of rules: i=I, ... ,p,
such that if the first symbol of a string is s., then the first P symbols are removed and the string E i is appended at the end. Since the system is monogenic, s, =1= S j when i =1= j. If the alphabet contains (J symbols, then p = (J. Another natural class is, for want of a better name, the lag systems. A lag system is a set of (JP rules:
such that if the first P symbols of a string are Si 1 ' " Sip' the first symbol, viz., Sil' is deleted and E j is appended at the end of the string. In either case, E, may be the null string. If S, is the length of E, and Sis the maximum among Si' then each system has a prefix number P and a suffix number S. In [11] and [12], Minsky has proved the following remarkable result:
308
HAO WANG
THEOREM 4: There is a tag system with prefix number P = 2 and suffix number S = 4, whose halting problem is unsolvable.
This is improved slightly in [22] to get the suffix number down to S = 3, and then the result is shown to be best possible because every tag system with P = 1 or P ~ S is always decidable (i.e., both its halting problem and its derivability problem). More recently, Cocke and Minsky gave an improved proof of Theorem 4, from which the simplification to S = 3 follows directly. In these considerations, attempts to use the SS-machines have not been possible. A similar result for lag systems is proved in [22] by using SS-machines : THEOREM 5: There is a lag system with P = S = 2, whose halting problem is unsolvable; moreover, when P = 1 or S ~ I, every lag system is decidable.
The tag systems are a subset of Post's monogenic normal systems, each of which has rules of the form such that a given string BiQ becomes QE j by the rule. It is quite easy to use SS-machines to get a normal system with P = S = 2 (P the maximum of the lengths of Bi ) whose halting problem is unsolvable (see [22]). A specially interesting subcase of the normal systems is the l-norrnal systems in which B, is always a single symbol. The l-norrnal systems include all tag and lag systems with P = 1. It is obvious from [22] that the halting problem for every I-normal system is decidable. S. Cook and S. Greibach have strengthened the result, with two radically different proofs, to get also: THEOREM 6: The derivability problem (i.e., whether one string is deducible from another) of every I-normal system is decidable.
1.6. It has been known for quite some time that for Turing machines erasing is dispensible (see [19]). In theory, this result has the practical application that, e.g., paper tapes can be used in place of magnetic tapes. The dispensibility of erasing is understood in the sense that every calculation can in theory be done without erasing. Recently, the consequence problem is considered and it is proved [15]:
MACHINES, SETS, AND THE DECISION PROBLEM
309
THEOREM 7: If T ranges over nonerasing T.M., W ranges over words in their history, I ranges over (finite) inputs, then the relationP(W, T,I) (i.e., W belongs to the history of T with input J) is recursive; on the other hand, for a fixed initial (finite) input, we can find a T.M. with erasing permitted such that the set of words in its history is not recursive.
2. The decision problem and its reduction problem
2. 1. In this part, we consider recent results on the decision and reduction problems of the (restricted) predicate calculus. Since all mathematical theories can be formulated within the framework of the predicate calculus (quantification theory, elementary logic), Hilbert spoke of the decision problem when he was referring to the problem of finding a general algorithm to decide, for each given formula of the predicate calculus, whether it is satisfiable in some nonempty domain (or, has a model). He called this the main problem of mathematicallogic. It is familiar today that this problem in its general form is unsolvable in a technical sense which is widely accepted as implying unsolvability according to the intuitive meaning. An interesting problem is to investigate the limits of decidable subdomains and the underlying reasons for the phenomenon of undecidability. Recently, the general problem has been reduced to the formally simple case of formulas of the form AxEx'AyMxx'y, where Misquantifier-free and contains neither the equality sign nor function symbols. In fact, one can further restrict the class to those AEA formulas in which all predicates are dyadic, and each dyadic predicate G j occurs only in some of the nine possible forms Gsxx, Gjxx', Gjx'x, Gjx'x', Gjyy, Gjxy, Gjyx, Gjx'y, Gsyx', The following is proved in [9]. THEOREM 8: Any AEA class including all formulas which contain only atomic formulas in three of the four forms (xy, yx, x'y, yx') is undecidable; the class of all AEAformulas of the form WXXA U(xy,x'y) A V(yx,Yx'), that ofthe form U (xy, x'y) A V(xy, yx), that ofthe form Uiyx.yx') A V(xy, yx), are all undecidable, where W, U, Vare truth-functional expressions. Moreover, all these classes are reduction classes.
This completely settles the question of decidable and undecidable prefix subclasses of the predicate calculus. This is true even if we allow
310
HAO WANG
formulas in the extended prenex forms, i.e., formulas which are conjunctions of formulas in the prenex normal form. (Compare [9] and [21]). THEOREM 9: An extended prefix form class is a reduction type (and undecidable) if and only if either the prefix of at least one conjunct contains AEA or AAAE as an (order-preserving but not necessarily consecutive) substring, or there are two conjuncts of which the prefixes contain AAA and AE respectively. Moreover, it is decidable if and only if it contains no axioms of infinity. i.e., formulas which have only infinite models. 2.2. In [21], a simpler alternative proof of Theorem 8 is given which has two additional properties: (a) only a small fixed finite number of dyadic predicates are needed, together with arbitrarily many monadic predicates; (b) finite models are preserved in the reduction procedure so that a formula has a finite model if and only if its corresponding AEA formula has a finite model. DEFINITION 2: Consider classes of formulas of the predicate calculus. For any class X, let N(X), I(X), F(X) be the subclasses ofXwhichcontain all formulas in X which have respectively no model, only infinite models, finite models. If R is a reduction procedure which reduces a given class Y to y* and every subclass Z of Y to Z*, then R is said to be a conservative reduction procedure for Y, if (F(Y»* = F(Y*). The following two theorems are proved in [21]: THEOREM 10: If K is the class of all formulas of the predicate calculus and R is a conservative reduction procedure for K, then no two of the three classes N(K*), I(K*), F(K*) are recursively separable. THEOREM 11: If Z is the class of AEA formulas (or some suitable subclass of this, such as A 1 given below), then no two of the three classes N(Z), I(Z), F(Z) are recursively separable. In another direction, Kahr (see [8]) extends Theorem 8 to the following: THEOREM 12: A reduction class for the predicate calculus is the set A 1 offormulas with prefix AEA such that each formula of the set contains only monadic predicates and a single dyadic predicate.
MACHINES, SETS, AND THE DECISION PROBLEM
311
This proof can be modified as in [21] to get Theorem 11 for ..11 and to give a corresponding result for the prefix AAA A AE, and therewith an alternative proof of Suranyi's similar result [18] for the more complex prefix AAA A AAE. 2.3. In studying the AEA case, "dominoes" were first introduced in [20], and are found to be useful for the study. They are also of some independent interest and are reviewed here mainly for the remaining open problems. We assume there are infinitely many square plates (the domino types) of the same size (say, all of the unit area) with edges colored, one color on each edge but different edges may have the same color. The type of a domino is determined by the colors on its edges and we are not permitted to rotate or reflect any domino. There are infinitely many pieces of every type. The game is simply to take a finite set of types and try to cover up the whole first quadrant of the infinite plane with dominoes of these types so that all corners fall on the lattice points and any two adjoining edges have the same color. DEFINITION 3: A (finite) set of domino types is said to be solvable if and only if there is some way of covering the whole first quadrant by dominoes of these types. It is natural to use ordinary Cartesian coordinates and identify each unit square with the point at its lower left hand corner. Then we can speak of the origin (0, 0), the main diagonal x = y, etc. The following general questions on these games have been considered:
DEFINITION 4: The (unrestricted) domino problem. To find an algorithm to decide, for any given (finite) set of domino types, whether it is solvable. The origin- (diagonal-, row-, column-) constrained domino problem. To decide, for any given set P of domino types and a subset Q thereof, whether P has a solution with the origin (the main diagonal, the first row, the first column) occupied by dominoes of types in Q. THEOREM 13: All the constrained domino problems are unsolvable (see [9] and [21]).
312
HAO WANG
The unrestricted domino problem remains open. In fact, as discussed in [20], there are two related open questions. I) Problem I. Is the unrestricted domino problem solvable? Problem 2. Does every solvable domino set have a periodic solution? A positive solution of the second problem would yield also a positive solution of the first problem, but not conversely. The unrestricted domino problem is related to a special subclass of the AEA formulas with dyadic predicates only, viz., those of the form (1)
U(G1xy, ... , GKxy; G1x'y, ... , GKx'y) V(G 1yx,
or briefly, (I)
G1yx', U(xy, x'y)
A
, GKyx; , GKyx'), A
V(yx, yx'),
where U and Vare truth-functional combinations of the components. 14: Given a domino set P we can find a formula F p of the form (I) such that P has a solution if and only if F p has a model; conversely, given a formula F of the form (1), we can find a domino set PE such that F has a model if and only P E has a solution. Hence, the unrestricted domino problem is undecidable if and only if the decision problem of the class of all formulas of the form (I) is unsolvable. (See [21].) THEOREM
2.4. Results on the degree of complexity of AEA formulas are announced in the preliminary report [10]. The whole paper has not been completed because of the unwieldy construction of the simulation. An outline with proofs of the less combinatorial part is reproduced here. The method of simulating Turing machines by domino sets with diagonal constraints, as developed in [9], can be extended to obtain a simulation of each Turing machine X with all its numerical inputs by a single domino set P x such that when X is viewed as a function from inputs to outputs, every diagonal-constrained solution of Px satisfies the condition: if X(n) = I, then.K occurs at the point (ex(n), ex(n»; and if X(n) = 0, then K does not occur at (ex(n), ex(n», where K is a domino type, ex is a fixed monotone increasing recursive function. Expressing the solvability condition for P x by an AEA formula, we can establish the following: 1) Recently (May 1964) Robert Berger has settled both questions in the negative.
MACHINES, SETS, AND THE DECISION PROBLEM
313
LEMMA 1: For every Turing machine X, there is an AEA formula Fx == (x)(Eu)(y)Jxuy which contains a monadic predicate M, such that every model of F x in the domain of natural numbers has the property that the model M* of M separates the sets ii(X(n) = 0) and ii(X(n) = 1), and any such set can be used as M*, with models of other predicates being recursive.
More specifically, identify M*(rx(n)) with K*(rx(n), a(n)), and, for all k, if for no n, cx(n) = k, M*(k) is true. In this way, we shall be able to choose M* which is recursive in ii(X(n) = 0). We shall leave the proof of the lemma out and discuss what consequences we can derive from it. In the intended model, all other predicates of F x are recursive. We use the fact that if F has a model, then Jxx'y, x' being short for x+ 1, has a model in the domain of natural numbers. It can be shown that the formula has no finite models. A nonstandard model must also contain all the natural numbers. This seems sufficient for showing that any RE (recursively enumerable) predicate A is recursive in every model of M, when x(X(x) = 0) and x(X(x) = 1) are suitably chosen (see below). "Recursive in" is defined for natural numbers, but if M* also includes other objects, we seem to require a generalization of the concept, which can be done in the natural manner. In any case, it is true that A is recursive in M* because A is recursive in the standard part of M* already. Further, the restriction to RE models also requires a definition for M* to be RE, one possibility is that its standard part is RE. It may be pointed out, incidentally, that if we require that F x has a unique model relative to the domain of natural numbers and the successor function, then all the predicates must have recursive models by the infinity lemma. Alternatively, we may also wish to relativize the definition of the given RE predicate A. Then we have to define A by a quantificational schema. Hence, we have to begin with all possible models. Another way of proceeding is to confine our attention to models in the domain of natural numbers since otherwise recursive and RE are not defined. This last alternative seems the most natural way. In other words, we are only concerned with models of F x in which the domain is the set of natural numbers and the existential quantifier
314
HAO WANG
is replaced by the successor function. This is not regarded as a weakened condition because otherwise we cannot talk about recursive and RE models. This is indeed the practice followed by earlier authors. LEMMA 2: If A is RE, then there are disjoint RE sets B, C, which are i.e., recursive in A, such that if an RE set D separates Band C, i.e., BcD, C c 15, then A ~ TD, i.e., A is recursive in D; in particular, A ~ T B, and, hence, A = TB. ~ TA,
This follows from the proof (though not the statement) of Theorem 1 in [17]. Observe that unlike recursive separability, we cannot infer from the existence of an RE set D separating Band C, that there is an RE set E separating C and B, i.e., C c E and BeE, since 15 is not RE unless D is recursive. The condition that D is an RE set is essential. Thus, if A is of degree 0', then D must be of degree 0' too. In an unpublished work, Dana Scott shows that there is a degree d < TO' such that any two disjoint RE sets are separable by a set (not necessarily RE) of degree
=
=
It appears likely that if we do not want the stronger result with the restriction to AEA formulas, we can combine Lemma 2 with familiar considerations to get a weaker form of Theorem 15. Thus, for example, we can write a more complex formula characterizing the machine X in Lemma 1. PROOF of Lemma 2. By definition of RE, there exists g:
x eA
=(Eu) (Ey) [x =
U(y)
A
T(g, u, y)]
=(Eu) (Ey)R(x, u, y).
MACHINES, SETS, AND THE DECISION PROBLEM
315
Hence, there is f: x s A == (Ey)T(f, x, y).
Let
x e B == (Ey) [T(f, (x)o, y)
A ..,
(Ez)z
s
yT«X)l' x, z»)
x a C == (Ey) [T(f, (X)O, y) A (Ez)z "yT«X)b x, Z»).
Clearly Band C are disjoint. Since [(Ey) (Fy
A ..,
Gy) v (Ey) (Fy
A
Gy») == (Ey)Fy,
(x e B v x e C) == (x)o eA. It is easy to see that Band C are recursive in A. Thus, if (x)o ¢ A, then x ¢ B, x ¢ C. If (x)o s A, we can determine the unique y such that T(f, (x)o, y). Hence, x e B == .., (Ez)z" yT«X)l' x, z) x s C == (Ez)z" yT«X)l' x, z).
Suppose now BcD, C c 15, and D is RE. Choose e so that D = x(Ey)T(e, x, y). To determine whether we A, we ask just whether x = 2 w3 e belongs to D, i.e., whether (Ey) [y < /lJ(e, x, z) A T(f, w, y»). Case 1. x ¢ D. We have then Hence, w ¢ A, because otherwise, if we A, then x e B. But x e B implies x e D, contrary to hypothesis. Case 2. x e D, We can then find unique z, T«X)l' x, z). Since x e D implies x ¢ C, weA==xeBuC
== x e B. But then, because otherwise, i.e., if z ~ y, then the second half of the condition for x s B cannot be satisfied. Since B can serve as a D, A = T B. 2.5. Since the class U of AEA formulas with dyadic predicates only is unsolvable and a reduction type, it is of interest to consider what
316
HAO WANG
subclasses are solvable. The following is proved in [5] and a more "geometrical" alternative proof is given in [21]. Consider the four forms xy, yx, x'y, yx'. First take any three of them. From Theorem 8 above we know that any subclass of U which includes all formulas whose atomic formulas are in just these three forms is a reduction class and hence is undecidable. Now take any two of the four forms. Combining them with the other five forms yields a subclass of U. In this way we obtain six subclasses of U which divide into three pairs:
= {xy, L = {xy, Q = {xy, J
x'y}, yx}, yx'},
= {yx, yx'}, L* = {x'y,yx'}, Q* = {yx, x'y}. J*
THEOREM 16: With the exception of subsets of Q and those of Q*, a class, determined by the forms of atomic formulas occurring, is decidable if and only if it contains at most two of the four forms xy, yx, x'y, yx'; it contains an axiom of infinity if and only if it contains three forms including either xy and x'y, or yx and yx'.
Problem 3. Is the class of AEA formulas with dyadic predicates only which occur only in contexts with xy, yx', xx, xx', x' x, x' x', yy decidable? Is the class decidable when dyadic predicates occur only in the contexts with xy, yx', xx? Problem 4. Does either case contain an axiom of infinity, i.e., a formula that has only infinite models? If the answer to the latter question is no for either class, then, by familiar arguments, that class is solvable. 2.6. Suranyi has applied his reduction classes with the prefix AAA v AAE to obtain reduction classes with more complex prefixes but fewer predicates. Denton has undertaken to study similar consequences of the AEA reduction: THEOREM 17: The following classes are reduction classes (a) EY1' .. EYnAxEyAzM(n = 1,2, ... ) with one predicate only which is dyadic; (b) AxEy(Pxy A (Pxx 1= Pyy) A Az 1 ... AznM (and therewith AxEyAz1 ... AznM) with only the predicate P.
MACHINES, SETS, AND THE DECISION PROBLEM
317
Part (a) follows from Theorem 12 in exactly the same way as Suranyi's Theorem IV follows from his reduction class with prefix AAA A AAE. Part (b) was announced in [3] and afterwards also proved by another argument using only Theorem 8. In [6], Gegalkine claims that the class of formulas AxEyFxy A AZ 1 .. . AznM with M containing any number of monadic and dyadic predicates is decidable for finite satisfiability. Denton shows in [4] that this would contradict Theorem II for A l' and singles out the mistake in Gegalkine's paper. Further, he proves, by extending Ackermann's work [I], the following: THEOREM 18: The class AxEyPxy A Az t . . . AznM, where M contains only the dyadic P and monadic predicates, is decidable for finite satisfiability.
Problem 5. Is the class AxE)'1' .. EYnAzM, with only a single predicate (dyadic), a reduction class? Problem 6. Is the class in Theorem 18 (or even without the monadic predicates) a reduction class?
3. Sets 3. I. The basic axioms of the system ZF of set theory are extensionality, infinity (unconditional existence) and four axioms of conditional existence: (a) pairs, (b) sum set, (c) power set, (d) replacement (a schema). It is noted in [23] that these axioms (a)-(d) are equivalent to a single axiom (schema): if A is a one-many correlation, x is a set, and A" t = = u(Ev) (t s v A Auv), then there is a set y, y = IA"rrx (I is sum set, tt is power set). Thus, if Auv is v
= {u},
then Ix
= IA"rrx. If Auv is u = {v},
then nx
=
= IA"rrx. If Auv is (Ez)(Ew)(Gzw A u = {z} A V = {w}), then G"» = IA"rrx.
If 0 is (u = v A V =I v)"x, Guv is (u = a A v = 0) v (u = b A V = {O}), then {a, b} = G"rrn:O. The part about {a, b} is familiar from the literature. Previously, Bernays ([2], p. 65) had employed a similar schema y = IA"x to get (b) and (d). Ono ([13]) had introduced a schema which would yield (a), (b), (c), (d) if we add another axiom: (x)(Ey)(y = {x}). As it turns out, in a less explicit way, the same axiom is also needed for the result stated in [23], although Ono's schema is different.
318
HAO WANG
3.2. A different procedure is followed in [24] to get a schema which would yield all the axioms including infinity and extensionality. A partial hull is a transitive set closed with respect to power sets, i.e., PH(x) if (l)Ex s x, (2) ye x ~ ny s x. A natural hull is closed also with respect to sum sets, i.e., NH(x) if PH(x) and (3) y e x ~ Eye x. The natural hull nzz (or partial hull (a) of a set a is the intersection of all natural (partial) hulls x such that a s x. THEOREM 19: In ZF, na can be shown to exist for each a, and to satisfy the conditions (1)-(4), as well as that of being the minimum; similarly for (a with the conditions (1), (2), (4).
Let SEbe obtained from the usual axiom of replacement by substituting nx or (x for the given set x, viz., the schema: If Huv is many-one, then (Ey) (y = H"flX) (or (Ey) (y = H"(x)). Let VE be obtained from SE by adding uniqueness, i.e., by substituting (E!y) for (Ey). Both SE and VE can be expressed in the primitive notation of ZF. THEOREM 20: In the predicate calculus with equality, SE yields all existence axioms of ZF, VE is equivalent to all these axioms plus extensionality.
For some purposes, it is also useful to define other closures, e.g., ca, the transitive closure of a, would be the smallest x, a S x and Ix S x. 3.3. The usual definitions of the class On of the Zermelo-Neumann ordinals do not reveal the intuitive picture of how the ordinals are obtained successively. It is possible to use a "genetic" definition roughly in the tradition of Frege and Dedekind. This approach has the advantage that we can also use different successor functions, e.g., x' = ttx rather than x' = x u {x}. Such different successor functions are useful, e.g., in studying the natural models of von Neumann and Bernays. Two definitions of the Zermelo-Neumann ordinals are found to be adequate: DI.
On 1(x) when x belongs to every set u such that (1) v' s u, if v s u and v' s x'; and (2) Ew e u, if w s u and Ew e x',
DI*. On 2(x) when for every set u, there is w, Ew s u, w = 0; if x e u and for all v, v e u if v' s u.
S
x, and u n w =
MACHINES, SETS, AND THE DECISION PROBLEM
319
In fact, a general theorem on these ordinals is proved: THEOREM 21: If for a predicate On(x) we can prove that for every F, Fy if (1) On(y), (2) (v)(Fv ~ Fv'), and (3) (w)(w s; F ~ F(L'w)); then every x which satisfies On(x) is a genuine ordinal. Hence, if On(x) is a property known to hold for all ordinals, then the definition is adequate. When specialized to finite ordinals, we get: DF. Nn.I») when x belongs to every set u such that (1) 0 e u if 0 ex'; (2) v' e u, if v s u and v' e x', DF*. Nn 2(x) when for every u, 0 e u if (1) x e u, (2) v s u if v' e u. These are also adequate definitions. In fact, DF* is one which had previously been studied by W. V. Quine and K. R. Brown. In all these developments, weak axioms are enough, viz., extensionality, Aussonderung, and self-adjunction (x)(Ey)(y = x u {x}). To get also recursive definitions or transfinite recursions, some strengthening of the axioms in the standard manner is necessary. References [1] W. Ackermann, Beitrage zum Entscheidungsproblem der mathematischen Logik, Math. Annalen 112 (1936) 418-432. [2] P. Bernays and A. Fraenkel, Axiomatic Set Theory. (Amsterdam 1958). [3] J. S. Denton, A Reduction Class with a Single Dyadic Predicate. Notices AMS 10 (1963) 124-125. [4] J. S. Denton, A False Decision Procedure for the Halting Problem. Notices AMS 10 (1963) 125. [5] B. Dreben, A. S. Kahr, and Hao Wang, Classification of AEA Formulas by Letter Atoms, Bulletin AMS 68 (1962) 528-532. [6] I. Gegalkine, Problema razresimosti na konecnik klassah. Ucenye Zapiski Mosk. Gos. Univ. 100 (1946) 155-212. [7] P. K. Hooper, Some Small Multi-tape Universal Turing Machines. Notices AMS 10 (1963) 584. [8] A. S. Kahr, Improved Reductions of the Entscheidungsproblem to Subclasses of AEA Formulas, Symposium on the Mathematical Theory of Machines, Brooklyn Polytechnic Institute (April 1962); Proceedings (New York 1963) 57-70. [9] A. S. Kahr, Edward F. Moore, and Hao Wang, Entscheidungsproblem reduced to the AEA Case. Proc. Nat. Acad. Sci., U.S.A. 48 (1962) 365-377. [10] A. S. Kahr and Hao Wang, Degrees of RE Models of AEA Formulas. Notices AMS 10 (1963) 192-193. [II] M. L. Minsky, Recursive Unsolvability of Post's Problem of Tag. Annals of Mathematics 74 (1961) 437-455. [I2] M. L. Minsky, Universality of (p = 2) Tag Systems. A.I. Memo No. 33 (Cambridge, Mass. 1962).
320
HAO WANG
[13] K. Ono, A Set Theory founded on Unique Generating Principle. Nagoya Mathematical Journal 12 (1957) 151-159. [14] E. L. Post, Formal Reduction of the General Combinatorial Decision Problem. American Journal of Mathematics 65 (1943) 197-215. [15] M. O. Rabin and Hao Wang, Words in the History of a Turing Machine with a Fixed Input. Journal ACM 10 (1963) 526-527. [16] J. C. Shepherdson and H. E. Sturgis, Computability of Recursive Functions. Journal ACM 10 (1963) 217-255. [17] J. R. Shoenfield, Degrees of Formal Systems. Journal of Symbolic Logic 23 (1958) 389-392[18] J. Suranyi, Reduktionstheorie des Entscheidungsproblem (Budapest 1959). [19] Hao Wang, A Variant to Turing's Theory of Computing Machines. Journal ACM 4 (1957) 63-92. [20] Hao Wang, Proving Theorems by Pattern Recognition, II. Bell Systems Technical Journal 40 (1961) 1-41. [21] Hao Wang, Dominoes and the AEA Case of the Decision Problem. Symposium on the Mathematical Theory of Machines, Brooklyn Polytechnic Institute (April 1962); Proceedings (New York 1963) 23-55. [22] Hao Wang, Tag systems and Lag Systems. Math. Annalen (1963) 65-74. [23] Hao Wang, A Universal Axiom of Conditional Set Existence. Notices AMS 10 (1963) 588. [24] Hao Wang, Natural Hulls and Set Existence. Notices AMS 10 (1963) 594.