Provability, Computability and Reflection, Volume 50

CONTRIBUTIONS TO MATHEMATICAL LOGIC PROCEEDINGS OF THE LOGIC COLLOQUIUM, HANNOVER 1 9 6 6 Edited by H. A R N O L D S C...

Author: Lev D. Beklemishev

27 downloads 603 Views 4MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

CONTRIBUTIONS TO MATHEMATICAL LOGIC PROCEEDINGS OF THE LOGIC COLLOQUIUM, HANNOVER 1 9 6 6

Edited by

H. A R N O L D S C H M I D T K. SCHfiTTE H.-J. T H I E L E

1968

NORTH-HOLLAND PUBLISHING COMPANY AMSTERDAM

0 North-Holland Publishing Company

- Amsterdam - 1968

No part of this book may be reproduced in any form by print, photoprint, m i c r o f i h or any other means without written permission from the publisher

Library of Congress Catalog Card Number: 68-24434

P R I N T E D I N THE NETHERLANDS

PREFACE An international logic colloquium was held at Hannover, Germany, in August 1966. Partly due to its favourable location just before the International Congress of Mathematicians at Moscow, international attendance was remarkable. One-hour addresses were delivered by R. 0. Gandy, J. Suranyi, and A. Tarski, 42 half-hour papers were presented, and there were more than 80 participants, half of them from foreign countries. The colloquium was sponsored and substantially supported by the Division of Logic, Methodology and Philosophy of Science of the International Union for the History and Philosophy of Science. Additional support was given by the German Federal Government and the Technische Hochschule Hannover. The colloquium was organized by the Deutsche Vereinigung fur mathematische Logik und fur Grundlagenforschung der exakten Wissenschaften (DVMLG) as a European meeting of the Association for Symbolic Logic in cooperation with the British Logic Colloquium. Topics dealt with at the colloquium ranged from mathematical logic, recursion theory, and intuitionistic mathematics to philosophy and history of mathematics and foundations and philosophy of physics. The present volume contains only a selection of the papers presented at Hannover, concentrating mainly on the more strictly logical and foundational subjects. Many of the papers published here have been revised or extended since their presentation at the colloquium. The publication of the colloquium proceedings is overshadowed by the death of H. Arnold Schmidt who presided at the Hannover Colloquium and also suggested the edition of the present volume. The German logicians mourn for H. Arnold Schmidt who as founder and long-time president of the DVMLG deserved well of the promotion of mathematical logic in Germany. May 1968

K. SCHUTTE

SATURATED INTUITIONISTIC THEORIES *

P. H. G. ACZEL St. Peter’s College, Oxford Introduction According to the intuitionistic interpretation of the logical connectives, any verification of the sentence a v p must involve a verification of either u or of 0.Also any verification of Vxci(x) must involve a verification of .(a) for some a. This suggests that the collection of first order sentences that are verifiable in any intuitionistic mathematical theory form what we call below a saturated theory. The main result of this paper is Theorem 1. The proof of this theorem is almost identical to the Henkin proof of completeness of classical logic as presented for example by Lyndon [4]. A particular case of this theorem is that every intuitionistically consistent set of sentences can be extended to a saturated theory. Using this theorem we show that there is a close connection between saturated theories and the interpretation of intuitionistic logic given in Kripke. In fact the family of all saturated theories partially ordered by inclusion form what we call a Kripke structure. In section 4 we give new proofs that a) the pure theory of intuitionistic predicate logic with a non-empty set of individual constants and b) the theory of Heyting Arithmetic are saturated. To do this we introduce a relation Tlku between a set of sentences r and a sentence a. This relation is very similar t o the relation r1.s introduced in Kleene [2]. We then give a characterisation of this relation in terms of certain Kripke structures. In the final section we suggest how the methods of classical model theory may be extended to apply to intuitionistic logic. Note that we make free use of set-theoretic methods in this paper. We have attempted to give as smooth a generalisation of the semantics of classical logic as possible. Hence, of

* This work was carried out while the author received a grant from the Science Research Council. Most of the results are contained in a part of the author’s thesis submitted for D. Phil. 1

2

P. H. G . ACZEL

course, our results have no direct bearing on an intuitionistic semantics of formal intuitionistic logic. 1. Preliminaries

We shall use a first order language L. This has a countable set Pred of predicate symbols, each one being n-ary for some integer n. In particular there is an 0-ary predicate symbol ‘El’ denoting absurdity. The atomic formulae have the form p (sl, ...,s), where p is an n-ary predicate symbol and each si is either one of a countable set of variables, or an individual constant taken from some arbitrary set. The formulae are built up from the atomic formulae in the usual way, using the connectives v , A , -+ and quantifiers Vx, Ax. A sentence is a formula with no free variables. If a is a sentence Ind(a) is the set of individual constants occurring in a, and if r is a set of sentences Ind(T)= U(Ind(a)l a ~ f )If. A is a set, St, is the set of sentences a such that Ind (a) E A . If is a sentence we shall write ka(t-,a) when a is a theorem of intuitionistic (classical) first order logic. We shall write rka(rk,a) if there are pl, ...,~ , E T such that l-(t,) P 1 ~ ( 2-f...(Bn-a)...). P Let = {a1 Ind (a) c Ind (r) and r t- a } Cn and Cn,(r) = {a1 Ind(a) E Ind(T) and I‘I,@}. DEFINITIONS 1) r is a theory (classical theory) if C n ( r ) = r (Cn, (r)=r); 2) r is consistent (a-consistent) if o $ C n ( r ) (a$Cn(T) and Ind(a)c Ind (r)) ; 3) r is complete (a-complete) if r is consistent and pECn(r) or f l - f l J ~ C n ( r ) for all p such that Ind(P)sInd(T) (F is a-consistent and p ~ C n ( r or ) /3-+aeCn(r) for all p such that Ind(P)GInd(r)); 4) r is prime if a v pECn(r) implies a ECn(r ) or pECn(T); 5) r is existential if v x a ( x ) ~ C n ( r )implies a ( a ) ~ C n ( r )for some individual constant a ; 6) r is saturated (a-saturated) if it is a prime, consistent (a-consistent) existential theory. If I n d ( r ) # 8 and r is consistent let Ur be the (classical) relational system Ur = (Ind (r),{ & - } p s p r e d ) where p r = { (al ...a,)] p (al... a,)ECn(T)} for each n-ary predicate symbol p . If U’ = ( A ’ , ( p ’ } p E p r e d ) is a relational system, let Val(U’) be the set of those sentences of St,, that are valid in U’. The following lemma will be useful below.

(r)

SATURATED INTWITIONISTIC THEORIES

3

LEMMA 1 1) Every a-complete theory is prime. 2) The union of a chain of a-consistent theories (a-complete theories) is an a-consistent theory (a-complete theory). 3) r is a saturated classical theory if and only if r=Val(U,). Proof. The proofs are all straightforward. We shall just prove 1). Let A be an a-complete theory. Let p v Y E A . Then p v y+a$A as A is a-consistent. Hence P-a$A or y+a$A as t(B+a)r\(y+a)-t(Bvy+a). But A is a-complete, so P E A or Y E A i.e. A is prime. 2. Completeness proof

The main result of this section is Theorem 1. The proof of this theorem follows closely the Henkin completeness proof of classical logic as given for example in Lyndon [4]. Lemma 2 is a generalisation of Lindenbaum's Lemma. LEMMA 2. Every a-consistent r can be extended to an a-complete theory A . Proof. Let X be the set of a-consistent theories A extending r such that Ind(A)=Ind(T). Then X is non-empty, as Cn(r)EX. Also X is closed under unions of chains by 2) Lemma 1. Hence by Zorn's Lemma X has a maximal element A say. A is an a-consistent theory. To show that A is acomplete assume p+a$A and pEStInd(d).Then A'=Cn(A u { p } ) is clearly a-consistent and hence is in X . Also A E A'. But A is maximal in X . Hence A = A ' and so P E A . Thus A is a-complete. If r is a set of sentences, a theory A is a rich extension of r if T c A and V X / ~ ( X ) implies E~ P(u)EA for some a. LEMMA 3. Every a-consistent r has an a-complete rich extension A. Proof. To each vxp(x)ECn(T) associate a new individual constant a,. Then A , = C n ( r u (p(aP)lvxp(x)ECn(r)}) is clearly a rich extension of r. We show that A , is a-consistent. Assume a ~ d , Then . there are Vx,pl(xl) ... Vx,&(x,)ECn(r) such that ru{Pl(as,),..., P,(a,,>}ka. Hence ru{p,(a,J, ..., P,- (a," - ,I} t- B, (a,,) a and so r { P l (a, 1 1 7 ...7 Pfl-1 ,)I tVx,P,(x,)+a as a,, occurs only in P,(a,,). But rkVxn/3,(x,). Hence r ~ { ~ ~ ( a..., , , P,-,(aP,_,)}t-a. ), Repeating the above we show eventually that Tt-a which contradicts the initial hypothesis. Hence A , is a-consistent and by Lemma 2 can be extended to an a-complete theory which clearly satisfies the lemma. +

1. Every a-consistent THEOREM

r can be extended

to an a-saturated A .

4

P. H. G. ACZEL

Proof. By Lemma 2, I' can be extended t o an cc-complete theory A,. By repeated use of Lemma 3 there is a sequence A, c A, 5 . .. of a-complete theories such that A,+l is a rich extension of A , for each n. By 2) Lemma 1 A = U,<mA,is an a-complete theory. A is also existential, for if VX/I(X)EA then VX/~(X)EA, for some n and hence /I(u)EA,+,GA as A , , , is a rich extension of A,. Thus A is an a-saturated extension of r. Let T k a iff T s A implies a ~ for d every saturated A such that Ind(a)c Ind (A). We may reformulate Theorem 1 as

THEOREM 2 (INTUITIONISTIC ANALOGUE OF THE COMPLETENESS THEOREM FOR CLASSICAL LOGIC). TI-a iff r k a . We shall need the following corollary of Theorem 1 THEOREM 3. 1) Every @+/3-consistent r can be extended to a /3-saturated theory A such that a c d . 2) Every Axa(x)-consistent r can be extended to an a(n)-saturated theory A, for some a. Proof. 1) If r is @+/I-consistent then Tu {cc} is /I-consistent and by Theorem 1 has a P-saturated extension A. 2) If T is Axcc(x)-consistent then T u (cc(u)-+x(a)) is @(a)-consistent for a$Ind(T). For I n d ( a ( a ) ) s I n d ( r u { ~ ( u ) - + a ( u ) } and ) if T u (a(a)-+a(u))ta(u) then f k a ( a ) and so I't-Axa(x). Hence by Theorem 1, r c r u (a(u)+ a(a)} has an a(a)-saturated extension. Let card (A) be the cardinality of the set A . Let A G c B if A c B and B - A and B have the same infinite cardinality. Then A s c_ B implies there is a Csuch that A ~ s C E G B . Theorem 4 below is a strengthening of Theorem 1. It is a generalisation of the Lowenheim-Skolem Theorem of classical logic. THEOREM 4. Let A be infinite and Ind(T)z c A . If r is cc-consistent then r has an cr-saturated extension A such that Ind(A)S EA.

Proof. By examining the proof of Theorem 1 it is clear that at most new constants are introduced to define A = Un<, A,. Also it is clear that card(St,)=max(w, card(B)) for any set B. Hence if m = max(w, card (Ind(r))) then card(A,)=m and then we easily see by induction

cn<,card(A,)

SATURATED INTUITIONISTIC THEORIES

5

that card(A,+,)=m for each n. Hence a t most m.o=m new individual constants are needed. But if Ind(I')c c A then card(A)>m. So we may select the new individual constants from a set C such that Ind(I')z E C c G A and card(C)=card(A). Then I n d ( d ) s G A .

3. Kripke structures The following definitions are basically due to Kripke [3] where they are presented in a different form. The only essential difference is that we allow the indexing set I to be a class rather than just a set. This will be convenient below though of no major significance. DEFINITION. 'x=<(Ui)iEIi

G)

is a Kripke structure if < is a reflexive transitive relation on the nonempty class I and {Ui}islis a family of relational systems Lli= (Ind(i), { p i ) p E P r e d . ) where Ind(i) is the domain of U i and pi is an n-ary relation on Ind(i) if p is an n-ary predicate symbol. This family is to satisfy i< j implies Uii and all aEInd(j). Let Val(%, i) = { X I %Ikicc}. Then we may easily show that i b j implies Val(.X, i) s V a l ( X , j ) . LEMMA 4. For every intuitionistic relational system (%, i), Val(%, i ) is saturated. Proof. By definition, V a l ( X , i ) is closed under modus ponens. That

6

P. H. G . ACZEL

aeVal(%, i) for all uESt,nd(i)such that tcc is shown by induction on the length of proof of a in the standard way. (See Kripke [3] Theorem 3.) Hence Val(%, i ) is a theory. It is consistent, prime and existential by definition. Hence V a l ( X , i) is saturated. Let S be the class of consistent r such that Ind(T)fQ). Then Y = ({U,),,,, c ) is clearly a Kripke structure. THEOREM 5. A is saturated iff Val (9, A) = A . Proof. The implication from right to left follows immediately from the previous lemma. For the converse implication we need to prove that for all saturated A Y k , a iff Z E A . This may easily be proved by induction on the length of c1 and the following six properties of a saturated A . 1) not OEA (as A is a consistent theory) ~ ( a l . . .a)n ) ~ d iff (al...) a n ) E P C n ( d ) (as Cn(d)= A and by definition of U,); 2) P A Y E A iff B E A and Y E A (as A is a theory); 3) P V Y E A iff P E A or Y E A (as A is a prime theory); 4) / ? - + Y E A iff P E A ' implies Y E A ' for all saturated A ' z A (by 1) Theorem 3 and that each saturated A' is closed under modus ponens); 5 ) V x a ( x ) ~ A iff a ( a ) ~ d for some a (as A is an existential theory); 6) A x a ( x ) ~ A iff a ( a ) ~ A ' for all saturated 4 ' 2 A and all U E Ind(A') (by 2) Theorem 3). By this theorem, every saturated theory A has the form Val<%, i) for some intuitionistic relational system ( X ,i). By a straightforward application of Theorem 4 it may be shown that X may be chosen so that the indexing class I is in fact a set, of the same cardinality as A . We shall not prove this here. As an immediate corollary of Theorem 5 and Theorem 2 we have the following completeness theorem for Kripke's semantics. THEOREM 6. l-k u

iff

%li,r implies

XIkicc

for all ( X , i ) such that I n d ( a ) c Ind(i) (where 3?lkiTiff YIt,yfor ally (r).

7

SATURATED INTUITlONlS?TC THEORIES

4. Two examples of saturated theories In this section we use the results of the previous sections to show that Intuitionistic Predicate logic and Heyting Arithmetic both give rise to saturated theories. These results have been proved by Godel, Gentzen, Harrop and Kleene, by more constructive methods. For references see Kleene [2]. We shall use a relation TIka very similar to the relation f l a of Kleene [ 2 ] .

DEFINITION. Assume r e Y . 1) If a is atomic rita iff T t a ; 2) f l l - p ~ y iff rlkp and fit?; 3) rlt-pvy iff flkp or m y ; (rkp implies flky); and 4) f l t p + y iff Tkp-+y 5) TIkVxa(x) iff r l t a ( a ) for some a ; Tlta(a) for all aEInd(f). 6) f l k A x a ( x ) iff rkAxa(x) and If T E S , let 9 r = ( { U d ) d s S r E) , where S r = { T ) u { d z f ( A is saturated}. Then 9,is a Kripke structure and as in Theorem 5 we can show that if A 2 T is saturated then Y,Ik,a iff a e A . THEOREM 7. If

fES,

then T l k ~ iff

Yrlkra.

Proof. It is sufficient to show that the relation Y,lt,a has the same properties as the relation r l k a as given by 1)-6) in the definition above. The only problematic cases are 4) and 6). We deal with 4). 6) may be dealt with similarly. We must show that Y,lkr/3+y

iff T k / 3 - + y and

Y,Ik,p

implies

Y,lk,y.

a) Assume c!YrIk,/3+7. Then Y,lt,/3 implies Yrlk,y by definition. Also Y,lt,p-+y for all saturated Azf i.e. p + y e A for all saturated A 2 T i.e. dkp-y and by Theorem 2, A k p - + y . for all b) Conversely, assume TtP-+y. Then f k p - y , i.e. Y,ltJ+y saturated d 2 f. So 9rlk,p implies Y’,Ik,y for all saturated A 2 Hence if ,sP,Ik,p implies 9rlk,y, then 9,IkA/l implies 9,lk,y for all A z r such that

r.

AES,. Hence Y,Ik,p+y.

8

P. H. G . ACZEL

If r E s, 1) {alrlt-a} is saturated; 2) Tll-cc implies Tt-a; 3) Cn(T) is saturated iff l W x for all aET. Proof. 1) {alrlt-a}= {cc\Y,kra} = Val (Y,, T ) is saturated by Lemma 4. 2) If Tltcc then Y,ltra so Y,It,a for all saturated Azr i.e. a ~ for d all saturated Azr, i.e. T k a . Hence T t a by Theorem 2. 3) If C n ( r ) is saturated then we may show by induction on the length of a that Tlt-a iff a e C n ( T ) . COROLLARY.

Hence rll-or for all a H . Conversely, if Tlta for all a E T , then T s {alf ka>. But {a(Tlta) is saturated and so a theory. Hence Cn(T)s{a(fltcc}. But by 2) {a[Tlt-a)s C n ( f ) . Hence Cn(r)={a(Tlta} is saturated. THEOREM 8. If A # 0

T,

= {LYESt,(

t a] is saturated.

Proof. If A f 0 then T,ES. Also aeT, implies TAlt-aas {aJT,lta) is a theory and T,lta implies T,Fa implies ~ € 7as ” TAis a theory. Hence TA= (a1T,lt-a} is saturated. COROLLARY.

tcc v p implies t v x a ( x ) implies

t a ta(a)

or t-P; for some a .

The following definition is given by Robinson [5] in $2. DEFINITION. RH(y) iff every occurrence of v or V x in y is in the antecedent PI of an implication P1-+/J2. LEMMA 5. If RH(y) then

Tlt-y iff y E C n ( r ) .

Proof. By induction on the length of y . Cases 3) and 5) of the definition of Tltcl will not arise. (See Robinson [5] 2.2.) By combining Lemma 5 and 3) of the corollary of Theorem 7 we have

SATURATED INTUITIONISTIC THEORIES

THEOREM 9. If

9

res C n ( r ) is saturated iff r l t y for all Y E Tsuch that not RH(y).

Theorem 9 allows us to give a simple proof of the following theorem THEOREM 10. If HA is a standard set of axioms for Heyting Arithmetic as presented for example in Kleene [I] then Cn(HA) is saturated. (Note: The standard formulations of Heyting Arithmetic use functions symbols. Although our results have only been proved for languages without function symbols, they are also true when function symbols do occur. There are some slight technical difficulties. We shall take Ind(HA) to be the set of nonnegative integers. Then we need the result that to each constant term t (i.e. a term built up from the non-negative integers using the function symbols ‘”, ‘ + ’, ‘.’) we may associate a non-negative integer n, such that HAI-t=n,. This ensures that the function symbols induce functions on Ind (HA).) Proof. By examining the sentences of HA we see that the only axioms that do not have the property R H are instances of the induction axiom. Hence by Theorem 9 it is sufficient to show that

HA It- A x (a(x)+ a (x’)) -+ A x ( a (0) -+a(x)) for every sentence Axa(x) i.e. we must show that HAlI-Ax(a(x)-ta(x’)) implies HAII- Ax(a(O)+a(x)). So assume HAIk ~x(a(x)-+a(x’)).Then HAI- Ax(a(x)-+sl(x’)) and HAII(a(n)+a(n’))for every integer n. Hence H A t Ax(a(O)+a(x)) and if HAlFa(0) then by repeated use of modus ponens HAlFsr(n) for all n. Hence HAlIa(O)+a(n) for all n, as HAI-a(O)+a(n) for all tz. Hence HAIk Ax(a(O)+a(x)).

COROLLARY. 1) H A h v P implies H A I R or HAI-p; 2) HAk Vxcr(x) implies HAI-a(n) for some integer n. 5. Final remarks

The Henkin completeness proof of classical logic has a syntactic and a semantic part. The syntactic part consists in extending each consistent classical theory to a saturated classical theory. Theorem 1 is our generalisation of this. The semantic part consists in showing how a relational system U, is associated with each saturated classical theory A so that U, is a model of A . Theorem 5 generalises this result.

10

P. H. G . ACZEL

There is one aspect which does not generalise. While there is a natural one-one correspondence between relational systems and saturated classical theories, there is no such correspondence between intuitionistic relational systems and saturated theories. This raises the problem: which is the more basic notion, that of an intuitionistic relational system or that of a saturated theory? We feel that the saturated theories are more basic for intuitionistic logic. While intuitionistic relational systems do not occur naturally in the literature, we have seen at least two important examples of saturated theories. (Kleene's various realisability interpretations give rise to further examples of saturated theories.) As we now have a smooth generalisation of the classical completeness theorem it is natural to try to generalise classical model theory to intuitionistic logic. First observe that by using the 1-1 correspondence between relational systems and saturated classical theories all of classical model theory may be carried out in terms of saturated classical theories. In this form classical model theory is the theory of saturated classical theories. The intuitionistic version of classical model theory is then the theory of saturated (intuitionistic) theories. We illustrate this idea by considering the ultraproduct construction. We first define the ultraproduct of saturated classical theories. Let
n

where

irl

A,/D = { a / - la E

a/-

= {bEnA,la

is1

N

Ai) b}

and a-b iff { i l a i = b i } ~ D . If cp is a formula containing no individual constants, but at most the free variables X o ...X , - let cp [ao...a, - be the sentence obtained from cp by substituting the individual constants ui for Xi in cp. If {di}i,r is a family of saturated classical theories, let

Then by L6s's theorem, n , , l d i / D is also a saturated classical theory. But we may apply the above construction to any family {AiliEl of sets of sentences. In particular we may show that if {di)i,l is a family of saturated (intuitionistic) theories then so is nisiAi/D.

SATURATED INTUITIONISTIC THEORIES

11

References 1. S. C. KLEENE,Introduction to metamathematics (Amsterdam, North-Holland Publ. Co., 1952; fourth printing 1964). 2. S. C . KLEENE, Disjunction and existence under imptication in elementary intuitionistic formalisms, J. Symb. Logic 27 (1962) 11-18. 3. S. A. KRIPKE,Semantical analysis of intuitionistic logic I, in: Formal systems and recursive functions, eds. J. N. Crossley and M. A. E. Dummett (Amsterdam, NorthHolland Publ. Co., 1965) 92-130. 4. R. C . LYNDON, Notes on logic (Van Nostrand Mathematical Studies, No. 6). 5. T. ROBINSON, Interpretations of Kleene’s metamathematical predicate in intuitionistic arithmetic, J. Symb. Logic 30 (1965) 140-154.

DECISION PROBLEMS ABOUT ALGEBRAIC AND LOGICAL SYSTEMS AS A WHOLE AND RECURSIVELY ENUMERABLE DEGREES OF UNSOLVABILITY1 W. W. BOONE Institute for Advanced Study and University of Illinois Arbitrary recursively enumerable degree analogues of the unsolvability results of Markov, Addison, Feeney, Adjan and Rabin for algebraic systems are obtained and related questions discussed. In particular it is shown that for any recursively enumerable degree of unsolvability D, there is a recursive class C ( D ) , of finite presentations of groups such that the isomorphism problem between members of C ( D ) has degree D. The paper On recursively unsolvable problems in topology and their classi3cation, by W. W. Boone, W. Haken and V. Poenaru, in this same volume, is a topological sequel.

1. Introduction Certain contemporary research has been concerned with the question of specifying examples of algebraic or logical systems of a certain kind, such that the word problem or decision problem is of an arbitrarily preassigned recursively enumerable degree of unsolvability. In this situation, each decision problem constructed is concerned with the equality of an arbitrary pair of words, or the deducibility of an arbitrary well-formed formula, belonging to some one algebraic or logical system. In the present context, it is natural to refer to decision problems of this kind as “local decision problems”. The present paper is about decision problems of arbitrarily preassigned recursively enumerable degree of unsolvability and having to do with Certain of these results were presented to the meeting of the Association for Symbolic Logic at Leeds, August 1962 (abstract, J. Symb. Logic 27 (1962) 275-6); others, in particular Result 4, to the International Congress of Mathematicians, Moscow 1966. This work was supported by U. S. National Science Foundation Grants No. GP-4616 and GP-6132. At the kind invitation of the editor, Kurt Schutte, it is included in these Proceedings because of its close connection with another paper herein. A point requiring clarification in the above-mentioned abstract is explained in footnote 2. 13

14

W. W. BOONE

algebraic and logical systems as a whole; i.e., decision problems concerned with recognizing that a system of a certain kind enjoys a specified property or with recognizing that two systems of a certain kind are in a specified relation. These latter decision problems we shall call “global decision problems”. E.g., does a Thue system of a particular recursive class (depending on the given degree) satisfy the cancellation law?; are two finitely presented groups each of which belongs to a particular recursive class (depending on the given degree) isomorphic? As will be seen, results of this kind concerning global decision problems can be obtained in a rather uniform way, - and without appeal to the far more difficult arbitrary degree results about local decision problemsz. In the case of decision problems to determine whether or not an algebraic system of a certain kind enjoys some specified property satisfying the well-known conditions of Markov, our uniform method consists of an easy conversion of the standard unsolvability results of Markov [24, 2513, Addison [l], Feeney [18], Adjan [2] or Rabin [40] regarding the kind of system being considered (Results 1,2 and 3 below). The argument is more subtle in the case of the decision problem to determine whether or not two algebraic systems are isomorphic (Results 4 and 5 below). But here too the argument does not depend on the existence of systems with word problem of preassigned degree. However in the case of groups, certain theorems about free products with amalgamation are required. We also consider certain meta- decision problems about groups which are formed by compounding the kind of “first order” decision problems about groups we have just been describing (Results 6 and 7 below). It is a possibility that some of these meta-problems could be analyzed by the results and methods of Boone and Rogers [9], but this is a matter that has not yet been investigated4. In the proof of Results 6 and 7, but only here, we do use the constructive existence of a finite presentation of a group with word problem of preassigned degree of unsolvability. We are indebted to Regarding V of the abstract in the J. Symb. Logic which is mentioned in footnote 1, we note that this was written before the author was aware that these more difficult results are not needed for the kind of “global” decision problems just described. But as explained below, we do use the existence of a finitely presented group with word problem of given recursively enumerable degree in our treatment of certain related meta- decision problems. Numbers in square brackets refer to the bibliography given at the end of the article On recursively unsolvable problems in topology and their classification, in this same volume, on pp. 72-74. In particular we have not attempted to use Boone and Rogers [9] and the results of this paper to locate the various kinds of decision problems here considered in the KleeneMostowski Hierarchy (see Kleene [23] or Rogers [42]).

ALGEBRAIC SYSTEMS AS A WHOLE

15

c.G. Jockusch Jr. for showing the uniform

construction aspect of Results 6 and 7; his interesting argument is given in a supplement. Result 8 below is an unsolvability result regarding the homomorphic images of a fixed finitely presented group (cf. Baumslag et al. [3]). The reader will see that the recursive constructions of Results 1 through 8 have a certain character of explicitness, e.g., a recursive class of finite presentations of groups is always given by means of some generic presentation depending on a single parameter ranging over some recursive set of words. An alternative, more function-theoretic, argument for Results 1, 2 and 3 is available from the fact that for any recursively enumerable set S, there is a 1-1 total recursive function with recursive range, by which S is reducible to the complete set K (see Kleene [23], p. 343), - but the explicitness of our constructions for Results 1, 2 and 3 would then be lost. In dealing with decision problems to determine whether or not a logical system of a certain kind enjoys some specified property, the same kind of trick as used in the algebraic setting works in a wide range of cases. But in the present paper we concentrate completely on the algebraic side, except for the statement of a general but contingent result covering logical systems (Result 9 below). The application of Result 9 - and its connection with our algebraic results - is discussed in a final short section. Based on ideas of Markov [26] and using Results 4 and 7 of this paper, topological results about recursively enumerable degrees are shown in the article in the present volume by Boone, Haken and PoCnaru. Because this paper and the topological sequel cut across three fields logic, algebra and topology - the reader from time to time may find himself reminded about facts in his specialty which are quite well-known to him. No offense will be taken if he skips. On the other hand, both the topological sequel and the Introduction t o Boone [6] contain still more general background material.

2. Statement of Results If P is an algebraic property of finitely presented groups, i.e., a property preserved under isomorphism between finitely presented groups5, then we shall say that P is a Markovproperty ofgroups if: (1) there exists at least one To require that P be preserved under isomorphism between any two groups is too restrictive in that interesting properties would then be excluded. E.g., while the recursively enumerable degree of the word problem for a finitely presented group is independent of the presentation, this is not, in general, the case for groups given by an infinite but recursive set of generators and defining relations.

16

W. W. BOONE

finitely presented group G P enjoying P ; (2) there exists at least one finitely presented group G,, which cannot be embedded in any finitely presented group enjoying P. The definitions of Markov property of semi-groups and Markov property of cancellation semi-groups are obtained from that just given by everywhere substituting “semi-group” and “cancellation semigroup”, respectively, for “group”. RESULT1. Let P be an arbitrary Markov property of groups and D an arbitrary recursively enumerable degree of unsolvability. Then there exists a recursive class C ( P , D ) offinite presentations of groups, such that the problem to determine of an arbitrary member of C(P,D), say 17, whether or not the group presented by 17 enjoys P is a problem of degree D. Indeed, the situation is as follows, where II, and n- are, respectively, any finite presentations of the groups G, and G , , of the definition of Markov property, and where %RD is any Turing Machine which semicomputes the characteristic function of a recursively enumerable set of natural numbers of degree D: we exhibit a recursive construction such that for any triple (n,, L‘-p7!lJlD) to which this construction is applied, the result is a class E(P,D) as described in the preceding paragraph. It can happen that P is a Markov property of semigroups, while the restriction of P to finitely presented groups is not a Markov property of groups; for example, take P to be the property of being a group, or the property of being embeddable in a group. By trivial arguments along the same lines, one sees that there are no implications among (i) Markov’s Theorem (Markov [24, 251); (ii) the Addison-Feeney Theorem (Addison [1l7 Feeney [18]); (iii) the Adjan-Rabin Theorem (Adjan [2], Rabin [40]). The present Results 1,2,3 are similarly independent of each other.

,

RESULT2. The analogue of Result 1 for cancellation semi-groups. RESULT3. The analogue of Result 1 for semi-groups. We shall not here dwell on the question of specific important examples of Markov properties, but refer the reader to Markov [24,25] or Rabin [40]. In certain special cases the P considered is not Markov but not P is Markov. E.g., the property of being infinite. For this kind off’, note that the essential content of Results 1, 2 and 3 still holds. As a somewhat more involved example, but one which we shall need later, suppose Do is a recursively enumerable degree of unsolvability. Then the property of having a wordproblenz of degree Do is a Markov property of groups, cancellation semi-groups, and

ALGEBRAIC SYSTEMS AS A WHOLE

17

semi-groups - if6 Do ZO’; but the property of not having a wordproblem of degree Do is a Markov property if Do = 0’. In any event, we then have from Result I that for an arbitrary pair of recursively enumerable degrees of unsolvability, Do, D,,there exists a recursive class C(Do,D l ) of finite presentations of groups, such that the problem to determine of an arbitrary member of % ( D o ,D J , say Z l , whether or not the word problem for Il is of degree Do is itself a problem of degree D,. Rabin [40] showed various important algebraic properties of groups, e.g., being simple, t o be not “recursively recognizable” as easy consequences of his main theorem (Theorem 2.1) rather than as special cases; i.e., Rabin circumvented a demonstration that the property being considered, or its negation, is a Markov property of groups. We conjecture that the arguments used by Rabin for these particular properties can be so modified as to furnish analogues of our Result 1 for these particular properties. Where O is a class of finite presentations of groups, the isomorphism problem for O is the problem to determine for two arbitrary members of C,say Il and I l l , whether or not Il and Il‘ present the same group; and the triviality problem for ( isI the problem to determine for an arbitrary member of C,say 17,whether or not Il presents the trivial group, i.e., the group with only one element. And similarly with “group” replaced by “semi-group”, or “cancellation semi-group”. RESULT4. For any recursively enumerable degree of unsolvability D,there exists a recursive class O(D) of finite presentations of groups such that the isomorphism problem for C ( D ) has degree D;moreover, the triviality problem for C(D)also has degree D. Indeed, where %ID is as described in Result I , we exhibit a recursive construction, such that for any %TID to which this construction is applied, the result is a class O ( D ) as described in the preceding paragraph. All of the presentations of O ( D ) have the same set of generators, say gl, g 2 , ..., gJ; and all of the presentations have the same number of defining relations, say I. Result 4 will be used in the topological sequel. RESULT5. The analogue of Result 4 for semi-groups. Since any finite presentation of a group is a Thue system (and a finite presentation of a cancellation semi-group), Result 4 implies Result 5 (as well By 0’ we mean the highest degree for recursively enumerable sets - that degree to which all other degrees of recursively enumerable sets are Turing reducible.

18

W. W. BOOM!

as the analogue of Result 4 for cancellation semi-groups). Nevertheless, before giving the proof of Result 4 we shall give a direct proof of Result 5. We do this because the argument about Thue systems, free of grouptheoretic complications, exposes the motivating ideas of both proofs. Adjan and Rabin have shown the meta- word problem unsolvable. See, e.g., Rabin [40], p. 187, Theorem 2.16. Special cases of Results 6 and 7 are results of this type ;e.g., that there exists no recursive procedure to determine of an arbitrary class C of finite presentations of groups whether or not the isomorphism problem for (5 is solvable. Or, alternatively, whether or not the problem to determine whether or not an arbitrary member of C is trivial, abelian, etc. is recursively solvable. RESULT6. Let P be an arbitrary Markov property of groups and D,D' an arbitrary pair of recursively enumerable degrees of unsolvability. Then there is a recursive family %(P,D, D') of recursive classes of finite presentations of groups such that theprobIem to determine of an arbitrary class of s ( P , D,D'), say C,whether or not a certain decision probIem depending on C has degree D is itself a decision problem having degree D'. The decision problem depending on C is the problem to determine of an arbitrary member of C,say II, whether or not the group presented by II enjoys P. In other words using a notation for decision problems of Boone [6, 71

(?a,0:E 8( P , D,0')) [(?II, I7 E C)[I7 enjoys P ] has degree D]

has degree D'. Indeed, using the notation of the secondparagraph of Result I , we can say we exhibit a recursive construction such that for any quadruple ( U p ,U W p , %RD,9J2,.) to which this construction is applied, the result is a family 5(P, D,D')as described in the preceding paragraph of the present Result 6. RESULT7. For any pair of recursively enumerable degrees of unsolvability

D and D',there is a recursive family g ( D , 0')of recursive classes offinite presentations of groups such that the problem to determine of an arbitrary class of 5 ( D , D'), say C,whether or not the isomorphism problem for (X has degree D is itseFa decision probleni having degree D'. Indeed, we show this result by exhibiting a uniform construction, analogous in the obvious way to the uniform construction of the earlier Results, Result 7 will be used in the topological sequel. In Baumslag et al. [3] it was shown that for an algebraic property P, of a certain kind, one can exhibit a fixed finite presentation of a group such that to determine whether or not a given subgroup enjoys P is recursively un-

ALGEBRAIC SYSTEMS AS A WHOLE

19

solvable. (Here a subgroup is given by a finite set of words by which it is generated.) An analogous result holds for homomorphisms of a fixed finitely presented group: for below we shall show how to reinterpret Rabin [40] in an obvious way so as to obtain the following. RESULT8. Let P be an arbitrary Markov property of groups. Then there exists afinite presentation of a group U psuch that the problem to determine of an arbitrary homomorphic image Ghp of the group G, presented by U p whether or not Ghp enjoys P is recursively unsolvable. (Here a Ghp is given by aJinite set of equations on the generators of U p . ) The present paper leaves a sizable open question: What sort of arbitrary degree - recursively enumerable or otherwise7 - analogues of Result 8 hold? Even the question of what is the most natural way to formulate such possible results is unclear in certain respects, and we do not go into the matter further heres. Of course the same sort of questions arise with regard to possible arbitrary recursively enumerable degree analogues of certain of the results in Baumslag et al. [3]. We now deal briefly with logical systems (or “canonical systems”, in Post’s terminology). We take as understood the notion of a logical system given in an effective way by primitive symbols, by a definition of well-formed formula, and by rules of inference, as well as the notion of the decision problem (as to theoremhood) for such a system. Of course Thue systems, etc., discussed above are logical systems, but so also are partial propositional calculi, Turing machines, Post normal systems. The results stated above for algebraic systems are rather definitive as compared to Result 9, which is contingent and so general as to be almost without content. We state Result 9 only to expose the fact that the same kind of trick used for the earlier algebraic results can be used to obtain analogous results for various logical systems used in symbolic logic. The author became aware of this situation in working with his students Brian Mayoh, W. E. Singletary and Anne Ihrig Yasuhara; and we refer the reader to their papers. E.g., to see how Result 9 is to be used to obtain propositional calculi analogues of our earlier results, the reader is referred to Ihrig [21], or Singletary [46]. To the non-logician we should explain that we need not restrict ourselves to recursively enumerable sets (problems) in considering the Turing reducibility of one set (problem) to another. See, e.g., Rogers [421. 8 Once the reader has understood our proofs given below, he should be able to state and show an arbitrary recursively enumerable degree analogue of Result 8 in which a recursive class of homomorphic images (depending on the preassigned degree) is involved.

w. w.

20

BOONE

RESULT 9. Let S be a logical system; C, a recursive class of well-formedformulas of S such that the problem to determine of an arbitrary well-formedformula A of C whether or not k A in S has recursively enumerable degree D. Let P be a property of logical systems; 51, a recursive class of logical systems such that for a certain recursive one-one correspondence f,between C and R, for each A in C, I- A in S if and only

iff (A) enjoys P .

Then it follows that the problem to determine of an arbitrary logical system S' of R whether or not S' enjoys P has recursively enumerable degree D. This Result follows directly from the definitions involved. By degree we mean - as elsewhere in the paper - Turing degree; but obvious variations and generalizations of Result 9 will occur to the reader. We have simply stated Result 9 in the form almost always desired for applications. We discuss this matter a little further at the very end of the paper.

3. Proofs of Results Proofs of Results 1, 2 and 3. We first show the following lemma whose proof is easily obtained from an argument for the unsolvability of the word problem for groups. In paralleling Adjan [2] and Rabin [40], we shall use this Lemma 1 rather than the existence of a finitely presented group whose word problem - with all the words being taken into account - is of preassigned recursively enumerable degree. This is why the total argument from first principles needed for Result 1 is so comparatively simple.

LEMMA 1. For any recursively enumerable degree of unsolvability D , there exists afinite presentation of a group 17, such that for a certain recursive class of words of n D , say R, containing 1, the empty word, the problem to determine for an arbitrary word w of R whether or not w = 1 in IID has degree D. Indeed, we exhibit a recursive construction which, when applied to IIJls, a Turing machine which semicomputes the characteristic function of a recursively enumerable set S of natural numbers of degree D , the result is a finite presentation of a group 17, as described in the first paragraph. We stipulate that the machine IIJls, which has internal configurations q l , q2,..., qNand tape symbols sl,s2,..., sM, semicomputes the characteristic function of the set S under the following arrangement g. Suppose s1 is printed on the tape of 2Jls in each of n + 1 consecutive squares, 2Jls is put into internal configuration q1 with the right-most square on which s1 is printed being See Boone [ 6 ] ,p. 531, for a detailed account.

21

ALGEBRAIC SYSTEMS AS A WHOLE

scanned, and !JJls is started running; then n E S if and only if !JJl, eventually enters q ,

.

Post [38] explicitly constructs a Thue system T from %Rs. This system T has as symbols so, sl, s,, ..., sM, ql, ..., q N , q N + l , q N + , and defining relations that are obtained from the table of operations of %R, by simple modifications. In effect, Post [38] shows that

n € S if and only if hs;+'qlh

=

hqN,,h in T"'.

Let T be the Thue system obtained from T by adding the symbol qo and the defining relation hg,, ,h =qo. Then, trivially,

(1)

n E S if and only if hs;++'q,h= qo in T .

By Lemmas 6 and 7 of Britton [ 101lo,we have that for a certain finite presentation of a group G (see pp. 22-23 of Britton [lo]) constructed directly from T

(2) Here

nESifandonlyifr(n)k(T(n))-'k-l=

1inG.

r(n)is our abbreviation for (!Is;+ l q l h ) - lt(hs;+'q,h);

the generators of G are the symbols1' of T, and certain additional ones including k and t . This shows Lemma 1 ; for the presentation G is the desired n,, and the recursive class of words {r(n) k(T(n))-'k-'] ",= together with 1 is the desired R . It is now an easy matter to complete the argument for Result 1 by an analysis of Rabin [40]. (Our discussion here can also be followed from review # 1611 of Mathematical Reviews 22 (1961) last paragraph of the first column on p. 281 through the first two paragraphs of the second column.) Consider Rabin's overall plan of argument to show his Theorem 1.1, p. 176. This plan is stated on p. 177, the first seven paragraphs of 0 1.3. For our purposes we must be precise as to how certain of his presentations of p. 177 are related to each other. We take it that no two of his no, 17, and 17, have generators in common ; that his is the finite presentation whose generators are the generators of noand of n,, and whose defining relations are the defining relations of noand of n,; that his n ( w ) which presents G(w) is the finite presentation whose generators are the generators of 171 and of n,, and whose defining relations are the defining relations of l7, and lo

l1

These lemmas are stated as Lemma 12.6, on page 272 of Rotrnan 1431. Writing Post's symbol h as an additional s.

22

W. W. BOONE

17,. Note that these assumptions are in agreement with his assertions that G, = G2*Gno and G(w)= GI * Gnw. In effect Rabin specifies a recursive construction of which, when applied to an arbitrary triple (no, n,, n,), the result is the recursive family {II(w)},, w ranging over the words of II and the indexing being recursive. The central technical lemma of his paper is that (*) for any word w of 17, w = 1 in 17 if and only if G(w), the group presented by n(w),enjoys P. Now we identify the 17, in Rabin’s construction with the n, of our Lemma 1. By (*) and the relation between no( =n,)and 17 above specified, (**)for any word w of R, w = 1 in 17, if and only if G(w), the group presented by n ( w ) , enjoys P. By Lemma 1 and (**) we have Result 1. The family {n(w)},, w ranging over the words of R, is the desired %(P,0) of Result 1 . We leave it to the interested reader to verify that in an analogous fashion the argument for Result 2 can be obtained from that of Addison [I] and Feeney [I81 about finitely presented cancellation semi-groups; for Result 3 from that of Markov [24, 251 about Thue systems. In both cases Lemma 1 may be used directly but obviously weaker analogues suffice12. Proof of Result 5. As we explained in section 2, we give the proof of Result 5 before showing Result 4. The proofs of these results are independent of each other. For the proof of Result 5 we use the notation and definitions given for Thue systems in Boone [ 5 ] §1 and first paragraph of§ 2, pp. 213-214. Following Markov [24, 251 very closely, we define the Thue system ZG,”, for each unordered pair of words G, H - not necessarily distinct - of the arbitrary Thue system 2,as follows. For each system Z we assume some one recursive canonical ordering of all ordered pairs of words on 3, and let o ~ ( GH) , be the number of the pair G, H in this ordering. %G, H

3G.H: UG,H:

The symbols of 3; c, d , e ; The rules of U; G, H.l G,H.2 cGd-I; G, H.3 acHd ++ cHd where a is any symbol of G,H.4 e 03(G H) t-,

sG, H;

We now use !? as a variable for Thue systems Z such that W tzl implies W is 1, or, what amounts to the same thing, such that U contains no rule pair of form A-1, A not 1. We use z for “is isomorphic to” and let (2)be Such a weaker analogue about semi-groups is given by the proof of Lemma 1. This is the (7) of the proof of Result 5, below. l2

23

ALGEBRAIC SYSTEMS AS A WHOLE

the semi-group presented by 2.Recall that to say a semi-group (or group) is trivial means that it consists of one element.

I (Markov). For any Thue system 2 and words G and H on THEOREM (%& is trivial ifandonly i f G t z H . THEOREM 11. For any

3,

5 let G, H, G', and H' be any four words on 3 such

-

that G, H is not the same ordered pair as G', H'. Then ( % G , H ) E ( Z G , , " * ) ifand only ifboth GkzH and G'l-5H'. With Theorems I and 11, Result 5 follows easily. For, as a semi-group analogue of Lemma 1, we have the following: (t)For any recursively enumerable degree of unsolvability D, there exists a

-

s,,

%,,

such that for a certain

recursive class of words on say R, andfixed word P belonging to R, the problem to determine for arbitrary word W of R whether or not Wtz,P has degree D. Corresponding to the secondparagraph of Lemma 1, thisfact can be stated as a recursive construction. The desired %D can be taken to be the T" of the proof of Lemma 1 where P is hqN+2hand R consists of hqN+,h and hslf'q,h for each natural number n. Using the terminology of (t),we assert that the family 7j={(%,)G,p}G, G ranging over the words of R, is the desired E ( D ) of Result 5. Since Ptz,P, by Theorem I, for any G of R, G tz,P if and only if ((2D),,p) z ((~D)p,p). Thus (?G, G E R )Gtz,P reduces to - indeed 1-1 reduces to - the isomorphism problem for 8. By Theorem 11, for any Gand G' of R, (($,)G,p) ((~,),~,,) if and only if either both Gtz,P and G'tz,P, or G is G'. Thus the isomorphism problem13 on 8 reduces - indeed, by bounded truth tables - to (?G, G E R )GFz,P. But it does remain to show Theorems I and 11. Theorem I follows easily and directly by the ideas of Markov [24, 251. Theorem I1 follows at once from Theorems I and 11'.

THEOREM 11'. Where G, H is not the same ordered pair as G', H', suppose

that neither G t z H nor G' tzH'. Then not (%G,H) 2 (%,.,,.). To show Theorem 11', various possible arguments suggest themselves, e.g., possible semi-group analogues of group theory results about free prodWe have argued explicitly here only about the isomorphism problem requirement of Result 5, but that the other requirements are also met by the construction given should be clear. For 1-1 and truth-table reducibility see, e.g., Post [37]. For (?--) . .. ,see [ 6 ]p. 533. l3

24

W. W. BOONE

ucts. But since the corresponding part of the proof of Result 4 is quite algebraic anyway, we give a rather primitive direct combinatorial argument for Theorem 11'. As we spell out later, Theorem 11" implies Theorem TI'. With m and n positive integers, mln means that m divides n. THEOREM IT". Supposenot Gt-ZH. Then N>O, WNFzG,H1 andnotW"FzGG.H1 for n=1, 2, ..., N-1, iinply Nlos(G, H).

As the kG,, for cGd-1".

% of

the discussion is arbitrary but fixed throughout, we write We use AFk,HB to mean "Ak,,,B without use of the rule Further, for the given G , H we use G as variable for c-free,

d-free words W of

3G,H such that Wk,,,G;

d-free words W of

jG,, such that W kc,,H.

and H as variable for c-free,

LEMMA 2. Suppose not GtzH. Then 1 FG,,U implies 1 kU ,;. The induction argument required is due to Markov. From a given proof of

1jU in

5c,H, eliminate the first application of cGd--+l if such exists.

LEMMA 3. Suppose 1 I-k,,U where not GkzH. Then either (3.1)

(3.li) U is of f o r m VcedM and (3.lii) U is of forin Bde", u = 0, 1, ... and (3.liii) U is not of f o r m PchdQ;

or

(3.2)

U is of f o r m e", v = 0, 1, ... .

Consider a given proof of 1jU in %G,H. The first step, i.e., 1, satisfies (3.2). One must show inductively that if a given step satisfies (3.1) or (3.2) the succeeding step satisfies (3.1) or (3.2).

-

For any word W on &", let dG,, [W] be the word obtained from W by successively removing the left-most subword of form cGd as long as possible. LEMMA 4. Suppose not G FzH. Then 1 kG,,Wnif and only if there is a u, u = 0, 1 , . .., such that dG,, [ W ]is e" and eun= 1 in FG,,, the cyclic group on e oforder o?j(G, H ) . We first note the following: (4a) Suppose not GtzH. Then lkI,,W" implies that dG,,[W"] is (dG,,[W])". This is clear by Lemma 3: for by (3.lii) of that lemma, in W" a subword c e d cannot overlap from one occurwe further note that (4p) rence of W t o the next. In view of the rules llG,H,2

25

ALGEBRAIC SYSTEMS AS A WHOLE

Now suppose not G kzH throughout. Then by Lemma 2,1 k,,.W" implies 1 tk,HWnwhich,in turn, by (4@),(4P),andLemma2,implies 1 kG,H(dG,H [W])". Thus, by the definition of dG,Hand Lemma 3 (note (3.1i)), for some u, u=O, 1, ..., dG,H[W] is e". But lastly we note that by Lemma 3 (note (3.liii) and (3.2)) the proof of l/eunnever applies the rule pair ecHd-cHd of l.lG,H,3. Hence, if from this proof we erase all symbols except e, we obtain a valid proof that eun=1 in F G , H . Conversely, if dG,H[W] is eu and eun=1 in FG,H then,trivially, 1 kG,H(dG,H [w])",so that 1 kG,,W"by(4P).This showsLemma4. We now show Theorem 11". Suppose WNkG,"l, where N>O, and not W" k,,,l for n = 1,. .., N - 1. By Lemma 4, we have that for a certain u, u=O, 1 , 2,..., eUN=1andeu"#1inFG,,,n=1,2 ,..., N-1.Asiswell-known, this implies N(oz(G,H ) . Now Theorem 11' follows trivially. For suppose (%G,H) E ( X G , , H , > and say o$G, H) < o$G', H'). Then by the assumed isomorphism and the rule

-

-

pair e 3 -1 of U G * , H , , for some word W on 3G,H, not W"kG,.l for o-(G'H') n = 1,2, ... or oz(G',H')- 1but W 3 k,,,l.Since not o j ( G f , H ' ) l o 5 ( G , H ) this contradicts Theorem 11". As explained earlier this gives Result 5. Proof of Result 4. For given recursively enumerable degree of unsolvability D, let nDbe as described in Lemma 1. We assume some one recursive ordering of the words of IID and let o,(w) be the number of the word w in this ordering, plus l.I4 We assume e is not a generator of ll, and define o-(G', H')

ny = ( e :

eoD(,)

=

1) -

Further, we define Ilw to be the finite presentation whose generators are the generators of n D and of n;, and whose defining relations are the defining relations of JT, and of I l y . Once more we refer to Rabin's overall plan for his Theorem 1.1, p. 176 of Rabin [40] as stated on p. 177, the first seven paragraphs of 9: 1.3. We now define Il, to be as specified by Rabin when our Il, is identified with his no, our Zi'Y with his 112 - and consequently our I l w with his n. We shaIl eventually show that since Il, is torsionfree, i.e., has no elements of finite order (as also shown below), the family 8 = {Il,},, w ranging over the words of the R of Lemma 1, is the desired C(D)of Result 4. THEOREM 111 (Rabin). For any word w of R , XI, is trivial i f and only if 1 in I l D .

W=

The "plus 1" of this definition is not necessary but simplifies the argument. (Cf. the proof of Result 5 . )

l4

26

W. W. BOONE

For that special case of Rabin’s Theorem 1.1 in which the Markov property P is the property of being trivial, his 17, may be taken as the group presentation having no generators and no relations, so that his n ( w ) is his 17,. For each w,our 17: cannot be embedded in the trivial group. Thus Theorem I11 is simply (**) of our proof of Result 1, with P taken to be the property of being trivial. Throughout Rabin’s proof of (*) of our proof of Result 1, his w is arbitrary but fixed; but note that nothing in his argument precludes his IZ2 varying with w as we require. 2 denotes isomorphism (between groups presented). is torsion free. Let w and w’ be any two distinct THEOREM I v . Suppose , ~ only ifboth w = 1 and w’=1 in IT,. words of R. Then I l W ~ 1 7ifand From Theorems I11 and IV, and a verification that n, as given by Lemma 1 is torsion free, Result 4 is immediate. Certainly 1 = 1 in IZ, so that by Theorem 111, for any w of R,w = 1 in 17, if and only if IZ,EIZ,. Thus (?w,w ~ R ) w =1 in IZ, reduces to the isomorphism problem on 5. By Theoif and only ifeither rem IV, for any two distinct words w,w’of R, IZ,r17,, (a) both w = l and w ’ = l in 17, or (b) w is w’.Thus the isomorphism problem l 5on 5reduces - indeed by bounded truth-tables - to (?w,W E R)w = 1 in If,. Thus to show Result 4 it remains only to show Theorem IV and to verify the hypothesis of that theorem. Theorem IV follows at once from Theorems I11 and IV‘. THEOREM IV‘. Suppose n, is torsion free. Where w and w‘ are two distinct words of n,, suppose w # 1 and w‘# 1 in IZ,. Then not l7, z IZ,,. As we explain later, Theorem IV” implies Theorem IV’. With m and n positive integers, mln means m divides n. By the order of the element g of a group we mean the least positive integer m such that g”’= 1 in the group. A group G is torsion free except f o r N , N a positive integer, if (1) at least one element of G has order N ; ( 2 ) for every positive integer m, if m is the order of some element of G , then mlN. (Note that torsion free except for 1 is equivalent to torsion free in the usual sense.) THEOREM IV”. Suppose 17, is torsion free. Where w is a word of 17, suppose w# 1 in 17,. Then 17, is torsion free except f o r O,,(W). Same as footnote 13, but with “Result 4” instead of “Result 5”. Or Rotman [43], the “Britton’s Lemma” section, pp. 265-271. Lemmas 3 and 4 of Britton [lo] are, for our purposes, stated as Lemmas 12.3 and 12.4 respectively in Rotman [431. 15

16

ALGEBRAIC SYSTEMS AS A WHOLE

21

The central idea t o verify the “torsion free” hypothesis is the following lemma. In its statement and proof we assume known 3 2 of Britton [lo116 and 6 2 of Boone [7]. LEMMA 5. Suppose Cond,,,,(E*;E;p,, U E V ) .Then E* has an element of jinite order N , if and only if E has an element of order N . The “if” part of the lemma is immediate by Lemma 3 of Britton [10]16, i.e., since E is imbedded in E*. Suppose (T) W N =1 in E* and W”# I in E*, n = 1, 2,. . ., N - 1. To obtain the “only if” part of the lemma, we show by induction on the number of occurrences of the p,,, U E V ,in W that (*) for a certain word U of E, U N =1 in E and Un#1 in E, n= 1,2, ..., N - 1. If W is p-free, we may take W itself to be U , by Lemma 3 of Britton [loll6, i.e., since E is imbedded in E*. Suppose W is not p-free. Clearly, by Lemma 4 of Britton [lo], either (a) Wis notp-reduced or (b) Wisp-reduced but W 2is not p-reduced. If (a), note by Lemma 0 of Boone 171, that (p [ W])N= 1 in E* and (p [ W])” # 1 in E*, n = 1,2, ...,N - 1, where p [ W] contains fewer occurrences ofp,,, U E V,than Wcontains. Suppose (b) and write Was A B where B A is notp-reduced. Since B(AB)”B-’=(BA)“, m=O, 1, 2, ..., in the free group, (7) holds with BA taken as new W. Note that BA falls under case (a) and contains just as many occurrences of p,,, U E V as , AB, i.e., the old W. This completes the inductive argument for (*), and hence shows the lemma. By a Britton Tower we mean a sequence of finite presentations of groups such that for any member n and succeeding member IT’ of the sequence Cond,,,(n’; n ; p , , U E V ) Here . the sequence may be finite, or infinite. Lemma 6 follows at once from Lemma 5 by finite induction.

6. For any Britton Tower, one member has an element offinite order LEMMA N if and only if every member has an element of order N . Thus if one member of a Britton Tower is torsion free, or torsion free except for N , so also is every member. Now, at this point, we must slightly modify our original argument for Lemma 1 so as to make the new Il, torsion free. Assume (1) of the proof of Lemma 1. By Lemma 39 of Boone [6],p. 566, for the Thue system Trqo, there defined, nESif and only if chs;+’q,hd= 1 in TIqO1. Writeq-, for theq, of qqolto avoid a notational confusion. Then, by Theorem X of Boone [5], p. 250, for the Thue system TrqOl* there defined, n e S if and only if q,chs~+’q-,hd=g, in Tcqol*. We now identify TLqO1* with the T on p. 22 of Britton [lo] by taking the present q1 to be his qo, and the remaining symbols of TLq0,* to be his sl, s2,..., sM. (Cf. (11) of Britton [lo], pp. 29, 30.) The crucial point is that for this choice of T, the group presentation G of Britton

28

W.

W. BOONE

[lo], pp. 22, 23, has N=O. Of course, our equivalence (2) of the proof of Lemma 1 still holds, although r ( n ) now has a slightly different form. Clearly, this G can be taken to be the 17, of Lemma 1. For this G, what with N=O, Britton’s argument for Lemma 6 of Britton [lo] admits of a certain modification, viz., changes (a), (b), (c), and (d) of pp. 59, 60 of Boone [7]. In a moment we must refer to certain material on p. 25 of Britton [lo] in this modified form, but the revised version is almost explicitly spelled out in the middle of p. 60 of Boone [7]. LEMMA 7. 17,, i.e., the presentation G of Britton [lo] pp. 22, 23, with N = O as just described, is torsion free. We assert that the sequence

is a Britton Tower. Here F(x, y ) is the usual presentation of the free group on x , y ; and the G’s are as described on pp. 22, 23 of Britton [lo]. For Cond ,LB(G4; F(x, y ) ; sb,b = 1, 2,. .., M ) is verified by Boone [7], Lemma 18, pp. 72, 73; CondJLB(G,;G,; li, ri, i= 1 , 2,..., P) by Boone [7], Lemma 4, p. 65; Cond,,,(G,; G3; qa,a=O, 1, ..., N ) by Britton [lo], p. 25, lines 10-20 modified as just explained; CondJLB(G,;G,, t ) by Britton [lo], p. 24, lines 13-15; and CondJLB(G;GI; k ) holds as noted in Britton [lo], p. 24, lines 13-15 - since the identity mapping verifies the isomorphism condition. Thus, since F(x, y ) is torsion free, Lemma 7 follows from Lemma 6. Having thus verified the hypothesis of Theorem IV” for 17, as specified just a moment ago, we now show the conclusion of this theorem. The argument has much the same flavor as that just completed: for we must check that certain groups of Rabin [40] are torsion free except for certain integers. The main tool is the following lemma. LEMMA 8. Let P be a group which is a free product with amalgamation. Then (8.1) P has an element ofJinite order N fi and only if some one of the factors of P has an element of order N . Thus (8.2) if each factor of P is either torsion free or torsion free except f o r N , and if some factor of P is torsion free except for N , then P is torsion free except for N . For any element of finite order in a free product with amalgamation the transform (conjugate) of an element is belonging to one of the factors. See e.g. Neumann 1341, Theorem 5.1, p. 514 for this result. LEMMA 9.

If the group

H , of Rabin [40], page 178, in his Lemma 2, is

ALGEBRAIC SYSTEMS AS A WHOLE

29

torsion free except for N , then so also is the group H4 of Rabin [40], page 182, in his Lemrna 7. We assume the reader has 9 1.4 of Rabin [40] before him. The Tietze transformation argument of Rabin [40], p. 179, lines 12-20, regarding the suppression of a dependent generator we shall call the trivial modijication. Note that (901) the group F of (1.3) on page 178 of Rabin [40] is torsion free. This follows by our Lemma 5. For let ( t ) be the free group on t . Then F has the stable letter u and corresponding basis ( t ) . Moreover, the mapping t2-t generates an isomorphism between the subgroup of ( t ) generated by t 2 and ( t ) itself. Hence Cond,,,(F; ( t ) ; u). Now since H , is torsion free except for N by assumption, Rabin’s group Hi of line 4 on p. 179 is torsion free except for N by our (901) and our Lemma 8.2 since Hi =(H,*F)u(x)=u. Rabin’s group H I defined in his Lemma 2 on p. 178 is isomorphic to H i by the trivial modification, and hence is torsion free except for N . Rabin’s group F’ of (1.4) on p. 179 differs only in notation from F. Thus Rabin’s group H i of (1.5) on p. 180 is torsion free except for N by our (901) and our Lemma 8.2 since H i =(H,*F‘),,,. Rabin’s group H2 defined in his Lemma 3 on p. 179 is isomorphic to H i by the trivial modification, and hence is torsion free except for N . At this point we must make an interpolation into Rabin’s construction to establish the torsion freeness except for N of his group H,, defined in his Lemma 4 on p. 180. Recall that his H2 is, in his Lemma 3 on p. 179, defined by ( x , t , a : r ( x ) , u ( x ) t = t 2 u ( x ) , ta = a 2 t ) . Let the group F, differing only in notation from F, be defined by ( u , s: us =

A).

By our (901), F is torsion free. What with u having infinite order in F and u ( x ) having infinite order in H,, which is torsion free except for N , we may form the free product with amalgamation

By our Lemma 8.2, K ; is torsion free except for N . Referring to the presentation of H , displayed above, let K , be defined by (x, t , a , s: r ( x ) , u ( x ) t Clearly K ,

= t2U(X),

ta = a t 2 , u ( x ) s = s’u(x)).

K ; by the trivial modification so that K , is torsion free except

30

for N . Let

W. W. BOONE

p, differing only in notation from F, be defined by (p, b: p b = b2p).

By our (9a), fi is torsion free. From considerations exactly like those about forming K ; , we see we may form the free product with amalgamation

By our Lemma 8.2, K, is torsion free except for N . Clearly H3 zK , by the trivial modification so that H3 is torsion free except for N as required. Finally, Rabin’s group F“ is defined on p. 182, line 9, as the free group on c,d - and thus is torsion free. Since H4 = (H3 * F”),=, - as noted by Rabin on p. 182, line 19 - by our Lemma 8.2 we have the torsion freeness except for N of H4, i.e., the desired Lemma 9. By Lemma 8.2, if w # 1 in 17, the group presented by IIW - described just prior to stating Theorem 111 - is torsion free except for oD(w): for 17, is torsion free (Lemma 7), I7p is torsion free except for o,(w), and the group presented by 17” is the free product of those presented by 17, and 17;. Let fiw be the presentation obtained from Ilwby adding the new generator x , , + ~ Again . by Lemma 8.2, since the infinite cyclic group is torsion free, fiwis torsion free except for oD(w). As argued by Rabin near the bottom of p. 183 of Rabin [40], X , + ~ W X ~ , ! ~has W infinite order in the group presented by f i w . Finally, the presentation we have called I7, - described just prior to stating Theorem I11 - is obtained from fib‘ by following directions (a) and (b), top of p. 183 of Rabin [40]. Thus by Lemma 9 we have shown that 17, is torsion free except for o D ( w ) :for fiwand 17, differ only as to notation from Rabin’s H,, and H4 respectively. This shows Theorem IV”. Now Theorem IV’ follows easily. Where w and w‘ are two distinct words of 17,, suppose w # 1 and w’# 1 in 17,. By Theorem IV”, 17, is torsion free except for o,(w) and II,, is torsion free except for o,(w’). But w and w’ are distinct, so o D ( w ) # D D ( w ’ ) . But from the very definition of torsion free except for N , it is not possible that a group be torsion free except for N and except for N‘, with N # N‘. Hence not h’,rn,, as claimed. As explained earlier this gives Result 4. Proofs of Results 6 and 7. As remarked in section 1, for these proofs, we d o require the (constructive) existence of a finitely presented group with word problem of arbitrarily preassigned recursively enumerable degree. We show only the case where degree D (called Do below)

31

ALGEBRAIC SYSTEMS AS A WHOLE

Let 17, be a n arbitrary finite presentation of a group, P an arbitrary Markov property of groups. By direct application of the main technical result of Rabin [40], we have that, where w is any word of no,there exists a finite presentation U,, of a group such that

w = 1 in 1 7 0 e 1 7 w ~ P .

(1)

Moreover, II, is recursively computable from w alone for fixed noand P. While in Rabin [40], IT, has an unsolvable word problem this in no way affects the correctness of (1). As we already noted in section 1, the property of groups to have a word problem of a particular recursively enumerable degree of unsolvability, say Do ZO’, is a Markov property. Thus as a special case of (1) we have w = 1 in 17, o {(?uW)[u, = 1 in

(2)

TI,]} ED,

where u, is any word of n,. But we can now iterate Rabin’s construction, i.e., from (1) we have that for each n, and any arbitrary Markov property P,one can recursively comof a group such that pute a certain finite presentation IIw,uw u, = 1 in 1 7 , 0 1 7 w . u , ~P .

(3)

By (2) and (3) we have

w

(4)

= 1 in 1 7 , 0 { ( ? u w )

[IT,,uw~P]}~DO.

Hence, by (4) where D, is any recursively enumerable degree of unsolvability,

(5)

{(?w)[w

=

1 in

no]} €D1 o [{[(?.W>

Cflw,uwEP3)

ED011ED1 *

By (5) and Fridman [19], Clapham [15], or Result G, p. 50 of Boone [71, we have the present Result 6 . To show Result 7, take the 17, in (2) torsion free so that the groups 17, l 7 The case D = 0’ is similar. But a construction uniform in D requires essentially new ideas furnished in Jockusch’s supplement to this paper. l8 By result G, p. 50 of [7], and our Lemma 7 , (t) there is a uniform construction which for any recursively enunwable degree D produces a torsion free finitely presented group with wordproblem of degree D . In the notation of p. 21, last paragraph, let 170,171and 172 all present torsion free groups; and let 171 and 172 have word problems of degrees DO# 0’, and 0 , respectively. By Lemma 9 and a simplified version of the remarks on p. 30 showing Theorem IV” from Lemma 9, the groups n ( w ) described at top p. 22 are torsion free. Take these n ( w ) as the Z7, of (2).

32

W. W. BOONE

of (2) are also.18 Then by Theorem IV of the proof of Result 4 we have for each XI, of ( 2 ) and arbitrary words u, and u; of ll,, that we can recursively compute Il,,uw and nw,u,w, finite presentations of groups such that Condl (w,u,,

(6)

UL)-~,,~,

E nw,u,w.

Here Cond, (w,u,, us) means that [u, = 1 in Ilw and u; = 1 in

n,] or [ u ,

is u;]

.

But, since for each w there is an obvious recursive procedure to determine if

u, is u; or not,

(7)

(?uW)[uw = 1 in Il,] =:(?uw, uk) Cond, (w,u,, u;)

for each w,where E indicates Turing equivalence. Directly from (6) we have for each w that

(8)

(?uw,u;) Condl (w,uw, u;)

ST

Thus by (7), (8), and the transitivity of (9)

(?uW)[u, = 1 in

n,]

(?u,, uk) =T,

[n,,,_

nw,firw]

for each w,

E ~ ( ? u , ,u;) [Il,,uw 2 llw,uew].

By ( 2 ) and (9) we have that for each w and any recursively enumerable degree of unsolvability Do, (10)

w = 1 in

no-= {(?uW, u;) [n,,,_z n,,ufw]> €DO.

Hence, for any recursively enumerable degree of unsolvability D,, (11)

{(?w)[w = 1 in no]} ED, c>

{(?4 [{(?uw, 4) [nw,uw = n w , u , w l l E Doll E Dl

*

By (f) of footnote 18, we now have Result 7 from (11). Proof of ResuEt 8. In effect, this result was shown by Rabin in Rabin [40] and it is only a matter of looking at his account in a certain way to see that this is so. Near the end of our proof for Result 1 (“For our purposes we must ...”) we discuss our point of view toward Rabin’s no,Ill,IT2, n, n,, and I l ( w ) , which we here again take. Now Rabin’s n, is obtained from his Il by adjoining seven additional generators x , + ~ t, , a, s, b, c and d independently of w, as well as certain defining relations - which depend on w - which we here call S(w).Let Il’ be the presentation obtained from Il by adjoining the generators x,+ t , a, s,

,,

ALGEBRAIC SYSTEMS AS A WHOLE

33

b, c and d, but no new defining relations; nothe presentation whose generators and whose defining relations are those of Illand Il'. Of course n, and D2 depend on the chosen Markov property P. Since IT, for each word w of IZ, can be obtained from the fixed noby adjoining the relations S(w),nocan be taken as the desired ZIP of Result 8.

Result 9. As noted in the statement of Results, this Result requires no proof. Actually, many of the earlier arguments of the paper are, in effect, applications of Result 9. Familiar arguments for unsolvability in symbolic logic can be modified so as to furnish us with the hypothesis of Result 9 in much the same way as we reinterpreted unsolvability results in algebra to obtain Results 1, 2 and 3. For example, consider the arguments for the fact that there exist partial propositional calculi with unsolvable decision problem as to theoremhood (Post and Linial [39], Davis [17], Yntema [54]). These arguments, in effect, stipulate a recursive class of well-formed formulas whose decision problem as to theoremhood is of preassigned recursively enumerable degree. For in such argument, call a well-formed formula, of the propositional calculus S being constructed, a code formula if the well-formed formula represents a word of the underlying Thue system or Post normal system T. By the results of this paper or Yasuhara [53] we may take it that T has a recursive class R of words, with a word problem of a special kind, of given recursively enumerable degree D . Then for the desired class C to satisfy the hypothesis of Result 9 we simply take those code formulas of S which stand for words of R. The authors mentioned go on to show that, since the theoremhood problem for S is unsolvable, the completeness problem for partial propositional calculi is also. This latter argument in effect furnishes us with the remainder of the hypothesis of Lemma 9, where P is completeness, etc. Thus there exists a recursive class of partial propositional calculi with completeness problem of preassigned recursively enumerable degree. Certain familiar proofs of Church's Theorem - such as given in Davis [17] and Hermes 1201- furnish us, in a similar way, with the C of Result 9 where S is the first order functional calculus. But no particularly interesting applications of Result 9 relative to this situation are known to the present author.

SUPPLEMENT TO BOONE’S “ALGEBRAIC SYSTEMS” C . G. JOCKUSCH Jr. University of Illinois

We assume complete familiarity with the arguments on pp. 31, 32 for Results 6 and 7 of Boone’s paper. These arguments apply only if the given degree D is not 0 , because one needs to know that it is a Markov property of groups to have word problem of degree D . On the other hand, it is very easy t o modify the argument for the case D=O‘ because the property of having a word problem of degree # 0’ is also Markov. (Roughly, substitute # for E.) However, as C .F. Miller I11 has pointed out, the existence of these two separate constructions does not guarantee that there is a construction uniform in D.In this note we specify a uniform construction for Result 7. The uniform construction for Result 6 is entirely analogous. Our construction is essentially the union of the two constructions - the one for D # 0’, the other for D = 0’ - previously mentioned. The proof that our construction has the desired properties hinges on determining the output of each of the two separate constructions when applied to a degree D which it was not intended to cover. We consider first Boone’s argument for Result 7 for the case D#O’. However, for the moment we make no assumptions on D , D‘ other than that they are arbitrary recursively enumerable degrees. Henceforth, we write D, D‘ as Do, D , , respectively, as in Boone’s proof of Result 7. If Z I is any finitely presented group, we write deg n for the degree of the word problem of n. We use 0 for the degree of all recursive sets and d u d * for the least upper bound of the degrees d, d*. Any unexplained notation will be found in Boone’s proof. Referring to the right-hand side of Boone’s (11) we let d(D,, 0,) be the degree of

(9

{(?.I

C{(?.W?

ul)

En,,._ = ~

w , u ~ , 1 ~’ ~ ~ 0 1 ~

Boone’s proof showed that d(D,, D,)=D,, provided D,#O’. We must now 34

SUPPLEMENT TO BOONE’S “ALGEBRAIC SYSTEMS”

35

compute d(O‘,Dl) by examining the details of Boone’s construction. However, we may not use Boone’s ( 2 ) because it holds only when the property of having word problem of degree Do is a Markov property, i.e. when Do#O. From our (i) and Boone’s (9) (whose validity depends only on each n, being torsion free, - and not on (2)) we see that d(Do, Dl) is the degree of (ii)

W W )

[deg n, = Doll.

Thus we desire to find deg Il, for each word w of Il,. Recall that Il, is obtained by Rabin’s construction from IT,, Il,, 112, where deg n , = D l , deg I l l = D o , deg I l , = O ’ . Furthermore, if w = l , I ~ , E I ~ and , , if w # 1, n, can be embedded in IT,. It follows that (iii) (iv)

w = 1 a d e g n , = d e g n , = Do ; w # 1 a d e g n , = 0’.

If we now assume that Do = 0‘ we have deg Il, = Do for all w in Il, so it follows immediately from (ii) that d(O’, Dl) = 0. We now consider the natural construction intended for the case Do = 0‘. This construction is identical to the previous one except that it starts with groups n,*, IlT, Il: rather than no,ZI,, Il,. To exploit the fact that having word problem of degree # 0‘ is a Markov property, we choose deg Ilg = D,, deg I l F = O , and deg n;=O’.Now let Il; and Ilz,,, be constructed from n,*, IlT, Ilg as before, and let d*(Do,Dl) be the starred analogue of d(Do,Dl). We now have that d*(Do,0,) is the degree of (ii)* We also have (iii)* (iv)*

w = 1 + deg Il; = deg IZT = 0 ; w # 1 =>degIlz = 0’.

From (iii)* and (iv)* it follows that w = 1 in Il,* iff deg I l z = O . Hence we see from (ii)* that d* (0, 0,) = deg Il,*=D,.Similarly, d* (0‘,Dl) = D , . On the other hand, if O
36

C . G . JOCKUSCH JR.

isomorphism problem for has degree Do. (Here %*(Do,Dl)is the starred analogue of Boone’s %(Do,D l ) , and it is assumed that 17, and ZZ,* have no generator in common.) Thus %(Do,D,)u %*(Do,D l ) , indexed by the words is the desired family of classes of groups. of 27, and n,*,

ON RECURSIVELY UNSOLVABLE PROBLEMS IN TOPOLOGY AND THEIR CLASSIFICATION1 W. W. BOONE, W. HAKEN Institute for Advanced Study and University of Illinois and

v. POENARU Institute for Advanced Study, Institut des Hautes Etudes Scientijiques, Harvard University, and Northeastern University The present paper has been inspired by Markov [26] on the unsolvability of the homeomorphism problem. We shall show that certain familiar decision problems in topology, in dimension 3 4 , are recursively unsolvable, in the strong sense that these problems can be taken to be of any preassigned, recursively enumerable degree3 of unsolvability. We have tried to make this paper accessible to both logicians and topologists. For this reason, we have recalled many familiar definitions (especially in sections 1.1 and 3.1, which may be skipped by the specialist), and we have carried out some of the proofs in more detail than would be necessary for a reader familiar with the techniques applied. We wish to thank J. Stallings for valuable discussions. Thanks are also due to C . F. Miller 111, and Daniel Richardson for various suggestions and practical help. 0. Introduction; statement of results The decision problems which we consider pertain to diffeomorphism 4, homeomorphism, combinatorial equivalence, and homotopy equivalence. This research was supported by the U. S. National Science Foundation under Grants GP-4616, GP-6132, GP-730, GP-5610, GP-6299 and the U. S. Air Force Office of Scientific Research under Grant AFOSR-359-63. Numbers in square brackets refer to the bibliography at the end of this paper. The reader not familiar with these logical concepts should read section 1.1 first. The reader not familiar with these topological concepts should read section 3.1 first. 37

w. w.

38

BOONE,

w. HAKEN and v. POBNARU

The following is a standard problem in topology: find a combinatorial classification for all compact n-manifolds, i.e., a recursive enumeration of all combinatorial types of (compact, piecewise linear) n-manifolds, without repetition (Papakyriakopoulos [35]). But, as we point out below, to give a combinatorial classification would imply a solution for the decision problem with respect to combinatorial equivalence and vice-versa. (This is proved as Theorem 5, section 1.2, of the present paper.) Thus our results imply that, at least for dimension 34, such a combinatorial classification does not exist. We introduce the following notation : z M z, z 3 , z shall indicate the relation between two manifolds (with appropriate structures) of being difeomorphic, homeomorphic, combinatorially equivalent, homotopy equivalent, respectively. z is called i-equivalence. For the “i-equivalence problem” (the decision problem with respect to i-equivalence) to make sense it is necessary, of course, to restrict considerations to manifolds which are “finitely presented”. This concept, especially in the form “finite presentation of a differentiable manifold”, requires a detailed discussion which we give in section 3.2 of this paper. There we define what we call a (finite) algebraic atlas presentation which means a certain finite notation (essentially consisting of matrices of rational numbers) that describes a closed topological n-manifold with a triangulation (representing the combinatorial structure) and an atlas (representing the differentiable structure). Then we prove

THEOREM 4. For every closed, diferentiable n-manifold & there exists a (finite) algebraic atlaspresentation such that the manifold M(Zm)presented by %R is diyeomorphic to M . Remark. This fact is to be contrasted with the existence of groups which are not finitely presentable and real numbers whose decimal expansions are not recursive. We shall call two presentations Zm, 9X’ i-equivalent, %JIM iYJl’,if they present n-manifolds which are i-equivalent. If C is a class of finite presentations of manifolds, we use the expression

rm

(?
to indicate the i-equivalence problem : “for two arbitrary manifold presentations W E C to determine whether or not the manifolds presented are i-equivalent”. We now state our main results; all are uniform constructions:

rm,

THEOREM 1. For each dimension n 3 4 and for each recursively enumerable

39

UNSOLV4BLE PROBLEMS IN TOPOLOGY

degree3 of unsolvability D, there exists a recursive class C(n, D ) of jinite presentations of n-manifolds, endowed with a diflerentiable and a compatible combinatorial structure4, such that, for each i= 1, 2, 3, 4: (1 A) The decision problem

(?~3n,%R’;fm,!Dl’~C(n,D)) [‘%R*ifm’] is of degree D. (1B) C(n, D ) contains a certain presentation n-manifold such that the decision problem

(?m;Y.REc(n,

is of degree D. (1 C ) The decision problem

0))L

@ of

a (simply connected)

~ zL,Im] X

(?%R; %RE C ( n , 0))[ M (fm) is simply connected] has degree D.

We have a similar result for meta-problems:

2. For each dimension n 3 4 , and each ordered pair (D, 0‘) of THEOREM recursively enumerable degrees of unsolvability, there exists a recursive family g ( n , D, 0’)of recursive classes ofjinitepresentations of n-manifolds (endowed with a differentiable and a compatible combinatorial structure) such that, for each i= 1,2, 3 ,4 : The decision problem to determine for an arbitrary member C of 8 (n,D, D’) whether or not

(?m,9X‘; %R,9Jt’~C)[fmzi9X‘]

be of degree D is itself of degree D’. Remark. Theorem 2 is concerned with the meta-problem corresponding to (1A). We think it possible to obtain similar results corresponding to (IB) and (lC), but quite lengthy considerations would seem to be required. We assume as understood the notion of a jinite presentation of a group consisting of certain generators and dejning relations (see e.g., Boone [5], pp. 213, 214). But for convenience, we shall make certain conventions about such presentations. The presentation p with generators x l , . . .,x , and defining5 relations cxl = 1, ..., up= 1 we write as P = ({XI,

..., x r ) , { Z l , ..., .,)

For convenience, we do not regard the so-called trivial relutions . ~ i - l x t += ~ 1 and 1 as necessarily being listed among the a’s.

X ~ += IX~-~

w. w.

40

BOONE.

w.

HAKEN

and v.

PO~NARU

calling the ai the relators of p. We allow the possibility that some x i are empty. We write Q (p) for the group presented by p . Recall that we can take 001)to be the factor group %(p)/%(p)where %@) is the free group with generators x l , ..., x, and '3101) is the normal subgroup of generated by the relators u l , . . ., up. We assume all generators chosen from one countable alphabet. We use r ( p ) andp(p) for the number of generators and relations5 of p , respectively. Two group presentations p and p' are isomorphic if the groups Q ( p ) and 001') are isomorphic. Two group presentations p and p' are congruent if they are identical up to notation, i.e., if there exists a 1-1 correspondence between the generators of p and those of p' which transforms the relators of p into those of p'. Following Markov [26] we use * for an empty relator in the following way: If p = ( { x l ,..., x,}, {a1,..., a,}) then for t = O , 1,... p * t = ( { X I , ...,

{a1,

.-.,up,*'>I

where *' stands for * repeated t times. By a fully recursive class Q offinite presentations of groups we mean a recursive class such that there exists an algorithm to determine for an arbitrary presentation p whether or not there exist a p' in Q such that p is congruent to p'. THEOREM 3. For each n 2 4 and fully recursive class Q offinite presentations of groups there exists a recursive class K(n, Q ) of Jinite presentations of nmanifolds, endowed with diferentiable (and compatible combinatorial) structures, such that the following hcld: (3A) The class K(n, Q ) is the range of a certain constructed recursive function F,, from the class of all group presentations E , = p*(4p(p)

+ 4r(p) + t),

~ E Q ,t

= 0, 1, ...

into the class of all finite presentations of n-manifolds. The (perhaps multivalued) function Fn-' is also recursive in the following sense: there is an algorithm to determine for an arbitrary given member %REK(n, Q ) all congruence classes of F,,-'(YX) (i.e., to determine a set of group presentations pl,..., p.,~F,-'(m) such that each .ii~F,;'(m) is congruent to one of pl,..., p q ) . then O(ii) is isomorphic to the fundamental group (3B) (3C) Let p, ~

' €be0such that 6h)rOh').W e denote by p2 (by p i ) the

UNSOLVABLE PROBLEMS IN TOPOLOGY

41

second Betti number of F,,(p) (ofF,(p’)). Then:

(3D) For every i= 1, 2, 3 , 4 , F,,(p) and F,(p‘) are i-equivalent if and only are equal. We shall prove that Theorem 3 implies the following two corollaries :

if Q (p)E 6 01‘) and the second Betti numbers BZ,

COROLLARY A. The i-equivalence problem between members of K(n, Q ) is reducible6 to the isomorphism problem between members of Q, and vice-versa. COROLLARY B. I f Q contains the empty presentation po = (@, $3) of the trivial group, and if, for soine integer N , 5p ( p ) + 4 r b )
1.

Logic

1.1. General remarks on decision problems We assume as understood at the intuitive level the notion of an efective process, i.e., a uniform set of directions which the user may apply in a purely mechanical way. For our purposes we take a decision problem P to be a collection of questions in some specified language such that (1) there is an effective process to determine whether or not an arbitrary expression of the language is a question of P ; (2) the correct answer to each question is either “Yes” or “No”. An effective process E is an efective solution to decision problem P (or, an algorithm f o r P ) if the application of E to each question 4 Reducibility here can be taken to be rnany-one reducibility in the sense of Post. See Post [37], Kleene [23], or Rogers [42].

42

w. w. BOONE, w.

HAKEN

and v.

PO~NARU

of P produces the correct answer? to q. And if there is such an E we say that P is effectively solvable; otherwise effectively unsolvable. “We say that decision problem P, is reducible to decision problem P2 if, given access to an oracle which supplies us with the answer to any question of P2, there is a [process] E to answer any question of P, - E being an effective [process] save in that at certain stages we must consult the oracle to determine what to do next” (Boone [6]). The relation holding between P, and P2 if PI is reducible to P2 and vice-versa is an equivalence relation on the totality of decision problems, and the corresponding equivalence classes are called degrees of unsolvability. A decision problem P is effectively enumerable if there is an effective process which enumerates those questions of P which have the answer “Yes” or one which enumerates those which have the answer “No”. Virtually all decision problems which have arisen in algebra, number theory, and combinatorial topology are easily seen to be effectively enumerable, e.g., (?m,m’; % YX’ Jl, arbitrary finite presentations of n-manifolds) [WM % , TI’] is effectively enumerable. (For details see 1.2.) A degree of unsolvability is effectively enumerable if it contains an effectively enumerable decision problem. In principle now we understand the vagueness of the preceding account to be removed: the notion of effective process is to be replaced by that of a Turing machine (Davis [17]; Hermes [20]; Kleene [23]; Rogers [42]); the notion of effective reducibility by that of Turing reducibility; and the language is to be precisely fixed. But we do not give these technical definitions which make precise the intuitive notions; for their actual form is irrelevant in that our argument can be followed completely just on the basis of the corresponding intuitive notions. Henceforth we do write “recursive” instead of “effective” since the precise notions are intended - but the reader may simply think “effective” instead. A set S is recursive (recursivezy enumerabb) if the decision problem as to membership in S is recursively solvable (recursively enumerable). Both the theory of recursively enumerable degrees of unsolvability taken alone and the theory of all degrees of unsolvability are now known to be very Note that in the definition of effective process no demand is made for being able to “compute an a priori upper bound on the number of steps required to reach the answer”. Indeed, there is serious doubt that this intuitive notion can be made precise in general. The idea is, to demand a “secondary computation” that allows one to compute for given question q and process E the number of steps in which E reaches the answer to q. But, provided that it is known that E will reach the answer at all, there is a trivial way to compute the “number of steps”, namely, to carry out E until the answer is reached and to count the steps.

UNSOLVABLE PROBLEMS IN TOPOLOGY

43

rich and complicated (Rogers [42]; Sacks [44]). A major break-through in these matters has been the Friedberg-Mucnik Theorem settling the Probfem of Post [37]. This theorem asserts that there exist two recursively enumerable decision problems neither of which is reducible to the other. Later, it has even been shown by Sacks, and subsequently Yates, that recursively enumerable degrees are, in the obvious sense, dense. Thus, given the present state of affairs, it is very natural to attempt t o pass from the recursive unsolvability of a decision problem of a certain sort, to showing that there is a decision problem of the same kind in each degree of unsolvability - or in each recursively enumerable degree if the decision problem under consideration is recursively enumerableg. This is one aspect of the present paper vis-a-vis Markov [26]. 1.2. The classiJcation problem in topology We call the equivalence classes induced on some class Q of manifolds by the relation z I (i= 1, 2, 3, or 4) i-equivalence clas,ses. (A natural special case is to take Q to be all manifolds of a given dimension n.) Then following a customary definition (see for instance Papakyriakopoulos [35]), the i-classij?cation problem9 for Q is the question as to whether or not there exists an “i-classijkation of Q”, i.e., a sequence S of members of Q , such that for each i-equivalence class K there is exactly one j such that thejth member of S is a member of K. But it would seem here that in this definition the sequence S is tacitly being required to be a recursive enumeration ;for without this requirement an affirmative answer t o the i-classification problem simply means that there are countably many i-equivalence classes. In turn, the requirement that S be a recursive enumeration leads us to take it that S be an enumeration, not of the manifolds themselves, but of their “presentations” in some suitable sense. Hence we shall so interpret the definition that S is a recursive enumeration of the “presentations” !IJl,, m,, ..., etc. We turn our attention to the combinatorial classification problem (i = 3). In this case the “presentations” !IJll,!IJ12,... should be so explicit that there is a recursive procedure to derive from each given !IJlj an abstract complex4 O j that describes the (isomorphism class of the) combinatorial structure A j of M(!IJlj).(Otherwise it would be doubtful that the “classification” would make any sense.) Now we have the following: The present paper throws no light on the question of the existence of topological decision problems having a degree of unsolvability which is not recursively enumerable. “Problem”, here, does not mean decision problem, but rather a proposition whose truth or falsity is open.

44

w. w.

BOONE,

w. HAKEN and v. POBNARU

LEMMA 1 (see Reidemeister [41], $5 11, 12). Every subdivision of a given finite (abstract) complex 0 can be obtained fiom 8 by a finite sequence of simple operations z and z - ‘ (bipartitions of edges and their simplex stars, and the inverse operations). COROLLARY. There is a recursive function X ( 8 , k ) , 8 ranging over allfinite (abstract) complexes, k = 1,2, ..., that recursively enumerates for an arbitrary complex 0 the subdivisions of 8, i.e., for jixed 0 the sequence X ( 8 , 1)=6, X ( 0 , 2 ) , ... is a recursive enumeration of all subdivisions of 8. We assume ajixedrecursive function Xas described in the corollary as given. An immediate consequence of the corollary is Theorem 5. THEOREM 5. For any recursively enumerable class C of presentations (in the sense just discussed) of compact, combinatorial manifolds 4, the combinatorial equivalence problem

\m, fm’E C) [m 3m]

(?‘Jn,m‘;

25

is recursively solvable if and only if the combinatorial classification problem for the corresponding class l o Q of manifolds has an afirmative answer. COROLLARY. If the Hauptvermutung holds for the class Q of Theorem 5, then the homeomorphy problem

(?fm,m’;fm,fm’EC)

[mz,fm’]

is recursively solvable i f and only if the topological classijication problem for Q has an aflrmative answer. The proof o f Theorem 5 is obtained by a standard argument of logic: (I) Suppose (?fm,YJI’; fm, C) [fmz,%R‘] is recursively solvable. Then we construct the classification fm,,._.of Q as follows: first we recursively enumerate the members of C, say fm;, ‘Jn;, .... (This is possible since C is recursively enumerable.) Take %R, =my. By hypothesis, we can recursively determine for each k = 2, 3, ... the first member, say mi*,,of + ,, fmz- +*, . . . that is not equivalent to any one of YX,, ..., fmk-,;take m,=fml. (11) Conversely, suppose !&,,fm,,.. . is a combinatorial classification of Q, and that two arbitrary members 93, ~ ‘ E are’given. C Then to determine whether or not fmz3%R’, we have the following algorithm: where B0, 8b, el, 02,... are abstract complexes corresponding to fm, fm,,fm2 , ... respectively, recursively enumerate the values of the function X ( 0 , k ) for 8=B0, Oh, 8,, 82,... and k = 1,2, ... in such a way that one generates the

mi,

%Qt-,

’m’,

lo

This means: M E Q if and only if Mm:3M(YJl)for some YXeC.

Y

45

UNSOLVABLE PROBLEMS IN TOPOLOGY

infinite 2-dimensional matrix :

x(Q,,1) we:,, 1)

w,,

1)

WQ,, 1)

...................................... by proceeding along the diagonals of finite length, i.e.,

x(o0,11, x(eb,11, x(Q,,21, x ( e , ,I), me;, 21, ...

By hypothesis 8, and 0; are each combinatorially equivalent to precisely one of O , , Q,, ..., say to 8, and 8,,, respectively. Consequently there are integers k,, kb, k,k’ such that X(O,, k,) is isomorphic to X(8,, k ) and X(&, kb) is isomorphic to X(B,,, k’). After a finite number of steps7in this enumerating process, our recursive enumeration will list these four matrix elements; thus we determine the integers I, I f , k,, kb, k, k‘. Then !IJlx,!IJt’if and only if 1=1‘.

1.3. Proof that Tlieorem 3 implies Corollaries A and B In this section we use the notation of Theorem 3. (A.1) Let !IJl,!IJl’ be given members of K(n, Q). We can find by a recursive =F, I? (p’) .since ’ F,- is recursive procedure p,p’ E such that %R= F, (p),% in the sense of Theorem 3. The second Betti numbers of 3n,‘W, p2, p;, are recursively computable (since %It,9X‘ are given with triangulations). If ,!I2 # & then ’93and 5J.R’ are not i-equivalent. Suppose p2 =B;. Then iYJl‘ (for all i = 1, ..., 4) if and only if 8 01)E 6 01’)(isomorphism). This reduces the iequivalence problem for K(n, Q) to the isomorphism problem between members of 0 which in turn is obviously (by the definition of reducible to the isomorphism problem between members of Q. (A.11) Conversely, we begin with given p , p ’ ~ Q .Let

o)

and B = CI * (4P (PI + 4r (4) p’ = p’ * (4p(p’) + 4r(p’)) and tP&*= P ($1 - r (B’) - P (8+ i“ ( B ) . We may assume that t,,,,20. By (3C), if S @ ) S : S ( p ’ )then p2(Fn@’))= f12(Fn(ji*tP,P,)); and thus in view of (3B), (3D), 8 ( p ) z 8 Q p 1 if) and only if ’n

( P * ffi ,p ,)M iFn (P’).

This finishes the proof of A. (B.1) Now, given !IJt€K(n,Q) we determine, as in (A.I), a member p of such that m=F, 01). Next we determine ,iiE Q such that p = fi * t, for some

w. w.

46

BOONE,

w.

HAKEN

and v.

POENARU

t , 3 4p@) + 4r(p). Clearly (by the definition of 0)such a p can be located. Next we compute fi2(@), fi2(!JX). Suppose that p2(‘5si)=fi2(9X). Then, by (3D), !?JIM i@ if and only if p presents the trivial group. Trivially, if fi2(9X)#

p,(@) then not @ %i!JX.

(B.11) Conversely, suppose we are given ~ E Q Let . t(p)=N-p(p)+r(p).

Then t@)>4p&)+4r(p), hence p*t&)EQ. By (3B), (3C) and (3D), p is trivial if and only if F,(p* t ( p ) ) z i%%. 1.4. Tarski’s decision method for the elementary algebra of real numbers Later, in order to show that our notion of a finite presentation of a manifold satisfies certain basic requirements, we shall require a celebrated result of Tarski [50] which we here briefly explain. Following Tarski, by an elementary expression (in the elementary algebra of real numbers) we mean a meaningful expression built up from the following: (1) Variables ranging over the real numbers; (2) Constants denoting the natural numbers; (3) Symbols expressing addition, subtraction, multiplication, and division of real numbers; and symbols expressing the relations of “greater than” and “equals” between real numbers; (4) The sentential connectives, ‘‘~r” ,“and”, “not”, and “implies” ; (5) The quantijiers, “for all x” and “there exists an x”, where x is one of the variables of (1). It is important to realize that we cannot, in general, speak about sets of real numbers by means of elementary expressions. E.g., while we can talk about specific integers (thus ‘‘x is 5 or 7” is elementary) we cannot speak about the set of integers (thus “x is an integer” is not an elementary expression). But all the familiar field and order properties of the real numbers can be given by elementary expressions - as can all the familiar properties of polynomials of degree n, f o r fixed n. Then Tarski’s result is as follows.

LEMMA2 (Tarski [50] Theorem 37; Cohen [16]). There exists an algorithm to determine whether or not a given elementary expression in the elementar]? algebra of real numbers be true. 2.

Algebra

2. I . Unsolvability results about group theoretic problems The present paper will rely on the following lemmas which are essentiallyll

47

UNSOLVABLE PROBLEMS IN TOPOLOGY

Results 4 and 7 of Boone [8] in the present volume. Definitions of the isomorphism problem and triviality problem are given there.

LEMMA3. For each recursively enumerable degree of unsolvability D, there exists a fully recursive class Q (D), of finite presentations of groups, such that (3i) The isomorphism problem for Q ( D ) is of degree D; (3ii) The trivialityproblem f o r Q ( D ) is of degree D; (3%) Q ( D ) contains the empty presentation po = (0, 0) ; (3iv) For all p, ~ ’ E Q ( D ) - { ~ we , > have r(p)=r(p’) andp(p)=p(p‘). LEMMA4. For each ordered pair (D, 0‘)of recursively enumerable degrees of unsolvability, there exists a recursive family F(D, 0’)of fully recursive classes of finite presentations of groups, such that: The decision problem to determine, for an arbitrary member Q of F(D, 0’) whether or not the isomorphism problem f o r Q be of degree D, is itself of degree D’. 2.2. On certain special Tietze transformations We introduce now the following “elementary operations” on finite presentations of groups (compare Markov [26]): The presentation p = ( { x 1,..., x r } , {a,, ..., K ~ - ~ai-a‘a”, , m i + , , ..., a,}) (where ul,..., a,, a’,a’’ are words in the x;l’s) is replaced by the presentation pl=({xl, ..., x r } , {K,, ..., ai-l, ci‘~4x,:~a‘‘,a i + 1 7..., a,}) nherejE(1, ..., r } , E = I. The inverse of Op, (deleting a syllable xsx,YE in one relator). The presentation p=({xl ,..., x r } , {a1, ..., ai-,,ai, ..., a,}) is cil, cii+l, ..., ci,}) where the replaced by pl= ( { x l ,..., xr>,{a1,..., word a: is a cyclic permutation of the word oli, The presentation p = ( { x l ,..., x r } , {a,, ..., mi, ai+,, ..., a,}) is re-1 placed bypt=({xl,..., xr}7 {a,,..., cii-1, xi , ~ i + 1 , - - . ,a,}). The presentation p = ( { x , ,..., x r } , {a1,..., ai-,, ai, a i + l ,..., a,}) is replaced by pl=({xl ,..., xr}, {a1,..., a i - l , u p j , u i + ,,..., a,}) where j ~ { l..., , p),j#i. The presentation p=({xl, ..., xr}, {K,,..., a,}) is replaced by p l = ({xl, ..., xr, x r + , } , {al,..., a,, x,+,a}) where x , + ~ is a letter, different from xl, . . ., x,, and ci is a word in the letters x : .., xr’

’,.

Lemma 3 with (3iii) deleted is Result 4 of Boone [8].But certainly, adjoining p,, to Boone’s class C ( D ) does not change the degree of either the isomorphism problem or the triviality problem. In his Result 7, all presentations concerned may be taken to have the same set of generators. so that recursive implies fully recursive.

l1

w. w. BOONE, w. HAKEN and v. P O ~ ~ N A R U

48

Op;

: The inverse of Op, (deleting in p a generator x j and a relator a i that reads x j a where a,, ..., a c - l ,a,a i + l..., , a, do not contain letters x j or x , ~’).

Remarks. Clearly, these operations preserve the isomorphism class of p ; 8 ( p ) z 6 ( p l ) . Moreover, Op:’, Op,, Op,, Op, preserve also the group presented 8(p)=Q(pl), i.e., the alphabet is not changed and hence g(p)=%(pl) and also %(p)=%(pl). The inverse operations of Op, and Op, are again operations Op,, Op,, respectively; the inverse of Op, can be composed of one application of Op, followed by one application of Op,, several applications of Op;’, and another Op,. The following lemma generalizes a lemma of Markov [26] and is quite parallel to a well-known result of Tietze: Let p=({xl ,..., x,.), {a1,..., a,)) and p’=((yl ,..., ,vrr), be two isomorphic group presentations, and let t, t’ be any integers such that p + t - r =pf + t‘ - r‘, t > p + r‘, and t’ >p’ i-r. Then the presentation p * t can be transformed by means of a Jinite sequence of operations Op: 1 , 0 p 2 , 0 p , , 0 p 4 , 0 p ~into the presentation p‘ * t’. Proof. We may assume that p 4r‘3p‘ + r. Let I : 8 0s)- 8 (p’) be an isomorphism of 0(p) onto 6 01’). Our first objective is to transform p * t and p ’ * t ’ into presentations p # and p;, respectively, both in the same generators, say zl, ..., zr# with r # = r + r ’ , where z , ,..., z, “correspond” to xl ,..., xr and zr+l,..., z,# correspond to y l , ..., y r r . Let tl,.. ., t,.,be words in the xi’s such that the corresponding elements C j ‘ 3 ( p ) ~ Q ( p(where ) <,%(p) means the coset of in 801)with respect to ‘301)) fulfill the condition LEMMA 5.

{PI, ..., p,.])

’

cj

t j % ( p ) = z - ’ (yj’3(p’))

( j = 1, ..., r’);

correspondingly, let y l , . .., yr be words in the y,’s such12 that yi%(p’)= i(xi%(p))(i= 1,. ..,r).Now wetransformp* t by r # =r+r’operations Op, into p1 = ((x1, ...?

x,, z1, ...

)

Z,J, {a,,...

)

a,, z,x;l, ...

)

correspondingly we transform p‘ * t ‘ into =((~1,

...¶

~ r ’ 7~

..., Z r , ) ,

1 ,

{pi, ...)P p , , ~

Z , X 2

zr+ltY1,

-1 1

..’) z r 2 t ; l 2

*‘I);

-1

9

...) ~ z1r y r

7

z,+ lyl- l , ...) z r s y ;

*“I).

In less precise terms & is a word in the xZ*l’s standing for the same abstract group element as yf, j = 1 , 2, ..., r’; and similarly for vz and x a , i = 1,2, ..., Y .

l2

49

UNSOLVABLE PROBLEMS IN TOPOLOGY

Next we replace, step by step, all the x iin al,..., ap, and in t;', ..., trT1 by the zi. T o achieve this we first replace the relators zix; in p1 by xiz; (Op,); next if a relator, other than those resulting from the transformations just performed, contains a letter xi or x; we transform it into a relator of the form 7x; (Op, and, perhaps, Op,) ;next (using Op,) we obtain txl: ' x i z i ; and then (by OPT') zzi. Correspondingly, we replace all the y j in fir, ..., f i p r , and in qL1, ..., qr- 1 by the zrt j . Now, by r operations Op, we remove the generators x i and the relators xizi-' from pl; also we remove the yj's and the Z , , ~ ~ , ~ ' ' Sfrom p i . This yields the presentations

'

'

',

p#

=((212

p;Z

= ( { ~ l r

*'I),

...) zr,), ( ~ 1 ,...) ~ p # . ...) z r # } , {81>...>S p # ' *')), P# = P

+ r',

where yl, ..., y p , are obtained from a1,..., ap, zr+,5;' ,..., zr,tr71, respectively, by replacing x i by z i and where d l , . . . , d p p t pare obtained from pl,..., ppr,z l v ; ' , ,.,,zrvr-' respectively, by replacing y j by z r t j ; and where

dpr + . .., dP, are empty words. Next we wish to show that in the free group on the z l , ..., z , , the di's generate the same normal subgroup as13 the 7;s (%(p'#)=%(p#)), i.e., that for i = l , . . . , p # ai hi = TikY:::T.i1

n 1 (#)IYi kgl k=l bi

=

Fikhyiil

in the free group on the z's for some words Tik,i + i k , To prove % ( p i ) = %(,u#) we consider the "obvious isomorphisms" : to x i % ( p , ) , I ; : Q ( p ' ) + @ ( p i ) , mapping y j % ( p ' ) to yj%(,ui), i 2 : @(,u#)+ 6(p1), mapping z k % ( p # ) to zk%(pl), &:@(pi)+ %(p;),mappingz,%(pl#)to zk%(p;),

1':

@ ,p)

Now,

+

6 ( p l ) , mapping xi%(,u)

,-I

I# =i2

I

t1z~;'z2:

(i = I ,

..., r ) ;

( j = 1, ..., r');

( k = 1, ..., r # ) ; ( k = I,..., r # ) .

%(p,)-+@gl'#) is an isomorphism, mapping

since

l3 # I./

Or, to speak in terminology not elsewhere used in this paper, we want t o show that and p'# are isomorphic in the obvious way, i.e., zi maps in zi.

w. w.

50

BOONE,

w. HAKEN and v.

PO~NARU

Thus z # maps z%(p#) to T%@'#) where 5 is any word in the z's, and hence I # maps %(pLI#)%(p,) to S i p ' # ) %(p'#)= %(p'#). That means, since also I # ( % ( p # ) ) = % ( p ' # ) (choosing the identity for T ) , %(p'#) % ( p # ) = % ( p # ) ; hence % ( p ' # ) c % ( p , ) . Thus, in view of the symmetry, % ( p # ) = % ( p ' # ) . Now, as we shall show, we can transform p # into the presentation using operations op:', pt=({zl ,..., z r , ] , { y l ,..., y p , , d,, ..., dps, Op,, Op, and Op,. Correspondingly, we can transform p'# into p,. This will prove Lemma 5. The transformation p # into p g (p'# into p , is completely analogous) can be done as follows: First we remark that the operation

-

Op:({xt, *..,x r } ? (mi, ...) E i - 1 9 xi, x i + , , ...) ~

p } )

xi-,,

X r I y {~l,...,

+({~1,...,

w r C ~ - ' ,E i + l , . . . ,

~ p ] ) ,

where o is an arbitrary word in the xj's and E = 5 1 can be composed by a sequence of operations Op,, Op, and Op,. The inverse operation, a p - ' , can be composed by a sequence of Op;', Op, and Op,. We transform p # by d p into ({Zl

,..., z,,), (Y,,

. ' . 3

Yl,lI-13

w:;y;,'.

Yulltlr

?)pa,

~' 3

*'I);

then by Op, into ({Zl

,-.> G#}, { Y l , .'.>Yal,-l, ~llY:~:~l-tl,Yul,+l,.'., Y P # ' ~llY:::~;ll,*t-l});

and then by d p - ' into ( { ~ 1 , . . * z, r g ) , { ~ 1 , . . . , ~ u l l - 1 9 ~ u I 1 )~ u i l + t , . . . , ~ p r ' , ~ 1 ~ : ~ ~ ' ~ ~ ~ ~ * ~ - ~ } ) .

Then by a similar triplet of operations, we obtain ( { z 1 , . . . >Z r 5 ) ,

{~1,..-r 'ipt'

1

T,,Y",::T;~T I

FIZT-l ~ Y U I 1 ~2

*t-I 3

}).

Continuing in this way we obtain (by a, -2 further triplets - see (#)) ({zlr

..., z r # j , ( ~ 1 ,

n T,,Y",~TG',*'-'I). a1

. . I >

~ p r r

k= 1

From this we have (by (#)) by a sequence of Op:' ({Zl,

...3

Z'J>

{Yl?

...9

Yps7

81,

*'-'I).

UNSOLVABLE PROBLEMS IN TOPOLOGY

51

In a similar way we obtain further

This completes the proof of Lemma 5.

3.

Topology

3.1. Preliminaries on manifolds and handles

In this section we recall the basic topological concepts which we use in this paper. However, we expect the reader to be familiar with the elementary concepts of general topology and of algebraic topology, especially the notions of fundamental group and homology groups, as explained in the usual text books, e.g. Seifert and Threlfall [45], Cairns [14], Spanier [48]. Topological manifolds. A (topological) n-manifold M (with or without boundary) is a connected, separable, metric space each point p of which possesses a closed neighborhood N ( p ) that is homeomorphic to the compact unit n-ball x: + ... + x’,< 1 in euclidean n-space En.A point P E M is called a boundary point of M if it lies in the boundary dN(p), of its closed neighborhood N ( p ) ; otherwise p is called an interior point of M . The set aM of all boundary points of M is called the boundary of M ; Int M = M - d M is called the interior of M . If M is coinpact and d M = @ then M is called a closed n-manifold. An n-manifold D,which is homeomorphic to the compact unit n-ball is called a (compact, topological) n-ball; dD, is called a (topological) (n - 1)-sphere. Combinatorial manifolds. A topological m-simplex om is an equivalence class of homeomorphisms cp: Io,(+6, of a topological m-ball Ioml, the point set of om”,onto a rectilinear simplex 6, in En,where two homeomorphisms cp’: Iom1+66 and 9”: Iom/-f6aare equivalent if there is a linear map x:6;+6:, i.e., x is a homeomorphism given by linear equations, such that (p“=x.cp’. If 6, is a face of 6, (we permit r=m) then cp1cp-’(br) represents a face or of urn.A simplicial complex A is a set of topological simplices such that (i) if O E A then all faces of o also belong to A , (ii) if o , o ’ ~ Ao#o‘, , then lul n lo’(is either empty or the point set of a face both of o and of 0’. The point set union ] A [ of A , with the topology induced by the c’s of A , is called the polyhedron of A . A simplicia1 complex A* is called a semilinear subdivision of A if (a) ] A * /= lAl, (b) for each Q * E A * there is a ~ E A represented , by a homeomorphism cp: 1o1+6 such that 1u*I c 101 and cp1 lc*l represents o*. 6‘

52

w. w.

BOONE,

w. HAKEN and v.

POENARU

It is often convenient to consider simplicial complexes A such that the point sets 101 of the simplicies CT are (rectilinear) simplicies in a euclidean space En and such that the identity map on I C T ~ represents 0.In this situation one may identify 0 and 101 and call A a (rectilinear) simplicial complex in En. By a triangulation of the topological manifold M we mean a simplicial complex A with lAl = M . It is a famous problem in topology whether every n-manifold admits a triangulation; for n > 4 this is still an open question. Two simplicial complexes A , , A , are combinatorially equivalent if they possess semilinear subdivisions A ; , AT respectively, that are isomorphic (i.e., such that there is a 1-1 correspondence between the simplices of A : and AT that preserves the dimensions and the incidence relations). It is clear that in this situation there exists a so-called semilinear homeomorphism ;1:I A , I +Id, 1 that maps the simplices of AT linearly onto simplices of AT. A simplicial complex r is called a combinatorial n-ball if r is combinatorially equivalent to the triangulation of an n-simplex into all its faces. A combinatorial n-manifold is an n-manifold M together with a triangulation A such that for each vertex (= O-simplex) P E A the simplex star St (plA ) (=the set of all simplices of A that are incident with p , and their faces) is a combinatorial n-ball. Such a triangulation is called a combinatorial structure on M . A famous conjecture in topology is the so called Hauptvermutung; it states that two complexes A , , A , are combinatorially equivalent if lAll and ld21 are homeomorphic. The converse is trivial. This has been disproved for complexes in general (Milnor [29]), but the restriction to combinatorial n-manifolds is still an open question for n 2 4. It has been proved for n = 3 - Moise [30], Bing [4]; recently a proof was obtained for all simply connected (i.e., with trivial fundamental group) combinatorial n-manifolds with n 3 5 (Sullivan [49]). By aJLinite (non-oriented) abstract complex 8 we shall understand a finite collection of finite sets; each set consists of letters, called vertices, taken from some suitable alphabet, say pl,p,, ...,pu; and further SEO and S'c S imply that S ' E ~If. SEOcontains n+ 1 vertices, then we call S an (abstract) n-simplex of 8. And a subset of S is called a face of S. The relation of isomorphy can be defined for abstract complexes in the same way as for simplicial complexes; it is also meaningful to say that an abstract complex is isomorphic to a simplicial complex. In fact, a finite abstract complex can be regarded as a finite presentation of an isomorphism class of simplicial complexes. We call an abstract complex 8* a subdivision of the abstract complex 0 if every vertex of 0 is also a vertex of O* and if there are simplicial complexes A * , isomorphic to 8*, and A , isomorphic to 8, such

UNSOLVABLE PROBLEMS IN TOPOLOGY

53

that A * is a semilinear subdivision of A . Now we may define the combinatorial equivalence for abstract complexes in the same way as for simplicia1 complexes. Two n-manifolds M , M’ are called homotopy equivalent if there exist continuous maps f : M-, M ’ and g :M’-, M such that g of : M-+M is homotopic to the identity map on M , and f o g : M‘-,M‘ is homotopic to the identity map on M‘. Diflerentiable manifolds. A coordinate system on a (topological) n-manifold M , (with or without boundary) is a homeomorphism h: W-tH,, where W and h (W) are open sets in Mn and in H,, respectively, and where H,, is an n-dimensional euclidean half-space (i.e., the subset of euclidean n-space E, for which x,,>0). The mapping h associates with each point P E W the coordinates of h(p) in H,,. Two coordinate systems h1:Wl+H, and h,: W2-fH,, are called C-related, r e { l , 2, ..., co,w > if the corresponding “coordinate transformation” h l o h;’Ih,(W, n W,): h2(W1n W2)+ 4 h l (W, n W,) is of differentiability class C‘ (here C” is the class of analytic functions), and has non-zero Jacobian determinant in all of h, (W, n W,). A C-atlas of M,, is a system h,: W,+H,, of pairwise C‘-related coordinate systems on M,, (6 ranging over an arbitrary index set) such that the W,’s cover M,,. A C-structure on M, is a maximal C-atlas, i.e., not a proper subsystem of another C-atlas. A C‘-n-manifold is an n-manifold M,, together with a C‘-structure, and this is called a diferentiable manifold if r = co. Let M,,, M,‘ be topological manifolds; if h : W-+H, is a coordinate system on M,, and if q : M , Z M ; is a homeomorphism, then we call hocp-’:q(W)-tH,, “the coordinate system on M,‘ carried over by cp from h”. Furthermore, if S is a C-structure on M,, then the collection of all coordinate systems on Mi carried over by cp from members of S is a C‘-structure on M i ; we shall call this the C-structure carried over by q (from S ) . Two differentiable manifolds M , M‘ are called difeomorphic if there is a “difeomorphism” M-+M‘, i.e., a homeomorphism of M onto M‘ that carries the differentiable structure of A4 into that of M‘. We remark that, if aM,#8, each C‘-structure on M , “induces” a C-structure on the (n - 1)-manifold aM,,, since for each coordinate system h: W-tH,, we have h(WndM,)=h(W)nE,,-, where E,-l means the (n - 1)-space x, = 0 bounding H,, in En. A C’-imbedding of a C”-manifold M,, with r‘>r in euclidean m-space Em means a homeomorphism 9 : M,-E, that is of differentiability class C‘ with respect to the C”-structure of M,,, i.e., g0h-’:h(W)+E,,, is of class C‘ and the Jacobian matrix of g o h - 1 if of rank n, whenever h: W-,H,, is a coordinate system of the C”-structure of M,,.

54

w. w. BOONE, w. HAKEN and v. P O ~ N A R U

If M, is a C‘-manifold then a combinatorial structure A on M, is called compatible with the C‘-structure of M , if each m-simplex am€A is a so-called C-simplex, i.e., there is a coordinate system h : W-tH, in the C‘-structure of M , such that JumI c W and h(Ja,J) is a rectilinear m-simplex in H,. For more details on differential topology see for instance Munkres [32]. Concerning differentiable manifolds we quote the following famous results: (i) Each C’-structure contains a C“-structure and moreover an analytic, C”, structure (Whitney [52]). (ii) Each differentiable manifold admits a (compatible) combinatorial structure (Cairns [I I, 131; Whitehead [51]). (iii) There exist combinatorial manifolds that do not admit a (compatible) differentiable structure (Kervaire [22]). (iv) For n B 7 the n-sphere admits several different differentiable structures (Milnor [27]). Handles. For the construction of differentiable and combinatorial manifolds with prescribed fundamental groups, we shall use the operation of “handle-adding” to a manifold M with boundary which induces the operation of “Morse-surgery” on the boundary-manifold aM. For the remainder of this section, all manifolds considered are to be differentiable or combinatorial or both. Let M, be a compact manifold with aM# @.We consider an (n- 1)-dimensional submanifold Mi - c JM, and p copies of the n-ball D,: D’,,D” ,..., 0,”. Each of them is regarded as a Cartesian product of the A-dimensional ball with the (n - @dimensional ball. Passing to the boundaries, we have:

ao: = (ao;x D:-~)U(D: x =(sf-,X D ~ - ~ ) U ( DX S: ; - ~ - ~ ) Here S:-

( i = I ,..., p ) .

means the (A- 1)-sphere aD:; actually Int(S:-, x D:-,)nInt(D: x = 0; further and a(S:-, x D i - J = S:-l x a&, = S f u lx

a(D; x S ~ - , - , )

= aD: x

S,,-A-1 i

=$-I

X $ - ~ - 1

are identical. Let us consider p differentiable and/or semilinear l4 homeomorphisms q i : ~ i - x. l~ i - ~ - + I n t ~ ; - (~i = 1 ,..., p )

UNSOLVABLE PROBLEMS IN TOPOLOGY

55

such that Image cp‘nImage cpk=O for I # k . Let us consider the quotient space obtained from M,, uDj u-.-Y 0,” if every x ~ S j - x, D l - , ( i = l , . . . , p ) is identified with c p i ( x ) ~ M : - ,c d M , . The topological space we obtain in this way is an n-manifold and has a “natural” differentiable and/or combinatorial structure. (See Smale [47] ; in this paper we shall need only the cases A = 1 or 2 where n 3 5, and in these cases we shall directly present these structures.) And we denote this differential and/or combinatorial n-manifold, as Smale does, by

X(M“,M;-l; c P 1 , . . . , ’ p P , A ) . It is called the result of “adding p handles o j index I. to Mn on MA-,”. We recall the definition of “Morse surgery” (see e.g. Milnor [28]). Let M i - be a compact manifold. We consider p copies of the (n - 1)-sphere Si- .. ., S,P- 1. Each of them is regarded as a union of two Cartesian products

,

,,

i

where

=

( s ; - ~x D;-~)u(D:

d(Si-, x Df-J

x

s;,-,-,)

= d(Di x

Let us also consider p differentiable and/or semilinearl* homeomorphisms .

.

cp’:S;-, x Dk-I+Int M,,’-,

(i

=

1,..., p )

such that Image ‘p’nImage qk#O for I Z k . Let C be the closure of M ~ - , - - U f = ,cpi(Si-,xDL-,). We have

ac = a ~ ” ‘ - , u where

a($-,

u cpi(sj-,x s;-,-,> P

i= I

xD;-J=s;-]

xs:-A-l.

Let us consider the quotient space obtained from Cu

u Di x P

i= 1

,

if every x ~ S j - x, Si-,-l is identified with cp’(x)~dC.This space is in fact a Here “semilinear” means compatible with the combinatorial structures A n i of Dlli and A n of M n , i.e., pi maps simplices of a certain semilinear subdivision of A n i linearly onto simplices of a semilinear subdivision of An. l4

56

w. w.

BOONE.

w.

HAKEN

and v. POENARU

manifold which we denote by

‘>

This is called the result of Morse surgeries of index A applied to M: - I ” . The definitions of handle adding and Morse surgery have been given independently of each other; however, note that the operation of handle adding t o M , on M i - induces Morse surgery to M i - Let us identify the spheres Si- of the definition of Morse surgery. with . dD;of the definition of handle adding. Further, let us regard M i - D;, Sf- SL-A- and cpi as identical in both definitions. Then we have

and in the special case that MA- I = d M , we have v = d X , which we shall speak of as “the (n - 1)-manifold obtained from dM, by the Morse surgery induced by the handle adding t o M,”. We remark that every differentiable and/or combinatorial n-manifold M , without boundary can be obtained from the n-ball D, by successive handle-addings of index A= 1, 2, ..., n (see Smale [47]), - a list of the corresponding homeomorphisms cp (for each A = 1, ..., n) being then called a handle-presentation of M,. 3.2. Finite presentation of differentiable and combinatorial manifolds. Proof of Theorem 4 Mathematicians have firmly fixed as a working concept the notion of “finite presentation of a group”. On the other hand the notion of “finite presentation of a manifold” requires here a considerable discussion. The logician will recognize it as much the same sort of analysis by which one passes from the intuitive notion of effective process to the precise technical notion of recursive process (Church’s Thesis). We contend that any definition of a “finite presentation lrJz of a differentiable and/or combinatorial n-manifold M” should satisfy the following conditions : (a) Zm is a finite notation, i.e., a finite sequence of symbols in some language; (b) there is an algorithm to determine whether or not any given finite notation in this language be a finite presentation; (c) to each finite presentation Zm there is precisely one n-manifold M(!JJl), presented by lrJz. However, a concept of finite presentation that fulfills these three necessary

UNSOLVABLE PROBLEMS IN TOPOLOGY

57

conditions may still be unsatisfactory. We are thus led to the further demand that (d) 9X describe M(9Jl) in a “natural” way. As to the interpretation of (d) we take the point of view that a finite presentation 9X of a differentiable and combinatorial manifold should have the property that (compare section 1.2) a triangulation A and a Cm-atlas 2 of M(9X) are described by 9X. Let us remark that the “handle-presentation’’ (see last paragraph of section 3.1) of a differentiable manifold, which is a very useful tool for many investigations, is not a finite presentation unless the C“-homeomorphisms cp are described by a finite notation so as t o fulfill Condition (a). We remark further that an abstract complex is a very satisfactory finite presentation of an isomorphism class of simplical complexes. However, for presenting combinatorial manifolds of dimension > 3 we need a more elaborate concept because of condition (b). There is no algorithm known that allows us to decide whether or not a given complex represent a combinatorial n-manifold (whenever n > 3 ) ; for, for such an algorithm, we should need a solution of the combinatorial equivalence problem with the (n - 1)sphere. Note that for dimension > 3 a presentation in terms of “incidence matrices” is to be rejected on the same grounds. Now we shall define the finite presentation YJl in such a way that it describes (I) a euclidean q-space E, (9X); (11) an n-dimensional, rectilinear, simplicia1 complex A ( 9 X ) in E,(9X) with rational vertices; (111) for each simplex star of A (9X) a semilinear homeomorphism into an n-dimensional subspace of E,(9X) (which makes it evident that A is a combinatorial n-manifold) ;and (IV) for each open simplex star of A (YJl) a homeomorphism into an n-subspace of E,(YJl) such that these homeomorphisms form a C“atlas q(9X) on IA(9X)l and are described by a set of algebraic equations. The algebraic equations will be derived from a q x q-matrix L and a 1 x q-matrix u whose components are polynomials in the coordinates of E,(%Jl). Here we use techniques developed by Nash [ 3 3 ] . The homeomorphisms can be interpreted as mapping each point of lA I into the nearest point of an approximating sheet g of an algebraic variety and then projecting this point into an n-subspace of E,(YJl). The matrix u approximates the component matrix of the distance vector from any point in a neighborhood of g to the nearest point on g ;the matrix L approximates the component matrix of the tensor that projects each vector into the (q-n)-dimensional normal plane to .!%. An algebraic atlas presentation YJl of a closed n-manifold with a combi-

58

w. w.

BOONE,

w. HAKEN and v.

POENARU

natorial anda compatible diflerentiable structure means an (ordered) collection

m = ( X I ,..., S q ; p l , ..., p s ; e; i,, ..., is; L; U; is, E , D) with the following properties: (I) x,,. .., xq are letters, called coordinate variables or simply coordinates. We denote by E,(%R), “the euclidean q-space presented by %R”, the euclidean q-space with coordinates x l , ..., xq. (11) p , , ..., p s are pairwise different 1 x q-matrices whose components are rational numbers; 8 is a finite abstract n-dimensional complex with vertices p1,..., p s . We denote by p, ,..., p, the points in Eq(%R)with coordinate matrices pl, .. ., p s , respectively. We require the following further properties of the p’s and 8: (IIa) If (p,,, ..., pjvm)~O, then p,,, .. ., pi, are in genera1 position, i.e., in E,(%R) there is a (rectilinear) m-simplex om with vertices pi,, ...,pi,,. (IIb) The set of all those simplices which correspond to the members of 8 in the sense of (IIa) is a rectilinear simplicia1 complex in E,(%R), ‘‘the simplicial complex presented by %R”, denoted by A (%R) ;moreover, the boundary complex of A(%R)is empty. (III) i, (ke{l, ..., s>) is an n-tuple of positive integers i l L < i 2 , < . . . < i f l k < q such that the (compact) simplex star St(p,lA(%R)) of pk in A(%R) projects 1-1 into the coordinate space E n ( i k )with coordinates xitr, ..., x,,,. I.e., the map z,:Eq(%R)-+Efl(i,)that maps a point with coordinates x;, ..., x,* into the point with coordinates ..., x:~ induces a semilinear homeomorphism of St(p,lA(%R)) into En&). (IV) L is a symmetric q x q-matrix, and u is a 1 x q-matrix where the components of these matrices are polynomials in the variables x,,..., xq with rational coefficients. 6, E and D are positive rational numbers, ~ < 1 / 2 n ; D < l . Let B,, k = 1,2, ..., s, denote the q-ball in E,(m) with radius 6 and with center p k . Then we require the following properties: (IVa) For each point in Ui= B,, L possesses n “small” eigenvalues whose absolute values are smaller then E/n and q - n “large” eigenvalues whose absolute values differ from 1 by less than ~ / n This . is equivalent to the condition that the coefficients of the characteristic polynomial a(],) of L (in the variable 1,with highest coefficient normed to 1) are sufficiently close to the coefficients of An(l.- l ) q - ” . (IVb) No 1-simplex of A(%R)is larger than 46. Let p(A) be that factor of ~ ( 1 that ~ ) embraces the n small eigenvalues with highest coefficient 1. By Nash [33] the coefficients of P ( A ) are real, analytic functions of the x,’s, and, by (IVa), all but one in absolute value < E .

,

UNSOLVABLE PROBLEMS IN TOPOLOGY

59

Further, let P be the q x q-matrix ,8(L), let Qi be the 1 x q-matrix Pu, and let Qi, ..., Qiq be the components of Qi. Finally, let i;,, ..., i6-n be the integers in { 1, ..., q } - i,. Then, for each k = 1, ..., s, we require the further properties: (IVc) The absolute value of the Jacobian determinant

(IVd) The system of 2q-n equations15

1

Qii,,k =...=

@i,q-nL

=0

x" - x = y1 grad @ i , , k + .--+ q4-, grad C P ~ , ~ - , , ~ , together with the inequality Ix* - XI <@,where X,X* are 1 x q-matrices with components xl, ..., xq,and x : , ..., , : x respectively, defines a 1-1 map, say (Gk)

of Int St (pkld ('Dl)) into En&) if the coordinate matrices of all points of Int St{p,ld('Dl)} are taken as range of x*. In particular if { p , , p , , , ..., p , , ) is an n-simplex of 0 and i f x * = i l p k i + . . . + i n P k , + ( I - i l - . . . - i , ) p k , then the Jacobian determinants of the system (Gk)(in the 2q variables xi,. . ., xq, ~ i , . . . , ~ l ~ - n , 5 i , . . . , i n ) with respect to xi .lr,...,x i . 4 - n L ' ~ 1 , . . . , r q - n , i lin, ,..., and with respect to x1,..., xq,yl ,..., y q - , , respectively, both have absolute values greater than D if Ix*-ppkl and Ix*-xI < i d . Further, the map g k O g f - lg,(Int St {pkld ('Dl)}n Int St { p ~ (m)}) d has non-vanishing Jacobian 8 ('Dl)}. determinant for all I such that ~ ~ St €{pkld Let E, be the n-dimensional subspace of E,(fm) with coordinates xl, ...,x,, and let K k : E , ( i k ) 4 E n be the linear homeomorphism that maps the xitk-axisonto the x,-axis ( I = 1, ..., n). Then the atlas '%(%Jl) presented by m is the collection of maps hk=KkOgk(k= 1, ..., s). The differentiable and combinatorial n-manifold M('Dl) presented by % is the Jpolyhedron l Id (9JI)l together with the coinbinatorial structure d ('Dl) and the differentiable structure represented by the atlas '%(%TI). Now we have the following theorem: gk

4. For every closed, diferentiable n-manifold M there exists a THEOREM (finite) algebraic atlas presentation !?X such that the manifold M(m)presented by fm is dcfeomorphic to M . Moreover, the concept of algebraic atlas presentation fulfills the requirements (a), (b) and (c) stated at the beginning of this section, and corresponding to (d), the following: (d') if an algebraic atlas presentation 9Jl is given, then the P - a t l a s f!l(fm)presented by (332 can be recursively computed. By this we mean lz

Here "grad" means the coefficient matrix of the gradient vector.

60

w. w.

BOONE,

w. HAKEN and v. POENARU

(using our above notation) that there is an algorithm to solve the following problem: i f a positive rational number z together with components of a rational point are given, to determine the components of a rational point h," ( p ) in En such that the distance Ih," @) - h,(p)l is smaller than z. Proof. The first paragraph of Theorem 4 follows immediately by an analysis of the work of Nash [33]. (We have chosen the notation according to that used in the proof of Theorem I in Nash [33].) By a result of Whitney [52], there is a manifold 9, diffeomorphic to our given manifold A, and analytically imbedded in a euclidean q-space E , (for q sufficiently large). We take the coordinate variables of E, to be the to be constructed. letters x l r . .., x, in the presentation By Nash [33] there is a neighborhood Jlr of 9 in E , in which each point has a unique nearest point on 9 ;for given positive rational E there are matrices L and u with polynomial components such that L fulfills condition (IVa), provided that US;=B, c A"-, and such that the equations @ = 0 where @=Pu, P=p(L), determine a sheet 9'cN of an algebraic variety that approximates 23.This means that there is a small, C", isotopic deformation of E, that moves 9 into 9; from now on, we regard 9 as endowed with the C"-structure carried over from 9 by this deformation. It is obvious from the proof of Lemma 2 in Nash [33] that the coefficients of the polynomials in L and u can be chosen to be rational numbers. We take these matrices and the number E t o be the required L, u and E in Zm. For each point P E N@, approximates the (coefficient matrix of the) distance vector from p to the nearest point, say p", on 9. Therefore, if Q denotes the (q- n)-plane normal to 9 at p , the vector grad CPi (i= 1,. .., q ) lies approximately in the direction of the projection of the xi-axis into Q; and [grad Qil approximately equals the cosine of the angle between the xi-axis and its projection into 52. Therefore, if i ; < . . . < i &,dq are positive integers such that Q projects 1-1 into the (q-n)-space E&, with coordinates xi,,, ..., then the vectors grad Qi,,,..., grad @ i , 4 - n are i.e., linearly independent, and so are their projections into Ei-,,,

Now there are positive rational numbers 6 and D such that (i) for each point o n 9 its neighborhood of radius 26 lies in A"-, and (ii) for each q-ball B of

61

UNSOLVABLE PROBLEMS IN TOPOLOGY

radius 6 in Jtr there is a (q-12)-tuple of integers i;, ..., i i - n such that

. take 6 and D for the required numbers in 53n. for every point ~ E BWe By a result of Cairns [ l l , 131, there is a triangulation dofB(compatib1e with the differentiable structure of g). This triangulation may be chosen so fine that none of its 1-simplices is longer than 46. To each vertex j j k of d we choose a nearby point p k E JV with rational coordinates. We take the coordinate matrices of these points to be the matrices p l , ..., p s in %R. For 0 we take the abstract complex that is isomorphic to d i n such a way that to each simplex of d with vertices, say PI,pZ,..., p",, there corresponds an abstract simplex { p l , p z , ..., p r n } € B and vice versa. The rectilinear simplicia1complex isomorphic to d wi th verticesp,, .. ., p , is then denoted by d(%R);the &neighborhood o f p k in E , is denoted by B,. For i, in we take the set {I,. . ., q } { i l k ,...,ii-,,,) where i;k, ..., i i P n kare the integers i;, ..., ik-,, corresponding to B , in the sense of the preceding paragraph. Now all required elements o f %R are described and the conditions (I), (II), (IVa), (IVb) and (IVc) are fulfilled. I f 6 has been chosen small enough then the simplex star St {pkld will lie so close to a neighborhood of j k in the n-dimensional tangent plane Z to g at j j , that it will project (as Z does) 1-1 into the n-space En(ik).Thus condition (111) is fulfilled. Moreover, ifp is a point on .%?, say with coordinate matrix x, and if Q is the normal plane to B at p , then a @-neighborhood of p in Q will intersect Id (%R)( in precisely one point, say p * , with coordinate matrix x*. Thus the equations (G,) and the inequality ]x*-x[ <*6 in condition (IVd) are fulfilled for precisely one choice of ql, ..., I?,-,,, since the vectors grad @ i , l k , . .., grad @ i , q - m k, taken at x, span the (q - n)-plane Q. We remark that in B , from @ i , l k = ... = @ i , 4 - n k = O it follows that @=O, since by (IVc) these q - n equations are independent, and since in @=Pu the matrix P is of rank q-n. That means that the map g,:Int St{p,[d(%R>}+E,,(i,), defined by (G,), is composed of the homeomorphism fild(%R)l+g that maps each point o f ld(%R)l to the nearest point on 9iJ, restricted to Int St {pkld(%R)}, followed by the projection i k into En(ik). As ProvedinNash [33], there is an analytic homeomorphism from a neighborhood Jlr of 9onto a neighborhood Jtr* of B,mapping 9onto B so that each point of 9 has its nearest point on 9 as pre-image (moreover, each This homeomorphism carries point of "4'- is moved in a normal plane of 9). the differentiable structure from 9 to g.Thus B is diffeomorphic to M and

('m)}

w. w. BOONE, w.

62

HAKEN

and v. POBNARU

analytically imbedded in Eq. Hence, if hs: W-E, is a coordinate system on 98 then, for each k = 1, ...,s, the homeomorphism K k iklf(ht St {pk(d(m)))

X.

is C"-related to Hence, the collection of maps K k O i k ( k = 1, ..., s) is a Cm-atlas % of g.Since f - l carries % into %(%R), %(%R) is a Cm-atlas on IA (9JI)l where A ($ isIcompatible ) with %(%Jl), and M('33) is diffeomorphic to M . Now (if D has been chosen small enough) condition (IVd) is also fulfilled. This finishes the proof of the first paragraph of Theorem 4. It is obvious that the concept of algebraic atlas presentation fulfills the requirements (a) and (c). For verifying (b) and (d') we require Lemma 2, section 1.4. All the conditions (I) to (IV) in our definition may be considered as elementary expressions in the sense of section 1.4: For we consider only systems of equations and inequalities between polynomials with rational coefficients. As we have given them, the polynomials in the matrix @ may have irrational coefficients, obtained by rational operations from the coefficients of P(A). But this can be taken care of by considering the coefficients of /?(A) (except the highest coefficient which is 1) as variables which have to fulfill the equation p ( A ) y(1)=cc(A) where the coefficients of the factor y ( A ) are also considered as variables. In more detail, we write

p (1)= 1" + p,- 1 1" y ( A ) = Aq-n

+ yq-,-l

a(lJ=lq+ccq-lP-'

+ + p,n + p o **.

Aq-n-1

ylil + y o +...+a,i + a o , + a = . +

where the ai are rational numbers obtained from L, and the pi and y k are considered as variables. We require that the equation /?(A) y (A) = u(A) be fulfilled for all values of A, i.e., that for all m =0, . .., q- 1 the coefficient of A"' in p ( A ) y ( A ) be equai to CY,. This yields q equations with rational coefficients in the q variables pi, y k . We know (see the paragraph preceding the statement of condition (IVc)) that these equations have real solutions in which Ipjl<& (for allj=O, ..., n-1). Thus we may enlarge the systems of equations in condition (IV) by the q equations described above and the n inequalities < E where we consider the pi and y k as variables. Thus all polynomial equations to be considered have rational coefficients. (The partial derivatives to be considered can be obtained by rational operations from polynomials.) It is now routine to check that all our conditions are elementary expressions in the sense of section 1.4. Thus Lemma 2 verifies (b). To compute the atlas %(m)with prescribed accuracy we have to solve systems of equations as considered above, with a given tolerance z. If (the

UNSOLVABLE PROBLEMS I N TOPOLOGY

63

components of) a rational point p e h t St {pklA('?JJl.)> are given, we know that the image point h,(p) will lie in a certain neighborhood N of K," ~ , ( p )in En. We may subdivide a neighborhood of N into n-dimensional cubes with edges (parallel to the coordinate axes) of lengths z. Then, by Lemma 2, we can determine the cube containing h,(p) and we may take its midpoint for an approximate solution h," ( p ) . This verifies (d') and completes the proof of Theorem 4.

3.3. Completing the proof of Theorems I , 2, 3 This section is devoted to the proof of Theorem 3. Theorems 1 and 2 follow at once from Lemmas 3 and 4, and Corollaries A, B of Theorem 3. To complete the proof of Theorem 3 we shall prove the following: LEMMA 6. Let n be ajixedpositive integer, n 3 4 . There is a recursive function F,,, which associates to every jinite group presentation p , a (Jinite) algebraic atlas presentation of a closed diferentiable n-manifold (with a compatible combinatorial structure), F,,(p) such that 1") The multivalued function F,-l is recursive in that there exists an algorithm to determine for an arbitrary algebraic atlas presentation fm (a) whether or not YJI belong to the range of Fn and (b) if the aiisbver to (a) is "yes", to determine all congruence classes in Fn-l (m). 2") % ( F f l ( P ) ) Z 3") W e have the following equalitiesjor the second Betti numbers: P 2 ( F , ( I I * 1 ) ) = 8 2 ( F f l ( p L ) ) + 2 if P2

( F n b* 1)) = P z (Ffl(P))+ 1 if

n=4, >4 .

4") r f the presentation p' can be derived from the presentation p by an ~ elementary operation Op: ', Op,, Op,, Op,, Op:', then F , , ( ' L ) iFn(p')for i = 1, 2, 3, 4. We first show that from Lemmas 5 and 6 Theorem 3 follows. Let Q and be as defined in Theorem 3. For the function F,, in Theorem 3 we take the function F,, from Lemma 6, with domain restricted to the class Q. Its range is then the class K(n, Q ) of Theorem 3 (which is defined in this way). K(n, Q ) is recursive and (3.A) holds, because of 1". (3.B) follows from 2". To show (3.C) and (3.D) we proceede as follows. Suppose that p , ~ ' E Q are isomorphic. Let ji, F'EQ and t,t' be such that p=fi*t, p' = p' * t' ,

t 3 4p(ji)

+ 4r(ji),

t' 3 4p(F')

+ 4r(P').

64

w. w.

BOONE,

w.

HAKEN

and v. POBNARU

Suppose that t a t ’ , which implies tap@)+r@’). Now let t”=t-p@’)+ +r@’)+p(p)-r@). Then t ” > p @ ’ ) + r ( i i ) . By Lemma 5, p = f i * t can be transformed by a finite sequence of operations Opf ..., Op:’ into the presentation f i ’ * t n . By 4” (*>

F, (p) M iFn(p’ * t”).

Hence it follows from 3” that and hence,

(t) But

and hence

t - t“ = p ( p ’ ) - r(p‘) - p ( f i ) = P(P’) - t ’ -

+ r(fi)

W )- P ( P ) + t +

t’ - t” = p ( p ’ ) - r(p’) - p ( p )

This completes the proof of (3.C). If /?2(F,,(p’))=P,(Fn&)), then t “ = t ’ b y Fn ( P )

+ r(p).

(t),hence f i ’ * f ‘ ’ = p ’ so that

iFn ( P ’ )

by (*). This completes the proof of (3.D). Proof of Lemma 6. First we shall define a special class of algebraic atlas presentations which correspond in a certain way to group presentations; moreover, there is an algorithm to decide for given algebraic atlas presentation %Jl, (i) whether %Jl belong to the special class, and (ii) if it does, to find corresponds. These all group presentations (up to congruence) to which special algebraic atlas presentations will be designed to meet the conditions 2”, 3”, 4”of Lemma 6. Then we shall define the function F,&) having special presentations as values. Our program is to define an algebraic atlas presentation %Jl as “special” if the corresponding n-dimensional complex A (fm) is “obviously” (i.e., in a certain algorithmically recognizable way) the boundary of a star-neighborhood of a 2-dimensional complex A , in E,,, = E,(fm), where A , “corresponds” to

UNSOLVABLE PROBLEMS IN TOPOLOGY

65

a group presentation p (i.e., O(p) is isomorphic to the fundamental group zl(A,) of A,, and JA,J can be decomposed into one point p,, r ( p ) open arcs Y’with all boundary points identical to p o , in 1-1 correspondence with the generators of p , a ndp(p) open disks A j with boundaries in UiLpi Y ‘ u p , in accordance with the relators of p ) . 6.1. DEFINITION. An at most 2-dimensional rectilinear simplicia1 complex A , in E n + l , n>4, with rational vertices is said to correspond to a group ..., M,}) if there exist semi-linear maps presentation p = ( { y l , ..., y,}, ‘ p i : Y’i+]A21 (i= 1, ..., r ) and $ j : A‘j-+lA21 ( j = 1, ..., p) such that: (a) The Y’l’s are oriented arcs (1-balls), and the A‘j’s are disks (2-balls) with oriented boundaries. (b) The restrictions q’lInt Y” and $jlInt A’j ( i = l , . . . , r ; j = l , ...,p) are homeomorphisms with pairwise disjoint images, say Y‘ and Aj. (c) There is a vertex p O e A , such that p,=q~’(dY’~)for all i = l , ..., r, Y’u A j = Id,/, and the closures of Y’ and A j are polyhedra p, u of subcomplexes of A,. (d) In each dA” there is a “base point” p ’ j such that: if eJ is the word y::yji;. . .yz;(gl = l), and if a point p’ runs through dA‘j, starting and finishing at p‘j, in the direction of the given orientation of aA”, then the image point $ j ( $ ) runs through the closed path p o Yk’po YkZp,...po Ykmp,so that it runs through Ykr in the sense of, or in the opposite sense of the orientation of Ykl(as carried over by qk’from Y’k’)according as g,= + 1 or - 1 .

u;= us=

6.2. LEMMA. There is an algorithm to decide f o r an arbitrary givenl6 complex A , (i) whether or not it correspond to any group presentation in the sense of 6.1, and (ii) i f A does correspond to some group presentation, to determine all congruence cIasses of group presentations to which it corresponds. This is nearly trivial since one can determine whether A is 2-dimensional, and then examine all sets of subcomplexes of A so as to determine whether they can be regarded as sets { p o , closure of Y’, closure of Ajli= 1, ..., r ; j = 1, ..., p > with the demanded properties. Here we need the fact that the semilinear homeomorphism problem with the arc or the disk has a simple recursive solution. Whenever such a set of subcomplexes is found, the corresponding group presentations p can be read from it.

Remarks. (1) If A corresponds to p then A also corresponds to all group presentations p‘ which can be derived from p by operations Op,, Op,, and Here “given” means that the rational coordinates of the vertices are explicitely given, together with a corresponding abstract complex.

l6

w. w.

66

BOONE,

w. HAKEN and v. POBNARU

by replacing a particular generator y i everywhere in the relators by y [ I . This holds since condition (d) of 6.1 can be fulfilled with respect to p' by changing base points p'j and/or orientations of dA'j's, and/or Y'"s. ( 2 ) If A corresponds to ci then the fundamental group n,(A) of A is isomorphic to 00).This fo0;:ows immediately from the standard procedure for determining presentations of the fundamental group of a given complex; see for instance Seifert and Threlfall [45], 5 46. If p is a group presentation and E n + 1is a euclidean (n+ 1)6.3. LEMMA. space with n 3 4 then there is a rectilinear simplicia1 complex A , in E n t lthat corresponds to p in the sense of 6.1. This is a special case of a general imbedding theorem for simplicial complexes, see for instance Seifert and Threlfall [45], 8 11. We remark that A 2 could even be constructed in E4 (but in general, not in E 3 ) although we shall not use this fact.

6.4. DEFINITIONS. Let A , be an at most 2-dimensional, rectilinear complex in En+,, 7234, with rational vertices; let Pi,Kj,Th (i= 1, ..., u o ; j = 1, ..., u,; k = 1, . .., u2) be the vertices, edges, triangles, respectively, of A , . Then, by the spherical handle neighborhood of A , in with radii po, p l , p, ( p 2 < p 1 < p o ) we mean the union N of (n+ 1)-balls Ph+,, K i , , , T,k+ with the following properties (for all i,.j, k ) : (P) Pb+, is the (n+ ])-ball in En+, with radius y o and center Pi. (K) Ki+ is the (n + 1)-cylinder in En+ Tnt PL+ with radius p1 and axis K j (i.e., the set of all points in E n + ,Int Pk+, whose distance:to Kj is < p l ) ; p1 is to be so small that the Ki+,'s are pairwise disjoint. (T) Ti+ is the (n+ 1)-cylinder in E n + -1I n t ( U z , P:+,u K i + , ) with radius p, and axis T k ;p, is to be so small that the T,k+,'s are pairwise disjoint. Further, by a normed, rectilinear handle neighborhood of A , in En+,we mean a rectilinear (n + 1)-dimensional complex N* with rational vertices that contains sub-complexes

,

,

uz

Uz ,

,

urLl

and with the following properties (for all i, j , k ) : (6*) There are positive rational numbers po, pl, p,, and 6*, 26*
67

UNSOLVABLE PROBLEMS IN TOPOLOGY

definition) each point of dP,*: naN*, aK;i, naN*, aT,*+klnaN* lies closer than 6" to a point on aPk+ naN, aKi+ naN, aT:+ ndN, respectively. (P*) P:il is a simplex star of P' in E n+ , . (K*) K,*:', is a simplex star neighborhood of UIntlP:!,I)

Ri=Kin(E,+,(T*)

i=l

in E n + , - UIiitlP:jll. i= 1

T',!, is a simplex star neighborhood of

u Ip:!l" uo

En+,

- Int

i=l

u lfcll* UI

j= 1

If 0 is an n-simplex of dP,*:, n a N * and E is the line in E n + ,that is normal to 0 and contains the midpoint of 6,then E contains P i ; similarly, if 0 is on aK;i1 n d N * , then E intersects Rj in a point; also, if 0 is on aTn!l n d N * , then E intersects T kin a point. (0)

6.5. LEMMA.There is an algorithm to decide for an arbitrary given rectilinear n-dimensional complex A , in En+, with rational verticesI6, n 3 4 , (i) whether or not it be the boundary complex of a normed rectilinear handle neighborhood of some (at most 2-dimensional, rectilinear) complex A , in En+,,and (ii) i f the answer to (i) is "yes", to determine all complexes A , in En+,in this relation to A,. Proof. We examine all sets of n-dimensional subcomplexes of A , so as to determine whether they can be regarded as sets: {ap,,?;, naN*, aK,,?jl naiv*,aT;tl naarv*l i = 1 , ..., u,;j

=

I ,..., u,; k = 1, ,.., u E }

(notation as in 6.4). This can be done recursively since the intersection properties of the normal lines through the centers of the n-simplexes of A , can be checked. Whenever such a set of subcomplexes is found, the corresponding 2-dimensional complex A , can be determined. (The vertices of A , are the intersections of the normal lines through the centers of the n-simplexes of the aP,*: n d N * , i= 1, ..., u,; etc.) This completes the argument.

6.6. LEMMA.Let A , be an a t most 2-dimensional, rectilinear complex in

En+,with rational vertices, n 3 4. Then there exists an algebraic atlaspresenta-

tion llJz of a differentiable n-manifold such that17 E , ( m ) = l7

i.e., Eg(9R)has dimension n

+ 1.

and such that

68

w. w. BOONE, w. HAKEN and v. POBNARU

A(m)is the boundary complex of a normed, rectilinear handle neighborhood N* of A , in E n + 1 . Proof. We begin with a spherical handle neighborhood N of A , in E n + 1 . Then, for any sufficiently small positive rational 6*, there is a normed, rectilinear handle neighborhood N* of A , in E n + , that approximates N as described in 6.4. On the other hand there exists a differentiable n-manifold V , differentiably imbedded in En+1, that approximates aN, i.e., such that for some 6’ > O and for each point p of %? the 6#-neighborhood (2 of p in the normal line (to through p ) intersects dN in precisely one point, say q(p), so that the map p - + q ( p ) is a homeomorphism of %? onto dN. This manifold V can be derived from aN by “smoothing out the corners” at which the boundaries of the cylindrical (or spherical) regions dPL+ ndN, aKi+ n dN, dT;+ ndN fit together; see for instance Cairns [12]. By Whitney [52] there exists an analytic n-manifold B in En+ that approximates %?. Now the construction of the algebraic atlas presentation 1)32 follows the proof of the first sentence of Theorem 4 with q = n + 1 where we take the complex dN* to be the complex A (2R). This is permissible, provided that N* has been chosen fine enough and that all approximations have been chosen close enough. This insures that the @neighborhoods of the points of ii? in the normal lines Q to will be close to the normal-intervals Q to %?, and consequently will provide 1-1 correspondences between the points of L% and the points of a N and also the points of aN*. Thus aN* will lie in the neighborhood Jlr* of %? as considered in the proof of Theorem 4. This completes the proof.

,

6.7. Constructing F n ( p ) . Using 6.2, 6.3, 6.5, 6.6, we shall show that there exists a recursive function that associates with each group presentation p an algebraic atlas presentation llJl of a closed differentiable n-manifold and such that: (i) Eq(1)32)=En+,,(ii) A(m)is the boundary complex of a normed, rectilinear handle neighborhood, say N * , of a rectilinear complex, say A , , in and (iii) A , corresponds to p (as defined in 6.1). It is possible, by reworking the arguments of these earlier sections, to construct such a function which is “natural”, i.e., such that it is relatively easy to compute the value of the function for given p. However, the description of such a function would be rather long and tedious. So, as is sufficient for our purposes, we construct an Fn which is less convenient to compute, but very easy to define, as follows: For given n, one can recursively enumerate those algebraic atlas presenta-

UNSOLVABLE PROBLEMS IN TOPOLOGY

69

tions of closed, differentiable n-manifolds having xl, x,, ..., x,+ as coordinate variables (q=n+ 1). Let Q be such a recursive enumeration. We define F,@) to be the atlas presentation Q ( m ) with smallest m such that (i), (ii) and (iii) as stated above are fulfilled. Clearly, for given p , F,@) exists in view of 6.3 and 6.6. And F, is recursive since, by 6.2 and 6.5, we can, step by step, check Q(1), Q(2), ... until the desired Q ( m ) is found7. Of course it still remains to show that this F,, satisfies the conditions 1" through 4" of Lemma 6. 6.8. Proof of 1". The algorithm demanded follows immediately from 6.5 and 6.2. 6.9. Proof of 2". By 6.7, condition (ii), d (F,,(p)) is the boundary complex of a rectilinear handle neighborhood N* of a 2-complex d, corresponding Since . n 3 4 , we have z l ( A ( F n @ ) ) ) ~ (A,), n l as may t o p ; hence n , (A,)? 00-1) be seen from the following argument l8: Every closed curve C on A , can be homotopically deformed within N * into a curve in dN*=A(F,(p)) (a small deformation takes C into a curve in N*, disjoint from A,; then this curve may be moved into dN*), and vice versa. Similarly, if C bounds a singular disk (i.e., a continuous image of a disk), say D, in A , then the deformed curve in dN* bounds a singular disk (obtained from D by deformation) in dN*, and vice versa. From this we may conclude 2". 6.10. LEMMA.The normed, rectilinear handle neighborhood N* of A,, where A , corresponds to p and A(F,,@))=dN* may be obtained from an (n+ 1)-ball by (combinatorially) adding r ( p ) handles18 of degree 1 and then adding p @ ) handles of degree 2. Moreover, the differentiable structure of M(F,(p)) can be extended to a differentiable structure of N*. Proof In detail (now we use the notation of 6.1 and 6.4), N * is obtained as follows: We begin with the (n+ 1)-ball P,*,", containing the vertex p o of A , ; then we add handles of degree 1, say Yi+ (i= 1, ..., r ) where Yi+ is the union of all those P,*+l's,and K;+,'S, different from P,*,",, and containing points of Y'. This handle-adding yields a "handlebody" V,+ =P,*," u U;= Y:+1 . Next, we add to V,+, handles of degree 2, say A:+ where A:+ is the union of all those P,*+l's, K;+ l's, and T,*+,'s that are not contained in V,+, but contain points of A j . This shows the first sentence of the lemma. The differentiable structure of the manifold B, as considered in the proof

,

,

18 For simplicity of notation, from now on we do not always distinguish among a complex A , its point set ] A \ , and Id1 with combinatorial structure A . Similarly for N , aN, etc. What is meant will always be clear from the context.

70

w. w.

BOONE,

w. HAKEN and v. P O ~ N A R U

of 6.6, which is diffeomorphic to M(F,(p)) and differentiably imbedded in E n + , ,is represented by an atlas obtained by projecting certain simplex stars on .G? into n-dimensional coordinate planes of Let fl be the (n+ 1)manifold bounded by 35' in E n i l and which consequently approximates N*. The projections of the simplex stars on 35' may be extended to differentiable homeomorphisms of (n +- 1)-balls on &' into coordinate-halfspaces of En+,. This yields an atlas of a neighborhood of 93 in fl.This atlas can be completed to an atlas of flby using the identity maps on (n + 1)-balls in the interior of fl. There is a homeomorphism cp of # onto N* which is the identity outside of a neighborhood of B, such that ql35'carries the differentiable structure of 35' into that of M(F,@)). Consequently, cp carries the differentiable structure of fl into a differentiable structure of N* which extends that of M(F,(p)). This proves 6.10. 6.11. LEMMA.If A(F,(p))=aN* and A(F,(p'))=dN*' where N* and N*' are normed, rectilinear handle neighborhoods of 2-complexes A; and A,, respectively, and if N* and N*' are combinatorially equivalent then M(Fn(P))XiM(Fn(P'))

( i = 1,23334)-

Proof. This is trivial for i = 2 , 3 and 4. For i= 1 (diffeomorphism) it follows immediately from Munkres [31], his 6.5 Theorem, which asserts in particular that combinatorial equivalence implies diffeomorphy if the homology groups H,,, are trivial for m>3. But N* and N*' have trivial H, for r n > 3 since they are neighborhoods of 2-dimensional complexes (and thus homotopy equivalent to 2-dimensional complexes). 6.12. Proof of 3". Let A(F,(p))=aN* as above, then A(F,(,u*l)) is the boundary of a neighborhood, say N*', of a 2-complex, say A;, that is homeomorphic to the union of A, and an open disk Api' with boundaryp, and AP+' n 1A21 =8 (see 6.1). We choose this open disk Ap+' in En+' in such a way that it intersects En+,-IntN* in a disk, say A . Let A , + 1 be a (polyhedral) neighborhood of A in -IntN*. Then N* u A , + , is combinatorially equivalent to N*'. In fact, N* u A , + is obtained from N * by adding the handle A n f l of degree 2 to N*. Thus d ( N * u A , + , ) = ( d N * - I n t ( ~ N * n ~ A , + , ) ) u a ( A , + ,- a N * n & 4 , + , ) . (The complex d ( N * u A , + , ) is obtained by Morse surgery from aN*.) In the above equation aN*ndA,+, is homeomorphic to S 1 x D n - , and Provided that isomorphic semilinear subdivisions of the combinatorial structures are compatible with the differentiable structures.

l9

71

UNSOLVABLE PROBLEMS IN TOPOLOGY

aA,+,-Int(dN*nc?A,,+,) to D , x Sn-,; moreover, aN*naA,+, lies in an n-ball in aN*. Now p2 (alv* - Int (alv* n aA,+ 1)) = (&(aN*) 1 if n = 4 (&(aN*) if n > 4 .

+

Further, adding aA,+,-aN*naA,+, increases j?, by 1 for both n = 4 and n > 4. With some effort, all this can be verified by the methods for computing homology groups as described in Seifert and Threlfall [45], Q 22. This yields 3”. 6.13. Proof of 4“. Let A(F,(p))=aN* and A(F,@’))=aN*’ where N* and N*’ are handle neighborhoods of A , , corresponding to p, and A ; , corresponding to p’, respectively. In view of 6.1 1 it is sufficient to show that N * and N*’ are combinatorially equivalent. Case 1) p’ is obtained from p by Op, or Op,. Then A , and A ; are combinatorially equivalent (see 6.2, remark (l)), and thus so are their neighborhoods N* and N*’. Case 2) p 1 is obtained from ,u by Op:‘ or Op,, say by replacing a i by a:. Let p” be the group presentation obtained from p by deleting ai.Let A‘; correspond to p” and let N*” be a handle neighborhood of A ; in Then a complex, homeomorphic to A , (to A ; ) is obtained by adding to A’; an open disk A’ (an open disk Ali) that corresponds to ai (to a:). We choose A’ and A” in En+, so that A’n(E,+,-IntN*”) and A’in(E,+l-IntN*”) are disks, say A and A’ with aAnaA’=0. Let A,+1 and A;+, be (polyhedral) neighborhoods of A and A‘, respectively, in -IntN*“. Then N * ” u A,+1 and N*” u A;+ are combinatorially equivalent to N* and N*’, respectively. We remark that adding A,+, or to N*” means adding a handle of degree 2 to N*“ corresponding to aior a:, respectively. Now we prove that N * ” u A,,, and N * ” u A A + ~are combinatorially equivalent to each other. For this we use the method of “sliding handles” (see Smale [47], Potnaru [36]) which can be described in our case as follows: The curves aAi and aA“ are homotopic in A ; ; hence, dA and aA’ are homotopic in N*“, i.e., there exists a singular 2-dimensional annulus (a continuous image of an annulus) in N*” with boundary curves dA and dA’. Since n 2 4 , this singular annulus can be deformed into a non-singular annulus, say J c dN*“ with aJ= aA u aA’. Consequently, dA can be “moved over J” into aA’. Hence, there exists a semilinear homeomorphism of N*“ onto itself that is the identity outside of some neighborhood of J , and that maps the neighborhood aA,+l naN*“ of the curve aA (in dN*”) onto the neighborhood aAA+ naN*” of aA’ (in dN*”). This homeomorphism can be extended to a semilinear homeomorphism of N*” u A , + onto N*” u A: +

,

72

w. w. BOONE, w. HAKEN and v. P O ~ N A R U

are combinatorially equivalent. This Hence N*" u A,+1 and N*" u completes Case 2. Case 3) p' is obtained from p by Op,. Again let A(F,&))=aN*, A(F,(p')) =dN*', etc. Then a complex combinatorially equivalent to Ah is obtained from A, by adding an open arc Y"' (corresponding to the new generator) and an open disk A p f l (corresponding to the new relator), where d A p f ' contains Y r + l precisely once. We choose Y'+l and AP+' in so that Y'+'n(E,,+,-IntN*)isanarc,say Y, and ( A P + l u Y r + l ) n ( E , + , - I n t N * ) i s a disk, say A . Let A,+1 be a (polyhedral) neighborhood of A in - Int N*. Then N * U A , + ~is combinatorially equivalent to N*'. But, on the other hand, A,+1 is an (n+ 1)-ball such that dA,+, naN* is an n-ball (viz., a neighborhood of the arc i3A naN* = aA-Int Y ) . Hence N" u A , + 1 is combinatorially equivalent to N * . This settles Case 3. Case 4) p' is obtained from p by Op; I . Interchange p and p' in Case 3 for this case. This completes the proof of 4". This finishes the proof of Lemma 6 - and hence of all previously stated results. 3.4. An open question: A topological analogue of the Markov-AddisonFeeney-Adjan-Rabin Theorem In Boone [8] in this volume, the notion of a Markov property of semigroups or groups is explained. We should like to raise here the question as to whether the work of Markov [24,25], Addison [l], Feeney [18], Adjan [2] and Rabin [40] can be paralleled in topology. What one would have to do is frame a definition of "Markov property of manifolds" in such a way that "most of" the properties of manifolds which are of actual interest to topologists would be Markov under the definition. Then one would have to show that for a given Markov property, one cannot recursively recognize whether or not a given presentation presents a manifold enjoying the given property. We do not here propose a definition of "Markov property of manifolds". Indeed, finding a useful definition - a definition which does not, in a trivial way, refer matters back to group theory - seems difficult. References 1 . ADDISON,J., On some points of the theory of recursive functions, Dissertation, University of Wisconsin, 1954. 2. ADIAN, S. I., The algortihmic unsolvability of checking certain properties of groups, Dokl. Akad. Nauk SSSR 103 (1955) 533-535 (in Russian). 3. BAUMSLAG, G., W. W. BOONEand B. H. NEUMANN, Some unsolvable problems about elements and subgroups of groups, Math. Scand. 7 (1959) 191-201.

UNSOLVABLE PROBLEMS I N TOPOLOGY

73

4. BING,R. H., An alternative proof that 3-manifolds can be triangulated, Ann. Math. 69 (1959) 37-65. 5. BOONE, W. W., The word problem, Ann. Math. 70 (1959) 207-265. 6. BOONE,W. W., Word problems and recursively enumerable degrees of unsolvability. A first paper on Thue systems, Ann. Math. 83 (1966) 520-571. 7. BOONE,W. W., Word problems and recursively enumerable degrees of unsolvability. A sequel on finitely presented groups, Ann. Math. 84 (1966) 49-84. 8. BOONE, W. W., Decision problems about algebraic and logical systems as a whole and recursively enumerable degrees of unsolvability, this volume. 9. BOONE, W. W. and H. ROGERS JR., On a problem of J. H. C. Whitehead and a problem of Alonzo Church, Math. Scand. 19 (1966) 185-192. 10. BRITTON,J. L., The word problem, Ann. Math. 77 (1963) 16-32. 11. CAIRNS,S. S., Triangulation of the manifold of class one, Bull. Am. Math. SOC.41 (1935) 549-552. 12. CAIRNS,S. S., The manifold smoothing problem, Bull. Am. Math. SOC.67 (1961) 237-238. 13. CAIRNS, S. S., A simple triangulation method for smooth manifolds, Bull. Am. Math. SOC.67 (1961) 389-390. S. S., Introductory topology (New York, Ronald, 1962). 14. CAIRNS, 15. CLAPHAM, C. R. J., Finitely presented groups with word problem of arbitrary degrees of insolvability, Proc. London Math. SOC.(3) 14 (1964) 633-676. 16. COHEN,P. J., Decision procedures for real and p-adic fields (Mimeographed. Stanford University, Stanford, California, 1967). 17. DAVIS,M., Computability and unsolvability (New York, McGraw-Hill, 1958). W. J., Certain unsolvable problems in the theory of cancellation semi-groups 18. FEENEY, (Catholic University of America Press, 1954). 19. FRIDMAN, A. A., Degrees of unsolvability of the problem of identity in finitely presented groups (in Russian) (Moscow, USSR Academy of Sciences; Central Economics-Mathematics Institute; “Science” Publishing House, 1967). 20. HERMFS,H., Aufzahlbarkeit, Entscheidbarkeit, Berechenbarkeit. Einfiihrung in die Theorie der rekursiven Funktionen (Berlin, Heidelberg, New York, Springer-Verlag, 1961; English translation: Springer-Verlag, 1965). 21. IHRIG,A. H., The Post-Linial theorems for arbitrary recursively enumerable degrees of unsolvability, Notre Dame Journal of Formal Logic 6 (1965) 54-72. 22. KERVAIRE, M. A,, A manifold which does not admit any differentiable structure, Commentarii Mathematici Helvetici 34 (1960) 257-270. S. C., Introduction to metamathematics (Amsterdam, North-Holland Publ. 23. KLEENE, Co., 1952; fourth reprint 1964). 24. MARKOV, A. A., Impossibility of algorithms for recognizing some properties of associative systems (in Russian), Dokl. Akad. Nauk SSSR 77 (1951) 953-956. (This paper can be understood completely from a review in J. Symb. Logic 17 (1952) P. 151 by A. Mostowski.) 25. MARKOV, A. A., Theory of algorithms; 444 pages, published for the U. S. National Science Foundation by the Israel Program for Scientific Translation, 1961. Available from the Office of Technical Services, U. S. Department of Commerce. A. A., Insolubility of the problem of homeomorphy, Proc. Intern. Congress 26. MARKOV, of Mathematicians, 1958 (Cambridge University Press) 300-306.

74

w. w. BOONE, w. HAKEN and v.

PO~NARU

27. MILNOR,J., On manifolds homeomorphic to the 7-sphere, Ann. Math. 64 (1956) 399-405. 28. MILNOR, J., A procedure for killing the homotopy groups of differentiable manifolds, Symposia in Pure Mathematics, Am. Math. SOC.,Vol. I11 (1961) 39-55. 29. MILNOR,J., Two complexes which are homeomorphic but combinatorially distinct, Ann. Math. 74 (1961) 575-590. 30. MOISE,E. E., Affine structures in 3-manifolds. V. The triangulation theorem and Hauptvermutung, Ann. Math. 56 (1952) 96-114. 31. MUNKRES, J., Obstructions to the smoothing of piecewise-differentiable homeomorphisms, Ann. Math. 72 (1960) 521-554. J., Elementary differential topology, Ann. Math. Studies No. 54 (Prince32. MUNKRES, ton University Press, 1966). 33. NASH,J., Real algebraic manifolds, Ann. Math. 56 (1952) 405421. 34. NEUMANN, B. H., An essay on free products of groups with amalgamations, Phil. Trans. Roy. SOC.London, Ser. A 246, No. 919 (1954) 503-554. C. D., Some problems on 3-dimensional manifolds, Bull. Am. 35. PAPAKYRIAKOPOULOS, Mat?. SOC.64 (1958) 317-335. 36. POENARU, V., Sur la theorie des immersions, Topology 1 (1966) 81-100. 37. POST,E. L., Recursively enumerable sets of positive integers and their decision problems, Bull. Am. Math. SOC.50 (1944) 284-316. ofaproblemofThue, J. Symb. Logic 11 (1947)l-11. 38. P~~~,E.L.,Recursiveunsolvability 39. POST,E. L. and S. LINIAL,Abstract, Bull. Am. Math. SOC.55 (1949) p. 50. 40. RABIN,M. O., Recursive unsolvability of group theoretic problems, Ann. Math. 67 (1958) 172-194. 41. REIDEMEISTER, K., Topologie der Polyeder und kombinatorische Topologie der Komplexe (Leipzig, Akademischer Verlag, 1953). 42. ROGERS, H., JR.,Theory of recursive functions and effective computability (New York, McGraw-Hill, 1967). 43. ROTMAN, J. J., The theory of groups. An introduction (Boston, Allyn and Bacon, Inc., 1965). 44. SACKS,G. E., Degrees of unsolvability, Ann. Math. Studies No. 55 (Princeton University Press, 1963). H. and W. THRELFALL, Lehrbuch der Topologie (Leipzig, Teubner, 1934). 45. SEIFERT, 46. SINGLETARY, W. E., Recursive unsolvability of a complex of problems proposed by Post, J. Faculty of Science, Univ. Tokyo, Sec. I, 14 (1967) 25-58. 47. SMALE,S., Generalized Poincark conjecture in dimensions greater than four, Ann. Math. 74 (1961) 391406. E. H., Algebraic topology (New York, McGraw-Hill, 1966). 48. SPANIER, 49. SULLIVAN, in preparation. 50. TARSKI, A., A decision method for elementary algebra and geometry (Santa Monica, Rand, 1948; Paris, Institut Blaise Pascal, 1967). 51. WHITEHEAD, J. H. C., On Ckomplexes, Ann. Math. 41 (1940) 809-824. 52. WHITNEY, H., Differentiable manifolds, Ann. Math. 37 (1936) 645-680. 53. YASUHARA, A. H., A remark on Post normal systems, J. Assoc. Computing Machinery 14 (1967) 167-171. 54. YNTEMA, M. K., A detailed argument for the Post-Linial theorems, Notre Dame Journal of Formal Logic 5 (1964) 37-50.

CONSTRUCTIVE THERMODYNAMICS W. K. BURTON Department of Natural Philosophy, The University, Glasgow 1. The purpose of this note is to discuss the feasibility of formulating a fundamental part of physics in a constructive manner. As a starting point we take the formulation of thermodynamics given by Robin Giles [2]. In this book, Giles effects a complete separation between the physical and the mathematical aspects of the theory, and presents the latter as an informal axiomatic theory measuring up fully to the standards of rigour customary in contemporary mathematics. Its reformulation as a formal theory would present no particular difficulty, but there are reasons for believing it to be worth while to attempt this in a constructive sense, making slight modifications in the original theory if necessary. These reasons stem from the physical aspects of the theory. In addition to the various mechanisms for producing theorems (derived formulae) it is necessary, in a physical theory, to lay down certain rules of interpretation which connect at least some of the formulae with practical actions. In the past this kind of problem has not received much attention, and the further great merit of Giles’s approach is that for the first time questions of this sort are submitted to a precise analysis. The axioms of the theory contain just four primitive concepts which are called ‘state’, ‘union’ of states, the relation of a state ‘going to’ a state, and the relation of a state being ‘equal’ to a state. Giles’s theory being informal, there will of course be further primitive concepts, for example logical ones, which will have to be taken into account in a complete formalisation. As a matter of fact Giles himself appears not quite to count equality between states as one of his primitive concepts, perhaps feeling that it belongs to a different level from the others. Denoting states by small Roman letters with or without subscripts, we have primitive formulae of the form a = b, a + b = c , a + b (read as ‘state a equals state by,‘state a plus (union) state b equals state c’, and ‘state a goes to state b’, respectively). Formulae then result by combining primitive formulae by means of the logical particles. Giles’s idea is now the following: if rules 15

76

W. K. BURTON

are laid down which attribute unique meanings to the primitive formulae, all the formulae will acquire unique meanings. These rules, which he calls primitive rules of interpretation, permit other derived concepts to be introduced by means of explicit definitions, and these derived concepts are thereby ‘explained’ in terms of the primitive ones. No other concepts besides primitive and derived ones appear. The axioms of the theory, being formulae, also acquire an interpretation, and the question arises as to whether the axioms are true under this interpretation. If they are, then the theorems will also be true, providing the rules of inference lead from true formulae to true formulae. Giles selects the aspects of experience which are linked to the primitive concepts in the mathematical theory by the primitive rules of interpretation to be as ‘direct’ as possible. An experience is direct to the extent that it can be demonstrated rather than explained in terms of other (more direct) ones. The implied ordering of experience according to directness is admittedly rather crude: it corresponds roughly to an order of concept formation in a child as it matures. On the theoretical level the direct experiences are supposed to correspond in some way with primitive concepts in a theory, and the less direct ones to derived concepts. The theory then, as it were, ‘explains’ the indirect aspects of experience in terms of the direct ones. 2. Before presenting the axioms of Giles’s theory, we wish to summarise Giles’s own discussion of his rules of interpretation. We do this not only to give the theory some intuitive content, but also because we wish to consider later on some modifications in these rules. The main purpose of a physical theory is to make predictions. The basis on which these predictions are made consists of prior knowledge about the ‘system’ which is under investigation. This knowledge, in its turn, consists of information about what has happened to the system in the past: in other words of how the system has been prepared, Thus we consider that the basis on which predictions are made is the method of preparation (of a system), and it is this which we wish to call the state (of a system). We use capital Roman letters A, B, ..., to denote systems. Then a state a of a system A may be designated by adding a subscript to A: thus A,, A,, ..., are states of the system A. In the mathematical theory, systems are not alluded to at all, the method of preparation being taken as including a specification of how the system is selected or produced. Thus ‘system’will appear only, if at all, as an arbitrary collection of states. If we have two systems A and B then we can conceive of them jointly as

77

CONSTRUCTIVE THERMODYNAMICS

forming a compound system, denoted by ‘A + B’, consisting of the conceptual union of systems A and B. In this union A and B are both considered as isolated. In fact a system can only be prepared in isolation, for if the method of preparation produced the system together with some ‘environment’, the position of the boundary between system and environment would have to be explained, and then the state would no longer be determined by the method of preparation alone. Accordingly, the term ‘state’ can only refer to conditions in which the system concerned is isolated. It is clear that + is associative and commutative. Given any system A it is possible in principle to construct a finite number of replicas of A. Thus ‘A + A’ has a meaning: it is the union of A with a replica of A. We denote it by ‘2A’. Similarly if m is a positive integer, ‘mA’ denotes the union of m replicas of A. Just as we can add systems, SO we can form in a natural way the union A, B, of any two states A, and B, of systems A and B. We define A, B, to be the state of the system A + B in which A and B are isolated and in the states A, and B, respectively. The addition of states is also associative and commutative, and as in the case of systems we can add replicas of the same state: we denote the union of m replicas of A, by ‘mA,’. Although A, +B, is always a state of the system A+B, not every state of A + B is of this form; only those in which the parts A and B are isolated. Thus the rule of interpretation for a + b is to be: $ a and b are states, then a b is that state whose method of preparation consists in the simultaneous and independent performance of the methods of preparation corresponding to the states a and b. The operation of addition of states may be regarded as defining a relation a + b = c between three states a, b and c. We now consider another relation between states connected with the natural evolution of a state with time. If, during some time interval, the state of a system A changes, a natural process is said to have occurred. In general, A will interact with other systems during such a process. Suppose A is part of a larger system I which remains isolated throughout the process. Thus I contains, together with A, every system with which A interacts during the process. Although these systems do not remain isolated during the process, it is possible that, for some of them, the initial and final states may coincide. If so, we say that they are not involved in the process. A system is involved in a process if and only if its initial and final states differ. If there exists a natural process involving only a system A which has initial and final states A, and A, respectively, then we write ‘A,-’A2’ (read “A1 goes to Az’’).

+

+

+

78

W. K . BURTON

Thus the rule of interpretation for a + b is to be: a-tb ifand only ifthere is a state k and a time interval z such that a + k evolves in isolation in the time z into the state b + k. With these explanations we have arrived at rules of interpretation for ‘state’, ‘+’ and ‘-+’.When are two states to be regarded as equal? Clearly if two states are prepared in the same way they should be regarded as equal. However, even if two states are not equal in this sense, but nevertheless any two experiments applied to these two states yield the same result (or rather the same statistical distribution of results) then these states need not be distinguished. This gives rise to a wider notion of equality, which in fact is the one which Giles uses in his book.

3. It is convenient [2]to characterise thermodynamics by making use of the concept of a primitive observer (for thermodynamics). Such an observer is a being whose direct experience embraces only the physical aspects of experience associated by the primitive rules of interpretation with the primitive concepts “state”, +, -+ and = . That is, he is directly aware of states and relations among them of the forms a = b, a + b = c and a-tb, but of nothing else. Thermodynamics can now be characterised as a physical theory which is meaningful to such an observer, and which could, indeed, have been developed by him. The specification of the concept of primitive observer for a theory amounts to the specification of a range of observational powers sufficient to guarantee that the theory can actually be applied in practice. As we shall see later, meagre though the powers of a primitive observer for thermodynamics may look, they transcend in important respects the powers of human observers. 4. We now present Giles’s axioms for thermodynamics as given in Appendix A of his book [ 2 ] . Consider a non empty set 6whose elements will be called states. We postulate in G an operation and a relation +. satisfying the following axioms. AXIOM1. In 6 (i) if a, be 6then a + b e 6 , a + b = b + a, and if a, b, c e 6 then a+(b+c) = (a b) c ; (ii) a-+a (iii) a+bA b-tc=>a+c a, b, CEG. (iv) a + c + b + c o a + b AXIOM2. If a, b, CEG a+b Aa+c=>b+cv c-tb.

+

+ +

1

CONSTRUCTIVE THERMODYNAMICS

79

DEFINITION 1. A process is an ordered pair of states (a, b). Denote the set of all processes by 13; denote the elements of !@ by small Greek letters, a,P,y,....

Define an operation

a relation

--*

in ’p by

-

+ in ‘p by

+ (c, d) = (a + c, b + d) (a, b) (c, d ) o a + d b + c (a, b)

-+

+

in ‘p by setting (a, b)-(c, d) whenever there is a state and a relation x such that a + d + x = b + c + x . is an equivalence relation with respect to which It is easily shown that + and + are compatible. Henceforth equivalent elements in !@ are identified. In particular all processes of the form (a, a) are equal: denote any such process by 0. If CI is the process (a, b), denote the process (b, a) by - CI.Then 0 + a= CI and a+ (- a) = 0, and ‘p turns out to be an abelian group under + with zero element 0.

-

DEFINITION 2. CI isnaturalif a+O, antinaturalifO-+a,possibleifa-+O v O+a, reversible if a-0 A 0-m. It is irreversible if it is possible but not reversible, and impossible if it is not possible. The set of all natural (antinatural, possible, reversible) processes is denoted by ‘pN(’pA, pp,‘p,). It is easily shown that Ppand ‘p, are subgroups of ‘p.

DEFINITION 3. Given states a and b, if there exists a positive integer n and a state c such that (na + c, nb)E ‘ppwe write a c b (read “a is contained in b”). 4. A state e is an internal state if, given any state x, there DEFINITION exists a positive integer n such that x c n e .

AXIOM3. There exists an internal state. AXIOM4. Given a process a, if there exists a state c such that for any positive real number E there exist positive integers m, n and states x, y such that m/n<&,x c m c , y c m c and (x, y)+ncc+O, then a-0. Using methods from functional analysis Giles derives from these axioms : GILES’SMAINTHEOREM. There exists a positive additive function of state S, called the quasi-entropy, and a set of positive additive functions of state, called components of content, such that for any states a and b, a + b if and only if S(a)<S(b) and Q(a)=Q(b) for every component of content Q. It is this theorem which shows, apart from a few minor details, that the

80

W. K . BURTON

physical discipline of pure thermodynamics is accessible to a primitive observer in Giles’s sense. The significance of ‘positive’ in the Main Theorem is this: positive functions are bounded. DEFINITION 5. Define the norm IlalI of any state a relative to the internal state e by

The topology defined by the norm turns out t o be independent of the choice of e. DEFINITION 6. A real valued additive function Q(a) defined for every state a is bounded if there exists a constant k such that, for all a, /Q(a)l,
5. For reasons of space, we are obliged to refer the reader to Giles’s book [2] for an account of how the theorems are derived from the axioms. It will be found that what is done is in accord with the standards of rigour customary among contemporary mathematicians. It will, however, be clear that it is not just Giles’s explicitly stated axioms which are involved in establishing the theorems, but also inferential machinery of a logical and mathematical character not covered by the axioms themselves. That is also the situation in ordinary mathematics. But in a physical theory interpretational questions are involved in addition to purely mathematical ones. Giles deals with these interpretational questions by trying to establish some at least of the axioms as ‘true’ in some specific sense, which may be in need of further clarification, but which in any case necessarily calls into question the justifiability of the inferential machinery used. In other words the inferential machinery itself requires an interpretation. As his theory stands at present, there are at least two places where difficulties may be anticipated. One of these concerns the application of the axiom of choice in the proof of an extension theorem of the Hahn-Banach type which is used to extend an additive function on a subgroup to the whole group, and the other concerns an application of the law of excluded middle of a kind best illustrated by means of a quotation: “for each value of n (an integer) the process a1=(m,x, c)+m,n(a, b) is either natural or antinatural. Suppose first a1 is natural for an infinite number of values of n, then a-tb. If u1 is natural for only a finite number of values of n then it is certainly antinatural for an infinite number of values of n.

CONSTRUCTIVE THERMODYNAMICS

81

Then a similar argument yields b+a”. This is reminiscent of the situation encountered in trying to prove that every rational sequence has a monotone subsequence. It seems that these difficulties can be satisfactorily resolved by making use of methods pointed out by Paul Lorenzen in his books [4,5] (see also [3]) where he sets up a form of constructive analysis which deviates only slightly from classical analysis as far as applications are concerned, and in which no axioms are assumed. If Lorenzen’s theory were axiomatised, then the axioms would be provable, because rules of interpretation could be introduced linking the primitive concepts with concretely specified actions in such a way that the axioms became true under these rules. Lorenzen does not, of course, express himself in these terms : he starts with inductive definitions which are used for defending prime propositions, out of which compound propositions can be defined by reference to obligations taken on in asserting them; then abstractions are introduced, and so on. As a matter of fact not all the formulae and theorems of Lorenzen actually receive an interpretation : instead the use of classical logic is justified by means of a consistency proof. This means, for example, that the use of the law of excluded middle is accepted as a fiction, but as a fiction which provably does no harm. If we grant that the primitive observer in thermodynamics has ‘established’ the Giles axioms as true, and that no more than a countable number of states can come into consideration, so that the use of the axiom of choice is avoided, then there would seem to be no difficulty in completing the Giles theory as a constructive theory in Lorenzen’s sense simply by adjoining Lorenzen’s considerations to those of Giles. However, it seems to the present writer that Giles’s primitive observer, although having only meagre-looking powers, has some which transcend those of any human observer. For example he can tell of any pair of states a and b whether a + b or not. Accordingly the proposition a - t b becomes truth definite [4]: there is a procedure which, when applied to the proposition, yields one and only one of two truth values, true or false. However, for any human observer all we are entitled to say is that the proposition a-tb is at most proof-definite [4] i.e., there exists a procedure which, when applied to another procedure which is applied to this proposition, yields a decision as to whether the second mentioned procedure constitutes a proof of a + b or not. What is required for establishing the truth of a + b is the following. We have to find a state k and a time interval z such that a + k evolves in time z into b + k. If we have found a suitable state k and time interval z so that this happens, then we have established that a+b; but if we have not, we have not shown that a + b is false. Perhaps further

82

W. K . BURTON

efforts to find suitable k and z will be successful, so that a + b after all. We can also describe the situation in terms of systems and apparatus like this: the system A can be in states A, and A,. Starting with system A in the state A, we place it in interaction with a piece of equipment K (the apparatus) which is in the initial state K,. This represents the start of an experiment on A. We then wait a time z,with A + K kept isolated, and at the end of that time we separate A and K so that both A and K, individually, are isolated. If A is now in the state A2 and K is again in the state K,, then we shall have shown by experiment that A,+A,. In any other case all we can say is that we have failed to decide whether A,-tA, or not. All that we have said so far about the problem of deciding about a-tb presumes that there are no problems connected with the notion of equality of states; but there are difficulties here as well, and these difficulties have repercussions on the rule of interpretation for a + b too. Expressed more fully, this rule would read: a-tb if there is a state k and a time interval z such that the state whose method of preparation is “apply simultaneously and independently the methods of preparation corresponding to a and k and then wait for a time 7’’is indistinguishable from the state whose method of preparation is “apply simultaneously and independently the methods of preparation corresponding to b and k” in the sense that any experiment applied to these states will yield the same result (or rather the same statistical distribution of results) in each case. Furthermore a consideration of microscopic systems suggests that, strictly speaking this can only apply to experiments which are concerned with only one of the parts of the state and which do not look for a correlation between these parts. Since the methods of preparation are independent there will be no such correlation for b + k ; but in the state evolved from a + k there will generally be some correlation - for instance a high energy for one part will tend to be accompanied by a low energy for the other. Thus in the above rule of interpretation, the phrase “into the state b + k” in the original formulation: ‘a-tb if there is a state k and a time interval z such that a + k evolves (in isolation) in the time z into the state b + k’ should strictly be replaced by “into a state which, ignoring the correlation between the parts, is indistinguishable from b + k”. In the model furnished by statistical mechanics it is just this procedure of ignoring correlation which accounts for the irreversible nature of thermodynamic processes. We see then that we are confronted with the problem of deciding whether or not two states are equal even when we know that they have arisen in quite different ways. If two states were counted equal only when their methods of

CONSTRUCTIVE THERMODYNAMICS

83

preparation consisted of identical actions there would be no difficulty, but the foregoing discussion shows that we need a wider notion of equality than this: two states are still to be identified when no subsequent experiments which may involve deciding on the truth or falsity of propositions of the form a+b, once more - can reveal a difference. Thus some propositions a = b will only be refutation definite, i.e. there will only be available a decidable refutation concept. To put this another way, propositions of the form a # b, but not a = b, will be proof definite. This seems to threaten propositions of the form a + b with not even being proof-definite, and thus in turn to threaten propositions of the form a = b with not even being refutation definite. The same applies to propositions of the form a = b + c : if a is defined as b + c in a particular context, all is well, but if it is a question of deciding whether a state b + c is equal or not to an independently specified state, then tests of the form a+b may again be needed, and this proposition may be neither proof nor refutation definite. One way out of this dilemma would be to try to make equality truth definite by replacing the rule of interpretation which has been given for ‘state’ by another one. There is also a possibility of retaining equality as refutation definite and trying to make a + b dialogue definite [4] - and managing with that. We prefer however, at this stage, to examine the first of these possibilities, since an issue is involved which appears elsewhere in the natural sciences, and which seems very difficult to eliminate. This issue concerns the formation of policies for future behaviour. On the basis of past experience we decide to act in a certain way, but as a result of further experiences which have come about as a result of those actions, we may decide that we ‘should’ have performed other actions instead. That is, our present policies may be revised as a result of further experiences. This train of thought seems to suggest that we are concerned with ‘states of mind’ - a change of state of mind having occurred when a policy is changed. We now reformulate the revised rule of interpretation for ‘state’ as: a state is a state of mind induced by the available knowledge of a method of preparation. In formulating this rule of interpretation we have acknowledged the presence of a subjective element, and one of our tasks will be that of showing how conclusions can be reached which d o not depend on the personal peculiarities of those whose states of mind are referred to. Two ‘states’ are now to be regarded as equal just when the associated states of mind are the same. Let us imagine that two states a and b have arisen in connection with possibly different methods of preparation. Experiments are now carried out on these states and in every case the results

84

W. K.

BURTON

cannot be distinguished. Possibly, if further experiments were to be performed, differences would come to light. However, it may be presumed that the experiments will not be continued indefinitely; a point will come at which the experimenter says that all his efforts to distinguish a and b have failed, and that as far as he can see, he has reached a stage at which his further efforts are disclosing no fresh information. His state of mind is that a and b are equal. He is not at all saying that a and b are identical - he may well agree that further experiments may disclose differences which his experiments have not disclosed so far. By saying that a and b are equal he is merely saying that whatever predictions he is prepared to make about a he is prepared to make about b and conversely; if he were a betting man he would be prepared to lay exactly the same odds in each case. In saying that equality is truth definite we mean simply that the experimenter can decide, yes or no, whether his state of mind about a is the same as his state of mind about b, in so far as his predictions about a and b in future experiments are concerned, With this altered interpretation of state, a = b and a + b = c become truth definite, and a-b proof definite. 6. We return now to Giles’s axioms and ask to what extent can they be justified. Axiom l(i) runs: if a, b E 6 then a + b E G , a + b = b + a , and if a, b, CEGthen a+(b+c)=(a+b)+c. Thus the set 6 of states is required in particular to be closed under the operation . In a constructive theory the set G will not be ‘given’ in advance, but will have to be constructed from the states which will constitute its elements. We start with various states a, b, c, ... and declare them to be in 6. If a and b are in 6, then the rule of interpretation for shows that a + b is a state: accordingly it is put into 6 as well. Thus the set 6 is closed in a sense entirely analogous to the sense in which the set N of numerals is closed in Lorenzen’s arithmetic. In this case we have rules for producing numerals. -1 ii+nl.

+

+

Each figure, as it is produced, is ‘put’ in N. Just as the numerals 1, 11, 111, . .. are ‘generated‘ by these rules, so are the states generated from an initial stock of states by the operation of taking unions. The axioms a + b = b + a and a + (b +c) =(a + b) + c are again justified by the rule of interpretation for +. For example both a + b and b + a refer to the simultaneous and independent performance of the methods of prepa-

CONSTRUCTIVE THERMODYNAMICS

85

ration associated with a and with b. The fact that a comes before b, or b before a, in a linear arrangement of symboIs is something which is forced on us only by the type of notation we have selected. Similarly in the case of the associative axiom. Thus we have ‘justified’ Axiom 1(i). The justification for Axiom 1 (ii) is found by reference to the rule of interpretation for +. If we take r=O in that rule then for a + a to be true we have to find a state k such that a k evolves in zero time into a + k. Any state k will do for this purpose. contains logical particles as Axiom 1 (iii), namely a + b A b+c*a+c, well as Giles’s primitive terms. If we interpret these operatively [3] or in terms of dialogues [4], then Axiom 1 (iii) would be justified by giving a winning strategy for it. Whoever asserts this axiom is obliged to assert and defend a 4 c if an opponent is prepared to assert a-+b and b+c and can successfully defend them. NOWa+ b and b+c are proof definite. Introducing temporarily the notation 7to mean “evolves in isolation in time z into”, proofs of these propositions involve establishing, with suitable k, and k, and r1 and r,:

+

+ k, 2 b + k,

(a)

a

(b)

b+k,;;tc+k,

and we can tell whether or not the opponent possesses proofs in this sense. If he does possess such proofs, the proponent now attempts to prove a+c by using the opponent’s (a) to produce the intermediate state b from a and then using the opponent’s (b) to produce c from b, so establishing a +k T c

(4

+k

for suitable k and r. The question now, however, is: what is the state k? Intuitively, the opponent may have employed two quite separate pieces of apparatus K(’) and K(’) with initial and final states k,, k, and k,, k, respectively. What should the proponent’s apparatus be? The most obvious suggestion is K(”+K(’); calling this K, it is a matter of arranging that the initial and final states of K be some state k. Since the opponent has produced states k, and k, for K(’) and K(’) respectively, this suggests that k should be k, + k,. The proponent would then try to arrange that a and

+ k , + k, 2 b + k, + k,

86

W. K. BURTON

so that he finally achieves (c')

a

+ k , + k2y-c

+

+ k, + k, ,

+

giving (c) with k = k, k, and t = t1 z2. But in order to do what is required of him, the proponent must arrange that in stage (a') only K(') interacts with the system under consideration, and in stage (bl) only K(') interacts with the system, because only that has been done by the opponent. Thus in order to succeed it looks as if the proponent must have the ability to maintain an arbitrary state constant - to freeze it, as we shall say. That is, in stage (a') the state k, has to be frozen while in stage (b') k, has to be frozen. So it appears that we must introduce a hypothesis - namely that an arbitrary state can be frozen - in order to be able to defend Axiom 1 (iii) hypothetically. This state of affairs certainly falls short of what we originally had in mind. It means that there will be an inherent restriction on the domain of applicability of the theory, the precise nature of which requires further elucidation. We are faced with the problem of distinguishing on the practical level whether or not we are confronted with a situation in which Axiom 1 (ii) is 'true'. There is, however, a possible way out of this difficulty: to modify the rule of interpretation for + once more. We define a new relation ias follows:

(D,)

a i b % a + b v V,V,

,,..., c,(a,+cl

A C ~ + C , A

... A C , - + ~ )

where the c,(i= 1, ..., n) are intermediate states. Under this definition a + b is still proof definite, and the transitivity law a i b

A

bic*aic

and Axiom 1 (i) is holds. Axiom 1 (ii) still holds with + replaced by i not affected by the replacement of + by 4. Thus with the new rule of interpretation for + (namely -+ means i with i defined by (D1) in which the '4' on the right is interpretated by the old rule) we have established that 6 is a partially ordered semigroup, i.e. we have proved Axioms 1 (i)1 (iii). The remaining clause of Axiom 1 amounts essentially to two axioms, namely a+c+b+c*a+b a+b +a+c+b+c.

CONSTRUCTIVE THERMODYNAMICS

87

The first of these is established on the grounds that the state c can be considered as part of the state k in the rule of interpretation for -+ in either the new or the old sense. But the second axiom gives rise to the same difficulty as was encountered in connection with Axiom 1 (iii) : it asserts that the state c can be frozen, and there seems to be no obvious way out of the difficulty in this case. This axiom then, as well as Axiom 2, as it turns out, has to be treated differently from the others. So our axioms will be of two kinds: the ones which can be effectively defended in “all”cases, and those which require for their defence an additional hypothetical element which restricts the defensibility to those cases for which some method is available which shows the axioms in question to be ‘true’ in those cases, even if not in general. We refer to the first group of axioms as prototheoretically defensible and to the second group of axioms as hypothetically defensible. As far as the question of a state’s being freezable is concerned we can often, in practice, achieve what is required by the use of suitable apparatus, possibly in a rather far-fetched sense. For example if we had a sample of gas of non-uniform temperature and pressure as an instance of a state k, we could freeze k by ‘instantaneously’ inserting a large number of insulating partitions which would serve, approximately, to freeze k. If we were concerned with a solid rather than with a gas this would scarcely be feasible. One would have to call on the possibility of dividing up the solid into small pieces each of which is to be insulated from the others. At a later stage when the freezing is to cease the solid would be reassembled. What really matters here is that the freezing can be done ‘in principle’. The conviction in each case can only be arrived at by a consideration of the case in question. Usually the difficulty is not too pressing in practice. The situation for Axiom 2: a-+b~a+c*b+cvc+b is quite different. As the discussion given by Giles in his book shows, this axiom is not true for systems exhibiting hysteresis. So if Axiom 2 is to be justified at all, it must be justified by reference to tests carried out on the systems under consideration themselves. Each instance of Axiom 2 regarded as a proposition is proof-definite but not truth definite. After preliminary trials an experimenter may come to the conviction that Axiom 2 is true; but if he is mistaken in this connection, he will find that when he tries to get upper and lower bounds for the entropy, he will always find a finite gap between them, however many experiments and whatever experiments he performs. We

88

W. K. BURTON

might say that when Axiom 2 is ‘false’ only an upper and lower entropy (cf. outer and inner measures) is available, whose difference measures the degree of hysteresis exhibited in the system. So with Axiom 2 we are confronted not with a proper axiom at all; rather we might say that a system for which Axiom 2 is true is defined to be hysteresis-free, and that the problem reduces to the practical problem of recognising hysteresis-free systems. Axiom 2 then has ‘physical content’ in a much stronger sense than Axiom 1, and is decisive in settling the detailed structure of the theory.

7. The motivation underlying Giles’s Axioms 3 and 4 comes from the desire for boundedness in the entropy and components of content, and thus is mathematical rather than physical. Since we are now trying to replace Giles’s theory by a constructive one, these axioms may be expected to appear in a rather different light. In particular, even though the set of states 6 will in general be (potentially) infinite, in any given case 6 will be generated from a finite number of basic states by taking unions. In such a case the state which consists of the union of all the basic states will constitute an internal state in the sense of Definition 4. So Axiom 3 will be justified for that case. However, it is not possible to choose an internal state once and for all in this way, for it may be possible to enlarge 6 by enlarging the set of basic states which generate 6. If this is done our procedure would give rise to a new internal state. The use of this new internal state instead of the old one would, however, do no more than induce scale changes in the entropy and components of content. Thus in a constructive reformulation of the Giles theory we may perhaps regard the justification of Axiom 3 as unproblematical. The justification of Axiom 4 is then achieved by means of a further modification in the rule of interpretation for -+.If we define a>b+ there is a state c such that for any positive real (D,) number E , there are positive integers m, n and states x, y such that m/n
CONSTRUCTIVE THERMODYNAMICS

89

As an ordinary physicist would no doubt express it, Axiom 2 stands out as ‘containing most of the physics’. It is somewhat remarkable that such a rich looking theory as thermodynamics should be supportable on such a meagre looking base. It is seen that while the Giles theory appears to suggest the feasibility of a fully constructive thermodynamics, and perhaps also of other physical theories, further work is required on the interpretational side (which may well lead to modifications in what appear above as axioms) before we shall be in a position to offer such a theory as a fitting companion to, say, Lorenzen’s constructive analysis. Acknowledgements It is a great pleasure t o thank Professor K. Schutte for his kind hospitality when I spent a year (1964-65) in his Department in Kiel, where the above work was started. I thank him also for the numerous illuminating discussions which clarified my own thinking considerably. It is a pleasure to acknowledge in the same sense the benefits accruing from discussions with Professor R. Giles. Finally I thank the Carnegie Trust for the Universities of Scotland and the then Directorate for Scientific and Industrial Research for Travel and Maintenance Grants, and The University of Glasgow for making my study in Germany possible. References 1. H. B. CURRY,Foundations of mathematical logic (McGraw-Hill, 1964) p. 48. 2. R. GILES,Mathematical foundations of thermodynamics (Pergamon, 1964). 3. P. LORENZEN, Einfiihrung in die operative Logik und Mathematik (Springer, 1955). 4. P. LORENZEN, Metamathematik (B. I. Hochschultaschenbiicher, 1962). 5. P. LORENZEN, Differential und Integral (Akademische Verlagsgesellschaft, 1965).

A DEDUCTION THEOREM FOR INFERENTIAL PREDICATE CALCULUS

H. B. CURRY University of Amsterdam

1. Introduction By inferential predicate calculus I mean the formulation of predicate calculus of first order, not necessarily classical, by inferential rules of the sort which Gentzen proposed [7]. The theory of such rules has been considerably developed in [4] (for references to preceding work see [4] p. 25), and some additions to it were made in [ 5 ] . In this paper we shall be concerned with some further developments. I shall use the notation of [4], except that it will not be necessary to use German letters for prosequencesl, and will suppose the reader is familiar with the main features of that work. The principal new development is a theorem which plays a role in the inferential theory analogous to that which the Deduction Theorem does in the ordinary predicate calculus. It shows, in fact, under certain rather broad conditions, that whenever an elementary statement can be deduced from certain elementary premises by the rules of the system and the elimination theorem, then there is a single elementary statement which, in a sense to be explained later, expresses the same deducibility. Lorenzen [13] proved a theorem of this nature which is the prototype of the present one. Several other analogous theorems exist in the literature2; some of these can be obtained as specializations of the present theorem. The theorem will be stated and proved for what may be called the regular case in Section 2. It will then be extended to some modal cases, which are irregular, in Sections 3 and 4. The remaining section of the paper will deal Thus the letters X , Y , Z, U,V, W will stand here for prosequences; A , B, C, M , N , P, Q, T for propositions (constituents of prosequences); F, as in [4],for a fixed proposition. E.g. Belnap and Thomason [2 : ;cf. the discussion of this in Belnap, Leblanc, andThomason [l]. Orevkov [17] and Jensen [ll] contain results which are somewhat related; but these were not available while this was being written. 91

92

H. B. CURRY

with applications of this theorem. These include some generalizations of the Glivenko theorem such as that proved in [6]. 2. The deduction theorem Let LX be an L-system ([4] p. 198), possibly containing quantifiers, which satisfies the following conditions : (i) LX has the property (r6), subject to the avoidance of conflict with characteristic variables 3, at least on the left. (ii) The rules of LA(D) (and, if quantifiers are present, of LA*@) also) are admissible in LX ; (iii) ET (i.e. the elimination theorem) is valid for LX. Elementary statements of LX will be denoted by ‘I” with or without affixes. Then rl,..., rm+ro (1) shall mean that T o is derivable from rl,..., rmby the rules of LX together with ET. Every elementary statement r will be of the form Xlak Y .

(4

Supposing that T o is given, we associate with any such r a proposition T determined as follows: Let X = A , , A , , ..., A , ; Y = B , , B,, ..., B,; N = B , v B , v ... v B,; M = A l ~ . A .... , ~3 . A p = , N . Then T is formed by closing (i.e. universally quantifying) M with respect to all variables in its range which are not in the range of Thus X , Y , M , N , a, T,are associated in a unique manner with any given r ;if r is Ti the If the system contains quantification rules which go beyond LA*, we must suppose that the rules either have the same range in all premises and conclusions, or that the range of the conclusion is obtained from that of the premises by cancelling a certain characteristic variable or variables. In that case the principle (r6) is to hold for all parameters which do not contain any characteristic variables. If LX is singular we need only the prime statement scheme ( p l ) , the structural rules (*C, *W, *K), and those related to P and 17;if LX is multiple or mixed we need also the structural rules on the right and also those related to A . Thus, in the singular case we have essentially the system which Ono [I61 calls “primitive logic” and names “LO”. 5 Then T does not contain any characteristic variable.

AN INFERENTIAL DEDUCTION THEOREM

x,

corresponding notions will be Xi, Mi, Ni, ai, the range of To.

93

Ti, etc. ; in particular a, is

THEOREM 1. If LX is an L-system satisfying the conditions (i), (ii), (iii), then a necessary and sufficient condition that (1) hold is that

be an elementary theorem of LX. Before we prove this theorem we need two lemmas. I f f is any elementary statement (2) and U is a prosequence not containing any variables not in a,, let r’ and f ” be defined as follows:

where b is the union of a and a,, and Tis determined as above. The statements so associated with T iwill be called Ti, r; respectively. Then the two lemmas are as follows. LEMMA1. Proof.

r’+r”. U , Xlb k Y (r’), U,XIbkN (V*).6 Ulb t M (P*). Ula, I- T (n,).

Here the first step (using V,) is unnecessary if Y is singular; and the last step follows since U contains no variables not in a,. LEMMA 2. f Proof.

B J b t Y (pl), (i = 1 , 2,..., 4 ) . Nlb t Y (*V). X, Mlb k Y (*P). X , Tlb t Y (*I7). U l b t T (f”).’ X , Ulb k Y (ET).

Here, as in certain cases below, there may be several successive applications of the cited rule in combination with structural rules. The change of range from a0 to b is valid by [4] Theorem 7B2.

94

H. B. CURRY

This completes the proof of the lemmas. We now proceed to the proof of the theorem. Proof of necessity. Here we let U be TI, ..., T,. Then r; is quasi prime (of type p l ) for i= 1,2, ..., m ; hence, by Lemma 2, there is a proof A ; of r!. Further, by Lemma I , there is a proof A ; of

of

By hypothesis there is a proof A of (1). By (r6) and *K8 there is a proof A’

r;,r;,..., r:, r; . =,

For each i= 1 , 2 , ..., m, put A ; over every occurrence of r:as top node in A‘, and put A ; under rb in the bottom node. The result will be a proof A” of r;l,which is (3). Proof of suficiency. Here let U be void. Then, since r’ is r and r“is IatT, we have by Lemma 1 i = 1 , 2 ,..., m .

Ti+laOt-Ti

From ( 5 ) , (3) and (2) we have, on the hypothesis that

(5)

rl,..., rmhold,

by ET, and from this by Lemma 2 we have T o . REMARK 1. The theorem was deduced directly from Lemmas 1 and 2; hence it will hold for any definitions of T such that these lemmas hold. For example, if we were to define M by M

EA, A

A,

A

... A

A p . 3 .N

and T by quantification closure as before, the theorem would hold for the T’s so modified; but of course we should need * A * in order to establish Lemmas 1 and 2.

REMARK 2. The main part of the proof follows closely the proof in Lorenzen [I31 pp 32f. Lorenzen’s theorem is more special, not only in that he does not consider explicitly the notion of a general L-system, but also in that each Ti is of the form (2) with X void and Y singular, say Ci, and a, contains all variables which occur in any of the Ci.In such a case Ti is simply Ci. However, the general theorem can be deduced from Lorenzen’s by certain auxiliary arguments, including the lemmas. 8

To take care of prime statements in A (cf. Theorem 2). Note that ET satisfies (r6).

AN INFERENTIAL DEDUCTION THEOREM

95

3. Extension to modal systems In [4] Chapter 8, I outlined a method of extending inferential techniques to include the introduction of an operation of necessity. It will be necessary to recapitulate this method here, and to make some revisions. To get the greatest generality we suppose that there are two separate but interconnected systems, LX, and LX,, each with elementary statements of the form (2). Such a statement r will be written in the forms

just when it is derivable, or postulated as derivable, in LX, or LX, respectively; when the suffix is omitted the choice between LX, and LX, is unspecified, but is supposed to be the same in different occurrences of ll- 9 in the same context. In [4] the two systems were called the inner and outer systems respectively; I shall retain this terminology here for occasional use. The theorem which will be proved in this section is of some significance when no modal operations whatever are involved, there being simply an outer system which is an extension of the inner system. But we obtain somewhat greater generality by including modality. The new theorem depends on the following assumptions: (iv) Each of LX, and LX, has nonmodal rules, with all premises and conclusions in the same system, associated with it; with respect to these rules each system is an L-system satisfying the conditions (i), (ii), (iii) of Section 2 and all nonmodal rules of LX, hold in LX,. (v) The rules for modality are as follows:

where the premise and conclusion of *Y and the conclusion of Y* can belong to either system. (vii) The elimination theorem holds in the form in which it was stated in [3] and [4]. We note that the only rule which allows us to transfer from one system to the other is Y * ,and this allows us to draw a conclusion in LX, from one in LX,, but not conversely. Thus an elementary statement in LX, can be Note that the two vertical bars in ‘It’ can be separated and a range written between them. The two separated parts are still considered an instance of ‘IF’.

9

n. B.

96

CURRY

derived only from premises in LX,, using rules of LX,; whereas one in LX2 can sometimes be derived from premises in either LX, or LX, or both, and using rules of both systems. Moreover, the rule Y* fails to satisfy (r6) on the left, and is the only rule for which this is true. In connection with deducibility relations analogous to (1) we distinguish if the deduction uses only rules (including modal two kinds of a;viz., =+, rules and ET) for which the conclusion is in LX, ; and J~if rules of LX, (and perhaps also rules of LX,) are used. The sign '3'without subscript is used where it is not desired to specify the system. When this occurs in the same context with It, they must both refer to the same system. Given an elementary statement T o , we associate with any statement r of the form (2) (i.e. either of the forms (6)) a proposition Tin the same manner as in Section 2.

THEOREM 2. Let the assumptions (iv)-(vii) hold, and let an elementary statement T o be given. Let r,,..., rm be elementary statements of LX,, and rm+l,..., Trn+,,be elementary statements of LX,. Then a necessary and sufficient condition that

r,,..., r,, rm+l, ..., rm+,,*rO

is that

TI,...,

T,, Tm+l,..., Tm+,lakT o .

(7)

(8)

We proceed with the proof of this theorem in a fashion simular to the proof of Theorem 1. We note first that, if T o is in LX,, then by the remark made in regard to the modal rules we necessarily have n=O and ==- is a,. For the two lemmas we proceed differently according to whether r is in the inner or the outer system. Let U , V be prosequences neither of which contains variables not in a,. Then for r in LX, we define

I-'*

X,

0 Ulb t , Y ,

r"P cl Ula,

t , 7';

whereas for r in LX, we define

rfPx,uu,v l b k y ,

r " P o u ,V J a o k T .

With these definitions Lemmas 1 and 2 follow as in Section 2. If r is in LX,, the lemmas hold in the inner system; if r is in LX, they hold in the system used for r' and r",since all inferences made from the hypothesis are valid in LX,. We now proceed to the proof of the theorem.

AN INFERENTIAL DEDUCTION THEOREM

91

Proof of necessity. We follow the general plan of Theorem 1 ; but it is necessary to go more into detail. For this we let T r + ..., r,,where r = m + n, be the steps in the regular deduction A ([4] p. 199). We let the index i range over 1, 2,. .., m ;the index j over m + 1, ..., m + n ; and the indices k , h over r + 1,. .., s. We let U be Ti,. .., T,; but we determine the V’s separately for r. For Ti, we can derive r; from the type ( p l ) statements

,,

Ula, ti Ti by *Y. By Lemma 2 we thus have a proof in LX, of ri. For Ti,we let Vj be Ti.Then r; is quasi prime of type ( p l ) , and hence it is derivable in LX,. By Lemma 2, rJ is also derivable in LX,. Now suppose that T o is in LX,. Then for every rkin A we make V, void. We can now treat A just as we did in Theorem 1. Indeed, if we replace each rkby r;we shall convert A into a tree A’ all of whose inferences are valid in LX,. For the new parameters, viz. the constituents of UU, are all of the form O T ; and for such constituents all rules of LX,, even Y*, satisfy (r6) on the left. If rkis an initial statement of A , then rkeither is some T ior is prime in LX, ;in the former case r;is LX,-derivable by the second preceding paragraph, in the latter case it is by *K. Thus all r; in A’ are LX,-derivable; in particular rb (i.e. ri)is. If rkis any statement in A which is in LX,, then we make V, void. Since the subtree consisting of rkand all statements above it has the same structure as the A of the preceding paragraph, it follows by that argument that r; is LX,-derivable. It remains to consider nodes r, which are not claimed to be in LX,. If such a node is an initial node, it is either some T j or is LX,-prime; otherwise it is obtained from premises rh (or rh,, r h Z ) etc.) by some rule R. If rkis some Tj, then we set V,= Vj. Then r;is I‘i, which is LX,-derivable. If rk is LX,-prime, then we set V, void. Since r; is obtained from rk by *K, r; is LX,-derivable, but in general it is not LX,-derivable. Suppose now that rkis obtained from premises rh by a rule R for which (r6) holds. Then we set V, to be the union of all the 6,for the different premises f h . By *K we can replace the V, in each r; by V,; from the SO altered premises we can derive r; by the same rule R. Then r; will be Lx,valid if all the premises r; are and R is valid in LX, ; otherwise it will be LX,-derivable. Finally we consider the case where rkis derived by Y*. Then the premise r his in LX, and has void V,. We make V, also void. Then by the same rule r; is obtained from r; and is, indeed, LX,-derivable.

98

H. B. CURRY

Thus all rkin A , in particular ri, are at least LX,-derivable. By *K we can add to V, any Tisuch that Ti does not occur in A . Then rb is derivable. By Lemma 1 we have r$,which is (8). Proofofsu&ciency. Here as in Theorem 1 we can let U and V be void. Then r' is r, and hence (5) holds for i= 1, 27...7m+n. If Ti is in LX, then we have r,*,lao t Ti, and hence, by Y*,

ri*lla,t~.

If Tiis in LX2, on the other hand, we have by (5)

Ti*,la0 t 0Ti. From these and

(9,by ET, we have

rl,..., r,, rm+l, ...,

rm+n*

la, t To

and from this (7) follows by Lemma 2. 4. Representation of the outer system in the inner

We now prove a theorem which is a generalization of [4] Theorem 8B2. This theorem, even more than that of Section 3, is significant even in the case where modal operations are not present. For this purpose we need a new assumption in addition to those of Section 3, viz. (viii) If (1) is a rulelo of LX, which is not a rule of LX,, then there is an elementary statement r+llsuch that r+is derivable in LX, and

rl,r2,..., r,, r+airo.

(9)

As an example of the assumption (viii) consider the rule Nx, viz.

X , Alk Y X__._ ,i Alt Y Xlt Y ~

If LX, is LD, this is a valid rule of LX,. Let LX, be LM, then from the premises of Nx we have in LX, by *V X , A v iAll-Y. This includes the case of prime statements; in such a case we have m = 0 in (1). We could consider with very little extra trouble the possibility that there is a set of but I know of no application for the increased generality (cf. Corollary such statements r+; 3.1). 10

99

AN INFERENTIAL DEDUCTION THEOREM

Hence, if

r+is It’4 v

1A

,

we can derive the conclusion of Nx by ET. Thus we have in LM

x,Alt- Y ; x,1Alt Y ;

IkA v

1A-Xlt

Y,

which is (9) for this case. Of course I‘+ is derivable in LD.

3. Let the assumptions (iv)-(viii) hold and let T o ,rl, ..., r,, THEOREM rm+l,..., Tm+,, be as in Theorem 2. Then a necessary and sufficient condition that (7) hold is that there exist a set f:, rl,..., r;, each of which is associ-

ated with a rule of LX, as described in (viii), such that

U T,, , . . , UT,, Tm+l,...,T,+,, TT ,..., T i l a , t T , ,

(10)

where Ti’ is the T associated with rf. Proof of necessity. We modify the proof of necessity in Theorem 2 as follows. For any r of form (2) we determine prosequences U, V which contain no variables except those in a,. Then we define r’ and I”’ as in the proof of Theorem 2, except that these are to be regarded as statements of LX, in all cases. Then Lemmas 1 and 2 hold in the sense of a,. We now consider the necessity proof proper. We first modify A so as to make it a proof all of whose inferences are made by rules valid in LX,. We do this by introducing the appropriate r+as new premise, which will then become an initial node of the modified A . For a r k which is LX,-prime we do the same, treating such a statement as obtained by a rule of LX, without premise. Then in the modified A all initial nodes will either by LX,-prime, or some rj,or some r+. If r k is some Tf then we let v k be the corresponding T f . Then rl is LX,-prime (of type pl), and hence is LX,-derivable. By Lemma 2, r; is LX,-derivable12. We handle all the other cases exactly as in Section 3. It then follows, by deductive induction, that r:, and hence rb and r:, are LX,-derivable. But r: is precisely (10). Proof of suficiency. By (v) it follows that if (10) holds, as indicated, in LX,, then it holds in LX,. By Lemma 2 we have for each r+ T f -zlao t T +

and hence, since f + holds in LX,, we have

la, l2

t-2

T +.

This is, of course, the same argument that was used for

rj in Section 3.

100

H. 8. CURRY

From this and ET for LX, we have (8). The rest follows by Theorems 2. REMARK 1 . An alternative procedure for proving necessity would be to transform the proof of (8) directly into a proof of (10). I have not investigated this; but the present proof seems to give more information in regard to the relationship of (10) to (7). REMARK 2. The proof of necessity does not use ET for LX,.13 For ET enters into the proof only through the proofs of Lemmas 1 and 2; and in the necessity proof here - in contrast to Theorem 2 - these lemmas are used only in LX,. In the sufficiency proof, however, ET for LX, is essentially involved (and, unless m = 0, that for LX, is also). COROLLARY 3.1. The assumption (viii) is a consequence of (iv)-(vii). Proof. Suppose that (1) is a rule of LX,. Let the Tibe defined as in Section 1 , and let r+be the statement (3). Then T’ is derivable in LX, by Theorem 2 (with 0 for m and m for n). On the other hand assume that rl,.. ., rm and r+ hold in LX,. Then by Lemma I (with U and V void)

la, t. Ti,

i = 1, 2, ..., m .

From this and T’ we deduce T o by ET. Thus (9) holds. REMARK 3. The corollary does not show that (viii) is superfluous. For the proof of Theorem 3 holds regardless of how the r’ are defined, provided that (viii) is satisfied. In the case of the rule Nx, adduced above as an example, the r+defined there is much simpler than that given by the corollary. COROLLARY 3.2. If (2) is derivable in LX,, then there is a prosequence W , whose constituents are all of the form T’, such that

X , Wlat-, Y .

(1 1)

Proof. This is the case where m=n=O. 5. Theorems of Glivenko type

The “Glivenko theorem” is the theorem of Glivenko [8] to the effect that if t-A holds in HK, then k i i A holds in HJ. I shall not attempt to report here on the extensive literature concerned with generalizations and analogues of this theorem; but I shall mention certain ones, and add some details which do not appear to be in the literature or about which certain misconceptions appear to exist. l3

This generalizes a similar remark made in connection with [4] Theorem 8B2.

AN INFERENTIAL DEDUCTION THEOREM

101

Probably the most famous of the extensions of the Glivenko theorem is that of Godel [9]. In that paper Godel showed, as an almost immediate consequence of the Glivenko theorem, that if A is any proposition constructed from the propositional variables by the operations A and ionly, then A is an assertion of HK if and only if it is an assertion of HJ. He then extended this result to assertions of the Heyting system of arithmetic [lo] for propositions A formed from equations by the operation iand A and universal quantification (V). Thus any classical assertion A of this sort is also an intuitionistic one; and also, since every proposition of classical arithmetic is equivalent to one so formed, classical arithmetic can be interpreted in intuitionistic arithmetic. It is, however, an error to suppose that classical predicate calculus (HK*) can be interpreted analogously in intuitionistic predicate calculus. This error occurs in [4] Exercise 7C6, p. 349, and I want to correct it here; the error also occurs in Mostowski [I51 p. 13. A counterexample is 1((VX) 11A ( X ) . A . 1(VX)

A(X))

which is assertible in HK* but not in HJ*. This is also a counterexample for the Glivenko theorem for predicate calculus 14. Godel's proof makes essential use of the fact that the atomic propositions are equations, and the law of double negation holds for these. A generalized form of the Glivenko theorem was proved in [6]. This theorem is almost an immediate consequence of Corollary 3.2. Indeed we can generalize it slightly. To show this I shall first prove two additional lemmas. LEMMA 3. Let LX be an L-system in which the rules of LM and ET hold. Let 6 be an operation such that, for A in LX, 6 A is also in LX, and in LX Then, in LX we have Proof.

All- 6'4,

1All- 6 A .

x,6Alt Y *x,1 Yll- Y

x,6AIk Y x,All- Y X , A , 1 Y 11- F

by hypothesis. by (12i)y ET. by *N.

14 That the Glivenko theorem does not hold when quantifiers are involved is well known, and was mentioned in Godel [9]. Several counterexamples are given in Kleene [12] Theorem 58.

102

H. B. CURRY

X , 1Y II- -I A X,i Y It 6 A X,i Yll- Y

by N*. by (12,), ET. by hypothesis and ET.

LEMMA 4. Let LX, 6 be as in Lemma 3 except that instead of (12) we have for all A in LX. Then

6A,lA3AIkA

x , 1 YItY-tX,6NltY,

where N is as in Section 1. Proof. If Y is singular, say B, then N is B. The proof is then as follows.

x , 1BIkB XlI-lBIB

by hypothesis. byP*.

6B, i B 2 BIt B X , 6Bll- B

by (14). by ET.

For the multiple case, where Y --= B,, B,, ..., B , ,

N

= B, v B ,

v

... v B, ,

the proof is more complicated, but may nevertheless be accomplished, as follows (where LM means processes valid in LM): X, i Y I/- Y X,i Y ti-N i NII- i Bi X, i Nlt N X , 6NIk N NIt Y X , 6NII- Y

by hypothesis. by V*. by LM. by ET. by singular case. by LM. by ET.

The hypotheses of Lemmas 3 and 4 are satisfied in LM by and in LJ by

6,AEAVlA,

6,A

6,A=1A3A.3AA,

= 11A 3 A .

Thus any such 6A is a sufficient axiom scheme for LD over LM, and for LK over LJ. However, 6,Ak 6 , A is not derivable in LM.

AN INFERENTIAL DEDUCTION THEOREM

103

THEOREM^. Let LX, and LX, be a pair of L-systems such that the hypotheses of Corollary 3.2 hold. Let LX, include LM, and let each r+be of the form Ik 6A (17)

where 6 satisfies the hypotheses of Lemmas 3 and 4. Then a necessary and sufficient condition that (2) hold in LX, is that

x,1YlkL Y .

(18)

Proof of necessity. In the present situation each T' is of the form 6A. Hence, if we apply Corollary 3.2, we have (1 1) in the form

x,SA,, ...)6A,II-,X

.

We then have (18) by repeated applications of Lemma 3. Proof of suflciency. By Lemma 4 we conclude from (18) that

x,SNII- Y ; holds in LX, and hence, by (v), in LX,. Since (17) is derivable for all A in LX,, we then have (2) in LX, by ET. COROLLARY 4.1. If the Y in (2) is of the form i Z or is F, then (2) holds in LX, just when it holds in LX,. Proof. If (2) holds in LX, then it holds in LX, by (v); it therefore suffices to prove the converse. If Y is of the form iZ, say Y

= 1c,,

c,, ...)1c,,

1

then, from ET and instances of A , i A l t F , we have

x,c,, ...) C,Ik,F. This reduces us to the case where Y is F. For that case (18) becomes

x,1Flt, F ; and from this, since lk,

i F,

we have by ET

Xlt,F. Theorem 4 and Corollary 4.1 give the Glivenko theorem as a special case. For the hypotheses are satisfied when LX, is LJ and LX, is LK. If X is void and Y is B, the conclusion (18) becomes i B I k B inLJ,

104

H. B. CURRY

and from this we have by *N and N* I t - i i B inLJ.

The following corollary depends on the fact that F is an indeterminate in LM, and consequently all theorems of LM hold in LA if one takes a fixed C and defines i A as A = , C. It is stated and proved for the singular case only; it could be stated for the multiple case also, but the extra complexity is hardly interesting. The corollary answers the question of [4] Ex. 5E21, p. 244. COROLLARY 4.2. If

XIkB

holds in LC,, then there is a C such that X , B 3 Cli-B (21) holds in LA,. Proof. Let LX, be LA1 and LX, be LC,. Then the hypotheses of Corollary 3.2 are satisfied with each f an instance of Peirce’s law. Then we have (1 l), where each constituent of W is of the form +

&,A, = A i C,.=, Ai.=,A i z)

for various Ai and various C,. For any fixed k we can define i A as A 2 C, and apply Lemma 3 to replace all constituents 6,Ai by BIG,. When this is done successively for all k we have X,B

New let

2

c,, B c,, ...)B 2

2

C,I!-B.

(22)

c c1 A c, A ... A c, . 5

Then we have in LA

B

3

CIkB 2

c,.

Thus from (22) we have (21) by ET. Although the Glivenko theorem does not hold for the predicate calculus, yet Mints and Orevkov [14] showed that it holds subject to an additional restriction, viz., that there be no positive occurrences of universal quantification in the proposition considered. It is easy to obtain an analogous result by the method of [6]; one simply considers three additional cases in the induction, viz., where the last rule is *C, *n,or C*;as we shall see presently, an occurrence of I7* is impossible. The following corollary obtains the result by modifying the proof of Theorems 3 and 4.

AN INFERENTIAL DEDUCTION THEOREM

105

COROLLARY 4.3. Let LX,,LX, be either of the pairs LM*,LD* or LJ*, LK*. Then Theorem 4 holds provided there is no positive occurrence of universal quantification in (2). Proof. We revert to the proof of Theorem 3. In the present case we have m=n=O, and hence A is a proof of (2) without hypotheses. By ET we can get such a proof within LX,, i.e. without any use of ET for making inferences in the proof. For such a proof the composition (subformula) and separation properties hold. Suppose then, there were an inference by IT* in A . Such an inference would introduce a positive occurrence of universal quantification; by the strong form of the composition theorem 15, there would be such an occurrence in (2). Since this contradicts the hypothesis, there are no instances of IT* in A . Now consider the transformation of A to A'. The new premises introduced in Section 4 were of the form T'. Each T i is obtained from an M + of the form 6 A by closing with respect to certain variables. The purpose of this closure was to make sure that there be no conflict with a characteristic variable which is eliminated further down in the proof. For the present purpose we introduce M i ,which will thus be of the form 6A, in the place of T', and we show that we can avoid conflict with characteristic variables by another device. All initial nodes in A are prime statements of LX2, and in the present case these are also prime statements of LX,. In such a case we take r; to be r k . This r; is of form (1 1) with W void. If rkis a new initial node of form M +, r;will be of form (pl) when Xis void and W is M' ;this is again of form (1 1). Suppose now that rkis obtained by a rule R. If R does not have a characteristic variable we treat the situation exactly as in Section 4; and we note that if all premises have been replaced by LX,-derivable statements of form (1 l), then r; will also be such a statement. If R has a characteristic variable, then R must be *C,and there is a unique premise rhof the form

where c does not occur in X k or

Suppose that the form

Yk.

Then

rhhas been determined

See [4] Exercise 5E19, p. 244.

is

to be an LX,-derivable statement of

x k , w k , c(c)lak* 15

r k

yk.

106

H. 8. CURRY

By Lemmas 3 and 4 we transform this into the statement

Here c does not occur in Nk, since it is not in Yk; we can therefore apply *C again, which leads to

This we take to be I-';, and we note that it is a n LX,-derivable statement of form (1 1). In this way every r k in A is replaced by LX,-derivable statement r; of form (1 1). In particular rb becomes the statement of form (1 1) which corresponds to the original (2). From this point on we complete the proof as in Theorem 4. The theorem of Mints and Orevkov does not give the most general situation where the Glivenko theorem holds. Thus consider the case of Pk Q ,

where P

= (Vx). A

v B ( x ) , Q = A v (Vx) B ( x ) .

This is not provable in LJ* ([4] Theorem 7B10), but P , 1Ql1Q

is provable in LJ*. Another sort of extension of the Glivenko theorem is that in the Godel case, viz. where A is an HK-assertion which contains no operations except A and i, A is not only HJ-derivable, but even HM-derivable. A proof of this is given in Schmidt [I81 Section 131. It can also be proved by the theorems given here. By Godel's argument we reduce to the case where A is i B, in which case the Glivenko theorem shows that i B is derivable in HJ. Then we can apply Corollary 3.2 to show as in Corollary 4.2, that if X contains only the indicated operations and XIbF

(23)

is LJ-derivable, then there must be a C such that X,F

2

CIk F

(24)

is LM derivable; by considering all possible cases one can show that this is impossible unless (24) holds in LM.

AN INFERENTIAL DEDUCTION THEOREM

107

As a final application we consider the following generalization of a theorem due to Belnap and Thomason [ 2 ] .

THEOREM 5. If (1) holds in LK* and all occurrences of 1,i , V in T I , ..,, while all occurrences of these same operations in T o are negative, then (1) holds in LJ*.16 Proof. By Theorem 1, if (1) holds in LK*, so also does (3). Let (3) be derived in (LK),. If any of the rules P*, N*,or n* were used proving in (3), then by the strong composition theoreml7, there would be a positive occurrence of one of these three operations in (3). But all occurrences of these three operations in (3), are negative. Hence none of the three rules mentioned is used in the proof of (3). Since all the other rules of LK" are also rules of LJ*, it follows that (3) and therefore (l), hold in LJ*.

r, are positive,

References 1 . N. D. BELNAP,LEBLANC and THOMASON, On not strengthening intuitionistic logic, Notre Dame J. Formal Logic 4 (1936) 313-320. and THOMASON, A rule-completeness theorem, Notre Dame J. Formal 2. N. D. BELNAP Logic 4 (1963): 3943. 3. H. B. CURRY,The elimination theorem when modality is present, J. Symb. Logic 17 (1952) 249-265. 4. H. B. CURRY,Foundations of mathematical logic (New York, McGraw Hill Book Co., 1963). 5. H. B. CURRY, Remarks on inferential deduction, in: Contributions to logic and rnethodology in honor of I. M. Bochenski, ed. Anna-Teresa Tyrnieniecka in collaboration with Charles Parsons (Amsterdam, North-Holland Publ. Co., 1965). 6 . H. B. CURRY, The system LD, J. Symb. Logic 17 (1952) 3542. 7. G. GENTZEN, Untersuchungen uber das logische Schliessen, Math. Z. 39 (1934) 176-210, 405-431. 8. M. V. GLIVENKO, Sur quelques points de la logique de M. Brouwer, Acad. Roy. Belg., Bull. CI. Sci. 15 (1929) 183-188. 9. K. GODEL,Zur intuitionistischen Arithmetik und Zahlentheorie, Erg. math. Kolloq. 4 (1933) 34-38. 10. A. HEYTING, Die formalen Regeln der intuitionistischen Logik, S.-B. Preuss. Akad. Wiss., Phys.-Math. K1. (1930) 42-56. 11. A. JENSEN, Some results concerning a general set-theoretical approach to logic M. Scand. 16 (1965) 5-24. Introduction to metamathematics (Amsterdam, North-Holland Publ. 12. S. C. KLEENE, Co., 1952; fourth printing 1964). Metamathematik (Mannhein, Bibliographisches Institut, 1962). 13. P. LORENZEN, l6

l7

Cf. Lemma 4 in Orevkov [17]. See [4] Exercise 5E19, p. 244.

108

H. B. CURRY

14. G. E. MINTSand E. P. OREVKOV, Obobshchenie teorem V. I. Glivenko i G. Kreisela na odin klass formul ischislenia predikatov (Generalization of theorems of V. I. Glivenko and G. Kreisel to a classs of formulas of predicate calculus), Dokl. Akad. Nauk SSSR 152 (1963) 553-554. 15. A. MOSTOWSKI, Thirty years of foundational studies, Acta P. Fennica Fasc. 17 (1965) 1-180. 16. K. ONO,On development of formal systems starting from primitive logic, Nagoya Math. J. 28 (1966) 79-83. 17. V. P. OREVKOV, Nekotorye klassy sekventsvi dlya konstruktivnogo ischislenie predikatov (Certain reduction classes and solvable classes of sequents for the constructive predicate calculus), Dokl. Akad. Nauk SSSR 163 (1965) 30-32. 18. H. A. SCHMIDT,Vorlesungen uber Aussagenlogik (same as: Mathematische Gesetze der Logik, I) (Berlin-Gottingen-Heidelberg, 1960).

ZUR BERECHENBARKEIT PRIMITIV-REKURSIVER FUNKTIONALE ENDLICHER TYPEN

J. DILLER Universitat Miinchen Godel hat 1958 die elementare Zahlentheorie in einem System primitivrekursiver Funktionale endlicher Typen interpretiert [2]. Die geschlossenen Grundterme dieses Systems, von dem der hier verwendete Teil in Sektion 1 angegeben ist, sind nach Arbeiten von Kreisel [4], Tait [6] und Howard [3] berechenbar. Dies wird in Erweiterung eines Beweisgedankens von Schutte [S], Satz 2.6, in Sektion 2 direkt mittels Bar-Induktion bewiesen. Durch Verwendung einer Ordinalzahlfunktion (Sektion 3), die mit der FinslerHierarchie ([1Iy S. 64) verwandt ist, wird diese Bar-Induktion in Sektion 4 durch eine transfinite Induktion bis zur ersten o-kritischen Zahi ersetzt. Dieser Weg liefert also nicht das scharfste Ergebnis ; er scheint jedoch auch anderweitig verwendbar zu sein.

1. Ein formales System C von Termen endlicher Typen Induktive Definition der Typen: 1. o ist ein Typ. 2. Sind a und z Typen, so ist auch ( 0 ) z ein Typ. Grundzeichen des Systems C. 1. Null 0, Nachfolgersymbol ' und Klammern (,). 2. Abzahlbar unendlich viele Variablen u' von jedem Typ T. 3. Zeichenf, g, ... fur Funktionale von jedem Typ. 4. Das Gleichheitszeichen = . Induktive Definition der Terme mit ihren Typen. 1. 0 ist ein Term vom Typ 0. 2. 1st t ein Term vom Typ 0,so auch (t)'. 3. Jede Variable u' ist ein Term vom Typ z. 4. 1st p ein Term vom Typ (a) z und q ein Term vom Typ a, so ist p ( q ) ein Term vom Typ z. 5 (Abstraktion). Zu jedem Term t [u;l, ..., u:] vom Typ 0,in dem hochstens I09

110

1. DILLER

die Variablen ui', ..., u: der Typen zl, ..., z, an den angegebenen Stellen auftreten, gibt es das Funktionalf, zu dem jede Gleichung

fP1

(A)

-1.

P n = tIIP1,

Pnl

.--2

gehort, in derp,, . . . , p , Terme der Typen zl, ..., z, sind. Das Funktional f ist ein Term vom Typ z1 ...z,o. 6 (Rekursion). Zu jedem Term t[uy, ..., u;] wie in 5. und jedem Term 5 [ u , a, uy, ..., u;] vom Typ 0,in dem hochstens die Variablen u, a, uy, ..., u: der Typen z1 ... znO, 0, x1 ,..., T,, an den bezeichneten Stellen auftreten, gibt es das Funktional g, zu dem jede Gleichung

gehort, in der t, p l , ..., p n Terme der Typen 0,z 1 7..., z, sind. Das Funktional g ist ein Term vom Typ ozl ... z,o. Wir lassen Klammern bei Typen und Termen fort, wenn dies nicht zu Mehrdeutigkeiten fiihrt. Jeder Typ ist eindeutig von der Gestalt r1 ...zno mit n 3 0, und jeder Term hat eine Gestalt 0, (t)' oder p o p l ... p k ,wo k > 0 und p o eine Variable oder ein Funktional ist. Endliche Folgen t, x , ... aus Grund... bezeichnen wir als Nennformen. t [ p l , ...,pn] zeichen und Nennzeichen bezeichnet die Zeichenreihe, die aus t hervorgeht, wenn in t die Nenn.., * n uberall durch die Terme p l , .. ., p n ersetzt werden. zeichen Formeh sind nur die Gleichungen s = t zwischen Termen s, t vom Typ 0. Axiome sind alle Gleichungen (A) und (R). GrundschluJregeln sind die Komparativitat : r

= s,

r

=t*s

und die Regel fur Ersetzungen vom Typ

=t

0:

r=s*t[r]=t[s]. 2. Reguyare Terme

Induktive Dejinition der Zifsern: 1. 0 ist eine Ziffer. 2. 1st z eine Ziffer, so auch 2'.

DEFINITION.Direkte Subtypen eines Typs z I ...z,,o sind die Typen TI,

...)z.,

DEFINITION der direkten Subterme eines Terms p . 1. Der Term 0 hat keinen direkten Subterm.

111

BERECHENBARKEIT PRIMITIV-REKURSIVER FUNKTIONALE

2. Ein Term (t)' hat den einzigen direkten Subterm t . 3. 1st u eine Variable, so hat ein Term up, ... P k fur k > 0 die beiden direkten Subterme up, . . . p k P lund P k , fur k=O keine direkten Subterme. 4. 1stf ein durch Abstraktion definiertes Funktional, zu dem die Gleichungen (A) gehoren, so hat ein Term f p l . . . p k den - bis auf Benennung der Variablen u k + , , ...,u, - einzigen direkten Subterm t [ p l ,..., p k , U k + l , ..., u,]. 5. 1st g ein durch Rekursion definiertes Funktional, zu dem die Gleichungen (R) gehoren, so hat 5.1. ein Term gOp, . . . p k den einzigen direkten Subterm t [ p , , ...,p k , U k + l , . . . , u,l; 5.2. ein Term gz'p, . . . p k mit einer Ziffer z den einzigen direkten Subterm e [ g z , Z,P1,...,Pk, % + l , . . . , un]; 5.3. ein Term gtp, . . . p k , in dem t keine Ziffer (eventuell auch nicht vorhanden) ist, die unendlich vielen direkten Subterme t und den direkten Subterm von g z p , . . . p k fur jede Ziffer z. DEFINITION. Eine Subtermkette eines Terms p ist eine (endliche oder unendliche) Folge p o , p , , ... von Termen mit 1. p o ist der Term p ; 2 . besitzt p i einen direkten Subterm, so ist p i + einer von diesen; 3. besitzt pn keinen direkten Subterm, so ist p , das letzte Glied der Folge. DEFINITION. Ein Term p heil3t regular, wenn jede Subtermkette von p endlich ist. DEFINITION. Ein Grundterm t heiBt berechenbar, wenn fur eine Ziffer z

t = z herleitbar ist.

Der Beweis, daB jeder Term regular ist, ist zu dem Beweis von Satz 2.6 aus [5] analog. SATZ1. Sind x [u'] und q' regulare Terme, so ist auch x [q'] regular. Dem Beweis dieses Satzes schicken wir einen Hilfssatz voraus. HILFSSATZ.Es sei x eine Nennform *xl ... Xk mit k>O und q ein Term uql . . . q l mit einer Variablen u, oder es sei x nicht von der Gestalt *XI ... Xk (mit k 2 0). Dann ist jeder direkte Subterm von x [ q ] ein Term [ q ] derart, daJ X[u'] ein direkter Subterm yon x[u'] ist. Beweis. 1. Fall. x sei nicht g*"." x, ... xk mit einem durch Rekursion definierten Funktional g oder q sei keine Ziffer. Dann bestimmt offenbar x allein eine Familie von Nennformen 2, so daB [q] die samtlichen direkten Subterme von x [q] durchlauft.

x

x

112

J. DILLER

2. Fall. x sei gy'"" x1 ... xk und q sei eine Ziffer z. Dann gibt es eine Nennform 2, so daB X[z] der einzige direkte Subterm von x[z] und X[u'] einer der unendlichen vielen direkten Subterme von x [u"] ist. Beweis von Satz 1. Wir machen die Induktionsvoraussetzungen : (JVl) Wenn ?[d]direkter Subterm von x[u'] ist, so ist x[q'] regular. (JV2) Sind i [ u " ] und 4"" regulare Terme, deren Typ c ein direkter Subtyp von z ist, so ist auch x"[q""] regular. 1. Fall. Die Voraussetzungen des Hilfssatzes sind erfullt. Dann sind nach dem Hilfssatz und (JV1) alle direkten Subterme von x [q'] und deshalb auch x [q'] selbst regular. 2. Fall. x ist *xl ...Xk und, falls k>O ist, beginnt q mit einem Funktional. 2.1. Es ist k=O. Dann ist x[q] mit q identisch, also regular. 2.2. Es ist k>O, und q beginnt mit einem Funktional. Dann sind nach (JV1) die Terme xk[q] und qx,[q] ... x,-,[q], also auch der hiermit subtermgleiche Term q x , [ g ] ...x k - [ q ] uk mit einer Variablen uk regular. Der Typ von uk ist aber ein direkter Subtyp von T, so da13 nach (JV2) auch q x , [q] ...xk[q],und das ist x[q], regular ist. Durch Bar-Induktion nach der Lange der Subtermketten beseitigt man zunachst (JVl), danach durch Induktion nach dem Typ (JV2). Damit ist Satz 1 bewiesen.

SATZ2. Jeder Term p ist regular. Beweis durch Induktion nach der Definition von p . 1. Fall. Fur Terme 0 und U' und, falls t bzw. t [ u , , ..., u,] regular ist, auch fur (t)' bzw. das durch Abstraktion definierte Funktional f ist die Behauptung trivial. 2. Fall. Sind die Terme p und q regular, so auch p u und q; also ist nach Satz 1 auch pq regular. 3. Fall. Sei p ein durch Rekursion definiertes Funktional g, und die definierenden Terme t [ u l , . .., un] und 5 [u, a, u l , ..., u,,] seien regular. 3.1. Dann ist auch go und nach Satz 1 s[gO, 0, u l , . .., un] regular. 3.2. 1st s [ g z , z, u l , ..., u,] regular, so sind auch gz' und nach Satz 1 s[gz', z', u,, ..., u,] regular. Mit Induktion nach z folgt aus 3.1 und 3.2, da13 jeder direkte Subterm von g, also auch g selbst regular ist. Damit ist Satz 2 vollstandig bewiesen. SATZ 3. Jeder geschlossene Term t vom Typ o ist berechenbar. Beweis. Da jeder direkte Subterm eines geschlossenen Terms vom Typ o wieder ein geschlossener Term vom Typ o ist, besteht der Subtermbaum von t nur aus geschlossenen Termen vom Typ 0.

BERECHENBARKEIT PRIMITIV-REKURSIVER FUNKTIONALE

113

1 . Fall. t hat keinen direkten Subterm. Dann ist t der Term 0, also berechenbar. (Denn die Reflexivitat der Gleichheit folgt mit (A) aus der Komparativitat .) 2. Fall. t hat genau einen direkten Subterms. Dannist t entweders‘, und mit s = z ist auch t =z‘herleitbar, oder t = s ist ein Axiom, und aus s =z folgt t = z. 3. Fall. t hat unendlich viele direkte Subterme. Dann ist t ein Term gsp I .. .pn. Aus s = z folgt t =gzp, .pn, und gzp, ...pn ist nach einem Axiom (R) gleich einem direkten Subterm von t. 1st dieser gleich 2, so ist auch t = P herleitbar. Also ist mit den direkten Subtermen von t auch t berechenbar, und durch Bar-Induktion folgt mit Satz 2, daB t berechenbar ist. Der Beweis von Satz 3 zeigt, daB die Definition der direkten Subterme fur jeden geschlossenen Grundterm ein Berechnungsverfahren liefert, das von auBen nach innen fortschreitet, im Falle eines Terms gtp, ...pn jedoch erst den Rekursionsterm t ausrechnet. Nach Satz 2 bricht dieses Standardverfahren nach endlich vielen Schritten ab.

..

3. Eine Hierarchie arithmetischer Operationen Um die hier verwendete Bar-Induktion durch eine transfinite Induktion zu ersetzen, betrachten wir die folgendermaBen durch eingeschachtelte transfinite Rekursion definierte dreistellige Ordinalzahlfunktion a. (1) acrpo=p’. (2.1) aopy’ = (aopy)’. (2.2) occ’py’ = ocC(0a’py) (aa’py). (2.3) o@y’=supaa(oApy) (aiby), falls 2 eine Limeszahl ist. a
(3)

acrpA=supaapy, falls ieine Limeszahl ist. Y<*

LEMMA 1. a ist normal im letzten Argument. Beweis. Wegen (3) ist 5a normal ini letzten Argument, wenn aapy
< aa(aa’py) 0 < “a(0a‘py) (aa’py) = acr’py’.

3. Sind alle aa, bei denen a kleiner als eine Limeszahl 1 ist, normal im letzten Argument, so ist wegen (1) und (2.3) aipy

< sup aa(aipy) 0 < sup aa(oApy) (aipy) = o i p y ’ . a
a
I14

J. DILLER

LEMMA 2. a ist monoton im vorletzten Argument. Beweis. 1st p < p, so ist wegen (1) aapO < cap0 und, falls fur ein bestimmtes a und fur alle y, die kleiner sind als eine Limeszahl I , stets aapy Q aapy gilt, so ist fur dieses a wegen (3) auch aapAQaapI. Also ist aa monoton im vorletzten Argument, wenn aus aapy Qaapy stets aapy‘
3. Sei oa monoton im vorletzten Argument fur alle a, die kleiner sind als eine Limeszahl I , und sei aAgy Q oIpy. Dann folgt wegen (2.3) mit Lemma 1 wie eben aIBy‘daIpy‘. LEMMA3. a ist monoton im ersten Argument. Wir beweisen nacheinander durch Induktion nach y : 1. Fur alle p, y ist oOpyQo1py. 2. 1st aapydaa’py fur alle p, y , so ist auch cra‘py
< (aIPy)‘ = ao(0lpy) 0 < O o ( 0 l p y ) (alpy) = 0lPy‘.

Zu 2. 1st oa’/3yQaa”Py, dann ist wegen (2.2) nach Lemma 1 und 2 aa’py’ = a‘%(aa’py) (oa’py) Q om (aa”Py) (ad’py)

< aa’ (0a”py) (0a”Py) = aa”py’ .

Zu 3. 1st zu den Voraussetzungen von 3. fur ein y und alle a
d aa’py’

= rJa(acr‘py) (oa‘py) Q aa (.I&)

(.@y)

d aIpy’ .

115

BERECHENBARKEIT PRIMITIV-REKURSIVER FUNKTIONALE

Zu 4. 1st zu den Voraussetzungen von 4. fur ein y noch al/3ydaA‘py, so ist wegen (2.2) und (2.3) nach Lemma 1 und 2 o@y’ = sup aa(aApy) (aqly) a < 2.

< sup oa(oiL’py) (on’py) c1 < 1

d oiL(aA’py) (FiL’py) = aiL’Py’ .

Damit ist Lemma 3 durch Induktion nach dem ersten und Nebeninduktion nach dem letzten Argument bewiesen. Wir untersuchen die Funktion a fur einige spezielle Argumente.

4. Fur endliche a, p, y ist auch aapy endlich. LEMMA Der Beweis ergibt sich durch (vollstandige) Induktion nach a.

+

LEMMA 5. Es ist stets aOPy = 8’ y. Dies folgt durch Induktion nach y. LEMMA 6 . Ist p oder y transjinit, so ist a lPy=p‘ 2y>o. Denn fur p < o ist a l p o = o nach Lemma 1 und 4. Der Rest folgt durch Induktion nach y mit Lemma 5. LEMMA 7. Es ist stets aa‘pl < o a “ ~ ;1also ist o a p l > tifur alle 5
< aa(oa’p‘p) (oa’p’p) = oa’p’p’

= oa”P1

.

Der zweite Teil folgt unmittelbar durch lnduktion nach a. DEFINITION. Eine n-stellige Ordinalzahlfunktion cp mit n > 1 nennen wir eine n-stellige arithmetische Operation, wenn sie folgende Eigenschaften hat : 1. cp ist monoton in jedem Argument, wenn die beiden letzten Argumente > I sind; 2. cp ist normal im letzten Argument, wenn das vorletzte Argument > 1 ist; 3. es ist stets cpa, ...L Y ~ . -3~ ~~ , , - ~ . Eine Ordinalzahl 5 heiI3t HauptzahZ von cp, wenn fur ein a < c aus a g a i < c (fur l 2 ist fur jedes a mit cp auch die ( n - 1)-stellige Funktion ‘pa eine arithmetische Operation; man kann dann cp als eine Hierarchie ( n - 1)stelliger arithmetischer Operationen auffassen. Jede arithmetische Operation besitzt Hauptzahlen, und zwar gilt genauer (vgl. [l], Sektionen 13 und 15): LEMMA8 . Sei cp eine n-stellige arithmetische Operation. Die kleinste Hauptzahl von cp, die groger als eine gegebene Zahl t o> 1 ist, ist sup ( k mit k
&+l

=cp<

k...
Das Supremum einer Menge von Hauptzahlen von cp ist

116

3. DILLER

wieder eine Hauptzahl von cp. Ist n > 2 und Z < ct, so ist jede Hauptzahl > 2 von cpci auch Hauptzahl von cpZ.

Nach Lemma 1 bis 3 sind o und oci fur jedes u arithmetische Operationen. Wir wollen deren Hauptzahlen bestimmen. DEFINITION. Sei M eine (beschrankte) Menge von Ordinalzahlen. Die Ordnungsfunktion der allen arithmetischen Operationen act rnit R E M gemeinsamen Hauptzahlen, die groDer sind als eine Zahl /?, sei rnit y"(p) bezeichnet. yM(p) y ist also die y-te Zahl der Klasse Es gilt offenbar

(*I (**I

{{ 15 > 0 A A CIEM A p

< 5 : oap5 = 5 ) .

(8) y' = q("' (y'"' ( p ) y) 0 und nach Lemma 8 q(")( p ) A = sup y"' ( p ) y fur jede Limeszahl A. Y
Aus (**) und Lemma 8 folgt leicht: y("'(/3) y ist normal in y und monoton in ct und p ; q ist also wieder eine arithmetische Operation.

+

LEMMA 9. Es ist stets y(") (oct'py) 0 = oa'fi(y 0). Beweis. Wir setzen tk=oa'p(y+k). Dann ist <,=oct'py und { k + l = aa'p(y+k+ 1)=act& wegen (2.2). Nach Lemma 8 ist also q(")(oct'py)O= sup (,=aa'P(y+o) wegen (3). k<m

LEMMA 10. Es ist stets y(")(P) y=oa'/?(w(l +y)). Beweis durch Induktion nach y. 1. Fur y = O ist Lemma 9 rnit y=O die Behauptung wegen (1). 2. Sei ua'fl(w(l+y))=y("'(,4) y. Nach Lemma 9 folgt rnit (*) ou'p(w(l+ y'))=oa'p(w(l+ y)+w)=q'"'(y'"'(p)y)o=q'"'py'. 3. Sei c~ct'p(w(l+ y))=q'"(P)y fur alle y, die kleiner sind als eine Limeszahl A. Nach Lemma 1 und (**) folgt oa'p(co(1 +A))=sup oct'p(o(l+y)) =y'"'(P) A. Y% Beweis durch Induktion nach y. 1 . Nach Definition ist y ' " ' ( ~ ) O ~ q ( a ' a ~ A ' ( fur ~ ) Oalle &
1.1. Sei p t w . Nach Lemma 7 ist owpl b w , und nach Lemma 4 ist wegen

BERECHENBARKEIT PRIMITIV-REKURSIVER FUNKTIONALE

117

(2.3) und (I) awP1 =sup aa/3'a',(w. Andererseits ist nach Lemma 4 (und 5)

a<w)

(PN.

a<w

1.2. Sei P 2 w oder I > w . Wegen (2.3) und (1) ist aA/?l=sup aaP'/?'< 

aa"p"p"a

aC.4

aa'"a'p"P')B~"'pw, falls a oder /3 transfinit ist, weil aus fi>w nach Lemma I und aus a 3 w nach Lemma 1 und 7 rmf/7'P'2wfolgt. Nach Lemma 9 ist aa'pw=q(')(/?)O, und es folgt o I p l >sup q ( " ) ( P ) 0. a
2. Sei ~ 1 ~ ( 1 + ~ ) = ~ ~ " ~ < ~ ' Dann ( ~ ) ~ist = 5wegen . (2.3) o@(l+y')= sup cat< = sup aa'51, weil oat< < cra'
a
daher mit (*) a@(l +y')=q("Ia<'} (5) ~ = ~ f a l a < A(P>Y'. ) 3. Sei fur alle y, die kleiner sind als eine Limeszahl A,, oAP(l+ y ) = 11'01 a < A 1 ( p ) y. Wegen (3) und (**) ist dann a I f i ( l + I 1 ) = q ( a x a c(P> n ) 4.

'

DEFINITION der a-kritischen und der stark kritischen Zahlen. Die Ordnungsfunktion der a-kritischen Zahlen sei rnit da)bezeichnet. 1. Die Zahlen 5 rnit w5= heiDen 0-kririsch. 2. Die Zahlen 5 rnit da)5= 5 heiRen a'-kritisch. 3. 1st I eine Limeszahl, so heil3en die fur jedes a ; :& 1st die kleinste Zahl rnit d")P>fl, so ist demnach E$)) y = ~ ( " ) ( P + y ) .

<

S A T z 4. Die Funktion G ist eine dreistellige arithmetische Operation. Die Hauptzahlen der Funktionen aa werden wie folgt von q(') ( 0 ) aufgezahlt: 1. p ( o > y=w'+y. 2. 1st a< w, so is? q'"') ( 0 ) 0 = o und q(") ( 0 ) (1 + y) = d a ) y . 3. 1st a>w, so ist U ] ( ' ) ( O ) y = d a ' ) y . 4. fst ieine Limeszahl, so ist an@(1 + y ) =E$\ y, auJer falls 1=w und p <w ist. In diesem Fall ist o o p l = w und awp(2 + y ) = d w ) y . Beweis von 1. bis 4. durch Induktion nach a (bzw. A). 1) 1. folgt aus Lemma 5. Deshalb ist nach Lemma 4 (und 8) auch ~ ( ' 1 ( 0 ) 0 = w fur alle a < w. Ferner folgt 2. fur a = 0 aus Lemma 6. 2) Fur ein a>O gelte nun y ~ ( " ) ( w )y=d")y fur alle y, wobei E=a' ist, falls a 2 w ist, und LY= Cr' ist, falls a < w ist.

118

1. DILLER

1st y > o Hauptzahl von oa’, so ist y nach Lemma 8 und der Induktionsvoraussetzung eine E-kritische Zahl. Daher ist y > o nach Lemma 10 genau dann Hauptzahl von ad, wenn aus p < y stets aa’py = oa’p (0(1

+ y)) =

?/(a)

~72)

(8) y

= &$)) y = y

folgt. Wegen y 2 dCOy2 y ist hiernach y > o genau dann Hauptzahl von aa’, wenn d ’ ) y = y eine 2-kritische Zahl ist. 3) Sei l eine Limeszahl, und fur alle a
4. Ersetzung der Bar-Induktion durch transfinite Induktion Um den Beweis von Satz 1 statt rnit Bar-Induktion rnit transfiniter Induktion zu fuhren, verwenden wir die in Sektion 3 betrachtete Funktion cr. DEFINITION eines Ranges ap fur jeden Term p . ap = sup {

~+p1Ip ist direkter Subterm von p ) .

Da jeder Term regular ist, ist damit jedem Term eine Ordinalzahl zugeordnet. DEFINITION des Grades Gs eines Typs z. 1. Go=O, 2. G(o) z=max{Ga+ 1, Gz}. Diese Definition ist gleichwertig mit : Gz = max { Gz, SAT2

+ 1I ziist direkter Subtyp von z} .

1‘. Fur Terme q vorn Typ z und x [u’] gilt stets:

Beweis durch Induktion zuerst nach Gz und daneben nach cix [u’]. 1. Fall. 3 sei *xl ... xk mit k>O und q sei v q , ...ql oder x beginne nicht rnit *. Dann gibt es nach dem Hilfssatz aus Sektion 2 eine Familie

BERECHENBARKEIT PRIMITIV-REKURSIVER FUNKTIONALE

119

von Nennformen X, so da13 X[q] samtliche direkten Subterme von x [ q ] durchlauft und alle X[u'] direkte Subterme von x[u'] sind. Es ist ax [q]= sup {ax[q]+ 1) . Also gilt in diesem Fall die Behauptung. 2. Fall. x sei *x, ...xk und, falls k > O ist, beginne q mit einem Funktional. 2. I . 1st k = 0, so ist x [q] der Term q, und es ist ax [q] < CJ (Gz) (aq)O. 2.2. Es ist k>O und q beginnt mit einem Funktional. Dann ist nach Induktionsvoraussetzung, weil UU'X, [u']. .. x k - , [u'] <xx[u'] ist : aqxl [q]

...x k - 1 [ q ]

217k

= %(4x1

[ql '.' x k -

1 [ql

d o(Gz) (aq) (au'x1 [u'] . . . x h - 1 [u']) d CT(GT) (aq) (ax [u'] - 1). Nach derselben Induktionsvoraussetzung ist Xxk

[q] d a(Gz) (aq) ( E X L [u']) d o(Gz) (aq)(ax [u'] - 1).

(ax [a'] ist Nachfolgerzahl, weil x [u'] mit

21' beginnt.) Nach der Induktionsvoraussetzung fur das 1. Argument ist, weil der Typ zk von xh[u'] direkter Subtyp von z,mithin Gzk
a x [ q ] 6 CT(GZL)(Clfik[q])(MYxl[q]...xk-l

[q]

Urk)'

D a CT monoton in allen drei Argumenten ist, kann man hierin die obigen Abschatzungen einsetzen und erhalt UX

[ q ] d o(Gz - 1) ( ~ ( G T () a q ) (ax [u'] - 1)) (~(GX) (aq) (EX [u'] - 1)) = o(Gz) (aq) (ax [u']).

Hiermit ist Satz 1' bewiesen. SATZ 2'. Fur jeden Term p ist ap
120

J. DILLER

2. 1st a t < o Y und t vom Typ a, so ist auch a(t’)=at+ 1 < o y ( y > O ) . 3. Sei p ein Term vom Typ (7) 2, q vom Typ z, und ap und ccq seien kleiner Dann ist auch apqdo(Gz)(aq)(apu‘) als die y-te Hauptzahl von a(&). kleiner als die y-te Hauptzahl von a(Gz). 4. Istfein durch Abstraktion aus einem Term t mit a t < o Y definiertes Funktional, so ist af=at+ 1 <w”(y>O). 5. Sei g ein durch Rekursion aus Termen t und 5 [ u , a] definiertes Funktional vom Typ 07, ...zno mit Gzl ...z,o=k, und seien at und asru, u] kleiner als die y-te Hauptzahl q von ak. Dann ist agO=at+ 1
Literatur 1. H. BACHMANN, Transfinite Zahlen, Ergeb. Math. (Berlin-Gottingen-Heidelberg, 1955) Heft 1. 2. K. GODEL,Uber eine bisher noch nicht benutzte Erweiterung des finiten Standpunktes, Dialectica 12 (1958) 280-287. 3. W. HOWARD, Briefliche Mitteilung an K. Schiitte; bisher nicht veroffentlicht. 4. G. KREISEL, Inessential extensions of Heyting’s arithmetic by means of functionals of finite type, J. Symb. Logic 24 (1959) 284. 5. K. SCHUTTE, Syntactical and semantical properties of simple type theory, J. Symb. Logic 25 (1960) 305-326. 6. W. W. TAIT,Infinitely long terms of transfinite type, in: Formal systems and recursive functions, eds. J. N. Crossley and M. A. E. Dummett (Amsterdam, North-Holland Publ. Co., 1965) 176-185.

EQUATIONAL MAPS W. FELSCHER Universitat Freiburg Let B and C be classes of non-empty abstract algebras and let @ be a map from B into C such that, for every BEB, the sets underlying B and @ ( B )are the same. 4 is called equational if there exist terms, formed with the operations of B, such that, for every BEB, the operations of 4 ( B ) are the same as the operations induced in B by these B-terms. If, moreover, 4 is bijective from B onto C and the inverse 4-l is equational from C into B, then 4 is called an equational equivalence. Equational equivalences were introduced in Malcev [I31 under the name of rational equivalences; equational maps occur in Lawvere [l I ] and Linton [I21 in the form of certain functors. Examples of equational maps abound : the map, converting every associative ring into a Lie-ring, is equational; the numerous possibilities to define Boolean algebras with varying kinds of operations give rise to equational equivalences, and some highly non-trivial examples of equational equivalences were exhibited by Cs6kany [3, 4, 51. The aim of the present article is the general study of equational maps and the investigation of how certain properties of a class B are reflected as properties of its image under an equational map. Also, some generalizations to relational systems and applications to syntactical transformations will be considered. The first three sections of this paper are preparatory. In Section 1 a general review of free algebras is given. All the facts stated are well known, although some of the proofs indicated may be new. Section 2 contains a detailed study of algebraic (or polynomial) operations on an algebra, useful later on in Section 5. The procedure followed here is influenced partly by Schmidt [16]. However, the applications intended make it necessary to delve somewhat deeper into the properties of n-ary operations induced by m-ary terms where n < m. The short Section 3 assembles some simple facts about reducts. The fourth section begins with a precise definition of equational and functorial maps. The latter ones are those maps from a class B into a class C 121

122

W. FELSCHER

which, in addition to preserving underlying sets, also have the following property: for any two algebras B and G of B, the homomorphisms from B into G are the same as the homomorphisms from 4 ( B ) into +(G). Every equational map is functorial. Conversely, if B contains free algebras with sufficiently many generators, then every functorial map is equational (Theorem 1). In particular, this yields a theorem of Malcev [13], stating that functorial equivalences between quasi-primitive classes are equational. In Section 5 the construction of equational maps from given sequences of terms is studied. It follows that every equational map $r from B into C can be extended to an equational map, defined by the same terms as 4, from Es(B) into Es(C), where Es(B) is the strict equational closure of B (Theorem 3). If B is strictly equational (i.e. B=Es(B)) and 4 is an equational equivalence from B onto C, then C is strictly equational (Corollary 1). - Let 4 be an equational map from B into C. It follows from Lemma 9 that a function g - called reductive - can be found which transforms every term in operations of C into a term in operations of B; moreover, g is recursive in the sense that it is a homomorphism between certain algebras of terms. Now let 4 be an equational equivalence from B onto C and let B be defined by a set M of equations; let the transformation g be determined with respect to 4-l. Then C can be defined by the equations which arise from M under g, together with another set of equations saying that g transforms the B-terms, defining 4, into terms corresponding to the basic operations of C (Theorem 4). - Finally, it is shown that every class B is equationally equivalent to a class C such that the arities of basic operations of C are cardinal numbers and are minimal for this situation (Theorem 5). The sixth section begins with an outline of the relationship between certain infinitary languages and their models, namely relational systems with not necessarily finitary relations. The only non-trivial statement here concerns the existence of general substitutions ; the rather technical proof has been placed in an appendix. Imitating the structure of the definitions in Section 4, it is obvious how t o introduce definable maps 4 between classes B and C of models; if open formulas only are admitted, also open maps may be considered. A closer inspection of Section 5 then shows that the principal definitions and the theorems there can be translated almost verbally to this new situation. In this connection, it is the author’s pleasure to acknowledge his indebtedness to H. J. Hoehnke. In July 1966, the author had occasion to discuss an earlier draft of the first five paragraphs with Hoehnke who, in [S] had studied first-order definable equivalences (‘Strukturaquivalenzen’). Hoehnke then observed that the extension theorem (Theorem 3) could be

EQUATIONAL MAPS

123

generalized to give an extension theorem for these definable equivalences, and he provided a proof along the lines of his article [8]. It was only then that it became clear to the author, that the arrangement of Section 5 was already such that the concepts ‘algebra’ and ‘equation’ only had to be replaced by ‘model’ and ‘bi-implication’in order to handle the more generalcase. There is, finally, a second part of Section 6 which deals with syntactical transformations. These are nothing but reductive functions belonging to certain equational equivalences, namely equivalences varying the structure of the class of full cylindric set algebras. The result then is that syntactical transformations preserve axiomatizations: if syntactical transformations between languages P L 1 ,P L 2 are given, then an axiomatization of either tautologies or the consequence operator of PL’ is, in a natural way, transformed into a corresponding axiomatization for P L 2 , and if the first axiomatization is finitary then so is the second.

1. A review of free algebras In the following, the usual notions of set theory will be employed. If Xis a set, then % X shall be the set of all subsets of X , and if X , Yare sets then X y shall be the set of all functions from Y into X . A function f with domain Y will also be written as a sequence ( f ( y ) l y e Y ) . 1ffe.Y’ and 2s Y then f 12 shall be the restriction off onto 2, and f * (2)shall be the image of 2 underf. Ordinal numbers will be construed as sets such that every ordinal number n is the set of all ordinal numbers m smaller than n . In particular, the empty set 0 is the smallest ordinal number, and w is the smallest infinite ordinal number. Cardinal numbers will be identified with initial numbers, i.e. ordinal numbers that are not equipotent with any smaller ordinal number. If E is a set then card(E) shall be the cardinal number of E. A cardinal number n is regular if it cannot be decomposed into a union of less than n sets of cardinality less than n. If E and n are sets, an operation of arity n on E is a function from E ninto E. A type A is a sequence ( n , l i e I ) of sets where I is a non-empty set. A type is said to contain constants if some of the sets n, are empty. One defines rank(A) as the smallest infinite cardinal number m such that card(n,),<m for ~ E Z one ; defines dim(A) as the smallest infinite regular cardinal number m such that card(n,)<m for ieI. A type isfinitary if every n, is finite; in that case dim(A) = r a n k ( A ) = w . In general, one has rank(A),
124

W. FELSCHER

of arity ni on E ; the underlying set E is called the carrier of A and is denoted by u(A). An algebra is called finitary if its type is finitary. An algebra is called empty if its carrier is empty; it is called singular if its carrier contains at most one element. A class of algebras is called singular if each of its elements is singular. All classes of algebras considered in the sequel are assumed to be not empty. Iffinitary algebras only are considered, none of the proofs given in this article will use the axiom of choice ( A C ) . From now on, in this paragraph algebras of a fixed type A will be considered. The definition of homomorphism, subalgebra, congruence relation, quotient algebra and of a product of algebras is obvious (for a detailed description cf. Slominski [20], Schmidt [I61 and, for the finitary case, also Cohn [2]). For algebras A , B the set of all homomorphisms from A into B will be written as Hom(A, B ) . If A is an algebra and X c u ( A ) then the closure [XI (or [ X I Aif necessary) of X shall be the intersection of all u(B) such that X c u ( B ) for subalgebras B of A ; the uniquely determined subalgebra C of A such that u ( C ) = [XI then is called the subalgebra generated by X in A. X is called closed in A if X = [XI. If Xc Y c u ( A ) and Y is closed, then [ X ] E Y ; this trivial but useful observation is called the principle of algebraic induction (Schmidt [16]).As an application, one finds that two homomorphisms from A into an algebra B coincide on [XI provided they coincide on X . Further, making use of the regularity of dim(A), algebraic induction yields for every a e [ X ] a subset X,EX such that a ~ [ x , and ] card(X,)
125

EQUATIONAL MAPS

equations. An equation ( u , v) holds in an algebra A if h(u)= h(v) for every hEHom(P, A). If A is a class of algebras, then Q ( A ) - or Q(A, X ) if necessary - shall be the set of equations holding in every A E A . Q(A) is a congruence relation on P since it is the intersection of all congruence relations induced by homomorphisms hEHom(P, A ) for A E A . Defining Q ( A ) = Q ( { A } ) for algebras A , one obtains that Q(A) is also the intersection of all Q ( A ) for A E A . If ( u , D ) E Q ( A )then, for any gEHom(P,P), also { g (u),g (v)) E (?(A)- for if A E Aand h E Hom(P, A ) then also hg EHom(P, A). Let X contain more than one element; a class A then is singular if and only if Q ( A ) contains an equation {x, y ) such that x,y in X,x # y ; in that case Q ( A ) = u ( P ) x u ( P ) . Let Y be a subset of X and let Py be the subalgebra generated by Y in P. Obviously, Py is absolutely freely generated by Y. For every class A of algebras one obtains Q ( A , Y ) = Q ( A , X ) n ( u ( P , ) x u(Py)). For if A E A and heHom(P, A ) then h /u(Py)EHom(Py,A ) , which implies Q ( A , Y)E Q ( A , X ) . If, on the other hand, kEHom(P,, A ) , i,b = k Y, choose cp~u(A)' such that cp extends i,b. If hEHom(P, A ) extends cp, then h ru(Py) = k ; thus Q ( A , X ) n ( u ( P , ) x u ( P , ) ) s Q ( A , Y ) . Let P be absolutely freely generated by X . I f a class A contains an algebra A , A-freely generated by a set Z equipotent with X , then A is isomorphic to P/Q(A). For let p be a bijection from X onto Z , let pEHom(P, A) be the extension of p, let p = hn be the decomposition into a natural epimorphism from P onto a quotient algebra P/Q and an isomorphism h E H o m ( P / Q , A ) . It will be sufficient to show Q = Q(A). Now A E A implies Q ( A )c Q. If, on the other hand, BEA andjEHom(P, B), define x=j 1X.Since A is A-freely generated by Z , x-p-' extends to kEHom(A, B ) ; as j and kp coincide on X, one finds j = kp = khn. Thus Q E Q (A). Conversely, let P be absolutely freely generated by X , let A be any class of algebras, and let n be the natural epimorphism from P onto P / Q ( A ) .Any map x from n * ( X ) into the carrier u(A) of an algebra A E A determines a kEHom(P, A ) which extends X-(TC 1X ) . Since A E A , k factors through x, i.e. k=hn where hEHom(P/Q(A),A ) . Since n*(X)generates P / Q ( A ) ,h is the unique extension of x in Hom(P/Q(A),A ) . Thus every algebra in A , isomorphic to P / Q ( A ) ,is A-freely generated by a set equipotent with n* ( X ) . Precisely in case A is not singular, n 1X is a bijection, i.e. n* ( X ) equipotent with X . For every class A of algebras define S ( A ) , H ( A ) , I(A) as the class of all subalgebras, homomorphic images and isomorphic images of algebras of A respectively; define P(A) as the class of all algebras which are products of arbitrary families of algebras in A. A class is called primitive if it is closed

r

126

W. EELSCHER

with respect to S, H and P; it is called quasi-primitive if it is closed with respect to S, I and P. If A is closed with respect to S and P, then the axiom of choice ensures that I(A) is quasi-primitive. Let P be absolutely freely generated by X and assume that P is not empty. Then, for any class A, the algebra P/Q(A) belongs to ISPIS(A). For let Q be the set of all congruence relations on P, induced by homomorphisms into ). R belongs to algebras of A; let R be the product ~ ( P / Q ~ Q E Q Then PIS(A) since each of the algebras P/Q belongs t o IS (A). Let T , be the natural epimorphism from P onto P/Q; let p a be the epimorphic projection from R onto P / Q ; letpEHom(P, R) be defined by nQ=pQ *pfor Q E Q ; letp=hn be the decomposition into a natural epimorphism from P onto an algebra PIS and a monomorphism into R . Since nQ=pQ.h.n for every QEQ, one obtains that S = ~ ( Q ~ Q E Q )S=Q(A). , Thus P/Q(A) is isomorphic to a subalgebra of R . Let A be a non-singular, quasi-primitive class of algebras. Then A contains, for any set X , an algebra A , A-freely generated by X . For let P be absolutely freely generated by X. If P is not empty, P/Q(A) belongs to A, and a familiar replacement construction produces an algebra A , containing X , and an isomorphism from P/Q(A) onto A which sends T*(X) onto X . If P is empty, also X is empty and the type A does not contain constants. In that case, the empty algebra is a subalgebra of any algebra of A and may be chosen for A . - If closedness with respect to S is replaced by closedness with respect to non-empty subalgebras, then the assertion still holds for any non-empty set X . A class A of algebras is called equational if there exist a set X such that card(X)=rank(A), an algebra P absolutely freely generated by X , and a set A4 of A-X-P-equations such that A consists exactly of those algebras in which the equations of M hold. In that case, A is called deJned by the set M . Obviously, this definition depends only on the cardinality of X but not on the actual set X nor on the way in which the particular algebra P may have been chosen. If A is equational, then A is defined by the set Q(A, X ) of A-X-P-equations. If A is arbitrary, let E(A) be the class of algebras defined by the set Q (A, X ) for some X such that card(X) = rank(d). Then Q (A, X ) = Q(E(A), X ) , and E(A) is the smallest equational class containing A; E(A) is called the equational closure of A, A class A is called strictly equational if A consists exactly of the non-empty algebras in which, for some X , P and M as before, the equations of M hold. Correspondingly, for a class A of nonempty algebras the strict equational closure Es(A) is defined as the class of all non-emPtY algebras of E(A).

EQU.4TIONAL MAPS

127

Let P be absolutely freely generated by a set Y and assume that P is not empty. Any A-Y-P-equation, holding in an algebra A , holds a fortiori in any subalgebra of A . Any equation, holding in every factor A, of a product R = n ( A , J s € S ) , also holds in R since operations in R are performed coordinatewise. Finally, it is easy to see that any equation holding in an algebra A also holds in every homomorphic image of A ; if A is finitary, then the axiom of choice is not needed here. Consequently, every equational class is primitive. If A is arbitrary, then P/Q(A, Y) belongs to ISPIS(A); hence the equations from &(A, Y ) hold in P/Q(A, Y). LEMMA 1. Let A be a class of algebras, let P be absolutely freely generated by a set Y, let n y be the natural epimorphism from P onto P/Q (A, Y ) . Then P/Q(A, Y ) is E(A)-freely generated by n;(Y). For the proof, it can be assumed that P is not empty. Then P/&(A, Y ) belongs to ISPIS(A) and, therefore, to E(A). Let X be a set such that card(X)=rank(A). Since Q(A, X)= &(E(A), X),the algebra P/Q(A, X) is E(A)-freely generated by nz(X).If the lemma is true for a certain set Y , it will also be true for any set equipotent with Y. It will be shown below that (a) if the lemma is true for a set Y then it is true for every subset Z of Y; (b) if the lemma is true for a set 2 such that card(Z)=rank(d) then it is true for every set Y such that ZE Y. This will finish the proof, for the lemma has been shown for the set X ; if Y is arbitrary, it will be true for X x Y by (b) (provided Y is not empty) and, therefore, for Y by (a). So let Z be a subset of Y and let Pz be the subalgebra generated by Z in P.In order to deal with (a), one has to prove that Pz/Q(A, Z ) is E(A)freely generated by n f ( Z ) . This will be done if, for any G€E(A), every k€Hom(P,, G) factors through nz. If k is given, let cp be an arbitrary extension of k 12 in u(G)' and let hEHom(P, G) be the extension of cp. By assumption, the lemma holds for Y ;hence h factors through ny.Therefore h identifies the equations of Q(A, Y ) ; since &(A, Z)=Q(A, Y)n(u(Pz) xu(Pz)) and k = h ru(Pz), also k identifies the equations of Q(A, 2). In order to prove (b), let 2 and Y be given such that Y z Z , card(Z)= rank(A); it can be assumed that card(Y)>card(Z). It will be sufficient to show that, for any GcE(A), every hEHom(P, G) factors through ny. Let ( u , v) be an equation in &(A, Y); there exist subsets Y,, Y, of Y such that U E [YJ, v ~ [ Y , l ,card(Y,)
128

W. FELSCHER

card(Z)< card( Y ) , it follows from (AC) that there exists W' G Y such that card(W')=card(W), W ' n W = W ' n Z = O . Sending first W into W' and then W' into Z, a bijection y of Y onto itself may be chosen which maps W into Z . Let g be the automorphism of P which extends y. Then h will identify (u , u ) if and only if h - g - ' identifies ( g ( u ) , g ( v ) ) . But with ( u , u ) also (g ( u ) ,g ( u ) ) belongs to Q(A, Y ) ; since g * ( W ) c Z , g * ( [ W ] ) c u ( P , ) , the equation (g (u), g ( u ) ) even belongs to Q(A, Y )n(u(Pz)x u(Pz))= Q(A, Z ) . By assumption, the lemma holds for 2 ; hence k = h . g - l ru(P,) factors through n,. Consequently, h v g - l as well as k identifies (g(u),g ( u ) ) . - If only finitary algebras are considered, the bijection y can be constructed without any use of (AC). For in that case rank(d)=dim(A)=w shows that the set W can be found such that card( W )
2. Algebraic operations In this section algebras of a fixed type A will be considered.

EQUATIONAL MAPS

129

Let E and X be sets and denote by S(E, X ) the power of E with exponent E X , i.e. the set of all operations of arity X on E. If Y c X , let r," be the injection of S ( E , Y ) into S ( E , X ) such that, for any dES(E, Y) and any cp~E', one has r:(d) (cp)=d(cp 1 Y ) . Let A be a non-empty algebra and let X be a set. Then Op(A, X ) shall be thepower o f A with exponent #(A)', i.e. the algebra with carrier S(u(A), X) and operations defined coordinatewise. If c p ~ u ( A )let ~ , nf be the epimorphic projection from O p ( A , X ) onto A at cp, i.e. n:(d)=d(rp) for any deu(Op(A,X)). If Y G X , then ry" becomes a monomorphism ryAX from Up(A, Y ) into @ ( A , X ) which, occasionally, will be written simply as ry". If XEX, define e:' in u ( O p ( A , X ) ) by e$"(cp)=cp(x) for every cp~u(A)'. If Xf 0, let H ( A , X ) be the subalgebra of O p ( A , X )generated by (e:"lxeX}. If y e Y and Y G X , then r:(eyAY)=efX. Hence r: ru(H(A, Y ) ) is a monomorphism from H ( A , Y ) into H ( A , X ) which, again, will be written as r:' or r,". Let A be as before, XfO, and let P be absolutely freely generated by X. One defines an epimorphism e, from P onto H ( A , X ) such that e, (x)= e,"' for XEX.Let cp be in u(A)' and let h , be the extension of cp in Hom(P, A). Then nf * e, = h,, since these homomorphisms coincide on X . Therefore e A ( r ) ( c p ) = h g ( r )holds for any r e u ( P ) . The congruence relation on P induced by eA is the set Q(A)= Q ( { A ) ,X ) of all A-X-P-equations holding in A , because eA(u)=eA(v)if and only if h,(u)=h,(v) for every q ~ u ( A ) ' (cf. Schmidt [17], Satz 8). Hence there exists an isomorphism h, from P / Q ( A ) onto H ( A , X) such that h,n,=e,, where nA is the natural epimorphism from P onto P/Q(A). Assume now that A is not singular. Then H ( A , X) is E({A})-freely generated by {e$"lxEX} because hA(nA(x))=e,"'. In particular, E({A}) contains each of the algebras Op(A, Y ) and H ( A , Y ) for arbitrary Y, since it is closed with respect to products and subalgebras. Let A , X and P be as before. Since the equations of Q ( A ) hold in P / Q ( A ) , one has Q(A)=Q(P/Q(A))=(z(H(A,X ) ) . Thus there exists an isomorphism h, from P / Q ( A ) onto H ( H ( A , X ) , X) such that hHn,=eH(,,.). Defining hX=hH*hA1,one obtains an isomorphism from H ( A , X) onto H ( H ( A , X ) , X ) which, for every r e u ( P ) , maps e A ( r )onto eH(,,')(r); in particular, e:' is mapped onto eff(A3X),x. Now let g be in u ( H ( A , X ) ) , let q be in u(H(A, X))', and let cp be in U ( A ) ~Then . n,"(h,(g) (q))=g(nf.q) holds. For let r be in u ( P ) such that g = e A ( r ) and, therefore, hX(g)=eH(,,')(r). Define i = n f . q , a n d leth',betheextensionofq in Hom(P,H(A,X)).Theng(nf*q)= e A ( r )(A)=h,(r) and n~(e,(,,xt(r) ( q ) ) = n f ( h : ( r ) ) holds. But nf.h: extends A in Hom(P,A); thus h,=nf.h,' and hA(r)=nt(h,'(r)).- The equality

130

W. FELSCHER

n:(hx(g)(u]))=g(n:*u]) may be written as (h,(g)(y))(cp)=g((u](x)(cp)lxEX)). Thus h,(g) is the power with exponent X of g and h,(g) ( u ] ) is the superposition of g and y. Let A be a non-empty algebra, Y c X , X#O. Let Op(A, X;Y ) be the image under r," of Op(A, Y ) in O p ( A , X ) . Then d~u(Op(A,X)) belongs to u(Op(A, X;Y ) ) if and only if, for all cp, x in u(A)', cp 1 Y=x 1 Y implies n:(d)=n:(d). Assume now Y#O and let H ( A , X;Y ) be the image under r; of H ( A , Y ) in H ( A , X).Then H ( A , X ; Y ) is the subalgebra generated by { e e x ( y EY ) in H ( A , x). LEMMA 2. u ( H ( A , X ; Y ) ) = u ( H ( A ,X ) ) n u ( O p ( A ,X;Y ) ) . This is obvious in case A is singular; if A is not singular, still the set on the left side is contained in the set on the right side. Observe that H ( A , X ) is E({A))-freely generated by { e t x [ x E X }and that H ( A , Y ) belongs to E({A)). Let o be a map from X onto Y such that 0 Y is the identity; let f be the epimorphic extension of o from H ( A , X)into H ( A , Y ) such that f (e:') = e::). For every $ in u(A)' one finds that ni*f=n& since both these homomorphisms map e t x onto $(o(x)).Let d be in u ( H ( A , X))and assume that cp 1 Y = x Y implies n:(d)=n;(d) for all cp, x in U ( A ) ~it; will be sufficient to show that d = r : ( f ( d ) ) . But with $=cp Y one finds n $ ( r , " ( f ( d ) ) ) = f (d))=n;.,(d)=n;(d) for every cp in u(A)', because $so 1 Y = ((q Y ) . o ) 1 Y = q Y due to the choice of o. Let A be a non-empty algebra and let X be a non-empty set. A subalgebra H ( A , X;0) of H ( A , X) can be defined by u ( H ( A , X ;O))=u(H(A, X))n u(Op(A, X ; 0)). Let V , Y be sets such that Vc Y G X , Y#O. Then r," determines an isomorphism from H ( A , Y ; V ) onto H ( A , X ; V ) . For it is clear that H ( A , Y ; V ) is mapped into H ( A , X;V ) . Conversely, Lemma 2 gives u ( H ( A , X ; V ) )c u ( H ( A , X ; Y ) ) because u ( H ( A , X ; V ) )= u ( H ( A , X))n u(Op(A,X ;V ) ) c u ( H ( A ,X ) ) n u ( O p ( A , X ; Y ) ) = u ( H ( A ,X ; Y ) ) .Therefore , )) for any d € u ( H ( A ,X ; V ) ) there exists d ' € u ( H ( A , Y ) ) and d " ~ u ( O p ( A V such that d= r,"(d') = r,"(d"). Then r,"(d') = r," (d")= r f r ; ( d " ) implies d'=r;(d") and d ' e u ( H ( A , Y ; V ) ) . Let A be a non-empty algebra, let X be a non-empty set. If Z is not empty, then r$"' is an isomorphism from H ( A , X;0) onto H ( A , X u Z ; 0), and r:"' is an isomorphism from H ( A , Z ; 0) onto H ( A , XuZ;0). Therefore the counterimage under r t of H ( A , X;0) in Op(A, 0) does not depend on X . Define as H ( A , 0) the subalgebra of Op(A, 0 ) which is this counterimage. Let $ be the unique element in u(A)O; then T$ is an isomorphism of O p ( A , 0) onto A which maps H ( A , 0) onto a subalgebra C ( A ) of A. Now $=cp 10

r

.I( r

EQUATIONAL MAPS

131

for any X and any cp~u(A)';further n$= n;.rz. Therefore ng determines an isomorphism, not depending on cp, of H ( A , X;0) onto C(A). In general, rcf maps H ( A , X ) onto the subalgebra of A , generated by the range cp*(X) of cp, because n,"maps the generating set { e t x l x E X } of H ( A , X ) onto 'p* (X) (cf. Schmidt [16], Theorem 5). If B is a non-empty subalgebra of A , a sequence cp can be found in u(B)'; hence [ c p * ( X ) ] ~ u ( B shows ) that n: maps H ( A , X ) into B. Thus B contains the image of H ( A , X ; 0) under nf, i.e. C(A) is contained in every non-empty subalgebra B of A (cf. Schmidt [18], Theorem 4). Consequently, if C ( A ) is not empty, it is the smallest non-empty subalgebra of A . This holds in particular if the type A contains constants ; in that case, C ( A ) is generated in A by the empty set and, therefore, the isomorphic algebra H ( A , 0) is generated in Op(A, 0) by the empty set as well. - The elements of H ( A , X ) are called algebraic (or polynomial) operations of arity X on A ; in particular, the elements of H ( A , 0) are the constant algebraic operations, and the elements of C ( A ) are called algebraic (or equationally definable) constants of A . A first mathematical definition of algebraic operations was given by McKinsey-Tarski [ 141; the present treatment is influenced by Schmidt [16]. If X # 0 or if the type A contains constants, then the algebra H ( A , X ) is the same as Schmidt's H X ( A ) ;if A does not contain constants, H o ( A )will be empty while H ( A , 0) need not to be so. Let A be a non-empty algebra and Iet X be a non-empty set. If d € u ( H ( A , X ) ) , define Supp(d) to be the set of all Y such that YEX, d € u ( H ( A , X;Y ) ) . Then Supp(d) is a filter: if Y c Z c X , Y~ Supp(cl), then ZESupp(d); if YESupp(d), Z ~ S u p p ( d )then , Y n Z E S u p p ( d ) . Further, if X E Z and d € u ( H ( A ,X ) ) , then Supp(d) is a base for the filter Supp(rg(d)). Since any d € u ( H ( A ,X)) belongs already to a subalgebra H ( A , X ; Y ) such that c a r d ( Y ) < d m ( d ) , every filter Supp(d) has a base of sets of cardinality less than dim(A). In particular, if the type under consideration is finitary, then Supp(d) has a base of finite sets and, therefore, contains a smallest set. However, there are examples when Supp(d) will not contain a smallest set. For let X , 2 be infinite sets, Z G X , Z # X , and let a, b be different elements of X . For every cp E Xx,define d by d(cp)= a if 'p* (A') n Z is finite, d(q)=b otherwise. If an algebra A can be found such that d e u ( H ( A , X)), then Supp(d) will contain precisely the sets Y such that X - Y is finite. Now define A = ( X ) , A = ( X , ( d ) ) ; then O p ( A , X ) is an algebra ( X X , d ' ) . Define x€(XX)' by x(x)=e$X for X E X ; d € u ( H ( A , X ) ) will be shown if d=d'(x). But for every VEX' one has n:.x='p, whence nt(d)=d(cp)= d(n; .x)= rcf (d' (x)) since rcf is homomorphic. Let A be a non-empty algebra and let P be absolutely freely generated by

132

W. FELSCHER

a non-empty set X.If Y c X , let G ( Y )be the set of all endomorphisms g of P such that g 1 Y is the identity and g 1X maps X into X . If r ~ u ( P )the , fact that YESupp(e,(r)) can be expressed with help of equations which do not depend on the particular algebra A : LEMMA 3. If YZO, then Y~Supp(e,(r))if and only if, for every g E G ( Y ) , the equation ( r , g ( r ) ) holds in A . If, moreover, card(X)>dim(d) then this equivalence remains true also for Y = 0. )), If q = g YX and For a proof, assume first Y ~ S u p p ( e ~ ( r gEG(Y). cp~u(A)' then cp and cp-q coincide on Y ; hence h,(r)=e,(r)(cp)= e,(r)(cp.q)=h,.,(r)=h,(g(r)). Assume now that the given equations hold in A . If Y#O, let q be a map from Xonto Y such that q Y is the identity, and let gEG( Y )be the extension of q . If cp, $ are in U ( A )and ~ coincide on Y , then cp.q=$.q and, therefore, eA(r)(cp)=h,(r)=h,(g(r))=h,.,(r)=h,.,(r)= h,(g(r))=h,(r)=e,(r)($). In order to cover the case Y=O, assume now that card(X)>dim(d). Let W c X be a set such that e,(r)Eu(H(A, X ; W ) ) and card(W)
n

LEMMA 4. Supp(d)=

n( S u p p ( p g ( d ) ) l B ~ B ) .

EQUATIONAL MAPS

133

For a proof, let Y be in u ( P ) such that e,(r)=dand, therefore, eB(r)=pz(d) for BEB. Since Q(A) = Q(B), it follows from Lemma 3 that both sides of the assertion contain the same non-empty sets. Assume next that X contains at least two elements. Then there exist non-empty subsets Y, Z of X such that Y n Z = O . If 0 belongs to every Supp(pi(d)), then so do Y, 2. Hence Y , 2 belong to Supp(d). But since Supp(d) is a filter, also 0 belongs to Supp(d). Conversely, O ~ S u p p ( d )implies O ~ S u p p ( p i ( d ) for ) every B E B . Assume, finally, that X contains only one element, and let 2 be a set such that X G 2, X Z Z . For any B E B the homomorphisms r p . p g and p g . r i z from H ( A , X) into H ( B , Z ) are equal since they coincide on the generators e:', X E X . Since the lemma has been proved for r:"(d), one obtains Supp(r:"(d))= ( s u p p ( p ~ ( r , ~ ~ ( d ) )= ) i ~ €( s ~u ) pp(r,~~(p,~(d)))i~ since E ~ ) .s u p m is a base for the filter Supp(riZ(d)),one has for every Y GXthat Y ~ S u p p ( d ) if and only if Y ~ S u p p ( r ; " ( d ) ) ;likewise, Y ~ S u p p ( p g ( d )is) equivalent to Y ESupp (rp( p i ( d ) ) )for any B EB. Let A be a non-empty algebra, let X , Y be sets, Y c X , X Z O . Let P be absolutely freely generated by X and let Py be the subalgebra of P, absolutely freely generated by Y. If YfO then H ( A , Y ) is isomorphic to Py/Q(A, Y ) . Let e: be the epimorphism from Py onto H ( A , Y ) ; then rF-eI=e, ru(Py) since both these homomorphisms coincide on Y. (it may be remarked that the isomorphism Y; from H ( A , Y ) onto H ( A , X;Y ) corresponds to the isomorphism from P,/Q(A, Y ) onto (eA1eA)*(Py)/Q(A,X ) , given by one of the so-called theorems of isomorphism.) If Y=O, assume that Po is not empty, i.e. that the type A contains constants. Then H ( A , 0) as well as the isomorphic algebra H ( A , X ; 0) is generated by 0; thus eA maps Po onto H ( A , X;0). Combining eA ru(Po)with the isomorphism from H ( A , X;0) onto H ( A , 0), one obtains an epimorphism el: from Po onto H(A,O) such that, again, rt*e;=e, ru(Po). This may be read as saying that, if there are constant names for some elements d of u ( H ( A ,0)) (i.e. elements r€u(P0) such that e,(r)=d), then there are such names for all elements of u ( H ( A , 0)) (a remark to this effect also in Lawvere [I I], Chap. 11, prop. 5). In any case, if e: is defined then the equality r:.e:=e, ru(Py)implies that, for r ~ u ( P ) , one has YESupp(eA(r))if and only if there exists S E U ( P ~such ) that ( r , S) holds in A . Now let B again be a non-singular class of algebras. With a suitable algebra A - e.g. P / Q ( B , X ) - Lemma 4 then gives

n

n

LEMMA 5. Assume that Y f O or u(Po)#O. If r ~ u ( Pthen ) Y E n (Supp(e,(r))lBEB) if and only if there existssEu(Py) such that ( r , s) holds in every BEB.

134

W. FELSCHER

As an immediate consequence, one obtains the following: Assume that Y # 0 or u(P,) # 0; let r be in u(P). If for every BEB an element u,eu(Py) can be found such that ( r , u , ) E Q ( B ) , then a uniform ueu(Py) can be chosen such that ( r , u ) E Q ( B ) for every B E B . Let X and n be sets such that there exists an injection p from n into X . If E i s a set, then p determines a bijectionj, from S(E, n) onto S(E, D*(n)) and an injection r ~ = r & , , . j , from S(E, n) into S ( E , X ) . Now let A = ( n , l i ~ I ) be the type under consideration and let X be such that card(X) 2 rank(d). A A-coordinate system f o r X shall be a pair (K, (piliEZ)) such that K is a class of injections into X , { p i l i E Z } c K and every pi is a (distinguished) injection from ni into X . If the type A is ordinal, it will be sufficient to demand that K is a set, containing an injection p, for every ordinal n such that card@) j n i , r:. Assume now that a fixed A-coordinate system for X is given. Let P = ( u ( P ) , ( f i l i ~ I ) be ) an algebra, absolutely freely generated by X , and let A = (u(A), (fiAli~1))be a non-empty algebra. For every p,, the map r,U(,), restricted to u ( H ( A ,n)), becomes a monomorphism r,” from H ( A , n) into H ( A , X ) ; r t will be written instead of r,$ For every c p e ~ ( A )every ~ , ieIand every AEX”’ one obtains e A ( f , ( l . ) )( 4 0 ) = S , ~ ( q a A ) because fi”(cp.A)= h , ( f i ( l ) ) since h , is homomorphic. In particular e A ( h ( P i ) )(q)=f!(cp.p,) and therefore, p* (n,)E Supp (e, (f i (pi))). Further, one finds eA( f , (Bi)) = rA (A”) since rA (LA)( ~=1(‘;W.Cn,, *ji(AA>) (CP) = (ji(LA)> (40 t P* (nil) = LA((9iB* (ni)).Pi>=fi” (9.Bi). Retaining the last notations, let r be in u ( P ) and let p, be in K such that f i * ( n ) ~ S u p p ( e , ( r ) )i.e. , e,(r)Eu(H(A, X ; p*(n))). The unique element rA,fln in u ( H ( A , n)) such that r,” (r = e, ( r ) shall be called the operation ofarity n on A defined by r. If no ambiguities can arise, r A S nwill be written instead of rASfln. In particular, (J;:(j?i))A~”i=A” for every i E 1 . If G is another algebra and geHom(A, G) then ghq=hg,, for any c p ~ u ( A )hence ~ ; ( g . e , ( r ) ) (q)= g ( e , ( r ) ( q ) ) = e , ( r ) (g’cp) for every r E u ( P ) (cf. Schmidt [17], Cor. of Satz 5). If, in addition, p*(n)ESupp(e,(r)) and p*(n)ESupp(e,(r)) then s ( r A , ” ( $ ) ) = r G , ” ( g - $ )for every $EU(A)”. For choose c p ~ u ( A )such ~ (cp)=(r&,,-jn(rA,”))(cp)= that cp rp*(n)=$.p-,’ ; then eA(r)(q)=(r,”(rA”’)) r”,”((cp rP*(n)).Pn)=rA,’($)and, in the same way, eG(r)( g * c p ) = r G 7 n ( g . $ ) . -If an equation ( r , s) holds in a n algebra A then Supp(e,(r))= Supp(e,(s))

EQUATIONAL MAPS

135

and, for any n such that P*(n)ESupp(eA(r)),also r A ~ " = s A ~If,' . conversely, B * ( n ) ~ S u p p ( e ~ ( r ) ) n S u ~ p ( eand ~ ( sr)A) r n = s A 2 *then , ) = g ( ( s ( Y )(cp t Z ) l Y E y>>= [ g : sl (cp YZ). Thus r ; [ g : q ] belongs to u ( H ( A , X;Z ) ) . Then [g: q l ~ u ( H ( A2)) , follows from Lemma 2 and, in case Z=O, from the definition of H ( A , 0). (For a different proof cf. Schmidt [16], Cor. 3 of Th. 20.) The following remarks shall establish some connections with clones in the sense of Ph. Hall (cf. Cohn [2]);they will not be needed in the later paragraphs. Let E and X be non-empty sets and let X be chosen such that card(X) is an infinite regular cardinal number. For every set Y and any Y E Y define ef' in S(E, Y ) by e,"'(cp)=cp(y) for any ~ E E ' . A subset M of U ( S ( E , Y ) I Y c X and card(Y)
u

136

W. FELSCHER

on I,and let A be the algebra ( E , ( L " l i ~ 1 ) )IfLAEM, . sayL"ES(E, n), define x ~ u ( H ( A , n ) ) "by x(k)=e$" for k
f i H ( A 9 n )which ( ~ ) , shows that Miscontainedin U(u(H(A,m))lm< c a r d ( X ) ) .

On the other hand, if m>O then M contains the generators e t m ,k<m, of H ( A , m). Further, for f i " E ( M n S ( E , n)) and q € u ( H ( A ,m))n one has f i H ( A 3 n ) ( q[)f=i A : q ] .Hence if m>O the inclusion u ( H ( A , m))EMfollows by algebraic induction from the fact that M is closed with respect to superposition. Finally, u ( H ( A , 0)) E M follows from u(H(A, 1)) 5 M and the property (iii) of clones. If A is an algebra of type A and X is a set such that dinz(d)max(dim(A'), d i m ( d 2 ) ) ;if both types are ordinal, let fi be a bijection from card(X) onto X . Then u(H(A', X ) ) = u ( H ( A 2 ,X ) ) holds ifand only if A', A2 determine the same non-ordinal or, in case both types are ordinal, the same ordinal E-X-clones (cf. the proof of Lemma 7 below).

3. Reducts

In this section two types A = ( p j l j € J ) and A' = ( n i l i € l )will be considered, and dl shall be a reduct of A :there shall exist an injection z from Z into J such that p7(i,=ni for i E Z - in most cases Z will be a subset of J and z will be the natural injection. Let A = ( u ( A ) ,(A41 ~ E J ) be ) an algebra of type A ;the Z-reduct A1Z of A is defined as ( u ( A ) , (,fiAirliEZ>)wherefiA" =A$, for ieI. Hence A(Z is obtained from A by forgetting the operations h4 such that not ~ E T * ( I )However, . it should be observed that, for p j = O , although the operation h4 may be forgotten, its unique value h f ( 0 ) still remains an element of u(AlZ)=u(A). If A is a class of algebras of type A , the Z-reduct All of A shall be the class of all I-reducts of algebras of A. Let A be an algebra of type A and let X be a subset of u ( A ) ; then the closure [XIA1'is contained in the closure [XI". If ( A J s E S ) is a family of algebras of type A , then n ( A , l Z IsES) is the Z-reduct of n ( A , l s ~ S ) In . particular, for any algebra A of type A and for any non-empty set X , the algebra Op(A(Z,X ) is the I-reduct of O p ( A , X ) and, therefore, the inclusion u(H(AIZ, X ) ) C u ( H ( A , X ) ) holds. If Y C X , Y#O, the monomorphism r t i r I xfrom H(A11, Y ) into H(A/Z,X ) is the restriction of the monomorphism r$' from H ( A , Y )

EQUATIONAL MAPS

137

into H ( A , X);due to the construction of H ( A , 0) and H(AIZ, 0), this carries over to the case Y=O. Let X be a non-empty set, let P be an algebra of type A , absolutely freely generated by X,and let P' be the subalgebra of P II,generated by X . Then P' is an algebra of type A', absolutely freely generated by X. For let B= ( u ( B ) , (f,BliEZ)) be a non-empty algebra of type A' and let cp be in U ( B ) ~ . Define an algebra A = ( u ( B ) ,( h f ( j ~ J )that ) h; =A" for j = z ( i ) , whereas h; is an arbitrary operation of arityp, if notjez*(Z). Then cp has an extension h in Hom(P, A ) , and h r u ( P ' ) belongs to Hom(P', B). - Now let A be an arbitrary algebra of type A . The elements of Hom(P, A ) as well as those of Hom(P', AJZ) are in one-to-one correspondence with the elements of u(A)X=u(AIZ)X.Hence one obtains a bijection from Hom(P, A ) onto Hom(P', A l l ) . If A is not empty, let eA and eAllbe the epimorphisms from P onto H ( A , X) and from P' onto H(AIZ, X).If A is a class of algebras of type A , let Q(A) and Q(A1Z) be the sets of equations determined by A in P and by AIZ in P'. One obtains LEMMA 6. (a) If A is not empty then eAlr= eA 1u(P'). (b) For every class A : Q(AlZ)=Q(A)n(u(P') x u(P')). (c) For every class A: E(A)IZsE(A/Z), and if A consists of non-empty algebras also Es(A)[ZcEs(AIZ). For a proof, let cp be in U ( A ) and ~ let h,, hk be the extensions of cp in Hom(P, A ) and Hom(P', AlZ). Since hk=hv ru(P1),one finds eA(r)(cp)= h,(r)=hk(r)=eAlr(r)(cp)forevery rEu(P'). This proves (a), and (b) follows since the correspondence between h , and h i is one-to-one. Finally, (c) is a consequence of (b). From now on, assume that card(X) >rank ( A ) , and let a fixed A-coordinate system (K, ( p j l j ~ J ) for ) X be given. Let A be a non-empty algebra of type A and define B=AII. Then Supp(e,(s))=Supp(e,(s)) for every s e u ( P ' ) . Moreover LEMMA7. (a) If s ~ u ( P ' )and P.EK, P*(n)ESupp(eA(s)),then S ~ , " = S ~ , ~ . (b) If u ( H ( A , X ) ) = u ( H ( B , X ) ) then, for every n such that P,eK, also u ( H ( A , n))= u ( H ( B , n)). For a proof of (a), observe that seu(P') gives r ~ ( s A ~ " ) = e A ( s ) = e g ( s ) = r , " ( ~ ' , ~This ) . implies s A , n=sB," since r," are the restrictions, to u ( H ( A , n)) and u(H(B, n)) respectively, of the one injection r:(A)=r:(B)from S(u(A), n) into S(u(A), X ) . For a proof of (b), assume u ( H ( A , X ) ) = u ( H ( B , X)); since B=A/Z, one has always u ( H ( B , n ) ) s u ( H ( A ,n)). It follows from Lemma2and, incasen=O, from thedefinitions that u ( H ( A , X))=u(H(B, X ) ) implies u ( H ( A , X;P*(n)))=u(H(B,X;P*(n))). Now if d € u ( H ( A ,n)) then

rt,

138

W. FELSCHER

r t ( d ) E u ( H ( A ,X ; p*(n))); hence there exists d ' a ( H ( B , p*(n))) such that r;(d) =rK,,,(d'). But then d"eu(H(B, n)) can be found such that d'=j,(d), whence r,"(dj=r:(d"). Since r,", r," both are restrictions of the injection r;('), one obtains d=d" and d c u ( H ( B , n)). Let P be again the algebra ( u ( P ) , ( h j l j € J ) ) . Then one can prove LEMMA 8. For a non-empty algebra A one has u ( H ( A , X))=u(H(AIZ, X ) ) if and only if, for everyj not in 7*(Z), there exists s j e u ( P ' ) such that the equation (s,, hj(Pj)) holds in A . If u ( H ( A , X))=u(H(AII,X ) ) , then the existence of the elements s j follows from the fact that both maps eA and e A l rare onto u ( H ( A , X ) ) and e A I I = e Aru(P'). Assume now that the sj exist. Since H(AI1, X ) contains the generators e:', X E X , of H ( A , X),it suffices t o show that u(N(Al1, X ) ) is closed with respect to the operations h r of H ( A , X ) . This is clear i f j ~ z * ( Z ) . I f j not in T * ( I ) and y ~u (If(A lZ,X))'j, then A:(?)= [hf: q ] . But since ( s j , h j ( P j ) ) holds in A , and since s j ~ u ( P ' ) .Hence hfeu(H(AIZ,p,)). Since u(H(AIZ, X)) was shown to be closed with respect to superposition, it follows that [hf : q l ~ u ( H ( A I 1X, ) ) . The following remark concerns a different approach to the fact that the algebras H ( A , Z ) are closed with respect to superposition. Let A be an algebra of type A ' , let g be in u ( H ( A , Y ) ) and q € u ( H ( A ,2)j'. Assume now that Lemma 8 is available; define A by J = 1 u { j } wherej not in Z,p i = Y. Extend A to an algebra C of type A by adding the operation g . Let Xcontain Y and 2 ; then r:gEu(H(A, X ) ) and A=CIZ shows that r ; g = e A ( s ) = e , ( s ) for some seu(P'). Since r f g = e c ( h j ( P j ) ) by definition of C, one finds that (s, hj(Pj)) holds in C. Hence u(H(A, X ) ) = u ( H ( C , X)) by Lemma 8 and, if an injection pz is available, also u ( H ( A , Z ) ) = u ( H ( C ,2))by Lemma 7. Since [g : r ]= h ~ p ( A ' Z ) ( yone l ) , obtains that [ g : y] belongs to u(H(C, 2))and, therefore, to u ( H ( A , Z)). It remains t o give a direct proof of the second part of Lemma 8, making no use of superpositions. If the sj are given, it will be sufficient to show that for every u e u ( P ) there exists u ' ~ u ( P ' such ) that eA(u>=eA(v').Let M be the set of all elements u ~ u ( Phaving ) this property; since X c M , it suffices to prove that M is closed in P. If r E M P J ,then, by definition of M , a sequence x ~ u ( P ' ) ' j can be found such that e , . y = e A . X . If hr corresponds to j in H ( A , X ) , this shows eA(hj(q))= h r (e,. y) =hr(e, .x)= eA(h,(x)). Therefore, j ez *( Z ) implies h.( ) E U ( P ' ) whence , h j ( y ) € M . If not j e z * ( Z ) , define I e X X xl by I tfl*(yj)=X*p,; t p * ( p j ) and /z t X - P * ( p j ) the identity. Let g and g 1 be the extensions of I in Hom(P, P ) and Hom(P', P'); then g1= g ru(P1)and

EQUATIONAL MAPS

139

Since with ( s j i h j ( P j ) ) a's0 ( g ( s j ) , S(hj(Pj))> holds in A , one obtains eA(hi (q))= eA( h j ( ~ = ) )eA( g (hj(pj))) = eA( g (sj)). Then s j e u ( P ' ) implies g(si)=g' (sj), g ( s i ) E u ( P 1 ) ,whence h , ( q ) ~ M .

gfhj
4. Equational and functorial maps

Let A ' = ( n J i E Z ) and A2=(mklkEK) be two types and assume, for the sake of convenience, that I n K = 0. Put J = I u K and define A = ( p i [j c J ) by p , = ni if i E Z , P k = mk if kE K ; A is called the mixed type determined by A', A'. Let X be a set such that card(X) = rank(A) and let a fixed A-coordinate system (K, ( p j l j ~ J ) for ) X be given. Let P = ( u ( P ) ,( h , l j ~ J > )be an algebra of type A, absolutely freely generated by X ; let P ' = ( u ( P ' ) , ( f i l i ~ Z ) ) and P' =(u(P'), (g,lkEK)) be the algebras of type A' and A', absolutely freely generated by X , which are subalgebras of the reducts PII and PlK respectively. These notions will be kept fixed throughout this and the following section. Let B be a class of algebras of type A' and let C be a class of algebras of type A'. A function @ from B into C is called a map if u ( B ) = u ( @ ( B ) )for every BEB.A map 4 is called an equivalence if it is a bijection from B onto C. A map 4 is called functorial if Hom(B, D)cHom(+(B), 4 ( D ) ) for all algebras B, D in B; an equivalence 4 is called functorial if 4 as well as 4-l are functorial maps. Thus a map 4 is functorial if it determines a functor from the full category of algebras, determined by B, into the full category of algebras, determined by C,and if this functor commutes with the underlying set functors. If the classes B and C are equational, then these functors are precisely the algebraic functors in Lawvere's [111and Linton's [121categorical presentation of universal algebra. From now on, let B, C be classes of non-empty algebras of type A', A' respectively. Let 4 be a map from B into C. If BEB, B= ( u ( B ) , (A'liEZ)), $(B)= C = ( u ( B ) , ( g " l k ~ K ) ) define , an algebra Y l ( B ) of type A by Y,(B)=A= ( u ( B ) , ( h f l j ~ J ) ) hf=fiB , if i E Z , h$ =gE if kEK. Let A be the class of all algebras Yl (B) for BEB; A is called the class of mixed algebras determined by 4. Then Yl is an equivalence from B onto A and, for any BEB, 4 ( B ) = Yl(B)lK. If, moreover, 4 is an equivalence from B onto C, one defines analogously an equivalence Y 2 from C onto the same class A such that 4= Y ; Yl. If the map 4 is functorial, then Y, is a functorial equivalence. Let # be a map from B into C and let A be the class of mixed algebras determined by 4. The map # is called equational if there exists a sequence ( s , l k E K ) such that s k ~ u ( P 1for ) k E K and, for all algebras AEA and all

140

W. FELSCHER

kEK, the equations (sk, hk(&)) hold in A . In that case, 4 is said to be defined by these equations. An equivalence 4 is called equational if 4 as well as 4-l are equational maps. Obviously, any map defined on a singular class B is equational. Every equational map determines a representation of C in (the category determined by) B defined by identities in the sense of Cohn [2], IV.4, and all examples discussed in Cohn [2] arise in that way. However, there are occasions when representations defined by identities in Cohn’s sense do not arise from equational maps: for instance, the representation of R-modules (where R is a commutative ring) in R-algebras such that the universal functor assigns to every module its exterior algebra. If 4 is an equational map from B into C, defined by the equations {(sk, hk(&))lkEK), then 4 is uniquely determined by these equations. For if BEB and C = 4 ( B ) , A = Y , ( B ) , then (Sk,hk(&)) holds in A , whence g C -h A -(h,(P,))A,mk=~$’mkfor every k E K . Since B = A I I and s ~ E u ( P ~ ) , = s : ’ ~ ‘ . Thus g: = si’m k for every kE K. Lemma 7 implies stTrnk The following remark answers a question posed by H. J. Hoehnke. If 4 is an equational map from a class Es(B) onto some C, then 4 is uniquely determined by its restriction 4 IB. For let 6 be another equational map from Es(B) into some such that 4 rB=$ rB; let {(sk, hk(&))IkEK} and {(&, hk(Bk))lkEK}be the defining equations o f 4 and respectively, and let Y,, P, be the equational equivalences onto the corresponding classes of mixed algebras. If BEB and A = Y,(B)=PI(B) then hk(Pk)), (&,A,‘(&)) both hold in A . Hence (sk, f k ) holds in A and, by Lemma 6, also in AlI=B. Since BEB was arbitrary, ( s k , f k ) belongs to Q(B) for every ~ E and, K therefore, holds in every DEEs(B). Now if DEEs(B) and G = Y,(D) then (sk, f k ) and ( s k , h k ( P k ) ) both hold ill G, whence ( 4, h k ( p k ) ) holds in G for every k e K . Thus 4 can be defined by the same equations as r$ and, therefore, coincides with Every equational map 4 from B into C is functorial. For a proof, it will be sufficient to show that the equivalence Ylfrom B onto A is functorial. If B, D are in B, then Hom(Yl (B), Y l ( D ) ) cHorn(& D)is obvious. Define A = Yl(B), G = Y,(D); let g be in Hom(B, D) and ~ E KSince . ( s k , hk(Pk)) holds both in A and G, one has S k A ’ m k = ( h k ~ k ) ) A ’ m k = h k A , sFfmk= h;. But it was shown in section 2 that, for every $ E U ( A ) ~one ~ , has g(s$.””($))= s,“. mk (g .$4). Conversely, there are examples of functorial equivalences such that 4 is equational but not 4. Namely, let G be the class of all groups, written additively and viewed as algebras C= (u(C), ( 0 , -,+)) of type ( 0 , 1,2); let M be the class of all reducts of groups, obtained by omitting the unary

c

6

($9

6.

EQUATIONAL MAPS

141

operation - ; thus the elements of M are certain monoids. Let 4 be the map which assigns to every BEM the uniquely determined group whose reduct is B. Obviously, (b is a functorial equivalence and 4-l is equational. According to rank(d)=w, let X be countable and let P ' = ( u ( P ' ) , (O,+)) and P = P 2 = ( u ( P ) ,( 0 , - ,+ )) be the corresponding algebras, absolutely freely generated by X . Assume now that B is a subclass of M such that 4 1B is equational onto a subclass C of G. This will occur if and only if there exists s ~ u ( P ' )such that (s, -xo) holds in every group CEC or, equivalently, holds in the group F = P / Q ( C ) . Since P*(l)ESuPp(eF(-xo)), this is equivalent to =sF*', i.e. -eEpl =sFjl in H(F, 1). Defining M=$-'(F), Lemma 6 (a) shows that the elements d€u(H(F,X ) ) such that d=e,(s) for s ~ u ( P ' )are precisely the elements of u ( H ( M , X ) ) . Hence the elements of u(H(F, l)), representable in the form sF,' for s ~ u ( P ' ) are , precisely the elements of u ( H ( M , 1)). Therefore, 4 t B is equational if and only if -e;'Eu(H(M, 1)). Since H ( M , 1) is the monoid, generated by e2' in Op(M, l), one obtains that this is the case if and only if there exists n, O
LEMMA 9. Let g be the epimorphism from P onto D which extends the identity on X . Then, for every A E A and every r E u ( P ) , the equation ( r , g (Y)) holds in A . For let A be in A and let M be the set of all r ~ u ( P such ) that eA(r)= e A ( g ( r ) ) ;since X G M is obvious, it remains to show that M is closed in P. If q € M p j , then e , . q = e A . g q by definition of M . If h r is the operation in H ( A , X),corresponding to hj, thisimpliese,(hj(q))=hr(e,.q)=hr(e,.gq)= e A ( h j ( g q ) ) .Therefore it suffices to prove that eA(hj(gq))=eA(g(hj(q))).IfjEZ then h j u ( P ' ) =&=hq, whence hj(gq)=hq(gq) = g (Aj(?)). Assume now j e K .

142

W. FELSCHER

Since 6,;Bj=gq and since t(6,,) is an endomorphism of P, one has (hj ( B j ) ) = hj(t(dg,). P j ) =hj(ag,* P j ) =hj(gq)* Since with (S j , j(Pj)> also (t(d,,) (sj), t(dg,) (hj(Pj))> holds in A , one obtains eA(hj(gq))= eA(t(sgJ (hj(Pj))> = e A (t(8gJ ( s j ) )= e A (hp(gq)) = e A (9(hj(~1)). Let 4 be a functorial map from B into C (the case that B and C contain the empty algebra - if it exists - is admissible here). If B, D are in B and B is a subalgebra of D, then 4 ( B ) is a subalgebra of 4 ( D ) ; if B is a homomorphic (isomorphic) image of D, then 4(B) is a homomorphic (isomorphic) image of +(D).Let ( B J s E S )be a family of algebras in B and assume that the product B = n ( B s l s ~ S )belongs to B. Then Cp(B)= ( 4 ( B J l S E S ) since 4 4 (B))=u(B)= (u(Bs)l=S) = ( 4 4 (Bs))lSE S > and every projection from u ( 4 ( B ) ) onto u(4(BS))is homomorphic. In particular, the functor determined by (p preserves products and equalizers. Therefore, if B is closed with respect to products and subalgebras, then the general adjoint functor theorem (cf. e.g. Lawvere [I I], Th. 1.4, also Cohn [2], Th. 111.4.2 or Felscher [7] 2.2.4) ensures that this functor has an adjoint: there exists a function 6 from C into B and, for every CEC, a homomorphism fc from C into 4 ( 6 ( C ) )with the following universal property: for every BEB and every hEHom(C,4(B)) there exists a unique gEHorn(B(C), B ) such that h = gfc, (For algebraic functors in the sense of Lawvere this is the Theorem in IV, sect. 2 of [I 11; a generalization to a wider class of functors is prop. 5 in Linton [12]. For equational maps 4 the existence of 0 follows also from Cor. IV.4.2 in Cohn [2].) Let 4 be a functorial map from B into C, let A be the class of mixed algebras determined by 4 and let Y l be the functorial equivalence from B onto A. If an algebra B is B-freely generated by a set Y, then Y,(B) is A-freely generated by Y. For put A = Y , ( B ) and let G be in A; then u(G)'= u(GJZ)', G = Y , (GJZ) by definition of A and Hom(A, G)=Hom(B, G J I )by functoriality. Hence every 'p~u(G)'has a unique extension in Hom(A, C). Further, B=AII implies [ Y I B s[YIA;therefore Y generates A . - Even if 4 is a functorial equivalence, Y will in general not also generate 4 ( B ) , hence 4 ( B ) need not be C-freeIy generated by Y. However, if 4 is a functorial equivalence from B onto C and if C contains an algebra C, C-freely generated by a set 2 equipotent with Y, then the same reasoning shows that Y,(C) is A-freely generated by Z . Thus Y l ( B ) and Y , ( C ) are isomorphic and, consequently, so are (p(B)and C. In that case therefore (p(B)is C-freely generated by Y. Let 4 be a functorial map from B into C and let Y , be the functorial equivalence from B onto the class A of mixed algebras determined by 4. t(dgq)

n

n

JJ

EQUATIONAL MAPS

143

Let R be an algebra of type A , absolutely freely generated by a set Y , and let R' be the subalgebra of R(I, generated by Y. Let Q(B) and Q(A) be the equations of R' and R, holding in B and A respectively. LEMMA10. Let B contain an algebra B, B-freely generated by a set Z equipotent with Y. Then for any reu(R) there exists s € u ( R 1 )such that ( r , s)EQ(A). For a proof, let nl and n be the natural epimorphisms from R 1and R onto R1/Q(B) and R/Q(A) respectively. Since card( Y )= c a r d ( 2 ) there exists a bijection 6 from Y onto 2 which can be extended to gEHom(R, Yl(B)). Since Yl(B)is A-freely generated by 2, g factors into g=hn where h is an isomorphism from R/Q(A) onto Y , (B). Further, h is also an isomorphism from (R/Q(A))II onto Y,(B)II=B. Since R' is a subalgebra of RII, g l = g ru(R') is the unique extension of 6 in Hom(R', B ) and decomposes into gl=ll,n, where 11, is an isomorphism from R'/Q(B) onto B. Therefore f = h - ' . h l is an isomorphism from R'/Q(B) onto (R/Q(A))lI. Since g(s)= g1 (s), hn(s)=hlnl (s) for any s ~ u ( R ' ) ,one also has fnl (s)=n(s). Being an isomorphism, f is in particular onto u((R/Q(A))II) = u(R/Q(A)). Consequently, for every r ~ u ( R )there exists s ~ u ( R ' )such that f n l ( s ) = n ( r ) . Since also fnl(s)=n(s), this gives n(r)=n(s), ( r , s)€Q(A). - As an immediate consequence, one obtains

THEOREM 1. Let B contain an algebra, B-freely generated by a set of cardinality raizk(d). Then every functorial map from B into C is equational. If, moreover, C contains an algebra, C-freely generated by a set of cardinality rank(d), then every functorial equivalence from B onto C is equational. For equational equivalences between quasi-primitive classes of finitary algebras, this is Theorem 6 of Malcev [13]. It follows from earlier remarks that the assumptions with regard to equivalences from B onto C are satisfied if B contains an algebra, B-freely generated by a set of cardinality rank(A), and if C is closed with respect to subalgebras. Let I#I be a map from B into C and let Ylbe the equivalence from B onto the class A of mixed algebras determined by 4. If 4 is equational then Lemma 8 implies u ( H ( d ( B ) ,X ) ) c u ( H ( Y 1 ( B )X, ) ) = u ( H ( B , X ) ) for every BEB. Let 4 be arbitrary and assume, conversely, that u(H($(B), X ) ) G u(H(B, X ) ) for every BEB. Then also u(H(Y,(B), X ) ) = u ( H ( B , X ) ) holds for every BEB. Namely, let B be in B and abbreviate Yl( B )= A , 4 ( B )= C ; let k be in K. Since u ( H ( C , X ) ) G u ( H ( B ,X ) ) and e c ( h k ( / ? k ) ) E U ( H ( C , X)), there exists S k E u ( P 1 ) such that e,(Sk)=ec(hk(/?k)). Since eB=eA ru(P') and e C = e A lU(P2),this gives e A ( S k ) = e A ( h k ( P k ) ) , i.e. ( s k , h k ( / ? k ) ) holds in A .

144

W. FELSCHER

Now Lemma 8 gives u ( H ( A , X ) ) = u ( H ( B , X ) ) . - It would be interesting to have criteria which, if u ( H ( 4 (B),X ) ) _c u(H(B, X ) ) for every BEB, ensure that 4 is equational. A sufficient condition is that the class A contains an algebra A , functionally free for A. For then u(H(AlK, X ) ) E ~ ( H ( AX ,) ) = u(H(A(1, X ) ) shows that the right equations hold in A and, therefore, in A.

5. Construction of equational maps The conventions, agreed upon at the beginning of Section 4, shall be kept in effect throughout Section 5. Let (s,lkEK) be a sequence of elements of u(P'). A non-empty algebra B of type A' is called admissible for ( s , l k e K ) if B*(mk)ESupp(es(s,))for every k e K ; a class B of algebras is called admissible if every of its elements is admissible. If B=(u(B), ( f i " l i ~ 1 ) )is admissible for ( s , l k ~ K ) , an algebra Y,(B)=A=(u(B), ( h f [ j E J ) ) of type A may be defined by h f = fi" if i E I , h$=skSsmkif kEK. If B is a class of non-empty algebras of type A', admissible for (s,lkEK), then the class A of all algebras Y l ( B )for BEB will be called the class constructed from B and the sequence ( s , ( k e K ) . From now on, let B, C be classes of non-empty algebras of type A', A 2 respectively.

11. (a) If B is a non-empty algebra of type A', admissible for LEMMA (s,lkeK), then, for every k e K , (s, hk(jjk)) holds in Y,(B).(b) If G is a non-empty algebra of type A and if, for every kEK, (s, hk(Pk))holds in G, then G(Zis admissible for ( s , ( k e K ) and G = Y , (GIZ). For let B be given in (a) and define A = Y , (B). Since A ( I = B , Lemma 7 implies = s!,~*, whence s$mk=hi = (hk(Bk))ATmk. Thus (s, hk(Bk)) holds in A . Now let G be given in (b) and define B= G ( I . Since es(sk)= e, (s,) = eG(h, (&)), one has p* (m,) E Supp (es(sk));hence B is admissible and A= Y,(B)can be defined. Since the equations (s, hk(Pk)) hold both in G and A , one has sF'mk=hFand s t * m k = h fSince . s,Eu(P') and B = A I I = G ( I , Lemma 7 implies sz3*' =~ f "="s$"~ ", whence h,"= hf for every k E K . This proves G = A . For the following considerations it will be useful to remember that rank(A1)< rank(d)and rank(A2)< rank(A).Hence, if a class B is equational, the defining set M of equations has t o be taken from an algebra P'' of type A', absolutely freely generated by a set X, such that card(X,)=rank(A'). Therefore, X, may be chosen such that X,_cX and the algebra P'' may be taken to be the subalgebra of P', generated by X,.Obviously, B then is also defined by the set M of A'-X-PI-equations. Conversely, if a set N of

~t~~~~~

EQUATIONAL MAPS

145

A'-X-P'-equations is given, the class B defined by N is primitive and, therefore, definable by a set M of A'-X,-P"-equations. However, if rank(d') < rank(A) then no uniform construction of M from N is available. THEOREM 2. Let B be admissible for (sklkeK), let A be the class constructed from B and ( s k l k e K ) and let Y l be the map from B onto A. Then (a) Y1 is an equational equivalence from B onto A, defined by { ( s k , h k ( P k ) ) I k ~ K(b) } . If B is strictly equational and defined by a set M of A'-X-P'-equations, then A is strictly equational and defined by M u { ( s k , h k ( l j k ) ) l k E K } . (c) The strict equational closure Es(B) of B is admissible for ( s k l k e K ) , and the strict equational closure Es(A) is the class constructed from Es(B) and ( s k l k e K ) ,i.e. Es(AII)=Es(A)IZ. Obviously, (a) is an immediate consequence of Lemma 11. Let B be given in (b), let B be in B and define A = Y l ( B ) . Since e,=e,,,=e, ru(P'), the equations from M also hold in A . On the other hand, let G be a nonempty algebra of type A in which the equations from M as well as all ( s k ,hk(ljk)), k e K , hold. Then the equations from M also hold in GIZ, whence GIZEB. Consequently, Yl(GIZ)EA, but Y l (GlZ)= G by Lemma 11. For a proof of (c), let F be the algebra P'/Q(B). Since Q(F)=Q(B)= Q (Es(B)), Lemma 4 implies p* (mk)E Supp (eF(sk)) and, therefore, P*(mk)~Supp(e,(s,))for every DEEs(B) and every kEK. Hence the equations defining Yl also define an equational equivalence 9, from Es(B) onto the class K of all non-empty algebras of type A defined by Q(B)u{(s,,h,(P,))lk~K}. Since coincides with Yl on B, one has A c K and, therefore, Es(A)cK. It now will be sufficient to show that Q(A)=Q(K). First, ASK implies Q(K)cQ(A). In order toprove the other inclusion, assume (0, w)EQ(A). Since Lemma 9 can be applied to theequational equivalence PI, there exist u', w 1 in u ( P ' ) such that ( u ' , v)eQ(K), (w', w)EQ(K). Here Q(K)sQ(A) implies ( u ' , ~ ) E Q ( A ) ( , d ,w)eQ(A). Since Q(A) is a transitive relation, it follows that ( d , w')EQ(A). Since B=AIZ, Lemma 6 gives ( u ' , W'>EQ(B). Hence Q(B)zQ(K) implies ( u ' , w')eQ(K). Since also Q(K) is transitive, one obtains (0, w)eQ(K). THEOREM 3. If 4 is an equational map from B into C, then 4 can be extended to an equational map $ from Es(B) into Es(C), defined by the same equations as 4. If 4 is an equational equivalence from B onto C, then $ is an equational equivalence from Es(B) onto Es(C). For let A be the class of mixed algebras determined by 4 and let 4 be defined by {(sk, hk(Pk))IkEK) where S k E u ( P 1 ) . Since these equations hold in every AEA and since B=AIZ, it follows from Lemma 11 (b) that B is

146

W. FELSCHER

admissible for (sklkEK) and that A is the class constructed from B and (s,lkEK). Moreover, the equational equivalence Y1 from B onto A determined by 4 is the same as the equivalence determined by (sklk€K) according to Theorem 2 (a), since these equivalences are defined by the same equations. Thus Y l can be extended to an equational equivalence from Es(B) onto Es(A). Since A I K c C implies Es(AIK)sEs(C) and since Es(A)IKc Es(A1K) by Lemma 6, may be defined by $ ( B ) = p l ( B ) / Kfor BEEs(B). If 4 is an equational equivalence from B onto C and Y , is the equational equivalence from C onto A determined by 4-', then Y , can be extended to an equational equivalence p, from Es(C) onto Es(A), and one obtains = Fi pl.- An immediate consequence is

6

6

'.

COROLLARY 1. Let B and C be equationally equivalent classes of algebras. If B is strictly equational then so is C. It follows from Theorem 3 and Theorem 1 that it does not depend on the chosen A-coordinate system whether a map 4 from B into C is equational or not. For if 4 is equational with respect to a certain A-coordinate system, it may be extended to an equational map 6 from Es(B) into Es(C). Now 6 is functorial and Es(B) contains algebras, Es (B)-freely generated by arbitrary sets. Hence 6 is equational also with respect to every other A-coordinate system, and so is 4. - Further, Theorem 3 may be used in order to obtain defining equations for an equational map in a more economical way : COROLLARY 2. If 4 is an equational map from B into C then 4 can be defined by equations {(sk, hk(Pk))lkEK} such that, for every kEK, the element sk belongs t o the subalgebra P; of P', generated by P*(mk). By Theorem 3 it will be sufficient to prove this in case B is strictly equational; further, it can be assumed that B is not singular. Now let k be in K and define Y=p*(m,); let R and R' be the subalgebras of P and P' respectively, generated by Y, whence R ' = P i . Since Pk(/Q(B, Y ) is 3-freely generated by a set equipotent with Y, the assumptions of Lemma 10 are satisfied for the equational, and therefore functorial, map 4 . Since h k ( / ? k ) E U ( R ) , an element s k E u ( R 1 ) can be found such that (sk, h k ( & ) ) belongs to Q(A, Y)cQ(A), where A is the class of mixed algebras determined by 4. - Another application of Theorem 3 is COROLLARY 3. Let 4 be an equational equivalence from B onto C. Then (a) if B E B and B is generated by a non-empty set Y then 4(B)is generated by Y; (b) if B E B and B is B-freely generated by a non-empty set Y then

EQUATIONAL MAPS

147

& ( B )is C-freely generated by Y ; (c) if B E B and B is functionally free for B then q5(B) is functionally free for C. Observe first that an algebra, B-freely generated by Y, is also Es(B)freely generated by Y ; likewise, an algebra functionally free for B is also functionally free for Es(B). Therefore one may assume that B and C are strictly equational. Now let B be given in (a) and let D be the subalgebra of & ( B ) ,generated by Y . Since C=Es(C) is closed with respect to non-empty subalgebras, D belongs to C. Hence &-'(D) is a subalgebra of By containing Y, which implies q5-' ( D )= B, D = 4 (B). Further, (b) follows from (a) and the fact that q5 is functorial. Finally, let B be given in (c). Then B consists of the non-empty algebras in HSPIS({B}). Since C is primitive, the nonempty algebras in HSPIS({$(B)}) form a subclass of C. Since q5 is also functorial, q5 maps B onto this subclass of C. On the other hand, q5 maps B onto C. Hence C consists of the non-empty algebras in HSPIS({$(B)}), i.e. d ( B ) is functionally free for C. Let (sklkEK) be a sequence of elements of u ( P ' ) , let B be admissible for (SkIkEK), let A be the class constructed from B and (SkIkEK) and let Y l be the equational equivalence from B onto A. For B E B define q5(B)= Y, (B)IK, and let C be the class of all algebras q5(B) for BEB. The sequence (s,lkEK) is called complete with respect to B if q5 is an equational equivalence from B onto C. By Theorem 3 completeness with respect to B entails completeness with respect to Es(B). 12. Let B contain an algebra B, functionally free for B. Then LEMMA ( s , l k ~ K )is complete with respect to B if and only if u(H(B, X ) ) = u(H(&(B), Here it is clear that completeness of (sklkEK) implies already u(H(D,X ) ) = u(H(+(D),X ) ) for every D E B . Assume now that B E B is functionally free for B and that u(H(B, X>)=u(H($(B),X ) ) . Since Y , is equational, also A = Y,(B)is functionally free for A and u(H(B,X ) ) = u ( H ( A , X ) ) holds; hence u ( H ( A , X))=u(H(AlK, X)). Now Lemma 8 gives the existence of a sequence ( t i l i E Z ) in u(P') such that, for every iEZ, the equation (hi(Pi),t i ) holds in A and, therefore, in every GEA. Hence P*(ni)ESupp(eG(ti)) for every GEA and every ~ E Z ;since e G ( t , ) = e G I K ( t one i ) obtains that C is admissible for ( t J i E Z ) . Let Y , be the equational equivalence from C onto the class K constructed from C and ( t i \ i E l ) . Since the equations ((hi(Pi),t , ) l i € Z } hold in every GEA, Lemma 11 (b) gives G = Y , ( G l K ) for every GEA, i.e. A = K . Therefore 4 = Y, * Y, is a bijection and, moreover, an equational equivalence.

m.

148

W. FELSCHER

A rather peculiar criterion for completeness is given by

. ( s k l k e K )is complete LEMMA 13. Let B be admissible for ( s , ( k ~ K )Then with respect to B if there exists a function ( i ( k ) l k E K ) from K onto l a n d if, for every kEK, there exists an automorphism gk of P' such that g k ( f i ( k ) (Pi(k)))=Sk.

Observe first that gk YX is a bijection of X onto X,because P' is absolutely freely generated by X . Hence (gk X)-' induces automorphisms p k , p : , p t of P, P I , P 2 respectively such that p : = p k ru(Pi),p : = p k r u ( P 2 ) and p : =g;l. Since h k ( P k ) ~ u ( P 2it) , follows that the element ~ ~ ( ~ ) = p : ( h ~ ( P ~ ) ) holds. Since (sk, hk(Pk)) holds in lies in u ( P 2 ) . Further every AEA, also (fi(k)(Pi(k)), titk,) holds in every AEA for every k e K . Now let ( k ( i ) l i E l ) be a function from I i n t o K such that i(k(i))=ifor every ieZ (the axiom of choice may have to be used here). Defining t i = t i ( k ( i ) )one , obtains a sequence ( t i l i e l ) of elements of u ( P 2 ) such that ( f i ( P i ) , t i ) holds in every A E A for every i E I . Now the same reasoning as in the proof of Lemma 12 can be applied. Let ( s , l k ~ K )be a sequence of elements of u ( P ' ) ; let B be the class of all algebras of type A', which are admissible for ( s , l k e K ) ; let Y , be the equational equivalence from B onto the class A constructed from B and ( s , l k e K ) . A function g from u ( P ) into u ( P ' ) is called reductivefor ( s , l k ~ K ) if, for every r e u ( P ) , the equation ( r , g ( r ) ) belongs to Q(A). It follows from Lemma 9 that reductive functions always exist. In case B is the class of all non-empty algebras of type A', the algebra D, considered in Lemma 9, simply becomes Y , (P'), and the proof then can be simplified considerably. THEOREM 4. Let 4 be an equational equivalence from B onto C,given by equations { ( s k , hk(Pk))lkeK}for 4 and { ( h i ( P i ) , t i ) l i E I } for 4-l. Let g be a function from u ( P ) into u(P'), reductive for ( t i l i e l ) . Let B be strictly equational and defined by a set A4 of A'-X-P'-equations. Let g * ( M ) be the set of all d2-X-P2-equations (g(u), g(v)) for (u, u ) E M . Then C consists precisely of the non-empty algebras C of type A' such that (i) C is admissible for ( t i l i ~ Z ) , (ii) the equations from g * ( M ) hold in C , (iii) the equations {(g(sk), h k ( P k ) ) l k ~ Khold } in C . Since due t o Lemma 3 also property (i) can be expressed with help of equations, one obtains in this way a set of defining equations for C. For a proof, let A be the class of mixed algebras determined by 4. Since A is also the class constructed from B and (s,lkeK), A is strictly equational and defined by M u { ( s k , hk(&))lkeK}. On the other hand, A is the class

EQUATIONAL MAPS

149

constructed from C and (t,\iEZ), whence (g(sk), sk) for kEKand (g(u), u), ( g ( v ) , u ) for (u, U ) E Mbelong to Q(A). By transitivity then the equations in (ii), (iii) belong to Q(A) and, since C=AlK, to Q(C). Conversely, let C be a non-empty algebra of type A 2 with properties (i), (ii), (iii). Since (i) holds, Y , ( C ) can be defined; since C = Y,(C)IK, it will be sufficient to show that Y , ( C ) E A . As g is reductive for (tJiEZ), in Y , ( C ) the equations (g(sk),sk) for kEK and (g(u), u), (g(u), v) for (u, V ) E Mhold. Since the equations holding in C also hold in Y 2 ( C ) ,one obtains that the equations from M u { ( s k , hk(Pk))IkEM}hold in Y , ( C ) . In concluding this paragraph, a theorem will be formulated for which the type A' shall begiven, while the type A' is to be determined in a particular way: THEOREM 5. Let B be a class of non-empty algebras of type A'. Then an ordinal type A 2 and a class C of non-empty algebras of type A' can be found such that (i) B and C are equationally equivalent; (ii) for every kEK: the ordinal number mk is a cardinal number; (iii) there exists a bijection (k(i)li€Z) of I onto K such that, for every i E I , mk(j) is the smallest cardinal number w ifor which a set Y,EX exists such that card(Yi)=wiand,foreveryBEB, YiESupp(eB(fi(Pi))). Moreover, if the type A' is ordinal, then the injections jk,kEK, of the A-coordinate system may be chosen such that pi( j ) =pi 1mk(j ) for every i e I. For a proof, let K be such that ZnK=O and let there exist a bijection (k(i)liEZ) from Z onto K. Since A' is given, there exist fixed injections p i from n, into X where card(X)=rank(Al). Since p*(ni)€Supp(e,(f.(pi))) for every BEB, sets Yimay be chosen such that Yicj3*(ni) and card(Yi)= mk(,), where mk(i)is determined in (iii). In order to treat the general case, define & j ) as an arbitrary bijection from mk(i) onto Y, and define sk(i)= f i ( p i ) . Then B becomes admissible for (sk(illiEZ), and if A is the class constructed from B and ( $ k ( j ) I i E Z ) then (hi(fli),hk(i)(&(j)))holds in every A E A . Hence (s,,,)JiEZ) is complete with respect to B. Turning to the case that d l is an ordinal type, let the sets Yi again be chosen such that Y,Ej?*(ni) and card( Yi)= mk( i). Since mk( i) d IE j , one can define pk(,)= p i r Y ) ? k ( j ) . If mk(j)= O let g i be the identical automorphism of P'. Assume now that mk(j)>O. Let p i be a bijection from ( p i ' ) * ( Y i ) onto mk(,.);since both these sets are contained in ni, p i may be extended to a bijection hi of n, onto itself. Then the bijection piS,pLr of p* (n,) onto itself can be extended to a bijection y i of X onto itself which maps Yi onto P*(mk(,,). Now let g i be the automorphism of P' induced by yi, and define

150

W.FELSCHER

~ ~ ( ~ ) = g , ( f , (ItP now ~ ) ) .suffices to show that B is admissible for (&(i)IiEI), since Lemma 13 then will ensure that ( s k ( , ) 1 i ~ is I )complete with respect to €3. Let B be in B. Since P*(mk(i))ESupp(eB(sk(i,)) is clear if mk(i)= O or if B is singular, assume that rn,(,,>O and let B be not singular. Then H (B, X ) is E((B})-freely generated by (e;'(xEX}; since H(B, X ) itself belongs to E ( ( B } ) , yi determines an automorphism g f of H ( B , X ) such that g F ( e y ) = e:,& for X E X . Then e,.g,=g"e, since these homomorphisms coincide on X.Now g? ( e B ( f i ( P i ) ) ) = e B ( g i ( f i ( P i ) ) ) = e B ( s , ( i ) > ; hence yiESuPp(eB(fi(Pi))), i.e. e,(f,(P,))Eu(H(B,X ; Yi)), implies that eB(Sk(i)) belongs to the image of u(H(B, X ; Y,)) under g y . Since Y,#O, H ( B , X;Yi) is generated by (e,BXlxEY,}; hence g f maps H ( B , X;Yi) onto H(B, X ; y: ( Yi))= H ( B , P*(mk(i))).Thus e B ( s k ( i ) ) E U ( H ( B , X;P*(rnk(i)))),p*(mk(i))E Supp (eB (sk( i)).

x;

6. Definable maps and syntactical equivalences

A relational type shall be an ordinal type A = ( n, li€I ) such that 0
EQUATIONAL MAPS

151

For every set E, let R ( E ) = ( u ( R ( E ) ) ,(r:lleL)) be an algebra of type A L such that u ( R ( E ) ) = % ( E X )and the operations are defined as follows: if ZE(L, u L1), r: is the obvious Boolean operation (i.e. complementation, subjunction or intersection); if 1=(2, Y ) , r: is the Y-cocylindr$cation t,: if S s E X and cpeEX,then cpEtyS if and only if, for every XEE', cp 1- Y = x t - Y implies XES.Further, for every SGE' one defines S u p p ( S ) c % X by Y ~ S u p p ( 5 'if) and only if t-,S=S, i.e. if, for any cp, x in EX,cp 1 Y=x 1 Y implies that c p ~ Sis equivalent t o XES.Then Supp(S) again is a filter, because tytz= tYuZholds for all Y and 2. From now on, a fixed bijection P from 6 onto X shall be given. For every Y c X let n y and y y be uniquely determined by the properties ny<6, y y bijective from n y onto Y, and P - ' . y y strictly monotonic from ny onto (P-')*(Y) - in particular, if n<6 and Y=p*(n) then ny=n, yY=pIn. If E is a set, SGE', y ~ S u p p ( S )define , S(') in En' by $ES("if and only if there exists cp ES such that $ = cp * yy. S(') is called the relation of arity n y determined by S. S can be computed from S(') since cpcS if and only if cp * y y E S'Y'. Let Ars be the set of all fi(PTn,) in At. There exists a bijective correspondence between the class of all models A of type d and the class of all functions 7c from Ats into sets u ( R ( E ) )such that j?*(ni)~Supp(7c(fi(b 177,))). For if A=(u(A), ( f i A l i ~ Z > )is given, define n: as a function into u(R(u(A))) by c p ~ n . ( f ~ ( P r nif~and ) ) only if c p - P 1 n i ~ J Aif; n: is given, define A by u ( A ) = E and x.A=n:(fl(prn,))('?*(ni)). Now let A be a model and let 7cA be the corresponding function into u(R(u(A))). Extend nA to a function n: from Ar into u(R(u(A)))by setting c p ~ n : ( f ~ ( Aif ) ) and only if there exists xerc,(f,(P [n,))such that c p * A = x - P Tni (i.e. if and only if c p . A ~ f i ~ Let ) . eA be the extension of 7c: in Hom(P, R(u(A))).If rEu(P), cp~u(A)',then cp is said to satisfy r if cpee,(r). r holds in A if e A ( r ) = u ( A ) X ;this is abbreviated by lb r. If A is a class of models, then r holds in A - abbreviated by ki, r - if r holds in every A E A . If A is a model, then lkA (0-w) is equivalent to e A ( o ) = e A ( w ) ;for every formula r one has lb (V) r-r. Further, algebraic induction shows that fr(r)ESupp(eA(r))for every r e u ( P ) ; thus the filter Supp(e,(r)) has a base of sets of cardinality less than 6. In analogy to Lemma 3, the fact that Y ~ S u p p ( e , ( r ) )is equivalent to IFA r + + V Rwhere r R = f r ( r ) - Y ; if diin(d)= 6=0, then R is finite and YeSupp(e,(r)) is equivalent to IkAr-VJXr for every X E R . If A is a class of models, let Q ( A ) be the set of all sentences r such that lkA r, and let Qp(A) be the set of all open formulas r such that lb r. Let

152

W. FELSCHER

D(A) be the class of all models D such that ItDr for every rEQ(A), and let Dp(A) be the class of all models D such that IFD r for every rEQp(A). D(A) then is called the dejnable closure of A, and Dp(A) may be called the open closure of A. A class A is called dejinable if A=D(A); this is the case if and only if there exists a set M of sentences such that AEA is equivalent to it, r for all rEM. In this case, A is said to be dejned by M . The class A is called open if A=Dp(A); this is equivalent to the fact that A can be defined by a set of open formulas. The following theorem about the existence of substitutions for languages P will be proved in the appendix: For every function v] from X into X there exists a function sub,, from u ( P ) into u ( P ) such that (i) sub,, fi(A)=fi(v]*A) for fi(A)EAt, (ii) for every rEu(P), every model A and every q ~ u ( A ) ~ : qEe,(sub,r) ifandonly ifq*q~e,.,(r).It is a consequence of (ii) that, for every q and every A , FI, u u w implies It, sub,,v++subqw.Making use of substitutions, YeSupp(e,(r)) becomes equivalent to l;A r++subq(R)r,where q(R) maps R = f r ( r ) - Y injectively into the set of elements of X which do not occur in r, and y(R) is the identity outside of R . If dim(A)=d=w, let ( q ( x ) l x ~ Rbe ) a (finite) sequence of functions v](x) in X x such that q ( x ) maps x into an element of X , not occurring in r, and otherwise is the identity; then YeSupp(e,(r)) is equivalent t o IF, r+-+sub,(x)r for all ~ E R . Consider now, in analogy t o Section 4, relational types A'=(nJiEI), A2=(mk/k€K) withamixed typeA=(pj( j E J ) ; l e t 8 besuchthatdim(A)<8 and let p be a bijection from 6 onto a set X . Let P be a 8-A-language, defined with help of X and the identity (hjl j E J ) of J . Write hi=fi if iEI and h,=g, if keK, and let At', A t 2 be the sets of allfi(A) and g,(A) in At respectively. Let P ' , P 2 be the subalgebras of P, generated by At' and A t 2 . Let B be a class of models of type A' and let C be a class of models of type A2. A function 4 from B into C is a map if u(B)=u(+(B)) for every BEB;a map is an equivalence if it is a bijection from B onto C. If 4 is a map, one can define the class A of all models of type A determined by 4, and 4 decomposes into an equivalence Y , from B onto A, followed by the reduction map from A into C. A map 4 is called definable if there exists a sequence (sk(k€K) in u ( P ' ) such that, for every keK, IFA SkCthk(P rink);in that case, 4 is said to be defined by these formulas. If, moreover, the sk can be chosen to be open formulas, then 4 is called open. An equivalence 4 is called definable resp. open if both 4 and 4-l are so. Making use of substitutions, one sees easily that a map 4 is definable already if there exists a sequence (sklkEK) in u ( P ' ) and a sequence (MkIkEK) of sets M k E Xsuch that, for every model BEB, one has e,(sk)(Mk)=gEwhere 4 ( B ) = C = ( u ( C ) , (gtlkEK)). As in

EQUATIONAL MAPS

153

the case of algebras, a definable (or open) map is uniquely determined by its defining formulas, and a definable map 4 on a class D ( B ) (resp. an open map on a class D p ( B ) ) is uniquely determined by its restriction 4 1B. Let 4 be a map from B into C , definable by ( s , l k E K ) ; a statement analogous to Lemma 9 can be proved. Define a function g1 from At into u ( P ' ) as follows. If rEAt', put g l ( r ) = r ; if r = h k ( P r m k ) , put g l ( r ) = s k . If r=hk(IZ) and A€Xrnh,define q in X x by qlj3*(mk)=IZ.(j?lrnk)-' and q r X - p * ( m , ) the identity; then put gl(r)=sub,,sk. Let g be the extension of g1 in Hom(P, P'). Then IFA r - g ( r ) holds f o r every r E u ( P ) . For a proof, it is sufficient to show that, for every A E A , the homomorphisms e, and e,.g from P into R ( u ( A ) ) coincide on At. This is obviously the case for elements rEAt' or r=hk(P lm,). But if r=hk(A) then r=sub,(h,(P rm,)) with the q defined above; hence kI, r - g ( r ) since lb hk(P rmk)c)s, implies sub,(h,(p rm,))-sub,s,. - It should be noted that g not necessarily maps sentences into sentences again. However, if the formulas s, are open, then g preserves open formulas. The definition of models B, admissible for a sequence ( s , l k E K ) , can be taken verbally from Section 5. Further, Lemma 11 remains in effect together with its proof, if only phrases like "(s,, hk(Pk))holds in G" are replaced by Iti; Skc*hk(P 1m,). Likewise, Theorem 2 remains true if equational maps, defining equations and equational closures are replaced by definable maps, defining sentences and definable closures; moreover, if the sequence <s,jkE K > consists of open formulas, then open maps, defining open formulas and open closures can be considered. This translation is obvious for parts (a) and (b) of Theorem 2. As for (c), remember first that together with r also sub,,(,,r is open; hence if B is admissible and the sk are open, then also D p ( B ) is admissible. Consider now the case of definability and sentences. Let Ylbe the equivalence from B onto A and let PIbe its extension to an equivalence from D ( B ) onto K , where K is defined by Q ( B ) together with the sentences (V)s,c)h,(p tm,). Again, it has to be shown that Q ( A ) c Q ( K ) . Let g be a homomorphism from P into P' such that IkKr-g(r) for every r E u ( P ) . If r g Q ( A ) , then e,(gr)=e,(r)=u(A)' for every A E A . Hence the universal closure (V) gr belongs to Q ( A ) , and since g ( r ) E u ( P ' ) it belongs already to Q ( B ) . Then Q ( B ) s Q ( K ) implies (ti) g r E Q ( K ) . Thus one obtains for every K E K that u(K)'=e,((V) g r ) = e , ( g r ) = e , ( r ) , i.e. r € Q ( K ) . Since Theorem 2 is available now, also Theorem 3 together with Corollary 1 hold in the new situation. Further, Theorem 4 can be taken over without change. Although Lemma 13 and the algebraic machinery in the proof of Theorem 5 break down, Theorem 5 itself remains essentially in effect, and

154

W. FELSCHER

this even in the strong form that the equivalence from B onto C can be chosen as open. However, in order t o avoid the case m,(i,=O, it has to be assumed that for every i E 1 there exists at least one BEB such that not OESupp(eB(fi(PTn,))). Define then the functions p i , ai and y i as in the case of ordinal types of algebras. For every iEZ, define ~ , ( ~ ) = s u b ~TnJ= ~f~(/j fi(yi.p rni)=fi(P.6i). Then B is admissible for (sk(i)liEI), since Y E S u p p ( e B ( f i ( P mi))) imp1ies Y * ( Y i ) E S u p p ( e B ( s u b y , f i ( P mi))>, Y,*(&)= P * ( M k ( i ) ) . Since the s k ( i ) are open, there exists an open map 4 from B onto a class C of models of type A'; let A be the class of models of type A determined by 4. Since lb s k ( i ) ~ h k ( irm,(i,) ) ( p for every ~ E I , also IF. sub,, - 1 s k ( i ) H S U b y i- 1 h k ( i ) (p mk(i)), but Subyi- t S k ( i ) = f i ( p 1 n i ) . Define t i = m b y i - l h k ( i ) ( / j/ m k ( i , )for i E z ; then c becomes admissible for ( t i l i E z ) . Let Y 2 be the open equivalence from C onto the class K constructed from C and (tiliEZ). Since It;4fi(P / n i ) - t i for every AEA, the analogon of Lemma 11 gives again A = Y 2 ( A l K ) for every AEA, i.e. A = K . Thus ~ = Y , ' . Y , is a bijection and, therefore, an open equivalence. It is clear that definable maps and equivalences may as well be studied for relational systems which, in addition to relations, also carry a sequence of operations. However, aside from notational complications nothing new seems to arise in that case. For let ( A ; , A') be the pair, formed from the algebraic type A ; and the relational type A' of a class B of such relational systems. A map 4 from B into a class C with types (A;, A 2 ) will be definable if not only the relations belonging to A' are definable but also, for every A+operation p of arity n, the (n + 1)-ary relation p ( P rn)=P(n) becomes definable by A'-formulas. And in general, nothing more about the definability of A:-operations can be said. If, however, in a fortunate situation some A+-operations are definable already by A$terms, then the relationship between these operations and terms can be described completely with the algebraic techniques of Sections 4 and 5. For let Br be the class of algebras, underlying the relational systems in B, and let C T be the class of those algebras which are reducts, with respect to the algebraically definable operations, of the algebras underlying the relational systems in C. Then 4, being a map, establishes an equational map from BT into C T . The methods developed until so far can be used in order to deal with so-called syntactical transformations. Let A be a fixed relational type; let A L 1 , AL2 be types of algebras and let ALo be the mixed type determined by A L ' , AL2. Let 6 be an infinite regular cardinal number such that dim(A)<6, dim(dLo)<6, and let p be a bijection from 6 onto a set X . Define the set At as before; let P L Obe an algebra of type ALo, absolutely freely generated

r

EQUATIONAL MAPS

155

by A t , and let P L 1 ,P L 2be the subalgebras generated by At in the reducts of P L Owith respect to A L ' , A L 2 . Assume now that, for every set E, the set %Ex carries a well-defined algebra R' ( E ) of type AL1. Let 4 be an equational equivalence from the class of all algebras R' ( E ) onto a class of algebras of type A L 2 ; put 4 ( R ' ( E ) ) = R 2 ( E ) and Y l ( R ' ( E ) ) = R o ( E ) .Since the models of type A are in a bijective correspondence with certain functions 71 from Ats into the sets u(R1(E))= u(R2( E ) )= u(Ro(E)),every model A determines homomorphisms e i , e:, e: from PL', PL2, PLo into R'(u(A)), R2(u(A)), R o ( u ( A ) ) such that eiru(P")=e:, eiru(PL2)=e:. Let gl, g 2 be two reductive functions determined by 4, g1 from P L ointo PL' and g 2 from P L o into P L 2 ;gl, g 2 then are called syntactical transformations between PL' and PL2. If rEu(PLo) then the equations ( r , g l r ) , ( r , g 2 r ) hold in every algebra R o ( E ) ; hence ei(r)=e:(g'r), e i ( r ) = e i ( g 2 r ) for every A , i.e. lkA r-g'r, IFA r-g2r. Consequently for every r e u ( P L ' ) and every s c u ( P L 2 ) also IFA r-g1g2r and IFA s-g2g1s. Let r be in u ( P L o )and M s u ( P L 0 ) . For every model A , define M It-- r by ( e i (m)lmeM)G e i ( r ) . It follows immediately that M lb r implies g 2 * ( M )IFA g2r and g ' * ( M ) IFA glr. Define the closure operator Cs on %u(PLo)by r e C s ( M ) if, for every model A , M IFA r. Let Cs' and Cs2 be the closure operators on %u(PL') and %u(PL2), induced by Cs. Then, for every M c u ( P L ' ) and every N s u ( P L 2 ) , g2*Cs'(M)ECs2(g2*M), gl*Cs2(N)c cs' ( g I * N ) ,Cs' ( M )= cs' ((g1g2)* M),C?(N) = Cs"(gZgl)*N) hold. A certain inconvenience arises now from the fact that g2, say, will not always transform closed sets into closed ones. As a remedy, define for every s c u ( P L 2 )a unary operation h: on u ( P L 2 )by h:(v)=s if v=g2g1s, h:(v)=v otherwise. Obviously, sets closed with respect to Cs2 will be closed with respect to all operations h:. If M s u ( P L 1 )and Cs'(M)=M, let N be the set of all values under operations h: of elements in g 2 * ( M ) .Then N is closed with respect to Cs2 and, therefore, N = Cs2(g2*M). Let dim(Cs') be the smallest infinite regular cardinal number m such that, for every M and r, r E Cs' ( M ) implies the existence of a set M' such that M'E M , card(M')<m and rECs'(M'); dim(Cs') exists since the successor of card(u(PL1))may be chosen for an m. Likewise define dim(Cs2). Then dim(Cs')=dim(Cs2)holds. For assume rECs'(M), whence g2rcCs2(g2*M). Let N be such that NE g2* ( M ) , card(N)
n

156

W. FELSCHEK

Let F be an algebra of a type A F such that u(F)=u(PL'). F is called an axiomatization of Cs' at a set M c u ( P ~ ' )if C s l ( M ) = [MIF;F is called an axiomatization of Cs' if F is an axiomatization of Cs' at every set M . In this situation, the values of constant operations of F are called axioms, and the non-constant operations are called rules of derivation. It follows from well known facts about closure operators that there are always axiomatizations F of Cs' such that dim(AF)=dim(Csl). i t will be shown now that, given syntactical transformations gl, g2 and an axiomatization F of Cs' (an axiomatization F of Cs' at the empty set), there is an explicit method to transform F into an axiomatization G of Cs2 (an axiomatization G of Cs2 at ), a the empty set) such that dim(AF)=dim(dc). For every r ~ u ( P ~ 'define unary operation h,! on u(PL') by h,'(u)=r if u = g ' g 2 r , h,'(u)=u otherwise. If k u ( P L ' ) " for some ordinal number n, define a unary operation hi on u(PL1)nby h i ( $ ) ( m ) = h ~ ~ m l ( $ ( m for) ) every $eu(PL')n and every mO, and for every A€u(PL')ni, G shall have an operation&,: of arity ni, defined by J;.,:((x)= g x F ( h :*gl (Instead of the unary operations one might have introduced constant operations s+-+g2g1sfor every s e u ( P L 2 ) and added a binary operation, yielding the modus ponens.) The following observations will be useful. Assume that N G u ( P L 2 ) . Then g 2 maps M = [gl*NIFinto "1'. For if r = g'sand SEN, theng2g's~[NIG; further g2a€[NIGfor every value a of a constant operation of F. Assume now that $EM"' and that already g 2 . $ ~ ( [ N ] ' ) n L Then . fi:(g"$)~[N]', but fi: (g 2 $) = g x " ( h i -gl . g 2 $) = g2fiF($). - r f , in addition, [ g1*NIF= Csl(gl*N), then g' maps [N]' into M = [g1*NIF.Namely g2g1aECs1(M) for every set M ; further, if s€[N]' and g'sEM, then also g1g2g1sECs'(M) and g'h;(s)ECs'(M) for every t € u ( P L Z ) Finally, . assume that x ~ ( [ N l ' ) " ' and that already g l - X E M " ' . Then M = C s ' ( M ) implies h:.g'*xEM"' for every I , and M = [MIFimpliesfiF(hi.gl.x)EM. Hence M = C s l ( M ) implies g1g2fiF(h:*g1 . x ) E M ,but g l g x f ( h : * g l-x)=gx,?(X). In order to prove the assertions about G, it will be sufficient to show that, if F is an axiomatization of Cs' at g'*(N), then G is an axiomatization of Cs2 at N . Assume therefore [g1*NIF=Cs1(g1*N).Then g1 maps C s 2 ( N ) into Cs'(g'*N) and, by the earlier observation, g 2 maps [g1*NIFinto [NIG. Thus SECS'(N)implies g'g'sc [N]', whence SE [N]' since G contains the ex).

-

EQUATIONAL MAPS

157

operation h:. Therefore C s ' ( N ) c [N]' holds. On the other hand, by the other observation g1 maps [N]' into [g'*N]'=Cs'(g'*N), and g 2 maps Cs'(gl*N) into Cs2((g2g')*N).Thus SE [N]' implies g2g1sECs2((g2g1)*N). Since Cs2(N)=Cs2((g2g1)* N ) and since Cs2((N)is closed with respect to h i , one obtains SE Cs2( N ) , i.e. [N]' c Cs2( N ) . Syntactical transformations can be studied, in particular, with respect to the known (and finitary) axiomatizations of first-order logic. It is in this case that axiomatizations of Cs at the empty set become important, because here usually an axiomatization at the empty set is given first, and is only afterwards amended in order to obtain an axiomatization at every set. Also, it should be pointed out that the standard axiomatizations have an additional property in that they are formal. Namely, a rule f , yielding say modus ponens, may be conceived as constructed by the following process. Consider an algebra Q of type A L , absolutely freely generated by a set Y , and prescribe a sequence (yo,yo-+yl,yl) of elements of Q . Define f for r , s in u ( P L ) as follows: if there exists a homomorphism h from Q into P L such that h(y,)= r, h(y,-ty,)=s, then f ( r , s ) = h ( y l ) ; otherwise f ( r , s ) = r . Without pursuing this matter any further, it may be remarked that in the above transformation of F into G every formal rulefiF can be transformed into a formal rulefi' which comprehends all the rules fi:. Appendix

In this appendix, the existence of general substitutions will be proved. In generalization of the setting in Section 6 , languages with terms and models with operations will be admitted. So let A T be an ordinal type and let A be a relational type. A model A of type ( A T , A ) is an ordered pair ( a ( A ) , b ( A ) ) such that a ( A ) is an algebra of type A T , b ( A ) is a relational system of type A , and u ( a ( A ) ) = u ( b ( A ) ) ;this set is written as u(A). Let 6 be an infinite regular cardinal number such that dim(A7)<6 and d i m ( A ) < 6 ; let be a bijection from 6 onto a set X . Let T be an algebra of type A T , absolutely freely generated by X . If A = ( ~ J ~ E letI ()f ,i l i E I ) be the identity on I and define At as the set of all ordered pairs ( A , A)=A(A) where A€u(T)";. Let P = ( u ( P ) , (r,llEL)) be the algebra of type A L , absolutely freely generated by A t . On u ( P ) one defines recursively the ordinal-valued function deg such that deg(r)=O if rEAt and deg(r,(p))=sup(deg(p(m))+IlmEq,) if p ~ u ( P ) ~this ' ; can be made precise by introducing suitable operations on the set 6. There will be occasion to give proofs by induction and definitions by recursion on deg; a particular case is the induction on the number of

158

W. FELSCHER

quantifiers in usual first-order logic. In a similar recursive way, one defines functions part from u ( T ) into % u ( T )and from u ( P ) into % u ( T )u % u ( P ) : for t E u ( T ) , part(t) shall be the set of all subterms of t , and for rEu(P), part(r) shall be the set of all subformulas and subterms of subformulas of r. The only important convention here is that, for a formula V Y r , one has part(V,r)= Yupart(r). Finally, one defines the functionfr from u ( P ) into %X, assigning to every r the set of variables free in r. It follows from the choice of 6 that, for every formula r, card(part(r))
>

,

x

*I]

r-

159

EQUATIONAL MAPS

Y n p a r t ( q ( x ) ) = O , i.e. p a r t ( q ( x ) ) c - Y, whence h;q r f r ( r ) = h ; q r.fr(r) for every x. Since, moreover,fr( r ) E - Y ,one obtains $ rfr( r ) = h; q If.( r ) = h;qY t f r ( r ) = h ; q y f r ( r ) for every $ and every x. Thusevery $ determines a x such that x r - Y = q r - Y and h ; q y r f r ( u ) = $ r f r ( u ) : define x r Y = $ Y. Conversely, every x determines a b,t such that $ Y=h;q 1- Y and h ; q y fr(u)=$ r f r ( u ) : define $ 1 Y=x Y.

r-

r

(ii) Let q be such that, f o r some Y C X,f j = q 1 Y is a bijection on a set Z E X , while q - Y is the identity. Let r be a formula such that ZS -part(r). Then x E f r ( r e p , r ) /#(not X E Y and x E f r ( r ) ) or ( X E Zand f l - ' ( x ) E f r ( r ) ) . Proof. Since Z E - p a r t ( r ) implies Z c -part(w) for wEpart(r), induction in the set of subformulas of r can be applied. The statement is true for r E A t and carries through under the sentential operations. Assume now r = V, u. Then x E f r ( r e p , r ) i f f not X E W and xEfr(rep,,u). Here qw determines a bijection from Y - W onto 2 - q * ( Y n W ) ;hence induction gives xEfr(rep,,u) i f f(not X E Y - Wand x E f r ( u ) ) or ( x E Z - q * ( Y n W ) and Q - ' ( x ) ~ f r ( u ) ) . Since W ~ p a r t ( rimplies ) Z E - W, one obtains x g f r ( r e p , r ) iff (not X E Y and not X E Wand x E f r ( u ) ) or (xEZa nd not f j - ' ( x ) ~ Wand Q-'(x)Efr(u)) i f f (not X E Y and X E f r ( r ) ) or ( X E Zand f j - ' ( x ) ~ f r ( r ) ) . (iii) Let q, r satisfy the assumptions of (ii). Thenf o r every model A : e A ( V y r )= eA ( v Z r>. Proof. Assume V,uEpart(r) and X E fr(Vwu). Since W c p a r t ( r ) implies Z n W=O, it follows from not X E W that not ~ ( x )W. E Thus r, q are compatible. Now qEe,(V,r) i f ffor every x, x 1- Y = q Y implies xEe,(r); likewisecp Ee,(Vz rep,r)iff for every$, $ 1 -Z= -2implies $eeA(rep,r), i.e. h,.qEe,(r). Then h,-q 1- Y=$t-- Y and Z C - p a r t ( r ) shows that h @ - qr f r ( r ) - Y = q r f r ( r ) - Y. Thus every $ determines a such that x 1- Y=cp t - Y and x r f r ( r ) = h , * q r f r ( r ) : define x [ Y=h,.q 1 Y . Conversely, every x determines a $ such that $ 1- Z = q 1-2 and x r f r ( r ) = h , . ~r f t ( r ) : define $ rZ=x-fj-l / Z . Let q be in U ( T )and ~ let r be a formula. Definej(r, q ) to be the smallest ordinal number a<6 such that, for every a', a<'a'<6, one has (j) not P(r')Epart(r), and (jj) if x E f r ( r ) then not P(a')Epart(q(x)). Since 6 was chosen large enough, j ( r , q ) always exists. If Y c X , card(Y)<6, define q(r,Yfin X x such that q ( r , y 1) - Y is the identity, while ~ ( ~ , ~ ) ( y ) = / ? ( j ( r ,q ) + / ? - ' ( y ) ) for Y E Y. Obviously, q ( r , Y1) Y is a bijection onto a set Z E X such that 2s -part(r).

r

r-

160

W. FELSCHER

For every q, define by recursion on deg a function rut, from u ( P ) into u ( P ) : if rEAt then tut,r=r; let rut, be homomorphic with respect to the sentential operations; define tut,(V,r)=V,tut,,rep,,,,y,r, where Z is the image of Y under u ( ~ , , ) .Obviously, deg(r)=deg(tut,r) for every formula r. (iv) For every q, r and every model A:eA(tut,r)=eA(r). Proof. Apply induction on deg. The statement is trivial for rEAt and carries through under the sentential operations. Consider now V y r , whence tut,(Vyr) =V, tut,rep,,,,,, r. The choice of Z ensures that (iii) can be applied ; hence eA( V y r )= eA(Vzrep,cr,y,r). However eA(rePq(,.Y, r ) =eA(tutq rep,,,.,y, r ) by induction, which implies eA(V, r ) = eA(V, tut, rep,(,.y, r ) . (v) For every q, r: fr(tut,r)=fr(r). Proof. Apply induction on deg. The statement is trivial for rEAt and carries through under the sentential operations. Consider now V, r. Since Z c --part ( r ) , one obtains XE fr(V, rut, rep,,,r,Y)r) iff not XEZand XE fr(tut,repq(,,y,r) iff not XEZand X E fr(repqc,.y,r) (by induction) iff not XEZand not X E Y and X E f r ( r ) (by (iN iff not X E Y and X E f r ( r ) (since f r ( r ) s -Z) iff XE f r ( V y r ) . (vi) For every ?I, r : tut,r, r are compatible. Proof. Apply induction on deg. The statement is trivial for rEAt and carries through under the sentential operations. Consider now V y r and tufq(Vyr)=V,tut,rep,,F,y,r. Assume that V,vEpart(tut,V,r). If V,v= tutsVyr, then W = Z . Since fr(tut,V,r)=fr(V,r) by (v), XE fr(V,v) implies X E f r ( r ) . Now the choice of j ( r , q ) guarantees that Znpnrt(q(x))=O; hence W= 2 implies W n p a r t ( q (x)) = 0. If, on the other hand, V, v # tut,V', r, then V, vEpart(tut,rep,,r,y, r ) . But induction gives that tut,,repVcry , r, q are compatible. Hence x ~ f r ( V ~implies v) Wnpart(q(x))=O. At this stage, the desired substitution can be introduced. Namely, for every q, define the function sub, from u ( P ) into u ( P ) by sub,r=rep,r if r, q are compatible, and sub,r = rep,, tut,r otherwise. One obtains (vii) For every 'I, r, for every model A and every cp~u(A)':q ~ e ~ ( s u b , r ) ifh;qEe,(r). Proof. If r, q are compatible, this follows from (i). Otherwise e,(sub,r)= eA(rep,tut,r), eA(r)=eA(tut,r) by (iv). But cpEe,(rep,lut,r) iff h;qEeA(rutqr) by (i) since tut,,r and y are compatible by (vi).

EQUATIONAL MAPS

161

References 1. G. BIRKHOFF, On the structure of abstract algebras, Proc. Cambridge Phil, SOC.31 (1935) 433454. 2. P. M. COHN,Universal algebra (New York, Harper and Row, 1965). On the equivalence of certain classes of algebraic systems 3. B. CSAKANY(B. CAKAN), (Russian), Acta Sci. Math. Szeged 23 (1962) 46-57. On primitive classes of algebras which are equivalent to 4. B. CSLKANY(B. CAKAN), classes of half-modules and modules (Russian), Acta Sci. Math. Szeged 24 (1963) 157-1 64. 5. B. CSAKANY(B. CAKAN),On Abelian properties of primitive classes of universal algebras (Russian), Acta Sci. Math. Szeged 25 (1964) 202-208. On induction and recursion in universal algebra, to appear in: Z. math. 6. K. H. DIENER, Logik und Grundlagen d. Math. 7. W. FELSCHER, Adjungierte Funktoren und primitive Klassen, Sitzber. Heidelberg. Akad. Wiss., Math.-Natw. K1. no. 4 (1965) 1-65. 8. H. J. HOEHNKE, Zur Strukturgleichheit axiomatischer Klassen, Z . math. Logik und Grundlagen d. Math. 12 (1966) 69-83. uber Modellkorrespondenzen, to appear. 9. H. J. HOEHNKE, 10. C. R. KARP,Languages with expressions of infinite length (Amsterdam, North-Holland Publ. Co., 1964). 11. F. W. LAWVERE, Functorial semantics of algebraic theories, Dissertation (Columbia Univ., New York, 1963). Some aspects of equational categories, in: Proc. Conf. on Categorical 12. F. E. J. LINTON, algebra (Berlin-Heidelberg-New York, Springer-Verlag, 1966) pp. 84-94. 13. A. I. MALCEV, Structural characteristics of certain classes of algebras (Russian), Dokl. Akad. Nauk SSSR 120 (1958) 29-32. and A. TARSKI, The algebra of topology, Ann. of Math. 45 (1944) 14. J. C. C. MCKINSEY 141-191. 15. B. H. NEUMANN, Special topics in algebra, Universal algebra, Lecture notes (New York University, 1962). 16. J. SCHMIDT, Algebraic operations and algebraic independence in algebras with infinitary operations, Math. Japonicae 6 (1962) 77-1 12. Die Charakteristik einer allgemeinen Algebra I, Archiv d. Math. 13 (1962) 17. J. SCHMIDT, 457-470. 18. J. SCHMIDT, Some properties of algebraically independent sets in algebras with infinitary operations, Fundamenta Math. 55 (1964) 123-137. 19. J. SCHMIDT, Die uberinvarianten und verwandte Kongruenzrelationen einer allgemeinen Algebra, Math. Ann. 158 (1965) 131-157. The theory of abstract algebras with infinitary operations, Rozprawy 20. J. SLOMINSKI, Matematyczne 18 (Warszawa, P.W.N., 1959). 21. A. TARSKI, A remark on functionally free algebras, Ann. of Math. 47 (1946) 163-165.

SOME FORMS OF MODELS OF PROPOSITIONAL CALCULI

R. HARROPI Simon Fraser University, Burnaby 2, B.C., Canada 1. Introduction In this paper an investigation is made of a form of model of propositional calculus introduced by Smiley in a dicussion of the concept of the independence of connectives in such a calculus 141. It is shown that the definition used by Smiley is at variance with one interpretation of a statement he makes concerning that definition and that the conditions he imposes in the definition can be weakened and still give the desired results. Some partial (unpublished) work was done in this direction by Gough in conjunction with the present author (see 121, p. 279) based heavily on the concept of weak model. This work has, however, been abandoned, at least for the present. The alternative independent approach developed in this paper works more from an analysis of the derivability concept than from a study of weak models. An unsolved problem mentioned at the end of the paper is concerned with whether or not it is possible to use finite models of this new type to prove non-derivability results in cases in which strong models cannot be used. A proof that this was impossible would give a traditional type result, but one which may not be entirely without interest. A proof that it was possible would certainly be of interest. When this paper was presented in Hannover, the author’s attention was drawn to [3]. This reference which is certainly of interest and slightly related, does not seem to bear directly on the main results of the present paper.

2. Models of propositional calculi We consider the term propositional calculus to be defined as in !j 3 of the survey paper 121. We use the definitions and notation of that paper for several related concepts. Thus the formulae of the calculus P will be defined 1 Invited paper read at Kolloquium uber Logik und Grundlagen der Mathematik, 8-12 August, 1966, Hannover. Supported in part by National Research Council Grant A.3024.

163

164

R. HARROP

using propositional variables a, b, c, a,, a2,..., and connectives al,..., a, ..., for variables for arbitrary formulae (vafs) and U, V, X , Y , Z , possibly with suffixes, for formulae of P which are arbitrary or satisfy certain specified conditions. Greek letters a, p, y ,6, and cp, $, w (possibly with suffixes) stand respectively for positive integers and for formulae schemes. The terms instance of a formula scheme, substituted case of a formula (formula scheme) are defined as in [2]. We suppose P has a finite set w , , ..., w, (t 2 0 ) of axiom schemes, each of which is a formula scheme and a finite set of rules R,, ..., R,; R, being of the form Rj: soil ... Viki (alternatively printed qil f p i k t / $ i ) ( s 2 l), aibeing ti-ary for some ti> 1. We use A , B, C , A , , A , ,

lcli

for some formula schemes q i l ,..., qik,(premises) and $, (conclusion). We distinguish between an application of a rule and a substituted form of it as in

PI.

The definitions of the sets of provable formulae and provable formula schemes are normal. We say that Y is derivable or deducible from X,, ..., X , ( p Z 0), and write that X,, . .., X,t Y holds in P (is true in P, is deducible in P), if Y can be ‘proved’ from X I , ..., X , and axioms of P using applications of rules of P; cp,,.,., q P t $ means similarly that the formula scheme )I can be ‘proved’ from the schemes cpl, ..., cpp and substituted forms of the axiom schemes of P using substituted forms of the rules of P. We note that methods stated to be available in the proof of Y from X,, ..., X , (or $ from cpl,. . ., cp), are permissive not mandatory - they need not all occur in any given case under consideration. We recall, see [2] pp. 276,277, that, in general context, it is stronger to say that X,, ..., X,k Y holds in P than it is to say that if XI ...X,/Y is added as a rule application to P then there is no change in the set of provable formulae of P. If X,, .. ., X , t Y or q,,.. ., cpp k $ holds in P we call the ‘corresponding’ rules X , ... X p /Y , cpl ... cp,/$ derived rules of P. The former is really a rule with a single instance, the latter has ‘applications’. Both have ‘substituted forms’. In view of our definition any rule of P, and any substituted form or application of a rule of P, is also a derived rule of P. We sometimes find it convenient to use the ‘I-’ notation for rules or derived rules rather than the ‘1’ notation, and thus could say that P has rules qil,..., q i k , k $,, 1 < i < u. This is stronger than saying just that P has derived rules cpil, ..., qik,t$, since it says that R i (1 < i < u) are rules actually given in the presentation of P in terms of its formulae, axiom schemes and rules.

MODELS OF PROPOSITIONAL CALCULI

165

Suppose we are given a system M = ( E , D, C,,..., Cs) where E is a nonnull set of elements (values), D a subset, possibly null, of E (designated values) and, for each i, 1
166

R. HARROP

THEOREM 1. (a) An expression X,, ..., X p k Y can be shown not to hold in P by means of a finite S-model (that is, by producing a model in which it does not hold) if and only if it can be shown not to hold in P by means of a finite strong model (equivalently, by means of a finite weak model). (b) As (a) with the omission of the three occurrences of the word ‘finite’. Proof. (a) See [2], p. 2791; (b) consideration ofthe details ofthe proofof(a). We note that Theorems l a and 1b are independent results in the sense that neither is a special case of the other. If we take p = 0 in Theorem 1 we get results concerned with the determination of non-probability of formulae by models. We notice in passing (compare [2]) that we can always test of a given finite structure whether or not it is a finite weak (strong, S-) model of P.

3. Modified S-model I n [4] Smiley stated that the closure conditions (i)*, (ii)* in the definition of S-model were just the ones required in order for him to use his models ( E , -, C,, ..., C,) for the purposes he wished. It would seem however that with XI, ..., X,k Y defined as in !j 2 (and this would seem to be what he wished), the essential thing he requires is that all derivations which hold in P are satisfied in the model. He uses this in the form that a non-satisfied expression of the type X,,.. ., X , k Y can be asserted immediately not to hold in P. We will say that M = ( E , -, C,, ..., C,) is a modijied Smiley model of P (mS-model) if and only if with E, -, E l , . ..,C, having the previously described form (set of elements of model; function from 2E into 2€, functions from E“ into E)and with mS-satisfied and mS-valid being defined like S-satisfied and S-valid (mS-satisfied rule and S-valid axiom), except that, in the case of the definition of mS-satisfied, values are given to variables of P or vafs of P as appropriate rather than just to vafs of P ( i)** the axioms of P are mS-valid in M, (ii)** all ~ E E , (iii)** if XI,. .., X,t Y is mS-satisfied in M then so is X,, ..., X,,Z,, ...,2, t Y for any formulae X , , ..., X,, Z,, ..., Z,, Y ( p > O , q>O), (iv)** if XI, ..., X,t Yi is mS-satisfied in M, 1 GiGq and Y, ... Y,/Z is an application of a rule of P then X,, ..., X , k 2 is mS-satisfied in M.

.{a>,

THEOREM 2. If M is an mS-model of P and Y,, . .., Y, t 2 an application of a rule of P (written in derivation form) then Y l ,..., k Z is mS-satisfied. ‘E’should be replaced by

‘#’ at two places on line 3 of this page.

MODELS OF PROPOSITIONAL CALCULI

167

x,

Proof. By (ii)** t 1
THEOREM 3. If X,, ..., Xni- Y holds in P and M is an mS-model of P, then XI, .,., X,, t- Y is mS-satisfied in M. Proof. From the definition of ‘t’ we can prove that X , , .. ., X,, t Y holds in P if and only if it can be shown to do so by inductive definition using the following clauses : (a) if X is an axiom of P then t X holds, (b) X t X holds for any formula X , (c) if XI, . . ., X,t Y holds, so does X , , .. ., X,, Z,, ..., Z, t Y for any formulae XI,.. ., X,,Z , , . .., Z,, Y , p Z 0 , q 3 0 , (d) if X,, ..., X , t holds for 1 < i < q and Y , ... Y,/Z is an application of a rule of P then XI, ..., X,t Z holds. The required result follows immediately. We note that it is not true that given that M is an mS-model of P then we can deduce in general from the satisfaction of X , , ..., X , k Y that X,, ..., X , i- Y will hold in P. We also note that we have not proved that if M is a structure of the type necessary for an mS-model of P then, from the fact that X , , ..., XPt Y holds in P for all derivations XI, ..., Xpt Y which hold in P, we can deduce that M is an mS-model of P.

THEOREM 4. If M is an S-model of P then it is an mS-model of P. Proof. Considering the definitions of mS-model and of S-model we see that the satisfaction of (i)**, (ii)**, follows from the satisfaction of (iii)*, (i)* respectively. The satisfaction of (iii)** follows from that of (ii)*, and the satisfaction of (iv)**, which is concerned with applications of rules, is a consequence of the satisfaction of (iv)*, which is concerned with rules, and of the closure conditions (i)*, (ii)*. (If c i ~ ( P,..., , /?and ,I P i e { y l ,..., y,), 1
2 3

168

R. HARROP

We note that 3 ~ ( 2 }but 3 $ n ) . Hence, condition (ii)* is not satisfied and M cannot be an S-model of P. Consider the definition of mS-model. The satisfaction of (i)**, (ii)** is immediate. Consider (iii)**. The required result is almost immediate provided we can show that we cannot have formulae X,, ..., X,,, Y with X , , . .., X,, 1 Y satisfied such that, for some substitution of values for variables, Xi takes the value 2, 1 B i d p and Y the value 3. If this substitution could arise, then, under the corresponding substitution in which values for variables assigned are unchanged except that variables originally taking the value 2 now take the value 1, X ,,..., X,, Y would take values 1,..., 1, 3 respectively and since we would have a contradiction with the assumed satisfaction by M of X , , ..., XPt-Y . Thus (iii)** is satisfied. Using the fact that the only applications of rules of P are of the form Xt- Y + X and noting that, for any substitution of values for variables, Y+ X either takes the value 1 or the same value as A’, we can easily show that (iv)** is satisfied and thus complete the proof of the theorem. There are many types of models of a propositional calculus. One aspect common to those we have discussed which many would feel is an essential property associated with the use of the word ‘model’ (compare [4], p. 434) is that it is effectively possible to determine of a finite structure M and a calculus P whether or not M is a model of P. We remember that our calculi possess only finitely many connectives, axiom schemes and rules. It is important, therefore, that we show that this finiteness property, which we have already observed holds trivially for S-models, holds also for mSmodels.

~EV}

-

THEOREM 6. Given a finite structure M=(E, , C,,..., 2,) and a propositional calculus P (usual notation) we can effectively determine whether or not M is a finite mS-model of P. Proof. Suppose that E consists of the n elements 1, 2, . . ., n. It is immediate that we can check effectively for the satisfaction or otherwise of conditions (i)**, (ii)** of the definition of mS-model, noting that in the case of axioms it is sufficient to consider the particular axioms, one for each scheme, obtained from the axiom schemes by replacing vafs by the corresponding variables. Consider (iii)**. If there is a case of the satisfaction of A‘,, ..., Xpk Y and the non-satisfaction of XI, ..., X,,, Z l , . . , Zqt-Y by M, then there must be some substitution 9’of members of E for the variables in XI, ..., X,, Z , , ..., Z,, Y such that X , , ..., X,, Z,, ..., Z,, Y take respectively the values

169

MODELS OF PROPOSITIONAL CALCULI

a,,

..., a,,

yl,

..., yq, P where

and P#{a1,..., U p , Yl,..., YJ. Let XT ,..., XE, Y * ,ZT ,..., Z: be obtained from X,, . . . 7 X,, Y , Z , , ..., Zq by replacing in these formulae, for all i, 1 d i
a,>

{f,,ll Q i Q r } = {f"., 11 and if

fv =fv.

then either both U,,... U,t V , and UT, ..., U,*,t V* are satisfied by M or neither of them is. Combining the results of the last two paragraphs, we see that we can effectively determine a finite set G of sets of formulae in T, say XI,, +-.,X l p l r

yl;

...;x b l , .-.)X b p b >

yb;

such that if XI, ..., X,,Yare all in T, the determination of the satisfaction or otherwise of X,, ..., XPt Y by M is from the point of view of associated functions equivalent to the determination of the satisfaction or otherwise by M of some effectively determinable Xil,..., Xip,!- K. 1 In [l J consideration was restricted to calculi with only unary and binary connectives. This affects certain details of the proof but not its essential content. For example, 2h+ 1 must now be replaced by max l G i < s ( f t h ) 1 and this brings consequential changes with it. Note that 'g(n(n2))'in line 2 Page 6 of [l] should read 'g(n(n"))'.

+

170

R. M O P

Now, we can determine effectively for each i whether or not Xi,,..., Xipik holds in M (we can always test whether or not a given derivation holds in M). Hence, all we need to do in order to check whether or not (iii)** is satisfied is, for each i, 1 < i k where k is the cardinality of the set T ) . Suppose Y,, ..., Y,kZ is an application of rule R: (p,, ..., q,t $ where, without loss of generality, R can be written so that it involves exactly the vafs A , , ..., A,, some m. Substituting members of F for A , , ..., A , we see we can find effectively all possibilities (g,, ..., g,, h ) for functions g,, ..., g,, h E F such that these functions can occur as the functions associated with q,,. . ., q,, I) respectively in an application of R which involves only formulae in T.2 Denote the set of ordered (q+ 1)-tupels which we can get in this way

e

1 Although apparently (iii)** has to be considered for infinitely many values of p , q our proof can be interpreted to show that one effect of the use of associated functions is that bounds get placed on the values for p , q which really need to be considered. 2 It must be remembered that the definition of the term ‘application of a rule’ does not require that the premises be provable formulae (see [2], p. 274). In this sense it might perhaps be better to speak of a ‘potential application of a rule’. No real gain would occur by this change. If a premise happened to be ‘obviously’ unprovable, it might seem even more inappropriate to call the ‘application’ a ‘potential application’ than simply to call it an ‘application’ and understand clearly the definition of that term.

MODELS OF PROPOSITIONAL CALCULI

171

by 9.Since given f E F we can effectively determine some X E T such that fx =f, we can for each element of B determine formulae U,, ..., Urnsuch that when Uiis substituted for Ai,1 < i < m , in R, the resulting application of R

has the given element of 9 as its associated (q+ 1)-tupel. (Not every set of formulae associated with g l , ..., g q , h need correspond to an application of a rule - all we require is that, by working as we have done through substitution for vafs, we can get at least one set of suitable formulae.) The proof that we can test effectively for satisfaction of (iv)** is now easy to complete for we just extend slightly the method used in the consideration of (iii)**. For each element (g,, ..., gq, 12) of P, we determine effectively for each r, O < r < k , all sets of r functions f , , . . . , f , for which (using in an obvious sense function notation)l f I ,...,f , t g i is mS-satisfied for all i, 1 d i 6 q . We can with any such set of functions fi, . . .,f,, g,, ..., g q , h now determine a set of formulae X,, ..., X,, Y,, ..., Y,, Z in T such that XI, ..., X,.t yi are satisfied all i, 1 6 i < q , and such that Y,, ..., Y , t Z is an application of R. Also, in view of our method of construction, any case of formulae XI,..., X,, Yl, .. ., Yq,Z which are all in T and for which X,, ..., X , t- is mS-satisfied for 1< i6 q and for which Y,, ..., Y, t- Z is an application of R, will, from the point of view of associated functions, be equivalent to one of the sets we have found. If we therefore can show that for all the cases we construct fi, .. .,f,k h is mS-satisfied then (iv)** is satisfied for rule R ; otherwise it is not satisfied for rule R. Hence we can test whether or not (iv)** is satisfied by M. In view of our earlier results, this completes the proof of Theorem 6. THEOREM 7. In order to prove that Smiley’s versions of McKinsey’s and Padoa’s criteria are necessary and sufficient conditions for definitional and functional independence in the sense defined by Smiley in [4], mS-models can be used in place of S-models. Proof. Consider in detail the proof referred to on page 435 of [4], using Theorem 3 as required, or alternatively note that mS-model is a concept of model in the Smiley sense which is wider than S-model. It is in view of Theorems 4, 5, 6 and 7 that it is considered that the remark made by Smiley ([4], p. 435) that his conditions were enough and no more than enough to ensure that his matrix correlates satisfied desired conditions seems to be in error. This is assuming that the desired conditions were to set We could if we wished use associated formulae and remember that mS-satisfaction depends only on the functions associated with the formulae.

172

R. HARROP

up the most general type of model, based essentially on the definition of derivability and with an effective test for being a model available in the case of finite structures, using his definition of the satisfaction of X,, ..., Xnt- Y, which contrasted sharply with the normal designated set form of satisfaction. There is a slight ambiguity in wording in that the ‘desired conditions’ could possibly be the necessary and sufficient conditions we referred to in Theorem 7, but even with this interpretation the statement still does not seem easy to verify. In that case, it would be necessary to show that it was definitely impossible to find a more restrictive type of model that would work.

4. mS-models and the finite model property Suppose now that M = ( E , -,Cl, ..., C,) is an S-model of P. The proof given at the top of p. 279 of [2] shows that if G c E (G possibly null), then ( E , G, Z,, ..., 1,)is a strong model of P. This proof is used in the reference cited in a form which enables it to be shown that if XI, ..., X,k Y can be shown t o be unprovable in P by means of an S-model then this can also be shown by means of a strong model, and if the former model is finite, the latter may be assumed also to be finite. A corollary of this result is that any non-derivable expression X,, ..., X,t Y of P can be proved to be nonderivable in P by means of a (possibly infinite) strong model, for it can be deduced from the last three lines of [4] that there is an S-model which is complete with respect to derivationsl. This does not contradict the fact that there are calculi which have no strong model which is complete with respect to derivations. It just means that although there may be no single strong model which is suitable for demonstrating simultaneously the non-derivability of all non-derivable expressions of the form X,, . .., X , k Y,there is given any particular non-derivable expression of this form, some strong model in which it can be shown to be non-derivable. Using the fact (compare [Z], centre page 278) that any (finite) strong model can easily be transformed into an ‘equivalent’ (finite) S-model, we can show that finite strong models and finite S-models are equivalent in respect of the set of expressions which can be shown to be non-derivable in P by means of models. Thus, not only are weak, strong and S-models equivalent with respect to the finite model property defined in terms of provability, but they are equivalent also in respect of the finite model property defined in terms of derivability. Use Lindenbaum method where E is the set of all formulae of P,the result of applying closure operation to a subset of E (a set of formulae of P) being defined by

Z = (~1x1, ..., X , t~

x

for some X I ,

...,X~ET}.

MODELS OF PROPOSITIONAL CALCULI

173

We now see how far this generalizes to mS-models. -

THEOREM 8. If M = ( E , , C,, ..., C,) is an mS-model of P then M*= ( E , @, C,, ..., C,) is a weak model of P. Proof. The fact that the axioms of P are all valid in M* follows at once from condition (I)** of the definition of mS-model. Suppose Y,, ..., Y,kZ is an application of a rule R written in derivation form and that Y,, . .., Y, are valid in M* that is take only values in 0. Then by definition of mS-satisfaction t Yiis mS-satisfied 1 < i,< q. Now, by Theorem 2, Y,, .. ., Y, 1 2 is also mS-satisfied. Hence, by condition (iv)** so is k Z . Thus Z only takes values in 0. Hence R is weakly satisfied in M* and therefore M* is a weak model of P. THEOREM 9. (a) A formula cannot be shown to be unprovable by means of a finite mS-model unless it can also be shown to be unprovable by means of a finite weak model (or equivalently by means of finite strong model).l (b) With normal terminology (compare [I], [2]) there is a calculus without the finite mS-model property. Proof. The first part of (a) follows from Theorem 8, since, with the notation of that theorem, the fact that a formula Xis not mS-satisfied in M means the same as saying that it is not valid in M*. The remainder of the theorem follows from Theorem 4 and results on pages 279, 280 of [2]. One question which is left unanswered at this stage is whether for the proving of non-derivibility of expressions such as X , , ..., X , k Y finite mSmodels are equivalent to finite S-models. It has already been remarked that the latter are equivalent to finite weak models and to finite strong models. The difficulty is that the result quoted at the beginning of the present section of the paper, namely that if ( E , -, C,, ..., C,) is an S-model of P and G c E then ( E , G, C,, ..., C,) is a strong model of P certainly does not trivially carry over to mS-models since, at least on the surface, the proof seems to use closure properties of - which are not known to hold in general mS-models. An answer to this question which showed that finite mS-models were equivalent to the others as far as derivability was concerned would be interesting in the sense that it extended the strong, weak and S-model results. A proof of non-equivalence would be even more interesting in that it would break this chain of results. 1 In view of the remarks concerning the existence of a complete S-model, the relationship between mS-models and S-models, and the relationship between S-models and strong models, this result would have been trivial to prove if the three occurrences of the word 'finite' had been omitted.

174

R. HARROP

References 1. R. HARROP, On the existence of finite models and decision procedures for propositional calculi, Proc. Cambridge Phil. SOC.54 (1958) 1-13. 2. R. HARROP,Some structure results for propositional calculi, J. Symb. Logic 30 (1965) 271-292. Remarks on sentential logics, Proc. Koninkl. Ned. Akad. 3. J. LoS and R. SUSZKO, Wetenschap., Series A.61 =Indag. Math. 20 (1958) 177-183. 4. T. SMILEY, The independence of connectives, J. Symb. Logic 27 (1962) 426-436.

LENGTHS OF FORMULAS AND ELIMINATION OF QUANTIFIERS I L. HODES Bethesda and E. SPECKER Zurich

Introduction The problems studied in this paper and its sequels are special cases of the following type of problem: Given a function, how long does a formula representing it have to be? Part I treats the case of the propositional calculus. In order to state some results, the following notation is introduced: Fl is the set of formulas of the first order propositional calculus with negation and conjunction as the only connectives.

Example:

1(Xi A 1 Xz) A l ( 1X I A Xz).

F, is the set of formulas of the first order propositional calculus with negation, conjunction and bi-implication as the only connectives.

Example:

(XI

f+(xz A

-I

x3>)-(7

X1 A

x2).

F3 is the set of second order formulas of the propositional calculus with negation, conjunction and bi-implication as the only connectives. Example: (VXI)

((3x2)(x2 A x3)-(V'X4)

((XI A

x21-h

A 1x4)))

Clearly, Fl c F, G F3. Furthermore, for every formula cp of F3 there is a formula $ of Fl such that $ is equivalent to cp. Such a formula IcI usually has to be much longer than the given formula cp. 175

176

L. HODES

and

E. SPECKER

In fact, defining the length of a formula as the sum of the number of occurrences of all its variables (so that the formulas given as examples have lengths 4, 5, 6 respectively), we have for i= 1, 2

THEOREM (i). For every integer c there exists a formula cp of Fi+lsuch that for every formula $ of Fiequivalent to cp the following inequality holds length $ 2 c’ length cp The proofs of these theorems are carried out more conveniently in the language of rings than in the language of lattices. We introduce therefore the Boolean sum, product (which is the same as conjunction) and the Boolean constants 0, 1. Negation can then be dispensed with, 1 + x , being i x l . Results of this paper and its sequels have been announced in [l], [ 2 ] . The first explicit example of a Boolean function which permits only “nonlinear” realizations has, as far as we know, been defined by NeCiporuk [3].

1. Definition of the notion of “formula”: 0, I, xo, x l , ... are formulas. If cp, $ are formulas, so are cp +$ and cp. $. (Parentheses are added according to custom.) 2. Definition of the notion of “p-formula” (“p” for “product”) : 0, 1, x o , x l , ... are p-formulas. If cp is a p-formula, so are O+cp and 1 +cp. If cp, IC, are p-formulas, so is cp.$. Remark. cp +$I is equivalent to 1 + (1 + cp*(1+ $)).(1

+ (1 + cp>*$I>

which is a p-formula if cp and $ are p-formulas. Every formula is therefore equivalent t o some p-formula. 3. If cp is a formula, cp

xi,...xi,

o...o

is the formula obtained from cp by substituting 0 for the variables x i , , ..., xi,. If no other variables but x i l ,..., xi,,,, x j , ,..., x j noccur in cp and if i p # j q for x i , .. .xi, all p , q (1 Q p < m, 1 < q < n) then cp/xj,.. . x j nis the formula cp

1o...o

.

If x is the sequence (x,,, ..., x i , ) then cp/x is cp/xj,...x j n . Example. If cp is the formula ( x t + x 2 ) . ( x 2+ x 3 ) and x is the sequence ( x l , x 3 ) then cp/x is ( x l +O).(O+x,). The theorems of the paper are based on the following

177

LENGTHS OF FORMULAS

MAINLEMMA.For all integers m, k there exists an integer no such that for a11 n, zz>n0, the following holds: If cp is a formula in the n variables xl,. . ., x,,, none of which occurs more than k times in cp, then there exist m distinct integers k,, .. ., k , (1
cO

+ c1 j fl (l + = 1

m

xkj)

+ c2 j 2 xkj. = 1

Moreover, if cp is a p-formula then c2 = 0. Putting n = (1 +xk,), CJ = xk,, we have T C -=00. The thesis of the lemma therefore states that cp/x is equivalent to a binary formula in TC,0. The proof of the lemma proceedes by constructing simpler and simpler formulas from the given formula by substituting zeros for some variables. It will be obvious from the construction that the final formula is a p-formula (i.e. co +c, if the given one is. e n )

4. The formula cp is an abridged version of the formula $ iff cp is equivalent to $ and for all i, 0 < i, the number of occurrences of xi in cp is less than or equal the number of occurrences in $. Let S be a set of variables and cp a formula containing at least two distinct variables. Then there exist formulas cpl, cp, and a Boolean constant c such that the following conditions hold (1) cpl + c p 2 or c+cp,.cp, is an abridged version of cp, (2) cpl and cp, both contain at least one variable, (3) the number of distinct variables of S occurring in ( p 2 is at least half the number of distinct variables of S occurring in cp.

Remark. We will refer to a binary operation on cpl, cp, by the common symbol cp1*‘p2. It is thereby understood that different stars in the same formula do not necessarily refer t o the same operation. 5. A sequence (GI, . .., $,,) of formulas is normal iff (1) each formula $i, 1
$i

contains

...,$,,) be normal and (xl, ..., x,) be the sequence obRemark. Let tained from . . ., $,,) by substituting 0 for some variable in all the formulas $1, . .., $,, and then deleting the formulas containing no variable. Then (x,,..., x,) is normal.

178

L. HODES

and E. SPECKER

6. A formula $ is of type 7; iff there exist formulas $1, $z such that for some operation * the following conditions hold (1) $1*$2 is an abridged version of $, (2) there exist at least q distinct variables occurring both in $1and in $z. A formula $ is of the type iff there exists a normal sequence (I)~, ..., $ J of formulas such that for some sequence of operations * the formula $1 *($Z*($3

*(*.**$PI

...I)

is an abridged version of $. An example of a formula of the above type is

1 + $1

($2

+ $ 3 ( 1 + $4*$5)),

the sequence of operation u*v being 1 f u - v , u + u , u'u, 1+u-v. LEMMA1. If 1 is normal; (3) for all i, l
-

179

LENGTHS OF FORMULAS

is an abridged version of q / y . Let z be the sequence of variables of y not ... $ i - l . Then for some Boolean constants el, c2 the formula occurring in c1 +cz.qi/z is an abridged version of q / z . xi*oibeing an abridged version of ( p i , the formula xi/z*wi/z is an abridged version of q i / z . Therefore, for some operation *', the formula

xi/z*' o i / z is an abridged version of q / z . By assumption, there is no x such that q / x is of the type T,' ; hence, the formulas xi/z and m,/z have less than q distinct variables in common. The number of distinct variables of m inot occurring in $1...$i-Ixi is at least 2 * 4 P - ' - ' . q - q , i.e. at least 4P-'-'.q. If xi contains variables occurring in $l.. . i j i - , , let u be a sequence containing exactly the following variables: all the variables occurring in $' ... $ i - ; all the variables of mi not occurring in xi. Defining $ i as xi/u, q i + las wi/u,conditions (1)-(5) still hold. If xi contains no variable of $' ... $i- let u be a sequence containing exactly the following variables : ... - ; exactly one variable of xi ; all the all the variables occurring in variables of oi not occurring in $l ...$ i - l .xi.Defining again $i as xi/u and qi+' as o i / u , conditions (1)-(5) still hold. are defined; q p ..., Assume that the sequences ( q l ,..., q,), contains at least q variables. If q pcontains variables occurring in $; ...$ p - l , let x be a sequence containing the variables of $1 ... $,-'. If q pcontains no variable of $; . . . let x be a sequence containing all the variables of $; ... $,-1and exactly one variable of q,. Define $ p as q p / x in both cases. Conditions (1)-(5) hold; the formula

',

~,,*($,*..-(*,-,*~,>...) is an abridged version of q / x , i.e. q / x is of type

7;.

7. LEMMA 2. If the sequence ($1, ..., $,J is normal, if no variable occurs in more than k of the formulas t,h1 ,..., $ n and if n ( k + 1)" then there exist an integer p and distinct integers k,,..., k , such that the following holds: (1) x k l occurs in (2) if (xl,. . ., x,) is the sequence obtained from ($l,. . ., $,,) by substituting

180

L. HODES

and E. SPECKER

0 for all the variables except xk,, .. ., x k pand then deleting the formulas containing no variables then ( 4 mdq, (b) for all j , 1d jd m, the formula xi contains exactly one variable, say Xi,'

(c) the sequence (il, ..., i,) (defined in (b)) is a sequence without alternations. (The sequence ( i l , ..., i,) is said t o be without alternations iff for all jl,j,,j3the following holds: If 1 djl <j, <j, <m and ij, =ij3, then ij,= i j 2 . The sequence (9, 9, 7, 8, 8, 8, 1) is without alternations, (9, 9, 7, 8, 8, 9, 1) is not.) Proof. Let xk, be the variable occurring in $,. If m= 1 then p = 1 and k , satisfies the thesis. For the inductive step two cases are distinguished. (a) x k , occurs in none of the formulas t,hj, 2<j<(k+l)"-'+l. Putting r = ( k l),1, the sequence (G2,. .., $,.) is normal and each variable occurs in at most k of the formulas t,hj, 2 < j < r , k,, ..., k , being indices according t o the inductive hypothesis, the sequence ( k , , .. ., k p ) satisfies the thesis. (b) Assume that xk, occurs in some formula $ j , 2 < j d ( k l)m-l 1 and let h be the smallest such numberj. Furthermore, let V be the set of variables different from xk, and occurring in $; ...$h - 1 . Substitute 0 for all the variables of Y in the formulas $ h , $h+ ..., $, and delete the formulas containing no more variables. Let the resulting sequence be (ol, ..., or).The length of the sequence ($1, $h, $ h + l , ..., tjn)is at least n - ( k + l ) " - l + l . There are less than ( k + l),-' distinct variables in $, ... $ h - l , each one occurring in at most ( k - 1) of the formulas t,hl, $ h , . . ., $,. Therefore,

+

+

+

+

,,

+ l),-' + 1 - ( k - 1) ( k + l),-', r 2 ( k + 1)m-1 + 1 . r 3 n - (k

i.e.

The sequence (o,, ..., or)is normal and its length is at least ( k + l)m-l.xkl occurs in w , ; if ( k l , ...,k p ) is a sequence of incides according to the induc..., or), the same sequence satisfies the thesis of the tive hypothesis for (o,, lemma for ..., $,). 8. LEMMA 3. If q is a formula of type T;, if no variable occurs more than k times in cp and ifg > ( k then there exist m distinct variables xk,, .. ., Xkn, of cp such that is equivalent to some formula o satisfying the following condition: There exists a sequence (ao,..., o m of ) formulas such that

+

181

LENGTHS OF FORMULAS

(1) no variable occurs more than ( k - 1) times in 0,; (2) for allj, 1 < j d m , there exist Boolean constants a j , b,, cj such that o j is equivalent to aj ( b j cj.xkj)*wj-l for some operation *; (3) o is the formula w,. Proof. cp being of type zf, there exists a normal sequence ($i,..., $ p ) such that some formula

+ +

*1*(*2*.

..**,I

is an abridged version of cp. Each variable occurs in at most k of the formulas t j i , . . ., $ p . By Lemma 2, there exists a sequence x of variables of cp such that the sequence obtained from the sequence

( X I 7

(*llX,

...

9

x,>

.*.)*,lx>

by deleting the formulas containing no variables has the following properties : (1) k . m < q , (2) for allj, 1 <j < k - m ,the formula xi contains exactly one variable, say xij, (3) the sequence (il, ..., ik.m)is a sequence without alternations. A variable occurs in at most k of the formulas xi, 1 <j
n

rj + 1< i < r j

ofy; (c) for all h, j , 1< h <j < m the variables of y occurring in

]II

rh+l
$,and

n

rj+iSi
$iaredistinct.

We may assume that the variable of y occurring in 1 <j<m. Defining wo as

fl

rj+iSi
t+hi iS xkj,

$r,lY***.*$plY and for all j , 1<j < m , defining the constants aj, bj, c j and the formulas w j such that I + l*...*($rj- ,*mi-1 ) ...)/Y $rj+I*(~rj+

182

L. HODES

is equivalent to Uj

for some operation

and E.

SPECKER

+ ( b j + Cjxkj)*Oj-l

* , the conditions are verified.

9. Definition of the notion of “basic formula”: A formula cp is basic in <xi,, x i l ,..., x i , ) of type ( a l , ..., a,) iff the following conditions hold: (1) ( x i o ,..., xi,) is a sequence of distinct variables; ( 2 ) ( a 1 , . , . ,x,,) is a sequence of operations, xi, l
if

Ekf1

+(bk+l

is sum;

+Ck+lXk+l)+(Pk

+ck+,x~+l)‘cpkifak+lisproduct

VniSq.

A formula q basic in ( x i o ,...,x i , ) is of the form qn*(qn-

1*...*(’~1*(~0)

>

...

2

where q j contains the variable xi, and no other.

If rp is basic in

(.yi0, ..., s i n )of

type (a,, ..., a,) and 1
xi, 1s

/o .

equivalent to a formula basic in (sin, ..., \, .. ., si,,) of type (a,,. .., \, ..., 2,).

LEMMA 4. If cp is basic in (xi,, ..., xi,) of type ( x , , ..., cln) and n36m, m> 1, then there exist rn distinct numbers k,, ..., k,, among il, ..., i, and Boolean constants do, d l , d, such that the formula cp/xioxkl... xk, is equivalent either to m

do

or to

+ (dl + d 2 x i 0 , ) * j fl (l -k x k , ) = 1 do

+

m dlXio

i. d,

j= 1

xk,.

Proof. We assume ij=j, O<j2. Furthermore, let the sequences of formulas and Boolean constants be as in the definition of basic formula. Let n, be the number of operations sum in a l , ..., am,n2 the number of operations product; n,+n,=n. If n1>2n1, let i l r . . . ,i,, be 2m distinct numbers such that zij, 1 <j < 2m, is sum. The formula cp/xoxi,... x ~ is ~then, ~

LENGTHS OF FORMULAS

equivalent to some formula do

+ dlx, +

183

2 111 j = 1

ejxij.

There are m distinct integers j and a Boolean constant d2 such that e j = d 2 . Therefore, there exist distinct k , , .. ., k,,, such that (p/xoxk,. ..xk,, is equivalent to do

m

+ d l x O + d2

j= I

xkJ.

Assume therefore nl < 2m; then a, is the operation product for at least 4m distinct indices k . If there exist m distinct numbers k , , ..., k, among 1 , ..., iz such that ckJ=O, 1 <j < m , then the formula cp/xoxk,...x,_ is equivalent to d,+d,x, for some constants do, d,. Assume therefore that there exist 3m distinct indices k such that a, is the operation product and c,= 1. Letting x be the sequence of the corresponding 3m variables, replacing cp/x by cp and changing notation it suffices to prove the following: If n23m, q, is b ,

for all k , O d k < n - I , (Pkfl

is

ak+l

+ coxo;

+(bk+l + Xk+l)'(Pk; (PniS cp,

then there exist distinct integers k , , .. ., k, such that cp/xoxk,. ..xk, is equivalent to some formula m

do

+ (dl + 4 x 0 ) . JIT (1 + Xk,). =l

If there exists an index k such that b, = 0 and m + 1 d k, then the formula cp/xox,...x, is equivalent to a constant. Assume therefore b,#O for all k , m + l < k , substitute 0 for xl, ..., x, and change again notation: n 2 2 m ; cpo is b ,

for all k , O < k d n - l , (Pktl

is u k + l

+ coxo;

+ (l f xk+l)*'Pk;

CpniSCP.

If all the constants a,, I d k < n - I , are 0, then 63 is equivalent to n

an

+ (b, + c , x ~ k>=

and the thesis of the lemma holds.

1

(1 + xk)

I . HODES and E. SPECKER

184

Otherwise, let i,, . .., ip-, be the indices i such that ui= 1 and 1 < i
-

i , < i2 c .. < i,

-

,.

Define a sequence ($,, ..., $ p ) as follows

Ij/h+l is $pis

ih >>...).

+,,

For all h , j such that 1
(c = c p ; s = [+(P - 1>1>.

Substituting 0 in $ for all the variables occurring in one of the formulas $ 2 j + l , j = 1, 2, ..., and also for the variables xi, i= 1, ..., i, (occurring in we obtain a formula x2 equivalent to c

+

$2

... $2(1 + bo + coxn)

( t = [+PI)-

Among the variables x , , ..., x,, n > 2m, there exist either m occurring in ..., t j 2 s + l or M occurring in one of the formulas one of the formulas $2, ...,tl/2t. In both cases, there exist distinct numbers kl, ..., k , such that q / x o x , ,.,x k , is equivalent to some formula

+

(l + x k j )

(dl

+ d2xO)'

10. MAINLEMMA.Let F be the primitive recursive function defined as follows F ( m , 0) = m F ( ~k ), = 4 ( k + 1 ) 6 ' * ~ F ( m . k - ' ) F ( F ( m , k - l ) , k - 1 ) . If n 2 F(m, k ) and p is a formula in the variables x,, .. ., x , none of which occurs more than k times in cp, then there exist distinct integers k,, ..., k, ( l < k j < n , 1 < j < i n ) and Boolean constants Cg, c,, c2 such that q / x k , .

185

LENGTHS OF FORMULAS

is equivalent to co

+ c1 fl (1 + X k j ) + c2 c Xk,. rn

m

j = 1

j= 1

Moreover, if cp is a p-formula then c2 =O. Proof. The proof is by induction on k , the case k=O being trivial. Putting p = ( k + 1)6’k’m, q= F(F(n2, k - l ) , k - 1) and applying Lemma 1, there exists a sequence x of variables xi(1 Q i < n) such that cp/x is either of type 7; or of type 7;. (1) If cp/x is of type T ; there exist formulas i,h2 and an operation * such that rf/l*$2 is an abridged version of q / x and such that there exist at least q distinct variables xil,..., xi, occurring both in $1 and in ij2.The variables xil,..., xi, therefore occur at most ( k - 1 ) times in t j j ( j = l , 2). Substituting 0 for all the other variables in yields a formula $;*$;. Applying the inductive hypothesis to ,;)I there exist F(m, k - 1) variables xjl, ..., xjr (r=F(un, k - 1 ) ) such that, putting y = ( x j , , ..., xj,), the formula $ ; / y is equivalent to some formula co

+ cln + c2n,

where n =

n (xj,+

l), G =

C xi,.

Applying the inductive hypothesis to $Jy, there exist m distinct variables xkl,. .., xk, among the variables xjl, ..., xj, such that, putting z = (xkl, . .., xk,), the formula $Jz is equivalent to some formula cb

+ c;n’ + c ; d ,

where .n’

= j = 1

(1

+ xkj),

G’ =

C

j= 1

xk, .

The formula $l/z is equivalent to CO

+

C1.n’

+ czn’ ,

the formula ($l*$2)/z therefore to some formula d o + dln’ + d , d . The formula ($1*$2)/z is equivalent to cp/z. (2) Assume on the other hand that cp/x is of type ~ fPuttings= . 6F(m, k - 1) and applying Lemma 3, there exist s distinct indices i,, . .., is and formulas wo,..., o,such that, putting y = (xil,..., xis), the formula w, is equivalent to cply (being the same as cp/x/y) and the sequence (coo,. . ., 0,)satisfies the following conditions : (1) no variable occurs more than (k - 1) times in 0,;

L. HODES and E. SPECKER

186

(2) for all'j, l<j<s, the formula oj is equivalent to some formula aj

+ ( b j+ cixij).wj-l

or ( b j

+ c j x i j )+

Define a sequence (I),, ..., $J of formulas as follows:

For allj, 1 dj<s, the formula aj

is equivalent to

$j

+ (bj+ cjxij)*$j-l

according to the relation of w j to

..., x i , ) ; furthermore,

or to (bi

+ c j x i j )+

The formula

$,y

$j-l

is basic in (so,x i , ,

xo is equivalent to w,. Putting t=F(m, k - l), we

$s L

O

have s> 6t. Applying Lemma 4, there exist distinct integersj,, ...,.itsuch that, putting z = ( x i , , . .., xjt), the formula $ J z is equivalent to some formula t

do

+ (d1 + dzxo) iH (1 + X j J , = 1

+ d l x , + d, 1 x j i . t

do

i= 1

The formula cp/z is therefore equivalent to some formula do

+ (d1 + d,oo/z)*H(1 + x j z ) , do

+ dloo/z+ d 2 C x j i .

We have t = F(m, k - 1); applying the inductive hypothesis to o o / z , we obtain m distinct integers k,, ..., k, such that, putting u= (xk,,.. ., xkm),the formula o o / u is equivalent to some formula co

+ cln +

C,C,

where

n=

I1(1 + x,,),

CJ =

Cxkj.

Because of n.o=O, the formula cp/z itself is equivalent to some formula

e,

+ e,?i + e 2 0 .

If cp is a p-formula, we may assume e2 = 0. 11. THEOREM. If /2>2-F(nz,2 . c ) ( F being the function defined in 10) and if cp is a formula in the variables x, ,. . ., x, of length less than c. n, then there exist integers k,, .. ., k , (1 d k , < ... < k, = it) and Boolean constants c,, cl, c,

187

LENGTHS OF FORMULAS

such that

$x?!k/,

... x k , , ,

is equivalent to m

m

Moreover, if cp is a p-formula then c,=O. Proof. Let n , be the number of variables x , , 1 n, .2c. Therefore, n,>F(m, 2c). Let x j , , ..., x J n 2 be distinct variables occurring at most 2c times in cp. Define x = ( x ) , ,..., x , , , ) and let $ be the formula cp/x. The thesis of the theorem 2 follows by applying the Main Lemma to the formula $. THEOREM (1). For every integer c there exist a formula cp such that for every p-formula $ equivalent to cp the following inequality holds : length I/J 2 c .length cp .

,

Proof. Assume IZ = 21;(2,2-c) and let cp be the formula C:= x,.If II/ is a p-formula in the variables xl, ..., x, of length less than c - n , there exist integers h, i and Boolean constants do, d, such that 1 < h < i< n and that $/x,,,x, is equivalent to do +d,(l +xh) (1 + x L ) .The formula ( p / x h , x , is equivalent to x, +x,, the formulas q, $ are therefore not equivalent. THEOREM (2). For every integer c there exists a formula Q, of the second order propositional calculus (based on the connectives conjunction and negation) such that for every formula (of the first order propositional calculus) equivalent to Q, the following inequality holds length$ 2 c-lengthQ,.

Proof. Assume n= 2 * F ( 3 ,150. c) and define formulas cpk, 1 < k < n, as follows : cp, is ( x l + .,).(I x1 + u l ) . (1 + w,), q k + l is

+

+ .k + x k + l ' k

+ x k + l w k + . k + l ) ' ( I + uk + x h + l u k + x k + l u h + O k + l ) ' f X k + l w k + xk+luk + w k + l ) .

*(I+ w k

'p, .u, is 18n - 12. There exists The length of the formula cp' defined by a p-formula cp equivalent to q' of a length which is at most 4.(18n - 12). Let Q, be the formula

188

L. HODES

and E.

SPECKER

The length of @ is less than 75.n. If (cl, ..., c,) is a sequence of Boolean constants then @/ “ l ” ’ ” ’ ’ has the c1.. .c,

value 1 iff the number of indices i such that 1< i < n and ci= 1 is congruent to 0 modulo 3. (“uk”, “tlk”, “wk” say that the number of constants among cl, ..., ck having the value 1 are respectively congruent to 0, 1, 2 modulo 3.) Let$ beaformulaoflengthless thanco(75.n). Becauseofn=2.F(3, 15O*c), there exist indices h, i, j and Boolean constants do, d,, d2 such that 1
+ d I ( 1 + xh)’(l + xi)’(1 + x j ) + dz(xh + xi + x i ) .

If (cl, ..., c,) is the sequence defined by ch = 1 and ck =0 for k # h (1
I

xl”.xfl

do + d 2 . On the other hand, @

+

I ... xl*’*xfl

c1

c,

is 0,

and t,b are therefore not equivalent.

@/z:‘*‘xy...c,

is 1; the formulas @

References 1. L. HODESand E. SPECKER, Elimination of quantifiers and the length of formulae, Notices Am. Math. SOC.12 (1965) 242. 2. L. HODESand E. SPECKER, Elimination von Quantoren und Lange von Formeln, Abstract J. Symb. Logic. 3. 8. I. NEEIPORUK, A Boolean function, Dokl. Akad. Nauk SSSR 169 (1966) 765-766, Engl. transl.: Soviet Math. Dokl. 7 (1966) 999-1000.

A DECISION PROCEDURE FOR THE WEAK SECOND ORDER THEORY OF LINEAR ORDER H. LAUCHLI Zurich “Linear order” refers to the general theory of the axioms (a) u < u A u < w+ +u<w, (b) U < U A U < U - U = U , (c) u < u v u < u . “Weak second order” (WS) theory (better : “Monadic weak second order theory”) means, roughly speaking, first-order theory extended by the concept “finite set of individuals”. Thus, the order type o of the natural numbers can be characterized by a WS-sentence as follows: “There is no greatest natural number and for every natural number, n, there exists a finite set, X , of natural numbers such that k E X iff k
190

n. LAUCHLI

KO-categorical, finitely axiomatizable. Let C‘, FA’ denote the corresponding classes for WS theory. Then C c (FA n M ) c M and M = F A f c C’, where c denotes proper inclusion. M is too small for strong second order theory: o1for instance, the least non-denumerable ordinal, is not strong second order equivalent to any denumerable order type. Then there are many strong-second-order-discernible kinds of non-denumerable “shufflings” (see section 2) of a given finite set of order types.

1. A decidability criterion We consider the first-order language, L, of a finite number of predicate letters. The set At, of atomic formulae with variables among ul, ..., uk is finite. Let P ( X ) denote the power set of X . The sets T,,,, defined by TOk= P(At,) and c+l,k=f‘(Tfl,k+l) are hereditarily finite. Be A a relational system of the similarity type of L, x = ( x l , ..., xk) a sequence of elements x , ~ l A lA, the empty set (sequence). We define

.>

tO,(A,x) fn+l,k(4

=

{4: 4 E At, and A k 4 [XI],

= P,,,k+l(A

x*Y>:YEIAI)

(where x * y denotes adjunction of term y to sequence x),

h(A>= h O ( 4

4.

tnk(A, x), “the nk-type of A , x”, is an element of Tnk.&(A) is called the n-type of A . Be Cl a class of relational systems of the similarity type of L. We write t;O. for ( [ , , ( A ) AGO.). : CRITERION. The $rst-order theory of C is decidable if the sets ttC efectively depend on n. Since the sets t f K are hereditarily finite, there is no question about the meaning of “effective dependence”. Note that since T,, (= Tfl0)effectively depends on n and t E T,,, (*) ttCl effectively depends on n iff the predicate ‘ ‘ s G t t K “ is decidable. The criterion is based on results by FrajissC [4]. We sketch a proof: Let t be an nk-type, 4 a formula. We define the predicate sat(t, 4) by: (i) If 4 is , if 4 is 41A 42(41v 42,i41),then sat (t, 4) atomic, then sat(t, 4) iff 4 ~ t(ii) iff sat(t, q51) and sat(t, q5J (sat(t, q51) or sat(t, 4& not sat(t, qb,)), (iii) if 4 is 3u4, (Vu4*) then sat(t, 4) iff sat(s, 4J for some s E r (all s ~ t )The . predicate “sat(t, 4)” is decidable, since t is hereditarily finite.

191

WEAK SECOND ORDER THEORY

Let A be a relational system, x an Idl-sequence of length k, 4 a prenex formula with free variables among u l , ..., uk and with a prefix Q l U k + l Q 2 U k + 2 ... Qnuk+“ (Qi are quantifiers). A straight forward induction shows that

A In particular, if

4 [XI

iff sat(t,,(A, x),

4).

4 is a sentence, A k4

iff sat ( t n ( A ) ,4 ) .

If E is a class of relational systems, (1)

K k 4 iff sat($,+) for aII s ~ t ; E .

On the other hand : Given S E A and x (of length k ) ,

mk,a formula dScan be found such that for all

A k 4, [x]

iff

t,, ( A , x) = s .

Therefore, if SET,, (2)

s~t;O.

iff not

Ek i qbs.

The criterion follows from (l), (2) and (*).

2. The WS theory of linear order The WS theory is viewed as the first-order theory of a special class Gt (the “standard models”) of relational systems. The similarity type is given by a unary predicate E and two binary predicates S, 0. A relational system A = ( / A1, EA, S,, 0,) belongs to Gt iff there is a linearly ordered system ( U , <), Uf A , such that (i) IAl= F ( U ) , the set of all finite subsets of U , (ii) EA(x)holds iff x = A , (iii) xS,y iff x s y ,(iv) x0,y iff, for every U E X - ~there is V E Y - xsuch that u < v . The set U= U IAl will be denoted by d. The singletons in I A Jcan be characterized in terms of s,. Furthermore, U .{ O,{U>, and U E X iff { u } S,X. Thus the first-order theory of A = (IAI, EA, S,, 0,) is of the same strength as the WS theory of (d, <>.The choice of the primitive constants E, S and 0 is motivated by technical reasons. The isomorphism type of A is uniquely determined by the order type of (2, <), and vice versa. A will denote the order type of (d, <), and W S ( a ) , the “WS theory of the order type a”, will denote the set of all formulae 4 such that whenever A E G and ~ K = E , then A 14. Two order types a, p are said to be WS equivalent if WS(a)=WS(/?).The Skolem-Lowenheim

192

H. LAUCHLI

theorem holds for WS theory (even for a stronger theory, see [6]): Every order type is WS equivalent to some countable order type. Let R be the set of rationals. The shuffling aF of a finite set F of order types is defined by a F = x { a , : ~ E R(the ) summands c(, arrangedin the natural order), where ~ , E F for all r and { r : @,=a} is dense in R for all ~ E FaF . does not depend on the particular partition o f R into dense subsets. Let M be the least class of order types which contains the order type 1 and is closed under the operations a+/?, a * o , a * o * , OF ( F finite). Let ‘93 denote the corresponding subclass of Gt. THEOREM 1. t;W effectively depends on n. THEOREM 2. t :Gt = t $Ul. By the decidability criterion of section 1 : COROLLARY. The WS theory of linear order is decidable. 3. Proof of Theorem 1

We show that to each of the operations a+/?, a v o , a-w*, oF corresponds an effective operation on the n-types. Notation: P ( X ) is the power set of X,F ( X ) the set of all finite subsets of X , X x Y the Cartesian product of X and Y, f o g the composition of the functions f and g ; we write f ” X f o r {f ( x ) : X E X } ,x * y denotes adjunction of term y to the sequence x, ( y ) is the one-element sequence with term y . The elements o f the set At, are the formulae Eui, viSuj, uiOuj, where 1< i , j < k. At, = A . n-types are invariant under isomorphisms. t,(a) will denote the n-type of the systems AEGt with A=a.

3.1. LEMMA 1. t,(l) efectively depends on n. The proof is immediate since 1 is a finite order type. By way of example: {a), then If

a=

t,,(A, ( { a } , A ) >= {Eu,, ulSul, UZSUZ, U&I,

UIOUI, V Z O ~UZZ ~O U I )

and

b ( l ) = f2(A)

= {{t02(’%

3.2. The binary operation t+,,s

< AA>)?t o z ( A ( A , {.}>>>, {t02(A, <{.>, A>>,fez(‘% ({.>, {.>)>>> *

+,,

on T,,is defined thus: and i , j < k } , t ’ ~ t and S ’ E S } .

= ( t n s ) u {“uiOuj”: “uj0ui”$s

t + n + l , k= ~ {t’

+

193

WEAK SECOND ORDER THEORY

n

The order-sum A+B, where A , BEG;^, is defined in a natural way ( A +B= (dx (0)) u (B x { l})). Let x be a k-sequence of elements of IA +BI, xlA = ( x l l A, ..., xk l A ) the restriction of x to A (if ~ E I A + B I then x l A = {u:

(U,O)EX)).

LEMMA 2. tnk(A+B, x)=t,k(A, xlA)+nktnk(B, X l B ) . Proof. We have i EA(xilA) and EB(xilB>, a) E A + B ~ iff xiSA+BXj iff (XilA) S ~ ( x j l A ) and (XiIB) s B ( x j l B ) , x~OA+BX~ iff either (xilA) O , ( x j [ A ) and OB(xjlB>, Or not ( x j l B ) OB(xilB)* b) The function defined by f ( x ) = ( x l A , x l B ) maps IA+BI one-one onto IAI x PI. The lemma for n=O follows from a), the induction step n, k + l+n+ 1, k from b).

COROLLARY 2.1. t,(a+fi)= t,(a)+n,t,(/3). 3.3. The functions fin!, finn:T,+l+Tn+l are defined by: fin: (t) = t , fin!"(t)=fin;(t)u { r + , , s : rEfinf:(t) and s ~ t } , fin,(t) = finr(t), where m = ,uh(fin;+'(t) = fin!(t)).

(m exists since fin!(t)cfin(t)!+' E T,, =finite.) fin;+'(t )=fin!(t) implies finh,"((t)=finh,+'(t). Therefore, fin,(t)= (fin!(f):/z<m}. That is,

u

+

LEMMA 3. fin,(t) is the set of alljnite sums s1 ,'sZ to the left) with s , ~ t .

+ nl.. . +

(associated

3.4. Let n be a permutation of (1, 2, ..., k } . Let Subst"(4) denote the formula obtained from 4 under substitution of variables ui+uni, i= 1, 2, ..., k . Let n+denote the extension of 17 to (1, ..., k + l } leaving k + 1 fixed. G: : Tnk+Tnkis defined by G & ( t ) = {$: Subst"(d)€t},

Gf+1 ,k (t , = GEk++I . Let 17x=<x,,,

..., xnk).

LEMMA 4. tnk(A, nX)=G$(t,k(A, x)). The proof is straight forward.

194

H. LAUCHLI

3.5. The functions FA:Tnk-+Tn,k+l and Fn;: FOfk(t) = t

i

U { “ E U k + 1”) U ( “ U i s U j ” :

u {“ui0uj”: i

=

k

+1

=k

or

TZ,kfl+Tnk

+1

Or

“ E u ~ ” Et ,

are defined by:

“EUi”E

t,

and j < k + I} and j < k 11,

+

FO;k(t)=t- {formulae involving u,+,>, 1,k

(1)= (G$+

FflQl,k(t)

2 OFnfk+

= (FnTk+

oCEkf2)“t

Let x be of length k and LEMMA tnk ( A>

x).

5.

F,f,(titk(A,

I

where ll is the transposition k

+1

c*

k

+2.

IAl.

X))=fn,k+l(A,x*A),

and

Fni(tn,k+l(A,X*y))=

In particular, F f l i ( t f l l ( A(,y ) ) ) = t n ( A ) . The proof of the lemma is straight forward.

WEAK SECOND ORDER THEORY

hypothesis give

c

195

0

t,l(

'%,) = G

i=h+l

( % ( W f l + 1 ( 4 ) ) ) ~

Therefore, by Lemma 3, tn+l(a.w>= ~ , + l ( t , + l ( ~ > ) .

3.8. altf:, altn:Tn+,x Tnl+T,,+lare defined by alt; (s, t ) = { t ) , alt;+'(s, t ) = altf:(s, t)u ( r +,,s' + , , t : rEaltf:(s, t ) and S ' E S } , alt,(s, t ) = altr(s, t ) , where m = ph(alt;+'(s, t ) = altf:(s, t ) ) . (Note that T,+, = Tn+l,o,while Tfll= T,,,.) LEMMA 8. alt,(s, t ) is theset ofalljinite alternating sums t+,,s1 f n l t + , , . . .

... + n l ~ h + , l l twith h a 0 atzd s i ~ all s i
1( r ) =

altfl(U r , G ( o f l ( f C r N* )

LEMMA 9. I f F is ajinite set of order types then t,(aF)=o,(t:F). Proof analogous to the proof of Lemma 7. 3.10. R!,R, are the following subsets of T,:

fc

R;"

=

{ t f l ( l > >>

= R f : u{s +,ot:

s, t E R ; }

u { ~ , , ( t )t:E R i ) u (an*( t ) : t E Rh,) u {a,(r): r E R f : } , R , = R r , where rn=ph(Rf:+'=Rf:). LEMMA 10. t t M = R,. Proof by Corollary 2.1, Lemmas 7 and 9. Since all operations introduced above are effective, this completes the proof of Theorem 1. 4. Proof of Theorem 2

Essentially, the proof can be found in [ 5 ] . pp. 111, 114, 11.5. We reformulate the necessary lemmas in the present notation and point out if neces-

196

H . LAUCHLI

sary how the proofs have to be modified. The numbering of the lemmas corresponds to the numbering in [ 5 ] . LEMMA 2‘. T, isfinite for each n. Let A , E G t for Z E I Given . a linear order relation A , is defined in a natural way.

< on I, the order-sum

XI

LEMMA 3’. Zft,(A,)=t,(B,), all ~ € 1then , t,(xIA , ) = t , ( x , B,). We prove the following generalization: Let x, y be k-sequences with x i ~ I C A l yi€ICB,l. l, Then (*)

if

tnk(A,,xlA,)

then

= tnk(B,,

ylB,),

=tnk(x

tnk(x

B,?Y ) .

(The lemma is the special case k=O, x = y = A . ) Proof. For n=O, (*) follows from t o k ( C A,, x) = { t O k ( A lxlA,):z~Z}u , {“uiOui)’: there is ~ ~such € that 1 “UjOui”$tOk(A,,,, xlA,,) and “ t @ j ” E t O k x x (A,, xlA,) for all z 2 l o } , which is an easy consequence of the definitions involved and the fact that the sets x i are finite. Induction step. Let t n + l , k ~ x ( A , , x l A , ) = t , + l , k ( B , , ylB,), all ZEZ. Then for all I and for every a , ~ l A , l there is b,EIB,I such that

n

(I)

tn,k+ 1

*

=

tn,k+ 1

(Bi,(ylBi) * b i )

Using the fact that “E” is one of our primitive predicates, an induction shows that in (l), either both or none of a,, b, are empty. Therefore, if a,=alA,, all z, for some a ~ l C A , [ ,and the b,’s are chosen to satisfy (l), then there is aJinite set b such that b,=blB,, all z. Therefore, for every a ~ l x A , I there is b E l x B,I such that for all ZEZ, tn,k+l(A,,(xlA,)*(aIA,))

= tn,k+l

( B ~ 3

(YIBc)*(blB1)).

Hence, by induction hypothesis, for every a ~ l A,I x there is b ~ l B,I x such that t n , k + l ( x A c ) X * a ) = t n , k + l ( C B , , y * b ) . Therefore t n + l , k ( C A t , X ) = t n + l , k X x
Proof by Lemma 3‘.

LEMMA 8‘. If the order type of every bounded segment of a given denumerable ordered set A is good, then the order type of A is good.

WEAK SECOND ORDER THEORY

197

LEMMA 9’. Every order type is good. Up to an obvious change of notation, the proofs are verbally the same as in [5]. (The Skolem-Lowenheim theorem is available in WS theory and, if WS(a)=WS(P)then t,(a)=f,,(P). The occurrences of IAl in [5] have to be replaced by d.) Lemma 9‘ establishes Theorem 2. References 1. J. R. BUCHI,Transfinite automata recursions and weak second order theory of ordinals, in: Logic, Methodology and Philosophy of Science, ed. Y . Bar-Hillel (North-Holland Publ. Co., Amsterdam, 1965) pp. 3-23. 2. J. E. DONER, Decidability of the weak second-order theory of two successors, Am. Math. SOC. Notices, November 1965, 65T-468. 3. A. EHRENFEUCHT, Decidability of the theory of the linear ordering relation, Am. Math. SOC.Notices 6, No. 3 (1959). 4. R. F R A ~ SEtude S ~ , de certain opkrateurs dans les classes de relations, difinis a partir d’isomorphismes restreints, Z. Math. Logik und Grundl. Math. 2 (1956) 59-75. 5. H. LAUCHLI and J. LEONARD, On the elementary theory of linear order, Fund. Math. 59 (1966) 109-116. 6. A. TARSKI, Some model-theoretical results concerning weak second order logic, Am. Math. SOC.Notices 5 (1959) 550-556.

STRUKTURZAHLEN IN ENDLICHEN RELATIONSSYSTEMEN W. OBERSCHELP Hannover 1. Einleitung Uber einer endlichen, nichtleeren Punktmenge, die 0.B.d.A. als N = { 1, ...,n } gewahlt werden kann, werden wie in [ 131 Relationssysteme ‘31= ( N , RY’, ..., Rlt’, ..., RY’,..., RE’) betrachtet. Dabei habe die Relation RY) die Stellenzahl i, wahrend j als Unterscheidungsindex dient. z = [ p l , ..., p,,,] sei der Typ von ‘31. ‘31 und %’ rnit gleichem Individuenbereich N und gleichem Typ z heiljen isomorph, wenn eine Permutation n €Gnexistiert, welche jedes RY’ in das entsprechende RY’ uberfuhrt. Gesucht sind asymptotisch auswertbare Formeln fur die Anzahl S(n, z) der nicht-isomorphen Relationssysteme uber n-elementigem Bereich N rnit dem Typ z. Dieses Strukturzahlproblem ist bereits von Carnap [4](S. 124) aufgeworfen worden, aber nur fur den Fall rn = 1 behandelt worden. Es soll hier gelost werden fur sog. reine Relationssysteme, d.h. fur Systeme rnit Relationen nur einer Stellenzahl c. Ein solcher Typ z = [O, ..., 0, p,, 0,..., 01 soll abkiirzend rnit z = (c, p,) bezeichnet werden. Die Carnapsche Problemstellung ist von Davis [5] aufgegriffen worden rnit dem Resultat einer (asymptotisch nicht direkt auswertbaren) Strukturzahlformel Fur S(n, ( B , 1)). Harary [8] erkannte, daB dieses Resultat als Spezialfall der allgemeinen Abzahlungstheorie von Polya [ l l ] zu deuten ist, einer Theorie, die in letzter Zeit insbesondere von De Bruijn [l, 21 fortentwickelt worden ist. Harary [9] gab ohne !. zur Beweis die asymptotische Beziehung S(n, (2, 1)) ~ 2 ” ~ / nMethoden genaueren Abschatzung finden sich bei Oberschelp [lo] (5 4). Als Ergebnis sei z.B. genannt n!

1 (2n2 - 2n)

1 (32n4 - 170 3n3 -t + 24n -

+ 288n2 - 149 3n) + 0 199

(2n’ 1) Tn

.

200

W. OBERSCHELP

In dieser Arbeit sollen unter Benutzung der erwahnten Theorie von Polya asymptotische Anzahlformeln fur S(n, (a, pL,)) bei beliebigem a und po gegeben werden. Dabei werden die Ergebnisse interpretiert als Aussagen uber die mittlere Gr4J3e der Automorphismengruppe reiner Relationssysteme %. Fur Stellenzahlen 0 > 1 erweisen sich fast alle solche Relationssysteme als starr, d.h. sie besitzen nur die triviale (einelementige) Automorphismengruppe. Sei Z(n, z) die Zahl der Relationssysteme uber N vom Typ z ohne Identifikation isomorpher Systeme. Z ( n , z) ist gleich der Zahl der Zustandsbeschreibungen (state descriptions) von Carnap (z.B. in [4], ff IBA). Trivialerweise gilt die Formel

n,

1 Smnm Z ( n , z) = 2O=‘ , wenn s = [pl,. . . , , u r n ] .

Fur die durchschnittliche Zahl s(n, z) zueinander isomorpher Relations-

Die Automorphismengruppe 9%von %, eine Untergruppe der symmetrischen Gruppe 6,mit der Ordnung g , zerlegt die 6,in n ! / g Nebenklassen. Alle zueinander isomorphen verschiedenen Relationssysteme entstehen aus einem durch Ausubung je einer Permutation a m genau einer Nebenklasse auf dieses eine Relationssystem. Mithin gilt fur die “mittlere Ordnung” g(n, z) von 9%die Formel g(n, z ) = n ! / s ( n , 7). Da l,
(n,z) < S ( n , S) ,< Z ( n , 7).

Die oben erwahnte Starrheit fast aller Relationssysteme im Falle n> 1 zeigt man, indem man nachweist, dalj hier fur n+ co die asymptotische Beziehung g(n, z)-1, also S(n, z ) - Z ( n , T ) / n ! gilt. Es sei noch erwahnt, dalj unsere Uberlegungen aufgefaljt werden konnen als eine quantitativ-finite Variante von Bestrebungen der modernen MetaMathematik, welche sich das Auffinden von Modellen mit grol3er Automorphismengruppe zum Ziel gesetzt haben *.

2. Anwendung des Polyaschen Satzes auf das Anzahlproblem Man kann ein reines Relationssystem % vom Typ z= (a, p) uber N vollstandig beschreiben durch ein Diagramtn in Form einer p-zeiligen und nu-

* Man vergleiche z.B. [ 6 ] .

201

STRUKTURZAHLEN

spaltigen Inzidenzmatrix in den Zahlen 0 und 1 . Die Zeilen entsprechen den o-stelligen Relationen R,,..., R, aus %, die Spalten den o-Tupeln xl, ..., x,, von Zahlen aus N in irgendeiner festen lexikographischen Anordnung. uik= 1 bedeute, dafi die Relation Riauf das k-te o-Tupel zutrifft, entsprechend bedeute 0 das Nicht-Zutreffen. Bei dieser Darstellung erscheint ein reines Relationssystem als eine Folge von sog. Elementarkonjigurationen, in diesem Falle von Spalten. Unter der Sturke einer Spalte sei die Komponentensumme verstanden, under der Starke eines Relationssystems die Summe der Starken aller Spalten. x , ...x,, Rl

3

RP

Das Efementarpolynom ist eine (formal bis Unendlich erstreckbare) Potenzm reihe E(z) = evzy,

1

v=o

deren Koeffizienten e , die Anzahl der Moglichkeiten angeben, eine Elementarkonfiguration (Spalte) der Starke v herzustellen. Da dies offenbar auf (t) Weisen geht, gilt

c (3 m

E(2) =

zv = (1

+ z)".

v=o

Da sich jedes reine Relationssystem aus nu solcher Spalten bestimmt, wiirde sich ohne Rucksicht auf Isomorphie-Identifikationen ein Polynom A n , , ( z )= E(z)"O -- (1

+

c m

=

Z)@

v=o

u,zv

mit

a, =

re">

ergeben, wobei a, die Zahl der verschiedenen Relationssysteme der Starke v angibt. Wir interessieren uns hier jedoch fur die Zahl der Isomorphie-Klassen der Starke v, d.h. fur das Polynom m

wobei S,(n, T) die Zahl der nichtisomorphen Relationssysteme uber N vom Typ z mit der Starke v angibt. Allerdings sol1 hier das asymptotische Verhalten dieser Zahlen selbst nicht untersucht werden *, sondern nur die Ge-

*

Vgl. hierzu irn Falle r = (2, l > in [lo], § 5.

202

W. OBERSCHELP

samtzahlen

c S,(n,

Bna

S ( n , 7 ) = &(1)

=

v=o

z)

sind Gegenstand dieser Untersuchung. Durch ubergang zu einem isomorphen Relationssystem vermoge einer Permutation ~ € 6 erfolgt , ein Austausch gewisser Spalten von '21, also eine Permutation II uber einer Menge von nu Elementen. Diese Permutationen IZ bilden die sog. a-Tupelgruppe, bezeichnet mit Gz, offenbar wie die 6, eine Permutationsgruppe der Ordnung n!. Zur Anwendung der Polyaschen Theorie hat man den sog. Zykelindex der 6:zu betrachten. Unter dem Zykelindex Z ( 9 ) einer Permutationsgruppe 9 der Ordnung g iiber n Elementen versteht man ein formales Polynom in n Variablenf,, ...,f, Z ( 9 ) :=

1

-

9

1

f P ' ...f,"".

R

€9

Dabei ist p i die Zahl der Zykeln von n mit der Lange i. Fur die der Permutation 71 zugeordnete Partition der Zahl n, geschrieben als p (n): = (pi, ..., p n ) gilt also ipi=n. Fur das gesuchte Polynom B,,,(z) liefert nun die Polyasche Theorie im Fall reiner Relationssysteme den

xr=,

SATZ1: Bn,
( z ) = Z (G:)

If,/'

(~'11

Die in eckigen Klammern angedeutete sog. Polyasche Einsetzung ist dabei so zu verstehen, da13 fur jedes Vorkommen vonfi im Zykelindex Z(G:)der a-Tupel-Gruppe das Polynom E(z') einzusetzen ist. Damit steht auf der rechten Seite tatsachlich ein Polynom in z. Da E ( z) durch die angedeutete triviale Elementar-Kombinatorik soeben explizit angegeben worden ist, steht und fallt die Auswertung der in Satz 1 gegebenen Formel mit der numerischen Berechnung von Z(6:). Beim Beweis von Harary [7] fur Satz 1 wird die Polyasche Kombinatorik fur solche Problemkreise wie den vorliegenden aufgeschlossen. Fur die Gesamtzahl der Strukturen (ohne Rucksicht auf Starke) gilt das KOROLLAR :

203

STRUKTURZAHLEN

Es ist namlich S(n, (a, , U ) ) = B ~ , ( ~ , ~und > ( ~E(Ir)=2". ), Das Schema Q (17) :=(Pl, ...,P,,) zu 176 G:, also das Schema der Exponenten im Zykelindex 2(63 entsteht aus p(n)= ( p l , ..., p,) in leicht ubersehbarer Weise, die bei Oberschelp [lo] (9 2) genau beschrieben worden ist.

3. Auswertung der Polya-Formel fur reine Relationssysteme rnit a > 1 Es sol1 jetzt das asymptotische Verhalten von S(n, 2 ) fur n+ co untersucht werden. Dabei fuhren die Falle a = 1 und a > 1 zu wesentlich verschiedenen Ergebnissen. Es sol1 zunachst der weniger triviale Fall a > 1 behandelt werden. Die grol3tmogliche Ordnung einer Automorphismengruppe von % ist r ( n , z ) = n ! , unabhangig von 2. Z.B. hat ein Relationssystem, welches nur aus Allrelationen besteht, die volle 6, der Ordnung n! als Automorphismengruppe. Andererseits gibt es stets Relationssysteme rnit der trivialen Automorphismengruppe (sog. starre Systeme), fur die also die kleinstmogliche Ordnung y(n, z)= 1 ist. Beispiele hierfur liefern z.B. Relationen vom Typ einer Ordnung im Sinne von < uber N . Fur die mittlere Ordnung der Automorphismengruppe kann man also zunachst nur 1
(n - 2)"

+ +(nu - (n - 2)") = $(nu + (n - 2)")

= nu

- a n a - l + a ( 0 - 1) n"-z(l

+ (c - 2 ) O(t)).

204

W. OBERSCHELP

Insgesamt ist also der Anteil der Transpositionen

Hierbei hat O(l/n) die Form

__ -

2 +-.,-----a-3 2(Cr-3)(0-4) --++ ( C r - 3 ) ( o - 4 ) ( a - 5 ) 3n 3n2 1 5n3 6!

c) Sei ZK,, derjenige Teil der Summe im Korollar, Abschnitt 2, der zu allen 17 gehort, welche von n induziert werden mit p 1 (n)=n - K (0 < K
1) Die Anzahl der n mit p (n)= (n - K, p 2 , ...,p,) ist bekanntlich* (12

- K)!

n! n! <2 p z p 2 !... nPnp,! ( n - K ) ! '

2) Die Anzahl der Partitionen der Zahl n in der Form p = (n - K , p 2 , . ..,p,) ist kleiner oder gleich 2". Denn alle diese Partitionen miissen einer Partition p = (0, p 2 , ..., p,) der Zahl K entsprechen, und davon gibt es bekanntlich weniger als 2". 3) Fur die Summe der Exponenten gilt nu

~ P i , < P+l ~ ( 2 ~ 2 + 3 ~ , + . . . + n " p , , ) + =+ ~ (1n " - P 1 ) = + ( n " - f - P , ) . 1

Nun ist aber PI = (n - ICY,denn wie in b) sind diejenigen o-Tupel, welche bei 17 unverandert bleiben, genau diejenigen ( n - icy Stuck, welche nur die n - IC bei n nicht permutierten Elemente aus N enthalten. Also ist

*

Vgl. [12], S.67 (1).

205

STRUKTURZAHLEN

Aus I), 2) und 3) folgt

Demnach ist

) --_

2n

K

21rn"

- n!

(2n)" 2+KWnb-'(l+0(l))

wobei die o-Konstante noch u.a. von K abhangt. Wir benotigen aber noch eine gleichmaI3ige Abschatzung in K . Der Exponent im Nenner lautet anders geschrieben ICya- l ) ( c - 2 ) 1-KfJ- I

,yp 2

-

(

n

-~

2

+ -5n

3!

Die Klammer, deren Wert mit K bezeichnet werde, hat die Form

Es sol1 gezeigt werden, daI3 K groI3er ist als eine von positive Konstante, namlich 1/20. x ) 1st 0 < IC < n/a, so wird 1a - 1 K>l-----.. 0

2

1 (a - 1) (a - 2) a2

____

2-3

IC

und n unabhangige

1 (a - 1) (a - 2) (c - 3)

... >

2-3.4 1 1 1 1 1 I--+---- ...--~ > o . 2 2 ~ 4 8 20

a3

p) Sei K >n/a. Zunachst ist wegen n/lc 2 1 : K > (1 - (1 - k-/n)")/o.Nach der Formel (1 - ~ / y<)e-r~ fur y 2 x 2 0 gilt wegen n > K > 0 :(1 - Ic/n)"<e - K ,also (1 - ~ / n ) " = ( l --~/n)"~'~n<e-uK'n. Also ist K>(1 -e-uK/n)/a. Da x/n> l/o, so ist e-uK'"<e-l, also 1 -e-uK'n> 1 -e-'. Damit konnen wir die obige Ungleichungskette fortsetzen : K > (1 - e- ')/a > 1/2a, denn es gilt e-' <+. Damit ist K > 1/2a allgemein bewiesen. Insgesamt gilt, daI3 der Exponent im Nenner der Abschatzung fur ,ZK,n groDer oder gleich $p"-ist fur alle IC mit 0 < IC
'

206

W. OBERSCHELP

Damit steht die Abhangigkeit der Abschatzung von K "ganz auDen". Um nun die Restsumme aller ZK,nfur rc>3 abzuschatzen, teilen wir sie (unter der Annahme, daB bereits n > 4a ist) in zwei Teile:

Fur n >,N o (a, p) ist wegen a > 1 dann 2n/2*""-- <4, so daB der hintere Ted gleichmaDig in K durch die geometrische Reihe abgeschatzt werden kann :

Dieser hintere Teil ist also von kleinerer GroBenordnung als der Term nach b). Die endlich vielen Glieder der vorderen Summe sind einzeln ebenfalls von geringerer GroDenordnung als der Term nach b). Denn nach der Anfangsabschatzung fur ,ZK,nist 2"nU (2nY ZK , l l < - ! 2 t w o n = - i( 1 +o<;>, ' wobei die 0-Konstante von K (und a) abhangt. Da ~ 3 3 so, ist dies aber von kleinerer GroBenordnung als der Term 28nU n ( n - 1) 2pu(0- l ) n ' - 2 ( l + ( o - 2 ) o ( l / n ) )

-.-

-

n!

2

.

2pun"-

1

nach b). Insgesamt gilt also SATZ

2.

n ( n - 1) ~

mit

0

(:) -

=-

2 -

3n

2~u(u-1)"~-2(l+(b-2)0(l/n))

--

-

2pan=-

1

a---

0-3

+ -->- + (a - 4) 0 3n

Im Sonderfall a = 2 ergibt sich damit SATZ3.

~

-

C 3 )

.

207

STRUKTURZAHLEN

1 2 2 2 2 10 8 10 3 104 8.5333 x 10' 1.0133 X 4 3044 2.7307 X lo3 2.9867 x 5 291 968 2.7962 x lo5 2.9054 X 6 96928 992 9.5450 X lo7 9.6848 x 7 112282908928 1.1170 x 10l1 1.1227 x 8 458297 100061728 4.5751 X lOI4 4.5829 X 9 6.6666 x 10l8 6.6630 x 10l8 6.6666 x 10 3.4939 x 1023 3.4933 x 1023 3.4939 x 11 6.6603 x loz8 6.6600 x loz8 6.6603 x 12 4.6557 x 1034 4.6557 x 4.6557 X

lo2 lo3 lo5 lo7 loll

l0ls 1023 loz8

z o , n -t Z 2 , n

I2

ZO.

1 2 3 4 5

4 4 4 136 128 136 44224 4.3691 X lo4 4.42027 x lo4 179228736 1.7896 x lo8 1.79219 x lo8 9383939974144 9.3825 x 1012 9.38393 x 10l2

I 2 3 4 5

8 2080 22386 176 11728 394650624 3 14824619 911 446 167552

1 2 2 136 3 22377984 4 768614354122719232 5 354460798875983 863 749270670915141 632

ld

8 2048 2.23696 x lo7 1.172812 X 1013 3.1482443 x 102"

2 128

1 0.8000 0.8205 0.8970 0.9577 0.9847 0.9948 0.9983 0.99946 0.99983 0.99995 1.Ooooo

1 1 0.9743 0.9812 0.9951 0.99916 0.99991 0.99998 1.ooooo 1.OOoOo 1 .m 1 .m

ZO.n

S

1 0.9412 0.9879 0.9985 0.9998

1 1 0.9995 0.9999 1.moo

8 2080 2.238601 x lo7 1.17283924 x I O l 3 3.1482461984 x lozo

2 136 2.23696 x lo7 2.237781 x 107 7.686143364 X 1017 7.6861435358 x 1017 3.544607988759775x 3.544607988759838626x

208

W. OBERSCHELP

1 2 3 4

4 32896 3 002 399 885 885440 14 178431955039 103827 204 744901 417 762 8 16

1

2 32 896 402975273205975947935744

2 3

4 32 768

3.002399751 x 1015 1.4178431955039102644x lo3’

32 32768 4.02975273204876 x loz3

Naturlich ist es leicht moglich, diese Abschatzungen weiter zu verscharfen, indem man z.B. Z3,n noch explizit berucksichtigt und die Restabschatzung erst bei ~ = beginnen 4 lafit. Aber auch die hier entwickelte Approximation ist schon enorm genau. Zahlenbeispiele. Die exakten Zahlen S(n, z) sind mit Hilfe der Zykelindices Z(G3 errechnet, welche bei Oberschelp ([lo], 3 3) angegeben wurden. Die relativen Fehler I - Z,,/S bzw. 1 +Z2,J/S sind schon bei kleinen Argumenten sehr gering. Im Falle 0 = 3 , p=2, n = 4 z.B. unterscheidet sich schon die erste Approximation erst in der 17. Stelle vom wahren Wert. Die Aussage, welche die Satze 2 und 3 fur die mittlere GroBe der Automorphismengruppe machen, ist klar : Die triviale Abschatzung 1
4.

209

STRUKTURZAHLEN

’

Demnach gilt die asymptotische Gleichheit S(n, (1, p)) -nZW- /(2”- l)!. Satz 4 ergibt sich auch in der hier entwickelten Theorie, da (3; = 6, ist und da bekanntlich” Z(G,,) [ f , / t ] = ( t ( t +1 )...(t+n- l))/n! ist. Fur t=2” erhalt man das Carnapsche Ergebnis. Fur die mittlere GroBe der Ordnung der Automorphismengruppe gilt also

Die grofitmogliche Ordnung ist nach wie vor r ( n , ~ ) = n ! und , diese wird z.B. angenommen fur diejenigen Carnap-Systeme, welche aus lauter AllPradikaten bestehen. Hingegen gibt es fur hinreichend groBe n keine CarnapSysteme mit der trivialen Automorphismengruppe. Vielmehr kann man fur die kleinstmogliche Gruppenordnung 1: (n, z) die folgende asymptotische Abschatzung geben : SATZ5 .

Beweis. Die Automorphismengruppen von Carnap-Systemen sind direkte Produkte von symmetrischen Gruppen, die uber den sog. Q-Pradikaten PI,..., P2,. operieren, welche durch die Pradikate R,,..., R, induziert werden: Ein Q-Pradikat ist dabei definiert durch eine Konjunktion P j : = (1) R, A ... A (1) R,, wobei die 2’ Moglichkeiten, die durch Klammern angedeuteten Negationen zu setzen oder nicht zu setzen, gerade zu den 2” QPradikaten fuhren. Jede Zahl aus N liegt in genau einem Q-Pradikat. Haben die Q-Pradikate die Kardinalzahlen q l , ..., q2*, so hat die Automorphismengruppe $9 die Ordnung g = ql !...q2W!, und diese Zahl fallt minimal aus, wenn alle q j moglichst gleich groB sind. g ist ein sog. OrdnungsmaB (degree of order) im Sinne Carnaps ([3], S. 1-2). 1st n=r 2”+s mit O<s<2”, so gilt fur die kleinste Gruppenordnung der HILFSSATZ. y(n, T ) = ( r ! ) 2 W - s ( ( r + l)!y=(r!)2W(r+ l y . Denn die “gleichmaBigste” Verteilung der n Individuen auf die 2” Basismengen besteht darin, 2” -s Basismengen jeweils r Individuen, den restlichen s Basismengen aber r + 1 Individuen zuzuteilen. Nach der Stirlingschen Formel ergibt sich aber bei festem p fur n+ co (und

*

Vgl. [12], S. 71 (8).

210

W. OBERSCHELP

damit auch fur r+co) aus dem obigen Hilfssatz

’ da s beschrankt ist, Denn (1 -s/n>. strebt gegen e-’, und (1 - ~ / n ) ~ ” -strebt, gegen 1. Hiermit ergibt sich nun, daB bei der Abschatzung y ( n , z)

< g(n, z) < r(n,4 = n !

die Zahl g(n, 2) stets echt zwischen den beiden Randschranken liegt. Genauer, die Quotienten g/y und r/g gehen gegen Unendlich, allerdings der erste wesentlich langsamer als der zweite. Man rechnet leicht aus, daB

r ~

9

N

(2’- I)!

2’”

n2a-1 ~

2””

-b’-- n 2 P -

1

=:PpF(n,p).

Durch Logarithmierung ergibt sich 1 0 g g = 1 0 g c t g + + ( 2 ~ -I ) l o g n + o ( l ) = O ( l o g n ) , dagegen

Y

r

log. = logp, 9

12 1.44 x 104 1.32 x 1013 2.41 x 1050 9.25 x 8.71 x (1 ;500) 1.05 x logs5 (1;1000) 1.49 x 1OZz6* (1 ;5)

(i;io) (1;20) (1 ;50) (1 ; 100) (1;200)

+ p(log2) n - (2” - 1) log n + o(1) = O ( n ) .

22.5 3.90 x 104 4.87 x 1013 1.38 x 7.44 x 10129 9.87 x 10316 1.87 x 1 0 9 8 6 3.76 x 1 0 2 2 6 9

120 3.63 x 106 2.43 x 10’8 3.04 x 1 0 6 4 9.33 x 10157 7.89 x 1.22 x 4.02 x 102567

1.88 2.71 3.70 5.73 8.04 11.3 17.9 25.3

1.78 2.52 3.57 5.64 7.98 11.3 17.8 25.2

95.1 92.8 96.5 98.5 99.3 99.6 99.8 99.9

lo8 1036 10100 10257 10837 101970

6.56 990 3.92 x 109 5.62 x 1038 1.03 x 10103 4.20 X 2.40 x 10840 5.88 x 1O1o73

120 3.63 x 2.43 x 3.04 x 9.33 x 7.89 x 1.22 x 4.02 x

1 4 2.07 x 104 3.54 x 1024 7.91 x 1073 3.35 x 10201 1.52 x 10691 1.58 x 101674

2.90 65.9 1.87 X lo6 5.63 x 1027 1.19 x 1078 5.54 x 10206 5.70 X 6.67 X 101681

120 3.63 x lo6 2.43 x 3.04 x 1 0 6 4 9.33 x 7.89 x 10374 1.22 x 4.02 x 102567

2 144 2.07 X 8.89 x 5.79 x 8.56 x 1.26 x 1.09 x

lo6

l0l8

1064 10157 10374 101134

3.28 6.87 18.9 63.2 1.77 x lo2 4.90 x lo2 1.91 x 103 5.38 x 103

1.89 5.35 15.2 59.9 1.69 x lo2 4.79 x 102 1.89 X lo3 5.35 x 103

2.90 16.5 90.4 1.59 x 103 1.51 x lo4 1.66 x 105 3.76 x 106 4.23 x 107

0.365 4.13 46.8 1.16 x 103 1.31 x 104 1.48 x 105 3.65 x lo6 4.13 x 107

-~-

~

~

1

1 16

4.51 x 1013 1.25 x 1 0 4 9 6.26 x 10147 4.58 x 2.30 x 101382

1 1 1 2.62 x 105 2.04 x 1027 1.57 x 1098 6.46 x 10411 2.10 x 101097

9 -

r ( n , t)

9(",T)

~

__

~-

-

V

Y ~

1.77 10.8 6.55 x 103 3.93 x 1018 8.66 x loS5 5.32 x 10156 3.15 x 10560 2.63 x

120 3.63 x 2.43 x 3.04 x 9.33 x 7.89 x 1.22 x 4.02 x

1.35 3.61 149 5.35 x 10'1 3.15 x 1037 2.03 x loll2 4.86 x 5.66 x

120 3.63 x 106 2 43 x 10'8 3.04 x 1064 9.33 x 7.89 x 1.22 x 4.02 x

106 1Ol8 1064 10374 102567

~

1.77 10.8 4.09 x lo2 8.70 x 104 6.92 x lo6 8.49 x 10s 6.87 x 10l1 1.15 X lOI4

5.92 x 1.07 x 1.94 x 1.87 x 3.39 x 6.14 x 5.92 x 1.07 x

10-4 10-1 10' 104 106 108 10l1

1.35 3.61 1.49 x lo2 2.04 x lo6 1.55 x 10'" 1.30 x 1014 7.52 x 1019 2.70 x 1 0 2 4

4.26 x 1.98 x 9.15 x 1.35 x 6.25 x 2.90 x 4.26 x 1.98 x

10-14

1014

10-9 10-5 lo4 lo8 1013 10l9 1024

STRUKTURZAHLEN

213

Natiirlich ist ebenfalls log (r/y) = O(n). Demnach liegt g tatsachlich wesentlich naher bei y, als r bei y liegt. Diese Tatsache kann man als einen schwachen Ersatz fur das starke Ergebnis im Falle Q > I ansehen, nach dem dann g exakt in der Nahe von y liegt. Zahlenbeispiele: Die Zahlen

sowie r ( n , T)=n! und die Vergleichsfunktion a,,f (n, p) sind fur einige Zahlenwerte bis zu p = 5 und n=1000 berechnet worden. Das Verhaltnis a,,f(n, p ) / ( g / y ) geht hier wesentlich langsamer gegen 1 als bei den Zahlenbeispielen Q > I , wo schon Zahlenwerte urn n = 10 sehr genau waren. Literatur 1. N. G. DE BRUIJN,Generalization of Polya’s fundamental theorem in enumerative combinatorial analysis, Indag. Math. 21 (1959) 59-69. 2. N. G. DE BRUIJN,Polya’s theory of counting, in: Applied combinatorial mathematics, ed. Beckenbach (New York-London-Sydney, Wiley, 1964) S. 144-184. The concept of degree of order (1952) vervielfaltigtes Manuskript. 3. R. CARNAP, Logical foundations of probability (Chicago, Univ. Chicago Press, 1950). 4. R. CARNAP, 5 . R. L. DAVIS,The number of structures of finite relations, Proc. Am. Math. SOC.4 (1953) 486495. und A. MOSTOWSKI, Models of axiomatic theories admitting auto6. A. EHRENFEUCHT morphisms, Fund. Math. 43 (1956) 50-68. The number of linear, directed, rooted and connected graphs, Trans. Am. 7. F. HARARY, Math. SOC.78 (1955) 445463. Note on an enumeration theorem of Davis and Slepian, Mich. J. Math. 8. F. HARARY, 3 (1955/56) 149-153. Note on Carnap’s relational asymptotic relative frequencies, J. Symb. 9. F. HARARY, Logic 23 (1958) 257-260. Kombinatorische Anzahlbestimmungen in Relationen, Math. An10. W. OBERSCHELP, nalen 174 (1967) 53-78. 11. G. POLYA,Kombinatorische Anzahlbestimmungen fur Gruppen, Graphen und chemische Verbindungen, Acta Math. 68 (1937) 145-254. 12. J. RIORDAN,An introduction to combinatorial analysis (New York, Wiley, 1958). Contributions to the theory of models I, Indag. Math. 16 (1954) 572-581. 13. A. TARSKI,

A SURVEY OF SOME CONNECTIONS BETWEEN CLASSICAL, INTUITIONISTIC AND MINIMAL LOGIC D. PRAWITZ and P.-E. MALMNAS] The University of Stockholm

1. Preliminaries As logical constants we use &, v , 2 , V, 3 and A (for absurdity). In case of modal logic, we add N (for necessity) to this list. The formulas are built in the usual way. The letters A , B, C, ... are used as syntactical variables ranging over formulas. The letters P and x are used as syntactical variables ranging over the predicate symbols and the individual variables, respectively. The predicate symbols are to be given in some alphabetic order. A = B is an abbreviation for ( ( A = , B ) & ( B 2 A ) ) and - A stands as an abbreviation for ( A = , A). (That absurdity instead of negation is taken as primitive is only a matter of convenience.) By a part of a formula A , we always mean a formula that forms a part of A. (The subformulas of A form a somewhat wider class than the parts of A, since if, e.g., VxB is a subformula of A , then so are all formulas obtained from B by substituting for all free occurrences of x a term that is free for x in B.) We assume that the reader is familiar with what it means for a formula, A , to be derivable from a set, r, of formulas within classical (intuitionistic, minimal) first order predicate logic; for brevity we say that A is derivable from r within C and write Tt,A. In case of intuitionistic logic and minimal logic we use the letters I and M respectively. The logic that arises by adding the modal rules of S4 to any one of these logics we denote by writing S4 as subscript; we thus speak of Cs4,etc. 1 The proof of Theorem D is based on a seminar essay by the second author (submitted in partial fulfilment of the requirements for the fil.kand. degree). He has influenced other parts of the essay as well.

21 5

216

D. PRAWITZ

and

P.-E. MALMNAS

2. Interpretability

The results of the next section will be concerned with interpreting logical systems within each other. Such an interpretation of a logical system s, into a logical system S, will be given by exhibiting a function or translation 9 that maps formulas of S, into formulas of S, in such a way that for each formula A of S, ks, A if and only if k s 2 F ( A ) . (1)

S, is then said to be interpretable in S, by 9. Consider also the condition: For each set T U { A } of formulas in S,

r Is, A if and only if S ( r )t s 2 F ( A ) ,

(2)

where F(r)is the set that results from replacing each member B of r by 9 ( B ) . If 9satisfies (2), we will say that s, is also interpretable in s, by 9 with respect to derivability. The existence of a translation F that satisfies condition (1) is of course trivial, and the interest of the interpretation will depend on the fact that 9preserves interesting structural similarities. Sometinies 9 is defined by schemata, and we then say that S, is schemaThis means that there are a number of tically interpretable in S, by 9. schemata, one defining the value of F for atomic formulas, and one for each logical constant c defining 9inductively for formulas that have as principal sign the constant c. For instance, there is to be a schema 4 ( F , G), i.e. an expression that is like a formula in S, except for containing the two letters F and G in the position of atomic formulas, such that for each formula A and B in S, .F(A & B ) = f$( F ( A ) ,F ( B ) ) , where $ ( F ( A ) , F ( B ) ) is the result of replacing the letters F and G in @(F, G) by F ( A ) and F ( B ) respectively. There are however interesting interpretations that cannot be defined schematically in this way. Sometimes, the schema consists simply of schematic letters combined with the logical constant in question, as when

9 ( A & B) = F ( A )& 9 ( B ) for each A and B. We will then say that the logical constant in question is literally translated by 9. REMARK.One may ask how schematic interpretations of one logical system in another are related to the notion of interpretability between non-

CLASSICAL, INTUITIONISTIC AND MINIMAL LOGIC

217

logical theories considered by Tarski, Mostowski and Robinson [ll]. Given two theories T, and T, containing no common descriptive constant, TI is there said to be interpretable in T, if there is a recursive set A of possible definitions in T, of the descriptive constants in TI, such that if

t,,A,

then

AI-T,sA,

(3)

where T i is obtained from T, by adding to the descriptive constants of T, those of T,. To consider this question, let us take the logical systems c and I as an example, and let us suppose that they are formulated in different symbolisms. Let I‘ be the extension of I obtained by extending only the language through the addition of the symbols of C. We then consider possible schematic definitions in 1 of the symbols of C. Let e.g. A be a binary sentential connective in C. By a possible schematic definition of A in 1, we then mean a schema

F

A

G

= 4 ( F , G),

where 4 (F, G) is a schema containing only the schematic letters F and G and symbols of I. Following Tarski et al., C may then be said to be interpretable in I if there is a set of possible schematic definitions in I of the symbols of C in such a way that if r is the set of substitution instances in 1’ of these schematic definitions. then for each formula A of c : I-, A

if and only if

r I-,

A.

(4)

Clearly, we want “if and only if” in (4) and not only “only if” as in (3); when dealing with logical systems of first order, we are considering open formulas that have no closed equivalents, and hence, non-provability must also reflect the meaning of the logical constants. By defining the notions involved more precisely, it can be shown that in all cases considered here, schematic interpretability as required by (1) is equivalent to interpretability as required by (4).2 To show this equivalence 2 A somewhat analogous situation pertains to the two notions of definability considered in Prawitz [lo]. It can be seen that the two notions are equivalent. Indeed, suppose that y is weakly definable in S using the transformation *. This means that t s A holds if and only if t-sA*. Hence, for all systems in which t s A * = A * , we also have t-s(A* = A * ) * . The last fact can be written ks(A =A*)*, and consequently we also have t s A = A * , which is what is required for strong definability. It would thus have been sufficient to show that the logical constants in I and M are not strongly definable, which would have slightly simplified the proofs. Cf. footnote 10.

218

D. PRAWITZ

and

P.-E. MALMNAS

in the example considered above, one has to show that if A is a formula of I and is derivable from f (where f is as above) in I', then A is provable in I ; this fact can be shown by using e.g. the theorems about normal deductions in Prawitz [lo]. 3. Theorems

THEOREM A.3 Let A"" be the result obtained from A by inserting two negation signs in front of each part of A , and let f - "be the result obtained from r by carrying out this transformation on each member of f . Then: Also

f k C A if and only if

Tt,A

f " "I-, A " " .

f""t,A"".

i fa ndonlyi f

COROLLARY 1.4 Classical logic is interpretable (also interpretable with respect to derivability) in intuitionistic and minimal logic by the translation * defined schematically as follows: (1) (Ptlr2 . . . t n ) * = --Pt,t,...t,. (2) ( A v B ) * = - ( - A * & - B * ) . (3) (3xA)*= -Vx-A*. (4) Translation of absurdity, conjunctions, implications and universal formulas : Literal.

-

2.5 Let f be the set of the non-logical axioms of a classical COROLLARY theory T. Under the condition that all members of r"" (or r*)are valid in an intuitionistic theory U, it holds: If U is consistent, then so is T.

COROLLARY 3.6 If the intuitionistic natural number theory is consistent, then so is the classical one.

--

3 A theorem of this kind was proved by Kolmogoroff [71, who showed that the fragment - transforof C containing only the logical constants 3 and is interpretable by the mation in the corresponding fragment of M; see also Church [l] p. 208, 38.12. Corollary 1 is due to Gentzen [3] and by Bernays (see Gentzen [2] p. 532) seemingly independent of Kolmogoroff; cf. also footnote 6 . A corollary of this kind was also drawn by Kolmogoroff [7], but f had then to contain also logical axioms. This corollary was not explicitly drawn by Kolinogoroff but the result was independently found by Godel [4],using a transformation similar to the * in Cor. 1 except that the transformation of implication was not literal, and that the transformation of atomic formulas was literal.

-

CLASSICAL, INTUITIONISTIC AND MINIMAL LOGIC

219

THEOREM B.7 Let A' be the result obtained from A by simultaneously replacing each part B of A by B v A , and let r' be the result obtained from r by carrying out this transformation on each member of I'. Then:

r t ,A

if and only if

r' t, A ' .

COROLLARY 1. Intuitionistic logic is interpretable in minimal logic by the translation " defined schematically as follows : (1) ( A z J B ) "= A x x ( B " v A). (2) (VXA)" =Vx(A" v A). (3) Literal translation in other cases. THEOREM C. Minimal logic is interpretable in intuitionistic logic by the translation " where A" is the formula obtained from A by replacing every occurrence of A by P, where P is the alphabetically first 0-place predicate symbol that does not occur in A . THEOREM D. Let AN be the result obtained from A by inserting N in front of each part of A , and let TN be the result obtained from r by carrying out this transformation on each member of r. Then:

r t,A

if and only if

TNt,,,AN.

COROLLARY 1.8 Intuitionistic predicate logic is interpretable (also interpretable with respect to derivability) in classical predicate logic extended with S4 by the schematic translation defined as follows: (1) (Pt,t, ...t,)"=NPt,t, ...t,. ( 2 ) ( A B)" = N(A" 2 B"). (3) (VxA)"= NVxA". (4) Translation of absurdity, conjunctions, disjunction, and existential formulas : Literal. O

COROLLARY 2.9 Intuitionistic predicate logic is interpretable in classical predicate logic extended with S4 by the translation defined schematically as follows: (1) ( A V B ) + = N A +V N B + +

7 The search for a translation that interpretes I in M was stimulated by a question by docent Jan Berg. 8 This result was proved for sentential logic by McKinsey and Tarski [9], using topological methods. A proof (still only for sentential logic) using Gentzen-type technique was recently given by Hacking [6]. 9 This result was conjectured to hold for sentential logic by GodeI [S] and was proved for sentential logic by McKinsey and Tarski [9].

220

D . PRAWITZ

and P.-E.

MALMNAS

(2) ( A 3 B)' = NA + 3 N B +. (3) (3xA)+=3xNA+. (4) Translation of atomic formulas, conjunctions, and universal formulas: Literal. 4. Discussion

A. From an intuitionistic point of view, the difference between classical and intuitionistic logic may be said to be that classical logic fails to make any distinction between the following two states of affairs : (i) A does not imply a contradiction (i.e. the assumption that A implies a contradiction implies a contradiction ; cf. the definition of negation) and (ii) A is provable. It is therefore very natural to expect that a classical argument can be understood intuitionistically, if the formulas are interpreted throughout in the weak sense (ii), i.e. that classical logic can be interpreted in intuitionistic logic by a translation which successively replaces classical formulas by their double negation as is expressed in Theorem A. Indeed, one finds, in agreement with our remark above, that classical logic may be obtained from intuitionistic logic by adding a rule that annihilates the intuitionistic distinction between (i) and (ii), i.e. by adding a rule for elimination of double negation allowing the inference of A from A. This formulation of classical logic, which was first given by Gentzen [3] (see also Prawitz [lo]), is very suitable for the present purpose. By observing, first, that the rule for eliminating double negation is intuitionistically valid when the conclusion is a negation, and, second, that every intuitionistic inference rule continues to be intuitionistically valid after the --transformation, one immediately obtains a proof of Theorem A. Corollary 1 is then obtained by observing that molecular formulas preceded by a double negation can be replaced with intuitionistically equivalent formulas, e.g. - ( A v B ) is intuitionistically equivalent to - ( - A & -B). Corollary 1 gives a somewhat different perspective to the relation between classical and intuitionistic logic. Except for predicate symbols, the difference between the two systems now appears to be that intuitionistic logic contains in addition to the constants of classical logic, a primitive connective v and a primitive quantifier 3 (both of which are known not be defineable in terms of the other logical constants, see Prawitz [lo] p. 59, Corollary 9). Instead of Corollary 2 one may perhaps expect a somewhat stronger result. To add a rule for elimination of double negation, which amounts to

--

-

-

CLASSICAL, INTUITIONISTIC AND MINIMAL LOGIC

22 1

allowing the inference of A when it is known that A does not imply a contradiction, may appear to be a very safe step in the sense that it does not lead to contradictions. Therefore, one may perhaps be inclined to think that no contradiction can be obtained in an axiom system by classical reasoning, if it is not already obtainable by intuitionistic reasoning. This does not hold, however, since A is classically but not intuitionistically derivable from e.g. -Vx(Px v -Px) (see Kleene [8]). (These considerations show incidentally that A k, A“ * does not hold in general, although A I-, A . ) But a classical theory is of course consistent, if the intuitionistic translations (as defined in Theorem A or Corollary 1) of the axioms are valid in an intuitionistically consistent theory.

-

N

B. Note that the translation defined in Corollary B 1 is not an interpretation with respect to derivability. C. The translation defined in Theorem C is not schematically defined10, and it seems likely that there is no schematic interpretation of minimal logic in intuitionistic logic. Similarly, we conjecture that there is no schematic interpretation of intuitionistic predicate logic in classical predicate logic. (For sentential logic, it is known that there is no such schematic interpretation, but the proof for that fact can not, it seems, be generalized to predicate logic; see Kleene [8] p. 495.) D. From a classical point of view, the difference between intuitionistic and classical logic may be said to be that intuitionistic logic fails to make any distinction between asserting only that a formula holds and asserting the provability of the formula; intuitionistic logic can make only the stronger of the two assertions. Hence, one may expect that by adding a modal operator N to classical logic, where NA may be read “it is provable that A”, and by adding rules that correspond to this meaning of N, it will be possible to understand an intuitionistic argument classically, interpreting the formulas overall as asserting provability ; i.e. one may expect that intuitionistic logic is interpretable in some form of classical modal logic by a translation which successively transforms the intuitionistic formulas by writing “N” in front of them as expressed in Theorem D. To prove Corollary 1, we can then observe that for certain molecular 10 The translation defined in Theorem C was wrongly used in Prawitz [lo] in a remark stating that A was weakly definable in I. Since the P that is to replace A according to this translation depends o n the context, the translation does not meet the requirements on weak definability. In fact, A is not weakly definable in M, cf. footnote 2.

222

D. PRAWITZ

and P.-E.

MALMNAS

formulas, if th{ atomic parts are preceded by N and if also the whole formula is preceded by N, then the formula is equivalent to the formula that results from striking this first occurrence of N. Corollary 1 is thus obtained when one tries to transform a formula AN to an equivalent formula by striking as many occurrences of N as possible except for those in front of the atomic parts, going outwards from the smaller parts to the larger parts. Corollary 2 is obtained, when one tries to transform a formula AN to an equivalent formula by striking as many occurrences of N as possible except the one in front of the whole formula, going inwards from larger parts to smaller parts. By completing the latter process, one obtains a formula NB such that kCS,ANif and only if k,,,NB. Since kcs4NBif and only if k,,,B, one can finally remove also the initial occurrence of N. It is to be noted that the interpretation obtained in that way is not a n interpretation with respect to derivability. A counterexample is obtained by observing that but not

P, P Pf, ( P 2

since not

2

Qt-iQ

Q>+t-cSbQ+

P,NP 2 NQ tc,,Q. 5. Proofs

In the proofs of the theorems above, we will assume acquaintance with Prawitz [lo]. At some places, we will make essential use of the fact that the deductions in the systems involved can be written in a certain normal form as described in [lo]. For convenience, theorems about normal deductions are provided in [lo] for c’,a reduced form of c where v and 3 are omitted. In this context, we need similar results for C. However, if ’ stands for a transformation by which disjunctions and existential formulas are replaced by equivalent formulas with and & or and V respectively, it can be seen that a normal deduction in C’ (or C;,) of A’ from r‘ goes over to a normal deduction in C (or Cs4)of A from r by an obvious transformation. N

-

PROOFOF THEOREM A. As pointed out in the discussion above, the theorem can be proved by showing that each classical inference rule goes over to an intuitionistically valid inference after the -transformation of both premises and conclusion. Instead of considering all inference rules in this way, one may however obtain the theorem directly from the following two lemmata : N N

CLASSICAL, INTUITIONISTIC AND MINIMAL LOGIC

223

LEMMA1. T t , A if and only i f r - " t,A"". LEMMA2. (a) r""t,A"" if and only if r"" t,A"". (b) r-" k,A"" if and only if r"" t,A"". The only part of the lemmata that is not trivial is the implication from left to right in Lemma 2. It suffices to prove this implication in Lemma 2 (b), i.e. to show that applications of the A-rule in deductions of A"" from r"" are superfluous. To this end let 17 be a normal deduction in c of A"" from r" " with a conclusion B of an application of the A-rule. It may be assumed that no conclusion of an application of the A-rule in Il has the form of an implication ([lo] p. 39, Th. 1). We first assume that B is not minor premiss of an application of the v E- or 3E-rule. Then Il has the form shown to the left below. [- BI [- BI __

c

__

A

(A)

c

That Il has this form follows from the following three facts. (1) B must be the minor premiss of an application of the I> E-rule because (i) B cannot be a major premiss of an application of an E- or A-rule, since Il is normal, and (ii) B, which by assumption does not have the form of a negation, cannot be premiss of an application of an I-rule, since the conclusion of such an application cannot be subformula of a --transformed formula as required by the Subformula Principle ([lo] p. 42). (2) By the same argument (i.e. the Subformula Principle), the major premiss of this application of the 3 E-rule must have the form B. (3) This premiss B must be an assumption; it can be a conclusion neither of an I-rule, since Il is normal, nor of an E- or A rule, since it would then stand below a formula that could not be a subformula of the required kind. An application of the A-rule of the kind above is clearly superfluous. It can be removed simply by transforming the deduction as shown to the right above. The situation is similar when B is minor premiss of an application of the v E- or 3E-rule. By the same arguments, it is then seen that the segment to which B belongs is minor premiss of an application of the 2E-rule, whose

-

-

-

224

D. PRAWITZ

and P.-E.

MALMNAS

-

major premiss is an assumption of the form B. This case can then easily be reduced to the first case by moving the application of the 2 E-rule upwards. PROOFOF COROLLARY A I . The corollary is obtained by proving that

t A“”

Ez

A*

holds for both intuitionistic and minimal logic. This fact is proved by induction over the degree of A . The base is trivial. For the induction step, it suffices to show that for both I and M: (a) t-A= -A, (b) F - ( A ” ” & B ” ” ) s A “ “ &B““, (c) t- - ( A v B)= - ( - A & W B ) , (d) I- - ( A “ “ ~ B ” “ ) ~ A ” ” ~ B “ “ , (e) t- - V x A “ ” = V x A “ “ , (f) I- - 3 x A - - v x - A . To prove (b), it is convenient to show I- - - ( A & B ) = -A&--B and then apply (a). Similar remarks holds for (d) and (e).

-- -

N

N

--

-

--

PROOFOF COROLLARY A 2. If T is inconsistent, then r k C A and hence A (or r*I-, A). Since I-, A = A , we have that r““(or r*) is inconsistent also by intuitionistic reasoning and that hence U is inconsistent.

r““tl

PROOFOF COROLLARY A 3. The axioms for classical and intuitionistic natural number theory are the same. Let r be the set of these axioms. Now, if A is an induction axiom, then so is A*, and if A is some other axiom, then A* = A . Hence, each member of r* is valid in intuitionistic natural number theory, and Corollary 2 then applies. PROOFOF THEOREM B. Clearly, t - , A - A ’ . Hence, if r’t-,A’, then Tt-,A. The converse is easily shown, using the fact that A t,A’. PROOFOF COROLLARY B 1. One may show by induction over the degree of the formulas that I-,A’=(A v A). Now, suppose that t , A . By the theorem, t-,,,A’.Hence, t , A x v A . It follows that either I-,A” or I-,,, A ([lo], p. 55, Corollary 6), but the last alternative k,., A is false. PROOFOF THEOREM C . Given a proof in M of A , we replace each occurrence of A with P. Since no inference rule in M involves A in an essential way, we then obtain a proof of A“ in M and hence also in I. For the converse, we note that since A does not occur in A“, it follows from the separation theorem ([lo], p. 54) that F,A” implies I-,A”. Of course the substitution of

225

CLASSICAL, INTWITIONISTIC AND MINIMAL LOGIC

A for P in the proof of A" does not change the validity of the proof, and since P does not occur in A , this substitution transforms A" to A. PROOF OF THEOREM D. The theorem follows from the following two lemmata (and the converse of Lemma 2), which may be of som eindependent interest : LEMMA 1. T t , A if and only ifrNkIS4AN. LEMMA 2. If TNtCS4AN, then TNtlS4AN. Lemma 1 is easily proved by induction over the length of the deductions. In the direction from right to left, it holds also for classical logic. In the direction from left to right, it is a peculiarity for intuitionistic logic (e.g., k C A v -A but not t,,,N(NAvN-NA)). Lemma 2 is the crucial step, and is proved by essential use of the theorem on normal deductions for modal logic. Let ( n , 9 )be a normal deduction with pure parameters in C,, of A N from TN.We assume that (l7,F)contains some application a of the A-rule by which an assumption B is discharged, where B is not an implication ([lo], p. 39, Th. I). The assumption B must be major premiss of an application of the 2 E-rule. Clearly, it can not be premiss of an application of the NI-rule, since that would break the restrictions on that rule. Other possibilities are ruled out because they would involve formulas that can not be subformulas of N-transformed formulas, contradicting the Subformula Principle. Hence, 17 has the form shown to the left below

-

-

c

B

-B

c

a is to be chosen so that there is no other application of the Ac-rule above a that discharges an assumption (applications that d o not discharge assumptions, are also applications of the A,-rule). We will show that a can be re-

moved, and assume for induction that this holds true for every application of this kind having a lower number of formula occurrences above its conclusion.

226

-

D. PRAWITZ and P.-E. MALMNAS

The assumption B that we are considering is to be chosen so that there is no other assumption of this form discharged by a that (i) stands in C or (ii) stands above the major premiss of an application of the v E- or 3E-rule, the minor premiss of which stands in C, below B.

-

Main case: There is no assumption in C that is discharged in .XI at the minor premiss of an application of the v E- or 3E-rule. We will show that in this case there is no assumption in C that is discharged in C, which shows that a is superfluous; we can simplify Ll as shown to the right above (and .F accordingly). (Note that applications of the NI-rule in 17, cannot be disturbed by removing Cl,because formulas in C, that stand below C, depend on -B, and can then not satisfy clause 1 in the restrictions on the NI-rule (POI P. 791.) Indeed, assume that there were an assumption C in C discharged in C,. It would then have to be discharged by an application of the 3 I-rule having a conclusion C I D (the A ,-rule is excluded because of the way a was chosen). But this is impossible. The segment c to which C I D belongs cannot be a major premiss of an E-rule, since the deduction is normal. Nor can c be premiss of an application of the NI-rule, because according to the restrictions on this rule ([lo] p. 79), there should then be an essentially modal formula between B and C x D that depends only on assumptions on which C x D depends (note that B and C 2 D is not essentially modal). But every formula between - B and C x D depends on C, which C I D does not depend on. Finally, c can not be premiss of an application of some other I-rule or minor premiss of an application of an E-rule, because that would again involve formulas that can not be subformulas of N-transformed formulas, contradicting the Subformula Principle; note that c cannot be minor premiss of an application of the 3 E-rule, where the major premiss is an assumption discharged by an application of the Ac-rule, since we have assumed that such assumptions do not have the form of implications.

-

-

Special case: There is some assumption in C that is discharged in Z,at the minor premiss of an application of the v E- or 3E-rule. In this case, Ll has one of the forms shown below and we will choose an application p of the v E or 3E-rule in C, and move it down to 17,.

CLASSICAL, INTUITIONISTIC AND MINIMAL LOGIC

We want to transform

227

n in respective case to ,z

z

~

B

-B

A

B

-B

A

However, this transformation can cause certain disturbances. Among other things, it is necessary (1) that C , v C2 or 3xC does not depend in (n,S ) on some assumption that is discharged by an application of the vE- or 3E-rule at some place in Z5, and (2) that C, or C, or C,X (where a is the proper parameter in question) does not contain some proper parameter of an application of the 3E-rule in Z5. fl will be chosen so that it satisfies (1) and (2). Let p, be an application of the v E - or 3E-rule as provided by the special case we are considering. If p, satisfies (1) and (2), we set /3 = p,. Otherwise, we consider an application p2 of the v E- or 3E-rule that makes (1) or (2) to fail for PI. If p 2 satisfies (1) and (2) we set p = p2 ; otherwise we consider a p3 that makes (1) or (2) to fail for p 2 and so on. By this process, we obtain finally a p, which satisfies (1) and (2), and we then set p=p,. Having chosen p in this way, it can be seen that the deduction obtained

228

D. PRAWITZ

and

P.-E. MALMNAS

by the transformation described above (where F is to be modified accordingly in the obvious way) is a correct deduction (of AN from I").To realize this, one has to check the following facts : (a) C , v C , or 3xC does not depend on any assumption discharged by an application of the 21- or Ac-rule in C, ; (b) C, or C , or C,X (where a is the proper parameter in question) does not contain the proper parameter of an application of the VI-rule in C, ; (c) no application of the NI-rule in Z12 is disturbed by the transformation. The arguments involved to prove these facts are similar to those in the main case, though we have also to utilize clause 2 in the restriction on the NI-rule ([lo] p. 79) and the (full) lemma on parameters ([lo] p. 29). By the transformation, IX is replaced by some other applications of the A ,-rule, but to all those applications, the induction assumption applies, and they can thus be removed.

PROOFOF COROLLARY D 1. We observe that each part B of A" that has the form of a conjunction, disjunction, or existential formula is essentially modal as defined in [lo] (p. 77). Hence, if B is a part of Ao,kCS4B-NB([lo] p. 77, Lemma), which gives the corollary. (One may also observe that the proof of Theorem D goes through without change when the formulas are "-transformed instead of N-transformed.) PROOF OF COROLLARY D 2. One proves easily that kcs4AN=NA+ using l-

cs4N (NA

& NB)

= N ( A& B )

and

kcs4 NVxNA

= NVxA

and then applies the remark in the discussion. References 1. CHURCH,Introduction to mathematical logic (Princeton, 1956). 2. GENTZEN, Uber das Verhaltnis zwischen intuitionistischen und klassischen Arithmetik, Manuscript set in type by Mathematische Annalen but not published (eingegangen am 15.3.1933). 3. GENTZEN, Untersuchungen uber das logische Schliessen, Math. Z . 39 (1934) 176-210. 4. GODEL,Zur intuitionistischen Arithmetik und Zahlentheorie, Ergeb. math. Kolloquiurn, Heft 4 (1932-33) 34-38. 5. GODEL,Eine Interpretation des intuitionistischen Aussagenkalkuls, Ergeb. math. Kolloquium, Heft 4 (1932-33) 3940. 6. HACKING, What is strict implication? J. Symb. Logic 28 (1963) 51-71. 7. KOLMOGOROFF, 0 principk tertium non datur (Sur le principe de tertium non datur), Mat. Sb. (Recueil mathkrnatique de la SociCt6 MathCmatique de Moscou) 32 (1925) 646-667.

CLASSICAL, INTUITIONISTIC AND MINIMAL LOGIC

229

8. S. C. KLEENE, Introduction to metamathematics (Amsterdam, North-Holland Publ. Co., 1952). 9. MCKINSEY and A. TARSKI, Some theorems about the sentential calculi of Lewis and Heyting, J. Symb. Logic 13 (1948) 1-15. 10. D. PRAWITZ, Natural deduction, A proof theoretical study (Stockholm, 1965). 11. A. TARSKI, A. MOSTOWSKI and A. ROBINSON, Undecidable theories (Amsterdam, North-Holland Publ. Co., 1953).

ZUR SEMANTIK DER INTUITIONISTISCHEN AUSSAGENLOGIK K. SCHUTTE Universitat Miinchen Als Grundzeichen zur Bildung von Formeln der intuitionistischen Aussagenlogik verwenden wir Aussagenvariablen, das Symbol A (fur die falsche Aussage) und die Junktoren A , v und +. Wahrheitswerte bezeichnen wir mit w (wahr) und f (falsch). Nach S. A. Kripke" hat man folgenden Modellbegriff fur die intuitionistische Aussagenlogik. Ein Modell ( M , R, W ) ist gegeben durch eine nichtleere Menge M , eine reflexive und transitive binare Relation R auf M und eine Zuordnung W von je einem Wahrheitswert W(v, a) zu jeder Aussagenvariablen v und jedem Element a g M mit der Eigenschaft

, = w. W ( v , a) = W, aRP* W ( V p) In einem derartigen Modell ordnet man jeder Formal F fur jedes Element EM nach der folgenden induktiven Definition einen Wahrheitswert W(F,a) zu: 1. W(u,a) ist fur jede Aussagenvariable v durch das Modell gegeben; 2. W ( h ,a ) = f; 3. W ( AA B, u) = w genau dann, wenn W ( A , a ) = w und W( B , u) = w ist; 4. W ( A v B, a ) =w genau dann, wenn W ( A ,a) = w oder W ( B ,a) = w ist; 5. W(A+B, a)=w genau dann, wenn fur jedes P E M , fur das aRp gilt, W ( A ,p)=f oder W ( B ,p)=w ist. Eine Formel F heiJ3e giiltig im Modell ( M , R, W ) , wenn W(F,a)=w fur alle a E M gilt. Eine Formel heiBe intuitionistisch allgemeingiiltig, wenn sie in jedem Modell ( M , R, W ) gultig ist. Durch Herleitungsinduktion beweist man :

KONSISTENZSATZ. Jede herleitbare Formel der intuitionistischen Aussagenlogik ist intuitionistisch allgemeingultig. * S. A. Kripke: Semantical analysis of intuitionistic Iogik I, in: Formal systems and recursive functions, eds. J. N. Crossley and M. A. E. Dummett (Amsterdam, NorthHolland Publ. Co., 1965) Seite 92-129. 231

232

K. SCHUTTE

Das Ziel dieser Note ist ein einfacher Beweis fur die Umkehrung: VOLLSTANDIGKEITSSATZ. Jede intuitionistisch allgemeingultige Formel der Aussagenlogik ist intuitionistisch herleitbar. Zum Beweis dieses Satzes verwenden wir folgende Bezeichnungen : Kleine griechische Buchstaben bezeichnen endliche (eventuell leere) Mengen aussagenlogischer Formeln. a-+B bezeichne die Formel A,

A

... A A,-+B,

... v B,,,

wenn @ = ( A ,,..., A,,,) und P={B, ,..., B,) ist, El v ... v En, wenn a leer und fi= { E l ,..., B,,} ist, A,A...AA,+A, wenn a={A,, ..., A,} undpleer ist, v

A,

wenn a und

p leer sind.

Hierbei sol1 es nicht auf eine Reihenfolge der Formeln in den Mengen a und B ankommen. Im folgenden sei F eine festgehaltene aussagenlogische Formel. T ( F ) sei die endliche Menge aller Teilformeln von F. Ein geordnetes Paar (s(, p) von Teilmengen a,p der Menge T ( F ) heiBe konsistent, wenn die Formel a+B nicht intuitionistisch herleitbar ist. Offenbar ist dann a n /3 leer. Das Paar (a, p) heil3e F-vollstandig, wenn a u fi = T ( F ) ist. Ein Mengenpaar (a*, p*) heilje eine Erweiterung von (a, p), wenn a ~ a "und B ~ f l *ist. Fur jede Formel C gilt:

1. 1st (cx,p) konsistent, so ist auch ( a u { C ) , p) oder ( a , p u { C ) ) LEMMA konsistent. Beweis. Sind (a, fiu{C}) und ( a u { C } ,p) inkonsistent, so sind die Formeln a-t p u{C> und C-+(a+P) intuitionistisch herleitbar. Dann ist auch a-+p intuitionistisch herleitbar, also (a, p) inkonsistent. Aus Lemma 1 folgt: 2. Jedes konsistente Paar (a,p) von Teilmengen a,p der Menge LEMMA T ( F ) 1aBt sich zu einem F-vollstandigen konsistenten Mengenpaar erweitern. Eine Teilmenge a der Menge T ( F ) heil3e F-ausgezeichnet, wenn das F-vollstandige Mengenpaar (a,T ( F )- a) konsistent ist. U ( F ) sei die Menge aller F-ausgezeichneten Teilmengen von T ( F ) .

LEMMA 3. U ( F ) ist nicht leer. Beweis. Das Mengenpaar (0,0) ist konsistent, da die Formel A nicht intuitionistisch herleitbar ist. Mit Lemma 2 folgt, da13 es ein F-vollstandiges konsistentes Mengenpaar (a,p) gibt. Hiermit hat man ein a~ U ( F ) .

233

INTUITIONISTISCHE AUSSAGENLOGIK

Anmerkung. Es kann sein, daD U ( F ) = (8) ist. Z.B. ist T ( A )= { A } und (8, A) das einzige A-vollstandige konsistente Mengenpaar, also U ( A )= (8).

LEMMA^. Eine Formel C E T ( F ) gehort genau dann zu einer Menge die Formel a+ C intuitionistisch herleitbar ist. Beweis. Fur jede Formel C E ist ~ trivialerweise a+ C intuitionistisch herleitbar. 1st C$LY, so ist C € T ( F ) - a und, da (a,T ( F ) - a ) konsistent ist, auch ( a , ( C } )konsistent, also a+C nicht intuitionistisch herleitbar. Definition des ausgezeichneten Modells ( U ( F ) ,G , W ) :Fur jede Aussagenvariable v und jedes Element C I EU ( F ) sei CIEU ( F ) , wenn

w ( u , a) =

w, wenn v E a ist, f, wenn v $ a ist.

Hiermit ist ein Modell gegeben, da U ( F ) nicht leer, transitive Relation auf U ( F ) ist und definitionsgemafi

c eine reflexive und

W ( v ,a) = w , a G p=. W ( v , p) = w gilt. Wir werden sehen, daD die Formel F in diesem Modell ungultig ist, falls sie nicht intuitionistisch herleitbar ist. Hierzu beweisen wir, daD das ausgezeichnete Modell ( U ( F ) ,c , W ) folgende Eigenschaft hat:

LEMMA 5. Fur C E T ( F )und

C&a ist.

M E U ( F ) gilt

W ( C , a) = w genau dann, wenn

Beweis durch Induktion nach der Lange der Formel C. 1 . C sei eine Aussagenvariable. Dann gilt die Behauptung nach der Definition des ausgezeichneten Modells. 2. C sei die Formel A. Dann ist ({ C),8) inkonsistent, folglich C$a und definitionsgemafl W(C,a)= f. 3. C sei eine Formel A A B. Dann hat man W ( AA B, a ) = w-W(A, a) = w und W ( B ,LY) =w ~ A E und M B E U(nach Induktionsvoraussetzung) o a - t A und a+B intuitionistisch herleitbar (nach Lemma 4) -a+A A B intuitionistisch herleitbar o A A B E E(nach Lemma 4). 4. C sei eine Formel A v B. Dann hat man W ( Av B, a) = w e W ( A , a ) = w oder W(B, a ) = w o A E M oder B E N(nach Induktionsvoraussetzung) o a + A oder a+B intuitionistisch herleitbar (nach Lemma 4) =a+A v B intuitionistisch herleitbar e A v BE^ (nach Lemma 4).

234

K. SCHUTTE

Umgekehrt gilt: 1st a-+A v B intuitionistisch herleitbar, so ist (a, { A , B } ) inkonsistent. Da { A , B } E T ( F ) ist, folgt dann A E X oder B E E . Hiermit ergibt sich W ( Av B, a)= w-A v BE&. 5. C sei eine Formel A+B. Dann hat man

A + B $ ~ ~ D L - + ( A -nicht + B ) intuitionistisch herleitbar (nach Lemma 4) o ( a u { A } ,( B } )konsistent ~ A E und P B $ p fur ein BE U ( F )mit a EP (nach Lemma 2) e W ( A , p) = w und W ( B , p) =f fur ein PE U ( F ) mit a E p (nach Induktionsvoraussetzung) *W(A+B, a)=$ Beweis des Vollstandigkeitssatzes. Die aussagenlogische Formel F sei nicht intuitionistisch herleitbar. Dann ist (0, { F } ) ein konsistentes Mengenpaar. Mit Lemma 2 folgt, daB es ein F-vollstandiges konsistentes Mengenpaar (DL, p) mit FED gibt. Hierfiir gilt cieU(F) und F$a, also nach Lemma 5 W(F, a) = f . Somit ist F nicht intuitionistisch allgerneingultig. Anmerkung. Mit diesem Beweis ergibt sich auch ein Entsclieidungsvevfahren fur die intuitionistische Aussagenlogik. Der Vollstandigkeitssatz IaBt sich in folgender Weise auf beliebige (auch unendliche) Formelmengen verallgemeinern. Ein Paar (a, p) von Formelmengen a, p heifie konsisterzt, wenn es keine endlichen Teilmengen U,ECY und p 0 c p gibt, 5 0 da13 die Formel ao+Po intuitionistisch herleitbar ist. Das Mengenpaar (a, p) heipe interpretierbar, wenn es ein Model1 ( M , R, W ) und ein EM gibt mit

W ( A , 5 ) = w fur alle A E ~ , W ( B , 5) = f fur alie B E @ . Aus dem Konsistenzsatz folgt: Jedes interpretierbare Paar (a,p) ist konsistent. Umgekehrt gilt auch: VERALLGEMEINERTER VOLLSTANDIGKEITSSATZ. Jedes konsistente Paar (a, p) ist interpretierbar. Dieser Satz 1aDt sich folgendermafien beweisen. Entsprechend wie Lemma 1 beweist man fur beliebige Formelmengen M , fl und fur jede Formel C :

LEMMA 1". 1st (a, p) konsistent, so ist auch ( a u { C } , p) oder (u, pu{C}) konsistent. Ein Paar (a, p) heifie maximal-konsistent, wenn es konsistent ist und DL v ,l? die Menge aller Formeln ist. Aus Lemma I * folgt:

INTUITIONISTISCHE AUSSAGENLOGIK

235

LEMMA 2". Jedes konsistente Paar (a,/) 1aDt sich zu einem maximalkonsistenten Mengenpaar erweitern. (ao, Po) sei ein gegebenes konsistentes Paar. (al, PI) sei eine maximalkonsistente Erweiterung von (ao, Po). Wir konstruieren eine Menge M von Formelmengen mit folgenden Eigenschaften : 1. U I E M . 2. Fur jedes N E Mist (a,E ) konsistent. ( E sei die Komplementarmenge von a.) 3. Zu jedem EM und zu je zwei Formeln A , B, fur die das Paar ( a u { A } ,( B } )konsistent ist, gibt es P E M mit a c P , A E P und BCP. Ein Modell ( M , E,W ) wird dann so definiert, daD fur jede Aussagenvariable v und jedes a ~ A genau 4 dann W(v,a ) = w ist, wenn V E E ist. Entsprechend wie Lemma 5 beweist man: Fur jede Formel C und jedes @ E Mgilt W(C, a)= w genau dann, wenn C E ist. ~ . Es folgt W ( A ,a,)= w fur alle A € a Ound W(B,a,)=f fur alle B E P ~Hiermit ist der verallgemeinerte Vollstandigkeitssatz bewiesen. Eine Folgerung ist der KOMPAKTHEITSSATZ. 1st (ao, Po) fur alle endlichen Teilmengen a. so ist (a,P) interpretierbar.

Po cP interpretierbar,

c a und

RECURSION THEORY AND THE THEOREM OF RAMSEY IN ONE-PLACE SECOND ORDER SUCCESSOR ARITHMETIC D. SIEFKES Mathematisches Institut der Universitat Heidelberg

In his paper [l] Buchi gives a decision method for a system of arithmetic which has the successor function as only operation, but is built up in the strong logical frame of second order one-place predicate calculus. This system is a very suitable tool in examining the behaviour of finite automata (sequential machines, cf. e.g. Rabin and Scott [7]). From its decidability follows the solvability of the automata decision problem for this language (cf. Church [ 2 ] ) .On the other hand, Buchi uses a good part of the theory of automata for his decision procedure. In view of this close connection he calls the system sequential calculus (SC). Buchi can use the means of the theory of automata, since he sets up semantically both the system SC and the decision procedure. If one wants to have a formal approach, one has to analyze the decision procedure - as already Buchi suggests - to get a complete axiom system for SC. To do so, we eliminate in [6] the theory of automata from the decision procedure and show that a certain kind of formulae works as finite automata within the system. To this end we set up an axiom system for second order one-place predicate calculus, and from these axioms and the Peano axioms for the successor function we build up primitive recursion theory as far as it is expressible in the language of SC. With the help of recursion theory we further derive theorem A of Ramsey [8] which was used by Buchi as a second help from outside and which he proposed to be the most interesting candidate for an axiom schema for SC. A careful examination shows that the remaining steps of the decision procedure are derivable; thus this very simple axiom system is complete. In fact it is the idea of Church (cf. e.g. [2]) to handle automata problems by recursion ; for this purpose he uses open recursive theories (quantifiers are excluded, but the introduction of new predicates by recursion equivalences is allowed). Therefore recursion theory suggests itself as a compen237

238

D. SIEFKES

sation of automata theory; but it is surprising that the whole theory of primitive recursion does not exceed the power of SC. - In the sequel we shall speak simply of “recursion”, omitting the word “primitive”. In this paper we shall give the derivation of the theorem of Ramsey within the system SC. Since Ramsey uses metamathematical recursion which is not available in SC, the translation of the original proof into the formal language is one of the central points in establishing the completeness of SC. Thus this topic seems to be worth a separate treatment. - In Section 1 we state the language and the axioms of SC; in Section 2 we derive recursion theory and from this in Section 3 the Ramsey theorem. A presentation of our version of the decision procedure and of the connections between SC and the theory of automata will be given in full detail in [6]. - I wish to thank Prof. G. H. Miiller for his kind criticism and helpful remarks in writing down the paper. 1. The system SC

Since we want to derive the theorem of Ramsey within the system SC, we describe the language and the logical frame before giving the non-logical axioms. Object language: As individual variables we use small Latin letters, a, b, c,... as free, t , x,y , z as bound variables. Analogous A , B, C,... resp. P,Q, R,S as free resp. bound one-place predicate variables. The quantifiers V, 3 serve for both types of variables. Further we use sentential connectives, T and F for the truth values “true” and “false”, brackets and dots for bracketing of formulae (dots extend over brackets). As only non-logical signs we have the individual constant 0 to denote the zero-element and the one-place function symbol ‘ for the successor function. Metalanguage: Formulae we denote by German capitals, natural numbers (for indices etc.) by small German letters. As for the rest we use the signs of the object language in the metalanguage. Logical axioms: We use freely rules of propositional calculus without mentioning. The other axioms and rules we state by pairs (respectively for individual and predicate variables). It is to be understood that one has to avoid collision of variables. 1) Substitution rule:

(a term).

(8 formula with one marked free individual variable).

239

ONE-PLACE SECOND ORDER SUCCESSOR ARITHMETIC

2) Changing of bound variables:

( Q quantifier). 3) Axioms for quantifiers:

(AQI1) (Vx) % (x) + % ( a ) (AQ12) %((a) + (3x1 %(x)

(AQPl) (VP) % ( P ) + % ( A ) (AQP2) % ( A ) + ( 3 P ) % ( P )

4) Rules for quantifiers: (RQIl)

B +%(a) 23 4 (VX) % (x) ~~~~

2l(a) + B (RQ12) (3x) %(x) + B

(RQPl)

(RQP2)

B +%(A) B + (VP) % ( P ) % ( A )+ B ( 3 P )% ( P )+ B

( A not in 23). (a not in 23). We call this logical frame P'K(2): second order predicate calculus with one-place predicate variables. Evidently the axiom system is not independent. It is wellknown that within this frame equality is definable by a = b H~~(VP) T P ( a ) -+ P ( b ) l

.

Further a special form of the replacement theorem is derivable which we call principle of extensionality: (EXTI

(vX) r A (x)

B (x)i.+. 3 ( A )+a ( B ).

The derivation is by induction over the length of the formula % and does not use (SP). By the same method we get the generalization

(vX) r A (x) -8 (x)i.+. % ( A )+ q ~ ) . With the help of this formula we show that the substitution rule (SP) is equivalent to the principle of comprehension : (COMP)

( 3 ~(VX) ) rP(x) -%(x)l

( P not in

a).

This shows that P'K(2) may be considered as a fragment of set theory (cf. Robinson [9], Hasenjaeger [3], McNaughton [ 5 ] ) . In most derivations we will use (COMP) instead of (SP) and it is in fact this highly impredicative comprehension rule which gives together with the induction axiom the strongness of SC.

240

D. SlEFKES

Non-logical axioms: The three Peano axioms for successor are sufficient. We need no schema in view of (SP):

a' = b' -+ a = b , a' # 0 , A (0) A (vt) rA ( t ) + A ( t ' ) ~-,(vt) A ( t ) .

(All ('42) (1)

The system built up in P'K(2) by these non-logical axioms is called sequential calculus SC. First of all we get from (I) by (SP) the induction schema

'u(o)

(1s)

A

(vt)

r'u ( t i -,'u( t ' ) ~

-+

(vt) 'u ( t )

.

Further it is known (cf. Hilbert and Bernays [4], p. 490-491) that order is definable by U

< b ++df(3P) [ P ( a ) A

(vt)

rP(t')

-+

p(t)l

A 1P ( b ) ] .

We use the following abbreviations (cf. Buchi [l]): 1)

2) 3) 4) 5)

( I t ) : % ( t ) ++df(3t)r U < t % ( t ) Hdf(3X) ( V t ) Tx < t + %(t>l , (3P)"'U(P)++df(3P) r ( 3 v ~ ( t A) ~ P ) . I

rx

-+

1) and 2 ) are the familiar restrictions of quantifiers; sometimes we use also (3t), % ( t ) and (Vt), %(I), if there is only a lower bound. 3) is to be read as "there are infinitely many t", 4) as "for ultimately all t", 5 ) as "there is an infinite P". Remark that l ( 3 " t ) 'u(t)++(V"t) 1% ( t ) is derivable. For recursion theory we need still another abbreviation: Let %(a, E ( t ) ; t < a ) mean that in the formula 'u the predicate variable E is contained only with bound arguments restricted by a. For such a formula '%(a, E ) we have a restricted form of (EXT):

(REXT)

(vX);;rA(x)oB(x)i.-+.%(a,

B).

2. Recursion theory

In his paper [2] Church considers "wider restricted recursive arithmetic", a numbertheoretic system similar to SC, which has no quantifiers, but allows

241

ONE-PLACE SECOND ORDER SUCCESSOR ARITHMETIC

the introduction of predicates by a certain recursion rule. A simple instance of this recursion schema would be

--

A (0) A (a’)

3 [ B ,(O), .. ., B,, (011

9

23 [ B ,(a), ..., B” (a), A (a)] .

In fact Church considers multiple recursion with any (fixed) number (not only 1) as step distance, but we want to generalize it slightly in another direction : For the whole section let B(a7E ( t ) ;?
(R)

defines a predicate in SC, that means that the existence of such a predicate is derivable. Clearly this cannot be done by (COMP) alone, since E is contained in 23, too. So we have to do a little more.* Note that by our definition of restriction 2330, E ( t ) ; t < O ) is equivalent to a formula 0. not containing E at all. Thus we have by (COMP) ( 3 P ) (V‘t) rP(t)-0.1 and therefore ( 3 P ) rP(O)Ct0.1, which ensures us the existence of a predicate starting as we want it. (Of course we could have stated the whole problem with an extra initial condition.) First of all we derive the uniqueness of recursive definition:

LEMMA1. (vz)gbt~~(z)t,~ ( zE)I , A(vz>; rD(z)ttqz, D)I

+piz):

A

b
rE(z)-D(z)i.

Proof. By induction over b : The beginning b=O is trivial. From the premise for b‘ follows the premise for b and thus by induction hypothesis the conclusion for b : (Vz)gb r E ( z ) c t D ( z ) l . From this we get by (REXT) B(b, E)-% (b, 0)and therefore E(b)-D(b), which gives the conclusion for b‘. Next we derive the existence of any initial part of a recursively defined predicate:

LEMMA2. (3s)(vz); r,qZ)- qZ7 s)i. Proof. Induction over b : The beginning is trivial (see above). For the

* Of course, the proof of this “recursion theorem” follows the known pattern of the proof of the recursion theorem e.g. in set theory.

242

D. SIEFKES

and therefore

In other words:

B*( b , B ) A

(vz);

ro

++

B ( z ) ~A

By quantification we get ~ * ( bB,) A (IS) [(vz);

ro (b’) 23 (b’, B)I + -,B*( b , D ) A ro (6’) -23

rs(z)-B(z)i

c-)

A

rS(b’)-%(b’,

(v,D)I

~)1]+

+ (IS) 23*

.

(b’, S) .

From this follows by (1)

B*(b, B ) + (3s)23*(b’, S ) , which leads to the wanted conclusion

(3s)23* ( b , S) --t (IS) 23* (b’, S ) . Now we are able to prove our wanted theorem:

RECURSION THEOREM. For any formula

derivable in SC: Proof. Let

%(a, E ( t ) ; t < a) the following

(3s)(vZ)rs(z)++23(z, s)i .

B* be defined as in the proof of Lemma 2. By (COMP) we have (3s) (vz)

[s(Z)

+R)

rB*(Z, R ) A ~ ( z ) l ] .

Let us abbreviate this formula by

(1)

is

(3s) 3 ( S )

ONE-PLACE SECOND ORDER SUCCESSOR ARITHMETIC

243

It is easy to see that with the help of Lemma 1 we have especially

~ * ( bc) ,

D(D).+.(v~); r c ( t ) - D ( t ) i ,

B * ( b , C)

A

%(D).-+.C(b)*D(b).

By (REXT) we have

rC(t)++D(t)l.+.B(b, C ) H B ( b , D)

(Vt);

and by definition of Together we get

B* B*(b, C).+.C(b)++B((h, C).

B*(b, C)

A

D(D).+.D(b)++B(b,D).

Quantification gives us step by step (3R)B*(b, R) A D ( D ) . - + . D ( b ) w B ( b , D ) , (vz) ( ~ R ) B * ( R ~ ), A D(D).+.W) rD(z)ttB(z, 011 , (vz) ( I R ) B * ( ~ ,R ) A (3s)D(s).-+.(~s)(vz) rs(z)HB(z,s)l . With the help of Lemma 2 and (1) we have our theorem. Three remarks to conclude: 1) It is clear that the recursion theorem holds for ordinary recursion, say

For (1) can be transformed into the form (R), e.g. ~ ( a ) + + a=

o v ( j Z ) : rz' =

A

B ( E ( ~ )A , , (z), ..., A,,(~))I.

If we have F instead of T in the first equivalence, we drop the disjunct a = 0. 2) The recursion theorem holds as well for simultaneous recursion of two or more predicates Ei(u) -Bi(a, E l ( t ) ,..., E,(t); t < a), i

=

I , ..., n .

One may prove this fact either by the completeness of SC or by generalizing the proof of the recursion theorem. 3) It follows very easily from the decision procedure of Biichi that every in SC definable predicate is ultimately periodic. Phase and period of the defined predicate can even be computed effectively from the defining formula. Every ultimately periodic predicate is explicitly definable in SC. We conclude

244

D. SIEFKES

that every in SC definable predicate is explicitly in SC definable. This seems to make redundant our recursion theorem: If every definition of the form

(R)

E(a)++B((a,E ( t ) ; t < a )

can be converted into an explicit one

(2)

E ( a ) ++%(a) ( E not in %),

then the existence of a predicate defined by (R) seems to be derivable in SC with the help of (COMP) alone. But this is not true, since we need the completeness of our axiom system to derive in SC that the predicates defined by (R) and (2) respectively are the same, i.e. to derive

3. The theorem of Ramsey For any natural number n>O and any set M let n,(M) be the set of all n-element subsets of M . Then Theorem A of Ramsey [8] reads as follows: THEOREM1 (Ramsey). Let N be a set, let n, m be natural numbers, let Ll, ..., L, be a partition of n,(N). Then there exists a number lj, 1< f j < m , and an infinite subset M of N such that q ( M )c L,. To prove this theorem in SC means first that we choose as N the set N of natural numbers. For the beginning we will further restrict us to the case n = 2 and m = 2 (we will treat the general case at the end of the section). Since we have a wellordered domain, we may speak of sequences instead of subsets and of “ascending” pairs (a, b) (i.e. a
245

ONE-PLACE SECOND ORDER SUCCESSOR ARITHMETIC

u i € { c ; , c:, ...} for f < j , we have (af, ai)eL, for every pair €<j. Thus the set {uo, a,, ...} fulfills our theorem in this case.

As for the other case let us assume that the above construction breaks down at step f. Thus for each j we have (cf, c f ) ~ Lfor , at most finitely many i, therefore (c:, c f ) € L Zfor ultimately all i. Let us denote c: by b, and the infinitely many c: with (bo,C:)E L , by dy < d: < .... Now we may continue the above selecting process for L, at infinity, because for every j the sets {bi,d i , d\, ...} are contained in {c:, c:, ...} and therefore ( d i , d / ) E L , for ultimately all i. Thus in this case, too, we get the wanted conclusion using the set {b,, bl, ...}. To formulate Theorem 1 in the language of SC we have to express sets by predicates. Since SC has only one-place predicate variables, we cannot write down the theorem for arbitrary, but only for definable partition sets. So let -4$(a,b) be an SC-formula which may contain other individual- and predicate variables. Then we may translate the Ramsey theorem into the language of SC by the following formula (it is just this formula for special formulae -4$ (a, b) Biichi suggested as axiom schema for SCj :

THEOREM 2. (3Q)". (VY) (vx)'o

rQ (XI

A

Q ( Y ) @ (x,~ 1 v1 +

v (VY) (vx>i

rQ (XI

A

Q (v)

+

-I

@ (x, y)l .

The first step in formalizing the proof given above consists in replacing sets by predicates: We want t o define recursively a predicate E determined just by the sequence a,, a,, .. . . Thus E(ai) holds if and only if there exists an infinite predicate Ci with

ci-

(ai>

A

r(vx) ci(x>-,ai < x A @(ai,

x)

A

ci- (X)I .

We have seen in Section 2 that we are able to introduce in SC a predicate by recursion over the arguments, but is impossible to introduce a sequence of predicates by recursion over the indices. Thus we avoid the explicit construction of the sequence C, C, ... ; we use only the existence of such a predicate Cifor every i, O < i < i . Let us abbreviate the condition A ( b )+ a < b

A

@ ( a , b)

by 8 ' ( A , a, bj and the formula ( g u t ) ~ ( t A) ~ ( a A) (vx) [rA(x)

+

B ( ~ ) IA @ + ( B , b, x)]

246

D. SIEFKES

by D+( A , B, b, a). If we replace in these formulae A and B by Ci and Ci respectively, then we have the conditions upon Ci and Ci of the last paragraph. Thus we may express the above considerations by the following formula : (1)

~ ( a-VQY ) [(vx) Q + (Q, a ,

A

(w: my) -+

+

For the following proofs we abbreviate this by

(W D+(Q, P , Y , all 1.

E ( a ) -FQ) 23 (E,Q, 0). As for the second case we introduce analogous a predicate H determined by the sequence b,, b,, ... . Again we avoid the predicates Di which are determined by the sets { d i , d i , ...}. Let 6- and 9- be the same formulae as 8 ' and 3 ' with 7 s ) instead of s). Then we formalize the second case by

(2) H ( a ) -(vt),

1E ( t ) A A

A

PQ)"[@XI

8-(Q, a , x)

A

(w'or - w ( ] P I a+(Q, P , y, a ) i (wy,rH( Y ) .+ ( 3 ~ID) - (Q, P , Y , a ) i I . --f

A

As abbreviations for the right side of the equivalence we use (Vt), 1E ( t ) A

(3Q) @ ( E ,ff, Q, a )

or even shorter C(E,H , a). Since the recursions (1) and (2) are of type (R) in Section 2, we have at once the existence of E and H : LEMMA 1.

Further it is very easy to see that the recursions (1) and (2) are good for our purpose: if E or H are infinite, then we get an infinite Ramsey set as wanted in Theorem 2. We show this in the following Lemmata 3 and 4. The whole difficulty is reduced to show: if E is finite, then H must be infinite; this will be done in the main Lemma 5. For the convenience of the reader we give now a short informal version of the proof of this lemma; then later on one has to check only the formalization step by step. So let E be finite and let a be the smallest number such that ( V t ) , i E ( t ) . We want to show: for every number b there exists a number c, b
247

ONE-PLACE SECOND ORDER SUCCESSOR ARITHMETIC

we have to pay for avoiding the predicates Ci and Di.Let a, be the predecessor of u ( i t . a, is the greatest number with ,?(a,)). We choose a predicate C , as on p. 245 with minimal first element 0, and show H(bo). To this end we prove: 1) ultimately all elements of C, may serve as elements for Do (p. 249, 2. case, (a)), and 2) H ( e ) does not hold for any number e smaller than b, (p. 249, 2. case, (b)). - The induction step is similar: If we have H(b,), we use 6 , and a minimal f), instead of at and C, to show that H(b,+ holds for the first element bt+l of D,.- It is possible to introduce the predicate H by a recursion which corresponds exactly to these three cases and thus goes closer along the lines of the informal proof. But then both the recursion and the proof are even more complicated. Now we come to the formal proofs. First of all we have to show that E and H do what we want them t o do: LEMMA 2. Ascending pairs of elements of E satisfy (Vz) rE(z)+-+(3P)B(E, Q,

z)l

A

5:

E ( a ) A E ( b ) A b < a - - $ $ ( b ,a ) .

Proof. From (1) follows by the premises

(3P)" P ( a ) A (Vx) rP(x) -+ b < x

A

5 ( b , x)?

and therefore $$ (b, a). From Lemma 2 follows directly that the existence of an infinite predicate E ensures us the existence of an infinite Ramsey set for 5: 3. LEMMA (3S)"'(Vz)

rS(z)c-*(3Q) % ( S , Q, z)l +

-+

(3QY ( V Y ) (vx)Y,

rQ(x) A Q (4')

+

$5(x, ~ ) . l

The same remarks hold for H , thus we have LEMMA 4.

(3R)"'(Vz) r R ( z ) c * E ( E , R , z)l -+

-+

rQ

(3Q)"(v~)(vx>Y, (XI A

Q (1'1

+

7

5(x, Y ) I .

Lemmata 1 , 3 , 4 together with the following main lemma give the assertion of Theorem 2. LEMMA 5. If E is finite, then H must be infinite: (Vz)

[ r E ( 4 w ( 3 Q ) 23 ( E , Q, 211 A

A

l H ( z ) + + % ( E , H , z)1]

A

(V"?)i E(t)-+(3"t) H ( t ) .

248

D. SIEFKES

Proof. To make better reading we will not follow the rules of our formalism as close as we did in Section 2; but there will be no difficulty to translate the following considerations into a strictly formal proof. So let E and H satisfy the premises of Lemma 5, let the number a be fixed for the whole proof such that ( V t ) , i E ( t )A (Vt); (3x),E(x) (thus a= 0, if ( V t ) i E(t), and a= c', if E(c) A V t ) = ,E(t)). i We assert (Vy) (3x),H(x) and to prove this we show by induction over b (3x),H(x). 1) Induction beginning: b = 0. 1. case: a = 0. Thus we have ( V t ) i E(t), especially i E(O), therefore by (1) (VQ).(Vx) Q'(Q, 0, x>+(v"t>lQ(t>. This implies

(VQ) . (Vx) from which follows (VQ). (Vx)

rQ (x) +-+x= 0 v 1 5(0, x)l

rQ (x) -0

<x

A i

5(0, x)l

(V'"t)

-+

+(

Q ( t ),

Y t ) Q(t)

.

By (COMP) holds (3Q).(Vx) rQ(x)++O < x

A 1&(O,

x)l ,

this gives together (3Q).(Vx) rQ(x)C-'O < x

A i &(O,

x)l

A

(3"t) Q(t)

and therefore (3Q)" (vx) 8-(Q, 0, x). According to (2), this is equivalent to H(0). 2. case: O
E(c) implies (3Q)B (E, Q, c), which implies by (1) ( 3 x ) V Q ) . B ( E , Q, c)

A

Q(x>.

From this follows by the minimum principle ( 3 4 : ( 3 ~ rB(E, ) Q , c)

Q ( ~ NA (Vy).(3Q) r B ( E , Q, c)

A

A

A

Q ( Y ) +~x

Let d be this least element: ( 3 ~ rB ) ( E , Q, c)

A

Q (d)l

A

(vy). ( 3 ~ rB ) ( E , P , c)

A

P ( ~ )+ I dGy.

249

ONE-PLACE SECOND ORDER SUCCESSOR ARITHMETIC

Further let B be a predicate satisfying (3Q): % ( E , B , c ) ~ B ( d ) n ( t l y ) . ( 3 P )r % ( E , P , c ) A P ( y ) l - , d d y .

From (1) and the definition of d and B follows c < d A i E ( d ) A (Vy); i B ( y ) . (3) At last we write down the formula %(I?, B, c) without abbreviation

(4) (3"t) B ( t ) A (vx) rB c < X A 5( c , x)l A (vy); [ E ( y > - + ( w ) " { P ( ~ ) A (vx) [rB(x)--fP(x)l A r p ( x ) - + --f

--f

Y <x

A

--f

B(L.7 X I 1

I}].

We assert that H ( d ) follows from (1)-(4). Proof. We define a predicate C (by (COMP) follows the existence) as (vt)

rc(t)++d < t

B(t) A

A

5 ( d , t)l .

1

a) Assume (3"t). B ( t )A i C(t). Then (3"t). B(t) A B(d, t ) . Thus, if we define c by (vt)

rc(t)-d < t

A

~ ( t A) ~ ( dt>l , ,

we get (39) C(t). We want to show

(vy): m

(5)

y ) -+ ( 3 ~D+ )

(c,P , y, 4 1 .

So let e be a number with 0 Q e < c A E(e); then there exists by (4) a predicate D with (6)

(3"t) D ( t )

A

(VX) r B ( x ) -+ D(x)l

A

rD(x) -+ e < x

A

$ ( e , x>l .

By (6) follows D(d) from B ( d ) and (Vx) rc(x)-+D(x)l from (Vx) rz'(x) - - f B ( x ) l .Therefore we may replace in (6) B by and c by d and get ( 3 P ) D + ( e P , , e, d). Thus we have shown

(vy); rE ( y )

-+

( 3 ~D)

+

(cP , y , d)i .

Since further D + ( c ,B, c, d ) and (Vt),.iE(t), we have derived (5). By definition of c follows %(I?, c, d ) and thus E(d), in contradiction to (3). b) Therefore we have (VV rB ( t ) + c (ti1

250

D. SIEFKES

and thus by (4) and by definition of C

(39) C ( t ) A (Vx) 6- (C, d, x).

(7) As in a) one shows (VA;

(8)

r w y ) -,( 3 ~D+ ) (c,P , y , d ) i

.

Assume now that there is a number e with e < dr\ H(e). Then c < e by (2), and from this and E(c) and from the defining formula for H(e) follows (9) ( 3 ~ ) " A

We define

re@)A (vx) Q + (Q, c, x)l

(w; CEW

-+

me)

(3~)-

by

wX) <

A

A (vX)

-+

A

5( y ,

I}.

rB"(t)-B(t) v t = e l

(vt)

and get from (4) and the first line of (9)

(3'"t) B ( t ) A (VX) 6'

(B, C, X)

and from (4) and the second line of (9)

(w;[ E ( y ) -,( I P , w CP(c)

A

R (e) --+

A

r B (x) -,

(vx)

p(X)l

A

Q f ( P , y , X)

A

6' ( R , .Y,

X)]}].

If we put this together, we get by the help of (COMP) (1"t) B"(t) A

(VX)

But this yields

8' (g, C,

X) A

B ( E , B", c)

(Vy): rE(y) -+ ( 3 P ) D f (B, P, J',C)]

A

B(e)

A

.

e
which is a contradiction to the choice of d. We conclude (Vy);i H ( y ) , which gives together with (7) and (8) E ( E , H , C, d ) and thus H ( d ) . To indicate the formalization of this step consider the formula (vz)

B(E, Q, z ) i A r m z ) - w , H , z ) i A A B(z) A i $ ( d , Z)1] A E ( c ) A (vf),,l E ( t ) A A B ( E , B, z ) A B ( d ) A (vy) [(IP) rB ( E , P , c> A P ( ~ ) I, --+ d < y ] -, % ( E , H , C , d ) .

rrE(z)o(w) A

rc(Z)t,d < z

This is in essence what we have proved in this paragraph. By existential quantification over C , B and d we get proved premises and the wanted conclusion (3x) H ( x ) .

ONE-PLACE SECOND ORDER SUCCESSOR ARITHMETIC

251

2 ) Induction step: Let ( ~ x ) ~ H (be x )proved, show (3x),.H(x). If from the induction hypothesis follows H(c) for c > b, we are ready. So we may assume H(b) and therefore (3Q) C(E,H , Q, b). Choose as on p. 249 B as a predicate satisfying (3Q) and d as an element of B in an optimal way, i.e.

Especially we have ( V y ) d , i B ( y ) b~ < d .

As in 1) we want to show H ( d ) . Essentially the proof is the same: Let C be defined as there. The assumption (3"t) B(t) A i C ( t )yields E ( d ) as there, is therefore contradictory. Thus one gets (3"t) C ( t ) A (Vx) F ( C , d, x)

A

(Vy)d, rE(y)

(3'D'(C, ) P, y , d)l

.

we may change this into

Assume that there is a number e with b<e
Define B by ( V t ) m ( t ) c t B ( t )v t = e l . Then we have by definition of B and by the first line of (10) (3"t) B ( t ) A (VX) 8-(B", b, X),

252

D. SIEFKES

and therefore the contradiction @(E,H , B, b) A B(e) A e < d .

Thus we have shown (Vy)$i H ( y ) , from which follows (Vy)$ r H ( y )-+ (3P) ID- (C,P, y , d ) l and thus E(E, H , C, d), which gives H ( d ) . This completes the proof of Lemma 5 and therefore of Theorem 2. Now we extend Theorem 2 to the case of arbitrary many partition sets. At the same time we drop the condition of Theorem 1 that the sets of the covering are disjoint. It is easily seen that both versions of Theorem 1 are of equal strength, and the new one is much more easy to write down formally.

Proof. By induction over m we get easily the proof of Theorem 3 from Theorem 2 : The assertion is trivial for m = 1, for m = 2 it follows from Theorem 2, applied to since the premises of Theorem 3 give i fil (a, b) -+ S2(a,b). So let be m 2 3 : Define formulae 53, as bi for j = I , ..., m-2 and Rm-l as $m-l vBm. By induction hypothesis there is an index j, 1 d j < m - I , with ( ~ Q > " ( V Y >(V'X)~ ~ Q ( x >A

If i < m - 1, we are ready. So let (3"O D ( t )

A

(VY) (VxX

Q (Y>

+

fii(X,

Y)]

*

i =m - 1, let ff be a predicate with

(XI A D ( Y )

-+

Ssm-

(x, Y ) v

B,,,(x, 4'11 .

ONE-PLACE SECOND ORDER SUCCESSOR ARITHMETIC

253

Similar restrict the recursion for H and the proofs to the predicate D.Then Lemmata 1-5 hold as before and one gets as Theorem 2

rQ(x) A Q ( Y ) D (XI A D ( Y ) A $L-(x, ~ 1 v1 v ( V Y ) (VxX rQ(x> A Q ( Y ) D (XI A D ( Y ) A 1ti,,-(x,~ 1 .1

(I€!)".( V Y ) (VxX

-+

1

-+

we get the desired formula

At last we extend Theorem 2 to the case of arbitrary n-tuples.

THEOREM 4. n-

1

m

-+(3Q)". V (Vx, ,..., x,) i=1

m

1

(vx,, ..., xn) ri= A xi <

-+

v 4si(xl, ..., xn)i

-+

i=1

rA

n- I j= 1

xi<xi+lA

n

111

i= 1

i= 1

A Q ( x i > d V 5i(xl,...,xn)1.

We may get a proof of this theorem by trivial changing of some points in the proof of Theorem 3. Or we use the completeness of SC announced in the introduction (shown with the help of Theorem 3 only) and get Theorem 4 by Theorem 1. We conclude with a consequence of the decidability of SC: If we drop the quantifier (3Q) in Theorem 3 and replace the variable Q by A , we get a formula %(A) which is satisfiable if and only if Theorem 3 is true. Now it follows very easily from a result of Biichi [I] that, if a formula % ( A ) is satisfiable at all, then by an ultimately periodic predicate. This shows that the Ramsey set M in Theorem 1 can always be chosen as ultimately periodic if we start with sets L, definable in SC.

254

D. SIEFKES

References 1. J. R. BUCHI,On a decision method in restricted second order arithmetic, in: Logic, Methodology and Philosophy of Science, Proc. of the 1960 Intern. Congr. (Stanford, 1962) 1-11. 2. A. CHURCH, Logic, arithmetic and automata, Proc. Intern. Congr. Math. 1962, 23-35. 3. G. HASENJAEGER, Uber o-Unvollstandigkeit in der Peano-Arithmetik, J. Symb. Logic 17 (1952) 81-97. 4. D. HILBERT and P. BERNAYS, Grundlagen der Mathematik I1 (Berlin, 1939). 5. R. MCNAUGHTON, Some formal relative consistency pioofs, J. Symb. Logic 18 (1953) 136-144. 6. G. H. MULLER and D. SIEFKES, Decidability and completeness in restricted second order arithmetic, to appear. 7. M. 0.RABINand D. SCOTT,Finite automata and their decision problems, IBM J. Res. Develop. 3 (1959) 114-125. 8. E. P. RAMSEY, On a problem of formal logic, Proc. London Math. Soc. 30 (f929-30) 264-286. 9. R. M. ROBINSON, Restricted set-theoretical definitions in arithmetic, Proc. Am. Math. SOC.9 (1958) 238-242.

REFLECTION PRINCIPLES OF SUBSYSTEMS OF ANALYSIS *

Dedicated to Professor S. Iyanaga for his 60th birthday G . TAKEUTI lnstitute for Advanced Study, Princeton, New Jersey

and

M. YASUGI University of Bristol Let 6 be one of the subsystems of analysis, SINN and the system with extended inductive definition (called SEID in this article) (cf. [7]). SINN is second order Peano arithmetic with full induction and the II: comprehension axioms. SEID is an extension of SINN, which is obtained by adding to SINN some inductive definitions. Let rD be the system of ordinal diagrams which is used in proving the consistency of 6 and let ‘Ip be first order Peano arithmetic with second order parameters. Then we can prove the following reflection principles.**

THEOREM 1. Ind,(

a),Prov, (‘Vx3yR (a, x,y)’)

--f

V x l y R (a, x,y )

is p-provable, where R ( N ,x,y ) is elementary in a, i.e., all quantifiers in R(a, x, y ) are numerical and bounded, and Ind,(D) is the schema which allows transfinite induction along 3 with respect to Z y formulas (without second order parameters).

THEOREM 2.

Ind,( %), Prov,(‘A (a)’)

-+

A (a)

is p-provable, where A (a) is an arithmetical formula with a parameter a and

*

Part of this work was supported by NSF GP-4616.

** The results here can also apply to the extended system footnote 2 of [71.

255

SIN”

of

SINN

defined in the

256

G . TAKEUTI

and

M. YASUGI

Ind,(%) is the schema which allows transfinite induction along % with respect to the formulas of ’p. By modifying the proofs of Theorems 1 and 2 we can prove the uniform reflection principles (cf. Introduction of [3]), that is the following two theorems are p-provable.

THEOREM 1’. Ind,

(a)+ V i n (ProvG(‘Vx3yR (x, y , a, n (m)>.)b Vx3yR (x,y , LY, i n ) ) ,

where n ( m ) denotes the “m-th numeral” and R(a, b, a, c) is elementary in LY.

THEOREM 2’. Ind, (9) -+ Vm (Prov, (‘A (a, n (m))’) F A (a, m)) , where A (a, m) is arithmetical in LY. We can also prove another form of the uniform reflection principle.

THEOREM 3. I n d ’ ( 3)

--f

Vm (Prov, ( V + A (4, n (m)).)t VdA (4, m))

is 6-provable, where A(a, a) is arithmetical in a and Ind’ is applied to Z: formulas with a second order parameter. For the meanings and consequences of the reflection principles, the reader should refer to [3] in which a list of references concerning those problems is also found. We are concerned with special cases only. Throughout this article, acquaintance with [7] is presupposed. Both authors started their study of logic in Professor S. Iyanaga’s seminar. We should like to take this opportunity to express our thanks to him. Chapter I In this chapter we shall prove Theorem 1 (which has been stated in the introduction) and its corollary.

1. DeJinition of the systems and elementary predicates. Let 6 be one of the systems SINN, G,, GI, SJNN and the system with extended inductive definition (denoted SEID) and let % be the system of ordinal diagrams that is used in proving the consistency of G(% is denoted by S in [7]). Those systems are defined in [l] and are also to be found in 1 of Chapter 2 , 7 of Chapter 3, at the beginning of Section 2 in Chapter 3, at the beginning of Chapter 3, and in 1 of Chapter 4, respectively. Although 6, and 6,are not to be considered after 2 of Chapter I in this article, we have introduced

SUBSYSTEMS OF ANALYSIS

251

them in order to prove Proposition 1 for SJNN. For the elementary notions and the notations, refer to Chapter 1 of [7]. 1.1. We shall restrict the non-logical constants of 6 to the following. Individual constants; 0, 1. Function constants; +, .. Predicate constants; =, < . 1.2. A formula of 6, R(bl,..., b,, pl,. . ., PI), whose only free variables are b,, ..., b,, pl, ..., (including the cases m=O and/or l=O) and which has no quantifiers on f-variable is called elementary iri pl, ..., pl if all quantifiers appearing in R are bounded. 1.3. The beginning sequences of the system 6 are all those of the forms D-tD and s= t, A ( s ) + A ( t ) , where D and A ( a ) are arbitrary formulas, and mathematical beginning sequences. We may restrict the mathematical beginning sequences to the well known quantifier free axioms concerning the constants given in 1.1. 1.4. All other definitions of 6 in [7] are effective here. We shall use the logical symbols v , k and 3, as well as 1,A and V, although they are not formally defined in 6. Remark. The class of the predicates which are elementary in some free f-variables (cf. 1.2) is smaller than the class of the predicates which are primitive recursive in some freef-variables. This does not weaken our result, however, since the classes of the predicates of the form VxR(a,x , a) and 3xR(cc,x, a) with R elementary in a respectively cover the predicates I7: in a and those Zy in a (cf. Theorem 1). 1.5. A cut is called essential if its cut formula is not of the forms= t o r s< t. 2. PROPOSITION 1. Let R(a, &, ..., 0), be a formula of 6 which is elementary in PI,..., p, and assume that +3xR(x, P1,..., P,) is 6-provable. Then there exists a proof-figure of 6 to the above sequence which does not contain any essential cut or any induction. Moreover, this can be proved with the system of o.d.’s 9, i.e., we can prove the above statement by transfinite induction on the o.d.’s of 9 which are assigned to the prooffigures. The treatment of SJNN is slightly different from the other cases. Proof. For simplicity, we shall prove the proposition only for the case m = 1 and denote the formula R(a, cx). Let us first consider certain conditions on the sequences of 6. Let S be a sequence A , ,..., A j - + A j + l,..., A , of 6. S is said to have the property P if it satisfies the following. P.l. S has no free t-variable.

258

G. TAKEUTI

and M. YASUGI

P.2. Each formula which is in the left side of S, i.e. one of A,, ..., A j , is elementary in a. (This implies that none of A,, ...,A contains unbounded quantifiers.) P.3. Each formula which is in the right side of S, i.e., one of A j + l , ..., A,, is either elementary in ct or 3xR’(x, a),R’(0, a) being elementary in 3.

LEMMA. If a sequence S of 6 which has the property P is 6-provable, then it is provable without essential cut or induction, except for the case 6 is SJNN. Obviously the proposition is a trivial corollary of this lemma, except when G is SJNN. The case where 6 is SJNN shall be treated separately. Proof of Lemma. Suppose a proof-figure P to S is given. The proof is carried out with several steps following the consistency proofs of G (cf. [7]). We shall see that, at each reduction step, the end sequence of the resulting proof-figure still satisfies the property P. 2.1. G is SINN. 1) 2 through 8.2 in Chapter 2 of [7] are effective here. 2) We add the following inference schema “bounded-quantification” (abbreviated to bq) to our system.

where Vy‘y
259

SUBSYSTEMS OF ANALYSIS

3) Reduction in case P contains an explicit logical inference or an induction in the end piece. We shall treat several cases according to the bottommost such inference. If the last such inference is a logical inference, then the reductions are done as those in 6.1 of Chapter 3, [7], except the case an V right on a t-variable. If it is an induction, then the reduction is done as 8.3 of Chapter 2, [7]. For the case where the last inference which satisfies the above condition is an V right on a t-variable, let P have the following form.

..

rAA,b
T + A , VY < tRO(Y, a>

..

r';

d l , VY < sRb(y, a),A , ,

where b< tkRo(b,a) is elementary in a, t is closed and Rb(x, a) and s are obtained from R,(x, a ) and t respectively by term replacements. Assume t = n for some numeral n. If n=O, then P is reduced to

..

r'4 k < n k R b ( k , a), A,,

Vy

< s R ' ( y , a), A , ,

where ( k )means the substitution of k for b in the indicated part of the prooffigure, kPn-1

_

_

~

r'+A', (0 < n k R ' (__0 , a)) A ... A ((n - 1) < nkRd(iz - 1, m)) t>q-r'+A', V y < n R b ( y , a ) ~

~~

~

~-

~~

I"

+

~~~

-

~

~-

-~~

A 1%VY < sRb ( Y , a), A , ,

where A' is A , , V y < s R & ( y ,a), A , .

~

~

-~

G. TAKEUTI and M. YASUGI

260

4) Reduction in case there is no explicit logical inference or induction in the end piece. We can follow the proofs in 8.4 through 10 of Chapter 2, [7]. We shall only mention the cases where some extra considerations are needed. 4.1) The end piece of P contains a beginning sequence of the form D-tD. Let P be of the form

D-D

..

r-; A , B r, n

.. B, n-;A ~ b,, A , A, A,,

.. 3+@.

b, A ,

up to term-replacements, the reduction in 8.5.1 in If 6 is identical with Chapter 2, [7], works. If not, then b is of the form

and fi is of the form Vy < tS‘(y), where s= n and t = n for some numeral n, so=O, ..., ~ , , - ~ = =1n, -and S’(0) is either S(0) itself or obtained from S(0) by some term replacements, since there is no explicit logical inference or induction in the end piece. Hence P is reduced to the following proof-figure. I

4.2) Elimination of weakenings in the end place of P (cf. 8.6 of Chapter 2, [7]). If the last inference of the proof-figure Q is a bq, say

A , (0 < kl- S(0)) A r -+ A , Vy < k S ( y )

then the proof goes as follows.

... A ( k - 1 < k t S ( k - 1)) ~

SUBSYSTEMS OF ANALYSIS

261

..

If Q: is T * A A * , then Q* is Q:. If Q: is

..

then

r*
is

Q:

r*+ A * , V y < k S ( y )

5) In the following we assume that the end piece does not contain any logical inference, induction, beginning sequence other than mathematical beginning sequences or weakening, while it may contain some bq inferences. We may also assume that the proof-figure is different from its end piece, for if the entire proof-figure is the end piece, then the end sequence is provable from the mathematical beginning sequences by bq, exchanges, contractions and non-essential cuts, and bq can be eliminated without a use of essential cut and induction (cf. the remark in 2). 5.1) The existence of an essential cut. Since the principal formula of a bq is always explicit, the proof in 9 in Chapter 9, [7], goes through. 5.2) Essential reduction. Note that no principal formula of a bq is implicit. The rest of the proof goes like 10 in Chapter 2, [9]. Thus we have completed the proof in case 6 is SINN. 2.2. 6 is 6,. We can prove this case by following the proof of Theorem 2 in Chapter 3 of [7]. 2.3. 6 is 6,. In this case the proposition is proved along the proof of Theorem 3 in Chapter 3 of [7]. 2.4 6is SEID. In thiscase the lemma is proved following the proof of the case where 6is SINN (cf. 2.1) and the consistency proof of SEID (cf. Chapter 4 of [7]). (We should remark that in this case we might need a finite number of primitive recursive functions (or predicates) in addition to the constants of the system and also a finite number of mathematical beginning sequences besides those in the system, so that Z(a) and a<*b can be defined and exactly one from each pair: {I(n)+ ; -Z(n)) for any numeral n and {n <*m+ ; + n < * m >for any pair n and m of numerals, is provable in 6 without using any essential cut or induction. Indeed these modifications of the system do not cause any problem later in the proof of our theorem.) Inductive definitions can be eliminated as in the consistency proof of SEID, since all A iare implicit (cf. 9 in Chapter 4 of [7]). 2.5. 6 is SJNN. Here we prove the theorem in the original form (i.e. not in the extended form).

262

G . TAKEUTI

and M. YASUGI

1) Set e(a):V4(4 [O] A V y ( 4 [ y ] k 4 [ y + l ] ) t 4 [a])and A : Vxe(x). + 3 x R ( x ,

u ) is 6-provable if and only if A + 3 x R ( x , E) is G1-provable (cf. 10.1 of Section 3 in Chapter 3, [7]). Hence A + 3 x R ( x , a) is G1-provable from the

assumption. 2) From l), T o , Ae+(3xR(x, a)>”is G1-provable, where T o is the set of formulas e(O), V x ( e ( x )k e(x’)) and 3xe(x),and A“ is the restriction of A by e (cf. 10.1 of Section 3 in Chapter 3, [7]). 3) All formulas of T o and A“ are 6,-provable (cf. 10.1 of Section 3 in Chapter 3, [7]). Remark. It is easy to show that Proposition 1 can be extended to the following. Let R ( a , p1,..., p,, y1 ,..., y l ) be a formula of 6 which is elementary in bl, ..., p,, yl, ..., y l and assume that +V41, ..., Vb,, 3 x R ( x , 41,..., @,,, yl, ..., y J is 6-provable. Then there exists a proof-figure of 6 to the above sequence which does not contain any essential cut or induction. For our purpose, however, the present form of the proposition is sufficient. In the following of this chapter we shall consider only SJNN, SINN and SEID as 6.The rest of Chapter I is devoted to proving the reflection principles of those systems.

3. Let $3 be the system of Peano arithmetic with freef-variables as parameters. It is well known that all notions in G (cf. [ 7 ] , particularly Chapter 1) can be arithmetized in 9 using Godel numbering. We shall denote the Godel number of an object A of 6 by ‘A’ and express the notions “P is a proof-figure of 6 ” , “P is a proof-figure of 6 to a sequence S” and “a sequence S is 6-provable” by Pf,(‘P’), Prov, TP’, ‘S’) and Prov, TS’) respectively. Pf * G (.P’), Provz TP’, ‘S’) and Provg TS’) denote the corresponding notions to Pf, r P 7 ) ,Prov, (‘P’, rS1) and Prov, CS’) respectively with the additional condition that “P has no essential cut or induction”. If S is of the form + A , then we may use the simplified notations ProvG(‘A’) and Provg(‘A’) etc. Ind,(T>) is the schema which allows transfinite induction along the ordinals less than 1 9 1 with respect to the Zy formulas (without second order parameters), where 1531 denotes the order type of 3. We should remark that the system of mathematical beginning sequences is finitely axiomatizable. LEMMA1. Ind, (ID), Prov, (‘3xR (x, R)’)

--+

is 9-provable, where R is elementary in a.

ProvG(‘3xR (x, my)

263

SUBSYSTEMS OF ANALYSlS

ProoJ This is proved by formalizing the proof of Proposition 1. We shall give only the outline of the proof that Ind, (D)is adequate. First let us introduce several notations. Assume that p denotes the Godel number of a proof-figure P of 6. ends(p) is the Godel number of the end sequence of P. Q ( p ) is true if and only if the end sequence of P has the property P (cf. the proof of Theorem 1). C ( p ) is true if and only if P is a proof-figure which has no essential cut or induction. 6 ( p ) is defined by 6 ( p ) = o ( p ) # p , where o ( p ) is the 0.d. of P. Note that 6 ( p ) is an 0.d. of D and all those predicates and the functions are primitive recursive. Now from the proof of Proposition 1, we can define a primitive recursive function r as follows. Let p be the Godel number of a proof-figure P. If C ( p )v i Q ( p ) , then define r ( p ) = p . If 1C ( ~ ) A Q ( p ) , then r ( p ) is defined to be the Godel number of the resulting proof-figure of the reduction of P.r is primitive recursive and satisfies the following. 1)

C ( ~ ( P ) ) < ~ ( P )if

* Q(P>.

1C ( P >

2) 5 ( r ( p ) )= 6 ( p ) if C ( p ) . Define r(u, 6) by r(0, p ) = p ; r ( d , p ) =r(r(n, p ) ) . r(u, 6 ) is primitive recursive. Finally define p < . q++O”(p)
itive recursive and the order type of <-is that of D.Then the induction applies on the ordering <. and the induction formula is Q ( p )k 3nC(r(n, p)), or equivalently, 3 n ( Q ( p ) I- C(r(n,p ) ) ) . Remark. In fact, we can prove Lemma 1 in a generalized form: Ind, (a), ProvG(‘Vq53xR(q5, x,a)’)+ProvZ(“Vq53xR(4, x,a).). This is proved from the present form of Lemma 1 and the fact that and

ProvG(‘Vq53xR (4, x,a)’) -ProvG(ElxR ProvZ ( ‘ v ~ ~ x(4, R x,

~1)’)

(8, x, a)’)

c-.Provz ( ‘ 3 x ~(8, x,a)’)

are !&provable, 4. A formula of 6 is said to have the property Q, if it has no quantifiers onf-variables (and no Aj’s in case 6 is SEID) and no free t-variables, but possibly has some free $variables.

264

G . TAKEUTI

and

M. YASUGI

DEFINITION. Let A be a formula of 6 having the property Q. We define A-subformulas as follows : A is an A-subformula; if B A Cis an A-subformula, so are B and C; if i C is an A-subformula, then so is C ; if VxC(x) is an Asubformula, then C(n) is an A-subformula for every numeral n. Notice that every A-subformula has the same property Q and contains only freef-variables in A. Let PA(a) denote the following statement: a is (the Godel number of) a sequence whose formulas are only A-subformulas. We can give the truth definition TA,for A-subformulas, and consequently for the sequences which satisfy PA.It is well known that TA is defined by an arithmetical formula with second order parameters (cf. e.g. 131).

LEMMA2. Let A be a formula having the property Q. The following are 5$-provable: 1 .O

TA (‘VxB(x)’)

w VbTA( r B ( n (b))’),

where V x B ( x ) is an arbitrary A-subformula, and n(b) denotes the b-th numeral.

1.1.

TA (‘B v C’)

tf

TA (rB’) v TA (rC’),

where B and C are any A-subformulas. 1.2.

TA (‘B’)1++ ITA CB’) ,

where B is an arbitrary A-subformula. 2.

T’.CB(n(b,), ..., n(bk)Y)-B(bl, ..., b k ) ,

where B(0, ..., 0) is an arbitrary A-subformula. 3.

PA(a), ProvZ ( a ) + T A ( a ) .

Proof of 3. For simplicity, we shall denote TA by T. Assume PA(a) and Provz (a). Let P be a proof-figure such that Provz CP’, a). Let r (b,, ..., bk)+ + A (b,, ..., bk), where all the free t-variables containing it are fully indicated, be an arbitrary sequence contained in P . We can prove by induction on the number of inferences in the proof-figure to r ( b , , ..., bk)+A(bl, ..., bk) that r (n(c,), ..., n (ck))-+A (n(c,), .. ., n (ck))is provable in 6,where c,, ..., ck range over all the k-tuples of natural numbers; and moreover, by arithmetizing the above proof and using the properties 1 and 2, we can prove

T r ( n ( c l ) , . . . , n ( C k ) ) - - t A ( n ( c , ) , ..., n ( 4 Y ) in ’Q. This proves 3.

265

SUBSYSTEMS OF ANALYSIS

5. THEOREM 1 (REFLECTION PRINCIPLE). Let R(a, b, a ) be a formula of 6 which is elementary in a and contains no free t-variable other than a and b. Then Ind, ProvG(‘Vx3yR(x, y, a).) -+ V x 3 y R ( x , y , a)

(a),

is 9-provable. Proof. Let Vx3yR(x,y , x ) be the A in 4 and define T, as in 4 Prov, (‘Vx3yR (x, y , a).)

(1)

is provable in (2)

Ind,

+ Va

Prov, (‘3yR ( n (a), y , a)’)

9.

(a), ProvG(‘3yR ( n (a), y , a)’) -+

Provg (‘3yR ( n (a), y , a)’)

is 9-provable by Lemma 1 for any free t-variable a. Provz (‘3yR ( n (a), y , a)’)

(3)

--f

T (‘3yR

(12

(a), y , a)’)

is !&provable by 3 of Lemma 2, since PA(r3yR(n(a),y , a)’) is 9-provable.

VaT(‘3yR ( n (a), y , a)’)

(4)

-+

T (‘Vx3yR (x, y , a)’)

is 9-provable by 1.0 of Lemma 2. T (‘Vx3yR (x,y , a)’) -+ Vx3yR (x,

(5)

J’,

a)

by 2 of Lemma 2. From (1)-(5) we have that Ind, (a),ProvG(‘Vx3yR (x, y , a).)

-+

V x 3 y R (x, y , a )

is ‘@-provable. Remark As we remarked in the introduction, we can prove the uniform reflection principle by modifying the proof of Theorem 1. THEOREM 1’. 1ndl (a),Vrn (Prov, (‘Vx3yR (x, y , a,n (m)).) -+ V x 3 y R (x, y , a, m ) ) . LEMMA 1’. Ind,

(a),Prov,

(5IxR’(x, a,I I On)>.)-+ Provg (‘3xR’(x, a, n (m))’) .

This lemma is provable from Lemma 1 by replacing ‘3xR(x,a). by ‘3xR‘(x, a, n(m))’. Then by taking VzVx3yR‘(x,y , a, z ) as A in 4, (1)-(5) in the proof of Theorem 1 are provable for ‘Vx3yR‘(x,y , x , n(m))’ instead of ‘Vx3yR ( x ,y , a)’. 6. Let

9‘be the system obtained

from

by adding second order pure

266

G . TAKEUTI

and M. YASUGI

logic, where mathematical induction is restricted to apply only to the formulas of ‘p. Then from Theorem 1 we have the following COROLLARY. Let R(P, a, b, a) be a formula of 6which is elementary in P and x . Then Ind,

(D), Prov,

(‘V4Vx3yR

( 4 , x,Y , a).) -,V4Vx3yR (4, x , Y , a )

is p-provable. Chapter I1 7. In this chapter we shall consider the systems SINN and SEID and call each of them G. Let 33 denote the system of o.d.’s which is used in proving the consistency of 6 (cf. 1 of this article). is the system which has been defined in 3. Let B(a) be an arbitrary formula of 9 of the form

(*I

3 x , v y , ... 3x,VynBo(a,

~

1

...) , x,, Y n ) ,

where B0(a, a,, b,, ..., a,, b,) is a quantifier free formula whose only free variables are CI, a,, b, ..., a,, b,. B(a)-subformulas are defined as in 4 of Chapter 1. LEMMA 3. Given B(a) which satisfies (*), we can define the truth definition TB(cI) for B(a)-subformulas in ‘p with a Z : , formula having the second order parameter CI.It is obvious that TB(a)can be extended to the sequences of which formulas are B(a)-subformulas. 8. Let B(a) be a formula satisfying (*). We define the condition SB(@) as follows: Let ‘S’ denote the Godel number of a sequence S. By “ we mean that the quoted sentence is actually an arithmetized formula. S, (B(a); ‘S’): “Each formula of S is a B(a)-subformula”. S , ( ‘ S ): “Each formula in the left side of S is quantifier-free”. S,(B(a); T ) : 1TB(a)(rSl). SB(a)(rS1):S , ( B ( a ) ; ‘ S ~ ) A S , C S ~ )SA3 ( B ( a ) ;‘Sl). From now throughout this chapter, B(a) shall be arbitrary but fixed so that it satisfies (*). For simplicity we shall abbreviate TB(a)and S,(,, to T and S respectively. ”

PROPOSITION 2. Ind, ( D),Prov, ( p , ‘ B (a).), is p-provable, where

i T

(‘B (a>.) -,3q

<*p (Pf * ( 4 )

A

S (ends ( 4 ) ) )

<. is the well ordering of o defined in 3 of Chapter I

SUBSYSTEMS OF ANALYSIS

267

and Ind, (a)is the schema which allows transfinite induction along the order with respect to Z:,+ formulas. The proposition is a trivial consequence of i

d

LEMMA 4.

is p-provable. 8.1. The lemma is proved by applying Ind,

(a)to the following formula:

Since T and S are in Z& and I& respectively, the induction formula is in with the parameter tl. It is now obvious that in order to prove (1) it suffices to show

Z:,,+

if Pf* ( p ) , then we may take p itself as q in (1). 8.2. Assume S(ends(p)) A PfG(p)A i P f g ( p ) and find a q which satisfies (2). This is done like the consistency proofs in [7], although, strictly speaking, the whole argument is developed in the arithmetized language. Let P be the proof-figure with the Godel number p . 1) Preparations for reduction, which is seen in 2 through 8.2 except 7 in Chapter 2 of [7], are applicable. 2) If there is an explicit logical inference or an induction in the end piece of P, then the proof is carried out according to the bottommost such inference. 2.1) The last such inference is an 3 right on a t-variable. Let P be of the form ..

Notice that ti is a closed term and primitive recursive. Therefore ti can be computed and is equal to a numeral mi. P is reduced to the following.

268

G . TAKEUTI

and M.

YASUGI

269

SUBSYSTEMS OF ANALYSIS

2.3) All other cases of logical inferences are proved easily. In virtue of S, (ends(p)) and S,(ends(p)), there is no V left on a t-variable and no 3 left on a t-variable. 2.4) The last inference which satisfies the condition is an induction. This case is proved as in [7]. 3) Now we may assume that there is no explicit logical inference or induction in the end piece of P . Hereafter we can follow exactly the consistency proofs of [7]. Thus we have proved Lemma 4.

9. PROPOSITION 3. Ind, (D), ProvG(rB(a)’), i T(‘B(a)’)-+ is p-provable, where B(a) satisfies (*). Proof. From the definition of T, Provz(q)-+T(ends(q)),which contradicts S , (ends(q)). Thus the proposition follows from Proposition 2. 10. THEOREM 2.

Ind,

(D), Prov,

(‘A (a)’)

-+

A (u)

is p-provable for an arbitrary arithmetical sentence A ( u ) with a second order parameter a, where Ind,(D) applies to the formulas of V, that is to the formulas arithmetical in some second order parameters. Proof. It is well-known that A (a) tf B (4

(1)

is !&provable for some B(u) which satisfies (*). Ind, (ID), Prov, (rB(~)7) 4 TB(a) (‘B (a)’)

(2)

and (3)

TB(a)

(rB (‘>’)

are p-provable from Proposition 3 and Lemma 3 respectively. It is also known that ProvG(‘A (a)’)

(4)

tf

Prov, (‘B

(MY)

is p-provable. (1)-(4) yield the theorem. Remark. Here again we can prove the uniform reflection principle.

THEOREM 2‘. Ind, (ID)

-+

Vrn (ProvG(‘A (a, n (m)>.)k A (a, m ) ),

where Ind2(ID) applies to the formulas of

13.

270

G . TAKEUTI

and

M. YASUGI

This is proved with the similar modifications that have been carried out in the proof of Theorem 1’. Namely first apply Lemma 4 to ‘B(a, n(m)). in place of ‘B(a)’. Then take VzB(a, z ) as B(a) and define the truth definition for B(a). The rest of the proof of Theorem 2 goes through after this alteration. 11. Let ‘p’ be the system defined in 6 . Then from Theorem 2 we have the following

COROLLARY. Let A ( @ ) be an arithmetical sentence with a second order parameter a. Then Ind,

(a),Prov, (rv4A (4)’)

-+

V 4 A( 4 )

is p’-provable, where Ind,(lD) applies to the formulas of

p.

12. Now we are going to present another formulation of the reflection principle for the formulas V 4 A (+), where A (a) is arithmetical in a. We shall state it in the form of the uniform reflection principle. THEOREM 3. Let A(a, a) be arithmetical in free variables of A . Then (1)

Ind’ (ID), Prov, (‘VW

c(

(4, n (a))’)

and let a and a be the only

--f

V 4 A (4, a )

is 6-provable, where Ind’(D) applies to Zi formulas with a second order parameter. Proof. 12.1. First there exists a quantifier free formula R(a, b, c, a) for which (2)

V4A (4, a ) - V 4 3 x V y R

(4, x, y , a>

is 6-provable. (2) implies (3)

Prov, (‘V4A (4, n (a)>.) -Prov,

(‘Vq53xVyR (4, x, y , n (a)>.)

is (3-provable. (2) and (3) guarantee that, in order to prove (l), we only have to prove (4)

in

Ind’(ID), Prov, (‘3xVj’R (a, x, y , n (a)>.) + 3xVyR (a, x, y , a )

(3.

12.2. (4) follows from (5)

Ind’( ID), ProvG(‘3xVyR (a, x, y , n (a)).), i T (‘3xVyR (a, x, y , n (a))’)

+

,

271

SUBSYSTEMS OF ANALYSIS

which is proved like Proposition 3. Notice that T is the truth definition for 3xVyR(a, x,y , a) so that we may assume it is a Z x formula with the parameter a, which implies that Ind'(53) is applies to Z: formulas with the parameter a. 13. Remark. Kreisel has suggested* the following argument to show: We cannot prove (1)

Prov, P ' 4 3 x V y R (4, x, y)'> + V 4 3 x V y R (4, x, Y >

by adding to G all true 1;sentences as axioms. (The idea of the proof is that we cannot obtain a new provable well-ordering by this addition by a result of [ 2 ] ; on the other hand, (I) provides a new provable well-ordering whose order-type is the supremum of all the provable well-ordering in G.) This implies that we cannot prove Prov, ( W A (4)')

-+

V#A (4)

2

where A ( a ) is arithmetical in a, by adding to G Ind(53) which applies to arithmetical formulas without second order parameters. 14. Appendix Here we wish to state an application of the proof of the theorem in Chapter 5 of [7] (also cf. [6]), although this has no direct bearing on the reflection principles :

THEOREM. Let 6 be SINN or

(which may contain recursive functions as functional constants) and 33 the system of ordinal diagrams used to prove the consistency of (5;and let Q ( a ) be a recursive predicate (containing no logical symbols) and a < .b a recursive linear ordering of the set Q (= {a[Q (u)}). Then, if <. is a provable recursive well-ordering in 6, there exists a ZZ: orderpreserving one-one map (T from Q into 33. Proof. We can arithmetize notions concerning TJ-proof-figures with respect to 6 as usual: Let P(u, b) express that "a is the Godel number of a TJ-proof-figure whose end-number is b". By the value of a TJ-proof-figure P we understand o(P)#O("), where o ( P ) denotes the value of P in the sense of [6] (i.e. the 0.d. assigned to P ) , a is the Godel number of P and O@) is defined by O(")=O and O(i+l)=O(i)#O. (By this modification of the value no two TJproof-figures have the same value unless they are identical.) Let a 4 b be the arithmetization of the well-ordering of 53 and u(a) the function expressing SEID

* Several remarks were communicated privately to the first author. We are grateful to Piofessor G. Kreisel for his valuable comments.

272

G. TAKEUTI

and

M. YASUGI

the Godel number of the value of a TJ-proof-figure with the Godel number

a. These notions concerning arithmetization can be chosen to be primitive recursive. Let o(a) be defined by

b = a ( a ) o P ( b , a ) ~ V ~ ( P ( x , a ) - + o ( b ) < ov( o~ ()x ) = o ( b ) ) . dfn

If b = a (s), then b is (the Godel number of) a critical proof-figure (for, otherwise, b would be reduced to a TJ-proof-figure with the same end-number s). a is order-preserving: i.e. if s<. t , then o(o(s))
15. Additional remark (added on Feb. 22, 1967) We say “transfinite induction on 4.’’ is introduced by a rule, when we admit a formula VxA (x) to be provable, if it is provable under the hypothesis

vx ( V y ( y

<-x t- A ( y ) ) t- A (x))

.

This rule is called the transfinite induction rule on A ( x ) . It should be remarked that T + A , VxA(x) is not considered to be provable even if it is provable under the hypothesis VX

(VY(Y

<*X 1’4 ( Y ) ) t- A ( x ) ) .

Therefore the rule is weaker than the ordinary induction schema. We can apply the rule repeatedly. For example, we can prove VxB(x) by using

Vx (Vy (y <-x t- B ( y ) ) t- B (x))

and VxA (x) after we prove VxA (x) .

In this paper transfinite induction along the ordering <*of ID is introduced as a schema. However in the proof of Theorems 1,2, l’, 2’ and 3, the schema can be replaced by the rule described above. More precisely we apply the rule only once in each case. For example, in Theorem 1, we can replace Ind, (ID) by the transfinite induction rule on a C(: formula (without second order parameter). Kreisel pointed out that this result is best possible i.e. we cannot replace .Z: transfinite induction rule by lZ(: transfinite induction rule, here. Suppose that lI(:transfinite induction rule is sufficient to prove the reflection principle in Theorem 1. Then, since V x A (x) is I7: if A (x) is IZ:, the reflection principle in Theorem 1 can be proved in @ with true f l y sentences. We can show that

SUBSYSTEMS OF ANALYSIS

273

the reflection principle in Theorem 1 cannot be proved by adding to even 6 any true I l y sentences. For the reflection principle in Theorem 1 provides an enumeration of the provably recursive functions of 6 and addition of true Ily sentences does not increase the class of provably recursive functions (cf. Section 5 of Kreisel, J. Symb. Logic 23 (1958)). He also pointed out that VI,bVx3yR($, x,y ) is equivalent to Vx3yR, (x,y ) for some primitive recursive R , using his theorem in Note 111 of British Journal for Philosophy of Science 4 (1953). Therefore the presence of CI in Theorem 1 is not essential. References 1 . A. KINO,On ordinal diagrams, J. Math. Soc. Japan 13 (1961) 346-356. 2. G. KREISEL, Status of the first &-numberin first order arithmetic, J. Symb. Logic 25 (1960) 390. 3. G. KREISEL and A. Levy, Reflection principles and their use for establishing the complexity of axiomatic systems, to appear. 4. G. TAKEUTI, Ordinal diagrams, J. Math. Soc. Japan 9 (1957) 386-394. 5 . G. TAKEUTI, Ordinal diagrams 11, J. Math. Soc. Japan 12 (1960) 385-391. A remark on Gentzen’s paper “Beweisbarkeit und Unbeweisbarkeit von 6. G. TAKEUTI, Anfangsfallen der transfiniten Induktion in der reinen Zahlentheorie”, I, 11, Proc. Japan Acad. 39 (1963) 263-269. 7. G. TAKEUTI, Consistency proofs of subsystems of classical analysis, Ann. of Math. 86 (1967) 299-348.

E Q U A T I O N A L LOGIC A N D E Q U A T I O N A L T H E O R I E S OF ALGEBRAS

A. T A R S K I * University of California, Berkeley, USA In this paper we shall introduce various metalogical notions referring to equational theories of algebras and, more generally, to equational logic. The purpose of the paper is to give a short survey of the results obtained and open problems concerning these notions. Some of the results announced in the paper appear in print for the first time. 1. Algebras and their equational theories

By an algebra we understand here a system 'u= ( A , 0,, ..., 0,) formed by a non-empty set A and a finite sequence of finitary operations O,, ..., 0, from and to elements of A ; A is the universe (or the carrier) and 0,, ..., 0, are the fundamental operations of the algebra a. The sequence of ranks of O,, . .., 0, is called the (similarity) type of 'u; two algebras of the same type are said to be similar. In particular, algebras ( A , 0) of type ( 2 ) are frequently referred to as groupoids. A fundamental operation of rank 0 is usually identified with the unique element constituting the range of this operation; such an element is called a distinguished element of the algebra. All classes of algebras discussed here are assumed to consist of similar algebras. Many basic properties of algebras are expressed by means of equations, i.e., formulas of the type o = z where a and z are terms in the first-order theory of the algebra discussed. Thus, o and z are formed from variables ranging over elements of the universe and from symbols denoting fundamental operations of the algebra. To avoid the use of parentheses we always put

*

This is an outline of a n address given by the author to the Colloquium on Logic and Foundations of Mathematics, Hannover 1966. The paper was prepared for publication when the author was working on a research project in the foundations of mathematics sponsored by the National Science Foundation, Grant Number GP-6232X. The proofs of new results announced in this paper will be published elsewhere. 275

276

A. TARSKI

an operation symbol in front of the variables to which it refers. A symbol denoting a distinguished element, i.e., an operation of rank 0, is often called a constant or an individual constant. The part of predicate logic in which equations are the only admitted formulas can be construed as a separate formal system and called equational logic. In this logic equations a = z with variables xl,. ..,x, are treated as if they were sentences

vx, ...x,(a

= T)

where V is the universal quantifier. We assume it will be understood what is respectively meant by the statements that an equation (T = T is a consequence of, or is derivable from, a set C of equations. The first of these two notions is defined semantically in terms of models or satisfaction; the second is defined proof-theoretically in terms of operations (rules) of direct injierence. Three such operations are used in equational logic: the operation of including the tautology x=x in every set of derivable equations, the operation of substitution, and that of replacing equals by equals. The notions of consequence and derivability are equivalent by virtue of the completeness theorem for equational logic, first proved by Birkhoff [l]. Two systems of equational logic differ only in operation symbols; each system is determined by a finite sequence (without repeating terms) a = (a1,..., a,) of all operation symbols occurring in it. To discuss algebras ( A , 0,, ..., 0,) of a given similarity type we choose the sequence a in such a way that, for each L=l, ..., v, the symbol a, has the same rank as the corresponding operation 0,. By the equational theory of an algebra 'LI, or a class K of algebras, we understand the set of all equations, in the system of logic determined by a, which are identically satisfied in the algebra 'LI, or in all algebras of K. We denote this theory by O,'u or @,K; usually we need not specify the sequence of operation symbols, and we can denote the equational theory simply by 02,or OK, without causing any misunderstandings. I t is known that for every class K of algebras there is an algebra 'LI such that @%=OK. K is said to be an equational class of algebras or a variety if it consists of all models of some set C of equations (or, equivalently, of the set OK). A well-known theorem of Birkhoff [l] provides a set of conditions, of a purely algebraic nature, which is necessary and sufficient for a class K to be equational. Two theories 0 and with two different sequences a and 5 may of course have the same class of models; the equations of one theory can then be obtained from those of the other simply by exchanging the corresponding operation symbols a, and 0,. We refer to two such theories as isomorphic.

EQUATIONAL LOGIC

277

A set 0 of equations is called an (equational) theory if 0 =@?Ifor some algebra % or, equivalently, if 0 consists of all equations derivable from a set C of equations. Such a set Z is called a base for 0 ; 0 is said to be generated by C and is denoted by O[C]. By a base (or equational base) for an algebra 3,or a class K, we mean any set C which is a base for @?I,or OK, respectively. Let 0 and 0‘ be two equational theories. If 0 is included in 0‘ (and hence, in particular, all the operation symbols of 0 are among the operation symbols of Of),we say that 0 is a subtheory of 0’and that 0’is an extension of 0. When speaking of subtheories of a given theory we always mean here subtheories with the same sequence of operation symbols; on the other hand, we shall sometimes consider extensions of a theory with additional operation symbols. Given a theory 0, an operation symbol 0 of rank, say, v, not occurring in 0, a sequence of v distinct variables xl, ..., x,, and a term z of 0, assume that the following condition is satisfied: either no variable different from xl, ..., x, occurs in z or, if such a variable y does occur, and z‘ is any term obtained from z by replacing some occurrences of y by occurrences of another variable, say, z, then the equation z=z’ belongs to 0. Under this assumption, the equation Oxl . . . x v = z

(orO=zincase v = O )

is called a possible definition of 0 in 0. Let 0’ be an extension of 0 and let O,, ..., 0, be all the distinct operation symbols occurring in 0‘but not in 0. We call 0’ a dejinitional extension of 0 if there is a set C of possible definitions of the symbols O,, ..., 0, in 0, one definition for each symbol, such that 0’ is generated by 0 UC (the union of 0 and Z). We say that two theories 0 and 0‘are directly dejinitionally equivalent if they have a common definitional extension, and that they are dejinitionally equivalent if any two theories 0 and 8’which are respectively isomorphic with 0 and 0’and have no common operation symbols are directly definitionally equivalent. If K and K’ are two equational classes of algebras, and their theories OK and OK’ are definitionally equivalent, then K and K’ are also called dejinitionally equivalent. 2. Finitely based theories

A theory 0 is said to befinitely based if it has some finite base; similarly for an algebra ?I or a class K. It is known that every base C of a finitely based theory 0 has a finite subset which is also a base for 0.

278

A. TARSKI

For various classes K of algebras, and in particular of finite algebras (i.e., algebras with finite universes), the problem of determining which algebras in K are finitely based was frequently raised and studied. Obviously, every one-element algebra is finitely based and has the single equation x = y as a base. Lyndon [2] proved that every two-element algebra is finitely based. In Lyndon [3] he exhibited a seven-element groupoid which is not finitely based. More recently Murskii [4] presented a three-element groupoid with the same property. Oates and Powell [5] proved that all finite groups are finitely based. Perkins [ 6 ] obtained an analogous result for all three-element semigroups (groupoids which satisfy the associative law), but he also constructed a six-element semigroup which is not finitely based; for semigroups with four and five elements the problem is open. An outstanding open problem is whether every infinite group is finitely based. The problem has been solved affirmatively(by B. H. Neumann, Lyndon, and others) for various special classes of groups, such as Abelian, metabelian, and nilpotent groups; an account of the results obtained so far can be found in Neumann [7], page 22. Perkins [6] has shown that every commutative semigroup is finitely based and has extended this result to some classes of non-commutative semigroups.

3. One-based theories If an equational theory 0 is finitely based, we may wish to determine the least number of equations a base of 0 can contain, and in particular to answer the question whether 0 has a base consisting of just one equation. If the answer is affirmative, 0 is said to be one-based; similarly for algebras and classes of algebras. While every one-element algebra is one-based, simple examples of algebras which are not one-based can be found among two-element groupoids. All the two-element groupoids can be divided into seven classes such that any two groupoids in the same class are isomorphic or dual-isomorphic. D. H. Potts has shown that the algebras in two of these classes are not one-based; as representatives of these classes we can take any two algebras ( A , 0) and ( A , Q ) where A consists of two distinct elements, a and b, and where Oaa =a = Qab while Oxy = b and Qxy =b otherwise. Concerning the algebra ( A , 0) see Potts’ note [8]. The algebras in the remaining five classes are known to be one-based. The problem of determining which classes of algebras are one-based has been studied exhaustively for groups and rings. Groups are usually treated as algebras 8 = (G, C, I ) of type (2, l), with

EQUATIONAL LOGIC

279

the binary operation C , group composition or multiplication, and the unary operation Z, formation of inverses. They can also be treated in many other ways, e.g., as groupoids (G, D ) where D is the left-hand division (with Dxy standing for x.y-'), or as algebras (G, C, 0)of type (2,2), etc. In all these cases the unit element E can be included in the definition of a group; groups then become algebras (G, C, Z,E ) of type (2, 1,0), or (G, D , E ) of type (2,0), etc. The class of all groups under each of these treatments is equational, and any two classes corresponding to two different treatments are definitionally equivalent. (Groups can also be treated as groupoids (G, C ) with the group composition C as the only fundamental operation. The class of groups so treated is not equational and is not definitionally equivalent with any of the classes previously mentioned.) It was shown in Tarski [9] that the class of Abelian groups treated as groupoids (G, D ) is one-based. Higman and Neumann [lo] obtained a much stronger result by showing that every finitely based class of groups (G, D ) is one-based. The author has established a still more general theorem, which implies the result of Higman and Neumann as a particular case and permits extending it to several other treatments of groups:

.

THEOREM 1. Let 0 be a jinitely based equational theory with a binary operation symbol D and at most one additional operation symbol 0 of an arbitrary rank, say, v. Assume that 0 contains the equation (i)

DDDvvDx~DDuux= y

and also, in case 0 actually occurs in 0 , the equation (ii)

O(Duu)'

= Duu

(or 0 = Duu if v = 0).

Under these assumptions 0 is one based. In (ii) we use (Duu)" as an abbreviation for the expression obtained by repeating Duu v times. A closely related result is THEOREM 2. Theorem 1 remains valid if equation (i) is replaced by two equations: DyDyx = x and D x x = Dyy . From Theorem 1 we conclude directly that every finitely based class of groups treated as algebras (G, D ) , (G, D, E ) , ( G , D, I ) , or ( G , C , D ) is

280

A. TARSKI

one-based; by means of a simple additional argument we extend this result to groups (G, C, Z). It seems, therefore, interesting that the result cannot be extended to groups treated as algebras (G, C, I,E ) , (G, C, D,E ) , or (G, D,I, E ) : it was shown by Thomas Green and the author that under none of these treatments is a class of groups one-based unless it consists exclusively of one-element groups. Each such class, however, if it is finitely based, has a base consisting of two equations. The notion of a one-based equational class of algebras, i.e., a class consisting of all models of a single equation, is a model-theoretical notion; we may be interested in providing a simple and purely mathematical characterization for this notion, just as it has been done for other related notions (such as equational class, finitely based equational class, elementary class, universal class, etc.). The problem is still open, but the results obtained for groups throw some interesting light on it. To fix the ideas, let K be the class of all groups (G, C , I,E ) and K* the class of all groups (G, D,E ) . K and K* are definitionally equivalent; as a consequence, we can establish a natural oneone correspondence between the members of these two classes which preserves all the properties of groups and classes of groups expressed in such terms as subalgebra, isomorphism, homomorphism, direct product, ultraproduct, etc. Nevertheless, as we have seen, K* is one-based while K is not. The conclusion is that a mathematical characterization of one-based equational classes cannot be given exclusively in terms of subalgebra, isomorphism, etc. (although these notions proved to be adequate for all analogous purposes in the past). The results for groups have been extended to rings. Rings can be treated, e.g., as algebras ( R , C, I,M ) of type (2, 1, 2), with ring addition C, the formation of negatives (additive inverses) I,and ring multiplication M , or as algebras ( R , D,M ) of type (2,2) with ring subtraction D and ring multiplication M. Similarly, rings with unit 1 can be treated as algebras (R,C, I, M , 1) of type ( 2 , 1,2,0), or as algebras (R, D,M , 1) of type ( 2 , 2, 0). A direct consequence of Theorem 1 (or 2) is that every finitely based class of rings ( R , D,M ) is one-based. On the other hand, the author has shown that the class K’ of all rings (R,C, I, M ) is not one-based; the same applies to all finitely based equational subclasses of K containing a ring, with more than one element, in which the product of any two elements is zero. However, each such subclass of K has a base with two equations. Green has proved that all other equational subclasses of K are one-based. The solution of analogous problems for rings with unit is based upon the following general theorem:

EQUATIONAL LOGIC

281

THEOREM 3. Let 0 be ajinitely based equational theory with two binary operation symbols, D and M , an individual constant, 1, and arbitrarily many other operation symbols of arbitrary ranks. Assume that 0 contains equation (i) of Theorem 1, or both equations of Theorem 2, and moreover any three of the following four equations: MXDUU= DUU, MDUUX= DUU, Mxl = x, M l x = x.

Under these assumptions 0 is one-based.

This theorem was established by the author with the help of Theorems 1 and 2. A somewhat weaker, though essentially the same, result was obtained by McKenzie using a different method and was announced in an abstract of Gratzer and McKenzie [ll]. It is easily seen that every equational theory which has a one-based definitional extension is itself one-based. This simple observation leads to a useful improvement of Theorem 3 :

THEOREM 4. I f an equational theory 0 satisjies all the assumptions of Theorem 3, then every equational theory 0’ which is dejinitionally equivalent with 0 is one-based. As a direct consequence of this theorem, every finitely based class of rings (R, D, M , 1) or (R, C , I , M , 1) is one-based, and the same applies to classes of rings with unit under any other definitionally equivalent treatment. In particular, Boolean algebras can be treated as Boolean rings with unit, i.e., as rings ( B , D, M , 1) satisfying identically the equation M x x = x . Several other treatments of Boolean algebras are known which are definitionally equivalent with the one just described. Thus, for instance, Boolean algebras can be treated as algebras ( B , J , M , K ) of type (2,2, l), with two binary operations, the formation of joins J (Boolean addition) and the formation of meets M (Boolean multiplication, coinciding with ring multiplication under the first treatment), and one unary operation, the formation of complements K ; they can also be treated as algebras ( B , J , K ) of type (2, 1) (where J and K are the same operations as before) or, finally, as groupoids ( B , E ) , with the operation E of exclusion (Sheffer’s stroke operation). Theorem 4 implies that the class of Boolean algebras under any one of these treatments is one-based.

282

A. TARSKI

It is well known that the equational theory of any single Boolean algebra

23, say, % = ( B , J, M , K ) , with more than one element coincides with the theory of the class of all Boolean algebras; hence, 23 is one-based. In case the

Boolean algebra %4 has just two elements, it is a primal algebra in the following sense: if B’is any algebra obtained from B by adjoining new fundamental operations (from and to elements of B), then 0%’ is a definitional extension of 0%.Clearly, every primal algebra is finite. An easy consequence of Theorem 4 is that every primal algebra is one-based. This last result was obtained independently by Gratzer (even before Theorem 4 was known) and was stated in Gratzer and McKenzie [ll]. 4. Independent bases

A set Z of equations is called independent (or irredundant) if no equation in this set is derivable from the remaining equations. Given a theory O we shall denote by V O the set of all cardinals v such that O has an independent base C with cardinality v ; similarly we define V%, V K . For any given 0 we may be interested in determining the set VO. The discussion of this problem has been simplified by the following result of the author : THEOREM 5. Let 0 be an equational theory. If K , 1, and p are threeJinite cardinals such that r c < l < p , and K , p E VO, then I E V O . In a more general form this result applies, not specifically to equational logic, but to a wide class of formal systems with finitary operations of direct inference. The latter, treated as operations from formulas to formulas or to sets of formulas, may be of different ranks; e.g., substitution is of rank 1, while the replacement of equals by equals and modus ponens are of rank 2. The result holds for all formal systems in which the ranks of all operations of direct inference do not exceed 2. Actually, the result can be given a still more abstract form, and in this form it can be applied, outside of metamathematics, in the general theory of algebras (and of arbitrary relational structures). Let, for instance, % be any algebra, even with infinitely many fundamental operations, in which, however, no fundamental operation is of rank greater than 2. Let K , I , and p be again three finite cardinals such that K
EQUATIONAL LOGIC

283

THEOREM 6. Let 0 be an equational theory. r f 0 consists exclusively of tautologies z= z,then

vo = ( 0 ) .

(i)

r f 0 is finitely based, but does not consist exclusively of tautologies, then VO is either a bounded or an unbounded interval of non-zero finite cardinals; i.e., either

(ii) for some (iii) for some

vo = [ K , p] = ( A : K < A < p} K,

p with 0
VO = [ K , w ) = (A:K K

with O < K < W .

IA finally, 0 is not finitely based, then either

(iv) or else (v)

VO = O

vo = { w ) ,

(i.e., VO is empty).

An essential supplement to Theorem 6 is

THEOREM 7. For each of the formulas (i)-(v) of Theorem 6 there are equational theories 0 satisfying this formula; formula (ii) can be satisfied for any given cardinals K , p (O< K < ~ < w )and , formula (iii) for any given cardinal K (O
284 YY:

6,:

A. TARSKI

o v + ~ y x l x... z x,y = ov+'yx, ... xvx1y, 0'yx1x, ... x , = o v + ~ y x l x... z x, oyy,

for v = 1,2, 3, ... . The set V 0 has been determined for various theories 0 of the form 0 = 0 [C] where C is a set consisting of some of the equations E ~ - E ~ y,, , and 6,. If, e.g., C = { E ~ } ,then V 0 = [ 1 , 0). If C is one of the sets { E ' , E ~ or } { E ~ E, ~ or } { E ~ E, ~ E, ~ } ,then V 0 = [ 2 , w). (It was pointed out by Potts [8] that 1 $ V 0 and 2 ~ V in0 case ~ = O [ { EE,,, ,E ~ } ] . )If C= { E ~ } then , V 0 = { 1 , 2 } = [ 1 , 2 ] . If Z = { y , : O < v < o } , then V O = { o } ; if C = { y , : O < v < o > u {6,:O
THEOREM 8. Let 0 be an equational theory with arbitrary operation symbols. Let x be a variable, and z be a term in 0 with at least two occurrences of x; assume that the equation z=x belongs to 0.If 0 < K < I < w and K E V O , then 1EV0. Some improvements of this result are also known. By comparing Theorem 8 with various results stated in section 3, we arrive at the following conclusion: if 0 is a finitely based theory of a class of groups or rings (under any treatment of these notions discussed in section 3), then either V 0 = [l , o)or V 0 = [ 2 , o),dependent on whether or not 0 is one-based. This solves a problem raised by Higman and Neumann [lo] for theories of groups ( G , D}. Let 0 be a finitely based theory of a class of lattices, treated as algebras ( L , J, M ) of type ( 2 , 2 ) with the operations of forming joins and meets. It has been announced by Padmanabhan [12] that 2 ~ V 0 Hence, . by Theorem 8, V 0 2 [2, 0).McKenzie has recently proved that 1 $ V 0 if and only if there is a lattice which is not model of 0 and there is a lattice with more than one element which is a model of 0. The following question arises in a natural way: Let r be a finite independent set of K equations belonging to a theory 0 but not forming a base for 0; let K c 1 < w and I E V O . Can we always extend r to an independent set A of A equations which is a base for 0?In general the answer to this question is negative. For instance, let 0 be a finitely based theory of a class of lattices; let r be an independent base for the theory of all lattices, but not for 0, and let K be the cardinality of r. As was mentioned above, we have AEVO for every A such that K<,?
EQUATIONAL LOGIC

285

theory of all distributive lattices. If we ask the same question taking for 0 any finitely based theory of a class of groups and for r an independent base for the theory of all groups, the answer is still unknown. However, the answer proves to be affirmative for theories 0 of various special classes of groups, e.g., the class of Abelian groups or the class of all groups in which the order of every element divides a given positive integer v ; r is assumed here to consist of just one equation. Turning to theories 0 without finite bases, we recall that, by Theorems 6 and 7, they can be divided into two non-empty classes characterized respectively by the formulas v @ = { ~and } V 0 = 0 . We do not know any natural and mathematically interesting examples of algebras without independent bases. In particular, we do not know whether there are groups Q with VQ =O; nor do we know whether there are groups 8 with VQ = { w } . If V ~ = { O } , then, as is easily seen, 0 has 2" subtheories (while V 0 = 0 seems to imply only that 0 has infinitely many subtheories). More generally, if V 0 = { w } and 0' is a finitely based subtheory of 0 , then there are 2" equational theories which include 0' and are included in 0. Hence, in particular, if there is a group Q with V@={O}, then there are 2" varieties of groups. 5. Consistent and complete theories ;the lattice of equational theories An equational theory 0 is called consistent if it does not contain all equations (with given operation symbols) or, equivalently, if it does not contain the equation x = y ; 0 is called complete if there is no consistent theory which properly includes 0.An algebra 2, or a class K, is (equationally) complete if and only if 02,or OK, is complete. The well-known theorem of Lindenbaum (originally established for the sentential calculus), stating that for every consistent theory 0 there is a consistent and complete theory which includes 0, extends to equational theories; its proof depends merely on the facts that all the operations of direct inference are finitary and that inconsistent theories are finitely based. In contrast to the predicate logic, a sharper form of this theorem, to the effect that every theory is the intersection of all complete theories which include it, does not apply to equational theories. Actually many incomplete theories are known which can be extended to just one consistent and complete theory. For some important classes of algebras all equationally complete members of these classes have been fully determined. Thus, Kalicki and Scott [13] have shown that the only semigroups ( A , C ) which are complete are first the

286

A. TARSKI

"trivial" semigroups satisfying one of the three equations Cxy = Cuv, Cxy = x, Cuv = v, then the semilattices, and finally those Abelian groups in which all non-unit elements are of a given prime order. By a result in Tarski [14], a ring 3,with or without unit, is complete if and only if it is commutative, and there is a prime number n such that (i) every non-zero element of the ring is of order in the additive group of 3,and (ii) either the product of any two elements is zero or else every element equals its n-th power. The variety of complete equational theories appears to be rather sparse. Nevertheless Kalicki [15] was able to show that there are 2'" complete theories with, say, one binary operation symbol. The set T of all equational theories, with a fixed sequence 0 of operation symbols, is lattice-ordered by the inclusion relation. If we define the meet of two theories 0 , O ' E T as the intersection 0 n O', and the join as the theory generated by the union 0 v O', then T together with these two operations forms a lattice QU. The meet and join of an arbitrary subset S of T are defined analogously. The lattice is complete; the inconsistent theory, i.e., the set of all equations, is the unit element, and the theory consisting of all tautologies is the zero element of the lattice. The complete and consistent theories are the maximal non-unit elements of the lattice; thus, every non-unit element is included in a maximal non-unit element. On the contrary, it is easily seen that the lattice has no minimal non-zero elements (disregarding the trivial case when all operation symbols are of rank 0). The finitely based theories coincide with the compact elements of the lattice, i.e., the elements x characterized by the following condition: if x is included in the join of any subset S of T, then it is included in a join of finitely many elements of S. The unit element is compact, and every element is the join of all compact elements included in it. It would be interesting to provide a full intrinsic characterization of the lattice & and all its isomorphic images, using exclusively lattice-theoretical terms. The characterization may, of course, depend essentially on the ranks of operation symbols occurring in the theories. In fact, McKenzie has recently shown that two lattices 2uand f$, are isomorphic just if, for every p <w, the sequences CJ and 0'contain the same number of operation symbols of rank p. The problem of characterizing intrinsically some special sublattices of the lattice may also deserve attention. In particular we have here in mind sublattices formed by all extensions of some important equational theories such as the theory of groups, rings, or lattices. Instead of the lattice of equational theories we can, of course, discuss the dual-isomorphic lattice formed by all equational classes of algebras of a given type.

EQUATIONAL LOGIC

287

6. Decision problems

Various notions, problems, and results discussed so far in this paper suggest in a natural way corresponding decision problems. These are problems of the type: is a given set of equations, or of finite sets of equations, or of finite algebras, recursive? The meaning of the term “recursive” in these contexts is clear; finite algebras can be regarded as algebras whose universe consists of finitely many integers. The conceptually simplest decision problems concern the decidability of individual theories. A theory is called decidable simply if it is recursive. Most familiar equational theories prove to be decidable. The theory of any finite algebra is of course decidable. Also, as in the predicate logic, every complete theory with a finite (or, more generally, a recursive) base is decidable. The proof uses again the fact that inconsistent theories are finitely based. There are finitely based undecidable theories. Probably the first example was given in Tarski [16]. It is the theory of relation algebras, which may be treated as algebras of type (2, 1 , 2, l), (2, 1,2) or even (2,2). Perkins [6] gives an example of a theory of groupoids with the same properties. Closely related to decidability problems are the famous ~ w r problems. d Given a theory 0 , we enrich its language with a finite set Q of individual constants. In the enriched language we construct a finite set C of equations containing no variables and we denote by [O, Q, C]the theory generated by 0 u C. The word problem for 0 is the problem of determining whether in each such theory [0,Q, Z] the set of all equations containing no variables is recursive. If we restrict ourselves here to those theories [0,Q, C] in which Q is arbitrary but C is empty, then the resulting problem is simply the decidability problem for 0. It may be noticed that, in formulating the word problem, we may take for 0 , not necessarily an equational theory, but e.g. any firstorder theory, and that the problem may then be formulated more simply in terms of so-called conditional equations (universal Horn sentences) of the original theory 0, without considering any extensions [0,Q, C]. The farreaching results obtained during the last years in the study of word problems are well known. We turn to decision problems of a different type concerning finite sets C of equations. Let S,, v = 1 , ..., 6, be the set of all such sets C satisfying respectively one of the following conditions: (cl) Z is a base for a given finite algebra 3,; (c2) there is a finite algebra ‘u for which C is a base; (cj) K,EV@[Z] for a given positive integer K,; (c,) O[Z] is consistent; (c5) 0 [C] is complete; (c6)0 [C] is decidable. The decision problems for S2,S4,S5,

288

A. TARSKI

S, were discussed and solved negatively by Perkins [6]. The negative solution for S, implies automatically the negative solution for S,, with a one-element algebra taken for go.It seems likely that a negative solution of this problem for some other finite algebras, and in particular for any Boolean algebra, can be obtained by appropriately modifying the proof of an analogous result for sentential calculus announced by Linial and Post [ 171; a detailed proof of the latter result is given by Yntema [18]. (There is much affinity between the formalisms of sentential calculus and equational logic; as a consequence, various metalogical results established for sentential calculus can frequently be carried over to equational logic with appropriate changes in formulations and proofs.) The decision problem for S,, with any particular integer, e.g., 1, taken for K ~ remains , open. We can replace everywhere in (c1)-(c6) the sets ,Z by the singletons { E } where E ranges over arbitrary equations, and denote by S: the resulting sets of equations E . The decision problems for S;-SA seem to be open. So are the decision problems for the sets K,, v = 1,2, 3, of all finite algebras 'u satisfying one of the following conditions: (el) 'u is finitely based; (e2) K ~ E V ( Ufor a given cardinal K~ (0 < K~ < w ) ; (e,) 'u is equationally complete. The problems for K, and K, were raised by Perkins [6]. Many other related decision problems for finite sets of equations and finite algebras are known.

References G. BIRKHOFF, Proc. Cambridge Phil. SOC.31 (1935) 433-454. R.C. LYNDON, Trans. Am. Math. SOC.71 (1951) 457465. R.C. LYNDON, Proc. Am. Math. Soc. 5 (1954) 8-9. V.L. MURSKII,Soviet Math. 6 (1965) 1020-1024. S. OATES and M. POWELL, J. Algebra 1 (1964) 11-39. P. PERKINS, Decision problems for equational theories of semigroups andgeneralalgebras, doctoral dissertation, University of California, Berkeley (1966). 7. H. NEUMANN, Varieties ofgroups (New York, 1967). 8. D.H. POTTS,Can. Math. Bull. 8 (1965) p. 519. 9. A. TARSKI, Fund. Math. 30 (1938) 253-256. 10. G . HIGMAN and B.H. NEUMANN, Publ. Math. 2 (1951-52) 215-221. 11. G . GRATZER and R. MCKENZIE, Am. Math. SOC.Notices 14 (1967) p. 697 [abstract]. 12. R. PADMANABHAN, Am. Math. SOC.Notices 14 (1967) p. 697 [abstract]. 13. J. KALICKI and D. SCOTT,Indag. Math. 17 (1955) 650-659. 14. A. TARSKI, Indag. Math. 18 (1956) 39-46. 15. J. KALICKI, Indag. Math. 17 (1955) 660-662. 16. A. TARSKI, J. Symbolic Logic 18 (1953) 188-189 [abstracts]. 17. S. LINIALand E.L. POST,Bull. Am. Math. SOC.55 (1949) p. 50 [abstract]. 18. M.K. YNTEMA, Notre Dame J. Formal Logic 5 (1964) 37-50. 1. 2. 3. 4. 5. 6.

THE USE OF “BROUWER’S PRINCIPLE” IN INTUITIONISTIC TOPOLOGY A. S. TROELSTRA Mathematical Institute, University of Amsterdam

The purpose of this paper is to demonstrate the strong consequences of the intuitionistic continuity postulate (called “Brouwer’s principle” by Kleene and Vesley in their monograph [2]) in an important domain of mathematics. The first section lists notations and conventions, the second one treats technical matters with the purpose of simplifying the applications, and in the third section a number of theorems is proved in order to show the consequences of Brouwer’s principle and the way it can be applied. Some of the theorems mentioned were already proved in [4]; their proofs are not repeated here; for Theorem 2 for which a proof was given in [4], a new version (more “topological”) of the proof is presented. The other theorems of Section 3 were already proved in another context in [5]; in this paper the presentation of the proofs is simplified. Our proofs are not carried out in any definite formal system, but the possibility of easy formalization was kept in mind. 1. Notations and conventions Instead of the intuitionistic notion “species” we speak of sets. For typically intuitionistic notions not defined in this paper see [l]. For the various topological notions used in this paper we can take those classical definitions which start from “open set” as basic notion. Full details are given in [5]. Now we list a few specific conventions and notations. a) i, j , k , I , m, n, s, t are used for natural numbers; N denotes the set of natural numbers (not including zero). b) tl, p, y always denote denumerably infinite sequences (functions defined on N ) . c) 6, E stand for positive real numbers. 289

A. S. TROELSTRA

50, $I, x,f are used for functions in general. cr, T are always reserved for finite sequences. * denotes concatenation of finite sequences. Sequences ( p 1 , p 2 ,p 3 , ...) are also written as ( P , ) , " , ~ =(p,>,. c( = ( a ( l ) , @(2),. . .) = (a(n)),. i?(n)=(a(l), ..., a(n)>. If a is a convergent sequence, e.g. a sequence of points in a metric space, we write limfl+ma(n)=lima. A , B, C, R , T denote predicates or relations. 0 is always used for a spread law, 9 for a complementary law ([I], 3.1.2); S = ( 0 , 9) denotes a spread. 0, is the set of all finite sequences of natural numbers. D ( O ) = ( 0 , 9) with 9 the identity. 9a=(9i?(n)),. Logical symbols: v , &, 1, -+,V, 3, 3! (there exists exactly one). Set theoretical symbols as usual: n, u,E, c etc.; denotes complementation, i.e. for a subset V of a topological space A : YC=d- V= (p:p€Ll &p$ Y } . r always denotes a complete, separable metric space, r' a complete metric space, r" a separable metric space, I"" a metric space. U(E,p ) = ( q : p ( p , q ) < ~in) spaces r'"with metric p . Likewise with rlinstead of r. m) p x A(x) denotes the smallest x such that A (x).

2. A few technicalities

In this paragraph we always suppose a, p, y ~ D ( 0 , ) ,if not stated otherwise; A is a predicate containing number- and function variables. We start with a summing-up of the various forms of the intuitionistic continuity postulate which will be used. Kreisel introduced ([3], p. 23, D9) a rather weak form of the continuity postulate, which is very manageable ([2], 7.7, p. 79): (1)

V d n A ( a , n ) - + V a 3 m 3 n V p ( ~ ( m=) jI(m)-+A(p,n ) ) .

Without essentially strengthening (I) we may suppose m a n , i.e.

(IA)

Va3nA(a, n)+Va3m3nVp((ii(m) = P ( m ) - + A ( p , n ) ) & m 2 n ) .

Kleene and Vesley introduce in their monograph ([2], p. 73) a much

29 1

“BROUWER’S PRINCIPLE”

stronger form, which they call Brouwer’s principle (for numbers):

(IT) VdnA(cc, n ) + 3 $ V a { 3 ! m $(ti(m)) > 1 &($(Cl(m)) > 1 + +

A (a?$ (i( m ) )- 1))) *

$ is a function defined on finite sequences of natural numbers. (Therefore the requirement that A does not contain $ free is superfluous here, since A contains variables for functions from N into N only.) A weak consequence of (11) is the enumeration principle: (111)

Va3nA(a, n ) -+ 3y(Va3nA(a, y ( n ) )&Vn3crA(cc, y ( n ) ) ) .

The strongest form of an intuitionistic continuity postulate we shall use is Brouwer’s principle for functions ( [ 2 ] ,p. 73)

(IV) Va3PA(a,P)-+3$Va(Vm3!n$(m,a(n)) > I & & VP [ V m h $ ( m , E ( n ) ) = P ( m ) + 1

A ( a , B11).

-+

Without essentially strengthening (1V) we can impose the following condition on the II/ in (IV):

(*I

VrnVn3k($(nz,&(n))< l & $ ( r n + l , E ( k ) ) > l - + n < I c ) .

The form of IV with (*) included is denoted by (IVA). In (IV) and (IVA) a mapping $ is defined for pairs ( m , a), mEN, CEO,. The principles (I)-(IVA) remain valid if the range of the variables a,p is restricted t o D(0). These restricted forms we denote by (I*)-(IVA*). Further we need the axiom of choice in the following form (a not necessarily in D(0,)).

Vn3xB(n, x)

(V)

-+

3aVnB(n, a ( n ) ) .

DEFINITION 1. A set X i s said to be represented by (0, 9, -) if a) (0, 9) is a spread, S, b) S*, the set of equivalence classes of S with respect to can be mapped bi-uniquely onto X . DEFINITION 2. We define a mapping (the standard mapping) x from D(0,) onto D(0),as follows:

-

-

E ( n )E 0 + xa ( n ) = i ( n ) ,

E(n)+O

-+

-

x a ( n >= x a ( n - I > * ( p , ( G ( n - I>*( m >E 0 ) ) .

292

A. S. TROELSTRA

DEFINITION 3. A spread ( 0 ,9) is called homogeneous, if

~atl/?trm (92 (m)

= 9p(m)

--f

37 ( E ( m )= ?; ( m )& 9/? = $7))

.

THEOREM 1. If ( 0 , 9 ) is homogeneous, then

k ’ a ~ D ( 03nA(9a, ) n ) + V a E D ( @ ) 3 m 3 n ’ d P ~ D ( @ ) ( Q ~= ( m9p(m)--t ) A (SB, a ) ) . +

Proof straightforward. Standard construction. (x,), is a given sequence. Let X be the set of finite sequences of elements of (x,),, and let R be a relation defined for pairs ( a , x), O E X ,XE(X,),. A sequence a=(y,, ..., y,) is called admissible if R(O,y , ) & R ( ( y , ) , y 2 ) &...&R((YI,..., Y n - I ) , Y 3 . A sequence a is called admissible if E(n) is admissible for every n. If { x : R ( a ,x)} is enumerable for every g, then a spread (0,,, 9) can be constructed which contains all admissible sequences and which is homogeneous. This is done as follows: Let (a,), be an enumeration of X , g1=O. We suppose (x: R(o,, x)} to be enumerated as ( x ( n , m)),. If to every sequence (mi,. .., m,) of length t a number y2 has been determined such that

9 ( m , , ..., In,) then we put for every m.

= 6,

9(ml,..., m,,m) =a,*(x(n,m))

3. Applications In intuitionistic mathematics all applications of Brouwer’s principle were hitherto in fact applications of the fan theorem, which is a combination of the bar theorem and Brouwer’s principle for a special case, the case of finitary spreads ( [ 2 ] , p. 59). The following examples show that the general form of Brouwer’s principle has strong consequences too. THEOREM 2 . I c N , Wi subsets of a space r.

U {Wi: i ~

l3}T - t

U {Interior Wi: ie1) 3 T

([4], Theorem 6).

ProoJ In this as well as in the other applications treated, the essential idea of the proof is always the construction of a suitable spread to which Brouwer’s principle can be applied.

293

"BROUWER'S PRINCIPLE"

r is separable, therefore we may suppose (p,,),, to be a sequence dense in the space. p is a metric for r. We put Ui,j=U(i-l, p j ) and we define a relation T on ordered pairs by T(Uk,l, Ui,j)tSk-l > i - l + P ( P j 9 P I ) . We remark that T(U,,,, Ui,j ) implies Ui,c Uk,1. The Ui,constitute a basis for r. T is an enumerable relation, in other words the pairs ( Uk,1,Ui, j ) for which T holds can be effectively enumerated. For let q ( n , m, k ) be a rational-valued function such that then

<2-k),

Vn'dmV'k(lP(Pn7 P,,) - q ( n , m , k)l

T(Uk,J,U ; , j ) + + I t ( k - ' > i-'

+ Cp(j,

1, t )

+ 2-f),

and to every pair ( k , I) a pair ( i , j ) can be found such that T ( U k , , ,Ui,j). Therefore, if we put for IJ = ( U k , , l l..., , Uk&) R(a,

ui,j>

=

ui,j> 7

(ukt,lt?

we can carry out the standard construction of a spread S = ( B o ,9) such that YES-VnT(Y(n), r ( n

+ 1)).

In the sequel of this proof we suppose a, PES. If diameter a(n)<2n-', then diameter a(n 1) d 2(n + 1)-'. Hence .(a) contains exactly one point of r for every M E SIf. we define by

+

nn

-

a-P-naan)= n

nm n

then ( B o ,9, -) represents r, hence

The following property holds for S, as is readily verified: (2)

v a v p ~ r v n ( p ~ . ( n ) - , 3 ~ ( c l= ( ~p)( n ) & n p ( n ) = { p } ) ) n

We introduce a predicate A by (3)

Then (4)

V d n A(M, n ) ,

294

A. S. TROELSTRA

Applying Theorem 1, we obtain

Va3m3nVP(E(m)= $ ( m ) -+ A (p, n ) ) .

(5)

Take a p E r , construct an M according to (I), n, m according to (5) and we obtain

(6)

v p @ ( m ) = $(IN)

-+

A (AH I ) .

We apply (2) with respect to a, m:

(7)

v w ( q E 4 4 - , 3 ~ ( ~ ( m= ) c l ( m ) w x w = (4))). n

if qEa(m), and

p is chosen according to (7), we obtain

n,

P ( n ) c W,, so q~ W,. This holds for Therefore (by (6)) A @ , n), hence every qEa(m), therefore a ( m ) c W,, hence p~ct(m)cInteriorW,. THEOREM 3. Iff is a mapping from a space r into a space I-", then f is continuous. Proof ([5], 3.2.23). Let ( p , ) , be a sequence dense in r".We put Wn,i= = ( q : q M & p ( f q , p , ) < 2-"}. ( W,, Ji covers r for every n, therefore (Interior W,Ji is also a covering for every H . Take a q E r , H E N .Then q ~ l nW,,, t for a certain m, so Vr(rEInterior W,,,+p(fr,p,)<2-"), hence Vr(rE1nterior W,,,~p(fr,fq)<2-""). This proves continuity. Example. Let A be a topological space. We consider the properties (A) and (B): (A) For every r"and every mapping f from A into r"f is continuous. (B) = the conclusion of Theorem 2. (A) and (B) are not of the same strength; [Ol] u [1,2] (with the topology induced by the real line) satisfies (A) without satisfying (B). For Interior [0,1]= [0,1), Interior [1,2]=(1,2]; ([0,1], [1,2]} is a covering, but ([0,1), (1,2]} is not. DEFINITION 4. v, rifi.

w~

I/ C '

W + + V ~ Er"'3E(U ( E , p ) n I/ = 0 v U ( E ,p )

v w-

c

W ).

THEOREM 4. v,W C r. c vcu r. Proof([5],3.2.24). The implication from the left to the right is trivial; in the other direction it is proved thus. ( V", W } covers r hence by Theorem 2 (Interior v",Interior W } covers r. Therefore VpEr(pG1nterior V' vpEInterior W ) ,so for everyp there exists an E such that U ( E p, ) c V"v U ( E ,p) c W.

w=

"BROUWER'S PRINCIPLE"

295

DEFINITION 5. A mappingffrom a space r;l,into a space r"'(with metric

p ) is called sequentially continuous if for every converging sequence

with limit x ( x E f ; ' , (x,),cr;') we have

(x,),

Vk3mV'n(n 2 m - + p ( f x , , f x ) < 2 - k ) .

THEOREM 5. Iff is a mapping from r' into r",thenfis sequentially continuous. Proof. See [4],Theorem 1. THEOREM 6. Iff is a sequentially continuous mapping from I"' into r'", then f is continuous. Proof. See [4], Theorem 2. Remark 1 . By combination of Theorems 5 and 6 we obtain another proof of Theorem 3. DEFINITION 6. A pointset V c r is called located, if V p E T V s ( 3 q d ' ( q d ( e , p ) n V ) v 3 6 ( U ( 6 , p ) n V = 0)).

Classically, every pointset is located. THEOREM 7 . Let V , W c r, V located. Then

GW. I.'-cInteriorW-V (Compare [ 5 ] , 3.3.7.) Proof. Only the implication from the left to the right needs proof. If V is located, then V - is located too; for if q E U ( E p, ) n V , then also q~ U(e, p ) n V - , and if U(6, p ) n V=O, then U(2-'6, p ) n V - =O. V G W t , V - G W , for if U ( E , p ) n V=Ov U ( s , p ) c W, then U ( 2 - ' e , p ) n V - = O v v U(2-'e, p ) c W. Hence we may suppose V to be closed in our proof without losing generality. be a sequence Suppose V c Interior W, V located. Let p E r , and let (4.). of points, and (k(n)), a sequence of natural numbers such that V n (q,E

I/ n ~ ( 2 - ~ ( "p' ), v

U(2-k'"', p ) n v = 0).

We may suppose Vn(k(n)>n). If U ( 2 - k ( 1 )p,) n V = @ ,there are no difficulties. Suppose therefore q1E U ( 2 - k ( 1 )p, ) n V . We define a spread S, = (@,$) as follows <0), l+t,,=t,-, + 1)&q,+lEU(2-k(m+1),P))). $(tl,

..., t,>

=

( q c , , ..., 4 r J .

296

A. S . TROELSTRA

If aeSp, then V n ( a ( n ) ~ V )and , since V is closed, limaE V. We have VqE V 3 l ~ ( U ( 2 -q~), c W ) ,and we define for a e D ( O ) :

A ( a , k ) ~ U ( 2 - lim9a) ~, c W.

We see that

Applying (IA*) we obtain

(I)

V ~ E D ( O3kA(cc, ) k).

vaED(0)3n3mVBED(O)((01(m)=B(m)--tA(P,II))&nZ>12).

We define a special a * e D ( O ) :

a*(n)=a*(n+ I ) t t ~ ( 2 - ~ ( " + ' ) , p ) n ~ = P ) . We apply (1) and find n, m for a* such that

(2)

V ~ E D ( O ) ( C I * ( ~ ) = B ( ~ ) - ,nA) )(& Bm , >n.

We remark that (3)

Vt3P E D (0)( t

> m & qt E u (2-k(t),p ) n v

--t

01" ( m ) = = p(m)&Iim$P = q t ) .

+

If a* (m)= a* (m- l), then U ( 2 - k ( m )p, ) n V= 0 ;if a* (m)= a* (m- 1) 1, then q , ~ U ( 2 - ~ ( ~ ) , pV.) nIn the second case we can take a P E D ( O ) such that E*(m)=P(m),and lim9p=ym (an application of (3)); then A ( p , n) holds i.e. U(2-", lim9p) = U(2-", )4, c W , p ( p , 4,) < 2-k(m)+2-k(rn) - p ( p , 4,) = 6 > 0. We remark that k ( m )2 m>n, hence 2-" 3 2-k(m).Therefore U ( 2 - k ( m )9,") , c W, hence U ( 6 ,p ) c W. So we have proved for an arbitrary p : ~E(U(p E ), n V = 0 v U ( E ,p ) c W )

and this proves V C W. THEOREM 8. Every open covering of a space r contains an enumerable sub-covering. (Compare [ 5 ] , 2.2.6.) Proof. Let (O,, 9, ) be the representation of r as described in the proof of Theorem 2, and let { W,: X E X )be an open covering of r ; ( p , ) , is a sequence, dense in r. We introduce a predicate :

-

A(a, k, m ) t , 3 p E r 3 x E X ( n a ( n ) We see that

=

{ p } & p ~ U ( 2 -p,) ~ , c W,).

fl

Va3k3mA (a,k , m ) .

291

“BROUWER’S PRINCIPLE”

From (111) we can derive (using the well-known pairing functions for natural numbers) (a, P, y ~ D ( 0 , ) ) 3P3Y (Va3nA(a, P (n), Y ( H I )

V’n3aA(a, P ( n ) , y (n”

We remark that ( U(2-p(”’,p , ( , , ) ) , covers r ; further we have Vn3x(U(2-P(”),P,(,,)c W,).

Applying (V) we obtain therefore a sequence ( Wxcn,>,which covers r. Remark 2. More generally, this theorem can be proved for topological spaces which can be represented by a spread. THEOREM 9. Let V be a set of sequences of elements of (x,),, and let be an equivalence relation defined for pairs of elements of V ; the corresponding set of equivalence classes is denoted by V*. We suppose V* to be represented by (0, 9, -); (0, 9)=Sc V. Let T be a predicate such that v a E V l p E v ( ( a - P&T(P)).

-

-

Then there exists a representation (O,, 9’, ), (O,, 9‘) = S,, such that v a E s, (T(a)). Proof. Let x be the standard mapping from D ( 0 , ) onto D ( 0 ) according to Definition 2. For a, PED(O,) we introduce a predicate A We see that

A (a, PI + + ~ X N

-

<Xp(n))n

& T ((Xpcn)),)

*

VO1VA(a, P)

Hence we can apply (IVA”), and obtain a function II/ as required there. We define a finite sequence of natural numbers ~ r n l , m z , , , , , r nfor n every (ml,..., mn)EOo,such that

298

A. S . TROELSTRA

Let Y E V . Then 3 a ~ D ( 0 , )( 9 p - y j . We can find a

Y

N

~ X N

N

(X~cn,)m

T ((x,n>

p such that

A ( a , p) i.e.

*

This proves our theorem. A useful topological application of this theorem is the following theorem (cf [ 5 ] , 3.2.20). 10. Let ( W,), be a sequence of closed pointsets of r, and let r THEOREM be represented by ( 0 , 9, -), ( 0 , 9) = S , such that Va E SVn ( a ( n ) E

< V n > m & 3~ E r ( { P } = na (n>>>.

nL

I1

If to every sequence (W,ci,)i, W n C i , = { p >a, sequence (Wn,(i,)i can be found such that F n ( i ) = { pVi(Wm(,+l) }, C W , , i J then there exists a representation (O,, 9’, -), (0,, 9’)=S’, such that V a ~ S ’ V m ( a ( m +1) Ca(m)j. Proof. lmmediate by an application of Theorem 9.

ni

References 1. A. HEYTINC,Intuitionism, an introduction (second revised edition, Amsterdam, NorthHolland Publ. Co., 1966). 2. S. C . KLEENEand R. VESLEY,The foundations of intuitionistic mathematics (Amsterdam, North-Holland Publ. Co., 1965). 3. G. KREISEL,Reports of the seminar on Foundations of Analysis, Stanford University, Summer 1963 (Mimeographed), Section IV: Theory of free choice sequences of natural numbers. 4. A. S. TROELSTRA, Intuitionistic continuity, Nieuw Archief voor Wiskunde (3), 15 (1967) 2-6. 5. A. S. TROELSTRA, Intuitionistic general topology, Thesis, Amsterdam 1966.

Provability, Computability and Reflection

Read more

Provability, Computability and Reflection

Read more

Provability, Computability and Reflection, Volume 17

Read more

Provability, Computability and Reflection, Volume 62

Read more

Provability, Computability and Reflection, Volume 88

Read more

Provability, Computability and Reflection, Volume 83

Read more

Provability, Computability and Reflection, Volume 30

Read more

Reflection

Read more

The Logic of Provability

Read more

Automata and Computability

Read more

Computability and randomness

Read more

Computability and Logic

Read more

Computability and Logic

Read more

Computability and randomness

Read more

Computability and Complexity Theory

Read more

Computability and Logic

Read more

Computability and Logic

Read more

Explicit Provability and Constructive Semantics

Read more

Computability and Logic

Read more

Computability and Complexity Theory

Read more

Computability and Complexity Theory

Read more

Computability and unsolvability

Read more

The Logic of Provability

Read more

Computability and Randomness

Read more

Computability and Complexity Theory

Read more

Computability and Unsolvability

Read more

The Logic of Provability

Read more

Contemporary Theatre, Film and Television, Volume 50

Read more

Contemporary Theatre, Film and Television, Volume 50

Read more

Advances in Agronomy, Volume 50

Read more

Recommend Documents

Provability, Computability and Reflection

UNDECIDABLE THEORIES BY ALFRED TARSKI Professor of Mathematics, University of California, Berkeley IN COLLABORATION WI...

Provability, Computability and Reflection

LOGIC, METHODOLOGY AND PHILOSOPHY OF SCIENCE Proceedings of the 1960 International Congress Edited by ERNEST NAGEL PATR...

Provability, Computability and Reflection, Volume 17

INTUITIONISM AN INTRODUCTION A. H E Y T I N G Profemor of Mathematics Univeraity of Amterdum 1 9 5 6 NORTH-HOLLAND PU...

Provability, Computability and Reflection, Volume 62

Provability, Computability and Reflection, Volume 88

Provability, Computability and Reflection, Volume 83

Provability, Computability and Reflection, Volume 30

THE LOGICAL SYSTEMS OF L E S N I E W S K I E U G E N E C. L U S C H E I Assistant Profissor of Philosophy University of...

Reflection

The Logic of Provability

Automata and Computability