This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Cx)E.LS(A) and QI= r(
<
t
PROOF: At first we show that it is sufficient to consider formulas 1jr(xO" •• ,xn_ 1) with parameters in (rcx=x) cg))n only. Assume that all parameters of 1/f(xO' •• ' '~-1) are contained in some model M of T. If P is any type over M realized in r(x=x)(Q), then by a result of Lascar and Poizat [8, Cor. 7.3.J p is the only extension of p~r(x=x)(M) to M. Let
Lascar Rank in Non-Multidimensional w - Stable Theories
37
i = 0, ••• ,n-1. Let h be the mapping that carries p E Sn (M) into p ~ r(x=x) (M). From the considerations above i t follows.at h that h is a bijection between the compact Hausdorff spaces
<
r<x=x»M and
>
the h-image of (-!J'(xo ' ••. '~-1 »M in
€S
"Geometrical" Stability Theory
55
§2. WHAT IS A GEOMETRICAL ANALYSIS OF A THEORY? To date there has only been one kind of analysis which has yielded any positive results, and that is to show that all models of a particular theory are "geometrically simple". Such an analysis usually proceeds in two rather distinct steps: 1.
2.
Local Analysis: Show that all minimal sets in the theory are geometrically simple. Global Analys;s: Show that local simplicity implies global simplicity
Before I go any further I must define what I mean by "geometrically simple". At this point I am going to restrict our attention to theories of finite rank (i.e., R(x=x) <w). Exactly what the proper notions are for theories of infinite rank is only now becoming clear. Local simplicity. Let D be a minimal set. Recall that for any set A, acl (A) denotes U{Y: Y is finite and definable with parameters from A}. Let c1(-) denote acl(-) n D. The set D with the closure operator cl(-) forms a matroid or pregeometry (see the Appendix). Completely characterizing the pregeometries of the form (D, c1(-)), for some D, seems hopelessly d i ff icu l t . But in deciding whether a structure is simple it seems to come down to looking for one property, namely, local modularity (see Definition A.2 in the Appendix). So, formally, we say a minimal set is simple if it is locally modular. A theory is locally simple if every minimal set is locally modular. That local modular;ty is the crucial property to look for is largely due to its effect on the way that sets of, e.g., rank 1 can be "pieced together" to form sets of rank 2. This is a global property which I now describe explicitly. Global simplicity. The general question I am addressing here is: How do the sets of rank n combine to form a set of rank n+11 As in the local case there seems to be one governing property to look for. Informally, a theory T is called l-based if we do not have the following picture: p
An~OAl ~2
56
S. Buechler
The picture is formalized by saying: there is a definable set P of ",,-rank n+l and, for i < w, a definable Ai C P such that R(A i) = n; for all i ,j there is an automorphism of (C taking Ai to Aj; R((Ai\A J,) U(AJ,\A i)) < n; and there is acE .n Ai such that l<W R(c/0) = n+l. A more usable definition of l-based does not require the theory to have finite oo-rank: a stable theory T is l-based if for all saturated models M of T and complete types p over M, there are a in M and A C M such that a realizes prA, p does not fork over A and p does not fork over {a}. (Pillay calls such theories weakty normat in [17].) While the geometrical content in this definition is not obvious, it is a more usable formulation. I must caution you, though, that when defining global simplicity for theories of infinite oo-rank a broader condition is more appropriate. Theories which are not simple. To see that we actually have the right notion of simplicity we have to show that theories which are not simple are geometrically complicated in some sense. Here Lachlan's concept of a pseudoplane arises [15]. Let P and Q be sets and I C P xQ a binary relation. (p,Q,I) is called a peeudoplane if for all pEP there are infinitely many R,E Q such that pIt; for all R, E Q there are infinitely many pE P such that pt z : for all p' t pEP there are only finitely many R,E Q such that pIt and p'It; for all R,t R,' E Q there are only finitely many pEP such that pIt and pIt'. The elements of P are thought of as the "points", Q the "lines" and I as the incidence relation. The most obvious example of a pseudoplane is an infinite projective plane. Pseudoplanes enter this discussion via THEOREM 1 (Zil' ber [28]). If D is strongZy minimaZ and not tocaZZy q modular, then Th(D)e contains a definable peeudop'lane, (See [4] for an alternate proof.) THEOREM 2 (Pill ay [17]).
A stable theory T is not I-baeed iff Teq
contains a type-definabZe peeudoplane,
"Geometrical" Stability Theory
57
(A pseudop1ane (p,Q,I) is type-definable if P and Q are the sets of real izations of types and I = 10 n (p x Q) for some definable relation 10 , ) I should remark at this point that the condition "Teq does not contain a definable pseudoplane" is a perfectly good formulation of global simplicity in the context of KI-categorical theories. This is the condition which Zi1'ber seems to work with. I have defined the property 1based largely to give an easily applied condition in the broader context of all stable theories. So, we see that the theories which are not simple are exactly those which contain a type-definable pseudop1ane. At first glance this looks quite promising. We can at least partially study complicated theories just by studying pseudop1anes. However, very little is known about pseudoplanes, or even projective planes. Here are a couple of conjectures which illustrate what we cannot prove. The first is essentially Conjecture C in [29]. CONJECTURE 1.
Every K. -oateqox-ioal. projective plane is deearqueeian
hence a projective plane over an algebraically closed field.
An easier problem is CONJECTURE 2.
In an K.-categorical projective plane the set of
points must have Morley rank 2. I have been able to prove this conjecture only under the assumption that all points realize the same type over the empty set (unpublished). Theorem 1 is not very satisfying as a description of a "locally complicated" theory since it does not speak explicitly of the pregeometry (0, cl }, But to date there are no significant theorems about non-locally modular strongly minimal sets. There is only the following conjecture due to Zil' ber.
CONJECTURE 3.
For every non-locally modular strongly minimal set 0
there is an algebraically closed field h such that 0 is definable in
h eq and
s.
eq•
is definable in D
See [29] for a complete discussion of this conjecture and its implications. (Actually, if 0 is definable in heq then it is definable in h by [19].)
58
S. Buechler
§3. THEOREMS INVOLVING A GEOMETRICAL ANALYSIS A.
w-Categorica1, w-Stab1e Theories The first theorems of a geometrical nature came about in this context. Indeed, this is where many of these ideas were formulated. A theory is called totally aategoriaal if it has exactly one model, up to isomorphism, in every infinite power. The conjecture motivating the following results was that a totally categorical theory is not finitely axiomatizab1e. Zi1'ber saw as early as 1977 that if a totally categorical theory does not contain a definable pseudop1ane,then it is not finitely aXiomatizab1e [23]. Then, later, he showed that the strongly minimal sets in a totally categorical theory are locally modular iff the theory does not contain a definable pseudop1ane [24]. The real breakthrough came in 1980 with THEOREM 3 (Cher1 in, Mi 11 s , Zil' ber) . If D is strongly minimal and ss-oateqox-ioal , then D is loaaZZy modular,
Cher1in and Mills independently showed that the theorem follows from the classification of finite simple groups [11]. Zi1 'ber's proof was to show directly that there is no totally categorical pseudop1ane [26],[27]. The global picture for w-categorica1, w-stab1e theories was thoroughly analyzed in [11]. They developed sophisticated machinery for "dissecting" such theories. The concept of 1-based was extracted from their proof of the finiteness of the fundamental order [11, 6.3]. A complete discussion of these results can be found in the introduction to [11] and in [29]. B.
Superstab1e, Non-w-stab1e Theories The main theorem at the local level is
THEOREM 4 (Buechler [2]). If D is an s.w.m. set whiah is not loaal/lq modular, then it is strongZy minimal. The essential part of the proof is to show that a type-definable pseudoplane with points of ~-rank 2 must be w-stab1e. There is a very similar theorem for minimal sets but it is more difficult to state. For the remainder of the eubeecbion assume that T is a euperebabl.e theory of finite oo-rank.
The next theorem says that local simplicity implies global simplicity.
"Geometrical" Stability Theory
THEOREM 5 (Buechl er [3]).
59
If in Teq every minimal set is looaUy
modular, then T is l-based.
So, we can intuitively say that a superstable theory can be divided into a simple part and an w-stable part. There are theorems which formalize this intuition but they are quite difficult to state (see [5],[6]). Perhaps the best way to combine Theorems 4 and 5 is with COROLLARY 6.
If T is unidimensional and not l-', -oateqor-ioal., then T
is l-based.
(The unidimensional theories are the properly superstable analogues of the l-',-categorical theories. See [3] for a proof.) C.
Groups That these geometrical concepts do have a significant effect on the overall structure of the models is best illustrated by THEOREM 7 (Hrushovski, Pillay [14]).
If G is a group and Th(G) is
stable and l-based, then G has a definable abelian subgroup of finite index.
observed that by generalizing Zil 'ber's proof of the l-',-categoricity of simple algebraic groups [30], and combining Corollary 6 and Theorem 7 we get COROLLARY 8.
If G is a simple group and R(G) < e , then Th(G)
is w-stable.
D.
Vaught's Conjecture for Weakly Minimal Theories It is rather disturbing that the problem in the title is still open. All we have is a collection of S.w.m. sets and an algebraic closure operator. But to date only partial results have been obtained (see [20], [7],[8] and [9]). I consider this a global geometrical problem in that we need to know how the 2~o many s.w.m. sets in the universe interact. A type of oo-rank 1 is said to have finite multiplioity if its set of realizations is the union of finitely many s.w.m. sets. Let (5) denote the condition:
S. Buechler
60
For all finite A and complete types p over A of oo-rank 1, if P is non-isolated, then p has finite multiplicity. THEOREM 9 (Buechl er [7]).
~
Suppose T has oo-rank 1, fewer than 2
many countable models, and satisfies (5). oountab Le modete,
0
Then T has countablq many
The following is essentially due to 5affe and appears to be the key to Vaught's conjecture for weakly minimal theories. CONJECTURE 4. ~
has 2
0
If T is eupere tabl.e and (5) does not hold, then T many countable models.
The next results may be helpful in proving Saffe's conjecture as well as other theorems. THEOREM 10 (Buechler [8J).
Suppose that T is unidimensional, has
countably many types over ~,and (0, cl) is a weakly minimal pregeometry. Then for all finite ACD, cl(A) intersects only finitely many s.w.m. sets.
A surprising consequence of this theorem is that, unless RM(D) =1 in the geometry !lJ, 1J contains many finite sets. (Here !lJ = (D,1J,E) is the incidence geometry as defined in the appendix.) This is used to prove THEOREM 11 (Buechl er [8J). Let T be unidimensional and have countably many types over 0. Suppose that the universe 0 is weakly minimal and (0, cl) is locaUy modular. Then every element of 11 is finite. Puirthermore, Saffe' e oonijeeture is true for T. THEOREM 12 (Buechler [9J). Suppose that M is a module oVer the ring Th(M) is weakly minimal with counbablq many types over 0. Let I C R be the ideaZ consisting of those r E R such that fa EM: ra = O} is infinite. Then R/I is a finite fieZd. R and
It follows easily that 11, the geometry associated to M, is just projective geometry over R/I. Thus, we may apply Theorem 11 to prove 5affe's conjecture, hence Vaught's conjecture, for weakly minimal modules.
61
"Geometrical" Stability Theory
E.
Geometry in Theories of Infinite Rank Here I want to indicate what progress has been made in the study of geometrical properties of arbitrary superstable theories. Time constraints prohibit me from defining many of the terms I will use. Pillay [18] defines a type p E S(A) to be good if whenever I C p(~) is an infinite set of pairwise A-independent indiscernibles, I is independent. It appears that the proper notion of a geometrically simple superstable theory is one in which every type is good. Any l-based theory has thi s property, and converse 1y if the theory has fi nite rank. There are, however, superstable theories with only good types which are not l-based. In analogy to Theorem 5 Pillay proves THEOREM 13 (Pillay [18]). Let T be euperetiabl:e, Then every type in Teq is good iff every stationary p in Teq with U(p) = wa , for some a, is good and locaUy modular.
(In the finite rank context the minimal types are those with U(p) = wa , for some a.) Pillay and I independently proved a coordinatization theorem for such theories which generalizes the one found in [3]. It is generally accepted that the following is the major open problem in the area. CONJECTURE 5. some a.
Suppose that T is euperetabl:e and
U(p) = wa for
If p is not locaUy modular, then p has global finite multi-
plicity.
If the reader is unfamiliar with global finite multiplicity [6], he may read "strongly regular" with little loss in content. I have proved this conjecture under the additional assumption that p is good. As this discussion indicates, attention should focus on those types of U-rank wa which are not good. It is known that there are locally modular types of U-rank w which are not good. Obtaining Definable Groups The most standard example of a modular non-trivial minimal type is a vector space over a division ring. Hrushovski has shown that, essentially, every such type arrives in this manner. F.
62
S. Buechler
Let T be stable. p a non-trivial. locally modular>.
THEOREM 14 [13].
Then there exists a connected abelian group A.
stationary r>egular> type.
A-definable over> some set B. whose qener-ic type is regular and domination equivalent to p,
For>king on A is given as [ol/loiae :
Let
Ao = {a E A: stp(a/B) 1 pl. If a1 •...• a n• bE A\A o ' then b>ta1 •...• an iff there exists a definable eubqroup S e Ao ' an element c E Ao ' and definable homomorphisms a 1 •••• • on: A -> A/S such that b/S = Laia i + c/S. (A set is A-definable if it is the set of realizations of some possibly incomplete type.)
Hrushovski shows that the set of definable homomorphisms
arising as above forms a division ring D. Ao
= acl(S)
When
p' is minimal
n A and the only structure on A/Ao is as a D-vector space. APPENDIX:
An Outline of the Relevant Geometry
There are basically two kinds of "geometries" which arise in this area:
incidence geometries and combinatorial geometries arising from An incidence geometry is a triple (P.Q.I) where
closure operators.
is the set of "points". incidence relation.
Q is the set of "blocks". and
I
P
e P x Q is the
(See. for example. the discussion of pseudoplanes in
§2 above.) Let cl and if
S
be a set and
cl
a unary operator on the set of subsets of S.
is a closure operator if for all X e y ~cl(X) C cl(Y).
X. yes:
X e cl(X).
(See [10. p.1S].)
= u{cl(Y): v c x and Y is finite}.
cl(X)
2
c1 (X) = cl(X).
is called alqebrai.c
cl
xes is called closed
if
cl (X) = X. Definition A.l.
If
cl
is an algebraic closure operator on
S, then
(S. c l ) is called a pr>egeometry if it satisfies the exchange property: a E cl(A u Ibl ) and a fF cl(A) then (s, c1) is called a geometry if for all singletons a E S. cl ({a}) ={a}. For A e S let for all
a.b E S and A C S,
bE cl(A u {a}).
A#
= {cl ({a}):
a
if
A pregeometry
E
A is a singleton}.
We can associate to any pregeometry (S#. Cl#) = G# by letting If
(S. cl)
cl#(A) = cl (uA)#
H has oo-rank l.then
for
(H. acl(-) n H)
=G
a geometry
A e S#.
is a pregeometry.
The
"Geometrical" Stability Theory
63
notion of independence used in studying strongly minimal sets generalizes easily to any pregeometry, and using the exchange principle we get a well defined dimension for any subset of S.
A line is a closed set of dimen-
sion 2. Definition A.2.
Let
G = (5, cl) be a pregeometry.
locally modular if for all closed X,Y c 5 with dim(X U Y) + dim(X n Y)
= dim(X)
+ dim(Y)
-trivial if for all X,Y c 5, cl(XUY) = cl(X) U cl(Y). is called disintegr-ated. If AC5 we define the loaalization of (5,cl)
5' = 5\A
ization of
and let
(5, cl)
cl '(B) = cl(B UA)\A.
A trivial geometry
A as follows.
Notice that if
G is
It is easy to prove
If G = (5, c l )
THEOREM A.3.
is loaally modular, then for' all
G/{e} is modular'.
e E 5\cl(~), Examples. F.
G/A.
at
G is
Then (5', c l ") is the local-
= G at A and is denoted G/A.
a pregeometry, then so is
a field
cl(~),
G is modular' if (*) holds for all closed X,Y c 5.
We say
Let
G is called
Xn Y F
(1)
Let
V be an infinite dimensional vector space over
V is strongly minimal so (V, acl) is a pregeometry.
The
usual dimension rule on subspaces says that this is modular. (2)
Let
0
Define the ternary operation Then
D is non-modular.
with
a
as the new zero.
are definable from
(Q,+). Let D = (O,n.
be the rationals and consider the abelian group f
by:
For any
f(x,y,z) = x + y - z .
a E Q we can recover
The other operators,
f(x,y,a), so
+ from f(x,y,a),
f(x,y,b), for bE 0,
(0, acl)/{a} is modular.
Thus
D is
locally modular. We consider projective geometry over a division ring to be formulated as follows.
Let
V be a module of dimension ;;;.2 over a division ring K.
For A C V let cl(A) denote the submodule generated by A; let A# = [cl Iv) . vEA and v FO}.
Let 1) = {cl(A)#: AcV, 2" dim(A) < l'Io } .
Then (V#,l1,E) is a pr'ojective geometry over' K. space is infinite if
[1]) .
dim(V) = 00;
otherwise it is
The dimension of the dim(V) -1
(see, e.g.
64
S. Buechler
We call a geometry projective if for all
a,b E Sand Z C S,
a E c1(Z U {b}) implies that there is acE c1(Z)
such that
a E c1(b.c).
(It is clear from the remarks on dependence relations in [21, VII] that this is equivalent to the definition given in [21, V] for a projective space.)
We associate with a geometry G an incidence geometry Q = (S,n,E)
by letting n = {c1(A): ACV, 2.;; dim(A) < i'l o } . (For G a geometry q always denotes this incidence geometry.) If G is a pregeometry, then Q denotes Q#.
If G is a projective geometry of dimension ;;;'3, then it "is"
projective geometry over a division ring in the sense that
q
phic to such an incidence geometry (see, e.g. [21, V,§2]). tion. lines are finite in field.
q,
is isomor-
If. in addi-
then it is projective geometry over a finite
We call a pregeometry G projective if G# is projective.
a geometry (or pregeometry)
G locally projective if for any
G/a is a projective pregeometry.
We call
a E c1(0),
A pregeometry G is locaUy finite if
for every A C S#, c1#(A) is finite. The relevance of projectivity lies in THEOREM A. 4.
If a pregeometry G is modular and non-trivial, then it
is projective. Proof:
Easy.
In general. we cannot recover the structure on a geometry from the fact that G/{a} is projective. THEOREM A.5.
G simply
However. by [12],
Suppose the geometry G is locaUy projective. looal.lq
finite. and regular (i.e., all lines contain the same number of points). Then Q is affine or projective geometry over a finite field. REFERENCES
[1]
Biggs and White. Permutation Groups and Combinatorial Structures. London Mathematical Society Lecture Notes No. 33 (Cambridge University Press, Cambridge. 1979). [2] Buech1er ,S., "The geometry of weakly minima1 types," Journal of Symbolic Logic. vol. 50, no. 4 (December 1985). 1044-1053. [3] - - - . "Locally modular theories of finite rank." Annals of Pure and Applied Logic (to appear). [4] - - - , "One theorem of Zi1'ber's on strongly minimal sets," Journal of Symbolic Logic, vol. 50, no. 4 (December 1985), 1054-1061. [5] - - - , "Coordinatization in superstab1e theories. I: Stationary types," Transactions of the American Mathematical Society, vol. 288. no. 1 (March 1985), 101-114.
"Geometrical" Stability Theory
[6] [7] [8] [9]
[10] [11 ] [12] [13] [14]
[15] [16] [17]
[18] [19] [20] [21]
[22]
[23] [24] [25]
[26] [27]
[28]
65
"Coordinatization in superstable theories, II," preprint (1985): , "The classification of small weakly minimal sets, I," in the Proceedings of the U.S.-Israel Joint Meeting in Model Theory, Chicago, December 1985, (Springer-Verlag, Heidelberg, to appear). - - - , "The classification of small weakly minimal sets, II ," preprint (1986). - - - , "The classification of small weakly minimal sets, II!," in preparation. Burris,S. and Sankappanavar,H.P., A Course in Universal Algebra, Graduate Texts in Mathematics #78, (Springer-Verlag, Heidelberg, 1981). Cherlin,G., ~arrington,H., and Lachlan,A.H., ""o-categoric~, "o-stable structures," Annals of Pure and Applied Logic, vol. 28, no. 2 (March 1985), 103-136. Doyen,J. and Hubaut,X., "Finite regular locally projective spaces," Mathematische Zeitschrift, vol. 119, no. 1 (1971), 83-88. Hrushovski ,E., "Locally modular regular types ," in the Proceedings of the U.S.-Israel Joint Meeting in Model Theory, Chicago, December 1985, (Springer-Verlag, Heidelberg, to appear). HrushovsktE, and Pillay,A., "Weakly normal groups,"This volume, 233-244. Lachlan,A.H., "Two conjectures on the stabil t ty of w-categorical theories," Fundamenta Mathematica, vol. 81 (1974), 133-145. Makkai,M., "A survey of basic stability theory, with particular emphasis on orthogonality and regular types," Israel Journal of Mathematics, vol. 49, nos. 1-3 (March 1985). Pillay,A., "Stable theories, pseudoplanes, and the number of countable models," preprint (1984). Pillay,A., "Simple superstable theories," in the Proceedings of the U.S.-Israel Joint Meeting in Model Theory, Chicago, December 1985, (Springer-Verlag, Heidelberg, to appear). Poizat,B., "Une theorie de Galois imaginaire," Journal of Symbolic Logic, vol. 48, no. 4 (1983), 1151-1170. Saffe,J., "On Vaught's conjecture for superstable theories," preprint (1982). Seidenberg,A., Lectures in Projective Geometry, (Van Nostrand, 1962). Shelah,S., Classification Theory and the Number of Non-Isomorphic Models, (North-Holland, Amsterdam, 1978). Zil'ber,B.I., "The structure of models of categorical theories and the finite-axiomatizabil ity problem," preprint, mimeographed by VINITI, Dep. N 2800-77 (Kemerovo, USSR, 1977). - - - , "Strongly minimal, countably categorical theories," Siberian Mathematics Journal, vol. 21 (1980), 219-230. - - - , "Totall y categori ca 1 theori es: structura 1 properti es and the non-finite axiomatizability," in Model Theory of Algebra and Arithmetic, Lecture Notes in Mathematics 834, (Springer-Verlag, Heidelberg, 1980). , "Strongly minimal countably categorical theories, II," Siberian Mathematics Journal, vol. 25, no. ~ (May-June 1984), 396-412. - - - , "Strongly minimal countably categorical theories, III," Siberian Mathematics Journal, vol. 25, no. 4 (July-Aug. 1984), 559-571. - - - , "Structural properties of models of "I-categorical theories," in the Proceedings of the International Congress on Logic, Methodology and Philosophy of Science, Salzburg, 1983, (to appear).
66
[29]
S. Buechler
, "The structure of models of uncountably categorical theories," in the Proceedings of the International Congress of Mathematicians, Warsaw, 1982, (to appear). [30] - - - , "Groups and rings whose theories are categorical" (in Russian), Fundamenta Mathematica, vol. 95, no. 3 (1977), 173-188.
Logic Colloquium '85 Edited by The Paris Logic Group © Elsevier Science Publishers B.V. (North-Holland), 1987
HOMOGENEOUS DIRECTED GRAPHS.
67
THE IMPRIMITIVE CASE
Gregory l. Cherlin Rutgers University Mathematics Department New Brunswick, New Jersey 08903, U.S.A. INTRODUCTION A relational system H is said to be homogeneous if any isomorphism a:
A + B between two of its finite substructures is induced by an
automorphism of H.
Assuming the language is finite, such structures
are Ro-categorical, and Lachlan has a very general theorem concerning the classification of the stable ones [4,6,9] which is a refinement (for this special case) of the results of [3].
Roughly speaking the
stable homogeneous structures for a fixed finite relational language fall into finitely many families, with the isomorphism type of the structures within a family determined by rather trivial numerical invariants.
In
particular there are only countably many countable stable homogeneous structures for a given finite relational language. In certain cases all the homogeneous structures have been classified, though not as a result of any general theory.
The homogeneous symmetric
graphs or tournaments (directed graphs with any two vertices joined by an edge) were classified in [10] and [8] respectively.
The methods of the
second paper seem particularly interesting, as the nimbUS of a general method seems dimly perceptible.
have shown recently that the same
method can be used to classify the homogeneous directed graphs omitting the edgeless graph
I~
on infinitely many vertices:
the tournaments are
of course those which omit I • 2
(* Research supported by NSF Grants OMS 83-01806 and INT-8313363.]
68
G.L. Cherlin ~o
What of the homogeneous directed graphs in general? There are
2
known types which are freely generated by tournaments in the following sense.
In the partial order of isomorphism types of finite tournaments
ordered by embeddability. fix an infinite antichain in
[5]. which I follow here).
the closure
A(I)
of
1:
For
X
~
(one is exhibited
an arbitrary subset of :::I. form
with respect to free amalgamation. isomorphism.
and substructure. where the free amalgamation of two directed graphs which agree on their common vertices is simply their union. pointwise and edgewise.
Following Fraisse. we associate to A(!)
homogeneous directed graph. from which this way we find
~o
2
Jt
the
A(!)-generic
is easily recovered.
countable homogeneous directed graphs.
In
(In the
future structures are assumed countable without further mention.) And so it seems that Lachlan's theory cannot be extended to the unstable case; but actually this does not follow at all - not from these cardinality considerations.
If one is to draw this sort of conclusion
from such evidence then one must in particular regard the homogeneous directed graphs as intrinsically unclassifiable. while the opposite possibility - that they are all already known - is perfectly consistent with the evi dence.
I propose accordi ngly to work in thi s di recti on - an ex-
plicit classification of the homogeneous directed graphs - partly in order to lay to rest these cardinality considerations. which have lately reared their heads in more algebraic contexts as well [1.11.12].
This is
not to say tha tone actua 11 y expects a smooth genera1 theory of homogeneous structures for finite relational languages. only that sensible criteria for classifiability are wanted; and indeed a very sensible criterion has already been suggested by Lachlan. style entailment relation for finite sets
.s4.E
He proposes a Gentzenof finite structures for
69
The Imprimitive Homogeneous Directed Graphs
a given language L:
s4 ~ 13
means that any homogeneous
embedding all the structures in
A-
L-structure
must also embed some structure in
:B.
Using Fraisse's theory relating homogeneous structures and amalgamation classes. one sees that this relation is r.e •• and that the problem of classifiability is expressed quite well by: (*)
Given
L. is
r
recursive?
This seems by far the most interesting problem in the area. and we know essentially nothing about it. The goal of the present paper is quite modest. I will describe the known homogeneous directed graphs in some detail. checking homogeneity when it seems appropriate. deficient (omitting some
They fall naturally into three families: 2-type). imprimitive (carrying a nontrivial
a-definable equivalence relation).
~nd
freely generated (in the sense
described above. or in a dual sense). and there are in addition two more examples known which may be characterized by the 3-types they realize. The deficient examples were classified in the papers [10.8] referred to earlier.
The imprimitive ones will be classified here.
There is one other topic which should be dealt with. at least in part. before attacking the primitive case directly.
In [10] Lachlan classifies
the homogeneous 2-tournaments (these are tournaments partitioned into two distinguished subsets).
In dealing with directed graphs it may be
convenient to deal with 3-tournaments. allowing in addition three 2-types to be realized between distinct components (as opposed to two realized in a given component).
have worked out the classification of the
n-tournaments with an arbitrary number of cross types between components. for all
n.
This seems to be a natural problem to consider prior to
tackling the homogeneous directed graphs. and the analysis suggests profitable lines for the latter problem. but I no longer expect the
70
G. L. Cherlin
result to be directly applicable (that is. it may be usable. but it seems that there are better approaches).
All of this will be explored in
detail elsewhere. §1. The known homogeneous directed graphs. Our description of the known homogeneous directed graphs will be keyed to the following catalog: I.
II.
III.
DEFICIENT. 1.
I
2.
C. Q. Q*. T"" 3
n
IMPRIMITIVE. 3.
Wreathed
5.
n
6.
semi generic
*
1
00
EXCEPTIONAL. 7.
S(3)
8. P IV. FREE. 9.
Generic omitting
I
10.
Generic omitting
Jr.
n+1
Proofs of homogeneity will be given in §2.
In the following discussion
H is some countable, homogeneous, directed graph. I.
Deficient cases. There are three nontrivial
2-types, which will be denoted in two
71
The Imprimitive Homogeneous Directed Graphs
ways as convenience dictates: x
+ y
or y
E
XI
x + y
or y
E
IX
X !
Y or
Yc
xl.
If H omits one of these 2-types then it is said to be deficient and is then either edgeless (Case 1. n
or a tournament. The homo-
(~l
geneous tournaments as classified by Lachlan [8] are
1 included in 1, Case 1. the oriented triangle C, the rational order Q, the circular 3
order
described below, and the generic tournament ~.
Q*
To form
Q*
we can either partition Q into two dense subsets and
reverse the arrows between elements in distinct subsets, or alternatively, place astronomers at all points lying at rational angles on a circle of large radius, equip them with telescopes enabling them to see halfway around in either direction, and draw arrows to the right as far as the eye can see; then each astronomer believes he lives on the rational line.
This structure is mentioned in §6 of [2]. and is con-
nected with §4 of [4]. II.
Imprimitive cases.
If H is imprimitive then the nontrivial equivalence relation is the union of equality with either 1 or its complement.
Wreath products
H [H] are formed by taking
H H with no 2-types in common, and re1, 2 placing the points of H by copies of H In other words, if T is 1 2• one of the four nondegenerate homogeneous tournaments from Case 2, then 1 2
we form T[I ], or n
called
n
>
I [T] for n
1 < n <
~
; the latter is more commonly
1.
In all nonwreathed cases the equivalence relation will correspond to !.
For T a tournament, T
A
is constructed as follows.
Let
G.L. Cherlin
72
T+ = T U {al, where a
T. Then
+
T+ T+ of T+. For x I'
Xl
2
1
E
T+. Y 1
2
is the union of two copies
T~ E
T+
2
corresponding to
x.Y
E
T+.
Y iff Y + x. Observe that I has equivalence classes of size 2. 2 any two of which form a 4-cycle C II '" C One may check also that 4' 4' C is isomorphic with a graph on the nonzero points of the plane V +
3
over the Galois field F
3
with edges defined by: x 2
equal to a fixed element of A V.
y iff x A y is
+
(The exterior product is just the de-
terminant of the matrix with columns x. Y. once bases are chosen; there is a similar structure on the nonzero points of the plane over F. homoq
geneous for a binary language with 2(q-l) 2-types.)
The graph Q' is
a variant of Q* in which each astronomer has an antipodal twin which he cannot see.
co
T
-
is generic subject to the constraints:
1.
I
gives rise to an equivalence relation with classes of size 2.
2.
The union of two I-classes is a copy of C. 4
The graph n *
I~
is defined as the generic graph on which
equivalence relation with n classes.
For n
=~
is an
there is a variant
which for lack of a more suggestive term we call semi generic, cc
I
The graph
* L" is generic for the constraint: 1.
I
gives rise to an equivalence relation;
To get the semi generic variant we impose the further constraint: 2,
For any pairs AI' A taken from distinct I-classes, the number 2 of edges from A to A is even. 1
2
III. Exceptional homogeneous directed graphs. We can define the myopic circular order 5(3)
most simply in terms of
astronomers whose telescopes enable them to see 1/3 of their circular universe in each direction - leaving a third invisible.
Alternatively,
73
The Imprimitive Homogeneous Directed Graphs
,
partition Q into three dense sets Q. the types
1. .... +
indexed by i
with 0.1.2 respectively. and for
EO
1131. identify
x
EO
Q. , 1
Y EO
i - j + tp (xy).
distinct, assign to xy the type
Q. J
Q
The generi c partially ordered set P needs no commentary. IV. Freely generated homogeneous directed graphs. These are the graphs which are generic subject to a constraint of the form:
H embeds no X from l
of a gi ven type.
In (9) 3; is
here Xis a class of defi cient graphs (I
n+l
)
and in (10)
X =J
is a class
of tournaments. These are all the homogeneous graphs known to me, and I conjecture that in fact:
only countably many are missing.
(Just as in the im-
primitive case the semi generic graph appears unexpectedly, others could easily turn up.)
§2.
Proofs of homogeneity. For the homogeneity of Q* see [2] or [8].
analyzed along similar lines:
Q* and 5(3)
can
be
the astronomical description shows that
the automorphism group is transitive. so we need only check that the expansion of the structure by a single parameter x is homogeneous, and up to a permutation of 2-types (and the removal of the element x) this expansion is just Q partitioned into 2 or 3 dense subsets, respectively. In the case of 5(3). identifying 1.... ,
+
with 0.1,2 respectively. and
letting Q = {y: tp(xy) = i}, we assign to y i (i - j) + tp(yz).
EO
Qi'
Z
EO
41. the type J
The homogeneity of wreath products of homogeneous structures in disjoint languages has been noted previously by Lachlan, if not earlier, and the existence of amalgamation classes corresponding to examples 8-10 is
G.L. Cherlin
74
both straightforward and well known.
It remains to discuss examples 4-6.
Recall as a matter of notation that T· and T = T.
=
r; U T; with
T~ = {ail U T
i
It is quite easy to see that the structure imposed on
i
T U T by {al' a is homogeneous if (and only if) T is. as it conl 2} 2 sists of two copies of T with a definable isomorphism. As {a} = a 1. 2
1
is transitive when T '1 Q* is homogeneous.
it suffices to see that T·
The following condition is sufficient for this. though not necessary: (*)
For x
€
Y. z
X:
+
T there is an isomorphism a: y
az iff ay
+
,
,
X+ X
such that for
z.
+
This condition evidently holds for
II. C3. and Q; for
T = roo and
X € T the desired a comes from a back-and-forth construction. To check the transitivity of T· observe first that there is a canonical involution
i
E
Aut T· defined by
to find maps ~
E
Aut T· which take al
corresponds to
x
E
X 1 i(x). so it suffices
to any Xl
€
Tl.
If Xl
T then let a be as in (*) and define (ai)
(Xi) = a3-i' while for (Yi)
= a(y)i
(Zi)
= a(z)3-i
y
+
x
+
xi.
z:
;
(*) expresses the condition that this is an automorphism of T·.
To see that (*) is not a necessary condition for transitivity. notice that if 7./2n7. is made into a directed graph by taking "(y - x)
€
{l ••..• n-l} (mod 2n)". then 1/2n1
is the transi ti ve tournament of order It will be useful later to know that
x
+
y to mean
= L(n-l)·. where L(n-l)
n - 1. Q*.
is not homogeneous. and for
this we check the failure of transitivity directly.
On the one hand
Q* by the construction. while on the other hand. for
75
The Imprimitive Homogeneous Directed Graphs
is linearly ordered, by inspection.
X' -
1
n * reo
115.
We must check that the class of finite directed graphs satisfying:
=
(1)
the union of
and I
is an equivalence relation;
(2)
this relation has at most n classes;
is an amalgamation class.
It suffices to describe how to complete an
amalgamation of
with
type of
a a 1 2 We can take
H U {a} 1 suitably. a
1
+
a
2
H U {a} 2
over
H, by specifying the
unless there is an obstruction of one of the
following forms: (1.1)
a
(2.0
H has n - 1 I-classes, and there is no b e: H with a
1
I
b I
a
2
b e: H;
b I a
1
or
2
We can take (1.2)
a ,
i
a I a 1
2
leI a , {i,j} j
unless there is an obstruction of the form:
= {1,2}.
There cannot be both sorts of obstruction, so the amalgamation succeeds. 116.
The semi generic I-imprimitive case. We claim that the constraint (1) above can be combined with the con-
straint: (3)
IA X A () EI 1 2
is even for
A, A 1 2
two
r-equtva lerrt pairs (where
E is the set of edges) to give an amalgamation class of finite directed graphs.
With the notation of the previous example, we must again specify
the type of
a a . 1 2 We take a + a unless there is either an obstruction of the form 1 2 (1.1), or this choice yields:
76
G.L. Cherlin
} (\ EI is odd. If (1.1) occurs a 1 b E H and I {al,b x {a 2,b2 l l l} x An EI is then we take \ 1 a and we have to check that 1{a 1,a 2} 2 even for any 1-equivalent pair A in H; this follows since (3.1)
I {a 1.• b)
x
An EI
is even for
i
1,2.
If case (1.1) does not apply but (3.1) does, then we take a
2
+
constraint (1) is still satisfied, and moreover (3.1) is now false. must still be checked is that for
I {a
1
, c } x {a , c }
1
{a .b }
2
x
2
n EI
a. 1 c 1
i
E
a
1
and
What
H, that always
is even; for this it suffices to consider
{a ,b }, {a ,b }
x
{a .c },
11221122
This completes the description of the currently known examples. The next order of business is to show that the list of imprimitive types is complete.
Imprimitive homogeneous graphs with finite classes.
§3.
Throughout the remainder of this article, H denotes an imprimitive homogeneous directed graph.
As the nontrivial equivalence relation on H
is the union of equality with either 1 or its complement, and in the latter case H is necessarily a wreath product, we may assume the equivalence relation is
"= U 1"; by a slight abuse of notation we will denote
the equivalence relation also by 1. Theorem.
The theorem we aim at is of course:
If H is an imprimitive homogeneous directed graph then H is
one of the following: 1.
a wreath product T[l ] or n
3.
n
*
1
00
;
I [T] n
77
The Imprimitive Homogeneous Directed Graphs
4.
semigeneric for
I
an equivalence relation.
As noted, we may take the equivalence relation on H to be (essentially)
We consider first the case in which this relation has finite
1.
classes. We can dispose of the case in which H is finite by reference to the list in
[6J
of all finite examples. So we may assume that
ite, and not a wreath product.
Fix an I-class C, and find
H
is infin-
x,y e H - C
with: x
+
s,
x' (\ C = y' {\ C.
If x' fI C
o or
n c. [x ' n CI
C then it follows easily that H is wreathed.
Fix a e x' If
k with 1 < k < n, then we can find
A * x'lI C and z e H - C with
x
+
z or
z
+
A'=. C, a e A,
x so that z'{\ C
Then axy and axz or azx have the same type, a contradiction. conclude that k
= I,
and similarly that n - k
follows rapidly that H = T~
n
*
n
= 2.
We
It then
for some homogeneous T, and we checked in
the previous section that this forces §4.
= I,
= A.
T
r Q* .
I"" •
We have assumed that I defines an equivalence relation on H, and we will assume throughout that H is not a wreath product. the condition that all
I-classes are infinite.
We now impose
We first take up the
case in which H/I is finite, but we begin with two general lemmas. Lemma 1. I' {\ C
2
For distinct I-classes C ,C 1
is infinite.
2
and Is C
1
finite, the set
G. L. Cher/in
78
Proof: Fix
We will show that for some
n.
IcC
-
Suppose the contrary, and choose
infini teo
of order
1
n, I'
n C2
is
10 maximal so that
C is infinite. Let J 1 II c • For IcC - I finite, 2 0 1- 1 0 2 0' 'I is cofinite in J, hence infinite. Hence for J c J of order n, 1 0 - 0 J'II C is infinite. As H is not a wreath product, there is an autoII
o
(\
1
morphism of
H which switches
C and 1
C. 2
o
The claim follows.
Corollary. With the same notation, if I'll 'J II C 2
I, J
~
C are finite and disjoint, then 1
is infinite.
Proof: Fix
n and 11 ==- 'K n C 1 1' = III, IJ11 = IJI. Apply homogeneity.
n arbitrary, K c;; C
J {\ K' ~ C where 1 1' Lemma 2.
IH/ll
Let
product for
x
€
11
2
11
n with
of order
3 ( n
(~.
Then
x'
is not a wreath
H.
Proof: We have supposed that
H is not a wreath product. If the lemma fails,
fix
C, C C distinct I-classes with x € C. The tournaments y'/l 1' 2 are canonically isomorphic for y € C, and hence no automorphism a of H carries
(C,C,e)
1 2
to
(e,e,e).
2 1
(If
a
carries
x e e
to
y e e,
(xy)' (\ (C u C).) Therefore 'x is also a 1 2 wrea th product. Let A = x' (\ e , B = x' 1\ e , a e A , b € B • If 1 1 1 2 1 1 1 1 the edges a b and a b have the same orientation then the map 1 2 2 1 x,a,b + x,a ,b is induced by an automorphism taking (e,e,C) to 1 2 2 1 1 2 look at its effect on
79
The Imprimitive Homogeneous Directed Graphs
(C,C
a contradiction. Thus the edges from Al to B have con2,C1), 2 stant orientation. It follows that any two points of A realize the 1
same type over C • From this, homogeneity readily yields that H is a 2
wreath product, a contradiction. ProposHi on. Assume
n
= IH/ll is finite. Then
H
n * 1
00 •
Proof: We proceed by induction on n, starting at n
1.
For the inductive
= 2 this is contained in Lemma 1. For n
> 2, we deduce from
step we show first: For x e: H, x' '" In-L) * 1
(1)
00 •
For
n
Lemmas 1 and 2 that x'
has infinite
i-classes and is not a wreath pro-
duct, so induction applies. Now we will prove that all finite directed graphs of the form: T U
with
T a tournament of order
H.
in
n - I, and I a disjoint i-class, embed
Our claim will then follow, as any subgraph of
n *
can be
100
built up from such graphs by amalgamations with unique solutions. the same reason we may take n
to be indiscernible over T.
For
For
= 2 (1) already suffices. For
n > 2 and T U 1 as described, fix
a,b e: T with
a
+
band
form a directed graph K on a set:
T U T U {a ,a ,b ,b } U 1 U 1 12121212 (leaving the orientation of T
1
U {a ,b } U 1
1
1
1
(a
1,b1
T U 1 with
)
unspecified, however) so that: + b •
I'
G.L. Cherlin
80
(a ,a} 1
1
1
has order
K/l b
and {b ,b}
2
are
2
l-classes;
n;
K - (a 11), b + K - (a / 1) - {a } . 1 2 2 2
+
We need to see that K K - {a} and K = K - {b are embedded 2 1 1 1} in H, so that an amalgam of Hand H will contain a copy of T U 1. 1 2 That K < H follows from (1) with x = a , and that K < H follows by 2
1
taking
K
2
1
over their common part, applying (I) with x
§5.
2
to be the (unique) amalgam of K - {a ,b}
= a2 ,b 2
1
and K - {a ,b } 2
1
respectively.
The semigeneric case. From now on we assume that HII
is infinite.
We will refer to the
extra constraint placed on the semigeneric graph as the parity constraint.
We wish to show that if H satisfies the parity constraint,
then it is semigeneric.
We will prove the following two claims by
i nducti on: (l.n)
If K = T U I is a finite directed graph, with T a tournament of order
(2.n)
nand
a disjoint I-class, then K embeds in H.
If K embeds in the semi generic graph and IK/II = n then K embeds in H.
Observe that
(l.n)
implies
(2.n+l)
by a straightforward amalgamation,
invoking the parity constraint. We prove (Ln) inductively. Lemma 1.
For n = 1 we use the corollary to
For the inductive step we form a directed graph K
II U 1 U T U T U {x,Xo,y}, with the orientation of 2 2 1
1
on the set
(x,y) unspecified,
The Imprimitive Homogeneous Directed Graphs
81
so that: I
1
UI
I xyT 1
2
1
I xyT 2 2
is a single I-class;
=K
with x,y
++
a,b, I
=K
with x,y
++
b,a, I
1
2
if
++
++
I
x
if y
+ y;
+
x;
Corresponding elements c e T c e T are unlinked; 1, 2 2 1
K
1
embeds in the semigeneric graph.
More preci sely, K
is an amalgamation problem, whose solution must
I
conta in a copy of K.
The point
x
0
forces an edge to 1i nk
x and y.
It remains to be seen that the factors embed in H. The factor sequence of
K - {y} I
u.n-n.
embeds in In the factor
H by
(2.n), which we have as a con-
K - {x} 1
the point x
0
the others.
dominates
Taking any x e H, we can apply (2.n) to x in the o 0 manner of the previous proof, so this factor also embeds in H. This I
completes the argument.
§6.
'"
*
leo.
We treat the last case in a similar but more elaborate fashion.
We
assume that H/l is infinite and that H does not satisfy the parity constraint. which
I
If
Jt
is an amalgamation class of finite directed graphs on
is an equivalence relation, then let
A-*
be the set of all
directed graphs K e J4 such that an arbitrary extension by a new I-class 1 will belong to
5f.
K U I of K
let Q be the simplest
directed graph Violating the parity constraint:
82
G.L. Cherlin
o Then the inductive argument corresponding to the ones we have given above is expressed as follows. if it contains all
I
g. Lemma.
If
on which
A
for
n
n < 00. arbitrarily large tournaments. and
is a robust amalgamation class of finite directed graphs
is an equivalence relation. then
1
Corollary.
Call a class of finite directed graphs robust
J4* is also robust.
With .f4 as above. any finite directed graph K on which
is an equivalence relation belongs to
Jf.
We run through the proof of the corollary first: duction on
n = IK/ll
from
1
n = O. assume
IK III = n - 1 and I an l-class. 1 hence K E i.
with
proceeding by in-
n > 0 and let K = K U I 1 By induction K E A* and 1
Proof of the Lemma: Let J = K U I either
I
n
be the graph we wish to show is in
or L(n) for some
n. or
g.
J4.
where K is
Making use of straightforward
amalgamations with unique solutions, we find that we need only consider the following two cases: (1) (2)
is indiscernible over K;
III = 2.
The Imprimitive Homogeneous Directed Graphs
Let H be the (3) For x
E
~-generi c
H. x' embeds
homogeneous di rected graph.
First form the amalgamation diagram:
/.~
<.«> a. (
*x
The factors embed in x'. since by Lemmas 1 and 2. x' generic.
We show first:
g.
We give a direct construction.
y*
83
is at worst semi-
If the edge goes from x to y we are done. and otherwise
amalgamate the result with:
/.~
y.~.~.b
taking a.b as the new points. (If this last factor is omitted then we get g embedded in x'
more
directly. ) Next we prove: (4)
If K is a finite directed graph on which
1
is an equivalence
relation with two classes. then K embeds in H.
G. L. Cherlin
84
Let K/I = {I,J}.
Extending J
if necessary, assume the elements of
III = 2,
realize distinct types over J; we can then reduce to the case and then further to the cases: (4.1)
II I
IJ I
(4.2)
II I
2, J indiscernible over I.
In case
2;
(4.1)
there are four cases:
one is
~,
one is covered by
Lemma I, and the other two can be forced by amalgamating the first two, with three points in one I-class and two in the other. In case (4.2) Lemma 1 applies unless fix
a, b
€
H with a
b
1
1
a, b
1
so a' (\ 'b
*b
+
J
+
b, and let C.s H be a second r-ctass.
1
claim is that a'A 'b{)C If
rapid contradiction.
{a.b} with a
is infinite.
b.
So
Our
If this set is empty one gets a
[a ' 1\ 'b /] CI = k with 0 < k < 00, then choose
with b e: (a n 'b)'.
Then a'(\ 'bn C = a' (\ 'b f\C
1
1
is a-definable, a contradiction.
'
Thus (4.2) is treated.
Next we claim: (5)
Suppose that every finite directed graph on which alence relation with n classes embeds in every tournament T of order
n lies in
Indeed consider K = T U I with
x'
is an equiv-
1
for
x e: H.
Then
.A*.
an additional
I-class.
We form
a directed graph (or amalgamation diagram with unspecified edge (x,y)) K
1
on the set I U {x,x ,y} U T U T 0
IxyT
1
or
IxyT '" H if 2
1
(x,y)
2
so that: is suitably oriented;
corresponding elements of T T are 1, 2 X 1 x, X
o
0
+
I-equivalent;
K - (X/I). 1
As usual it SUffices to check that the factors
K - {x} 1
and K - {y} 1
The Imprimitive Homogeneous Directed Graphs
embed in H.
In fact both K 1
embed in x'. by o
hypothesis. (6) A
[.
--t • ~ .]
a
85
belongs to
.94*.
c
b
This is a fairly lengthy argument. We consider K = A U I with an additional
l-class. which we must embed in H.
I
We may suppose that
is either indiscernible over A, or of order 2. If
is indiscernible over A then take three
H. and fix a, c
in "a
I
+
x
+
E
l-classes C. C C 1, 2 C. Let p be the type over a,c defined by:
c", in other words the type of b. and let q be the type of lover {a,c}.
the elements of
Let B be the set of all realizations
of
p in C and let J be the set of all realizations of q in C 2. 1• Both Band J are infinite. We claim that each element of B is linked to J
by infinitely many edges with either orientation, so that
K embeds in H in this case. ferred orientation.
If this claim fails, then there is a pre-
Now if p = q. then no automorphism of H carries
to (C,C and then as in the proof of Lemma 2. H is a 1,C2) 2.C1). wreath product. Now suppose that p * q. If c + I + a. then no auto(C,C
morphism carries a,c,C.C to c,a.C ,C • which gives a contradiction, 1 2 1 2 for example by taking x, E a'f'I c'rI C and mapping ac\x to 1 2 i cax x. (Here we apply Lemma 1.). 1 2 If type r
I is indiscernible over A and a and c realize the same over
I. then we consider also
our claim is that
tp(b/I).
If
tp(blI) = r. then
I'. 'I are not wreath products; the proof of Lemma 2
is readily adapted to this purpose.
So suppose that tp(b/I) = s
* r.
86
G.L. Cherlin
Then we perform the following amalgamation (with a unique solution): a
*
\ s · x-.::...._; ·
(*)
b
c * with factors: a.
r
x.
~.2-
.>. (As
5
s
-:
b .-
and
c.
r,s are asymmetric, the labelled edges should be read from left to
right.)
If abx or xbc ;s isomorphic to A, the corresponding factor
embeds in H by the case treated at the outset, and otherwise it suffices to examine When
'b or
b'.
is of order 2 and is not indiscernible over A, it suffices
to show that for any three
l-classes C,C
1,C2
in H and any i somorphi c
copy abc of A with a,c e: C and b e C , the two types realized by 1
are realized in C • Let q be the type rea 1i zed by one element of 2
over a,c.
If q is either
"a
+
x
+
c" or
"c
v
x
v
e", then this
was done in the course of the argument above, and in the remaining cases it suffices either to look at x'
or
'x
for
x
e:
C , or else to amal2
gamate in the manner of (*). using factors whose elements lie ;n the appropriate classes.
This completes the proof of (6).
After these preparations we can turn directly to the proof that is robust.
By (4)
I c ~* n
for all finite
n.
Next we claim:
**
87
The Imprimitive Homogeneous Directed Graphs
Arbitrarily large tournaments T are in ~*.
(7)
Equivalently, if H* is the homogeneous directed graph associated with .54*, then we claim that H*/1
is infinite.
On the basis of (4,6)
we know that H* is not a wreath product, and that its 1-classes are infinite.
By our work so far, if
for some
IH*/11
is finite then H* ~ n * I~
n.
But this means that every finite directed graph K on which equivalence relation with n+1
classes embeds in H.
so as to minimize the value of
n here.
robust.
A'
Let
choice of
n, the ~I*-generic directed graph is again
n+1
robust
x e: H, x' is
be the amalgamation class associated with x' .
every finite directed graph K on which with
A
Choose
Observe that for
1 is an
n * I"".
By our Thus
1 is an equivalence relation
classes embeds in x', and hence by (5), every tournament of lies in ~ *, a contradiction.
order n+1
It remains only to prove: (*)
Q e:~*. We consider K = I UQ with
cernible over
Q, or of order
sults so far either embeds Thus if
K embeds in
omits Q, then it embeds in H*, and hence
H, as claimed.
This argument applies in particular if
has order two, and both
I U {b,d} are isomorphic with
amalgamate __0 U {ik,j}
for
k = 1,2
This can be set up so that neither is isomorphic with
Q.
I
Q.
In the only case remaining, and (similarly)
directed graph H*, which by our re-
Q (which is the present claim), or is semi-
I U {a,c}
is indiscernible over
Label 0 a,b,c,d as before.
Jf *-generic
We again consider the
generic.
2.
an 1-class which is either indis-
U {a,c}
Q. In this case just
so as to force
0 U {i ,i }
=
{i , j } U {a,c} nor 1 This completes the argument.
~
K.
1 2 {i , j } U {b,d}
2
88
G.L. Cherlin
REFERENCES
1.
C. Berline and G. Cherlin, "QE rings of prime characteristic," in Bull. Soc. Math. Belg. B 33 (1981), 3-17.
2.
P. Cameron, "Orbits of permutation groups on unordered sets II," J. London Math. Soc. 23 (1981), 249-264.
3.
G. Cherlin, A. Harrington, and A. Lachlan, "l\o-categorical, ~o-stable
structures," APAL (1985), 103-135.
4.
G. Cherlin and A. Lachlan, "Stable finitely homogeneous structures," TAMS, to appear 1986.
5.
C.W. Henson, "Countable homogeneous relational systems and categorical theories," JSL 37 (1972), 494-500.
6.
A. Lachlan, "Finite homogeneous simple digraphs," in Logic Colloquium 1981, J. Stern ed., North-Holland, NY (1982), 189-208.
7.
A. Lachlan, "On countable stable structures which are homogeneous for a finite relational language," Israel J. Math. 49 (1984), 69-153.
8.
A. Lachlan, "Countable homogeneous tournaments," TAMS 284 (l984), 431-461.
9.
A. Lachlan, S. Shelah, "Stable structures homogeneous for a binary language," Israel J. Math. 49 (1984), 155-180.
10.
A. Lachlan, R. Woodrow, "Countable ultrahomogeneous graphs," TAMS 262 (1980), 51-94.
11.
D. Saracino and C. Wood, "QE commutative rings," J. Symb. Logic 49 (1984), 644-651.
12.
D. Saracino and C. Wood, "QE nil-2 groups of exponent 4," J.A1g. 76 (1982), 337-382.
13.
J. Schmer1, "Countable homogeneous partially ordered sets," Alg. Univ. 9 (1979), 317-321.
14.
T. Skolem, "Logi sch-kombi natori sche Untersuchungen tiber di e Erful1barkeit and Beweisbarkeit mathematischen Satze nebst einem Theorem tiber di chte Mengen, II Skriften Vitenskapsakad. Kri sti ana 4 (1920), 1-36, §4.
Logic Colloquium '85 Edited by The Paris Logic Group © Elsevier Science Publishers B.V. (North-Holland), 1987
89
PROOFS OF PARTIAL CORRECTNESS FOR ITERATIVE AND RECURSIVE COMPUTATIONS Bruno COURCELLE Universite de Bordeaux I,Departement d'lnformatique+ 3~1,Cours de la Liberation, 3340~ TALENCE, France
Abstract :this paper provides some general definitions concern ing the validity of programs, an abstract presentation of the inductive assertion method for iterative and recursive programs. The case of recursive programs is handled by means of call-trees.
INTRODUCTION
The importance of appropriate methods for establishing the validity of programs has been recognized for a long time. Proof methods of various types have been proposed. Even if they are not applicable to long commercial programs because of the length of the resulting proofs, they are useful for the publication of algorithms ( algorithms should always be proved as are theorems in mathematical papers). They are also important at a theoretical level to guide the design of programming languages and the establishment of relevant programming methodologies. This paper presents : - some general definitions concerning the validity of programs and algorithms, an abstract presentation of iterative comvutations and of a proof method for establishing their partial correctness : the inductive assertion method,
+
Formation associee au CNRS This work has been supported by the ATTRISEM project of the GRECO de PROGRAttNATION •
90 an abstract presentation of recursive computations based on the concept of the call-tree of a recursive procedure which yields a proof method for their partial correctness (this presentation is based on recent results by Courcelle and Deransart [3]). 1 - Specifications and correctness proofs
We shall not distinguish between a program written in some precise programming language and an algorithm which can be written in a more flexible (but, hopefully precise and unambiguous) way. Both of them will be called programs. With every program P is associated a set of data A that we can consider to be (in bijection with) a recursive subset of X+ (the set of non empty finite words over the finite alphabet X). The set of programs in also a recursive subset of X+ With a program P and a data d is associated ~ computation sequence a = (SO,81,52, ••• ,8 n , ••• ) which is a sequence of states. A state Sj encodes everything that is necessary for the continuation of the computation: next instruction number, pushdown list of return addresses, intermediate values computed and stored for further treatment, etc ••• We only say that it can be represented by a finite word over X. In the case of a deterministic program the state s1+1 is determined in a unique way from s1 and there is a unique computation sequence a associated with P and d. It is denoted by ap,d. Otherwise a set of computation sequences is associated with P and d. Unless otherwise indicated we shall deal with deterministic programs in this paper. The computation sequence ap,d can be of three types (1)
(2)
ap,d
is infinite,
is finite and its last state is an state, from which no result can be extracted, ap,d
~
(3)
In
is its
is finite, its last state is a success state from which the result d' of the computation (a word over X) can be extracted.
ap,d
case (3), we say that ap,d is successful, that d and d' its output.
~
91
Proofs for Partial Correctness
The function computed by IPI: A ~ X+ such that
IPI (d)= d' if IPI (d)
P
is the partial function
is successful and its output is d',
ITP.d
is undefined in cases (1) and (2).
We shall use the abbreviation P(d)! in the former case and p(d)r in the latter. Let now ~(x,y) be a formula in some logical calculus (that we shall leave as a parameter), with free variables x and y ranging over X+ •
This
formula will be used as a specification of what
P must compute. We shall call it a specification for P. We say that (1)
P
is correct with respect to f
lid E A r sa- E X+ ,
IPI (d) = d'
and
if
~(d,d')]
It is frequently convenient to establish this property by means of two separate proofs, a proof that P terminates (on A) namely that
(2)
lid E A [P(d)
and a proof name ly that : (3)
lid E A
that
! P is partially correct with respect to
P(d)!
~
.. f(d,lpl(d»].
A proof of (2) includes a proof that the length of the computation sequence of P for input d is finite for every d in A. This length depends in an essential way on d, and the standard way to do the proof is to use an induction on the size of d. The size is an appropriately chosen mapping : A -+W where W is some well-founded set (typically II). Nothing more can be said at this level of generality. Finding the appropriate size function may be very difficult even for simple programs. See Dershowitz [5] for a survey of existing techniques concerning term rewriting systems where a similar problem occurs. In this paper we shall only consider proof techniques for partial correctness and we shall show the close relation between structure of the proof and the structure of the program
B. Courcelle
92
We shall denote by ip,d the syntactical part of ~P,d the sequence of instructions of P that are executed in ~P.d' Hence ip,d does not encode the values taken by the variables any longer. i.e.
More generally, we can define a syntactical computation sequence of P as a sequence (f of instructions of P, starting with the initial instruction and such that the sequencing of instructions is compatible wi th the definition of P. Such a sequence is complete if it is finite and if its last instruction is a terminal one (corresponding to a success state). (Actually some of these syntactical computation sequences may correspond to no actual computation). Let ~ be the set of complete syntactical computation sequences. Then (3) can be rewritten as follows: (4)
lid EA [II ~ E
r
As for (2) an induction on the size of d can be used and we shall not discuss this possibility any longer. An alternative possibility arises from the rewriting of (4) into
(~)
vi
E
r
[lid E A [iTp,d = iT
.. 'f
(d,lpl
which suggests to use an induction on the length or better as we shall see on the structure of i Using an induction on d to prove (4) tends to reprove termination or rather to prove directly (1) in a single proof. For the proof of (~) the length of ~ is frequently not the good object to do induction on • Example Consider a program of the general form while
p
JlQ. A whi Ie q
do
B
end
C
Its syntactical computation sequences are of the general form n1
AB
n2
CAB
nk
C."AB
C
Proofs for Partial Correctness
93
A given length n can be achieved in many different ways for various choices of k, n1, ••• ,nk • Two extreme cases are (AC)k and AB 2 k - 2C , for which the validity of ~ will be established in very different ways. In the next sections we shall formulate precisely what we call the structure of a computation sequence and derive proof techniques for partial correctness expressed as in (5). We shall make a sharp distinction between iterative computations the structure of which is defined by words and recursive computations for which trees are appropriate. Let us finally mention that the extension to non-deter ministic programs of the concept of partial correctness is easy: it suffices to take M (i is a possible ll computation of P with input d instead of II ifp,d = 0'\1 as in (4) and (5). The proof techniques we shall consider apply equally well to non-deterministic programs. But for non-deterministic programs several concepts of termination can be defined :
(6)
Yd E A [ there exists a successful computation sequence of P with input d]
and Yd E A [ there is no computation sequence with input d that is infinite or that terminates in an error state ] • Hence several concepts of validity can be defined. See Gall ier [8] for more details. (7)
2 - Iterative computations We purposely avoid to restrict ourself to any specific class of iterative programs or even of iterative program schemes. We rather present a general notion of iterative computation (encompassing actually all effective computations) and we shall base upon it an abstract presentation of the inductive assertion method.
(2.1)
Iterative computation sequences An
states
~
iterative computation sequence is a sequence of (SO,S1,S2, ••• ) as in 1 such that So is defined
94
RCou~e~
from the input d, and for all i, s1+1 is defined from S1 in a certain fixed way. In other words, a is defined by iterating as many times as necessary an elementary operation • Such an operation is considered as elementary with respect to the construction of a , but it can be defined itself in terms of more elementary operations • We
define an iterative program P as consisting of the following objects: a set a set a set
A S B
of data (the input set), of states, of data (the output set)
a
j
~-tuple
we assume
that S n B = • and the notation S+B standing for SuB recalls this assumption, a relation ~ c A x (S+B) a relation a c S x (S+B). We assume that ~ and s are multivalued computable total functions. More precisely we assume that an algorithm produces for every d in A an element s' of S+B such that (d,s') E u , not necessarly in a unique way but always within a finite time. A similar assumption holds for a . Hence we are describing nondeterministic computations. If ~ and s are single valued then P is deterministic. Given P and d E A, a computation sequence for (P,d) is a finite or infinite sequence a = (SO,S1,S2' ••• ) such that: (d,so) e ~ , (S1,S1+1) e a
for all
i ) O.
Hence if Sk e B , sk is the last element of a and we define it as the output of the computation. If no Sk belongs to B then the computation is infinite. We do not distinguish here ~-states from successstates as we did in section 1. Actually an ~-message can be considered as the output of a computation as well as the desired result in the case where the computation reaches a success-state. In other words we assume that B contains the ~states, as well as the good results. Clearly the specifications will distinguish between ~-states and success-states. The input-output relation computed by P is then the set = {(d,d')/d E A, d' E B, there is a computation sequence for (P,d) with output d'} .
Ipi
It is clear that (8)
Ipl
= us* n (A x B)
95
Proofs for Partial Correctness
where ~* denotes the reflexive and transitive closure of a binary relation ~ on any set E. It can also be expressed by the formula :
=6
~*
(6
u ~
U
is the identity relation
~2
•••
U
~n
{(x,x)/x
e
E}).
U
U
Since states can encode stacks of values and of return addresses, this notion of an iterative program can represent an iterative implementation of a recursive program. More generally every effective computation (in particular every Turing-machine computation) can be represented as an iterative computation with an appropriate set of states S (see Harel [8] for a discussion of this fact and some of its applications to the translatability of flowcharts into while-programs).
(2.2) - Establishing the partial correctness of an iterative program.
A specification for an iterative program P (defined as in (2.1)) can be defined as a relation ~ S A x B, defined by a logical formula ~(x,y) from some logical calculus that need not be precised here. The partial correctness of P with respect to expressed by ~ or equivalently by
Ipls
ace'" n
(9)
(A
~
can be
x B)£~
or else by (10)
¥i ) 0 [ ~~i n (A x B)S~ ] •
In some cases (in fact when B is not too complex), it is possible to express the relation ~Bi n (A x B) by a logical formula Q(i,x,y) from a many-sorted calculus, where i denotes a nonnegative integer, x an element of A and y an element of B, and then to prove the implication Vi
e
~,
Vx
e
A,
¥y
e
B [Q(i,x,y)
~
~(x,y)]
in the appropriate theory. This possibility has been considered in the case of flowchart programs by Andreka et al. [1] but it yields very
96
B. Courcelle
lon, and complicated formulas structure of the program.
which
do
not
reflect the
One might try to prove (10) namely ~i e ~, Pi (where Pi denotes the property ~pi n (A x B)~ ~) by induction on i. But PI represents the partial correctness of the terminating computation sequences of length i. Hence Pi has no reason to imply Pi+1 since the inputs of the computation sequences represented by Pi and by Pi+1 are not the same. To overcome this difficulty, it suffices to define an assertion Qi relative to the first i steps of ~ computation and such that Qi implies Pi for all i ;> o. The above difficulty disappears and it is sensible to require that QI implies Qi+l·
This proof method is an abstract formulation of the inductive assertion method. It consists precisely in defining a relation l ~ A x(S+B) such that (11)
lS n (A x
B)~~
and (12) Condition (12) is achieved by requiring that (13)
~
C
11'
and (14) And (12) now follows from (13) and (14) by a simple induction on i (i.e. we take for QI the condition ~eis lS). If (13) and (14) hold, then lS is called an inductive assertion. It defines a relation between the input and the last state of any computation sequence. In an abstract sense this method can be considered as complete i.e. as able to establish any true partial correctness formula of the form (9) since it suffices to choose for lS the relation ~e*. This result appears in De Bakker and Meertens [3]. But it is not complete if one requires that lS is expressible in first-order lo,ic. A counter-example can be built from Wand [13] • On the other hand ~e* is expressible
97
Proofs for Partial Correctness
in
the
[6,7].
infini tary
language
as noticed by Enjalbert
1w w 1
Since the set of true partial correctness formulas is not recursively enumerable in general (due to the incompleteness of arithmetic), there is no hope to find any concrete completeness result (i.e. relative to a recursively enumerable set of proofs ). (2.3) - Application to flowcharts A flowchart can be considered as an iterative program the set of states of which is of the form S = L x D where L is a finite set (the set of instruction labels) and D is the set of all assignments of values from some domain to the variables of the program. (By variables one does not understand only individual variables ranging over ~, 2 or R but more complex data structures like arrays, lists, trees etc ••• ) • An Example
Rather than a fully formal treatment we consider an example from which the general case can be easily derived. Let
P=
be an iterative program such that
S = { 0,1,2 } x D
°
~ = {(d, (O,d' ))/ (d,d') E lXo } for some means that labels the initial instruction).
lXo
cA x D (this
8 = {(i,d),(j,d'))/i,j E {0,1,2},(d,d') E 8i,j} {«i,d),b)/i E {O,I,2},(d,b) E 8i,exit} for some
8i,JfD x D
and some
81,exit cD x B.
If we furthermore assume that
=
80,0 = 81,1 = 81,2 = 82,0 = 82,exit ~ represented by the following diagram :
then
P
can be
U
98
B. Courcelle
Let now ~ be an inductive assertion i.e. satisfying (11), (13) and (14) • Since ~CA x 0,1,2 } x D) + B)
a
relation
«{
it can be written ~
= {(a,(i,d'))/(a,d') E
for some relations ll'exlt C A x B.
~i,iE{0,1,2}}u
~o,
~l
,
~2
£
A x
{(a,b)/(a,b) E
D and some relation
Hence condition (13) reduces to (15)
lXo ~ lSo
,
condition (14) reduces to the following conditions (16)
lSi
(17)
~IBl,exlt
and condition (18)
Bl,j
~
~J
S
, i,j E {0,1,2}
~exlt
(11) reduces to : ~exlt~'t'.
Actually (17) and (18) can be replaced by (19)
~IBl,exlt
£
't'
i
which eliminates the introduction of
~exit}
E {0,1,2} , ~exit
•
Proofs for Partial Correctness
99
The set of conditions (15), (16) and (17) expresses that ~ forms a system of invariant assertions in the terminology of Katz and Manna [Ill. ( This paper presents several techniques for constructing invariant assertions). For applying this method to a non-recursive program one has to find a set of cut-points i.e. a set of instruction labels such that every loop in the graph of the program goes through at least one cut-point. This allows to divide the program into blocks like ao, at, a2 above such that each bloc contains finitely many execution paths. This finiteness condition allows to express the relations aj'J by first-order quantifier-free logical formulas. See [Ill for more details.
(2.4) - Application to parameter less recursive procedures The iterative program implementing a recursive parameter less procedure (or rather a set of mutually recursive such procedures) (by parameterless we mean that they modify a fixed finite set of global variables) uses a set S of the form R x D where R is an infinite set of words over a finite alphabet. Each element of R is a stack of return addresses • the manipulations we did in the example an infinite family of inductive assertions,(~r)rER' Provided one can handle the infinite set of conditions similar to (16) and (19), one can establish the partial correctness of recursive parameter less procedures. of
Working out (2.3) yields
The formal treatment has been done in De Bakker and Meertens [4l But it is practically very difficult to use for concrete proofs. We shall provide a much more convenient method in the next section that is also be applicable to recursive procedures with parameters. 3 - Recursive computations.
In recursive formally.
real programming languages like ALGOL or PASCAL procedures are very difficult to investigate
B. Courcelle
100
For this reason two types of recursive procedures have been extracted from these lancuages : the applicative ones (which form the core of pure-LISP) investigated in depth in many works (see Guessarian [9J ; this book contains many other references) and the imperative ones, that have been much less investigated (see Gallier [8] and, in a more abstract setting De Bakker and Merteens [4]). These two different classes are investigated with different techniques and none of them encompasses all ALGOL recursive procedures. We shall present a new formalism which encompasses both types and can (probably) encompass all ALGOL procedures which do not take procedures as parameters • It is based on the idea of defining the structure of a recursive computation by means of a tree which represents the relations between the different recursive calls. And partial correctness formulas can be proved by induction on these trees, i.e. in some sense by induction on the structure of computations. (The notion of a call-tree is borrowed from Courcelle and Deransart [3] , the notion of a clausal scheme is original). (3.1) - Definition
Let
~
=
Clausal schemes.
{A1J •.. ,AN}
be
a set of unary relational
symbols; let S {Bl, ••• ,BM} be a set of relational symbols with positive arity (p(B) denotes the arity of B in S). Let variables. We define an (J,S)-clause (also clause in the context of a fixed pair (J,S» the form:
called simply a as a sequence of
where n ) 0, io, ••• ,i n E [N], j E [Ml, p(Bj) = n+l. Anticipating on the semantics we shall denote it as a logical formula and AI A clausal scheme on
(J,S)
is a set
n
(x n) and S
of
Bj (xo, •• .,x n)
(J,S)-clauses.
101
Proofs for Partial Correctness
I = (DI
S is a relational structure An interpretation for • (BI) ) consisting of a non empty set DI and a
BES.
p(B)-ary relation on
DI associated with every
B
in
s.
A pair (S.I) where S and I are as above is called a clausal program. The value of S in I is an N-tuple of unary relations AtI •••• ,ANI on DI that we shall define in two equivalent ways. first by taking a least fixed point and secondly by means of call-trees (that are close to computation sequences).
(3.2) - Least fixed point semantics We say that an N-tuple at •••• ,aN of subsets of D I is a solution of S in I if the relational structure (I,ats···,aN) = (DI,(BI)
,(ail
Ai ES model of S considered as a set of formulas of the form BES
is a (20)
The set P(DI)N is ordered by (at, ••• ,aN)S (a~, ••• ,a~) iff a , c a , for all i E [NJ. LEMMA A clausal scheme interpretation I.
S
has a least solution in every
We denote this least solution by (AII' •••• ANI) and consider it as the value of S in I • This lemma is easy to establish (i t s basically the one of the main theorem of [12]).
proof
Let us only concretely defined as
is
recall that (Atl' •••• ANI) can be U K~(_N) where Ks maps P(DI)N into JQO itself as follows for al •••• ,aN c DI • KS(al ••••• aN) denotes the N-tuple (a~, •••• a;) such that a'i :::
U Kc (a j CESi
, •••
,aN)
where Si is the set of clauses C in S of the form (20) with io=i. and where for such a clause C
is true for some d 1 in al , ••• ,d n in al }. 1
n
102
B. Courcelle
(3.3) - OperatiQnal semantics
Our second characterizatiQn uses the concept of a call-tree, defined as fQIIQws. Let Al E ~ • A call-tree t Qf Al is a finite tree such that for SQme clause C in 51 Qf the fQrm
i) either n = 0 and t ii) Qr n
~
1 and t
is reduced tQ a nQde labeled by C
is of the fQrm C
/\ where tl, ••• ,t n are call-trees Qf Ai
1
, ••• ,Ai respectively. n
By using the nQtatiQn C in case (i) and the notatiQn C(tl, ••• ,t n ) in case (ii) Qne gets a linear notation for call-trees. A call-tree t can be cQnsidered as defining a certain subset tr Qf Dr. Using the recursive definition of call-trees we can define tr as follows: tl
= Bjr
in case (i). This definition is meaningful since in this case C is reduced tQ and p(Bj) = 1. Ai (XQ)+ Bj (xo)
We now define Ail
=u {tl / t
is a call-tree of
AI} •
A routine prQof can establish that this definition coincides with the first one. Remarks (1)
sequences
Call-trees correspond tQ syntactical cQmputation defined for iterative prQgrams. The set tr is the
Proofs for Partial Correctness
103
set of all tuples (inputl, ••• ,inputk' oUtpUtl, ••• ,outputl) which are computed by the tree of calls specified by t (recall that the variables xo, ••• ,x n appearing in the definition of clauses as in (20) represent tuples of values in the appropriate domain; see example 1 below). (2) In the case of iterative programs we have only used an operational semantics (based on the concept of a computation sequence). It is not diffiCUlt to convert equation (9) into a least fixed-point characterization of by using the fact that the transitive closure of a relation is itself a least fixed point.
Ipi
(3.4) - Applicative and imperative recursive programs There are two great types of recursive programs, the applicative (pure-LISP like) ones and the imperative (ALGOL like) ones. The former ones are written with functions, predicates, if-then-else and define new functions by means of function application and recursive calls. The latter ones use the sequencing of instructions as the ground control structure and the effect of a program is a modification of the values of the variables. Both types of recursive programs can be considered as interpreted clausal schemes. Rather than formal constructions that can be found in Courcelle and Deransart [3] we give representative examples. Example 1 : Ackermann's function The well-known Ackermann's function f : ~ x ~ ~~ can be defined by the following recursive applicative program f(x,y) = i f x = 0 then y + 1 else if y= 0 then f(x-1,1) else f(x-l, f(x,y-l» The computation of 1(1,2) = = = = = =
f(I,2)
1(0,f(1,1» 1(1,1) + 1 f(O,f(1,O» 1(1,0) + 2 1(0,1) + 2 2 + 2 = 4
proceeds as follows
+ 1
104
B. Courcelle
We have underlined the occurrence of f which is replaced at the next step. Other computations (yielding the same result) are possible .f(l,2) = = = = = =
f(O,.f(I,I» f(O,f(O,.f(l,O» f(O,f(O,f(O,I» r (O,.f(0,2» .f(O,3) 4
Anticipating on the sequel we can represent the organization of recursive calls by the following tree, which is the same in the two cases : f (l, 2)
D
f(O,I)
D We represent (21) by the clausal scheme A(uo) .. B I (uo) A(uo) .. A(UI) A(uo) .. A(UI)
B2(UO,UI) A(U2) and
The
corresponding interpretation is I with domain DI = ~ X ~ such that AI={(X,y,Z)E~3/z = f(x,y)}.
X ~ We shall define B I 1,B2I' B31
Each variable uO,uI,u2 stands for a triple of variables ranging over ~ ; we let ul represent (xl,YI,zl). We
105
Proofs for Partial Correctness
now define Bl1, B21, xo,Yo,zO,xl, ••• etc
B31
by formulas with free variables
Bl(Xo,Yo,ZO) ~ Xo > 0 and Zo= Yo+1 B2(Xo,Yo,zo,Xl,Yl,Zl) ~ Xo > 0 and Yo= 0 and xl=xo-1 and Yl=l and Zo = zl B3(Xo,YO,ZO,Xl,Yl,Zl,X2,Y2,Z2) ~ xo> 0 and Yo > 0 and xl~xO and Yl=Yo-1 and x2=xo-1 and Y2=Zl and Zo = z2. The two computations of following call-tree t :
f(l,2)
both
correspond
to
the
It is easy to verify that (1,2,4) belongs to tl. Note that t does not represent a unique computation sequence (in the sense of section 1) but rather a set of equivalent computation sequences (the equivalence of computation sequences has been formally defined in Berry and Levy [2]). Example 2 Consider the following sorting algorithm which modifies a sequence u (say of integers) so as to sort it (say by increasing order) : sort (u) begin i f length (u) > 1 then ~ new variable v,w of type sequence of integers .!.P..li!. (u,v,w) sort (v) sort (w) ut-merge (v,w) end This program uses an auxiliary procedure .!.P..li!. (u,v,w) which divides u into two parts (as equal as possible), assigns to v the first part and to w the second part without modifying u
B. Courcelle
106
(so that u = v.w after execution). The base function ~ forms a unique sorted sequence by interleaving the two sorted sequences it takes as arguments. We can trans late this program into the clausal program A(u,u') A(u,u')
• •
B I (u,u') A(v,v') and A(w,w') and B2(U,U' ,v,v' ,w,w')
where BI(U,U') ~ length (u) = 1 and u'=u B2(U,U' ,v,v' ,w,w') .~ length (u) > 1 and U=v.w and length (v) ~ length (w) ~ length (w) - 1 and u' = merge (v' ,w') and u,u' ,v,v' ,w,w'
range over nonempty sequences of integers.
In this program, A(u,u') should be understood as is the result of the sorting of u
u'
The call-tree associated with the sorting of any sequence u of length 5 is B2(B2(B 2(B I,BI),BI),B2(BI,BI»'
(3.5) - Partial correctness of clausal programs Let us recall that the partial correctness of a program w.r.t. ~ can be formulated as follows: For every terminating computation with input x and output y , property ~(x,y) holds. In the case of clausal computations correspond to call-trees. For {AI"" ,AN} specification One says that i in [N], for Thi s in [N] •
schemes,
terminating
(4,~) where ~ = a clausal scheme S over and I is an interpretation, we define a ~ as an N-tuple (~I""'~N) of subsets of Dr. (S,I) is partially correct w.r.t. ~ if for all every call-tree t of Ai' tr f. ~i'
is equivalent to requiring that
Air~ ~i
for all
In concrete cases, the subsets ~I"",~n will be defined by logical formulas similarly as are the Br's for B E ~ (see examples 1 and 2).
107
Proofs for Partial Correctness
How to establish the partial correctness of a clausal
progr~
We now propose a method for establishing the partial correctness of a clausal progr~ (S,I) with respect to some specification If. Defini tion : A spec ificat ion is inductive if Ks (E>l,· •• ,E>N) equivalently if
IX
Fact 1 : then
(I,91,···,E>N)
(S,I)
9
E>
s
for
(E>l,···,E>N)
(S, I)
(E> l' ••• ,E>N)
is a solution of
in
S
is inductive with respect to
is partially correct with respect to
I.
(S,I) 9 •
Proof: Since (AII, ••• ,ANI) is the least solution of S in I , 9i if 9 is inductive. This means that (S,I) is partially correct w.r.t. 9 • 0
AiI~
Fact 2 : for (S,l).
(A11, ••• ,A N1 )
is an inductive specification
These two facts yield the following proposition: (3.6) Proposition A clausal progr~ correct w.r.t. some specification If specification 9 such that 9 9
(1) (2)
i.Q1:
9
Proof
.
't'
is inductive w.r.t. is stronger than If
if one thinks of The
i
It
r"
9
and
(S,I) is partially iff there exists a
(S, I) i.e. 9 't'
part follows from
s
If
as formulas). Fact 1.
The '\onlY" if part follows from Fact 2 that shows that (A1I, ••• ,ANI) can be taken as the requested specification 9. 0 Rema.rk
(1) A clausal progr~ (S,I) may be partially correct w.r.t. some specification If without 't' being inductive.
108
B. Courcelle
(2) In concrete cases the following holds
(m) One works in a model of axioms
~
(a typical example is
M= < ~
M,(fM)fEF> of some set with Peano's axioms ).
(~) The domain DI is Mk for some finite k. (If)
free BI·
lo&:ical
The relations BI'S are defined by quantifierformulas in the language of M ; say ~B defines
(£) So are the components of the specification ~ with first-order logical formulas in the language of M (we denote them by ~l, ••• ,~n) •
In order to establish that (S,I) is partially correct with respect to ~ it suffices to find ~ i.e. an N-tuple of logical formulas 91, ••• ,9N' such that (s )
.t4
r--
9;
..
~;
for all i = 1, ••• N (i.e. such that one can prove in is stronger than ~) and such that : ('II)
.t4
r--
I;IXo, •••
,xn[~B(Xo, ••• ,xn) and
for all clauses C of the form (20) that 9 is inductive) •
9 i (x n) n
~
that 9
and 9;fx 1 )
.. 9 i
o(X
O)
and
]
(i .e. one can prove .in .t4
The proof method consisting in defining ~,k'~B'~i,9i as in (m) - (£) and satisfying (E) and (~) is sound • This is a consequence of the "if" part of corollary. (3) The "only if" part of Proposition (3.6) looks like a completeness result. It is one at the set-theoretical level, where one can deal with arbitrary subsets of some given set. It is not if one must define them in first-order logic since the relations AiIare not usually expressible in first- order logic (but they are in second-order logic or in L w w). A precise counter-example has been given by Wand 1 [13]
and adapted to the present situation by Courcelle and Deransart [3]. To summarize, this proof method is applicable to the following classes of programs:
Proofs for Partial Correctness
109
imperative recursive programs (deterministic or not), applicative recursive programs with call-by-value computations (deterministic or not), deterministic applicative recursive programs with call-by- name computations attribute grammars (as shown in [3]). Appl ications to PROLOQ programs, can be expected since the notion of a program defined as a set of clauses and the relational style of clausal programs are borrowed from PROLOG. But they are not very deep since in PROLOQ the major problem is with termination and not with partial correctness. Acknowledgnews helpful comments.
thank
Z.Manna and
the
referee for many
---0---
REFERENCES [ll
H. ANDREU, I. NEMETI, I. SAIN, A complete logic for reasoning about programs via nonstandard model theory, Theor. Comput. Sci. 17 (1982) pp. 193-212 and pp , 259-278.
[2]
Q.
BERRY, J.J.LEVY, Minimal and optimal computations of recursive programs, J.Assoc.Comp.Mach. 26 (1979) 148-175.
[3]
B. COURCELLE, P. DERANSART, Proofs of partial correctness for attribute grammars and recursive procedures, INRIA research report 322, July 1984.
[4]
J. De BAKKER, L. MEERTENS, On the completeness of the the inductive assertion method, Journ. Comput. Syst. Sci. 11 (1975) pp 323-357.
[5]
N. DERSHOWITZ, Termination, Proc. of the colloquium on rewriting techniques and applications, Dijon 1985, L.N.C.S vol.202, Springer-Verlag.
[6]
P. ENJALBERT, Algebraic semantics and program logics algorithmic logic for program trees. In Logics of Programs and their applications, Salwicki ed : vol. 148, Springer Verlag, 1983.
[7]
P. ENJALBERT, w-rule and continuity, Bialowieza Conference on logic of program, October 1981.
110
B. Courcelle
[8]
J. GALLIER, Non-deterministic flow-chart programs Semantics and with recursive procedures (1981) pp • correctness, Theor.Comput.Sci. 13 193-229 and 239-270.
[9]
I. GUESSARIAN, Algebraic Semantics, L.N.C.S.vol. 99 1981.
[10]
D. KAREL, On folk theorems, Communications of ACM 23 1980) pp. 379-389.
[11]
S. KATZ, Z.MANNA, Logical analysis of programs C.ACM 19(1976) pp 188-206.
[12]
M.VAN EMDEN, R.KCWALSKI , The semantics of predicate logic as a programming language, J.Ass.Comp. Mach. 23 (1976) pp 733-742.
[13]
M. WAND, A new incompleteness result for Hoare's logic J.Ass.Comp.Mach. 25 (1978) pp 168-175.
Logic Colloquium '85 Edited by The Paris Logic Group
111
© Elsevier Science Publishers B.V. (North-Holland), 1987
SYSTEME ET METASYSTEME CHEZ RUSSELL Jean van Heijenoort Brandeis University Principia mathematica fut originellement con9u, en decembre 1902, comme devant etre Ie second volume de Russell 1903, lequel porte l'indication 'Volume I';
il devint bient&t un projet independant.
Les
deux ouvrages, auxquels il faut ajouter Russell 1919, sont cependant les produits (avec l'aide de Whitehead en ce qui concerne PM) d'un effort constant de la part de Russell.
Que tentait-il de faire?
Russell
declare, dans la phrase qui ouvre la preface de PM, que Ie sujet de l'ouvrage est 'Ie traitement mathematique des principes des mathematiques';
et les deux titres, The principles of mathematics et Principia
mathematica, semblent confirmer que telle est bien l'entreprise de Russell.
Nous commen90ns, cependant,
a avoir
certains doutes quand nous
voyons que la plupart des exemples qu'il invoque sont non-mathematiques. On pourrait peut-etre considerer que ces exemples appartiennent
a des
explications qui restent en dehors du systeme et n'ont donc aucune importance theorique.
Mais, dans PM, *1, parmi les 'Idees primitives'
Russell fait figurer les 'propositions elementaires', qu'il caracterise ainsi:
'Par proposition
ne fait pas appel
a des
"~l~mentaire"
nous entendons une proposition qui
variables ou, autrement dit, une proposition qui
ne contient pas de mots tels que "tous", "certains", "Ie", ou les equivalents de tels mots.
Une proposition tel Ie que "ceci est rouge", ou
"ceci" est quelque chose qui nous est donne par la sensation, sera elementaire'.
Les mathematiciens ne semblent guere se preoccuper d'une
proposition telle que 'ceci est rouge'.
lIs partent d'un domaine
arbitraire (ainsi en theorie des groupes, par exemple), avec des operations ou des relations definies dans ce domaine, mais certainement pas de proprietes non-mathematiques telles que 'rouge'; partent d'un univers d'ensembles, bati
a partir
ou bien ils
de l'ensemble vide par
la theorie des ensembles. II y a chez Russell une continuite entre Ie mathematique et Ie nonmathematique.
Son principal argument contre Hilbert est que fonder
axiomatiquement l'arithmetique 'a Ie desavantage de ne pas reussir
a
112
J. van Heijenoort
expliquer comment les nombres s'appliquent lorsque l'on compte' (1959, page 110).
Pourquoi?
Parce que la collection des douze apotres, par
exemple, est completement detachee des etres mathematiques introduits axiomatiquement.
8i des ob j at.s quelconques peuvent etre enumeres, toute
collection de ces objets doit etre consideree comme figurant parmi les ensembles qui servent a la definition des nombres cardinaux.
On n'a pas
une theorie mathematique que l'on applique ensuite au monde des sens;
on
a une imbrication etroite entre Ie mathematique et Ie non-mathematique qui fait que l'on a un systeme universel. Les propositions ont chez Russell, comme on sait, un statut ambigu; elles flottent entre phrases (c'est-a-dire objets linguistiques) et faits (etats de choses).
Optant pour les phrases, nous pouvons peut-etre
exprimer la conception de Russell ainsi: (designes par des demonstratifs:
l'univers consiste en individus
ceci ou cela), auxquels sont attaches
des predicats (a un ou plusieurs arguments) et nous avons un stock initial de phrases vraies, toutes celles qui affirment gue tel ou tel predicat s'applique, ou ne s'applique pas, a tel{s) ou tel{s) individu{s). Ces phrases vraies caracterisent l'univers, elles embrassent tout ce qui peut se dire dans la connaissance humaine et forment Ie terrain sur lequel la logique se tient, et aussi les mathematiques, puisque, selon Russell, celles-ci ne sont que de la logique.
'L'univers consiste en objets ayant
diverses qualites et maintenant entre eux diverses relations' (PM, page
45;
-1
page 43). Et: 'Notre sys t eme commence avec des "propositions 2, atomiques". Nous les acceptons comme quelque chose de donne, car les PM
problemes qui surgissent en ce qui les concerne appartiennent a la partie philosophique de la logique et ne peuvent subir (en tout cas a present) un traitement mathematique' (PM, page xv). La logique se -2 revele etre une structure abstraite erigee sur les phrases qui caracterisent l'univers. Avant Russell, Frege avait deja considere que la logique reposait sur les phrases atomiques qui sont vraies dans un univers fixe et embrassant tout, l'univers des objets et des fonctions.
Dans la preface
a 1879 il envisage une extension de son systeme a l'ensemble des mathematiques, a la geometrie, a la mecanique, a la physique. Russell different sans doute quant a leurs vues sur l'univers.
Frege et Pour
Frege, c'est un cosmos rationnellement reconstruit, dans lequel les proprietes sont 'objectives' (sur ce point voir ~ Heijenoort 1985, pages
91-92);
pour Russell (a certains moments du moins), c'est un monde
Systeme et Metasysteme Chez Russell
113
peuple de maints 'ceci' et 'cela', chacun d'eux etant donne par la sensation.
Mais tous deux sont d'accord pour penser que la logique repose
sur un univers unique et ne devrait pas s'abaisser a considerer, successivement, de soi-disant univers de discours, univers desseches dont on peut changer a volonte.
Cette conception, qui est dans la tradition
de la logica magna medievale, n'est pas expressement adopteee et defenduej mais, tacitement, elle forme Ie terrain sur lequel leur oeuvre repose. Une premiere consequence d'une telle conception, c'est que les quantificateurs liant des variables individuelles vont s'etendre a tous les objets, c'est-a-dire a tous les objets dans l'univers. que Frege ecrit (1879,
C'est ainsi
§ll, ou van Heijenoort 1967, page 24):
les autres conditions a imposer
a ce
'Toutes
qui peut etre mis a la place
d'une lettre gothique [c'est-a-dire une variable universellement liee] doivent etre incorporees dans Ie jugement'.
Pour prendre un exemple
simple, la loi commutative de l'addition des nombres naturels est, selon cette vue, formulee ains;: y + ~),
et non ainsi:
Ce qui est ici 'incorpore dans Ie jugement', c'est l'antecedent '(Nx & Ny)' Sur ce point Russell a exactement la meme position que Frege: devons, par consequent, permettre
a nos
~,
'Nous
chaque fois que la verite de
notre implication formelle n'en est pas alteree, de prendre toutes les valeurs sans exception;
et la ou une restriction quelconque
a la
variabilite s'impose, l'implication ne doit pas etre regardee comme formelle tant que la dite restriction n'a pas ete eliminee, etant transformee en une hypothese initiale' (1903, page 38).
Si nous nous
souvenons que pour Russell une implication formelle est la cloture universelle d'un conditionnel, la restriction transformee par Russell en hypothese initiale est exactment ce qui pour Frege est la condition incorporee dans Ie jugement.
Dans PM (au commencement du Chapitre I)
Russell distingue entre variables restreintes et variables nonrestreintesj limitees
une variable est restreinte 'lorsque ses valeurs sont
a n'etre
que certaines de celles qu'elle peut prendre';
la variable est non-restreinte.
II ajoute ensuite:
sinon,
'Pour les fins de la
logique la variable non-restreinte est plus commode que la variable restreinte et c'est elle que nous emploierons toujours' (PM, page -1
4; -PM 2,
114
J. van Heijenoort
page
4).
II y a, chez Frege, un certain nombre de degres de
stratification et, chez Russell, une echelle infinie de types;
il est
entendu qu'une variable est non-restreinte a une certain niveau, non a travers des niveaux differents.
Comme Russell l'ecrit, 'les limitations
auxquelles la variable non-restreinte est sujette [par la stratification] n'ont pas besoin d'etre indiquees explicitement, car elles sont les limites du sens de l'enonce dans lequel la variable se trouve, donc intrinsequement determinees par cet enonce' (PM
4).
l,
page
elles sont
4; PM
2,
page
Comme Ie systeme est suppose tout embrasser, vrai est ce qui est affirme dans Ie systeme, soit comme axiome, soit comme consequence d'un certain nombre d'axiomes, et une notion de verite qui se maintiendrait hors du systeme serait illusoire.
Russell est ainsi naturellement
conduit a s'abstenir de donner de ses connecteurs primitifs une definition basee sur les tables de verite; 'indefinissables' (1903, page 8).
il les considere
II ecrit: 'Les constantes logiques
elles-memes ne seront definies que par enumeration, car elles sont si fondamentales que toutes les proprietes par lesquelles la classe qu'elles constituent pourrait etre definie presupposent certains termes de cette classe' (1903, pages 8-9).
Et aussi (page 4):
l'implication est tout a fait impossible'. 'Si
E implique 3,
alors si
E est
de E implique la verite de 2; a-dire la faussete de
3
vrai
3
'une definition de
Son argument est Ie suivant:
est vrai, c'est-a-dire la verite
et aussi, si 3 est faux E est faux, c'est-
implique la faussete de
E'.
Et il conclut:
'Ainsi verite et faussete ne nous donnent que de nouvelles implications, non pas une
defi~ition
de l'implication'.
Nous avons ici devant nos yeux
un homme qui s'avance sur un plancher gluant, incapable de lever un pied sans s'y coller de nouveau. Les regles d'inference conduisent Russell a un semblable embarras. Comme, pense-t-il, rien ne peut etre dit en dehors du systeme, les regles d'inference prennent un statut ambigu, et la notion meme de regIe est douteuse. Dans 1919 il enumere, comme etant 'les principes formels de deduction', cinq axiomes pour Ie fragment propositionnel de son systeme (pages 149-150). Et il ajoute: a un double emploi [ •.• ].
'Un principe formel de deduction
II a un emploi en tant que premisse d'une
inference et un emploi en etablissant Ie fait que la premisse implique la conclusion'.
La regIe de detachment est mise par Russell sur Ie meme
plan que les axiomes.
Dans PM elle apparait (pour les formules sans
Systeme et Metasysteme Chez Russell quantificateurs) comme *1.1 et est enoncee ainsi:
115
'Tout ce qui est
implique par une proposition elementaire vraie est vrai'.
Et
a la
fin de
cette phrase Russell ajoute 'Pp', une abreviation empruntee a Peano et signifiant 'Proposition primitive'.
Le meme 'PP' se trouve a la fin de
*1.2-6, qui sont les cinq axiomes pour Ie fragment propositionnel de PM. La situation dans laquelle Russell s'est enfonce est assez bien decrite par lui-meme:
'Le proces de l'inference [c'est-a-dire l'emploi de la
regle de detachement] ne peut etre reduit a des symboles.
La seule
marque qu' il laisse est l' occurrence de "q'" (P~ ,page 9; P~, page 9). C'est bien vrai, la regle de detachment 'ne peut etre reduite a des symboles', c'est-a-dire exprimee dans le systeme, et 'la seule marque qu'elle laisse' dans le systeme est
Dans Ie systeme nous voyons que la regle a ete appliquee, mais nous ne pouvons pas dire qu'elle l'a ete. regrettable;
C'est la, pour Russell, une situation
la regle de detachement 'echappe a un enonce forme1 et
indique un certain defaut du formalisme en general' (1903, page 34).
A ce
point Russell invoque, a juste titre, l'article bien connu de Lewis Carroll (1895), qui montre comment on ne peut se depetrer du systeme si un metasysteme bien distinct n'a pas ete pose. II a deja ete note (par exemple, par Godel (1944, page 126)) que, dans PM, Russell specifie sa syntaxe avec moins de precision que Frege. Bien que Frege, comme nous Ie verrons plus loin, maintienne plus strictement que Russell les principaux elements de ce que nous pouvons appeler ici la conception universaliste de la logique, ceci ne l'empeche pas, cependant, de donner une vue exacte du statut des regles d'inference; ce sont, comme il l'ecrit, des regles 'pour l'emploi de nos signes' et elles 'ne peuvent etre exprimees dans l'ideographie, car elles en forment la base' (Frege 1879, § J.3, ou ~ Heijenoort 1967, page 28). Avec ces quelques mots Ie plancher gluant est lave a grande eau.
Alors
que Russell regarde Ie conditionnel comme 'indefinissable', Frege en donne une definition semantique (metasystematique) en termes de valeurs de verite, definition qui lui permet de justifier la regIe de detachement. II sait faire la part du feu.
J. van Heijenoort
116
Le fait que Frege et Russell regardent leurs systemes comme embrassant tout les empeche d'entreprendre aucune recherche metasystematique.
C'est ainsi que Frege ecarte abruptement Ie probleme
de la non-contradiction:
'Car, comme un axiome doit necessairement etre
vrai, il est impossible que des axiomes se contredisent les uns les autres.
Done il ne faut pas gaspiller en pure perte un seul mot la-
dessus' (1969, page 267).
A propos d'un autre probleme metasystematique,
celui de l'independance mutuelle des axiomes, Russell est quelque peu plus loquace, mais lui aussi refuse de sortir
du systeme:
'et il faut
observer que la methode qui consiste a supposer un axiome faux et
a
deduire les consequences de cette hypothese, methode qui s'est averee admirable dans des cas tels que celui de l'axiome des paralleles, n'est pas ici universellement applicable. principes de deduction;
Car tous les axiomes sont des
et, s'ils sont vrais, les consequences qui
semblent decouler de l'emploi d'un principe oppose ne decoulerait pas vraiment, de sorte que des arguments bases sur l'hypothese qu'un axiome est faux sont ici sujets
a des
erreurs particulieres.
Ainsi, Ie nornbre
de propositions indemontrables peut etre susceptible d'etre reduit davantage et, en ce qui concerne certaines d'entre elles, je ne connais pas de raisons pour les regarder comrne indernontrables sinon Ie fait qu'elles sont restees indemontrees jusqu'a maintenant' (1903, pages 15-16). Et dans PM nous avons une note (1910, page 95;
1925, page 91) disant:
'Les methodes generalernent adrnises pour prouver l'independance ne peuvent pas s'appliquer sans reserve aux choses fondamentales'. L'incapacite de Russell de regarder son systeme de l'exterieur est ici frappante. Ajoutons que Frege, dans sa polemique avec Hilbert sur les fondernents de la geometrie, maintenait que l'independance des axiomes de la geometrie euclidienne ne pouvait etre demontree. Nous voyons Russell s'en remettre a l'experience dans la question de l'independance des axiomes ('nous pouvons seulement dire que certaines propositions sont restees indernontrees jusqu'a maintenant');
il s'en
rernet encore a l'experience lorsque se pose la question de savoir si Ie systeme est adequat.
Dans PM nous lisons:
'la principale raison en
faveur de toute theorie concernant les principes des mathematiques doit toujours etre inductive, c'est-a-dire qu'elle doit resider dans Ie fait que la theorie en question nous perrnet de deduire les rnathernatiques ordinaires' (1910, page v;
1925, page v).
Une demonstration de la
cornpletude de la theorie de la quantification aurait implique qu'il aurait
Systeme el Metasysteme Chez Russell
117
fallu considerer une notion ensembliste de verite en dehors du systeme, alors que les ensembles, si l'on veut bien en parler, doivent etre introduits a une certaine etape dans le developpement du systeme.
La
notion meme de completude n'a pas de sens, et nous voyons que le systeme est adequat en deduisant dans lui autant de theoremes de logique et de mathematiques que nous pouvons.
La seule completude a laquelle nous
puissions aspirer, c'est, pour nous servir d'une expression de Herbrand, une 'completude experimentale'. 11 faut ajouter, sur cette question de completude, que Frege et Russell ne considerent pas la logique du premier ordre comme digne d'une etude independante. Leurs formules sont closes, dans un sens absolu. C'est ainsi qu'ils ne considerent pas la formule
mais la formule
PoUr de telles formules il y a une seule interpretation et la notion fondamentale n'est pas la validite, mais la verite. on ne peut meme guere parler d'interpretation.
Pour Frege et Russell
Cette notion implique
que l'on vient attacher, peut-etre de differentes manieres, le sens au signe.
Pour eux le sens colle toujours au signe.
La conception universaliste de la logique est associee chez Frege a une vue absolutiste de la verite mathematique.
Les axiomes dans les
diverses branches des mathematiques sont vrais dans le sens que leurs negations, etant fausses, ne peuvent pas etre considerees comme premisses de conditionnels.
Frege revient constamMent la-dessus.
'Tout comme les
theoremes, les axiomes sont des verites; mais ce sont des verites qui ne peuvent pas etre prouvees dans notre systeme et qui d'ailleurs n'ont nullement besoin d'une preuve.
Ceci implique qu'il n'y a pas de faux
axiomes et que nous ne pouvons pas non plus reconnaitre comme axiomes des pensees qui pour nous sont douteuses' (1969, page 221).
Et encore, 'un
axiome qui n'est pas vrai est une contradiction' (page 263).
De plus,
toutes les notions qui se presentent dans un axiome doivent avoir ete prealablement definies:
'Dans l'expression d'un axiome il ne peut yavoir
rien d'inconnu' (page 263).
Ce dogme tenace trouve une application
immediate dans le cas special de la geometrie: maitres.
'Nul ne peut servir deux
On ne peut pas servir la verite et servir l'erreur.
Si la
geometrie euclidienne est vraie, alors la geometrie non-euclidienne est
J. van Heiienoort
118
fausse;
et si la geometrie non-euclidenne est vraie, alors la geometrie
euclidienne est fausse' (page 183). L'attitude de Russell dans ces questions est bien differente de celIe de Frege.
En 1901 deja, il ecrivait (1951, page 75):
'Les mathematiques
pures consistent entierement en assertions declarant que, si telle ou telle proposition est vraie d'une entite quelconque, alors telle ou telle autre proposition est vraie de cette entite.
11 est essentiel de ne pas
discuter si la premiere proposition est reellement vraie et de ne pas mentionner ce qu'est l'entite dont elle est supposee etre vraie'.
Et il
concluait par une phrase qui est devenue un aphorisme souvent cite: 'Ainsi les mathematiques peuvent etre definies comme le sujet dans lequel nous ne savons jamais de quoi nous parlons et nous ne savons jamais si ce que nous disons est vrai'. On a dit de Russell que c'etait un philosophe sans philosophie.
11 a
souvent change ses idees, et sa conception universaliste de la logique est moins coherente que celle de Frege.
On pourrait dire aussi qu'il est
moins dogmatique que Frege et n'a pas son ton peremptoire. page 169;
~!
page 161) il declare:
Dans PM (1910,
'en pratique seuls comptent les
types relatifs des variables' (souligne dans l'original).
Cet 'en
pratique' veut dire que la technique logique peut, jusqu'a un certain point, etre dissociee des presupposes ontologiques, et ainsi, sur Ie plan logique, un certain relativisme est introduit.
11 y a dans
~
a savoir le paragraphe *9, qui est carrement metasystematique:
un passage, si c'est
en rempla9ant, dans certains schemas, les lettres par des formules sans quantificateurs que nous obtenons les axiomes du fragment propositionnel, alors sont demontrables les formules obtenues en rempla9ant ces memes lettres par des formules quelconques.
D'un autre cote, nous le voyons se
refuser a une entreprise aussi innocente que le probleme de l'independance des axiomes propositionnels. Avec le renouveau moderne de la logique et les premiers succes, les chercheurs voyaient s'ouvrir devant eux un large champ d'activite; idees nouvelles surgissaient dans des directions differentes.
des
C'est
pourquoi il ne faut pas chercher chez Russell, ou meme chez Frege, trop de coherence.
Mais derriere un certain foisonnement, il y a une tendance
profonde, que ces deux auteurs sui vent sans la mettre en question tant elle semble evidente, qui reste done tacite, mais qui se manifeste a la surface en differents points. ces points.
J'ai essaye d'indiquer un certain nombre de
Cette tendance profonde, c'est la crainte de la circularite.
Systeme et Mltasyst~me Chez Russell
119
La logique est la science premiere, car elle se place avant toutes les
autres, en particulier avant les sciences mathematiques, puisqu'elle pretend donner (au moins potentiellement) une forme a leur langage et a leurs arguments. d'une table rase.
La logique ne peut done rien supposer.
II faut partir
On demarre et l'interlocuteur doit suivre, suivre dans
les deux sens du mot, c'est-a-dire etre entraine et comprendre.
Frege
Ie dit, cet interlocuteur ne doit pas lui refuser une dose de bonne volonte.
II doit voir sans que l'on lui dise.
Frege se sert du mot
allemand 'Wink', qui signifie 'signe ', 'indication', 'clin d'oeil'. metalangage comme clin d'oeil!
Le
Poussee a l'extreme, cette conception est intenable et l'on trouve chez Frege et Russell des introductions, des explications prealables, qui, comme ils Ie disent, ne comptent pas officiellement.
Mais, sous une forme
adoucie, cette crainte de la circularite se retrouve chez un certain nombre de chercheurs. Bien que les ecrits de Skolem ne contiennent pas de considerations philosophiques d'une amplitude comparable a celIe de ce qu'on trouve chez Frege ou Russell, on per90it en filigrane dans ses travaux techniques une conception bien definie de la logique et de ses fondements, conception qU'il formula explicitement en 1955:
'II me semble que les fondements des
mathematiques devraient etre etablis sur une table rase, c'est-a-dire sans que soient supposes des notions ou des theoremes empruntes aux mathematiques classiques, en particulier sans que soit supposee la theorie transfinie des ensembles de Cantor' (1955, page 103;
1970, page 584).
Voila des lignes que Frege ou Russell auraient pu signer. esquisse ensuite trois manieres de proceder.
Skolem
On peut developper les
mathematiques dans la logique du premier ordre, 'con9ue de la maniere syntactique' (donc sans demonstration de completude), mais les modeles non-standard sont alors inevitables.
On peut proceder au developpement de
l'arithmetique primitive recursive, avec exclusivement des variables libres.
On peut enfin tenter d'extraire de chaque enonce mathematique son
contenu constructif.
Ce sont la des solutions fort eloignees de celles
de Frege ou de Russell. est Ie meme:
II n'en reste pas moins que Ie point de depart
la table rase, la crainte de la circularite, l'interdiction
d'invoquer des considerations ensemblistes. Cette meme interdiction se retrouve chez Herbrand.
Tout comme
Skolem, il n'entre pas dans des considerations philosophiques qui puissent se comparer par leur etendue a celles de Frege ou de Russell, mais il a
120
J. van Heijenoort
une vue bien arretee sur les moyens a utiliser dans les recherches logiques.
11 fut sans doute partiellement inspire par Hilbert, qui a
propos des problemes de non-contradiction avait indique des limites a ne pas franchir dans les moyens de demonstration. aU-dela de ces restrictions.
Mais Herbrand alla bien
11 etendit a toutes les investigations
logiques Ie finitisme de Hilbert, se refusant, par exemple, lui aussi a considerer une demonstration de completude ensembliste.
Comme il
admettait la theorie des ensembles classiques en mathematiques, il faut, la encore, voir dans son attitude en metamathematique une crainte de la circularite. En 1929 Godel donna une demonstration de la completude semantique de la logique du premier ordre.
A cette occasion, dans le texte original de
sa dissertation, il fit quelques commentaires sur les moyens employes dans la demonstration et on le voit se debattre contre les vieux reproches de circularite.
Concernant ces moyens, declara-t-il, 'aucune restriction de
quelque sorte que ce fut n'a ete faite'.
Et il nota que 'le principe du
tiers exclu pour les collections infinies avait ete employe de maniere essentielle' (vu l'indecidabilite de la logique du premier ordre).
II
rejetait l'objection que 'ceci rendrait invalide toute la demonstration de completude'.
En conclusion il ecrivait:
'Enfin, il faut encore ne pas
oublier que le probleme traite ici n'est nullement apparu a la suite de la querelle sur les fondements (alors que ce fut sans doute Ie cas pour le probleme de la non-contradiction des mathematiques), mais que, meme si l'on n'avait jamais doute que la mathematique "naive" fUt valable quant a son contenu [inhaltlich], ce probleme pouvait etre pose d'une maniere nullement denuee de sens a l'interieur de cette mathematique (contrairement, par exemple, au probleme de la non-contradiction), car une limitation des moyens de demonstration ne parait pas plus s'imposer ici que pour n'importe quel autre probleme mathematique' (1986, page 64) .
Voila qui met un point final aux craintes de circularite.
La
citation de Godel decrit assez bien la situation qui s'est formee apres la ruine des entreprises de Frege et de Russell.
On se debrouille avec
les moyens du bord, en ayant soin de les rendre explicites.
L'ecueil sur
lequel fit naufrage la conception universaliste de la logique, ce fut en fin de compte sa sterilite.
Car que pouvait-on faire?
Deduire des
theoremes l'un apres l'autre. II y eut longtemps en logique deux courants qui avancerent sans
121
Systeme et Metasysteme Chez Russell
meIer leurs eaux, celui de Frege-Russell (ou logicisme), dont j'ai essaye d'esquisser les traits fondamentaux, et celuide Peirce-SchroderLowenheim (ou algebre de la logique), dont les tenants, sans s'embarrasser de preoccupations ontologiques grandioses, developpaient une conception plus technique de la logique. celIe des mathematiciens.
Leur pratique se rapprochait de
lIs consideraient librement, l'un apres
l'autre, des univers differents et accumulerent bient6t des resultats importants et parfois inattendus. des solutions
a differents
Comme Ie theoreme de Lowenheim, comme
problemes de decision ou de reduction
(problemes ignores par les logicistes). courants se melerent.
Dans les annees vingt les deux
Frege et Russell avaient apporte la notion de
systeme formel, que les algebristes de la logique avaient ignoree.
Mais
bien des problemes examines furent ceux que ces derniers avaient poursuivis.
Les problemes de la decision retinrent l'attention
(Schonfinkel, Bernays, Ackermann).
La notion d'interpretation (done de
validite) fut precisee, et l'aboutissement fut Ie theoreme de completude pour la logique du premier ordre (Godel).
On pourrait discuter quelle
est la veritable portee philosophique de ce theoreme, mais ce que l'on ne peut nier, c'est qu'il eut des applications importantes qui ouvraient des voies nouvelles.
Done
a la
sterilite de la conception universaliste de
la logique on peut opposer la fecondite d'un courant etranger conception. II peut sembler ridicule de parler de sterilite
a propos
a cette de Frege,
qui nous a donne les regles logiques que nous employons encore aujourd'hui, et de Russell, dont les Principia influencerent toute une generation.
Mais en l'absence de recherches metasystematiques la force de
ces decouvertes s'epuisa bient6t.
On se trouvait dans une impasse, et
pour sortir de cette impasse il fallut renouer avec une autre tradition, celIe de l'algebre de la logique.
122
J. van Heijenoort
References
Carroll, Lewis 1895
What the tortoise said to Achilles, Mind, new series, ~, 278280.
Frege, Gottlob 1879
Begriffsschrift, eine der arithmetischen nachgebildet Formelsprache des reinen Denkens; Halle; traduction anglaise de Stefan Bauer-Mengelberg dans
1969
Nachgelassene Schriften;
~
Heijenoort 1967, 1-82.
Hambourg.
Godel, Kurt 1944
Russell's mathematical logic, dans Schilpp 1944, 123-153.
1986
Collected works, vol. 1;
Oxford et New York.
Russell, Bertrand 1903
The principles of mathematics;
1919
Introduction to mathematical philosophy;
New York.
1951
Mysticism and logic, and other essays;
1959
My philosophical development;
Londres. lOth printing;
Londres.
Londres.
Voir Whitehead, Alfred North, et Bertrand Russell. Schilpp, Paul Arthur (ed.) 1944
The philosophy of Bertrand Russell;
New York.
Skolem, Thoralf 1955
A critical remark on foundational research, Det Kongelige Norske Videnskabers Selskabs Forhandlinger 18, no. 20, 100-105; reimprime dans Skolem 1970, 581-586.
1970
Selected works in logic;
Oslo, Bergen, Tromso.
van Heijenoort, Jean 1967
(ed.) From Frege to Godel, ~ ~ book in mathematical logic, 1879-1931;
1985
Cambridge, Massachusetts, et Londres.
Selected essays;
Naples.
Whitehead, Alfred North, et Bertrand Russell 1910
Principia mathematica, vol. 1;
Cambridge, Angleterre (cite
aussi comme PM 1925
1). 2nd edition of 1910 (cite aussi comme PM
2).
Logic Colloquium '85 Edited by The Paris Logic Group © Elsevier Science Publishers B.V. (North-Holland), 1987
123
Concepts Mathematiques et Informatiques Forrnalises dans Ie Calcul des Constructions Thierry Coquand Gerard Huet Inria Rocquencourt France Nous presentons un essai de mecanisation de concepts mathernatiques et informatiques dans Ie Caleul des Constructions. Tous les exemples presentee ont ete verifies par machine.
1
Introduction
Le Caleul des Constructions est un langage logique dont le calcul de types implements la deduction naturelle en caleul des predicats d'ordre superieur. Cette theorie s'appuie SUr les travaux de De Bruijn [8,10,11], Girard [25,26] et Martin-Lof [45,49]. Le langage en a He presente et motive dans Coquand-Huet [17], et sa coherence a ete prouvee dans la these de Coquand [16]. Une version simplifiee, munie d'une sernantique detaillee, est presentee dans Coquand-Huet [19]. Une implementation prototype a He developpee a l'INRIA afin dexpetimenter avec le pouvoir expressif du systeme. De nombreux exemples ont ete verifies mecaniquement sur cette implementation [18,52]. Nous rappelons les regles du calcul dans une premiere section. Nous expliquons en detail les conventions d'ecriture permises par Ie systeme. Le reste du papier est une session annotee d'un certain nombre d'exemples caracteristiques. Cette session presente l'ensemble des axiomes, definitions et theoremes necessaires a la comprehension des notions introduites, dans' la tradition des Principia [76J.
2
Le Calcul des Constructions.
2.1
Constructions: contextes, propositions et preuves.
Les Constructions sont des expressions bien typees d'un lambda-calcul type dont les types sont des expresssions de merne nature. Le lang age de base s'appuie Sur Ie formalisme A de Nederpelt [53,54,21]. Nous avons quatre regles de formation:
*
Univers
[x:M]N
Abstraction
(M N)
Application
x
Variable
Dans la regie de formation pour I'abstraction, nous preferons la notation Automath a la notation plus traditionnelle >'XM • N pOur deux raisons. Premierement,
[x : M]N
124
T Coquand and G. Huet
le type M associe a. la variable liee x peut etre tres complique, et la notation indicee deviendrait trop embrouillee, avec des indices de niveau arbitraire. Dsuxiemement, cette operation d'abstraction sert a. representer des produits 'Ix EM· N aussi bien que des fonctions AX EM· N. Le nom x est bien sur completement arbitraire, et n'est utilise qu'au niveau de l'interface d'entree sortie. Dans la syntaxe abstraite, I'operateur d'abstraction est binaire, avec deux composantes M et N. Les occurrences de la variable x apparaissant dans la syntaxe concrete du terme N sont remplacees par des indices de de Bruijn, qui refletent la profondeur de la variable dans l'emboitement des abstractions [9J. Ainsi la formule
[x: A]([y : BJ(x y) x) est une representation concrete de la structure abstraite
[A]([BJ(2 1) 1). Comme il est d'usage en logique combinatoire, on ecrit (M N) pour l'application du terme N au terme M. On emploie aussi a. l'occasion la representation (N)M, dans Ie style Automath. Notre algebre de termes est completee par une constants *, qui joue Ie role de la sorte de tous les types (sans etre elle-meme un type). On peut voir egalement * comme Ill. sorte de toutes les propositions, suivant la correspondance de Curry-Howard entre propositions et types. Dans les langages Automath, * est note To Nous appelerons construction un terme bien construit, relativement a. un algorithme de verification de types que nous allons presenter. Cette verification restreint les termes legaux suivant trois criteres. Premierement, les termes doivent etre legaux du point de vue du scope des variables. Deuxiemernent, les applications (M N) ne sont legales que si le terme M a un type fonctionnel coherent avec Ie type du terme N. Troisiemement, nous limitons notre calcul a. trois couches de termes: les contextes, les propositions et les preuves. Les contextes sont simplement les termes construits par une suite d'abstractions a. partir de *:
Intuitivement, les contextes sont des declarations: on introduit les variables Xl, ... ,x" avec leurs types. Les propositions sont les constructions dont le type est un contexte. Intuitivement, les propositions sont des formules logiques contenant eventuellement des variables libres. Par exemple, une proposition ayant pour type Ie contexte ci-dessus peut contenir des occurrences lib res des variables Xl, ... , x". Une variable de proposition de ce type peut done etre consideree comme designant une proposition n-aire, On appelle enonce une proposition fermee, c'est a. dire de type *. Finalement, les preuves sont les constructions dont Ie type est une proposition. Intuitivement, une telle construction est une preuve de la proposition qui est son type. Une preuve fonctionnelle est une preuve dependant dhypotheses, vues comme ses parametres. La proposition correspondante aura comme type ce contexte d'hypotheses. Cette vision fonctionnelle des preuves est conforme aux systernes d'inference de logique naturelle [61]. L'interpretation inforrnatique des constructions suivant leurs trois niveaux est de considerer les contextes comme des declarations, les propositions comme des specifications, et les preuves comme des algorithmes realisant la specification qui est leur type. On peut d'ailleurs voir cette interpretation comme definissant une semantique de la partie logique de notre formalisme, une proposition etant interpretee par l'ensemble de ses justifications, c'est a. dire des algorithmes de ce type. Cette vision constructive de Ill. sernantique est conforme a. la logique intuitioniste [39J. Toutefois, I'identification d'une preuve avec une A-expression permet d'eviter les codages par arithmetisation.
Ca/cu/ des Constructions
2.2
125
Typage
Nous n'avons pas 180 place ici de donner 180 theorie syntaxique complete de notre calcul. Nous renvoyons Ie lecteur interesse a 180 these de Coquand [16]. Nous supposons connue I'operation de substitution M[x/ N] qui remplace les occurrences libres de 180 variable x dans Ie terme M par Ie terme N. Nous avons une regie de caleul unique, correspondant a 180 notion de ,a-reduction en A-caleul. Cette regie remplace un sous-terme de 180 forme ([x: AIM N) par Ie terme M[x/N]. On peut montrer que cette regie definit une relation de calcul confluente et noetherienne sur les termes types. Tout terme bien construit M possede done une forme irreductible unique, atteignable par une sequence arbitraire de caleuls, que l'on appelle forme canonique de M, et que l'on designe par )I(M). Nous allons decrire l'algorithme de typage des constructions par un systeme dinference, dont les regles manipulent des sequents r f- If}, ou r est un contexte servant a declarer les variables lib res apparaissant dans l'expression E. II est pratique de definir quelques notations permettant de manipuler les contextes comme des suites. Si r est un contexte et M un terme queleonque, on definit la concatenation I': M de r et M recursivement. Si r = *, alors I'; M = M. Si r = [x : A]A, alors I'; M = [x : A] (Aj M). Lorsque r et A sont des contextes, on ecrit r ~ A si et seulement s'il existe un contexte e tel que A = I'; e. On dit que r est un prefixe de A. Finalement, si r est un contexte et M un terme quelconque, on ecrit I'[z : M] pour I'; [x : M]*. Cette notation permet d'ecrire tout contexte non vide comme une suite [Xl: Ml] ... [xn : M n ] et de reserver * pour Ie contexte vide. Nous allons maintenant decrire precisernent I'algorithme de typage. Les sequents sont de trois formes: Contexte indique que
r
est un contexte bien construit.
rf-p:A indique que Pest une proposition bien construite dans Ie contexte construction entraine que I'; A est un contexte bien construit.
rs- u .»
Proposition
r
et de type A. Une telle
Preuve
indique que Ie terme M est une preuve de la proposition P dans Ie contexte r. Une telle construction entraine que P est bien construite dans Ie contexte bien construit r. Dans ce qui suit, les meta-variables r ,A et e designent des contextes, P et Q des propositions, M et N des termes non-contextes queleonques, T et U des contextes ou propositions queleonques, X et Y des termes queleonques. Donnons tout d'abord les regles de construction de contextes:
f-*
Contextel
r; A f- * I'[e : A] f- *
Contexte2
r; A f- P: *
r[x: A; )I(P)] f- *
Contexte3
Un contexte ne peut done declarer que des predicate et des hypotheses en forme canonique. Pour les termes non contextes, on donne des regles de typage, correspondant aux regles de formation de termes: Variable r f- x ; Tz
T. Coquand and G. Huet
126
r[x : XlI- M : Y I- [x : XIM : [x : XJY
Abstraction
I- M : [x: P1X r I- N : P I- (M N): N(X[x/N])
Applicationi
r r
r
r I- M: [x: ~]X r I- P: e r
I- (M P): N(X[x/PJ)
(~$
e)
Application2
Le systems presente ci-dessus met systematiquement en forme canonique les expressions de type. Ceci n'est pas strictement necessaire, mais simplifie la presentation. quelques explications sont necessaires. Tout d'abord, Ia regie Application2 permet une coercion entre Ie contexte 8 et son prefixe A. Cette regie d'inclusion de types permet de diminuer a. volonte la fonctionnalite d'une proposition en quantifiant universellement les hypotheses superflues. Donnons tout de suite un exemple. On se place dans Ie contexte vide. L'algorithme d'identite polymorphe est construit par la preuve Id = [A: *][x : A]x,
de type la proposition Un = [A: *][x : A]A.
La proposition Un peut se lire comme Ie schema d'implication A ---> A, lorsque A est une proposition quelconque. Mais elle peut aussi se lire comme l'enonce VA . A ---> A decrivant le cardinal I. II est done legal de construire l'application (Id Un), qui est l'algorithme d'identite specialise 11 la structure 1. On peut remarquer egalement que les types sont toujours dans une forme ou leur fonctionnalite est explicite. Dans Ie jargon du A-calcul, on parle de forme 17-saturee. Par exemple, le contexte [x : [A : *]][y : xl n'est pas bien construit, car la fonctionnalite de y n'est pas apparente. Par contre, [x : [A: *]][y : [A : *j(x A)] est bien construit. On remarque que la regie de calcul preserve Ie type des constructions. Plus exactement, si r I- M : T et si M se reduit en N alors r I- N : T', avec T = T' lorsque Met N sont des preuves, et T $ T' lorsque M et N sont des propositions. Si M est un terme bien construit dans Ie contexte I', on a r I- M : T avec T unique, en forme canonique. On designe T par rr(M), ou r(M) losque le contexte r est clair. Le Calcul des Constructions presente ici est plus restreint que celui originellement propose dans Coquand-Huet [17]. La restriction 11 trois niveaux est conforme avec la theorie presentee dans la these de Coquand [16] et dans Coquand-Huet [181. En particulier, tout terme bien construit est fortement normalisable (toutes les sequences de calcul issues du terme terminent), ce qui justifie l'utilisation de formes canoniques pour les expressions de type. En tant que systeme logique Ie calcul est coherent, dans la mesure ou la proposition absurde V = [A : *] A n'a pas de preuve.
2.3
Abrevlations
Nous utilisons plusieurs abreviations dans notre syntaxe concrete. Tout d'abord, on peut abreger [A : *] X en VA·X. Par exemple, on ecrit V comme VA·A. Cette abreviation s'itere, par exemple VA,B,C ·X. On peut interpreter le symbole "v" comme quantifiant sur toutes les propositions. (Et non seulement sur tous les enonces, a cause de la regie d'inclusion des types expliquee ci-dessus: une variable de type * peut etre liee a une proposition ayant pour type un contexte arbitraire, considers comme prefixe de quantification). La deuxierne abreviation consiste 11 autoriser l'expression A ---> B a la place de [x : A] B, lorsque x n'apparait pas libre dans B. Le terme A ---> B, vu comme un type, est Ie type des
Calcul des Constructions
127
fonctions de domaine A et de codomaine B. Un type dependent [x : AJ (P x) est Ie type d'objets fonctionnels generalises associant a la valeur X de leur domaine A une valeur de type (P X). De tels constructeurs de type existent par exemple dans la theorie intuitioniste des types de Martin-Lof [49]. Des constructeurs de type analogues existent deja dans les langages de programmation usuels. Pensez a une procedure Algol admettant un pararnetre en tier n et un parametre de tableau de dimension n. Si I'on pense a A ---+ B comme une proposition plutot que comme un type, on peut interpreter Ia Heche ---+ comme I'implication intuitioniste. Nos deux abreviations peuvent etre vues comme particularisant, au niveau des propositions, les deux constructions de type du calcul de second ordre de Girard [25,26]. Quelques autres abreviations sont autorisees. Par exemple, la syntaxe du let de ISWIM [42] et ML [28J est autorisee, sous la forme de [x = XJ Y. Ceci permet de simplifier des constructions complexes avec de multiples occurrences d'une sous-expression X en ecrivent [x = X] Y" au lieu de la forme developpee Yx. Ceci presente deux avantages sur l'ecriture sous forme de "redex" ([x: A]Y" X); tout d'abord, les expressions sont plus lisibles. Ensuite, il n'est pas besoin de specifier Ie type A, qui est remplace implicitement par Ie type de X. Les conventions usuelles de la logique combinatoire sont acceptees, et on peut ecrire (A B 0) au lieu de ((A B) 0). De meme, ---+ associe a droite. Ainsi, A ---+ B ---+ 0 est une abreviation pour [It: A][v : B]O. Nous avons decide d'implementer la version prototype du calcul des constructions dans Ie lang age ML [28]. ML est utilise egalement en tant que meta-Iangage du systeme, ce qui permet a l'utilisateur de macro-generer des constructions pararnetriques compliquees. Dans l'implementation, la syntaxe concrete des constructions est definie par une grammaire Yacc, dont les actions semantiques engendrent des valeurs ML sous la forme d'arbres de syntaxe abstraite. Les expressions entre guillemets "..." sont ainsi analysees par Yacc. Un programme d'impression permet de restituer a l'utilisateur une forme concrete des constructions qu'il a fabriquees. L'utilisateur peut manipuler une construction en cours de developpement, en faisant naviguer I'interprete de ML a l'interieur d'un contexte de constructions. Le paragraphe suivant presente les commandes dont il dispose.
2.4
Le systeme de theories
Nous avons enrichi Ie Iangage de base des constructions en autorisant des constantes, designees par des identificateurs. Tout terme bien construit peut etre nomme, et plus tard invoque par ce nom. Ceci est coherent avec la notation [x = XJ qui peut maintenant etre vue comme la forme interne de declaration d'une constante. On peut done voir le contexte courant comme une suite de declarations d'hypotheses, et une suite de declarations de canstantes correspondant a des constructions deja verifiees. On peut ajouter des hypotheses et des constantes, en naviguant vers I'interieur de la construction en cours. On peut aussi decharger des hypotheses et des constantes en naviguant vers I'exterieur. On peut distinguer les commandes de ce systems rudimentaire de theories en deux categories; les commandes eIementaires fabriquent une construction etape par etapej les commandes de haut niveau utilisent I'analyseur syntaxique pour compiler des suites de commandes elementaires. Les commandes elementaires construisent progressivement un terme courant. Ces commandes se comprennent par leur effet sur une petite machine, possedant un registre C (la construction courante), et une pile contenant I'environnement E, constitue du contexte courant d'hypotheses et de constantes. On empile egelement dans E les arguments deja construits, en attendant qu'ils soient appliques. L'environnement generalise E est done constitue de declarations d'hypotheses [x : MI, de declarations de constantes [x = XJ, et
128
T. Coquand and G. Huet
d'arguments en attente (M). • Cornmandes e1ementaires Raz: G
<-
*
Ref norm G
<-
nom
Var nom: E
<-
E [nom: Gj; G
Nomme nom: E Empile: E
<-
<-
<-
*
E [nom = GJ; G
E (G}j G
<-
Decharge: Si E = E' [x: M] alors E OubHe: Si E
= E' [x = M]
<-
* alors E
Appl: Si E = E' (M) alors E
<-
E';
Efface: Si E = E' E 1 alors E
<-
E'
<-
<-
*
E'j
G
E'; G G
<-
[x: M] G [x = Mj G
<-
<-
(G M) lSi (G M) est bien type]
• Commandes de haut niveau Abs nom: Suite de Decharge et de Oublie jusqu'a nom compris Soit formule: G
<-
formula [Si bien construite]
Dec! nom formule: Soit formula; Const nom formule: Soit [ormule;
Var nom Nomme nom
Prouve formule: Verifie que r(G) = )./(formule) Lemme nom formule: Prouve [ormule;
Nomme nom
Preuve formulel formule2: Soit [ormuleg;
Prouve [ormulel .
La description des commandes ci-dessus est volontairement simplifiee. Par exemple, nous ne detaillons pas la verification de type effectuee lors de I'execution de la commande Appl. En effet, I'explication du typage par les regles d'inference n'est plus suffisante en presence de constantes, car on ne veut pas alors avoir tous les types en forme canonique, ce qui serait tres cofiteux. On ne discute pas ici non plus des problemes d'allocation de mernoire pour les zones E et C. Un papier separe presentera en detail l'implementation de la machine virtuelle des constructions.
2.5
Un petit exemple
Nous donnons ici I'exemple completement detaille d'une preuve effectuee pas a. pas a. l'aide des commandes elementaires. II s'agit de Ia definition de la connective logique /\, et de la preuve (sous forme de I'algorithme de premiere projection) de la proposition correspondant a. la regie dinference usuelle "/\-elim-gauche", c'est a. dire:
VA, B· A /\ B ...... A. Nous donnons cet exemple sous la forme suivante. La premiere colonne est l'instruction donnee par l'utilisateur. La deuxieme colonne indique l'etat de la machine apres execution de la commande, sous la forme E f- G.
Calcul des Constructions Raz; fVar Aj VA fVar B, VA,B fVar OJ VA,B,O fRef Aj VA,B,O fVar Xj VA,B,O[x:AjfRef B; VA,B,O [x: Aj fVar Yj VA,B,O [x: A][y: B] fRef 0; VA,B,O [x: A][y: Bj fDecharqe; VA,B,O Ix: A] fDecharge; VA,B,OfVar h; VA,B,O [h: [x: A][y: B] 0] fRef 0; VA,B,O [h: [x: A][y: B] OJ fAbs A; fNomme A; [A = VA,B,O [h: [x: A][y: Bj OJ 0] f-
129
* * * *
A
*
B
*
0 [y: B] 0 [x:A][y:B]O
*
0 VA,B,O [h: [x: A][y: B]
*
OJ 0
II faut comprendre la definition en donnant a la connective A un sens operatoire. L'idee est la suivante. Pour toutes propositions A et B, A A Best une methode pour prouver toute proposition 0, pourvu qu'on ait une methode h qui a partir d'une preuve x de A et d'une preuve y de B, fournisse une preuve de O. Autrement dit, on a defini la connective A par sa semantique (au sens intuitioniste). Donnons maintenant la preuve de VA, B· A A B -> A, dans un environnement qui contient la definition de /\ .
Var Aj . .. [A = . --]- .. VA fVar B; VA,B fRef B; VA,B fEmpile; VA,B(B) fRef Aj VA,B{B) IEmpilej VA,B{B,A) fRef A; VA, B(B, A) fAppl; VA, B{B) fApplj VA, B fVar h; VA,B [h: A/\ B] fRef Aj VA,B[h:AAB]fVar Xj VA,B[h:AAB][x:A]fRef s, VA,B [h: AA B][x: A] fVar Yj VA,B [h: AA B][x: A][y: Bj fRef Xj VA, B [h : A A B][x : A][y : B] fDecharqe; VA, B [h : A A B][x : A] fDecharqe; VA,B[h:AAB] fEmpilej VA,B [h: AA B]([x: A][y: Bj x) fRef A; VA, B [h : A A B]([x : A][y : BJ x) fEmpile; VA,B [h: AA B]([x: A][y: B] x, A) fRef h; VA,B [h: AA B]([x: A][y: BJ x,A) fApplj VA, B [h : A A B]{[x : A][y : B] x) IVA,B [h: AA B] fApplj ... fProuoe "A"j ... [A = ... j ... fAbs Aj
*
*
B
*
A
*
A
(A)A (B, A)A
*
A
*
B
*
x [y: B] x [x:A][y:B]x
*
A
*
h (A)h {[x: A][y: Bj x,A)h VA, B [h : A A B] (h A [x: A][y : B] x)
On peut maintenant conclure la session ci-dessus en enregistrant Ie lemme preuve, au
J30
T. Coquand and G. Hue!
moyen de la cornmande:
Lemme
Jrl
"VA,B· A/\ B -> A".
La preuve de haut niveau correspondante peut se resumer a la construction de I'algorithme de premiere projection:
Const
Jrl
"[A: *][B : *j[h : A /\ B] (h A [x: A][y: BJ x)";
et a la verification:
Prouue "VA, B· A /\ B -> A"
Remarques. La commande Prouve verifie effectivement la validite d'une proposition, au sens intuitioniste de posseder une methode de preuve, en tant qu'algorithme calculant sur des justifications. Nous avons reduit le mecanisme de verification a un simple calcul de types dans le langage A, c'est a dire a la simple deduction naturelle. Nous n'avons pas besoin de postuler des connectives logiques, des axiomes 'ou regles d'inference concernant l'usage de ces connectives. Notre systeme est done fondamentalement different de ceux qui postulent au depart des regles particulieres, comme le PP Ade LCF [29]ou la theorie des types intuitioniste de Martin-Lof [49]. Le Calcul des Constructions est plus directement apparente au langage Automath [8J, et ala theorie des types de Maetin-Lof de 1971 [45J. Le prix que nous avons a payer pour cette sirnplicite est qu'il nous faut developper un materiel preliminaire important avant d'avoir construit suffisamment de logique pour pouvoir faire des preuves de haut niveau. Les premiers exemples que nous presentons concernent ces preliminaires logiques. IIs ont une cornplexite intrinseque, due a leur nature fondamentale. La situation est analogue II la comprehension d'un langage inforrnatique, a partir de son implementation en langage machine. II est preferable d'admettre en premier lieu l'existence d'un micro-code et de comprendre les concepts de plus haut niveau en termes de ce microcode, I'implementation du micro-code en langage machine etant laissee au specialiste. Les exemples ci-dessous vont etre presentee dans un symbolisme qui fait un compromis entre les commandes de haut niveau et les commandes de bas niveau, en autorisant I'utilisation de formules complexes, mais en rendant compte explicitement de la structure de I'environnement par l'utilisation de tabulations. La commande Decl nom [ormule se symbolise par nom: [ormule, avec introduction d'une tabulation. De meme Const nom [ormule se symbolise par nom := f orrnule. Les eommandes d'abstraction Decharge, et Abs se.syrnbolisent par le retour a la tabulation anterieure. La commande Prouve for mule se symbolise par I- [ormule, et la commande Nomme nom par =: nom. La commande Boit formule se symbolise simplement par l'ecriture de for mule.
2.6
Polymorphisme implicite et synthese de types
Considerons la constante Jr1 ci-dessus. Une construction typique utilisant Jr1, dans Ie contexte d'une paire [p; nat /\ nat], est (Jr1 nat nat p). Cette ecriture est maladroite, car redondante: lier p a h devrait automatiquement lier l'argument de polymorphisme A a nat (resp. B II nat). On aimerait done pouvoir ecrire simplement Jrl(P), comme dans I'ecriture usuelle. II est clair que les arguments A et B peuvent etre aisement synthetises comme etant des composantes du type de p, que Ie systerne doit calculer de toute facon lors de la verification des types. Cette notion de polymorphisme implicite existe deja dans le langage ML [28]. Le cadre general dans lequel cette synthese est possible est le suivant. Soit X un terme arbitraire en forme canonique. On peut l'ecrire sous la forme:
Calcul des Constructions
131
au
x est une variable, et les Ti et X; sont de la merne forme. Soit V un ensemble de variables. On appelle occurrence rigide de X relativement a v l'ensemble des positions dans X suivantes. D'abord, les occurrences rigides dans Ti relativement a v U {UI,...,Ui-l}, pour i = 1, ... , n. Ensuite, si p = 0, l'occurrence de x, et sinon, lorsque z appartient It W = V U {Ul' ..., un}, les occurrences rigides dans X; relativement It W, pour j = 1, ... ,p. Soit z une variable quelconque. On dit que X determine z ssi z apparait dans X a une occurrence rigide relativement It 0. La notion d'occurrence rigide provient d'algorithmes de resolution d'equations en A-calcul [32,35J. Intuitivement, les occurrences rigides relativement a V sont invariantes par substitution a des variables non dans V. Nous allons maintenant decrire une generalisation de la commande Canst en autorisant d'ecrire schema .- formule, ou schema est une suite de chaines de caracteres et de declarations [x : X]. Ceci permet d'introduire une constante avec sa representation concrete sous forme "mixfixe". Les variables du schema, qui doivent etre toutes distinctes, sont automatiquement declarees avant la construction de [ormule (de la gauche vers la droite), puis dechargees, Ainsi, la conjonction peut etre declaree comme operateur infixe par: [A:
*] /\ [B: *]
:= 'VC· (A
--t
B
--t
C)
--t
C.
La commande Canst generalises permet d'une certaine maniere de declarer comme constructions des operateurs de ler ordre, munis d'une arite. On en profite pour munir l'operateur d'une syntaxe concrete, mais cette commodite d'ecriture n'est qu'un aspect superficiel du concept. Ce qui est plus important est que cette commande specifie les arguments explicites associes It la constante, correspondant aux variables du schema. Par exemple, dans la definition de /\ ci-dessus, les variables A et B sont explicites, et la conjonction est done un operateur binaire muni d'une syntaxe infixe. Les arguments supplernentaires sont optionnels. De merne avec la construction : A: *j B: *j
1I"1([h: A /\ BD := (h A [x: A][y : B] x)
a un argument explicite h. Nous voulons maintenant abstraire 11"1 en A et 11"1 comme une constante polymorphe dans Ie contexte vide. Dans ce cas on va pouvoir Ie faire sans rajouter explicitement des arguments de polymorphisme 11.11"1,' car 11"1 determine A et B par son argument explicite h. Plus generalement, on pourra abstraire une variable u comme argument implicite d'une constante C([Xl : T1l,..., [xn : TnD := X ssi u est determinee par un T i • Expliquons maintenant la situation generale, On suppose que l'on vient de declarer la constante C ci-dessus, dans un contexte se terminant par la declaration de variable u. L'environnement courant est done de la forme: lei, la constante
11"1
B, pour pouvoir utiliser
et on veut maintenant decharger u sans oublier C, grace ala commande Gen que nous allons maintenant decrire, Tout d'abord, l'environnement devient :
... [C = [u : U][Xl : T1l,..., [xn : Tn] Xl f- * et maintenant il y a deux cas pour l'invocation de C. Si u est determinee par l'un des Ti, alors on invoque C sans modification, l'instanciation de u etant fournie automatiquement par Ie verificateur de types, qui va verifier la concordance de U au type trouve a l'occurrence rigide correspondante du type de l'argument i, Sinon, u devient un nouvel argument qui doit etre explicitement applique a C.
T. Coquand and G. Huet
132
Dans les exemples qui suivent, les variables Bout toujours dechargeee par la cornmande Gen, qui est simplement symbolisee par un retour de tabulation, avec indication entre crochets (IL) de la variable supplementaire introduite dans le deuxieme cas ci-dessus. Remarquons que si 0 est declaree avec deux variables explicites z et y, par exemple O([x : T], [y : U]) := "', on peut obtenir la construction de 0 applique a un seul argument X par [y : U] O(X, y), ce qui montre que notre facilite d'ecriture ne restreint pas la generalite du formalisms.
Remarque. Plus generalement, on pourrait autoriser des schemes mixtes melangeant des mots-cles, des declarations [lLi : T i ] d'arguments explicites, et des declarations {lLi : Ti } d'arguments implicites. De tels schemas sont legaux ssi pour tout i implicite il existe un argument j > i tel que T j determine lLi. II n'est pas necessaire que j soit un argument explicite, car la synthese des arguments implicites peut s'iterer (de la droite vers la gauche). II est bien sur fondamental pour I'implementation de l'algorithme de synthese que tous les arguments declares comme explicites dans Ie schema soient presents lors de tout appel du combinateur ainsi defini. On peut bien sur aussi imaginer de realiser la synthese de preuves de constructions par recherche systematique des combinaisons possibles de combinateurs donnes, De telles tactiques sont facilement programmables dans Ie meta-langage des constructions, et seront une aide puissante au mathematicien, qui n'aura plus a se soucier que de la strategie generale de la preuve, c'est a dire de l'ordonnancement des lemmes, sans se soucier de la combinatoire liee a la preuve de chaque lemme.
3
Constructions Logiques
Nous allons construire progressivement les principaux concepts logiques, d'abord intuitionistes, puis classiques. Notre presentation suit [61,16].
3.1
Logique positive
Rappelons tout d'abord que le langage possede implicitement les concepts d'implication (intuitioniste) et de quantification universelle. En particulier, A -+ B n'est qu'une abreviation pour [h : A]B. La proposition VA· A -+ A, c'est 11 dire la reflexivite de l'algorithme d'identite polymorphe : A:
-+,
se prouve simplement par
*
Id := [x: A] x I- A -+ A (A). De merne, la transitivite de
A:
-->
se prouve par l'algorithme de composition:
*
B: * 0: *
If : A -+ B] ; [g : B -+ 0] := [x: A] (g U x)) I- (A --> B) --> (B -+ 0) -+ (A --> 0).
Notre facilite de polymorphisme implicite permet exactement les notations categoriques : I'identite IdA pour l'objet A est obtenue comme (fd A) (qui peut aussi s'ecrire (A)Id), mais
Calcul des Constructions
133
la composition des fleches f : A -- B et g : B -- C est notee simplement fig, son domaine A et son co-domaine C etant implicites a partir de ceux de f et g. Remarquons qu'il est possible d'interpreter les propositions comme des regles d'inference, la relation de deduction f- ayant les proprietes de l'implication intuitioniste. Par exemple, on peut lire la definition de la composition comme une regle de coupure: A--B B--C A--C A titre d'exemple, donnons les combinateurs usuels [20] : A: * B: *
K := [x: A] [y : B] x f- A -- B -- A (A, B)
A: * B: *
C: * S := [f: A -- B -- C][g : A -- B][x : A] (J x (g x)) f- (A ~ B -- C) -- (A -- B) -- (A -- C) (A,B,C).
Ces combinateurs, combines par application, forment les preuves du calcul propositionnel minimal, ou logique implicationnelle positive. On remarque en effet que les types de K et S correspondent aux axiomes de Hilbert. Nous definissons Ie produit, c'est a dire la conjonction, comme nous I'avons detaille plus haut: [A: *! A [B : *J := VC· (A -- B -- C) -- C. On introduit une conjonction par I'algorithme de paire : A: * B: '"
< [x: A], [y : B] > := VC· [h : A f-A--B __ AAB.
->
B
->
CJ (h x y)
Notez que la preuve a ete faite dans Ie contexte [A : *][B : *J. L'algorithme de paire <, > a eM rendu polymorphe en dechargeant A et B. Les preuves d'elimination de "A" a gauche et a droite correspondent respectivement aux algorithmes 11"1 et 11"2 de Jere et 2eme projection. On donne a titre d'exemple I'algorithme de Curryfication : A: * B: *
C: * Curry [h:AAB--C] := [x:A][y:BJ(h <x,y».
Remarquez la synthese des arguments de polymorphisme dans I'application < x, y > ci-dessus. Nous laissons au lecteur la preuve de l'implication inverse. L'isomorphisme ainsi defini prouve Ie theorems de deduction. L'equivalence intuitioniste s'ecrit done: [A: *] ..... [B : *J := (A -> B) A (B -> A).
134
T. Coquand and G. Huet
On construit de meme la somme, ou disjonction intuitioniste:
[A : *] U [B : *1 := VC· (A
-+
C)
-+
(B
-+
C)
-+
C.
Cette definition suit ici encore la sernantique operatoire : une preuve de A U B permet de prouver toute proposition C, 11 partir de methodes hi et h 2 permettant de prouver C 11 partir de respectivement A et B. L'elimination d'une somme est l'algorithme de construction par cas: A: *;
B: *j C: *j Si [c: Au B] a/ors [x: A
-+
C] sinon [y: B
-+
C] .- (c C x y).
Les algorithmes d'injection prouvent I'introduction gauche et droite de la somme. Nous laissons leur construction au lecteur.
3.2
Quantificateurs
La quantification universelle, ou produit generalise, est implicite dans Ie langage, puisqu'on peut definir : A:
*
II([P: A -+ *]) := [x: A] (P x).
L'elimination de II, c'est a dire I'instanciation, est ici simplement l'application. Son introduction est simplement I'abstraction. La quantification existentielle, ou somme generalisee, se construit comme suit. Dans Ie contexte [A : *][P : A -+ *], la proposition 3x· (P x) permet de prouver tout enonce B, a partir d'une preuve que B Be deduit de (P x), pour x : A quelconque : A:
*
L([P: A -+ *]) := VB· ([x: A] (P x) -+ B)
-+
B.
On introduit une quantification existentielle par la construction : A:
*
::I[P: A -+ *] := [x: A][h: (P x)] VB· [p: [x : A] (P x) -+ B] (p x h) f- [P : A -+ *][x : Al (P x) -+ L(P),
Inversement, on peut projeter la somme sur un "temoin" qui verifie Ie predicat quantifie existentiellement : A: t
f-
*
[P : A -+ *] := [p: L(P)] (p A [x : A] (P x) -+ x) [P : A -+ *] L(P) -+ A.
Dans la pratique, on s'autorisera a Skolemiser la preuve p en une constante C utilisee comme abreviation pour (t P p), ce qui rendra I'ecriture plus conforme a la pratique mathematique. Remarquez qu'il n'est pas possible ici de projeter p vers la preuve de (P C) qu'elle encapsule. Notre somme est done differente de celIe axiomatisee par Martin-Lof [49]. Ceci est par contre conforme a I'interpretation d'un quantificateur existentiel comme type abstrait [44].
Calcul des Constructions
3.3
135
Logique classique
La contradiction, ou proposition absurde, permet de prouver toute proposition A par simple application: V := VA·A.
V n'a pas de preuve, et joue done logiquement Ie role de la valeur de verite faux. Nier une proposition revient II exprimer qu'elle entraine V, d'ou Ie concept de negation: .., [A : *1 := A --> V. La connective de Sheffer A I B (lire "A contradictoire avec B") se definit par : *11 [B : *1 := A --> B --> V. II est facile de montrer VA, B . (A 1 B) ..... ..,(A/\ B). Les autres connectives classiques s'expriment simplement en terme de 1 : [A:
*J => fB : *J := A I-.B *1 V [B : *1 := (..,A) 1(..,B) *1 {} [B : *1 := (A => B) /\ (B => A).
[A:
[A: [A:
Appelons fermeture classique de la proposition A sa double negation:
[[A: *]] := ..,(..,A).
Toute proposition nie sa negation:
*
A: p:A q:
..,A
NegJYeg f- VA·
(q p)
A --> [A].
(A,p, q)
L'implication inverse n'est vraie que des propositions classiques : Classique fA: *J := fA] --> A. On peut montrer que V,,,, ,
I
produisent des propositions classiques, et done aussi V et
=>. Finalement, /\ preserve la propriete d'etre classique, et {} produit done egalement des propositions classiques. En fait, un raisonnement classique consiste en general a montrer qu'un ensemble de propositions {A1, ...,An } est contradictoire. Les connectives V,.." 1 expriment cette notion pour n = 0, 1,2 respectivement. Remarquons qu'il est tres simple de prouver Ie principe du tiers exclu :
[A:
4
*1 (Id [A))
f-
VA· (..,A) vA.
Une fheorle intuitioniste des ensembles
Dans cette section, nous montrons comment axiomatiser dans les Constructions les concepts usuels de theorie des ensembles. On se place dans un contexte global, dans lequel on a declare [U : *1. On peut penser II U comme etant l'universdu discours, Les ensembles seront assimiles a des predicats sur U, c'est II dire a des propositions de type U --> *, que nous abregeons en Ens. On appelera famille un ensemble d'ensembles, de type Ens --> *, abrege en Fam. De meme on appelle relation un ensemble de paires Curriflees, de type U --> U --> *, abrege Rei. On se place done dans Ie contexte:
136
T. Coquand and G. Huet
U: * Ens:= U -+ * Fam:= Ens -+ * Rel:= U -+ U -+ *
4.1
Ensembles, Familles, Relations
Les ensembles etant assimiles a. leur predicat caracteristique, I'appartenance se definit simplement comme etant I'application : [x : U] E [P : Ens] := (P x). Notons qu'il est essentiel de distinguer entre Ie signe " E " et Ie signe " :" de la relation de typage; la notion d'ensemble apparait comme une notion derivee (mais distincte) de celie de type. Cette distinction fondamentale remonte aux Principia [76]. On definit dans ce cadre les notions ensemblistes usuelles : [P: Ens] C;;; [Q : Ens] .- [x: U] x E P -+ x E Q [P : Ens] = [Q : Ens] := P C;;; Q A Q C;;; P := [x: UJ \7 [P: Ens] n [Q : Ens] .- [x: U] x EPA x E Q [P:Ens]Il[Q:Ens].- [x:U]XEPUXEQ [P: Ens] U [Q : Ens] .- [x: U] x E PV x E Q ~ [P : Ens] := [x: U] -, z E P. On remarque que les quantificateurs peuvent etre interpretes comme des operations sur les ensembles: IT(P) dit que Pest universel, et E(P) dit que P n'est pall vide. Le type des ensembles est intrinsequement plus riche que Ie type de l'univers. On peut exprimer cette forme du paradoxe de Russell (ou du theoreme de Cantor) en montrant qu'il n'y a pas d'injection de type Ens ..... U. Cette preuve se construit par contradiction:
o
F: Ens ..... U G: U -+ Ens H: [P : Ens] (G (F P)) = P epimenide := [u: UJ -,u E (G u) menteur := (F epimenide) paradoxe := menteur E (G menteur) (1rl(H epimenide) menteur) I- paradoxe -+ <paradoze =: negati! hyp:paradoxe (negati! hyp hyp) I- -iparadoxe =: contradiction (1r2(H epimenide) menteur) I- (-'paradoxe) ..... paradoxe =: positi]
(contradiction ipositi] contradiction)) I- \7.
L'intersection d 'une famille d'ensembles se definit par: E P. On peut alors verifier que cet ensemble est effectivement Ie plus grand ensemble indus dans tous les ensembles de la famille.
n[F : Fam] := [x: U][P : Set] (F P) ..... x
Calcul des Constructions
137
On peut definir sur les relations les notions analogues a ~, =, 0, n, U, r-e , Par convention, nous indicerons ces notions avec 2, comme par exemple n2. De meme, on peut considerer des families de relations (de type Fam2 = Rei -> *), et definir I'intersection (n2) d'une telle famille. Enfin, on peut introduire les notions usuelles associees aux relations:
Reflexive [R : Rei] := [x: U] (R x x) Symetrique [R : Rei] := [x: U][y : UJ (R x y) -> (R y x) Transitive [R : Rei] := [x: U][y : U)[z : U] (R x y) -> (R y z)
4.2
->
(R x z).
Egalit es intentionnelle et extensionnelle
L'egalite definie ci-dessus est I'egalite extensionnelle considers traditionnellement en theorie des ensembles: deux ensembles sont egaux ssi ils ont les memes elements. II est possible egalement de definir une egalite intentionnelle, pour tout type. Suivant Leibniz, on dit que x et y sont identiques ssi ils admettent les memes proprietes (tout ensemble contenant x contient y) :
[x: UJ
== [y: UJ
:=
[P: Ens] (x E P)
->
(y E P).
La propriete de substitutivite est implicite dans la definition de bien une equivalence. La reflexivite de == se montre par I'algorithme d'identite :
==.
Montrons que
== est
Refl.; := [x: U)[P: Ens] (Id (x E P)) f- Reflexive ==. De meme la transitivite de montre par composition :
Trans= := [x: U)[y : U][z : U][p : x f- Transitive ==.
== y][q : y == z][P: Ens] (p P)j (q P)
Par contre, la symetrie est moins triviale, compte tenu du fait que notre implication est intuitioniste : x: U
y: U p: x== y
P: Ens Q = [z : U] (z E P) -> (x E P) (p Q (Id (x E P))) f- (y E P) -> (x E P) f- Symetrique ==.
4.3
Operations de fermeture
Nous formalisons la notion de fermeture d'une relation R par rapport a une propriete comme etant I'intersection de la famille des relations possedant la propriete et contenant R. A titre d'exemple, nous definissons la fermeture transitive:
[R: Re/]+ := [u: U][v : U][S : Rei] (Transitive S)
->
(R ~2 S)
->
(S
u
v).
Plus generalement, definissons:
Fermeture [P : Fam2][R : ReI] := [u: U][v : U][S : Rei] (P S) --+ (R ~2 S) -> (S u v). Lorsque Pest une conjonction, on Curryfie la definition, comme par exemple pour la fermeture transitive et reflexive: [R: Re1]* := [u: U][v : U][S : Rei] (Transitive S)
->
(Reflexive S)
->
(R ~2 S)
->
(8
u
v).
T. Coquand and G. Huet
138
On dit qu'une propriete P est stable ssi elle est stable par intersection: Stable [P : Fam2] := [Q: Fam2] ([R : Rei] (Q R) --t (P R)) --t (P (n2Q)) et on montre: (Stable P) --t (P (Fermeture P R)). Comme cas particulier, on peut montrer que la propriete transitive est stable, et en deduire que R+ est transitive. De meme, on montre par stabilite que R+ contient R. La prochaine section illustre les notions precedentes sur un exemple elementaire en theorie des relations.
4.4
Lemme de Newman
Ce lemme, qui exprime que la confluence d'une relation peut etre r eduite a une verification locale si cette relation est Noetherienne, est fondamental pour l'etud\e des systemes de simplification [55,40,36]. II illustre bien l'utilisation de la logique d'ordre superieur. Definissons tout d'abord les notions necessaires a I'enonce. On se place dans Ie contexte d'une relation R fixee : R: ReI. Deux elements sont dits coherenis s'ils admettent un majorant commun : Coherence [x : U][y : U] := VA.([z: U] (R+ z z) --t (R+ Y z) --t A) --t A. Remarquons que cette definition n'est que la forme sequentialisee de la definition equivalsnte
2)R+ x) n (R+ y). La notion de confluence exprime une forme de determinisme de R : --t (R+ X v) --t (Coherence u v). La confluence locale restreint cette propriete aux successeurs immediats de z : Confluence locale := [x: U][u : U][v : U] (R xu) --t (R x v) --t (Coherence u v). Une relation est Ncetiierienne ssi elle n'admet pas de chaine infinie. La formalisation de ce concept en tant que construction passe par son enonce sous la forme du principe de recurrence Noetherienne [13] : N oetherien := [P: Ens] ([u : U] ([v: U]( R u v) --t V E P) --t U E P) --t [u : U] u E P. Le lemme de Newman s'enonce alors : Newman := N oetherien --t Confluence locale --t Confluence. II se prouve par recurrence Ncetherienne sur la propriete : (Confluence x). La preuve est detaillee dans [18J.
Confluence := [x: U][u : U][v : U] (R+ xu)
5
Constructions informatiques
Nous allons montrer dans ce chapitre que Ie Calcul des Constructions est bien adapte formaliser certains concepts informatiques.
5.1
a
Algebre universelle et types de donnees
Cornmencons par montrer comment formaliser les notions elementaires d'Algebre, et en particulier la notion d'algebre libre sur une signature. On se place d'abord dans Ie cas homogene, c'est a dire dans Ie contexte: A: * Pour tout n
O=A
~
0, on definit Ie A-cardinal 11 associe
a n par recurrence:
Calcul des Constructions
n+ 1= n-+
139
A.
On definit maintenant 1a fonctionnalite
~
representee sous
Dans la derniere clause, on aurait pu bien sur ecrire n -+
~-algebre
arbitraire, c'est 11 dire dans Ie contexte
r :
Si M : I(~) est une construction arbitraired'un element de la ~-algebre initiale, on appelle image de M dans la ~-algebre r Ie terme M = (M A F I ••• F.). On remarque que ce terme est bien construit, et de type A. Cette notion d'image correspond, de maniere classique, 11 prendre I'image de M par Ie ~-morphisme unique de I(~) -+ r. Par exemple, si on a M; : I(~), M 2 : I(~), ..., M n • : I(~), alors (Fk M;, ... A. On definit donc ainsi un Fk operateur dans I(~), que l'on appelle Fk-constructeur, et qui s'obtient en faisant abstraction de r, et d'une liste de nk variables de type I(~). Donnons maintenant quelques exemples. Si ~ = 0, on obtient I(~) = V', I'algebre vide. Si ~ = [i : 0], on obtient I(~) = Un := VA·A -+ A, et le i-constructeur est Id = [A : *][i : A] i. Avec ~ = [11 : O][f : OJ, on obtient I(~) = Bool .- VA· A -+ A -+ A, et les deux constructeurs sont les Booleens de Church [12] : Vrai := [A: *][11 : A][f : AJ 11 Faux := [A: *][11 : A][f : A] f.
M::.) :
Lorsque ~ = [s: l][z: OJ, on obtient I(~) = Nat, les entiers de Church: Nat := VA· (A -+ A) -+ A -+ A S ([n : Nat]) := [A: *][s : A -+ A][z : AJ (s (n A s z)) o := [A: *][s : A -+ A][z : A] z. Lorsque ~ = [c : 2][n : 0], on obtient I(~) = Bin, les arbres bin aires : Bin := VA· (A -+ A -+ A) -+ A -+ A Cons [al : Bin][a2 : Bin] := [A: *][c : A -+ A -+ A][n : A] (c (al A c n) (a2 A en)) Nul := [A: *][c : A -+ A -+ A][n : A] n. II est facile de generaliser ces notions au cas non-homogene, en introduisant autant de sortes que necessaire, Par exemple, la structure de liste s'axiomatise sur deux sortes A et B : Liste := VA, B· (A -+ B -+ B) -+ B -+ B. lei, l'operation d'ajouter un element 11 une liste est doublement polymorphe, et nous pouvons aisernent synthetiser ce polymorphisme : A: * B: *
T Coquand and G. Huet
140
Ajout [x: A][L : (Liste A B)] := [J: A -+ (Liste A B) -+ (Liste A B).
-+
B
-+
B][V : B] (J x (L J V))
f- VA, B· A
On pourrait egalement definir un operateur unaire plus general Ajouter qui ajoute listes polymorphes : A:
*
Ajouter [x: A] := [L: (Liste A)] VB.[J: A f- VA· A -+ (Liste A) -+ (Liste A).
-+
B
-+
a des
B][V : B] (J x (L J V))
Remarquons que la liste vide est simplement de type Liste :
Vide := [A: *][B : *][J : A f- Liste.
-+
B
-+
B][V : B] V
D'une maniere generale, on peut aisement definir toutes les structures de donnees correspondant a des algebres libres. On remarque que les propositions correspondantes sont exactement les types de Girard restreints au degre 2. Notre traitement est similaire a celui de Bohm et Berarducci [7]. Donnons quelques exemples de programmes sur les entiers. L'addition s'obtient par iteration de successeur :
Succ:= [n: Nat] S(n) [m: Nat] + [n: Nat] := (n Nat Succ m). D'autres definitions sont possibles. La multiplication s'obtient par iteration de l'addition
Add := [m: Nat][n: Nat] m+n [m: Nat] * [n: Nat] := (n Nat (Add m) 0). On peut egalement voir les en tiers comme des iterateurs polymorphes. La multiplication de m et n aurait ainsi pu etre definie comme la composition mj n. L'exponentiation s'obtient par iteration de la multiplication:
Mult:= [m: Nat][n: Nat] (m*n) [m: Nat] i [n: Nat] := (n Nat (Mult m) S(O)). L'iteration d'un entier sur un type fonctionnel peut produire des fonctions non primitives recursives, telle la fonction d'Ackermann : Ack [n: Nat] := (n (Nat -+ Nat) ([f: Nat -+ Nat][m: Nat] (m Nat f m)) Succ).
5.2
Ordinaux
Tous les types d'algebres ci-dessus sont d'une forme tres simple, l'emboitement des fleches etant limite. Tres exactement, ils sont de degre 2, avec Ie degre S defini par:
• SeA) = 0
A variable
• S([F: T] X) = max{l + S(T),S(X)}. Pour des types plus cornpliques, on obtient des structures de donnees plus riches. Par exemple, les ordinaux peuvent etre construits comme extension des entiers, en ajoutant une operation limite, qui associe une ordinal a toute suite d'ordinaux, representee par une fonction de domaine Nat. On definit done :
Ord := VA· «Nat -+ A) -+ A) -+ (A -+ A) -+ A -+ A Lim [0' : Nat -+ Ord] .- [A: *][li : (Nat -+ A) N at](u n A li s z))
-+
A][s
A
-+
A][z
A] (Ii [n
Calcul des Constructions
141
Suc ([a: Ord]) ;= [A: *][li: (Nat -+ A) -+ A][s: A -+ AJ[z: A] (s (a A e z» Z := [A: *][li: (Nat -+ A) -+ AHs: A -+ AJ[z; A] z. II est facile de traduire un entier en l'ordinal correspondant, ce qui definit I'ensemble des ordinaux finis : Finis := [n: Nat] (n Ord (Ia : OrdJ Suc(a» Z). Remarquons l'instanciation de I'argument de polymorphisme de l'entier n en Ie type Ord. En effet, la signification de * est de designer un type (une proposition) arbitraire, et non pas seulement un type appartenant a une totalite circonscrite a la construction contenant cette occurrence de *. Autrement dit, nous utilisons de maniere essentielle la non-predicativite du calcul.
Le premier ordinal infini, w, s'obtient comme limite des ordinaux finis: w := (Lim Finis).
On programme sur les ordinaux de facon similaire aux entiers : [a: Ord] €:l [.8 : OrdJ := (fJ Ord Lim (17 : OrdJ Such» a) [a : Ord] 0 [fJ : OrdJ := (fJ Ord Lim (17 : Ord] a EB "I) Z) [a : Ord] t [fJ : OrdJ := ((3 Ord Lim (["I: Ord] a 0 "I) Suc(Z». On remarque que nos ordinaux sont en fait des notations ordinales, c'est a dire des ordinaux presentes avec des sequences fondamentales. En particulier, 1 €:l w et w sont des constructions distinctes. On obtient (0 par iteration de w t w t ... := (w Ord Lim (b : OrdJ w t "I) Z).
:
(0
On peut maintenant utiliser les ordinaux pour decrire des hierarchies fonctionnelles. Donnons quelques definitions preliminaires sur les fonctions entieres : Incr := !f: Nat -+ Nat][n: Nat] (S (f n» Iter := !f: Nat -+ Nat][n: Nat] (n Nat f n) Diag := [0": Nat -+ Nat -+ Nat][n: Nat] (0" n n). On definit maintenant la hierarchic rapide de Schwichtenberg par: Rapide [a: Ord] := (a (Nat -+ Nat) Diag Iter Suce) et la hierarchie lente ne s'en distingue que par la fonctionnelle successeur : Lente [a: Ord] := (a (Nat -+ Nat) Diag Incr Stlcc). Notons que (Rapide (0) est une fonction recursive totale, mais dont la terminaison est independante de I'arithmetique de Peano [22,41J.
6
Conclusion
Le Calcul des Constructions est une formalisation de la deduction naturelle d'ordre superieur, bien adapte au traitement par machine. II fait la synthase entre la theorie des types de MartinLof de 1971 et Ie systems F de Girard, au sein d'une generalisation naturelle du langage A d'Automath. Nous avons tente de montrer par des exemples que ce calcul est bien adapte a exprimer de maniere concise des notions mathematiques et informatiques. Cette theorie a ete developpee en vue de son application a la conception d 'environnements de programmation fondes sur des principes logiques permettant d'assurer Ie developpement des programmes de maniere compatible avec leur specification. Le Calcul des Constructions permet de representer de maniere uniforme les types de donnees et des propositions
142
T. Coquand and G. Hue!
logiques complexes. La coherence des programmes avec leur specifications peut etre verifiee mecaniquement, Nous proposons une methodologie de developpement de logiciel oir Ie programmeur construit ses algorithmes avec l'assistance interactive du systeme de verification de types. Dans certains cas, il sera meme possible de synthetiser partiellement Ie programme. De nombreux problernes sont souleves par cette approche. Par exemple, Ie formalisme est suffisamment puissant pour etre considere comme un langage de programmation de trea haut niveau. Mais l'absence de recursion generale oblige a repenser l'activite de programmation. De nouvelles methodologies, une nouvelle "discipline" de la programmation restent a definir. D'autre part, Ie langage machine suggere par cette approche est Ie A-calcul pur. Malgre la simplicite et I'uniformite de ce mecanisme de calcul, on ne connait pas encore d'architecture de machine permettant son execution efficace. II reste toutefois envisageable d'etendre Ie formalisme par I'ajout de constantes implernentees directement par les operations d'une machine traditionnelle. Par exemple, on pourrait ajouter des en tiers primitifs, et un operateur de recursion. Le Calcul des Constructions devient alors la description d'un verificateur de types extremement general pour les langages de programmation a venir [42].
References [IJ P. B. Andrews, Dale A. Miller, Eve Longini Cohen, Frank pfenning"Automating higherorder logic." Dept of Math, University Carnegie-Mellon, (Jan. 1983). [2] P. B. Andrews "Resolution in Type Theory." Journal of Symbolic Logic 36,3 (1971), 414-432. [3] R. Backhouse "Algorithm development in Martin-Lof''s type theory." University of Essex (July 1984). [4] H. Barendregt "The Lambda-Calculus: Its Syntax and Semantics." North-Holland (1980). [5] E. Bishop "Foundations of Constructive Analysis." (1967) McGraw-Hill, New-York. [6] E. Bishop "Mathematics as a numerical language." Intuitionism and Proof Theory, Eds. J. Myhill, A.Kino and R.E.Vesley, North-Holland, Amsterdam, (1970) 53-71. [7] C. Bohm, A. Berarducci "Automatic Synthesis of Typed Lambda-Programs on Term Algebras." Unpublished manuscript, (June 1984). [8] N. G. de Bruijn "The mathematical language AUTOMATH, its usage and some of its extensions." Symposium on Automatic Demonstration, IRIA, Versailles, 1968. Printed as Springer-Verlag Lecture Notes in Mathematics 125, (1970) 29-61. [9] N. G. de Bruijn "Lambda-Calculus Notation with Nameless Dummies, a Tool for Automatic Formula Manipulation, with Application to the Church-Rosser Theorem." Indag. Math. 34,5 (1972), 381-392. [10] N. G. de Bruijn "Automath a language for mathematics." Les Presses de I'Universite de Montreal, (1973). [11] N. G. de Bruijn "A survey of the project Automath." (1980) in to H. B. Curry: Essays on Combinatory Logic, Lambda Calculus and Formalism, Eds Seldin J. P. and Hindley J. R., Academic Press (1980).
Calcul des Constructions
143
(12] A. Church "The Calculi of Lambda-Conversion." Princeton U. Press, Princeton N.J. (1941). [13] P. M. Cohn. "Universal Algebra." Reidel, 1965. [14] R. L. Constable, J. L. Bates "Proofs as Programs." Dept. of Computer Science, Cornell University. (Feb. 1983). [15] R. L. Constable, J. L. Bates "The Nearly Ultimate Pearl." Dept. of Computer Science, Cornell University. (Dec. 1983). [16] Th. Coquand "Une theorie des constructions." These de troisieme cycle, Universite Paris VII (Jan. 85). 117] Th. Coquand, G. Huet "A Theory of Constructions." Preliminary version, presented at the International Symposium on Semantics of Data Types, Sophia-Antipolis (June 84). [18] Th. Coquand, G. Huet "Constructions: A Higher Order Proof System for Mechanizing Mathematics." EUROCAL85, Linz, Austria (April 85). [19] Th. Coquand, G. Huet "A Calculus of Constructions." To appear in Information and Control (1986). [20J H. B. Curry, R. Feys "Combinatory Logic Vo!. I." North-Holland, Amsterdam (1958). [21] D. Van Daalen "The language theory of Automath." Ph. D. Dissertation, Technological Univ. Eindhoven (1980). [22] S. Fortune, D. Leivant, M. O'Donnell "The Expressiveness of Simple and Second-Order Type Structures." Journal of the Assoc. for Compo Mach., 30,1, (Jan. 1983) 151-185. [23] G. Frege "Begriffschrift, a formula language, modeled upon that of arithmetic, for pure thought." (1879). Reprinted in From Frege to Godel, J. van Heijenoort, Harvard University Press, 1967. [24] G. Gentzen "The Collected Papers of Gerhard Gentzen." Ed. E. Szabo, North-Holland, Amsterdam (1969). [25] J. Y. Girard "Une extension de l'interpretation de Godel a l'analyse, et son application a l'elimination des coupures dans l'analyse et la theorie des types. Proceedings of the Second Scandinavian Logic Symposium, Ed. J. E. Fenstad, North Holland (1970) 63-92. [26] J. Y. Girard "Interpretation fonctionnelle et elimination des coupures dans I'arithrnetique d'ordre superieure." These d'Etat, Universite Paris VII (1972). [27] K. Godel "Uber eine bisher noch nicht benutze Erweitrung des finiten Standpunktes." Dialectica, 12 (1958). [28J M. Gordon, R. Milner, C. Wadsworth "A Metalanguage for Interactive Proof in LCF." Internal Report CSR-16-77, Department of Computer Science, University of Edinburgh (Sept. 1977). [29] M. J. Gordon, A. J. Milner, C. P. Wadsworth "Edinburgh LCF" Springer-Verlag LNCS 18 (1979). [30] J. Herbrand "Recherches sur la theorie de la demonstration." These, U. de Paris (1930). Ecrits logiques de Jacques Herbrand, PUF Paris (1968).
144
T. Coquand and G. Hue!
[31] W. A. Howard "The formulae-as-types notion of construction." Unpublished manuscript (1969). Reprinted in to H. B. Curry: Essays on Combinatory Logic, Lambda Calculus and Formalism, Eds Seldin J. P. and Hindley J. R., Academic Press (1980). [32] G. Huet "A Unification Algorithm for Typed Lambda Calculus." Theoretical Computer Science, 1.1 (1975) 27-57. [33J G. Huet "Constrained Resolution: a Complete Method for Type Theory." Ph.D. Thesis, Case Western Reserve University (1972). [34] G. Huet "A Mechanization of Type Theory." Proceedings, 3rd IJCAI, Stanford (Aug. 1973). [35J G. Huet "Resolution d'equations dans des langages d'ordre 1,2, ... w." These d'Etat, Universite Paris VII (1976). [36] G. Huet "Confluent Reductions: Abstract Properties and Applications to Term Rewriting Systems." J. Assoc. Compo Mach. 27,4 (1980) 797-821. [37J L. S. Jutting "The language theory of A, a A-calculus where terms are types." Unpublished manuscript (1984). [38] L. S. Jutting "A translation of Landau's "Grundlagen" in AUTOMATH." Eindhoven University of Technology, Dept of Mathematics (Oct. 1976). [39] S. C. Kleene "Introduction to Meta-mathematics." North Holland (1952). [40] D. Knuth, P. Bendix "Simple word problems in universal algebras" , dans: Computational Problems in Abstract Algebra, J. Leech Ed., Pergamon (1970) 263-297. [41J G. Kreisel "On the interpretation ofnonfinitist proofs, Part I, II." JSL 16 (1952, 1953). [42J P. J. Landin "The next 700 programming languages." Comm. ACM 9 (1966) 157-166. [43] D. Leivant "Polymorphic type inference." 10th ACM Conference on Principles of Programming Languages (1983). [44] D. Macqueen, G. Plotkin, R. Sethi. "An ideal model for recursive polymorphic types." Proceedings, Principles of Programming Languages Symposium, Jan. 1984, 165-174. [45J P. Martin-Lof "A theory of types." Unpublished manuscript (Oct. 1971). [46] P. Martin-Lof "An intuitionistic Theory of Types: predicative part." Logic Colloquium 73, Eds. H. Rose and J. Sepherdson, North-Holland, (1974) 73-118. [47] P. Martin-Lof "About models for intuitionistic type theories and the notion of definitional equality." Paper read at the Orleans Logic Conference (1972). [48] P. Martin-Lof "Constructive Mathematics and Computer Programming." Logic, Methodology and Philosophy of Science VI, 153-175, (1980) North-Holland. [49] P. Martin-Lof "Intuitionistic Type Theory." Studies in Proof Theory, Bibliopolis (1984). [50J D. A. Miller "Proofs in Higher-order Logic." Ph. D. Dissertation, Carnegie-Mellon University (Aug. 1983).
Calcul des Constructions
145
[51] R. Milner "A Theory of Type Polymorphism in Programming." Journal of Computer and System Sciences 17 (1978) 348-375. [52] C. Mohring"Algorithm Development in the Calculus of Constructions." IEEE Symposium on Logic in Computer Science, Cambridge, Mass. (June 1986). [53] R. P. Nederpelt "Strong normalization in a typed A calculus with A structured types." Ph. D. Thesis, Eindhoven University of Technology (1973). [54J R. P. Nederpelt "An approach to theorem proving on the basis of a typed A-calculus." 5th Conference on Automated Deduction, Les Arcs, France. Springer-Verlag LNCS 87 (1980). [55] M. H. A. Newman "On Theories with a Combinatorial Definition of Equivalence." Annals of Math. 43,2 (1942) 223-243. [56] B. Nordstrom "Programming in Constructive Set Theory: Some Examples." Proceedings of the ACM Conference on Functional Programming Languages and Computer Architecture, Portmouth, New Hampshire (Oct. 1981) 141-154. [57J B. Nordstrom, K. Petersson "Types and Specifications." Information Processing 83, Ed. R. Mason, North-Holland, (1983) 915-920. [58] B. Nordstrom, J. Smith "Propositions and Specifications of Programs in Martin-Lof's Type Theory." BIT 24, (1984) 288-301. [59] T. Pietrzykowski, D. C. Jensen "A complete mechanization of w-order type theory." Proceedings of ACM Annual Conference (1972). [60] T. Pietrzykowski "A Complete Mechanization of Second-Order Type Theory." JACM 20 (1973) 333-364. [61] D. Prawitz "Natural Deduction." Almqist and Wiskell, Stockolm (1965). [62] D. Prawitz "Ideas and results in proof theory." Proceedings of the Second Scandinavian Logic Symposium (1971). [63] J. C. Reynolds "Definitional Interpreters for Higher Order Programming Languages." Proc. ACM National Conference, Boston, (Aug. 72) 717-740. [64] J. C. Reynolds "Towards a Theory of Type Structure." Programming Symposium, Paris. Springer Verlag LNCS 19 (1974) 408-425. [65] J. C. Reynolds "Polymorphism is not set-theoretic." International Symposium on Semantics of Data Types, Sophia-Antipolis (June 1984). [66] J. C. Reynolds "Three approaches to type structure." TAPSOFT Advanced Seminar on the Role of Semantics in Software Development, Berlin (March 1985). [67] J. C. Reynolds "Types, abstraction, and parametric polymorphism." IFIP Congress'83, Paris (Sept. 1983). [68] D. Scott "Constructive validity." Symposium on Automatic Demonstration, SpringerVerlag Lecture Notes in Mathematics, 125 (1970). [69] J. R. Shoenfield "Mathematical Logic." Addison-Wesley (1967).
146
T. Coquand and G. Huet
[70] J. Smith "Course-of-values recursion on lists in intuitionistic type theory." Unpublished notes, Goteborg University (Sept. 1981). [71] J. Smith "The identification of propositions and types in Martin-Lof's type theory a programming example." International Conference on Foundations of Computation Theory, Borgholm, Sweden, (Aug. 1983) Springer-Verlag LNCS 158. [72] S. Stenlund "Combinators A-terms, and proof theory." Reidel (1972). [73] W. Tait "A non constructive proof of Gentzen's Hauptsatz for second order predicate logic." Bull. Amer. Math. Soc. 12 (1966). [74] M. Takahashi "A proof of cut-elimination theorem in simple type theory." J. Math. Soc. Japan 19 (1967).
[75J G. Takeuti "On a generalized logic calculus." Japan J. Math. 23 (1953). [76] A. N. Whitehead, B. Russell "Principia Mathematica." Cambridge University Press (1903).
Logic Colloquium '85 Edited by The Paris Logic Group © Elsevier Science Publishers B.V. (North-Holland), 1987
147
ARITHMETICAL TRUTH AND HIDDEN HIGHER-ORDER CONCEPTS Daniel Isaacson Oxford University*
I dedicate this paper to the memory of my father, who died on August 30th, 1985. §O.
Introduction
The incompleteness of formal systems for arithmetic has been a recognized fact of mathematical life for over half a century, and much has been said about it.
Even so, I want still to raise some issues in
this area, to advocate certain ways of looking at the phenomenon.
In
particular, I want to focus attention on the status of Peano Arithmetic, and on the nature of those true statements in the language of arithmetic which are unprovable in it. GOdel's construction of an undecidable sentence, given a formal system strong enough for numeral-wise expressibility of rather weak arithmetic, shows that each such formal system for arithmetical truth must admit proper extensions.
This situation suggests that the choice of any
particular formal system must be provisional, subject to an eventual mathematical need to go beyond it.
At the same time, Peano Arithmetic
seems a natural and intrinsically important axiomatization.
Does this
impression arise through the historical accident of what people happened to arrive at first, or does it reflect rather some underlying conceptual fact? This paper explores a viewpoint on which Peano Arithmetic indeed occupies an intrinsic, conceptually well-defined region of arithmetical truth.
The idea is that it consists of those truths which can be
perceived as such directly from the purely arithmetical content of a categorical conceptual analysis of the notion of natural number.
The
truths expressible in the (first-order) language of arithmetic which lie beyond that region are such that there is no way by which their truth can be perceived in purely arithmetical terms. *Address for correspondence: Oxford OXI 4JJ, England.
Via the phenomenon of coding,
Sub-Faculty of Philosophy, 10 Merton Street,
LJ. Isaacson
148
they contain essentially hidden higher-order, or infinitary concepts. On this perspective, Peano Arithmetic may be seen as complete for finite mathematics. §l.
The starting point:
properties which characterize the concept of
natural number. The nineteenth century saw tremendous development in the conceptual basis of mathematical knowledge.
This was most notable, and pressing,
in the case of real analysis and the understanding of the continuum. The development of correct definitions of limit, continuity, the derivative, the distinction between convergence and uniform convergence, isolation of the least upper-bound and cut property of the reals allowed better understanding of results already obtained.
It was also now
possible to obtain new results not accessible earlier on the old conceptual basis (such as, most famously, the existence of a continuous but nowhere differentiable real-valued function of a real variable). In the later part of the nineteenth century this enterprise of rigourization came also to be applied to that most basic part of mathematics, the theory of whole numbers. The fact that arithmetic was dealt with in this way only after the calculus had been reflects the differing nature of the demand for conceptual understanding of the reals and of the natural numbers.
The
drive to understand more deeply the basic, characteristic properties of the reals arose directly out of demands of mathematical practice.
In
the case of the natural numbers there was no doubting the truth of the results being obtained. infinite.
The essential contrast is between finite and
Each natural number is a finite entity, while a single real
number is itself an infinite structure (whose infinity can be analyzed in various ways, for example by an infinitely proceeding process of increasingly sharp approximation).
Success in analyzing the mathematical
structure of that infinity showed how the real numbers, and functions on them, could be thought of as built up by set-theoretic operations on the natural numbers.
Hence the momentum of the foundational enterprise
combined readily with philosophical curiosity to impel the attempt to understand the basic nature of the natural numbers. In some ways nothing could be more transparent. are what arise in the process of counting.
The natural numbers
We mark the beginning of that
process by counting I, and the further natural numbers arise by iteration
149
Arithmetical Truth and Hidden Higher-Order Concepts
of the process of counting a next one (with possibly a slight increase of sophistication, we may think of 0 as counting the empty collection, and in this way as constituting the smallest natural number).
Yet
however much this simple account might seem plausibly to articulate our actual understanding, it shows a basic circularity, namely that the notion of successive iterations is tantamount to the notion of natural number, which must mean that it cannot constitute the conceptual basis of the analysis of our grasp of this notion.
Dedekind* addressed himself
specifically to this circularity (as he recounts in his letter to Keferstein [3])and arrived at the following non-circular characterization of the structure of the sequence of natural numbers:
it consists of the
smallest collection whiCh is closed under an operation which takes distinct elements to distinct elements and which contains an element which is distinguished as not being the successor of any element.
This notion
of smallest such collection is expressed by the condition that the natural numbers consist of the intersection of all such collections (p.IOI of [3], Chapter VI of [2]).
Thus we arrive at the following explicit definition
of the property of being a natural number: ~(x)
=df VX(OEX & Vy(yEX
where (1) Vx(s (x) *0) ,
~
s(y)EX)
~
and (2) VxVy(s(x) =s (y)
xEX) ~
x=y) •
The correctness
of this definition to characterize the intended structure is dependent on the existence of an inductive set, i.e. one containing 0 and closed with respect to the injective map s with 0 not in its range.
Dedekind was well
aware of this point, and in his Theorem 66 attempted a proof, based on the notion 'possible object of thought'.
How convincing such an existence
proof might be is disputable, and one may prefer rather to take the required conclusion as given by an axiom. From this definition the principle of induction for natural numbers follows: (3)
VX(OEX & Vy(yEX ~ s(y)EX)
~
Vx(xEX)
*The analyses of natural number by both Frege [5] and Dedekind [2] are, while independently arrived at, identical as to key ideas. My discussion here is in terms of Dedekind's analysis, both because the motivating line of development followed by Dedekind is the one I wish also to follow (as opposed to Frege's motivation in terms of ontology), and because Dedekind explicitly addresses himself to the issue of categoricity of the analysis.
D. Isaacson
150
where the elements of the domain satisfy the condition IN(x) given above.
This analysis/definition of the natural numbers suggests and
validates the axiomatization of arithmetic by the three axioms (1), (2), (3).
Having characterized the structure of the succession of natural numbers, Dedekind then dealt with their arithmetic by establishing (in section 126) the unique existence of functions defined by (primitive) recursion, which he applied to obtain the standard operations of addition, multiplication, and exponentiation.
He also used this result to
demonstrate the categoricity of this characterization of the natural numbers (sections 132-4) (were there two structures satisfying these axioms, the stipulation of an isomorphism between them is a primitive recursion in the domain of the map).
From categoricity it follows that
these axioms have as logical consequences exactly the truths of arithretic. §2.
Transition to a first-order axiomatic system:
Peano Arithmetic.
The situation so far described sounds a thorough success. two features of it are problematic.
However,
The first is that this categorical
axiomatization cannot be fully formalized.
It relies upon the logic
for quantifiers ranging over all subsets of the domain of individuals, and the very success of the axiomatization, its completeness as a consequence of categoricity, means that in light of the Godel incompleteness theorem there can be no consistent formal system which recursively enumerates the logically valid formulas for these second-order quantifiers. The other difficulty has to do with impredicativity.
We noted earlier
that the characterization could only succeed in case there exists a set containing 0 and closed under succession.
But in fact there must be a
set consisting precisely of the natural numbers.
If the smallest
inductive set contained extraneous elements appropriately structured, then the axioms would still be true, but the structure in which they were true would not be isomorphic to the natural numbers.
The given characteriza-
tion of the natural numbers can only be known to succeed if the structure we are attempting to characterize is recognized to be an element in the second-order domain. possible.
A variety of responses to these difficulties are
I want here to follow one of these, namely the transition to
the corresponding first-order axiomatic system, Peano Arithmetic. To some extent a philosophical presumption has developed in favour of first-order axiomatizations (see for example W.V. Quine [12J, Chapter 5).
Arithmetical Truth and Hidden Higher-Order Concepts
151
However, the expressive weakness of first-order languages strikes me as decisive against the possibility of working exclusively within firstorder logic.*
Nonetheless there is something of intrinsic importance in
a first-order language, since it is one in which, for arithmetic, the objects we talk about by it are the natural numbers, and that is just what we do in doing arithmetic.
It is natural then to look for a first-
order formal system which expresses as much as possible of the insight into the structure of the natural numbers and their arithmetic given by Dedekind's analysis. The first two axioms of Dedekind's analysis are, as they stand, firstorder.
What about the axiom of induction, with its quantification over
sets of natural numbers?
The way we make use of it in our study of the
natural numbers is in establishing by mathematical induction that some particular numerical property applies to all the natural numbers.
The
purely arithmetical content of the induction principle (as distinct from its full conceptual content) -is thus given by a scheme which tells us that induction holds for any particular numerical property.
Numerical
properties are expressed by free-variable formulas in a language for arithmetic.
So we come to the question, what language should that be?
The formulation of Dedekind's analysis was in terms of successor and the distinguished element I (for Dedekind, 0 for us).
As Dedekind's
proof of the recursion theorem shows, all the usual arithmetic of the natural numbers can then be established on this basis.
By contrast, if
we restrict ourselves to those properties of natural numbers expressible only by quantification over the natural numbers themselves, then a system with only 0 and s as primitive is extremely weak, in particular, not strong enough to prove the existence of arithmetical functions defined by primitive recursion.
However, if we take as primitive the two most
familiar and basic arithmetical functions, namely addition and multiplication, as given by their recursion equations, we then do arrive at a situation where all the further primitive recursive (and indeed general *This expressive weakness follows as a consequence of what is often cited as one of the main reasons why we should work within first-order logic, namely the completeness of formal systems of first-order logic with respect to logical validity. The familiar point is that compactness follows from completeness, and from compactness we have it that the first-order truths of arithmetic will hold in structures not isomorphic to the natural numbers, showing thereby that we cannot express fully in a first-order language the understanding we have of the structure of the natural numbers.
D. Isaacson
152
recursive) functions are expressible.
This fact, which is a kind of
first-order correlate to Dedekind's recursion theorem, was established by G8del [6J, Theorem VII, via arithmetical coding of finite sequences. In this way we arrive at a prima facie justification of interest in the formal system of Peano Arithmetic, for which the first order language has as its primitives 0, s, +, • and has as axioms the first two Dedekind axioms, plus the recursion equations for + and " i.e. x +0 x + s(y)
=x
x'O x-s (y)
s(x+y)
o x-y
+x
and the scheme of induction with respect to properties of the natural numbers expressible in this language.
In this way Peano Arithmetic may
be seen as an intrinsic system, arising in a natural way when investigating the general nature of natural numbers.
Is there a sense in which
it is canonical as a first-order system for arithmetic? Other second-order analyses could have been used.
For example, the
natural numbers can be characterized by a second-order formulation of the least number principle, for which the corresponding first-order scheme is equivalent to the first-order scheme of induction.
Could
Peano Arithmetic be canonical in the sense that it is equivalent to any first-order system which arises by forming a scheme of first-order substitution instances from a rr1-sentence which serves as part of a 1
categorical characterization of the natural numbers? It seems that this condition is too strong.
Consider for example a
second-order sentence expressing transfinite induction with respect to an arithmetically expressed primitive recursive well-ordering < on the natural numbers of order-type a (*)
~
Eo ;
'v'X( ('v'y(x
Insofar as we are able to perceive < as defining a well-ordering of that order-type, the second order principle expresses the condition that we have exactly the natural numbers ordered in that way, and so a categorical characterization of lN, albeit now with a highly complex structure on it.
Evidently the corresponding system generated by turning
(*) into a first-order scheme must be stronger than PA, in light of the
unprovability in PA of transfinite induction of order-type Eo' In the present context, this case is not worrying, since principle (*) is stronger than such apprehension as we have just of the natural numbers
Arithmetical Truth and Hidden Higher-Order Concepts
153
themselves, requiring as it does an irreducible notion of well-ordering. I understand that Alex Wilkie has obtained a result which can be interpreted as showing essentially that PA is the weakest first-order system arising from any categorical rr~-characterizationof the natural numbers.
The question of the intrinsicness of Peano Arithmetic might
then be explored by assessing the conceptual content of the various cate gor i. ca I
n'1- charact er t•aat•t ons
of the natural numbers, looking to see
whether those which yield Peano Arithmetic are recognizable as conceptually equivalent to Dedekind's analysis of the notion of natural number, and whether those which yield stronger first-order systems require some conceptual element which goes beyond our grasp of the natural numbers.
Such a project lies beyond the scope of this paper.
Alex Wilkie's extremely interesting paper
'~n
discretely ordered rings
in which every definable ideal is principal" [15], in which he shows that a resulting first-order system is (essentially) equivalent to Peano Arithmetic, offers a sharply developed particular case apt for such consideration. The attempt to recognize a conceptually significant boundary between first- and higher-order notions for arithmetic may be called into question by the predicativist viewpoint (I am grateful to Solomon Feferman for drawing my attention to this possibility, in discussion after my talk at CSLI).
"By predicativity is meant that part of
(abstract) mathematical thought which is implicit in our conception of the natural numbers, "([4], p.68). Dedekind's starting point:
That conception is the same as
"Our conception of the natural number
sequence N is as generated from an initial number 0 by repeated application of a successor operation a
I->
a'," ([4], p , 70).
The project
disallows use of any higher-order operations not themselves justified by that initial arithmetical conception, by which restriction the inherent impredicativity of Dedekind's analysis is to be avoided.
However,
higher-order operations are not excluded by it as such, and in particular it is argued that functionals determined by primitive recursion on numerical arguments can be predicatively justified.
Nothing I have said
counts against the possibility or the intrinsic interest of carrying through such an analysis.
The question would be rather, does that
analysis count against the stability of arriving at a first-order axiomatic system? In considering that question, I am E£! claiming that PA could itself
1). Isaacson
154
constitute an adequate conceptual basis for our understanding of the concept of natural number.
Far from it, I consider that we can only
arrive at such a system on the basis of some higher-order understanding. The system PA arises as constituting the purely arithmetical content of our full understanding of the concept of natural number, where that understanding is implicitly and inherently higher order. If in pursuing the predicativist programme one were motivated by a firm conviction that impredicativity is
~~
illegitimate, then the
Dedekind analysis would not be allowed as an acceptable conceptual basis for the account of arithmetical truths I am attempting to develop.
But
I believe that one can still arrive at PA as the first-order system given directly from the analysis of the concept of natural number, even where that analysis is carried through predicatively. So the question must be not:
is PA conceptually strong enough to
analyze the concept of natural number? to which the answer must always be no, but rather:
can we motivate focusing attention on an axiomatic
system for arithmetic which is first-order?
It seems to me that there is
an intrinsic point of interest in working within a domain in which our only objects of quantification are the natural numbers themselves. In the following section I turn to considerations which offer a characterization of the domain of mathematics captured in that firstorder axiomatization. §3.Peano Arithmetic as the mathematics of finite structures. Peano Arithmetic is equivalent with a natural theory of purely finite structures, namely ZF-, the axiom system of Zermelo-Fraenkel set theory without the axiom of infinity and with the negation of the axiom of infinity.
As is well known, the standard model for ZF- is HF, the
collection of hereditarily finite (pure) sets.
The significance of this
equivalence for the present discussion is two-fold.
First, it is natural
to consider arithmetic as essentially the mathematics of that paradigm finite process, counting finite collections.
To see that Peano Arithme-
tic is equivalent to a natural axiomatization of finite structures encourages the belief that a right understanding of this domain has been reached.
The second point concerns the account to be given of the
incompleteness phenomenon in relation to the intrinsic position of PA. The suggestion is that those mathematical truths expressible in the language of arithmetic but not provable in PA contain 'hidden higher-
Arithmetical Truth and Hidden Higher-Order Concepts
155
order concepts", where what is hidden is revealed by the re cogm ta on of the phenomenon of coding.
What I mean here by higher-order includes the
standard usage for quantification over sets of individuals in distinction to first-order quantification over the domain of those individuals.
But
I also mean to include in this phrase something of the notion infinitary, in the sense of presupposing an infinite totality, and the applicability of such a notion in this context is suggested by the results being looked at in this section. The possibility of formulating Peano Arithmetic in a theory of the finite (for which I shall sometimes use the adjective 'finitary', hoping it will be clear and acceptable that I am thus using the word with a different sense from Hilbert), begins in a reworking of the Dedekind analysis of natural number. looked at in §2.
This approach goes back to
ZermeIo , in his 1909 paper "Sur les ensembles finis et 1e principe de l'induction complete" [16J. The basic idea of Dedekind's analysis, as we noted. is-to stipulate that an object of our theory is a natural number just in case it belongs to every inductive set. Such sets are necessarily infinite.
However. we
can also correctly stipulate that x is a natural number by the condition that x belongs to the smallest set which contains 0 and is closed under successor except for that element. i.e. a set of the form
{a.
s(O) •..• ,sn(O)}.
Thus:
VX(OEX & Vy(yEX & y#X
~
s(y)EX)
~
xEX).
This formulation is due to Michael Dummett, as reported by Hao Wang in his paper "Eighty years of foundational studies" [14J.
The existential
condition required in order for this definition to define anything. corresponding to the existence requirement of an inductive set in the case of the Dedekind analysis. is evidently: Vx3X(OEX & Vy(yEX & y#X
~
s(y)EX)).
It is also clear that on this existential basis the definition works when the second-order quantifiers range only over all finite subsets of the domain (weak second-order logic). I want to digress briefly to consider this definition in relation to the charge of impredicativity.
That this definition picks out all the
natural numbers, given the preceding existential condition, depends on a feature of the range of the second-order quantifier comparable to the impredicative requirement in the full second-order analysis that the
D. Isaacson
156
second-order domain contain a set consisting exactly of 0 and the elements obtained from it by finite iteration of the successor function. This definition will pick out just the natural numbers in case for each natural number n the second-order domain contains a set consisting precisely of 0 and what is obtained from it by up to n-fold iteration of the successor function, and the domain of second-order quantification contains only finite sets.
In the strictest sense this condition falls
short of impredicativity, in that the very set being defined is not required to lie within the range of the quantifiers of the definition. However, as we see, an exact representation of the natural number sequence must occur as elements of the domain, and I am inclined, therefore, to consider that the weak second-order definition does not fare significantly better on the score of avoiding impredicativity than the one based on full second-order logic. This analysis provides a categorical characterization of the natural numbers, on the basis of which the full second-order principle of induction is derivable.
As before, this success shows that the system
obtained cannot be given by a recursive set of axioms.
Peano Arithmetic
as an axiom system is obtained from this analysis along lines similar to those described in the previous section. Having arrived at PA by this route, there is then a very natural construction within ZF-, interpreting 0 as the empty set and the successor function in the von Neumann way as a axioms of PA are all provable.
1-+
a U{a}, by which the
The construction is essentially similar
to the usual interpretation of number theory in ZF, but renders explicit the finiteness of the set theory required to yield Peano Arithmetic. Quite strikingly, there is a converse to this result, that is to say, not only is PA interpretable in ZF-, but ZF- is fully interpretable in PA.
The interpretability of ZF- in PA can be established on the basis
of coding each hereditarily finite set by a natural number, as first shown by Wilhelm Ackermann [lJ, by the following definition of an Ere l.ati.on among the natural numbers: n Em:: df for some finite a c IN, m= :EE 2k and n Ea. For example {O, 4, 7} is coded by 2 7 + 2~ + 2° = 145. k a A delightful feature of this coding is that if the number is written in binary notation, the sequence of O's and l's is then simply the characteristic function for the finite set coded by that number, beginning at the right with O.
Thus 145 = 10010001.
0 must code the empty set,
so 1 = 2°, which codes {O}, can be thought of as coding {Ijl}.
And so we
Arithmetical Truth and Hidden Higher-Order Concepts
157
have a bi-unique coding of hereditarily finite (pure) sets by natural numbers.
For example, 145 codes {~, {{{~}}}, {~, {~}. {{~}}}}. On this
coding each of the axioms of ZF- is translated into a true statement of arithmetic provable in PA. Thus Peano Arithmetic can be established on the basis of finite set theory, and is itself, in a very natural way, a theory of the hereditarily finite sets. §4.
Incompleteness:
the first example, via diagonalization, and the
phenomenon of coding. The incompleteness of all formal axiomatizations for arithmetical truth seems to have come as a surprise to mathematicians generally, including those mathematicians and logicians concerned particularly with the relationship between mathematics and formal systems, most notably David Hilbert.
COdel presented his proof of incompleteness specifically
for the system of Principia Mathematica, and he notes its applicability to Zermelo-Fraenkel set theory.
But it was also made clear in GBdel's
account that his construction was applicable to any system strong enough to deal with the basic form of recursion. It is in a way slightly surprising, after the fact, that this result took Hilbert and his school so much by surprise.
It was Hilbert, after
all, who made the basic move required for it of realizing that the manipulation of symbols of a formal system should be seen as being of the same character as the computational manipulations of arithmetic. From this conceptual basis Hilbert gave an ingenious mathematical formulation to a programme for establishing that the full, infinitary range of mathematics was consonant with that part of mathematics which offers certain and absolute constraint, namely finitary computations. This came down in effect to establishing rr~-reflection for formal systems of arithmetic, also equivalent to the deductive consistency of the formal system.
The programme of establishing these results was envisaged as
proceeding within informal, intuitive, finitary mathematics. COdel saw that a further step was possible.
If the syntactic
manipulations of a formal system could be viewed as part of finitary mathematics, as akin to the elementary calculations of arithmetic, then one might be able actually to map the investigation of these syntactic manipulations into the arithmetic itself, and so establish it within formal systems of arithmetic.
Godel's primitive recursive arithmetiza-
tion of syntax showed that indeed this could be done.
On this basis one
D. Isaacson
158
could then, as GOdel did, diagonalize on the provability predicate, proceeding analogously to the liar paradox by way of the heuristic formulation "This sentence is Improvable", to obtain a true arithmetical statement unprovable in the given formal system. Let us consider this result as applied to the formal system PA. Indisputably, it shows that PA is incomplete for mathematical truth expressible in the language of arithmetic.
Does this situation count
decisively against any claim for the intrinsic character of PA?
One may
try to say that it does not by observing that the GOdel sentence is from the point of view of usual mathematics rather peculiar.
It arises not by
working from arithmetical properties of the natural numbers, but by reflecting about an axiomatic system in which those properties are In a certain way, even, it might be said not to be arithmetical. It is not saying something about the natural numbers, formalized.
rather it is 'about' the statement itself.
In that way the GOdel
incompleteness phenomenon is not an arithmetical incompleteness.
That
is my viewpoint in this paper. The contrasting viewpoint would hold that expressibility in the language of arithmetic renders the statement genuinely arithmetical. That the COdel sentence says of itself 'This very sentence is underivable in PA' is, on this view, merely heuristic, and it is highly misleading to put it in such terms, as is sometimes done when the result is being expounded.
It 'says' nothing about itself.
What it asserts is that a
certain universal relation holds of all natural numbers (given that the COdel sentence is of rr:-form).
Now it is crucially true, and obvious,
that the GOdel sentence for PA in the formal language of arithmetic and the English sentence 'This very sentence is underivable in PA' are of flmdamentally different character (in particular, of course, the question of whether or not that sentence in English is derivable in PA does not arise).
Nonetheless, it seems to me a significant fact, which cannot be
brushed aside as merely heuristic, that the GBdel sentence does say something like what we also express with this sentence of English. only way to
~
The
the arithmetical truth of the COdel sentence is in terms
of its connection with the situation we describe by that English sentence. And the situation described by that sentence goes essentially beyond arithmetic itself. The key technique of GOdel's proof is the use of coding, the coding of syntactic relations and properties by properties and relations of natural
159
Arithmetical Truth and Hidden Higher-Order Concepts
numbers.
At least in the case of Godel sentences (1 will consider some
other undecidable sentences in the next section), the understanding of these sentences rests crucially on understanding this coding and our grasp of the situation being coded.
The phenomenon of coding reveals
fixed links between two situations or facts, one in the structure of arithmetic, the other in the realm of syntax of a formal system.
These
facts, and the link between them, are revealed by the description of the coding, but their existence is not dependent on being described. We might consider whether, in view of its truth and independence from PA, we should adopt the Godel sentence for PA, call it G, as a new axiom of arithmetic.
Such a move would be unnatural.
An axiom in this context
should be an evident truth, in the terms in which it is expressed.
But
the truth of this statement, as a statement of arithmetic, is not directly perceivable.
PA+G would not constitute, in this way, a purely
arithmetical extension of PA.
The Godel sentence thus offers an instance
of the general thesis of this paper that any axiomatic extension of Peano Arithmetic must be motivated by considerations for establishing its truth which rely essentially on non-arithmetical notions. Hilbert noted and made essential use, both technical and philosophical, of the similarity between formal manipulation of symbols and elementary computation on the natural numbers.
He considered that this similarity
offered a uniform account of the nature of these formal manipulations. GOdel's discovery of the phenomenon of coding shows that the account of formal syntax in these terms cannot be uniform, and that some truths of arithmetic must be seen in terms of their link to syntactic properties, rather than the other way round, as Hilbert had envisaged would always be possible.
In these terms, it may be said that GOdel's discovery of
incompleteness for arithmetical formal systems reveals not so much their deductive weakness, as rather the structural expressiveness of arithmetic. The arithmetic of the natural numbers can mimic quite other situations. If the truths in the language of arithmetic which express these mimicsituations are to be seen as true, that will depend not on the principles which generate our understanding of the natural numbers, but on those which apply to the situation which is mimicked, and which reveal the coded connection between them.
D. Isaacson
160
§5.
Consideration of some further examples. Attention has focused in recent years on some extremely interesting
examples of arithmetical truths unprovable in Peano Arithmetic which are thought of as much more genuinely and purely mathematical than the GOdel sentences.
They include the study by Kirby and Paris [lOJ of Goodstein's
theorem [8J, the Paris-Harrington variant of the finite Ramsey theorem [llJ, and Friedman's finite version of Kruskal's theorem [13J.
The
general thesis of this paper is tested and supported by these particular cases. (a) Goodstein's theorem offers at first sight a particularly compelling example of genuinely arithmetical incompleteness in PA.
It is extremely
easy to grasp, in purely arithmetical terms, what the theorem says, what assertion about the natural numbers is being made.
One can understand
the situation it describes with just the sophistication which has become standard at the level of elementary school with the 'new maths' fascination with change of base for integer notation.
It is further
possible to give a purely mathematical demonstration of its truth which is utterly simple and perspicuous, and for which the required mathematics is appropriate to a middle-level undergraduate university course, namely one on set theory which develops the notion of ordinal as far as the Cantor normal form theorem.
It is in these terms evident that the result
follows from ordinal induction of order-type Eo' The converse also holds.
Indeed, Goodstein's interest in studying
these sequences of natural numbers was as a way of giving arithmetical expressions to ordinal inductions of order types less than EO'
Hence by
the adequacy of Eo-ordinal induction for proving the consistency of PA, combined with GOdel's second incompleteness theorem, Goodstein's theorem must be unprovable in PA.
Its unprovability can also be analyzed more
directly, as was done by Kirby and Paris using the model-theoretic technique of indicators, and results of Ketonen and Solovay [9J.
These
two ways of seeing the independence of Goodstein's theorem reveal different features of it relevant to the general thesis of this paper. The fact that it codes Eo-induction tells us that there is no way to perceive the truth of Goodstein's theorem which does not also establish the correctness of Eo-induction.
The question raised by Goodstein's
theorem in this context comes then to the following:
is it possible that
we should manage to establish this result using only purely arithmetical notions, that is without the use of any 'higher-order' notions, such as
Arithmetical Truth and Hidden Higher-Order Concepts
161
'arbitrary subset', or 'well-ordering', or 'sound axiomatization of arithmetical truth'? Could we have some basis just within our understanding of arithmetic on the natural numbers for taking Goodstein's theorem as an axiom of true arithmetic?
What is at least clear is that
the way in which we do know that Goodstein's theorem is true is not such a basis.
Thus Goodstein's theorem as it stands does not refute the thesis
of this paper. there is
~
I do not see how to establish the stronger claim, that
such way.
But still, I draw support for the viewpoint I am
urging in this paper because at first Goodstein's theorem seems to be a particularly likely counterexample to it, which then turns out, on closer consideration of what we do know about it, not to be one. The second heuristic point for the general thesis under consideration which emerges from this example relates to the theme developed in §3, on Peano Arithmetic as the mathematics of finite structures. theorem is of the form
Vx3y~(x,y).
Goodstein's
The Kirby-Paris method of demon-
strating its independence by the use of non-standard models shows the existence of a model of Peano Arithmetic in which
Vx3y~(x,y)
is true,
an initial segment of which is also a model of PA, and in which there are elements a such that any witness to the true existential
3y~(a,y)
lies
beyond the initial segment (the element a must, of course, be nonstandard, since for each n, PA!- 3y~(n.y».
The elements of an end
extension of a given model of PA are infinite with respect to the initial segment model, in the sense that for that model they lie beyond all the natural numbers.
One can then think of this demonstration of independence
as modeling the fact that the process of generating Goodstein sequences goes essentially beyond finite arithmetic. a Goodstein sequence is finite.
Any particular calculation of
But the process as a whole is infinitary.
In the initial segment which is a model of PA, everything is true that is given by the theory of purely finite sets.
Goodstein's theorem can be
seen to fail in that situation, but then to hold in an infinitary extension. (b) The Paris-Harrington sentence.
The finite version of Ramsey's
theorem is provable in Peano Arithmetic, coded suitablY as to be expressed in the language of arithmetic (or it can be expressed more directly and proved in ZF-).
Paris and Harrington found that a (seemingly) slight
variation in the theorem made it, while still true, unprovable in PA (the variation consisting in the requirement that the homogeneous set given by the original Ramsey theorem should satisfy the further condition that it be "relatively large", in the technically stipulated sense that
162
D. Isaacson
every natural number in the set should be greater than the cardinality of the set).
This result was hailed as what mathematicians had been looking
for since the COdel incompleteness theorems, namely "a strictly mathematical example of an incompleteness in first-order Peano Arithmetic, one which is mathematically simple and interesting and does not require the numerical coding of notions from logic" (Jon Barwise, Editor's Note to Paris and Harrington [11], p.1133). I do not dissent from the positive enthusiasm of this assessment.
A
question which it suggests for the present perspective is whether the adjective "mathematical" could be replaced by "arithmetical"?
Is this a
strictly arithmetical example of an incompleteness in first-order Peano Arithmetic?
I am using the word arithmetical here as meaning both
mathematical (in the sense that Barwise uses the term) and being about the arithmetic of the natural numbers.
To mean by arithmetical
"expressible in the language of arithmetic", as one might, would beg, or really obliterate the question which I am trying to articulate here. Perhaps the issue of this paper could be seen as the question whether there is such a sense to 'arithmetical' different from expressibility in the language of arithmetic. The criterion I have in mind is that a truth expressed in the language of arithmetic is arithmetical just in case its truth is directly perceivable so expressed, ££ on the basis of other truths in the language of arithmetic which are themselves arithmetical.
The analysis of the
number concept as discussed in §§ 2, 3, 4, seems to me to render the axioms of Peano Arithmetic arithmetical, in the sense that their truth is directly perceivable so expressed, and on this basis the second clause renders the theorems of PA arithmetical (but not quite unproblematically; I shall say something in §6 about the possible non-arithmetical nature of some theorems of PA).
It seems to me reasonably evident that the
examples we are considering are not arithmetical in the first sense.
The
difficulty comes in having grounds for considering that the second condition does not apply, of being confident that no as yet unknown proof could be given in terms of recognizably arithmetical truths. What can we say specifically about the Paris-Harrington statement? Its proof is an easy consequence of the Infinite Ramsey Theorem:
if the
Paris-Harrington statement is false, the set of counterexamples to it can be given the structure of a finitely branching infinite tree, and the infinite branch which is shown to exist by the argument for the Konig
Arithmetical Truth and Hidden Higher-Order Concepts
163
Infinity Lemma produces a counterexample to the Infinite Ramsey Theorem. This argument, as it stands, quite evidently goes beyond finite arithmetic.
What about the possibility that some other proof might
exist which constitutes a purely arithmetical basis for perceiving this truth?
The following theorem (at 3.1 in [llJ) renders this prospect
unlikely:
the Paris-Harrington statement is provably equivalent in PA
to k~-reflection for Peano Arithmetic.
I construe this result as
revealing something of the implicit (hidden) higher-order content of the arithmetically expressed Paris-Harrington statement.
k~-reflection is an
expression of the soundness of PA as an axiomatization of arithmetical truth, and in these terms looks beyond the natural numbers themselves to our capacity to consider the correctness of our analysis of their fundamental properties.
It is in fact a strong enough form of reflection
not only to tell us that we have arrived at consistent notions, but also to express that in fact these are the intended notions, via the I-consistency of PA (a special case of Godel's notion of urconsistency).* In this way the Paris-Harrington statement expresses a very strong reflective property about the whole formalization of arithmetic, and seems to me thereby to be revealed as implicitly higher-order, and so in the way I am trying to make clear, non-arithmetical. (c) Friedman's finitization of Kruskal's Theorem.
This fascinating
result constitutes a much more extreme case of the situation represented by the previous two examples.
In this way it illustrates particularly
compellingly that a truth expressible in the language of arithmetic may be such that it is not a truth of arithmetic.
It provides support for
the thesis of this paper by showing, in a way that seems conclusive, that there ~ such cases where undecidability of a statement by PA does not constitute incompleteness of PA for arithmetical truth. (By its very strength in this way, it does not test so well the idea that PA is complete for arithmetical truth, which is done more by arithmetically expressible truths not so far over the border from what is provable by PA, for which it might seem feasible that an arithmetical extension *One might ask, how can I-consistency guarantee that the theory is of the standard model? After all, when formalized and expressed mathematically, the applicability of the GOdel phenomenon means that I-consistency itself is subject to non-standard interpretation. The point is that it has this expressive power on our understanding of it as intuitive, non-formal mathematics.
D. haacson
164
could reach that point.) A natural proof of Friedman's finitization of Kruskal's Theorem is by the original proof of Kruskal's Theorem plus appeal to the compactness theorem to show that if there were a counterexample to the finite version, then the full theorem also would not hold.
As it stands, Kruskal's proof
of his theorem goes entirely beyond arithmetic.
That it does so
essentially, and how far beyond it goes, is made apparent in the analysis if its associated ordinal strength.
Smorynski, in his exposition [13J of
Friedman's work here, sets out how Kruskal's theorem readily yields provable well-ordering of type f o ' and remarks that Friedman has succeeded in showing that his finitized version does also. Hence, finitized Kruskal's Theorem is sufficient to establish the consistency of predicative analysis, thereby indicating its exceedingly powerful hidden higher-order content. §6.
Higher-order concepts within Peano Arithmetic. I have been attempting in this paper to assess the non-arithmetical
character of statements in the language of arithmetic which are true but unprovable in PA in terms of their coding of mathematical situations whose description requires use of higher-order concepts.
A serious
challenge to the interest, or indeed cogency of this viewpoint arises from the observation that many statements in the language of arithmetic which
~
provable in PA code assertions of the same character as those I
have been terming higher-order.
Examples include transfinite induction
of order-type a for a<E o' consistency of sub-systems of PA, such as PR (primitive recursive arithmetic) or PAn (where the scheme of induction is restricted to formulas of logical complexity bounded by n), and so on. How can it be that the coded presence of such notions renders truths expressed in the first-order language of arithmetic non-arithmetical when they are unprovable in PA, but does not have this effect when the statement in question is provable in PA?
But if it did have that effect, then
the idea of provability in Peano Arithmetic marking a natural boundary to arithmetical truth would be called into question. The situation seems to me not so drastic as this.
I am concerned with
the way in which arithmetical truths can be established.
The point about
the examples of truths unprovable in PA considered in this paper is that they are, in each case, shown to be true by an argument in terms of truths concerning some higher-order notion, and in each case also a converse
165
Arithmetical Truth and Hidden Higher-Order Concepts
holds, so that the only way in which the arithmetical statement can be established is by an argument which establishes the higher-order truth. The relationship of coding constitutes a rigid link between the arithmetical and the higher-order truths, which pulls the ostensibly arithmetical truth up into the higher-order.
In cases such as e.g.
TI (00 (0), an arithmetical sentence coding transfinite induction of ordertype
0000,
there is similarly a rigid linkage between two kinds of truths.
But in these cases, the linkage pulls the ostensibly higher-order truth down into the arithmetical.
The statement in the language of arithmetic
has a derivation in PA, and following through that derivation gives the basis for perception of that truth as true in arithmetic purely on the basis of directly perceivable arithmetical truths. This answer may seem unrealistic, on the grounds that there can be cases where the higher-order perspective is essential for actual conviction as to truth of the arithmetically expressed sentence.
One may
know that a derivation in PA must exist, but if generated would be so long as to be unsurveyable.
This might be true, for example, in the case
of Con(PR), the arithmetical statement coding the consistency of primitive recursive arithmetic, or TI(a) for a<€o but large in comparison with
00.
There is a theorem by Codel [7J, on the lengths of proofs,
which points to this sort of situation as a systematic phenomenon:
for
each computable function f, there correspond infinitely many formulas in the language of arithmetic such that each
~
~
is provable both in PA and
PAZ (second-order arithmetic) but where if j is the length (measured in number of formulas of which it consists) of the shortest proof of
~
in
PA and k is the length of the shortest proof of ~ in PAZ, then j > f (k). The higher-order perspective can be essential, then, for shortening an otherwise unsurveyable proof. This point seems to me a serious one, and in some ways I am inclined to accept rather than to resist the force of it.
It comes down to the
issue of the extent to which it seems relevant and legitimate to appeal to the notion of an operation being performable 'in principle', a notoriously difficult matter, the answer to which depends critically on the context and purpose of the thing to be done.
If one is prepared to
countenance a notion of being 'in principle' derivable in PA, then the present problem disappears. One might consider that this move is legitimate, as enabling one to define precisely a theoretical boundary, to which mathematical practice approximates.
However, I have in
~
D.haacson
166
discussion been considering provability in terms of providing a basis for perceiving the truth of a given statement.
In these terms, a proof
in PA of a given proposition being infeasibly long has to be taken seriously.
If one does so, then within the arithmetically expressible
truths of mathematics, we must think of the boundary between those which are purely arithmetical and those which are essentially higher-order as running somewhat inside the collection of those for which derivations in PA exist.
But in thus narrowing the boundary of the arithmetically
provable, to a proper subset of those sentences for which there exist derivations in PA, the considerations which favour doing so also dictate a correspondingly narrower domain of the arithmetically true.
With these
two notions shrinking thus together, the general thesis, that Peano Arithmetic is complete with respect to purely arithmetical truth, stands. §7.
Conclusion. The point of this paper is essentially conceptual.
It is concerned
with what our attitude should be toward the phenomenon of incompleteness of formal systems for arithmetical truth.
The term 'incompleteness'
suggests that the formal system in question fails to offer a deduction which it ought to.
The contrasting attitude, which I have been
attempting to explore in this paper, is to see these arithmetically expressible independent truths as exemplifying the expressive richness of finite structures, perceivable through the phenomenon of coding. Coding is not something which we do, but rather recognize as existing, thereby discovering rigid links between truths of ostensibly different character.
These links belong to the mathematics of the structures
being studied. The formal system on which I have focused attention is Peano Arithmetic.
I have been attempting to explore its conceptual stability
in light of the apparently destabilizing effect, through extensibility, of the phenomenon of incompleteness.
I have offered heuristic and
conceptual support for the viewpoint that PA is the strongest natural first-order system for arithmetic, and that there is a sense in which it is complete with respect to purely arithmetical truth.
That sense can
be expressed by the claim that any true extension of it must be based on considerations in terms of higher-order concepts, and so in that sense not purely arithmetical.
Known cases of true sentences in the
language of arithmetic unprovable by PA are such that we could not take
Arithmetical Truth and Hidden Higher-Order Concepts
167
them as axioms in an extension of PA, given only their arithmetical formulation.
Rather, we must look to some more comprehensive mathematical
theory, and to the link between the arithmetical statement and a corresponding statement in the more comprehensive theory, for our mathematical confidence in its truth.
It is in this sense that truths
in the language of arithmetic which lie beyond what is provable in Peano Arithmetic must be perceived in terms of hidden higher-order concepts.
ACKNOWLEDGEMENTS I am extremely grateful for opportunities I have had to present material which is now developed in this paper, and to members of my audiences on these occasions for helpful responses. I was stimulated to pursue this theme by a lecture in Oxford by Laurie Kirby in January 1982, and first expounded some of these ideas, to an Oxford philosophy discussion group, the following month. Subsequent presentations have been to my philosophy of mathematics seminar (May 1982, and also June 1985), to the philosophy research seminar at St. Andrews (March 1983), the Somerville Philosophical Society (June 1983), the European summer meeting of the Association of Symbolic Logic, at Orsay in Paris (July 1985), and the Center for the Study of Language and Information at Stanford (September 1985). I am especially grateful for the invitation to speak at the A.S.L. summer meeting, as occasioning writing up the ideas here for these Proceedings. I cannot name all those whose questions and comments and encouragement have benefited me. but I want especially to mention Jon Barwise. Oswaldo Chateaubriand. Burton Dreben. Solomon Feferman, Robin Gandy. Alexander George. Deirdre Haskell, Jocelyn Hawkins, Ruth Isaacson. Angus Macintyre, Dag Prawitz, Philip Scowcroft. I owe an especially great debt in this paper to Alex Wilkie. His lectures in Oxford on models of arithmetic made work in this area accessible, and many of the thoughts I present here have developed from remarks of his (in particular, the emphasis on the link between Peano Arithmetic and finite mathematics). His written comments on a draft of this paper have been illuminating and generous.
D. Isaacson
168
REFERENCES [lJ
Wilhelm Ackermann, "Die Widerspruchsfreiheit der a11gemeinen Mengenlehre", Matherrrxtische Annal.en 114 (1937), pp . 305-315.
[2J
Richard Dedekind, Was sind und was sol.l.en die Zahl.en?, Braunschweig, 1888. English translation by W.W. Beman as "The nature and meaning of numbers", in Richard Dedekind, Essays on the Theo~ of Numbers, Open Court, 1901, reprinted by Dover, New York, 1963, pp. 31-115.
[3J
Richard Dedekind, letter to Keferstein, 1890, translation in Jean van Heijenoort (ed.), From Frege to GBdel.: a source book in matherrrxtical. l.ogic, 1879-1931, Harvard University Press, 1967, pp. 98-103.
[4J
Solomon Feferman, "A more perspicuous formal system for predicativity", Kuno Lorentz (ed.), Konetruetionen versus
Positionen: Beitrage urn Konstructive Wissenschaftstheorie,
Walter de Gruyter, Berlin, 1979, pp. 68-93. [5J
Gottlob Frege, Die Grundl-aqen de» Arithmetik: Eine Loqiech
rrrxthematische untersuchung uber den Begriff der Zahl.,
Wilhelm Koebner, Breslau, 1884, l19pp. English translation by J.L. Austin, Basil Blackwell, Oxford, 1950. [6J
Kurt Godel, "Uber formal unentscheidbare Satze der Principia mathematica und verwandter Systeme I", Monatshefte fUr Mathematik und Physik 38 (1931), pp. 173-198. English translation in Jean van Heijenoort (ed.), From Frege to
G8del.: a source book in mathematical. loqi.e, 1879-1931,
Harvard University Press, 1967, pp. 596-616.
[7J
Kurt Godel, "tiber die Lange von Beweisen", Ergebnisse eines mathematrieohen Kol.loquiiane 7 (1936), pp , 23-24. English translation in Martin Davis (ed.) The undecidabl.e: basic
papers on undecidabl.e propositions, unsol.vabl.e probl.ems, and aomputabl.e functions, Raven Press, Hewlett, New York, 1965, pp. 82-83.
[8J
R.L. Goodstein, "On the restricted ordinal theorem", Journal. of Symbolia Logic 9 (1944), pp. 33-41.
[9J
J. Ketonen and R. Solovay, "Rapidly growing Ramsey functions", Annal.s of Mathematics 113 (1981), pp. 267-314.
[lOJ Laurie Kirby and Jeff Paris, "Accessible independence results for Peano Arithmetic", Bul.letrin of the London Mathematical. Society 14 (1982), pp. 285-293. [l1J Jeff Paris and Leo Harrington, "A mathematical incompleteness in Peano Arithmetic", Jon Barwise (ed.), Handbook of Mathematical. Logia, North-Holland, 1977, pp. 1133-1142. [12J W.V. Quine, Phil.osophy of Logic, Prentice-Hall, Englewood Cliffs, N.J., 1970, 109pp. [13J Craig Smorynski, "The varieties of arboreal experience", The Mathematical. Intel.l.igencer 4 (1982), pp. 182-189.
Arithmetical Truth and Hidden Higher-Order Concepts
[14J Hao Wang, "Eighty years of Foundational Studies", Dialectica 12 (1958), pp. 465-497. Reprinted in
Hao Wang, A survey of Mathematical Logic, Science Press, Peking, 1962 and North-Holland, 1964, pp. 34-56.
[15J A.J. Wilkie, "On discretely ordered rings in which every definable ideal is principal", C. Berline, K. McAloon, J.-P. Ressayre (eds.), Model theory and Arithmetic (Proceedings, Paris, 1979/80), Springer Lecture Notes in Mathematics no. 890, 1981, pp. 297-303.
[l6J Ernst Zermelo, "Sur les ensembles finis et Le principe de l' induction complete", Acta Mathematica 32 (1909), pp . 185-193.
169
Logic Colloquium '85 Edited by The Paris Logic Group © Elsevier Science Publishers B. V. (North-Holland), 1987
171
SOME PROOF-THEORETIC CONTRIBUTIONS TO THEORIES OF SETS
Gerhard Jaeger
ETH-Ziirich Mathematik
INTRODUCTION.
Questions about set and set existence constitute an important part of the research in mathematical logic and the foundations of mathematics. This is the case since Cantor's discovery of the theory of infinite sets and the subsequent crisis of the paradoxes. Today Zermelo-Fraenkel set theory with or without the axiom of choice is the generally accepted formalism and can be considered as the 'official' framework for doing mathematics. Besides ZF, however, there exists a broad variety of theories which all attempt to formalize set-theoretic universes. Some of these are motivated by philosophical considerations and are based on constructive/intuitionistic principles; other groups are guided by the fact that ZF heavily violates the principle of parsimony and study weak non-constructive subsystems of ZF and their relevance for mathematics. Proof theory is concerned with the investigation of the proof possibilities of mathematical systerns. One aspect of this research is the ordinal analysis of formal theories. The proof-theoretic ordinal IThl of a theory Th is often defined as the least ordinal a such that the consistency of Th can be proved
in primitive recursive arithmetic PRA plus the scheme of transfinite induction along a standard primitive recursive well-ordering of order-type a. Experience has shown that the ordinal IThl is a good measure for the proof-theoretic strength of Th, so that we call IThl the proof-theoretic strength of Th.
J. Jaeger
172
Historically, logicians first carried through the proof-theoretic analysis of subsystems of second order arithmetic A z. This makes sense since an old observation, probably established by Hilbert, Bernays and Weyl, says, that most of ordinary mathematics can be carried through in subsystems of A 2• This, however, can be accomplished only by using a heavy coding machinery.] Later the interests shifted to theories for iterated inductive definitions and their relations to subsystems of A 2• Feferman's reductive proof theory is concerned with the reduction of classical nonconstructive theories to systems which can be justified constructively. In Friedman-Simpson's reverse mathematics one seeks natural set existence axioms which are equivalent (over a very weak ground theory) to theorems of ordinary mathematics. Girard's work on nl-logic provides a functorial approach to proof theory based on the notion of dilator. We do not quote special results here but refer to the textbooks of Schutte [34] and Takeuti [39], the recent Lecture Notes of Buchholz-Feferman-Pohlers-Sieg [3] and to Feferrnan [6,7], Simpson [38], Girard [14,15].
The first significant proof-theoretic results obtained for set theory are to be found in Peferman [5], where he introduces predicative subsystems of ZF. Friedman [13] is concerned with intuitionistic set theories and their importance for constructive mathematics. In both cases the proof-theortic analysis is carried out by reducing the set theories to subsystems of A 2• With admissible proof theory we follow the opposite track: Suitable subsystems of set theory become the focus of interest and are treated proof-theoretically in a very direct and uniform way. In this paper we are going to discuss several aspects of admissible proof-theory; the full program will be presented in Jaeger-Pohlers [28].
ADMISSIDLE PROOF THEORY. The central notions of admissible proof theory are the notions of admissible set and its formal counterpart, the Kripke-Platek set theory over the natural numbers as urelements: KPu. Admissible sets have been studied in mathematical logic during the last 20 years; especially significant are the contribut) The development of mathematics within ZF also relies on some coding arguments, but they are used there to much smaller degree.
Proof-Theoretic Contributions to Theories of Sets
173
tions in set theory, model theory and generalized recursion theory (cf. e.g. Barwise [1]). The use of admissible sets in proof theory is more recent. Admissible proof theory originates from Pohlers [30] and Jaeger [20] where the theory ID j of one inductive definition and the Kripke-Platek set theory KPu were analyzed proof-theoretically. At this early stage one could already exploit the interplay between definability theory and proof theory in order to gain some deeper insight.
Later the essential ideas were isolated, extended and applied to stronger theories (cf. e.g. Jaeger [18,21], Jaeger-Pohlers [27], Pohlers [30,31]); finaIly the program of admissible proof theory was developed. Today this approach can be used to obtain a uniform proof-theoretic treatment of all (natural) known theories in strength between PA and (.1.}-CA}t(Bl). A lot of work done on admissibles in definability theory is relevant for proof theory as well, either directly or as a source for proof-theoretic analogues. Examples of this will be given later. There is the question of the proper formal framework for a uniform proof-theoretic approach:
1. As formal language we propose the language of set theory especially for two reasons: By working in set theory we can follow the usual mathematical practice as much as possible; proofs in subsystems of set theory are generally more perspicuous than proofs in subsystems of analysis since coding can be avoided. Moreover, already in fairly weak set theories it is possible to treat problems which in A 2 cannot even be formulated. Typical examples are propositions of the form
¥x(x uncountable group ~ A(x» where sets of higher cardinalities are involved. It is interesting to determine the proof-theoretic strength which is required for the proof of A(x), provided that x is an uncountable group, even if the existence of such a group is not provable in the theory. Example: KPu+(V=L) solves the Whitehead problem. 2. Once the decision for a set-theoretic language is made, one has to think about the appropriate axioms. The main reason for introducing theories for admissible sets is the observation that they present themselves as a very uniform and powerful proof-theoretic tool: Although systems for admissible sets are relatively weak subsystems of ZF, they are strong enough to develop a fair amount of definability theory. Together with the expressive power of the set-theoretic language, these results then
174
J. Jaeger
make it easy to embed other systems. On the other hand. theories for admissible sets are very close to initial segments of the constructible hierarchy. The relevant closure properties of these initial segments can be simulated in a system RS of ramified set theory which has been completely analyzed in Jaeger [19] and Jaeger-Pohlers [27]. The proof-theoretic analysis of an arbitrary system Th using theories for admissible set can be sketched as follows: (i)
Find the least a such that La induces a model of Th.
(ii)
Find a theory for admissible sets Tho which formalizes La.
(iii) By restricting the induction principles in Tho. find the weakest subtheory Th1 of Tho such that Th 1-A
=> Th, 1- A
for all sentences A of the theory Th. (iv) Embed Th, into ramified set theory; calculate IThd. (v)
In all interesting cases we have IThl = IThri.
Taking this procedure for granted. please observe that the strength of a theory depends on the variation of two parameters: I. the number of admissibles
2. the amount of induction needed in order to reduce a given theory Th to a theory for admissible sets.
I. Basic notions.
A bit of notation is unavoidable here so we take this opportunity to introduce the main definitions for the whole paper. Let L 1 be the usual first order language of Peano arithmetic PA with constants for all primitive recursive functions and relations. We wish to formalize set-theoretic universes which have the natural numbers as urelements and are admissible or limits of admissibles. Most of the theories we are going to
Proof-Theoretic Contributions to Theories of Sets
175
discuss are formulated in the language LAd = L1(e,S,Ad,N) in which L 1 is augmented by: a membership relation symbol e; two unary relation symbols S and Ad in order to express, respectively, that an object is a set or an admissible set (in contrast to an urelement); and a set constant N for the set of natural numbers. We define 6., L., The notation
At~]
nO' 1: and n formulas
as usual and write !! for a finite string a1>•••, a. of terms.
is used to indicate that all free variables of A come from the list !'.; A(0 may contain
other free variables besides !'.. The formula A a is the result of replacing in A each unrestricted quantifier Vx(...) or
3>:(...)
by (Vxe a)(...) or <:Ire a)(...). Equality a=b is introduced as an abbreviation for
the formula
(aeN & beN & a=,p) v (S(a) & S(b) & (Vxea)(xeb) & (Vxeb)(xea))
where
=N
is the primitive recursive equality relation on the natural numbers.
Kripke-Platek set theory KPu is formulated in LAd; its axioms fall into the following five groups.
I. Ontological axioms. These axioms can be considered as implicit definitions of the set constant N and the predicates S and Ad.
(01) S(a) ~~ a'l=N;
(02) aeb
~
S(b);
(03) J
(04) Ad(d)
~
for every relation symbol J of L 1;
Ned & d transitive;
(05) Ad(d) ~ (Paid & (Transitive HuIll & (6.-Sep)d & (6.-Col)d
where (Pair)d, (Transitive Hull/, (6.-Sep)d and (6.-corl are the relativized versions of (Pair), (Transitive Hull'), (t.,,-Sep) and (6.--co/).
176
J. Jaeger
II. Number-theoretic axioms. For every axiom A[.':!] of PA:
III. Equality axioms. (£1) a=a; (£2) a=b -) (A(a) -) A(b» for every atomic formula of
LAd'
IV. Set existence axioms. (Pair) 3z(aEz & bEz) ;
(Transitive Hull) :Iz(ac:z & z transitive) ; (t:.,,-Sep) :Iz(z = {XEa : A(x)}) ;
(!!o-Col) (VXEa)(3y)A(x,y) -) 3z(VXE a)(3ye z)A(x,y)
where A is !!o in both schemes.
V. Induction axioms. These axioms provide complete induction on the natural numbers (INDN) and the usual
E
-induction (IND e); both for arbitrary LAd formulas:
(IND N ) A(O) & (VxEN)(A(x) -) A(x+l» -) (VxEN)A(x);
(IND e) Vx«VyEx)A(Y) -) A(x»
-) VxA(x).
KPu corresponds to Barwise's theory KPU" as defined in [1] with PA as theory for the urele-
ments, The transitive standard models of KPu are called admissible sets (above N). An ordinal a is said to be admissible (above N), if La is an admissible set above N. Here (La: aE ON) denotes the constructible hierarchy over the natural numbers as urelements.
ProofTheoretic Contributions to Theories of Sets
177
In order to speak about universes which are limits of admissible sets, we replace in KPu the scheme of Ii. collection by the limit axiom
Vx3.l'(XEy & Ad(y»
(Lim)
and call this theory KPI. KPi finally is KPu+(Lim) and formalizes recursively inaccessible universes, i.e. universes which are admissible limits of admissibles. Suppose that Th is one of the theories KPu, KPI or KPi. Then Th' is taken to be the theory Th with (IND N) and (IND e) replaced by the axioms (IN) and (Ie) (IN) DEa & (VxEa)(x+IEa)
(Ie) a,t0
---7
---7
(VxEN)(xEa);
(3xEa)(Vyea)(Yfix).
Th' is taken to be Th with (INDN) restricted to (IN) and (IND e) omitted completely. For ordinal notations we refer to the literature (e.g, [4,33,3]). Ordinal functions eo, for a countable, can be introduced by the following transfinite recursion: ~ is CJ)~; for a > D, <Pa~ is the ~th simultaneous fixed point of all functions
with
~
< a. The least a such that
<PaD = a is normally
denoted by roo Progressiveness, transfinite induction and well-foundedness are defined for all a, r and arbitrary formulas A as follows:
Prog(a,r,A)
T/(a,r,A)
Wfia,r)
:<=> rczaxa & Vx[Vy«y,x>Er---7A(y»
:<=> Prog(a,r,A)
---7 (VXEa)A(x)
---7 A(x)]
;
;
:<=> VXT/(a,r,x).
An ordinal a is said to be provable in the theory Th if there exists a primitive recursive wellordering Q of order-type a such that Th 1- W/(N,Q). The proof·theoretic ordinal of Th, denoted by IThl, is the least ordinal a that is not provable in Th. By Th; " Th1 we mean that the theories Th; and Th1
J. Jaeger
178
prove the same arithmetic formulas, possibly with parameters (and modulo ..,.., translation if one of the theories uses intuitionistic logic). Remark. In practice we have: I. Tho '" Thl iff IThol
2. If a
= IThl,
= ITh11
then a is the least ordinal such that PRA+QF-TI(a) (primitive recursive arithmetic with
quantifier-free transfinite induction along a standard primitive recursive well-ordering of order-type a) proves the consistency of Th.
2. Impredicative theories. A theory Th is called predicatively reducible if its proof-theoretic ordinal is less than or equal to
r 0> otherwise Th is referred to as impredicative. This
terminology goes back to the work of Feferman
[4] and Schiine [33] in the middle of the sixties, where the philosophical concept of predicativity lila Poincare was brought into precise mathematical shape. Remark. We use the phrase 'predicatively reducible' instead of the more common 'predicative' in order to emphasize the following point: The least standard model of a predicative theory Th is a substructure of L ro or L ro itself, and therefore each set existence axiom of Th is predicatively justified. This is the case for example for Feferman's theory IR or the theory AUT(I1~) of Feferman-Jaeger [10]. In the next section we will see that there are predicatively reducible theories which are not predicative. KPi is the strongest theory which has been treated in admissible proof theory so far. It is closely
related to second order arithmetic with ~1 comprehension and bar induction and to Feferman' s theory To for explicit mathematics. The main result states that
KPi '" (~1-CA)+(B/) '" To
(1)
The reduction of To to (~1-CA)+(B/) is due to Feferman [7]; the embedding of (~1-CA)+(B/) into KPi is given in Jaeger [22]. The ordinal analysis of KPi is carried through in Jaeger-Pohlers [27]; it is IKPil
:$; 8°(81£10+10)0.
Jaeger [21] shows that every ordinal a < 8°(81£10+10)0 is provable in To.
Proof-Theoretic Contributions to Theories ofSets
179
The equivalence (1) is also interesting for constructive mathematics. Feferman's T. is based on intuitionistic logic and is a suitable framework for Bishop style constructive mathematics (Feferman [6.7.9]). Hence the theory KPi. which was developed without any reference to constructivity, is a conservative extension of constructive mathematics with respect to arithmetic (even Ill) statements. Admissible proof theory can be extended to stronger theories. A cut elimination argument for the formalization of a recursively Mahlo universe is given in Jaeger [26]; the exact ordinal assignment. however. is still missing. The impredicative subsystems of KPi are closely related to subsystems of analysis. subsystems of T. and theories for iterated inductive definitions. For simplicity we confine ourselves to state some spe-
cial cases only. For further equivalences. unexplained notations and the proofs of (2) - (6) we refer to Buchholz-Feferman-Pohlers-Sieg [3] and Jaeger [21.24]. KPi'+(INDN ) and KPi' correspond in strength to the theories (t.i-CA) and (t.i-CA). Here the sub-
script
0
is used to indicate that the scheme of complete induction is replaced by the axiom.
(2)
(3) KPI is the framework to treat
III comprehension on the natural numbers. We have KPI '" (rrj-CA)+(Bl)
(4)
(5)
(6)
KPu is an impredicative theory which proves the same arithmetic sentences as (t>I-CA)+(Bl) and
the theory /D I of one inductive definition. The proof-theoretic ordinal of KPu. (t.l-GA)+(Bl) and [Dr is the so called Howard ordinal. The equivalence (7) below follows from Howard [17] and Jaeger [20].
KPu '" (t.l-GA)+(Bl) '" /DI
(7)
180
J. Jaeger
3. Predicatively reducible theories. The previous section supports the thesis that theories for admissible sets provide a uniform framework for impredicative formalized mathematics. Now we turn to predicatively reducible systems. They can be approached by theories for admissible sets without
E
-induction. In this survey, however, we
essentially concentrate on KPj· and the new concept of the admissible extension Th< of a theory Th. The weakest theories for admissible sets are the theories KPu· and KPu'. These systems are conservative extensions of Peano arithmetic PA in the sense that
KPu'
I- AN <=> KPu· I- AN <=> PA 1- A
for every sentence A of L!. If we add the scheme of complete induction, we reach the strength of
(L\l-CA). So we have:
KPu' '" KPu· '" (L\l-CA). '" PA
(8)
(9)
(8) and (9) are proved in Jaeger [7,22] by a combination of proof-theoretic and model-theoretic arguments.t Both results also follow from more general considerations concerning the admissible extension of a theory, a concept that we explain now.
Definition. Let Th be a theory which is formulated in the language LAd or an extension LAI.!) of LAd by finitely many set constants '!. = ej, ....e•. First we extend this language by a new set constant M to the language LAI.!,M). Tb" then is the theory that consists of the following axioms:
1. Ontological and equality axioms. As in KPu.
2. M-axioms. M is transitive; bEM for all constants b of Th.
t) The third equivalence in (8) was originally proved by Barwise-Schlipf [21 (model-theoretically) and FefennanSieg [3] (proof-theoretically).
Proof-Theoretic Contributions to Theories ofSets
181
3. Th-axioms. For every axiom A[~] of Th: (V!EM)AM[:!l . 4. ]{ripke-Platek axioms. The set existence axioms of KPu.
The can be considered as the proof-theoretic analogue of the 'next admissible set' construction in
recursion theory (cf, Barwise [1]): If «a,...> is a standard model of Th, then is the least standard model of The that contains a as element. In this case. M is interpreted as a and a+ is the least admissible set with element a. The is very weak with respect to induction. Only that amount of induction is available in The
which can be lifted from Th, The existence of infinite descending sequences of sets outside M is consistent with The. The admissible extension of a theory Th is characterized by the following two theorems. Theorem 1. Suppose that Th is a theory in LAi.9. Then we have for every sentence A of L,.,/9:
(a) Th'
1-AN <=>
Th
1-A
;
The implications from right to left are obvious; the converse directions are proved in Jaeger [25]. In order to establish conservation results for The+(INDN) we turn to infinitary systems. By Tk; we denote the system which results from Th if we replace the scheme (INDN ) by the co-rule A -+ B(n)
for every natural number n A -+ (VXEN)B(X)
We write Th_l-
~
< u,
182
J. Jaeger
Lemma 1. Let Th be a theory in LAM. Then we have for any
L~)
sentence A and e-number
a:
This lemma is proved in Jaeger [25]. By standard techniques it implies the following theorem. Theorem 2. If Th is a theory in LAM and A an LAM sentence. then
The notion of admissible extension is very general and interesting by its own right. We will apply it for the proof-theoretic analysis of Ki'i", As preparation we introduce theories for finitely many admissible universes. Extend LAd to the language LAI4J by adding n new set constants
~ =
do,..••d,....!, n < ro. (U.) is the following axiom:
Together with the ontological axioms, (U.) expresses that dQ> •.••d,....t is an increasing sequence of admissibles. Specializing Theorem 1 to the present situation yields the following conservation result.
Lemma 2. We have for every sentence A of LA.f..dO' ... •d,....I):
KPuo+(U,...!) 1-Ad. <=> KPu°+(INDN)+(U.) 1-A .
By some trivial ordinal manipulations we deduce from Lemma 1 and Lemma 2 that
IUKPuo+(UJI
~
roo
It is not difficult to see that this bound is sharp.
•
Lemma 3. IUKPuo+(UJI = r o .
""'"
Remark. The theories KPuo+(U.) correspond in proof-theoretic strength to Martin-Loefs constructive theories with finitely many universes and to Feferman' s subtheories To.. of To (cf. Feferman [8] and Martin-Lof [29]).
Proof-Theoretic Contributions to Theories ofSets
183
The key to the proof-theoretic analysis of KPio is the I12 reduction of Kl'i" to the theories
KPuo+(U,J. Lemma 4. Suppose that the I12 formula Vx:JyA(x,y) is provable in Kl'i", Then there exists an n < OJ such that
KPuo+(Uk+1)
1- (VXE d",)(3YE d0A(x,y)
for all k: and m such that k = m+n. A detailed proof of Lemma 4 is given in Jaeger [25]. Now it is obvious that KPio is predicatively reducible: Theorem 3. IKPi"l = IKPl"1 =
r0 •
In the remainder of this section we will list several results which reveal the connections between Ki'i", KPI" and some well-known set-theoretic principles. Lemma 5 and Lemma 6 below indicate that
KP/o and Kl'i" are strong enough to develop a reasonable part of the theory of hyperarithmetic sets.
The axiom 13 asserts that every well-founded relation r on a set a has a collapsing function; i.e, it is the universal closure of the following formula:
Wf(a,r) ~ ~Fun{j) & domlf)=a & (VxEa)(f(x)={f(y):
Lemma 5. KPf proves the axiom
~.
Actually, if r is a well-founded relation on a and belongs to
the admissible set d, then the collapsing function f for «a.r» belongs to d as well.
Comparability of well-orderings is the statement that, given two well-orderings
Lemma 6. KPf proves comparability otwell-orderings.
Friedman's theory ATR o is a subsystem of analysis with the scheme of complete induction weak-
184
J. Jaeger
ened to the axiom. Set existence is provided in ATR o by the principle of arithmetic transfinite recursion: Any countable well-ordering has a jump-hierarchy starting at any set. ATR o was introduced in Friedman [12]; his work was later expanded in Friedman-McAloon-Simpson [14] and Simpson [35,36]. Today ATR. is one of the five fundamental systems in the Friedman-Simpson program of reverse mathematics. The proof-theoretic ordinal of ATR o and the relationship between ATR o and Feferman's theory IR of predicative mathematics were first determined in [14] by very complicated model-theoretic ad hoc methods. A perspicuous and conceptionally clear proof-theoretic analysis of ATR o is now available via the (analysis of the) theories KP/o and Kri", All one needs is the following observation (cf, Jaeger [24]):
Lemma 7. ATR o is contained in KP/o. Each ordinal less than r o is provable in ATR o ; this can be proved either directly or by embedding Feferman's theory IR into ATR o ' Hence we conclude: Theorem 4. The theories Kl'i", Kl'l" and ATR o prove the same arithmetic (even rrj) statements as Feferman's theory IR of predicative mathematics.
4. Set existence versus induction. Admissible proof theory tries to extract as much information as possible from the standard structures of a given theory. However, standard models are not sensitive to whether we treat a theory with full or restricted induction. So we have for example
La 1= sr«
<=> La 1= KPu
for all ordinals a. On the other hand, the constructible hierarchy is important for the proof-theoretic analysis of theories for iterated admissible sets, namely in the sense that suitable initial segments are simulated by ramified set theory. To retain the naturalness of standard structures and to get more information out of the sets La, we introduce the notion of Il z model.
Proof-Theoretic Contributions to Theories ofSets Definition. Let Th be a theory in LAd' La is a IT2 model of Th if La Th. The least
185
1= A for all IT2 consequences A of
a with this property is called the IT2 ordinal of Th.
Remarks. 1. The minimal IT2 model of a theory and its IT2 ordinal reflect its proof.theoretic strength. Natural
theories with different proof-theoretic ordinals have different IT2 ordinals. 2. If a is the IT2 ordinal of Th, then the Skolem functions for provable IT2 sentences can be easily constructed from a. The IT2 ordinals of all interesting theories are known. In general they are easier to calculate than the proof-theoretic ordinals. With some experience it is not hard to compute the proof-theoretic ordinal of a theory Th from the proof that a particular a is its IT2 ordinal. What is still missing is an exact characterization of the proof-theoretic ordinal in terms of the IT2 ordinal and vice versa. Without going into details we will quote some results concerning IT2 ordinals: A. ro is the IT2 ordinal of KPuO and KPu'. As a consequence we have that the existence of ro is not provable in KPu'. The reason is that the natural ordering of N cannot be lifted to the ordinals without using Ll induction along N. The prooftheoretic ordinal of KPu' and KPuO is eo. B. eo is the IT2 ordinal of KPu'+(INDN) and KPuo+(INDN).
This result should be compared with an old result of Friedman [11] that (LI-AC) can be reduced to ramified analysis in all levels < eo' The proof-theoretic ordinal of KPu'+(INDN) and KPuo+(INDN) is
epeoO C. epeoO is the IT2 ordinal of KPuo+(INDN)+(Lj-INDe). Here (LI-IND e) is the scheme of e -induction restricted to Lj formulas, The proof-theoretic ordinal a of this theory is not determined yet; conjecture: a = ct>eoO.
J. Jaeger
186
D. The Howard ordinal 8Eo\+10 is the II 2 ordinal and the proof-theoretic ordinal of KPu.
E. The TI2 ordinal of KPi is obtained by collapsing the ordinal ordinal of KPi we have to collapse
EI.+1
below
EI.+1
below 10; for the proof-theoretic
rofK .
In KPu we have full induction. KPio is a theory which is strong with respect to set existence axioms at the price of being weak. with respect
to
induction principles. Systems of this flavour have
become interesting recently since they are very flexible as far as ordinary mathemtics is concerned (cf. Simpson [38]). Sets are not necessarily well-founded in Kl'i", Therefore we introduce in Ki'i" the subuniverse
Hwf of all hereditarily well-founded sets and study its closure properties. Definition. A set a is called hereditarily well-founded iff there exists a transitive set b well-founded with respect to E
::J
a which is
;
Hwft.a)
:<=> 3z(z transitive & acz & Wf(Z,EIzXz) •
If TC(a) is the transitive closure of a, then a is hereditarily well-founded if TC(a) is well-founded. The existence of the transitive closure of a set
a
is not always clear in theories with restricted induction.
Hwf is a proper class in Kl'i", Let EST be KPuT minus !io collection. (ES is an abbreviation for elementary set theory.) Theorem 5. (a) We have for every sentence A of LAd:
EST+(Axiom
P) 1-A
(b) There exists an instance of (!io-Col) such that
for its universal closure A.
=> KPi· 1- AHwf
ProofTheoretic Contributions to Theories a/Sets
187
The proof of (a) is more or less clear. Now assume that (b) is wrong. Then Hwf reflects the theory KPu'+(Axiom KPu'+(Axiom~)
~).
provable in Kl'i". This is a contradiction since the proof-theoretic ordinal of
is greater than
r o (cf. Jaeger [22]).
Remark. In Simpson [36] a set-theoretic version ATR~ of ATR o is studied. ATR~ is a subsystem of ZF which is conservative over ATR o for sentences of second order arithmetic. ES'+(Axiom ~) is ATR~ formulated in a language with the natural numbers as urelements. Now we turn to a familiar induction principle. the so called bar rule (BR). It is the rule of inference
WF(N.Q) TI(N.Q,A.)
for every primitive recursive relation Q on N and every LAd formula A. As KPi", KPuo+(BR) is a theory of strength
r o, but both theories
formalize completely different universes. Compared with the least stan-
dard model of Kl'i", the least standard model of KPuo+(BR) is very small. However, in KPuo+(BR) we have very strong induction on N. The reduction of KPuo+(BR) brings us to the very interesting class of formulas which are, in Kl'i", uniformly provable in all admissibles.
Theorem 6. If A is a formula of LAd, then
KPuo+(BR)
1- A =>
Ki'i"
1- lfx(Ad(x)
~
k') .
The proof is by induction on the length of the derivations in KPu"+(BR). It is not clear whether the converse of Theorem 6 is true in full generality as well. Uniform provability in all admissibles has to do with predicativity. The following result makes a case of Kl'i" having a predicative justification with respect two uniformly provable 112 formulas.
Theorem 7. We have for every 112 sentence A of LAd: Kl'i"
1- lfx(Ad(x)~k') => L ro 1= A
.
J. Jaeger
188
Remark. We have compared KPio with all theories of ordinals $
r o that
we know. In each case it was
fairly easy to prove an embedding theorem into Ki'i", Hence it seems justified to call Kri" the strongest predicatively reducible theory. In this context it is interesting to consider the following parallel:
KPio - predicativity KPi - constructivity,
Further research. In the previous considerations we have concentrated on set existence provided by the iteration of admissibility and on various restrictions of induction principles. By doing this, a uniform proof-theortic approach to theories in strength between PA and (.1.i-CA)+(BI) has been obtained. We end this paper with mentioning an alternative form of set existence. It is possibile to shift from pure set existence axioms to another class of axioms which one can denote as set structuring axioms. The general form of these axioms is
3xA(x) ~
3Y(y nice & A(y»
where A and nice have to be specified. Obviously axioms of this form are related to the so called basis theorems in recursion theory (cf. Shoenfield (35)). However, not much is known about set structuring axioms in proof theory but they seem to be very interesting.
REFERENCES. [1]
Barwise, J. Admissible Sets and Structures. Springer-Verlag, Berlin, Heidelberg, New York (1975).
[2]
Barwise, J. and Schlipf, J. On recursively saturated models of arithmetic. Model Theory and Algebra. Lecture Notes in Mathematics 498. Springer Verlag, Berlin, Heidelberg, New York (1975).
Proof-Theoretic Contributions to Theories of Sets [3]
189
Buchholz, W., Feferman, S., Pohlers, W. and Sieg, W. Iterated Inductive Definitions and Subsystems of Analysis: Recent Proof-Theoretical Studies. Lecture Notes in Mathematics 897. Springer-Verlag, Berlin, Heidelberg, New York (1981).
[4]
Feferrnan, S. Systems for predicative analysis. Systems of predicative analysis. Journal of Symbolic Logic 29 (1964).
[5]
Feferrnan, S. Predicatively reducible systems of set theory. Set Theory II. American Mathematical Society, Providence, R.I. (1974).
[6J
Feferman, S. A language and axioms for explicit mathematics. Algebra and Logic. Lecture Notes in Mathematics 450. Springer-Verlag, Berlin, Heidelberg, New York (1975).
[7J
Feferman, S. Constructive theories of functions and classes. Logic Colloquium '78. NorthHolland, Amsterdam (1979).
[8J
Feferman, S. Iterated inductive fixed-point theories: Application to Hancock's conjecture. Patras Logic Symposion 1980. North-Holland, Amsterdam (1982).
[9J
Feferman, S. Between constructive and classical mathematics. Computation and Proof Theory. Proceedings, Logic Colloquium Aachen 1983, Part II. Lecture Notes in Mathematics 1104. Springer-Verlag, Berlin, Heidelberg, New York, Tokio (1984).
[l0]
Feferrnan, S. and Jaeger, O. Choice principles, the bar rule and autonomously iterated comprehension schemes in analysis. Journal of Symbolic Logic 48 (1983).
[IIJ
Friedman, H. Iterated inductive definitions and (~-AC). Intuitionism and Proof Theory. North-Holland, Amsterdam (1970).
[12]
Friedman, H. Systems of second order arithmetic with restricted induction I, II (abstracts). Journal of Symbolic Logic 41 (1976).
[13J
Friedman, H. Set theoretic foundations of constructive analysis. Annals of Mathematics 105 (1977).
[14J
Friedman, H., McAloon, K. and Simpson, S.O. A finite combinatorial principle which is equivalent to the l-consistency of predicative analysis. Patras Logic Symposion 1980. North-
J. Jaeger
190 Holland, Amsterdam (1982). [15]
Girard, J.-Y. nl-logic, Part I: Dilators. Annals of Mathematical Logic 21 (1981).
[16]
Girard, J.-Y. A survey of TIl-logic. Logic, Methodology and Philosophy of Science VI. NorthHolland, Amsterdam (1982).
[17]
Howard, W.W. Ordinal analysis of bar recursion of type O. Compositio Mathematica 42 (1981).
[I8]
Jaeger, G. Beweistheorie von KPN. Archiv fiir Mathematische Logik und Grundlagenforschung 20 (1980).
[19]
Jaeger, G. Die konstruktible Hierarchie als Hilfsrnittel zur beweistheoretischen Untersuchung von Teilsystemen der Mengenlehre und Analysis. Dissertation, Miinchen (1979).
[20]
Jaeger, G. Zur Beweistheorie der Kripke-Platek-Mengenlehre tiber den naturlichen Zahlen. Archiv fur Mathematische Logik und Grundlagenforschung 22 (1982).
[21]
Jaeger, G. A well-ordering proof for Feferman's theory To. Archiv fur Mathematische Logik und Grundlagenforschung 23 (1983).
[22]
Jaeger, G. Iterating admissibility in proof theory. Proceedings of the Herbrand Symposium. Logic Colloquium '81. North-Holland, Amsterdam (1982).
[23]
Jaeger, G. A version of Kripke-Platek set theory which is conservative over Peano arithmetic. Zeitschrift fuer mathematische Logik und Grundlagen der Mathematik 30 (1984).
[24]
Jaeger, G. The strength of admissibility without foundation. Journal of Symbolic Logic 49 (1984).
[25]
Jaeger, G. Theories for admissible sets - a unifying approach to proof theory. Habilitationsschrift, Muenchen (1984).
[26]
Jaeger, G. A cut elimination argument for a Mahlo universe. Handwritten notes (1981).
[27]
Jaeger, G. and Pohlers, W. Eine beweistheoretische Untersuchung von (D.l-cA)+(Bl) und verwandter Systeme. Bayerische Akademie der Wissenschaften, MathematischNaturwissenschaftliche Klasse: Sitzungsberichte (1982).
ProofTheoretic Contributions to Theories ofSets
191
[28]
Jaeger, O. and Pohlers, W. Admissible Proof Theory. In preparation.
[29]
Martin-LOf. An intuitionistic theory of types: Predicative part. Logic Colloquium '73. NorthHolland, Amsterdam (1975).
[30]
Pohlers, W. Cut-elimination for impredicative infinitary systems. Part I: Ordinal analysis for ID I • Archiv fiir Mathematische Logik und Grundlagenforschung 21 (1981).
[31]
Pohlers, W. Cut-elimination for impredicative infinitary systems. Part II: Ordinal analysis for iterated inductive definitions. Archiv fUr Mathematische Logik und Orundlagenforschung 22 (1982).
[32]
Pohlers, W. Admissibility in proof theory, a survey. Logic, Methodology and Philosophy of Science VI. North-Holland, Amsterdam (1982).
[33]
Schutte, K. Eine Orenze der Beweisbarkeit der transfiniten Induktion in der verzweigten Typenlogik. Archiv fur Mathematische Logik und Grundlagenforschung 7 (1964).
[34]
Schutte, K. Proof Theory. Springer-Verlag, Berlin, Heidelberg, New York (1977).
[35]
Shoenfie1d, J. Mathematical Logic. Addison-Wesley, Reading, Mass. (1967).
[36]
Simpson, S.G. Set theory. Set theoretic aspects of ATR o • Logic Colloquium '80. NorthHolland, Amsterdam (1982).
[37]
Simpson, S.G.
1:1 and III transfinite induction. Logic Colloquium '80. North-Holland, Amster-
dam (1982). [38]
Simpson, S.O. Reverse mathematics. Proceedings of the AMS Summer Institute in Recursion Theory. To appear.
[39]
Takeuti, O. Proof Theory. North-Holland, Amsterdam (1975).
G. Jaeger Mathematik ETH-Zentrum CH-8092 ZUrich (Suisse)
Logic Colloquium '85 Edited by The Paris Logic Group © Elsevier Science Publishers B.V. (North-Holland), 1987
213
TI 2-MODELS OF EXTENSIONS OF KRIPKE-PLATEK SET THEORY Peter Pappinghaus Institut fur Mathematik, Universitat Hannover Welfengarten 1, D-3000 Hannover 1 § 1. INTRODUCTION
In the development of proof theory a shift has taken place from subsystems of analysis (second order arithmetic) to theories of inductive definitions, and from these to extensions and subsystems of Kripke-Platek set theory. The latter change is advanced most forcefully by Jager, and for background and recent results we refer to his contribution in this volume [Ja 2J. Following this trend we present a new method of analyzing extensions of Kripke-Platek set theory in the spirit of Girard's TI~-logic. Our method is remarkably simpler than previous work. Partly this is due to the fact that our aims are less ambitious. In contrast to the tradition we are not concerned with determining so-called proof-theoretic ordinals, but rather with finding TI 2-models and the TI 2-ordinals of the theories we study. These notions are explained below and go back to Jager (see [Ja 2J)t. We assume familiarity with Kripke-Platek set theory as developed in [Bw, Ch. I]. We consider only sets without urelements, and so the formal system KP is formulated in a one-sorted language with the two-place relation symbols
=
and E. As axioms of KP we have the axioms of equality, ex-
tensionality, pairing and union, and the schemes of
~o-separation, ~o-col
lection and foundation. We extend KP to the theory KP(G) by adjoining a new two-place predicate constant G and the following axioms expressing that G is the graph of an ordinal function. (G 1)
\!x,y,z ( G(x,y)
(G 2)
\j x ( Ord(x)
Ord(c) is the usual
+
A
G(x,z) + y
3y ( Ord(y)
~o-formula
A
= Z
G(x,y) ) )
expressing that c is an ordinal.
KP(G) is a uniform way of giving extensions of KP of different strength according to how G is interpreted. For example, if G is to be interpreted as the graph of the constant function with value w, we can replace the relation constant G by a Ll-formula defining this graph, and we obtain a
t The referee has pointed out to me that the notion of TI 2-model implicitly goes back to Tait. However I have not been able to locate a reference.
P. Pdppinghaus
214
theory equivalent to Jager's KPu (see [Ja lJ). And if G is to be interpreted as the graph of the function associating with any ordinal a the next admissible ordinal a+, we can similarly obtain a theory equivalent to Jager's KPi (see [JP]) as well as to Pearce' s ~ (see [Pc]). We are interested only in standard models for the language of KP(G). which are given by a pair (M,g), where M is a non-empty. transitive set, and
g:On + On is a total ordinal function. M is called admissible. if
(M,E) is a model of KP, and M is called g-admissible, if (M,€,G
is a mog) del of KP(G). G is defined by Gg(a,b):<=> (Ord(a) b=g(a)) v (-,Ord(a) g A b=O). (M,g) is called a TIz-model of KP(G), if (M,€,G ) is a model of A
g
every TIz-sentence. which is provable in KP(G).
One way to phrase our aims is to say that we want to determine ordinals K s.t. (VK,g) is a TIz-model of KP(G). (VK denotes the K-th level of the Phrased in a recursion-theoretic spirit, we want
cumulative hierarchy.)
to determine bounds for set functions and ordinal functions, which are ~~ definable provably in KP(G). Let V denote the universe of sets and On the class of all ordinals. We use a.b.c •••• as free variables of the formal language and also (somewhat ambiguously) informally to denote sets. The notation Fl~J is used to indi+
a = al, .•. ,a as its free variabn les. ~ denotes derivability, and a,S,y, .•• are used for ordinals.
cate that a formula F contains at most
f:V + V is called ~?-definable provably in KP(G) iff there is a ~l-formu la F[a.b] of KP(G) s t , KP(G) ~ Yx 3!y Flx,yJ s
set M and
a EM:
defines that
and for every g-admissible
(M,E.,G g) 1= F [a, f(a)J • Analogously one On is Z~-definable provably in KP(G). In particular
f(a) '" M and
f:On
+
an ordinal a is called Z~-definable provably in KP(G) iff there is a ~l formula
A[a] of KP(G) a.r ,
g-admissible set M:
KP(G)
a EM and
l-
(M,E,G
3!y(Ord(y) ~
A
A[y])
and for every
A[a]
g) We will give bounds for provably ~~-definable functions in terms of a
hierarchy of ordinal functions JB:On + On • This hierarchy is defined and investigated at length in [pa
, Kap. IIJ. The indices D of the hierarchy
denote dilators. The following dilators. in particular. playa role in our
To understand this definition and notation (and even more so later parts of this paper) the reader has to be familiar with the fundamentals of the theory of dilators as exposed in [Gi 2, pp. 89-139J.
Kripke-Platek Set Theory
215
The principal bounded ness result is the following. We assume that g:On
+
On is non-decreasing and
Theorem 1.1: Let
qtt]
KP(G) /-
g(O) >
a
be a lIo-formula of KP(G) s
\ft3 y c[ii,y]
s
t ,
V! Eo V ::3 t E V a ]g (a) is true with G interpreted by Gg . (l+Id)(n) This immediately entails closure properties in the cumulative hierarchy.
Then for some fini te n and every ordinal a:
+ +]
C[ a,b
Corollary 1.2: Let K be an ordinal closed under all :]~2+Id) ,n < w. Then (VK,g) is a TIz-model of KP(G). In other words: V is close~n)under K all functions, which are L~-definable provably in KP(G), and these have a VK-absolute L~-definition. K(g) := ~rl+Id) w (0) of the corollary.
is the least ordinal satisfying the hypothesis
() Moreover it follows from the results of CPa] that
the set of ordinals L~-definable provably in KP(G) is a cofinal subset of K(g). Consequently K(g) is the least ordinal K s.t. (VK,g) is a TIz-model of KP(G). An application of the same methods to a version of KPu yields that
K(Aa.w)
(which, incidentally, by [PaJ equals the so-called Bachmann-
Howard ordinal) is the least ordinal K s.t. VK is a TIz-model of KPu. By a slight modification of our techniques we can prove similarly that K(Aa.w) is the least ordinal K s.t. LK is a TIz-model of KPu, i.e. in the terminology of Jager (see [Ja 2J): K(AQ.W) is the TIz-ordinal of KPu. (LK denotes the K-th level of the constructible hierarchy.) Finally we can also analyze a version of KPi and obtain that
K(Aa.a+)
is the TIz-ordinal of KPi. These
ordinals have been determined by Jager and Pohlers in [Ja IJ and [JpJ to equal
EJE
Q+10
and
01 E I +10 respectively.
The principal boundedness result (Theorem 1.1) is proved by combining a syntactic analysis of KP(G) in traditional lines with a semantic analysis in the form of an asymmetric interpretation. To analyze KP(G) proof-theoretically we introduce an infinitary sequent calculus KPV(G). It has for every ordinal a an a-branching rule expressing the definition of Va' and further an On-branching rule expressing that every set is an element of some Va' KP(G) can be embedded in KPV(G). By virtue of the infinitary rules the scheme of foundation can be derived in KPV(G). The remaining axioms of KP(G) - which are of bounded logical complexity - are taken care of by particular rules, called axiom rules. For KPV(G) a weak cut-elimination theorem is proved. (In the disguise of the
P. Pdppinghaus
216
axiom rules there remain specific cuts.) As usual we have to control the lengths of the derivations obtained in the process of embedding and cut-elimination. For this purpose we use dilators rather than ordinals. More precisely we use a relation of majorization between KPV(G)-derivations and dilators. This is the only trace of functoriality in our treatment. We have stayed closer to traditional proof theory ~han the Girard school (see e.g. [Fe]) in that we have not incorporated any functoriality conditions in the notion of derivation. However all infinitary derivations occurring here could be shown to be homogeneous trees in the sense of [Jer] , even primitive recursive ones, and are thus perfectly good constructive objects. While homogeneity of derivation trees would be needed for a unique assignment of dilators as lengths, we get along with a majorization, which can be obtained more cheaply. Moreover we pay no attention to the metamathematical methods employed in our work, and so there is no technical need to bother about homogeneity. Dilators as length bounds are of great technical advantage. We could have replaced the On-branching rule of KPV(G) by an analogous Q-branching rule. In such a setting we could have used ordinals below £Q+I as length bounds like in [How]. But as Howard's work shows, this leads to great complications due to the fact that the fundamental sequences are not well-behaved w.r.t. the algebraic operations. The use of dilators is a much more elegant solution. To obtain length bounds for derivations, one only has to construct natural transformations between dilators without ever paying attention to fundamental sequences. And for coping with the On-branching ·rule we have the fundamental sequences ({D}<:;\';
E
On at our disposal, which are
inherently given by virtue of Girard's functor of separation of variables: {D}<:;
= SEP(D)(',<:;)
• A certain price is to pay, since the functor SEP is
not exactly the inverse of diagonalization. As a consequence the fundamental sequences given by SEP are still not sufficiently well-behaved w.r.t. the algebraic operations. For example (l+Id){D}<:; can be naturally transformed into {(l+Id)D}<:; • but is in general not equal to it. Also we have to modify the usual cut-elimination technique slightly, since there are no commutative 'natural sum' or 'natural product' for dilators. The final result of our proof-theoretic work is that every KP(G)-derivable sequent has a (weakly) cut-free KPV(G)-derivation, which is majorized by (l+Id)(n) for some finite n •
Kripke-Platek Set Theory
217
This syntactic analysis is supplemented by a semantic analysis, namely an asymmetric interpretation of cut-free KPV(G)-derivations. Such a method has first been applied by Girard in [Gi IJ and later by Masseron, Van de Wiele and Vauzeilles in the context of theories of inductive definitions in order to prove boundedness results for ordinal recursion and set recursion. In contrast to these authors we are here concerned with the provably ordinal recursive and set recursive functions of certain extensions of KP, and arrive at a proof-theoretic refinement of Van de Wiele's results, in particular. In a separate paper we will combine our asymmetric interpretation with completeness theorems to recapture many of the bounded ness results due to Girard, Masseron, Normann, Ressayre, Van de Wiele and Vauzeilles. Leaving a detailed comparison to that paper, let us say here only that in our variation it is the unbounded quantifiers that are interpreted asymmetrically rather than an inductively defined predicate. Our Asymmetric Interpretation Theorem says the following. If d is a cutfree KPV(G)-derivation of the sequent
r
~ ~
and D a dilator majorizing
d, then this end sequent is valid under the following interpretation: free variables and (essentially) universally quantified variables are interpreted in Va' whereas (essentially) existentially quantified variables are interpreted in V g asymmetric
• The hierarchy ']B is needed in order to cope with the
:]D(a)
interpretations of the hidden cut-formulae in the
axiom rules of KPV(G), which are of bounded logical complexity and concretely known. Essential properties of the hierarchy are monotonicity, the law
'J B+E; 'JJ ~ o]B
mainly for the ~o-Collection rule, and the base function
g for the axioms on the relation constant G. In place of used a variant of Girard's functor
A
:D we could have
(see [Gi 2, § 5.4, pp , 159-168J). But
we have preferred to demonstrate the possibility of an alternative to
fi\.
This paper is not self-contained. Beyond background on Kripke-Platek set theory and the theory of dilators as specified earlier, we presuppose the results on computing with dilators expounded in [Pa
,§ 7J
and some fami-
liarity with Gentzen's sequent calculus (see e.g. [Ta, Ch. I] and [Sch, pp. 871-883J) . ACKNOWLEDGEMENTS The research described in this paper waS partly carried out at the Universite Paris VII. This stay was financially supported by the DFG (Deutsche
218
P. Pdppinghaus
Forschungsgemeinschaft). Special thanks go to Jean-Yves Girard for his cooperation and stimulating discussions during my visit in Paris and at several occasions in Oberwolfach. § 2. INFINITARY SEQUENT CALCULUS KPY(G)
KPY(G) is given by the following language, initial sequents, and rules. Language: We use a,b,c, ••• to denote free variables and X,y,z, .•• to denote bound variables. As non-logical symbols we have the two-place relation symbols =, E, G of KP(G), and in addition for every ordinal a a one-place relation
symbol Va. The variables are supposed to range over some universe of sets (without urelements), and ya is supposed to denote the a-th level of the cumulative hierarchy. We restrict our logical language to ~, +
,\f
as
basic symbols and assume all other connectives to be defined in terms of these in the usual way. In particular "(Y x E a)B" is to be taken as an abbreviation for ""ix(xEa + B)". By r,/'" ••. we denote finite sets of formulas of this language (hence so-called structural rules are superfluous). Initial seguents: A,r ~ /'"A ~,r ~
for every atomic formula A of the language of KP(G).
/',
Logical rules: (r+)
(IV)
A,r
f-
/'"B
(1+)
r ~ /'"A+B
-
A(a),r ~ \t'xA(x),r
/',
/',
Non-logical rules:
(d)
r ~ /'"A
r r
-
s
A(a)
/'"YxA(x)
if a does not occur in the lower sequent.
-
(Y y € a) yB(y) , r ~ ya(a),r
(Y)
A+B,r ~
/'"
f-
B,r ~/',
ya(a), r
r ~
/',
(all
B < a)
(all
c s On)
/',
/',
/',
Krtpke-Platek Set Theory
r l-
( ll o- Coll )
yIS a) 3 zC
1I, (V
219
3x(Vy€a)(3zex)C,r
r l-
f-
1I
1I
if C is a lIo-formula of the language of KP(G).
fr l-
A,r
(Ax)
1I
if A is an instance of one of the following axiom schemes.
1I
Axioms: (Eq)
a=b
+
(C(a)
C(b»
+
for every atomic formula C of KP(G).
(Ext)
('.j z € a) z is b
(Pair)
3z(a~z
(Union)
3u(\1'YEa)(V zeY)ZEU
( ll o- Sep)
A
A
('.j Z E b) z is a + a=b
b e z)
3Z((\'YEz)(yc:a
A
Cf y )
A
(\fYEa)(C(y)
+
y6z) )
for every lIo-formula C of KP(G). (G 1)
G(a,b)
A
(G 2)
Ord(a)
+
G(a,c)
+
b=c
::Iy(Ord(y)
A
G(a,y»
, where Ord(a) is the usual
lIo-formula of KP expressing that a is an ordinal. Cut rule:
r l- 1I,A
A,r
r f-
f-
1I
if A is a formula of KP(G).
1I
A KPV(G)-derivation is a well-founded tree whose nodes are labelled by sequents
f-
r
1I
containing the constants Va only negatively, and by names
of rules or axioms in a locally correct way. By virtue of the rule (V), KPV(G)-derivations are in general proper classes. A KPV(G)-derivation is termed cut-free, if the cut rule does not occur in it. It may, however, contain applications of the rules (Ax) and (lIo-Coll), which may be viewed as a means of allowing cuts on axioms. We introduce two quasiorderings on dilators and a fundamental-sequencelike notation for the projections of the functor SEP. Definition 2.1:
Let D,E be dilators.
1. D
~
E
:~
there is a natural transformation
2. D
<:::
E
:~
there is a dilator F s , t
3. If D is of kind
~
,
D+ F
T:D
=E
and i;; an ordinal, then we let {D}r;
+
E
.-
SEP(D)(',i;;)
In order to measure the size of KPV(G)-derivations we define a formal relation of majorization between derivations d in KPV(G) and dilators of the form D
=1
+ E • The clauses of the definition of "d maj D" are moti-
vated technically to make the asymmetric interpretation and the cut-elimi-
220
P. Pdppinghaus
nation theorem work. Definition 2.2: We define by induction on the well-founded KPV(G)-derivation trees. ~
~
If d is an initial sequent, we let
d maj D :<===> 1
~
D
If d is obtained from do by an application of (r+), (rV) or a ~o-instan stance of (Ax), we let
d maj D :<===> do maj D •
~ If d is obtained from (dS)S
l!:: D
V S
d maj D : <===> ~
h
stance of (Pair), (Union) or
we let
1 s D .. :3 Do: do maj Do
d maj D :<===> ~
(~o-Sep),
1 + Do ~ D
If d is obtained from do' d 1 by an application of (1+) or of (~o-Coll), we let d maj D :<=> 1 !:: D 3D o,D 1: do maj Do d1 maj D1 Do + 1 + D1 ~ D . If d is obtained from do by an application of (Ax) for (G 2), we let h
~
d maj D :<===> 1 s D :3 Do: do maj Do 1 + Do :; D 7. If d is obtained from (da)a€On by an application of (V), we let h
d maj D ;<===> 1 ~ D
-
h
~
D' of kind Q:\fa; d
a
maJ' {D'}(a+l)
D' ~ D •
h
8. If d is obtained from do' d 1 by an application of the cut rule, we let d maj D ;<===> 1 ~ D 3 Do,D1: do maj Do d 1 maj D1 Do + D1 ~ D • h
Lemma 2.3:
d maj D
h
D
s
E
1
!:: E
===> d maj E
Proof; By induction on d. § 3. ASYMMETRIC INTERPRETATION OF KPV(G)
Before exhausting the reader's patience by tedious cut-elimination techniques, we go ahead with a more interesting part of our work: the asymmetric interpretation. In this section we assume to be given a total, non-decreasing function g;On + On satisfying
g(O) > 0
An asymmetric interpretation of the sequents of KPV(G) is defined relative to g as follows. Definition 3.1: a,SEOn with
g
1~
(f
f-~) [~]
a:> S , a =a1, ... ,a
E Va
<===> and the sequent
n satisfied under the following interpretation.
r f-
~ The free variables of
~
are interpreted by the sets
2. G is interpreted by G (the graph of g). g
r f-
~
+
is
a EVa
Kripke-Platek Set Theory ~
The standard interpretation is used for =,
221
~, +,
E, and the bounded
quantifier C'v"x E') ••••
w'
~ For any ordinal ~,VW is interpreted as V ~
Unbounded universal quantifiers are interpreted to range over Va' where they occur positively, and to range over VS' where they occur negatively.
g
~ Cf
l-
1:1)
denotes the corresponding notion of validity for all in-
terpretations of the free variables in Va'
if The principal tool in the asymmetric interpretation This hierarchy has been chy of ordinal functions 'Jg D The following results are quoted gated in CPa , § Lemma 3.3: ~ For every dilator D, 'Jl ~:On + On is
5J.
theorem is a hierardefined and invest ifrom there. a total and non-de-
creasing function.
1..:..
D s E => 'JlgCa) ;;; 'Jl~C<x) .
~ 'Jg+ECa)
=
~l;;; D => ~l;;;
'JJ~C'JlgCa»
•
JJgCa) > a
D and D of kind
A
']
gCa) ;;; gCa) .
=>
~
'JgCa);;; JfD}Ca+l)Ca)
Theorem 3.4: Asymmetric Interpretation Theorem If d is a cut-free KPVCG)-derivation of the sequent dilator s.t.
f
f--
1:1 , and D is a
d maj D , then for every ordinal a: g
1 gc<X)
1
a
Cf
f--
1:1)
Proof: By induction on d. We distinguish cases according to which rule has been used last in d. We confine ourselves to a few crucial cases and leave the rest to the reader. Case CV): Let d be obtained from Cd) CV), where d
W
derives
VWCa),f
f--
that D';;;D and for every ordinal W:d hypothesis to da and obtain By Lemma 3.3 we have
g
0
W W6. n
by an application of rule
1:1 • For some dilator D' of kind ~ we have
w maj
{D'}Cw+l) • We apply the induction
1'J{D'}Ca+l)Ca) a
Va r C (a },
'JI{D'}CCI+l)Ca) ;;; 'J1 D, Ca ) ;;; 'JlDCa) , and so 1 JDCa) g The free variable a ranges over Va a anyway, so it follows that \JDC a) g a Cf f-- 1:1) •
1:1
f-- )
CVaCa) .r
l-
1:1)
P. Piippinghaus
222
Case (~o-Coll): Let d be obtained from do with end sequent r ~ ~, (\:I yEa)3zC , and from d with end sequent 3x(\>'ye:a)(3zex)C,r ~ ~ 1 There are dilators Do and D1 s.t. do maj Do' d 1 maj D1, and Do+l+D1~ D . We let (3:=~D (a) and Y:=JG «(3+1), and for contradiction we assume to be D given ~ eVa g.t. gff: (r~~h~J is not satisfied. By induction hypothesis
for do we see that
('1/ y E a)(3 Z 6V(3)Cfa"J is true (recall that C is a ~o
formula of+KP(G». By virtue of V(3E.V(3+1 it follows that (3 XE.V(3+1)(V'Yea) (3 Z e x)C La] is true. By induction hypothesis for d 1 we obtain validity of g~ (3x(\i YE.a)(3 z ~x)C,r ~ ~) , and so g~ tr ~ ~) [it] is satisfied. Since
a~(3+1~y,
we have met a contradiction. By Lemma 3.3 we obtain
Y='JlD «(3+1)~'JID (']l«(3»='llD (J 1 ( '] D (a»)='J D -i-» (ah'JlD(a), and hence the claim follo~s. 1 0 a - 1 Case (Ax): We have to distinguish subcases according to which instance of the axiom rule we meet. The instances of (Eq), (Ext) and (G 1) are valid ~o-formulas
of KP(G), so the claim follows immediately from induction hypo-
thesis. For the instances of (Pair), (Union) and
d is obtained
(~o-Sep),
from do with end sequent 3zc(z),r ~ ~ , where C(b) is a ~o-formula of KP(G). For some dilator Do with do maj Do and l+Do~ D we obtain by induc-
gL (3zC(z), r f-~) [~J is satisfied for arbitrary 'a+I ;Lsva+and (3:=':]D (a+1). By virtue of the content of the axioms, (3 zE.Va+1) C(z)la] is true,oand so g~l(rf-~)L~J is satisfied. Now the proof is comtion hypothesis that
pleted by observing that (3='J D (a+1)~']D (']I l(a»='J)l+D (ah'JD(a) • It remains to look at instanceg of (G 2)~ d is obtai;edofrom d with end sequent (Ord(a) +3y(Ord(y)
o
A
G(a,y»),r f- !1 , and there is a dilator D
-0
sv t , do maj Do and l+Do~ D . We have that (3:='Jl2(a)=']1(']1(a»~']1(g(a»
~g(a)+l, and by induction hypothesis
g+ «Ord(a)+3y(Ord(Y)AG(a~y»),r~
~) [+] a is satisfied for arbitrary + acVa and Y:=']D «(3).
by the ordinal 0 €V , then g(o) ~ g(a), so g(O)E.V
(3
If a is interpreted
) 1 c V(3 , and hence a g a + ye V(3)(Ord(y) A G(a,y» [it] is satisfied. It follows that gff: (rf-!1) is
valid. The proof is completed by noting that Y='J
'J) 2+D -
o(
(a)
s l]D(a) .
D a
«(3)='3I
D
('Jl2 ( a » =
0-
0
§ 4. EMBEDDING OF KP(G) IN KPV(G)
Lemma 4.1: ~
For every formula A of KP(G) there is a cut-free KPV(G)-derivation d of A,r f- ~,A
~
s.t. for some finite k: d maj
1 .
If A is the universal closure of an instance of one of the axiom schemes (Eq), (Ext), (Pair), (Union), (!1o-Sep),
(~o-Coll),
(G 1) or (G 2), then
Kripke-Platek Set Theory
there is a cut-free KPY(G)-derivation d of
223
t-
r
s.t. for some fi-
~,A
nite k: d maj k • Proof:
~
is proved as usual by induction on A, and
using the rules (Ax) and
~
follows from
~
In either case finite derivations
(~o-Coll).
(without applications of (Y~) or (Y» are obtained. It follows easily from Definition 2.2 that such derivations can be majorized by a constant dilator k with
l~
k<w.
Lemma 4.2: Let A(a) be a formula of KP(G), and let Prog(A)
:=
Vx«Vy€ x)A(y)
+
A(x»
1. There is a finite k s.t. for every ordinal derivation d
Ct
of
yCt(a),Prog(A),r
t-
there is a cut-free KPY(G)-
~
Ct
~,A(a)
satisfying d maj (k+S)o2 • Ct
2. There is a cut-free KPY(G)-derivation d of n-~, 'v'y(Prog(A) Id s.t. for some finite k: d maj kol
+
Proof:
A(a)
~
We choose k s.t. k majorizes a KPY(G)-derivation of
and for it we prove the claim by induction on
Ci..
t-
y13(b) ,Prog(A)
bea
y13(b) ,Prog(A)
t-
bE a
(\/ ye:a)y 13(y) ,Prog(A)
t-
(V YEa)A(y)
bE:a
+
+
13(y),Prog(A)
l-
A(a)
yCt(a) ,Prog(A)
l-
A(a)
° :d 13 l- A(b) °
A(b) A(a)
(\fye:a)y 13(y),Prog(A),(VYEa)A(y) + A(a) ('.::/y€a)y
~A(a),
In a fragmentary writing
the induction step looks as follows. bs a
\/xA(x»
t-
t-
A(a)
A(a) all
13
<
Ci.
Now one has to apply induction hypothesis to d and to do some calculations 13 according to Definition 2.2. For the l3-th premiss of the end sequent one obtains 4+(k+S)o2 13+(k+l) as a majorizing dilator, and so the claim follows by 4+(k+S)o2 13+k+l ~ (k+S)o(2 13+1) ~ (k+S)o2Ct • ~
Using the above derivations
da we obtain the following.
:d
° Ct
vU(a), Prog(A) ~ A(a)
all
Ct
Prog(A) ~ A(a)
~ \;;!y(Prog(A)
+
'v' xA(x»
With the help of [Gi 2, § 3.6, pp. 136-139J one can directly compute
EOn
P.
224
Pdppinghaus
SEP(~'lId) and obtains {~'lId}(a+l) = k'2 a+ l ~ k'2 a • Hence by ~ and Definition 2.2 the claim follows. Theorem 4.3: For every sequent
r ~ ~
derivable in KP(G) there is a
KPV(G)-derivation d s.t. d contains only finitely many applications of the I d'r d maj k'2 for some finite k, r.
cut rule and
Proof: By virtue of cut-elimination for first order logic there is a purely logical, cut-free derivation do of Al, •.. ,An,r ~ ~ , where Al, ••• ,An are universal closures of instances of the axiom schemes of KP(G). do is a finite KPV(G)-derivation with
do maj k for some finite k By the previous o o' lemmas there are cut-free KPV(G)-derivatlons d. of r ~ ~,A. s.t. di maj Id . . . 1 ., 1 k or d maj ki'l for sUltable flnlte k n appllcatlons of the cut i i i. rule yield the desired derivation d, and d maj ~·lId.~ for k:= max(ko,···,k n) § 5. CUT-ELIMINATION FOR KPV(G)
In order to prove the cut-elimination theorem for KPV(G) we follow the technique of [Sch, pp. 871-883]. Usually one appeals to symmetry utilizing the commutativity of the 'natural sum' or 'natural product' of ordinals. Since we work with dilators, this is not possible here. We modify the usual technique by observing that it suffices to 'move upwards' the cut formula only in the right upper sequent of the cut rule. This works by virtue of an inversion lemma for both
(r~)
and
(l~),
and due to the asymmetry implicit
in the absence of the existential quantifier as a primitive symbol. We apply the usual notions of logical complexity of a formula and of cut-rank of a derivation (see e.g. [Sch, p. 873J). We omit proofs, which are just routine adaptations of usual techniques. Weakening Lemma: If d is a KPV(G)-derivation of
r ~ ~
majorizing d, then there is a KPV(G)-derivation d' of
and D a dilator
r,f ~ ~,0
with the
same cut-rank and majorizing dilator D. Inversion Lemma: Let d be a KPV(G)-derivation with cut-rank
~n,
D a di-
lator majorizing d, and b an arbitrary free variable. Then there are KPV(G)-derivations do' d
l, D satisfying the following. 1.
2. 3.
r Ir d
~,A~B
A~B, r t.:! ~ r ~'.L
t!
=> => =>
and db with cut-rank
d A,r 1L....:: 0
~n
~,B
r ~ s, A r ~ ~
and
and majorizing dilator
d
B,r
~ ~
225
Kripke-Platek Set Theory
~ r ~ b., 'IxA(x)
d
r ~
=>
b.,A(b)
Reduction Lemma: Let d and d' be KPV(G)-derivations of c,r ~ b.
r ~ b.,C and
respectively with cut-rank ~n, C a formula of KP(G) of complexity
n, D and D' dilators majorizing d and d' respectively. Then there is a KPV(G)-derivation p(d,d') of
r ~ b.
with cut-rank ~n and p(d,d') maj D·D'.
Proof: By induction on d'. Case 1: C is the principal formula of the last rule of d'. Case 1.1: C atomic: We let p(d,d'):=d. (In case version Lemma.)
1 ~ D'
entails
Case 1.2: C = VxA(x): Let
d~
D
~
C=~,
apply
~
of the In-
D'D' .
be the subderivation of d' deriving the
upper sequent A(b), \fxA(x),r ~ b. and D~ a dilator s.t. d~ maj D~ and l+D'~ D'. By induction hypothesis p(d,d') derives A(b),r L b. and p(d,d')
-
0
maj
0 ' -
D'D~.
Let db be obtained from d by
cation of the cut rule to db and
~
P(d,d~)
desired p(d,d'). We have p(d,d,) maj
0
of the Inversion Lemma. An appli-
with cut formula
D+D'D~
and
A(b)
yields the
D+D'D~=D'(l+D~)~D'D',
and
hence q.e.d. Case 1.3: C = A+B: Let the upper sequents
d~
and di be the subderivations of d' deriving
A+B,r ~ b.,A
and
B,A+B,r ~ b.
respectively, and D~
and Di be dilators s.t. d~ maj D~, di maj Di, and D~+l+Di ~ D'. By applying the Inversion Lemma to d, of
A,r ~ b.,B,
r ~
d~,
s,«
and di we obtain derivations
,and
B,r ~ b.
d*,d~',
and di'
respectively. Two applications
of the cut rule with cut formulas A and B yield the desired p(d,d'). We have p(d,d') maj D~+D+Di and D~+D+Di ~ D'D~+D+D'Di = D'(D~+l+Di) ~ D'D' , and hence q.e.d. Case 2: C is a side formula of the last rule of d'. Let (de)S
from which the dee), sired p(d,d') is defined by applying the same last rule as in d'. In order to verify the claim about majorization, we must distinguish various sub-
cases. We confine ourselves to the case that rule (V) is applied last in d' (whence a = On). Then there is a dilator D"
S
of kind
n s.t.
D"~
D' and
for all ordinals S: d maj {D' '}(S+l) • Induction hypothesis yields p(d,d
S) maj
D'{D' '}(S+l) D'D' •
D'{D"}(S+l) • By Lemma 5.1 below, D'D" ~
{D'D' '}(S+l) • D'D"
~
D'D'
is of kind n, and
finally gives that p(d,d') maj
226
P. Pdppinghaus
In the proof of the Reduction Lemma just given, we had to know how the fundamental sequences for dilators of kind
~
cation. Simple laws like
would be desirable, but unfor-
{F-G}6
F-{G}6
are compatible with multipli-
tunately in general these do not hold (see [Gi 2, Ex. 3.3.9, p. l25f.J). We must content ourselves with the inequality
F-{G}6
~
{F-G}6 • We genera-
lize this by replacing multiplication by an arbitrary bilator
®, which we
assume to be fixed in the following. We use the terminology and notations of [Gi 2J. If F and G are dilators, the dilator F®G is defined by (F®G)(a):=
®(F(a),G(a» and (F®G)(f):= ®(F(f),G(f». If B is a bilator, the two-place dilator F®B is defined by (F®B)(a,6):= ®(F(a),B(a,6» and (F®B)(f,g):= ®(F(f),B(f,g». An important invariant associated with ® is n(®), which we define to be the least natural number s.t. the flower
®(~,-)
is non-
constant. (Such an integer exists for every bilator ®.) Now we are ready for a lemma. Lemma 5.1: Let F be a dilator, F~{~li
F ® G is a dilator of kind
~
UN(F ® B)
~,and
F ® B is a bilator.
F ® UN(B) •
~
3. For every ordinal 6: F ® {G}6
~
(F ® G}6
Proof: ~
We leave it to the reader to verify that
F ® B is a bilator.
For the other claim it is useful to observe that the criterion for connectedness in [Gi 2, Prop. 3.3.1, p. 123J entails the following criterion for dilators of kind
(*)
G is of kind the set
G is of kind For
~.
~
iff
G(w) is a limit ordinal and for some y < G(w)
(G(f)(y)1 f6I(w,w)} ~,
is co final in G(w).
so we assume to be given an ordinal y according to (*).
0:= F(w)®y we have that (F ® G)(f)(o)
(F(f) ® G(f»(F(w)®y)
~
F(w) ® G(f)(y) . (The inequality follows from properties of bilators analogous to [Gi 2, Remark 2.4.10, p. 116J.) Hence (F ® G)(w) , and by (*)
is cofinal in ~
{(F ® G)(f)(o)I f€I(w,w)}
F ® G is of kind
~_
For an arbitrary ordinal a we define a map Ta:UN(F®B)(a)
as follows. Let (F ® B)(a,a)
0' < UN(F ® B)(a)
+
(F®UN(B»(a)
be given. 0' comes from a point
0 <
whose coefficients in the denotation w.r.t. F ® B are obtai-
ned as follows. We look at the denotation of 0 w.r.t. ® . 0= (0 0 ; 1 ; 0 " " ,1;n-l ;F(a) ;11 0 " " ,11 m_ 1 )® < F(a) ® B(a,a)
Kripke-Platek Set Theory
227
and for
i
r.t. B:
Si
_
i
i)
(si;so.···.Sn._l;a F < F(a) .
.1
(n.;sJ •••• ,sJ J
0
r j-
The coefficients of 6 w.r.t.
.
• and
1
~.
J
w.
.
l;a;~J •••. ,~J
qF l)B
0
< B(a.a)
F ® B now consist of all the
i
sk
and
sj k
for the left argument place. and all the 0~ for the right argument place. The fact that 6 generates a point in UN(F ® B)(a) means in terms of these coefficients that any of the From this it follows that UN(B)(a) . Now we define
st
and
00 ••.• '~m-l
s~ is greater than any of the generate points
0~""'~~_1
in
Ta(6'):= (6o;so •...• sn_l;F(a);~~,••. '~~_1)® < F(a) ® UN(B)(a) One of the seemingly endless verifications of this subject yields that the family (Ta)a EOn is a natural transformation as required. 3. For B:= SEP(G) we obtain by ~ a natural transformation T:UN(F ® B) + F ® UN(B) = F ® G • An application of the functor SEP gives us a natural transformation SEP(T):F ® SEP(G) = F ® B = SEP(UN(F ® B» + SEP(F ® G) And for any fixed ordinal S we finally obtain a natural transformation SEP(T)(e.S):F ® {G}S = F ® (SEP(G)(e.S» +
= (F ® SEP(G»(e,S)
SEP(F ® G)(e.S) = {F ® G}S
Moreover we need the following technical lemma. Lemma 5.2: For dilators F, G the following hold.
3.
(2+F) + (2+G) ;;; (2+F)e(2+G) (;+Id)l+F-+ (2+Id)1+G ;;; -(2+Id)1+F+l+G F- ~ (l+Idl-
4.
<,~+Id)(n) + (l+Id)(n)
1.
2.
~
s (1.+Id) (n- l )
(1.+Id)(n) e (1.+Id)(n) Proof:
;;;
1. We define
+
T S:2+a+2+S a.
Ta, B(Y)
(1.+I d)(n+1) (2+a)e(2+S) as follows.
:= Y
, if Y < 2+a+2
T S(2+a+2+6):=(2+a)e(2+6) ,if6<S a. It is easy to verify that the family (Ta• is a natural transS)(a.S)60n XOn formation, and hence the family (TF(a) .G(a»a E On is a natural transformation from (2+F) + (2+G) to (2+F)e(2+G) as required. l+F 2. In order to apply 1. we must show that 2 ~ (2+Id)• By [Pa. Lemma
~2.2J
there is a dilator F' s .r.. (1.+Idl:- l+F-;-. It follows that
(1.+Id)l+F = (1.+Id)le(1.+Id)F = (1.+Id)e(l+F') = l.+Id+(1.+Id)eF' • hence q.e.d. ~ Ta(y):= (2+a)Y yields a natural transformation as required.
P. Pappinghaus
228 ~
By induction on n. In the induction step apply
~
Trivial for n=O, and for n>O apply
~
~
.
•
Cut-elimination Lemma: Let d be a KPV(G)-derivation with finite cut-rank n+l and majorizing dilator D. Then there is a KPV(G)-derivation £(d) of the same end sequent with cut-rank ~n and
£(d) maj (l+Id)D
Proof: By induction on d. Case 1: d is obtained from do and d l by an application of the cut rule. There are dilators Do and Dl s.t. do maj Do' d l maj Dl, and Do+Dl~ D. By induction hypothesis i=O,l: £(d
£(d o) and £(d with cut-rank ~n and for l) By an application of the Reductbon LemmaDwe get
weD~btain
maj (l+Id)
1.
i) £(d):= p(£(d ),£(d l » with cut-rank ~n and £(d) maj (2+Id) o'(2+Id) 1 o D Dl D +D D the claim follows since (l+Id) o'(l+Id) (l+Id) 0 l ~ (l+Id)
Now
Case 2: d is obtained from (d a)a60n by an application of rule (V). There is a dilator D' of kind Q s.t. D'~ D and for every ordinal a: d maj {D'}(a+l) . By induction hypothesis we obtain £(d ) with cut-rank a ~n and £(da) maj (l+Id){D'}(a+l) • We let £(d) be obtai~ed by an application of rule (V) to (£(d » EO . From Lemma 5.1 it follows that {D'}(a+l) aD,a n D' D (2+Id) ~ {(2+Id) }(a+l), and together with (2+Id) ~ (2+Id) w: see that
£(d) rna; (l+Id)D •
- -
Using Lemma 5.2 the remaining cases follow straightforwardly from induction hypothesis. Finite iteration of this lemma leads to the following normal form theorem. Cut-elimination Theorem: Let d be a KPV(G)-derivation with finite cutrank nand majorizing dilator D. Then there is a cut-free KPV(G)-derivation v(d) of the same end sequent with
v(d)
maj
(l+Id)'
n times
Theorem 5.3: For every sequent derivable in KP(G) there is a cut-free KPV(G)-derivation d s.t. for some finite n:
d maj
(l+Id)(n)
Proof: We combine Theorem 4.3 with the Cut-elimination Theorem. Moreover using Lemma 5.2 it is not very hard to prove that for finite k, r there is I d'r k'2 ~ (2+Id) -- (n)
a finite n s.t.
Kripke-Platek Set Theory
229
§ 6. TI 2-MODELS AND TI 2-ORDINALS
We are now in the position to verify the claims made in § 1. First we observe that our principal bounded ness result (Theorem 1.1) follows from Theorem 5.3 together with the Asymmetric Interpretation Theorem (Theorem
3.4). In Corollary 1.2 we have interpreted this result as a closure property of certain levels V of the cumulative hierarchy. Any K closed under all K ~g (n<w) must be a limit ordinal by virtue of Lemma 3.3.4. With Cl+Id) ( ) this n observation, Corollary 1.2 follows immediately from Theorem 1.1. Another way to interpret Theorem 1.1 is to draw from it bounds for the functions, which are ~~-definable provably in KP(G). Corollary 6.1: 1. If
f:V
V is ~~-definable provably in KP(G), then for some finite
+
n and every set a: ~ If
n
f:On
< ~Z2+Id) -
(rank(a»
(n)
is ~~-definable provably in KP(G), then for some finite
On
+
rank(f(a»
and every ordinal
f(a)
<
'JJ Z2+Id) -
(a)
(n)
3. If the ordinal a is ~~-definable provably in KP(G), then for some finite
n:
a
< JZ2+Id) -
Proof:
.L..
(0)
(n)
Let F[a,b] be a ~¥-definition of f with KP(G)
I-
Yx 3!yF[x,y]. Choose
n according to Theorem 1.1, and for an arbitrary set a let a:= rank(a). Then a
I". V l' and so by Theorem 1.1: f(a) 6 V a+ 'J\ g (1) It follows that rank(f(a» < ']Zl+Id)(n) (a+1) (l+Id)(n) Ct+
~ JJt2+Id) -
(ri)
~
Analogous to
~
By applying
('J]y(a» -
= JL(2+Id) -
-
(n )
(a)
~ ']Z2+Id) -
(n- l )
(a) .
.L.. .
~
to the constant function with value a.
As explained in § 1, we have by our results determined K(g):=~Z2+Id) (0) to be the least ordinal s.t. (VK(g),g) is a TI 2-model of KP(G).
-
(w)
A little care is needed, if we want to apply our results to specific ordinal functions
g:On + On according to our remarks in § 1. The idea is to
replace the relation constant G by a
~l-formula
of KP defining the graph of
g. The axiom (G 1) thereby becomes a IT-formula, which can be replaced by an open 6o-formula, and so there is no problem for the asymmetric interpreta-
P.
230
Pdpptnghaus
tion. The asymmetric interpretation of the axiom rule for (G 2), however, has to be checked for each individual case. In practice it is sometimes more convenient to replace (G 1) and (G 2) by some other equivalent axiom, whose asymmetric interpretation is easier to verify. In order to obtain a theory equivalent to Jager's KPu of [Ja IJ, for example, we can replace G by a formula defining the graph of g= Aa.w. In this case we are lucky, since we can choose a lI-formula for this purpose. a
Alternatively we can define KPu to be KP extended by the axiom of infinity: 3xLim(x) , where Lim(c) is the usual llo-formula of KP expressing that c is a limit ordinal. g= Aa.w and
g= Aa.(I+a)·w
are provably Ll-defin-
able in this version of KPu. In either approach the asymmetric interpretation works for both choices of g by virtue of hold for KPu, in particular
WE:;
Vw+l' So all our results
K(Aa.w) = K(Aa.(I+a)·w)
the Bachmann-Howard
ordinal (see [PaJ) is the least ordinal K s.t. V is a IT2-model of KPu. K In order to find the IT2-ordinal in the sense of [Ja we have to modify
2J,
the asymmetric interpretation to refer to the levels of the constructible hierarchy. A careful checking of the proof of the Asymmetric Interpretation Theorem reveals that the cumulative hierarchy could be replaced by any hierarchy
H:On + V satisfying the following conditions.
(a) All H are transitive, and a < S ===> Ha£H a S (b) a E: H => "3 a < S: a ~ H a S (c) a,~E.Ha ===> {a,b}6Ha+ 1 and Uae.H a+ 1 (d) a,bEH a ' D[y,~] lIa-formula of KP(G) ===> Iv s a] D[y,b]}EHa+ 1 • (e) H () On a a
For the general case of KP(G) we have no asymmetric interpretation into the levels of the constructible hierarchy L, since condition (d) fails for this language. KPu, however, is formulated in the language of KP, and so the Asymmetric Interpretation Theorem works, if the predicate constants Va are interpreted by La' the corresponding level of the constructible hierarchy. We arrive at a variant of Theorem 1.1 and its corollaries for the theory KPu. In particular it follows that
K(Aa.w)
is the least ordinal K s.t.
L is a IT 2-model of KPu, i.e.: the Bachmann-Howard ordinal is the IT 2-ordiK nal of KPu. In order to obtain a theory equivalent to the theory KPi of Jager and Pohlers mula
in [JPJ, we proceed as follows. We observe that there is a llo-for-
Ad(c)
of KP expressing that c is an admissible
~.
Aczel [RA, Thm , 2.4, p, 315£.J there is a IT 3-sentence for every non-empty, transitive set M: M is admissible
Ax
By Richter and
of KP s t , KP iff (M,E)~ Ax KP i
Kripke-Platek Set Theory
We choose a
~o-formula
Tran(c)
231
expressing that c is a non-empty, transi-
tive set, and let (AXKP)(c) be obtained from Ax by bounding to c all unKP bounded quantifiers in Ax KP (see [Bw, p, 15J). Then Ad(c) Tran(c) (AXKP)(C) serves our needs. By a lengthy, but easy verification one can
:=
A
see that for every sentence B, which is an instance of an axiom of KP, Ad(c) ~ B(c)
is derivable using only the axioms of equality and extensio-
nality and the scheme of foundation. Now we define KPi to be KP extended by the following axiom of inaccessibility:
::h(Ad(y)
A
aEy). g=i\a.a+
is
provably Li-definable in KPi. The asymmetric interpretation presents no difficulty, if we choose to interpret into the constructible hierarchy. Namely for a E La we have that a € La+ , La+ is admissible and La+ E La++1' and so (3 YEL + 1)(Ad(y) " a6 y) is true. This is essentially what is a + needed to verify the asymmetric interpretation for the axiom rule in the case of the axiom of inaccessibility. As result we obtain a variant of Theorem 1.1 and its corollaries for the theory KPi. In particular it follows that
K(i\a.a+)
is the IT 2 - or di na l of KPi.
REFERENCES [BwJ
Barwise, J.: Admissible Sets and Structures. An Approach to Definability Theory, Berlin-Heidelberg-New York 1975
[FeJ
Ferbus, M.-C.: Functorial bounds for cut elimination in L .1, in: Bw Arch. math. Logik 24 (1984), 141-158
[Gi IJ
Girard, J.-Y.: A survey of rr~-logic, in: Cohen, L.J., Los, J., Pfeiffer, H., Podewski, K.-P. (eds.): Logic, Methodology and Philosophy of Science VI. Proceedings of the Sixth International Congress at Hannover 1979, Amsterdam-New York-Oxford-Warszawa 1982, 89-107
[Gi 2J
--: rr~-logic, Part 1: Dilators, in: Annals of Mathematical Logic 21 (1981), 75-219
[How]
Howard, W.A.: A system of abstract constructive ordinals, in: JSL 37 (1972), 355-374
[Ja IJ
Jager, G.: Zur Beweistheorie der Kripke-Platek-Mengenlehre tiber den nattirlichen Zahlen, in: Arch. math. Logik 22 (1982), 121-139
[Ja 2J
--: Some Proof-Theoretic Contributions to Theories of Sets, this volume
[JpJ
Jager, G., Pohlers, W.: Eine beweistheoretische Untersuchung von (~!-CA)+(BI) und verwandter Systeme, in: Sitzungsberichte der Bayerischen Akademie der Wissenschaften (1982), 1-28
[JerJ
Jervell, H.R.: Introducing homogeneous trees, in: Stern, J. (ed.): Proceedings of the Herbrand Symposium. Logic Colloquium '81, Amsterdam-New York-Oxford 1982, 147-158
232
P Pdppinghaus
CPa]
Pappinghaus, P.: Ptykes in Godel's T und Hierarchien von Ordinalzahlfunktionen, to appear in the Proceedings of the rri-workshop in Oslo - August 1984
[pc]
Pearce, J.: A constructive consistency proof of a fragment of set theory, in: Annals of Pure and Applied Logic 27 (1984), 25-62 Richter, W., Aczel, P.: Inductive definitions and reflecting properties of admissible ordinals, in: Fenstad, J.E., Hinman, P.G. (eds.): Generalized Recursion Theory. Proceedings of the 1972 Oslo Symposium, Amsterdam-London-New York 1974, 301-381
[RA]
[Sch]
Schwichtenberg, H.: Proof theory: Some applications of cut-elimination, in: Barwise, J. (ed.): Handbook of Mathematical Logic, Amsterdam-New York-Oxford 1977, 867-912
[Ta]
Takeuti, G.: Proof theory, Amsterdam 1975
Logic Colloquium '85 Edited by The Paris Logic Group © Elsevier Science Publishers B.V. (North-Holland), 1987
213
TI 2-MODELS OF EXTENSIONS OF KRIPKE-PLATEK SET THEORY Peter Pappinghaus Institut fur Mathematik, Universitat Hannover Welfengarten 1, D-3000 Hannover 1 § 1. INTRODUCTION
In the development of proof theory a shift has taken place from subsystems of analysis (second order arithmetic) to theories of inductive definitions, and from these to extensions and subsystems of Kripke-Platek set theory. The latter change is advanced most forcefully by Jager, and for background and recent results we refer to his contribution in this volume [Ja 2J. Following this trend we present a new method of analyzing extensions of Kripke-Platek set theory in the spirit of Girard's TI~-logic. Our method is remarkably simpler than previous work. Partly this is due to the fact that our aims are less ambitious. In contrast to the tradition we are not concerned with determining so-called proof-theoretic ordinals, but rather with finding TI 2-models and the TI 2-ordinals of the theories we study. These notions are explained below and go back to Jager (see [Ja 2J)t. We assume familiarity with Kripke-Platek set theory as developed in [Bw, Ch. I]. We consider only sets without urelements, and so the formal system KP is formulated in a one-sorted language with the two-place relation symbols
=
and E. As axioms of KP we have the axioms of equality, ex-
tensionality, pairing and union, and the schemes of
~o-separation, ~o-col
lection and foundation. We extend KP to the theory KP(G) by adjoining a new two-place predicate constant G and the following axioms expressing that G is the graph of an ordinal function. (G 1)
\!x,y,z ( G(x,y)
(G 2)
\j x ( Ord(x)
Ord(c) is the usual
+
A
G(x,z) + y
3y ( Ord(y)
~o-formula
A
= Z
G(x,y) ) )
expressing that c is an ordinal.
KP(G) is a uniform way of giving extensions of KP of different strength according to how G is interpreted. For example, if G is to be interpreted as the graph of the constant function with value w, we can replace the relation constant G by a Ll-formula defining this graph, and we obtain a
t The referee has pointed out to me that the notion of TI 2-model implicitly goes back to Tait. However I have not been able to locate a reference.
P. Pdppinghaus
214
theory equivalent to Jager's KPu (see [Ja lJ). And if G is to be interpreted as the graph of the function associating with any ordinal a the next admissible ordinal a+, we can similarly obtain a theory equivalent to Jager's KPi (see [JP]) as well as to Pearce' s ~ (see [Pc]). We are interested only in standard models for the language of KP(G). which are given by a pair (M,g), where M is a non-empty. transitive set, and
g:On + On is a total ordinal function. M is called admissible. if
(M,E) is a model of KP, and M is called g-admissible, if (M,€,G
is a mog) del of KP(G). G is defined by Gg(a,b):<=> (Ord(a) b=g(a)) v (-,Ord(a) g A b=O). (M,g) is called a TIz-model of KP(G), if (M,€,G ) is a model of A
g
every TIz-sentence. which is provable in KP(G).
One way to phrase our aims is to say that we want to determine ordinals K s.t. (VK,g) is a TIz-model of KP(G). (VK denotes the K-th level of the Phrased in a recursion-theoretic spirit, we want
cumulative hierarchy.)
to determine bounds for set functions and ordinal functions, which are ~~ definable provably in KP(G). Let V denote the universe of sets and On the class of all ordinals. We use a.b.c •••• as free variables of the formal language and also (somewhat ambiguously) informally to denote sets. The notation Fl~J is used to indi+
a = al, .•. ,a as its free variabn les. ~ denotes derivability, and a,S,y, .•• are used for ordinals.
cate that a formula F contains at most
f:V + V is called ~?-definable provably in KP(G) iff there is a ~l-formu la F[a.b] of KP(G) s t , KP(G) ~ Yx 3!y Flx,yJ s
set M and
a EM:
defines that
and for every g-admissible
(M,E.,G g) 1= F [a, f(a)J • Analogously one On is Z~-definable provably in KP(G). In particular
f(a) '" M and
f:On
+
an ordinal a is called Z~-definable provably in KP(G) iff there is a ~l formula
A[a] of KP(G) a.r ,
g-admissible set M:
KP(G)
a EM and
l-
(M,E,G
3!y(Ord(y) ~
A
A[y])
and for every
A[a]
g) We will give bounds for provably ~~-definable functions in terms of a
hierarchy of ordinal functions JB:On + On • This hierarchy is defined and investigated at length in [pa
, Kap. IIJ. The indices D of the hierarchy
denote dilators. The following dilators. in particular. playa role in our
To understand this definition and notation (and even more so later parts of this paper) the reader has to be familiar with the fundamentals of the theory of dilators as exposed in [Gi 2, pp. 89-139J.
Kripke-Platek Set Theory
215
The principal bounded ness result is the following. We assume that g:On
+
On is non-decreasing and
Theorem 1.1: Let
qtt]
KP(G) /-
g(O) >
a
be a lIo-formula of KP(G) s
\ft3 y c[ii,y]
s
t ,
V! Eo V ::3 t E V a ]g (a) is true with G interpreted by Gg . (l+Id)(n) This immediately entails closure properties in the cumulative hierarchy.
Then for some fini te n and every ordinal a:
+ +]
C[ a,b
Corollary 1.2: Let K be an ordinal closed under all :]~2+Id) ,n < w. Then (VK,g) is a TIz-model of KP(G). In other words: V is close~n)under K all functions, which are L~-definable provably in KP(G), and these have a VK-absolute L~-definition. K(g) := ~rl+Id) w (0) of the corollary.
is the least ordinal satisfying the hypothesis
() Moreover it follows from the results of CPa] that
the set of ordinals L~-definable provably in KP(G) is a cofinal subset of K(g). Consequently K(g) is the least ordinal K s.t. (VK,g) is a TIz-model of KP(G). An application of the same methods to a version of KPu yields that
K(Aa.w)
(which, incidentally, by [PaJ equals the so-called Bachmann-
Howard ordinal) is the least ordinal K s.t. VK is a TIz-model of KPu. By a slight modification of our techniques we can prove similarly that K(Aa.w) is the least ordinal K s.t. LK is a TIz-model of KPu, i.e. in the terminology of Jager (see [Ja 2J): K(AQ.W) is the TIz-ordinal of KPu. (LK denotes the K-th level of the constructible hierarchy.) Finally we can also analyze a version of KPi and obtain that
K(Aa.a+)
is the TIz-ordinal of KPi. These
ordinals have been determined by Jager and Pohlers in [Ja IJ and [JpJ to equal
EJE
Q+10
and
01 E I +10 respectively.
The principal boundedness result (Theorem 1.1) is proved by combining a syntactic analysis of KP(G) in traditional lines with a semantic analysis in the form of an asymmetric interpretation. To analyze KP(G) proof-theoretically we introduce an infinitary sequent calculus KPV(G). It has for every ordinal a an a-branching rule expressing the definition of Va' and further an On-branching rule expressing that every set is an element of some Va' KP(G) can be embedded in KPV(G). By virtue of the infinitary rules the scheme of foundation can be derived in KPV(G). The remaining axioms of KP(G) - which are of bounded logical complexity - are taken care of by particular rules, called axiom rules. For KPV(G) a weak cut-elimination theorem is proved. (In the disguise of the
P. Pdppinghaus
216
axiom rules there remain specific cuts.) As usual we have to control the lengths of the derivations obtained in the process of embedding and cut-elimination. For this purpose we use dilators rather than ordinals. More precisely we use a relation of majorization between KPV(G)-derivations and dilators. This is the only trace of functoriality in our treatment. We have stayed closer to traditional proof theory ~han the Girard school (see e.g. [Fe]) in that we have not incorporated any functoriality conditions in the notion of derivation. However all infinitary derivations occurring here could be shown to be homogeneous trees in the sense of [Jer] , even primitive recursive ones, and are thus perfectly good constructive objects. While homogeneity of derivation trees would be needed for a unique assignment of dilators as lengths, we get along with a majorization, which can be obtained more cheaply. Moreover we pay no attention to the metamathematical methods employed in our work, and so there is no technical need to bother about homogeneity. Dilators as length bounds are of great technical advantage. We could have replaced the On-branching rule of KPV(G) by an analogous Q-branching rule. In such a setting we could have used ordinals below £Q+I as length bounds like in [How]. But as Howard's work shows, this leads to great complications due to the fact that the fundamental sequences are not well-behaved w.r.t. the algebraic operations. The use of dilators is a much more elegant solution. To obtain length bounds for derivations, one only has to construct natural transformations between dilators without ever paying attention to fundamental sequences. And for coping with the On-branching ·rule we have the fundamental sequences ({D}<:;\';
E
On at our disposal, which are
inherently given by virtue of Girard's functor of separation of variables: {D}<:;
= SEP(D)(',<:;)
• A certain price is to pay, since the functor SEP is
not exactly the inverse of diagonalization. As a consequence the fundamental sequences given by SEP are still not sufficiently well-behaved w.r.t. the algebraic operations. For example (l+Id){D}<:; can be naturally transformed into {(l+Id)D}<:; • but is in general not equal to it. Also we have to modify the usual cut-elimination technique slightly, since there are no commutative 'natural sum' or 'natural product' for dilators. The final result of our proof-theoretic work is that every KP(G)-derivable sequent has a (weakly) cut-free KPV(G)-derivation, which is majorized by (l+Id)(n) for some finite n •
Kripke-Platek Set Theory
217
This syntactic analysis is supplemented by a semantic analysis, namely an asymmetric interpretation of cut-free KPV(G)-derivations. Such a method has first been applied by Girard in [Gi IJ and later by Masseron, Van de Wiele and Vauzeilles in the context of theories of inductive definitions in order to prove boundedness results for ordinal recursion and set recursion. In contrast to these authors we are here concerned with the provably ordinal recursive and set recursive functions of certain extensions of KP, and arrive at a proof-theoretic refinement of Van de Wiele's results, in particular. In a separate paper we will combine our asymmetric interpretation with completeness theorems to recapture many of the bounded ness results due to Girard, Masseron, Normann, Ressayre, Van de Wiele and Vauzeilles. Leaving a detailed comparison to that paper, let us say here only that in our variation it is the unbounded quantifiers that are interpreted asymmetrically rather than an inductively defined predicate. Our Asymmetric Interpretation Theorem says the following. If d is a cutfree KPV(G)-derivation of the sequent
r
~ ~
and D a dilator majorizing
d, then this end sequent is valid under the following interpretation: free variables and (essentially) universally quantified variables are interpreted in Va' whereas (essentially) existentially quantified variables are interpreted in V g asymmetric
• The hierarchy ']B is needed in order to cope with the
:]D(a)
interpretations of the hidden cut-formulae in the
axiom rules of KPV(G), which are of bounded logical complexity and concretely known. Essential properties of the hierarchy are monotonicity, the law
'J B+E; 'JJ ~ o]B
mainly for the ~o-Collection rule, and the base function
g for the axioms on the relation constant G. In place of used a variant of Girard's functor
A
:D we could have
(see [Gi 2, § 5.4, pp , 159-168J). But
we have preferred to demonstrate the possibility of an alternative to
fi\.
This paper is not self-contained. Beyond background on Kripke-Platek set theory and the theory of dilators as specified earlier, we presuppose the results on computing with dilators expounded in [Pa
,§ 7J
and some fami-
liarity with Gentzen's sequent calculus (see e.g. [Ta, Ch. I] and [Sch, pp. 871-883J) . ACKNOWLEDGEMENTS The research described in this paper waS partly carried out at the Universite Paris VII. This stay was financially supported by the DFG (Deutsche
218
P. Pdppinghaus
Forschungsgemeinschaft). Special thanks go to Jean-Yves Girard for his cooperation and stimulating discussions during my visit in Paris and at several occasions in Oberwolfach. § 2. INFINITARY SEQUENT CALCULUS KPY(G)
KPY(G) is given by the following language, initial sequents, and rules. Language: We use a,b,c, ••• to denote free variables and X,y,z, .•• to denote bound variables. As non-logical symbols we have the two-place relation symbols =, E, G of KP(G), and in addition for every ordinal a a one-place relation
symbol Va. The variables are supposed to range over some universe of sets (without urelements), and ya is supposed to denote the a-th level of the cumulative hierarchy. We restrict our logical language to ~, +
,\f
as
basic symbols and assume all other connectives to be defined in terms of these in the usual way. In particular "(Y x E a)B" is to be taken as an abbreviation for ""ix(xEa + B)". By r,/'" ••. we denote finite sets of formulas of this language (hence so-called structural rules are superfluous). Initial seguents: A,r ~ /'"A ~,r ~
for every atomic formula A of the language of KP(G).
/',
Logical rules: (r+)
(IV)
A,r
f-
/'"B
(1+)
r ~ /'"A+B
-
A(a),r ~ \t'xA(x),r
/',
/',
Non-logical rules:
(d)
r ~ /'"A
r r
-
s
A(a)
/'"YxA(x)
if a does not occur in the lower sequent.
-
(Y y € a) yB(y) , r ~ ya(a),r
(Y)
A+B,r ~
/'"
f-
B,r ~/',
ya(a), r
r ~
/',
(all
B < a)
(all
c s On)
/',
/',
/',
Krtpke-Platek Set Theory
r l-
( ll o- Coll )
yIS a) 3 zC
1I, (V
219
3x(Vy€a)(3zex)C,r
r l-
f-
1I
1I
if C is a lIo-formula of the language of KP(G).
fr l-
A,r
(Ax)
1I
if A is an instance of one of the following axiom schemes.
1I
Axioms: (Eq)
a=b
+
(C(a)
C(b»
+
for every atomic formula C of KP(G).
(Ext)
('.j z € a) z is b
(Pair)
3z(a~z
(Union)
3u(\1'YEa)(V zeY)ZEU
( ll o- Sep)
A
A
('.j Z E b) z is a + a=b
b e z)
3Z((\'YEz)(yc:a
A
Cf y )
A
(\fYEa)(C(y)
+
y6z) )
for every lIo-formula C of KP(G). (G 1)
G(a,b)
A
(G 2)
Ord(a)
+
G(a,c)
+
b=c
::Iy(Ord(y)
A
G(a,y»
, where Ord(a) is the usual
lIo-formula of KP expressing that a is an ordinal. Cut rule:
r l- 1I,A
A,r
r f-
f-
1I
if A is a formula of KP(G).
1I
A KPV(G)-derivation is a well-founded tree whose nodes are labelled by sequents
f-
r
1I
containing the constants Va only negatively, and by names
of rules or axioms in a locally correct way. By virtue of the rule (V), KPV(G)-derivations are in general proper classes. A KPV(G)-derivation is termed cut-free, if the cut rule does not occur in it. It may, however, contain applications of the rules (Ax) and (lIo-Coll), which may be viewed as a means of allowing cuts on axioms. We introduce two quasiorderings on dilators and a fundamental-sequencelike notation for the projections of the functor SEP. Definition 2.1:
Let D,E be dilators.
1. D
~
E
:~
there is a natural transformation
2. D
<:::
E
:~
there is a dilator F s , t
3. If D is of kind
~
,
D+ F
T:D
=E
and i;; an ordinal, then we let {D}r;
+
E
.-
SEP(D)(',i;;)
In order to measure the size of KPV(G)-derivations we define a formal relation of majorization between derivations d in KPV(G) and dilators of the form D
=1
+ E • The clauses of the definition of "d maj D" are moti-
vated technically to make the asymmetric interpretation and the cut-elimi-
220
P. Pdppinghaus
nation theorem work. Definition 2.2: We define by induction on the well-founded KPV(G)-derivation trees. ~
~
If d is an initial sequent, we let
d maj D :<===> 1
~
D
If d is obtained from do by an application of (r+), (rV) or a ~o-instan stance of (Ax), we let
d maj D :<===> do maj D •
~ If d is obtained from (dS)S
l!:: D
V S
d maj D : <===> ~
h
stance of (Pair), (Union) or
we let
1 s D .. :3 Do: do maj Do
d maj D :<===> ~
(~o-Sep),
1 + Do ~ D
If d is obtained from do' d 1 by an application of (1+) or of (~o-Coll), we let d maj D :<=> 1 !:: D 3D o,D 1: do maj Do d1 maj D1 Do + 1 + D1 ~ D . If d is obtained from do by an application of (Ax) for (G 2), we let h
~
d maj D :<===> 1 s D :3 Do: do maj Do 1 + Do :; D 7. If d is obtained from (da)a€On by an application of (V), we let h
d maj D ;<===> 1 ~ D
-
h
~
D' of kind Q:\fa; d
a
maJ' {D'}(a+l)
D' ~ D •
h
8. If d is obtained from do' d 1 by an application of the cut rule, we let d maj D ;<===> 1 ~ D 3 Do,D1: do maj Do d 1 maj D1 Do + D1 ~ D • h
Lemma 2.3:
d maj D
h
D
s
E
1
!:: E
===> d maj E
Proof; By induction on d. § 3. ASYMMETRIC INTERPRETATION OF KPV(G)
Before exhausting the reader's patience by tedious cut-elimination techniques, we go ahead with a more interesting part of our work: the asymmetric interpretation. In this section we assume to be given a total, non-decreasing function g;On + On satisfying
g(O) > 0
An asymmetric interpretation of the sequents of KPV(G) is defined relative to g as follows. Definition 3.1: a,SEOn with
g
1~
(f
f-~) [~]
a:> S , a =a1, ... ,a
E Va
<===> and the sequent
n satisfied under the following interpretation.
r f-
~ The free variables of
~
are interpreted by the sets
2. G is interpreted by G (the graph of g). g
r f-
~
+
is
a EVa
Kripke-Platek Set Theory ~
The standard interpretation is used for =,
221
~, +,
E, and the bounded
quantifier C'v"x E') ••••
w'
~ For any ordinal ~,VW is interpreted as V ~
Unbounded universal quantifiers are interpreted to range over Va' where they occur positively, and to range over VS' where they occur negatively.
g
~ Cf
l-
1:1)
denotes the corresponding notion of validity for all in-
terpretations of the free variables in Va'
if The principal tool in the asymmetric interpretation This hierarchy has been chy of ordinal functions 'Jg D The following results are quoted gated in CPa , § Lemma 3.3: ~ For every dilator D, 'Jl ~:On + On is
5J.
theorem is a hierardefined and invest ifrom there. a total and non-de-
creasing function.
1..:..
D s E => 'JlgCa) ;;; 'Jl~C<x) .
~ 'Jg+ECa)
=
~l;;; D => ~l;;;
'JJ~C'JlgCa»
•
JJgCa) > a
D and D of kind
A
']
gCa) ;;; gCa) .
=>
~
'JgCa);;; JfD}Ca+l)Ca)
Theorem 3.4: Asymmetric Interpretation Theorem If d is a cut-free KPVCG)-derivation of the sequent dilator s.t.
f
f--
1:1 , and D is a
d maj D , then for every ordinal a: g
1 gc<X)
1
a
Cf
f--
1:1)
Proof: By induction on d. We distinguish cases according to which rule has been used last in d. We confine ourselves to a few crucial cases and leave the rest to the reader. Case CV): Let d be obtained from Cd) CV), where d
W
derives
VWCa),f
f--
that D';;;D and for every ordinal W:d hypothesis to da and obtain By Lemma 3.3 we have
g
0
W W6. n
by an application of rule
1:1 • For some dilator D' of kind ~ we have
w maj
{D'}Cw+l) • We apply the induction
1'J{D'}Ca+l)Ca) a
Va r C (a },
'JI{D'}CCI+l)Ca) ;;; 'J1 D, Ca ) ;;; 'JlDCa) , and so 1 JDCa) g The free variable a ranges over Va a anyway, so it follows that \JDC a) g a Cf f-- 1:1) •
1:1
f-- )
CVaCa) .r
l-
1:1)
P. Piippinghaus
222
Case (~o-Coll): Let d be obtained from do with end sequent r ~ ~, (\:I yEa)3zC , and from d with end sequent 3x(\>'ye:a)(3zex)C,r ~ ~ 1 There are dilators Do and D1 s.t. do maj Do' d 1 maj D1, and Do+l+D1~ D . We let (3:=~D (a) and Y:=JG «(3+1), and for contradiction we assume to be D given ~ eVa g.t. gff: (r~~h~J is not satisfied. By induction hypothesis
for do we see that
('1/ y E a)(3 Z 6V(3)Cfa"J is true (recall that C is a ~o
formula of+KP(G». By virtue of V(3E.V(3+1 it follows that (3 XE.V(3+1)(V'Yea) (3 Z e x)C La] is true. By induction hypothesis for d 1 we obtain validity of g~ (3x(\i YE.a)(3 z ~x)C,r ~ ~) , and so g~ tr ~ ~) [it] is satisfied. Since
a~(3+1~y,
we have met a contradiction. By Lemma 3.3 we obtain
Y='JlD «(3+1)~'JID (']l«(3»='llD (J 1 ( '] D (a»)='J D -i-» (ah'JlD(a), and hence the claim follo~s. 1 0 a - 1 Case (Ax): We have to distinguish subcases according to which instance of the axiom rule we meet. The instances of (Eq), (Ext) and (G 1) are valid ~o-formulas
of KP(G), so the claim follows immediately from induction hypo-
thesis. For the instances of (Pair), (Union) and
d is obtained
(~o-Sep),
from do with end sequent 3zc(z),r ~ ~ , where C(b) is a ~o-formula of KP(G). For some dilator Do with do maj Do and l+Do~ D we obtain by induc-
gL (3zC(z), r f-~) [~J is satisfied for arbitrary 'a+I ;Lsva+and (3:=':]D (a+1). By virtue of the content of the axioms, (3 zE.Va+1) C(z)la] is true,oand so g~l(rf-~)L~J is satisfied. Now the proof is comtion hypothesis that
pleted by observing that (3='J D (a+1)~']D (']I l(a»='J)l+D (ah'JD(a) • It remains to look at instanceg of (G 2)~ d is obtai;edofrom d with end sequent (Ord(a) +3y(Ord(y)
o
A
G(a,y»),r f- !1 , and there is a dilator D
-0
sv t , do maj Do and l+Do~ D . We have that (3:='Jl2(a)=']1(']1(a»~']1(g(a»
~g(a)+l, and by induction hypothesis
g+ «Ord(a)+3y(Ord(Y)AG(a~y»),r~
~) [+] a is satisfied for arbitrary + acVa and Y:=']D «(3).
by the ordinal 0 €V , then g(o) ~ g(a), so g(O)E.V
(3
If a is interpreted
) 1 c V(3 , and hence a g a + ye V(3)(Ord(y) A G(a,y» [it] is satisfied. It follows that gff: (rf-!1) is
valid. The proof is completed by noting that Y='J
'J) 2+D -
o(
(a)
s l]D(a) .
D a
«(3)='3I
D
('Jl2 ( a » =
0-
0
§ 4. EMBEDDING OF KP(G) IN KPV(G)
Lemma 4.1: ~
For every formula A of KP(G) there is a cut-free KPV(G)-derivation d of A,r f- ~,A
~
s.t. for some finite k: d maj
1 .
If A is the universal closure of an instance of one of the axiom schemes (Eq), (Ext), (Pair), (Union), (!1o-Sep),
(~o-Coll),
(G 1) or (G 2), then
Kripke-Platek Set Theory
there is a cut-free KPY(G)-derivation d of
223
t-
r
s.t. for some fi-
~,A
nite k: d maj k • Proof:
~
is proved as usual by induction on A, and
using the rules (Ax) and
~
follows from
~
In either case finite derivations
(~o-Coll).
(without applications of (Y~) or (Y» are obtained. It follows easily from Definition 2.2 that such derivations can be majorized by a constant dilator k with
l~
k<w.
Lemma 4.2: Let A(a) be a formula of KP(G), and let Prog(A)
:=
Vx«Vy€ x)A(y)
+
A(x»
1. There is a finite k s.t. for every ordinal derivation d
Ct
of
yCt(a),Prog(A),r
t-
there is a cut-free KPY(G)-
~
Ct
~,A(a)
satisfying d maj (k+S)o2 • Ct
2. There is a cut-free KPY(G)-derivation d of n-~, 'v'y(Prog(A) Id s.t. for some finite k: d maj kol
+
Proof:
A(a)
~
We choose k s.t. k majorizes a KPY(G)-derivation of
and for it we prove the claim by induction on
Ci..
t-
y13(b) ,Prog(A)
bea
y13(b) ,Prog(A)
t-
bE a
(\/ ye:a)y 13(y) ,Prog(A)
t-
(V YEa)A(y)
bE:a
+
+
13(y),Prog(A)
l-
A(a)
yCt(a) ,Prog(A)
l-
A(a)
° :d 13 l- A(b) °
A(b) A(a)
(\fye:a)y 13(y),Prog(A),(VYEa)A(y) + A(a) ('.::/y€a)y
~A(a),
In a fragmentary writing
the induction step looks as follows. bs a
\/xA(x»
t-
t-
A(a)
A(a) all
13
<
Ci.
Now one has to apply induction hypothesis to d and to do some calculations 13 according to Definition 2.2. For the l3-th premiss of the end sequent one obtains 4+(k+S)o2 13+(k+l) as a majorizing dilator, and so the claim follows by 4+(k+S)o2 13+k+l ~ (k+S)o(2 13+1) ~ (k+S)o2Ct • ~
Using the above derivations
da we obtain the following.
:d
° Ct
vU(a), Prog(A) ~ A(a)
all
Ct
Prog(A) ~ A(a)
~ \;;!y(Prog(A)
+
'v' xA(x»
With the help of [Gi 2, § 3.6, pp. 136-139J one can directly compute
EOn
P.
224
Pdppinghaus
SEP(~'lId) and obtains {~'lId}(a+l) = k'2 a+ l ~ k'2 a • Hence by ~ and Definition 2.2 the claim follows. Theorem 4.3: For every sequent
r ~ ~
derivable in KP(G) there is a
KPV(G)-derivation d s.t. d contains only finitely many applications of the I d'r d maj k'2 for some finite k, r.
cut rule and
Proof: By virtue of cut-elimination for first order logic there is a purely logical, cut-free derivation do of Al, •.. ,An,r ~ ~ , where Al, ••• ,An are universal closures of instances of the axiom schemes of KP(G). do is a finite KPV(G)-derivation with
do maj k for some finite k By the previous o o' lemmas there are cut-free KPV(G)-derivatlons d. of r ~ ~,A. s.t. di maj Id . . . 1 ., 1 k or d maj ki'l for sUltable flnlte k n appllcatlons of the cut i i i. rule yield the desired derivation d, and d maj ~·lId.~ for k:= max(ko,···,k n) § 5. CUT-ELIMINATION FOR KPV(G)
In order to prove the cut-elimination theorem for KPV(G) we follow the technique of [Sch, pp. 871-883]. Usually one appeals to symmetry utilizing the commutativity of the 'natural sum' or 'natural product' of ordinals. Since we work with dilators, this is not possible here. We modify the usual technique by observing that it suffices to 'move upwards' the cut formula only in the right upper sequent of the cut rule. This works by virtue of an inversion lemma for both
(r~)
and
(l~),
and due to the asymmetry implicit
in the absence of the existential quantifier as a primitive symbol. We apply the usual notions of logical complexity of a formula and of cut-rank of a derivation (see e.g. [Sch, p. 873J). We omit proofs, which are just routine adaptations of usual techniques. Weakening Lemma: If d is a KPV(G)-derivation of
r ~ ~
majorizing d, then there is a KPV(G)-derivation d' of
and D a dilator
r,f ~ ~,0
with the
same cut-rank and majorizing dilator D. Inversion Lemma: Let d be a KPV(G)-derivation with cut-rank
~n,
D a di-
lator majorizing d, and b an arbitrary free variable. Then there are KPV(G)-derivations do' d
l, D satisfying the following. 1.
2. 3.
r Ir d
~,A~B
A~B, r t.:! ~ r ~'.L
t!
=> => =>
and db with cut-rank
d A,r 1L....:: 0
~n
~,B
r ~ s, A r ~ ~
and
and majorizing dilator
d
B,r
~ ~
225
Kripke-Platek Set Theory
~ r ~ b., 'IxA(x)
d
r ~
=>
b.,A(b)
Reduction Lemma: Let d and d' be KPV(G)-derivations of c,r ~ b.
r ~ b.,C and
respectively with cut-rank ~n, C a formula of KP(G) of complexity
n, D and D' dilators majorizing d and d' respectively. Then there is a KPV(G)-derivation p(d,d') of
r ~ b.
with cut-rank ~n and p(d,d') maj D·D'.
Proof: By induction on d'. Case 1: C is the principal formula of the last rule of d'. Case 1.1: C atomic: We let p(d,d'):=d. (In case version Lemma.)
1 ~ D'
entails
Case 1.2: C = VxA(x): Let
d~
D
~
C=~,
apply
~
of the In-
D'D' .
be the subderivation of d' deriving the
upper sequent A(b), \fxA(x),r ~ b. and D~ a dilator s.t. d~ maj D~ and l+D'~ D'. By induction hypothesis p(d,d') derives A(b),r L b. and p(d,d')
-
0
maj
0 ' -
D'D~.
Let db be obtained from d by
cation of the cut rule to db and
~
P(d,d~)
desired p(d,d'). We have p(d,d,) maj
0
of the Inversion Lemma. An appli-
with cut formula
D+D'D~
and
A(b)
yields the
D+D'D~=D'(l+D~)~D'D',
and
hence q.e.d. Case 1.3: C = A+B: Let the upper sequents
d~
and di be the subderivations of d' deriving
A+B,r ~ b.,A
and
B,A+B,r ~ b.
respectively, and D~
and Di be dilators s.t. d~ maj D~, di maj Di, and D~+l+Di ~ D'. By applying the Inversion Lemma to d, of
A,r ~ b.,B,
r ~
d~,
s,«
and di we obtain derivations
,and
B,r ~ b.
d*,d~',
and di'
respectively. Two applications
of the cut rule with cut formulas A and B yield the desired p(d,d'). We have p(d,d') maj D~+D+Di and D~+D+Di ~ D'D~+D+D'Di = D'(D~+l+Di) ~ D'D' , and hence q.e.d. Case 2: C is a side formula of the last rule of d'. Let (de)S
from which the dee), sired p(d,d') is defined by applying the same last rule as in d'. In order to verify the claim about majorization, we must distinguish various sub-
cases. We confine ourselves to the case that rule (V) is applied last in d' (whence a = On). Then there is a dilator D"
S
of kind
n s.t.
D"~
D' and
for all ordinals S: d maj {D' '}(S+l) • Induction hypothesis yields p(d,d
S) maj
D'{D' '}(S+l) D'D' •
D'{D"}(S+l) • By Lemma 5.1 below, D'D" ~
{D'D' '}(S+l) • D'D"
~
D'D'
is of kind n, and
finally gives that p(d,d') maj
226
P. Pdppinghaus
In the proof of the Reduction Lemma just given, we had to know how the fundamental sequences for dilators of kind
~
cation. Simple laws like
would be desirable, but unfor-
{F-G}6
F-{G}6
are compatible with multipli-
tunately in general these do not hold (see [Gi 2, Ex. 3.3.9, p. l25f.J). We must content ourselves with the inequality
F-{G}6
~
{F-G}6 • We genera-
lize this by replacing multiplication by an arbitrary bilator
®, which we
assume to be fixed in the following. We use the terminology and notations of [Gi 2J. If F and G are dilators, the dilator F®G is defined by (F®G)(a):=
®(F(a),G(a» and (F®G)(f):= ®(F(f),G(f». If B is a bilator, the two-place dilator F®B is defined by (F®B)(a,6):= ®(F(a),B(a,6» and (F®B)(f,g):= ®(F(f),B(f,g». An important invariant associated with ® is n(®), which we define to be the least natural number s.t. the flower
®(~,-)
is non-
constant. (Such an integer exists for every bilator ®.) Now we are ready for a lemma. Lemma 5.1: Let F be a dilator, F~{~li
F ® G is a dilator of kind
~
UN(F ® B)
~,and
F ® B is a bilator.
F ® UN(B) •
~
3. For every ordinal 6: F ® {G}6
~
(F ® G}6
Proof: ~
We leave it to the reader to verify that
F ® B is a bilator.
For the other claim it is useful to observe that the criterion for connectedness in [Gi 2, Prop. 3.3.1, p. 123J entails the following criterion for dilators of kind
(*)
G is of kind the set
G is of kind For
~.
~
iff
G(w) is a limit ordinal and for some y < G(w)
(G(f)(y)1 f6I(w,w)} ~,
is co final in G(w).
so we assume to be given an ordinal y according to (*).
0:= F(w)®y we have that (F ® G)(f)(o)
(F(f) ® G(f»(F(w)®y)
~
F(w) ® G(f)(y) . (The inequality follows from properties of bilators analogous to [Gi 2, Remark 2.4.10, p. 116J.) Hence (F ® G)(w) , and by (*)
is cofinal in ~
{(F ® G)(f)(o)I f€I(w,w)}
F ® G is of kind
~_
For an arbitrary ordinal a we define a map Ta:UN(F®B)(a)
as follows. Let (F ® B)(a,a)
0' < UN(F ® B)(a)
+
(F®UN(B»(a)
be given. 0' comes from a point
0 <
whose coefficients in the denotation w.r.t. F ® B are obtai-
ned as follows. We look at the denotation of 0 w.r.t. ® . 0= (0 0 ; 1 ; 0 " " ,1;n-l ;F(a) ;11 0 " " ,11 m_ 1 )® < F(a) ® B(a,a)
Kripke-Platek Set Theory
227
and for
i
r.t. B:
Si
_
i
i)
(si;so.···.Sn._l;a F < F(a) .
.1
(n.;sJ •••• ,sJ J
0
r j-
The coefficients of 6 w.r.t.
.
• and
1
~.
J
w.
.
l;a;~J •••. ,~J
qF l)B
0
< B(a.a)
F ® B now consist of all the
i
sk
and
sj k
for the left argument place. and all the 0~ for the right argument place. The fact that 6 generates a point in UN(F ® B)(a) means in terms of these coefficients that any of the From this it follows that UN(B)(a) . Now we define
st
and
00 ••.• '~m-l
s~ is greater than any of the generate points
0~""'~~_1
in
Ta(6'):= (6o;so •...• sn_l;F(a);~~,••. '~~_1)® < F(a) ® UN(B)(a) One of the seemingly endless verifications of this subject yields that the family (Ta)a EOn is a natural transformation as required. 3. For B:= SEP(G) we obtain by ~ a natural transformation T:UN(F ® B) + F ® UN(B) = F ® G • An application of the functor SEP gives us a natural transformation SEP(T):F ® SEP(G) = F ® B = SEP(UN(F ® B» + SEP(F ® G) And for any fixed ordinal S we finally obtain a natural transformation SEP(T)(e.S):F ® {G}S = F ® (SEP(G)(e.S» +
= (F ® SEP(G»(e,S)
SEP(F ® G)(e.S) = {F ® G}S
Moreover we need the following technical lemma. Lemma 5.2: For dilators F, G the following hold.
3.
(2+F) + (2+G) ;;; (2+F)e(2+G) (;+Id)l+F-+ (2+Id)1+G ;;; -(2+Id)1+F+l+G F- ~ (l+Idl-
4.
<,~+Id)(n) + (l+Id)(n)
1.
2.
~
s (1.+Id) (n- l )
(1.+Id)(n) e (1.+Id)(n) Proof:
;;;
1. We define
+
T S:2+a+2+S a.
Ta, B(Y)
(1.+I d)(n+1) (2+a)e(2+S) as follows.
:= Y
, if Y < 2+a+2
T S(2+a+2+6):=(2+a)e(2+6) ,if6<S a. It is easy to verify that the family (Ta• is a natural transS)(a.S)60n XOn formation, and hence the family (TF(a) .G(a»a E On is a natural transformation from (2+F) + (2+G) to (2+F)e(2+G) as required. l+F 2. In order to apply 1. we must show that 2 ~ (2+Id)• By [Pa. Lemma
~2.2J
there is a dilator F' s .r.. (1.+Idl:- l+F-;-. It follows that
(1.+Id)l+F = (1.+Id)le(1.+Id)F = (1.+Id)e(l+F') = l.+Id+(1.+Id)eF' • hence q.e.d. ~ Ta(y):= (2+a)Y yields a natural transformation as required.
P. Pappinghaus
228 ~
By induction on n. In the induction step apply
~
Trivial for n=O, and for n>O apply
~
~
.
•
Cut-elimination Lemma: Let d be a KPV(G)-derivation with finite cut-rank n+l and majorizing dilator D. Then there is a KPV(G)-derivation £(d) of the same end sequent with cut-rank ~n and
£(d) maj (l+Id)D
Proof: By induction on d. Case 1: d is obtained from do and d l by an application of the cut rule. There are dilators Do and Dl s.t. do maj Do' d l maj Dl, and Do+Dl~ D. By induction hypothesis i=O,l: £(d
£(d o) and £(d with cut-rank ~n and for l) By an application of the Reductbon LemmaDwe get
weD~btain
maj (l+Id)
1.
i) £(d):= p(£(d ),£(d l » with cut-rank ~n and £(d) maj (2+Id) o'(2+Id) 1 o D Dl D +D D the claim follows since (l+Id) o'(l+Id) (l+Id) 0 l ~ (l+Id)
Now
Case 2: d is obtained from (d a)a60n by an application of rule (V). There is a dilator D' of kind Q s.t. D'~ D and for every ordinal a: d maj {D'}(a+l) . By induction hypothesis we obtain £(d ) with cut-rank a ~n and £(da) maj (l+Id){D'}(a+l) • We let £(d) be obtai~ed by an application of rule (V) to (£(d » EO . From Lemma 5.1 it follows that {D'}(a+l) aD,a n D' D (2+Id) ~ {(2+Id) }(a+l), and together with (2+Id) ~ (2+Id) w: see that
£(d) rna; (l+Id)D •
- -
Using Lemma 5.2 the remaining cases follow straightforwardly from induction hypothesis. Finite iteration of this lemma leads to the following normal form theorem. Cut-elimination Theorem: Let d be a KPV(G)-derivation with finite cutrank nand majorizing dilator D. Then there is a cut-free KPV(G)-derivation v(d) of the same end sequent with
v(d)
maj
(l+Id)'
n times
Theorem 5.3: For every sequent derivable in KP(G) there is a cut-free KPV(G)-derivation d s.t. for some finite n:
d maj
(l+Id)(n)
Proof: We combine Theorem 4.3 with the Cut-elimination Theorem. Moreover using Lemma 5.2 it is not very hard to prove that for finite k, r there is I d'r k'2 ~ (2+Id) -- (n)
a finite n s.t.
Kripke-Platek Set Theory
229
§ 6. TI 2-MODELS AND TI 2-ORDINALS
We are now in the position to verify the claims made in § 1. First we observe that our principal bounded ness result (Theorem 1.1) follows from Theorem 5.3 together with the Asymmetric Interpretation Theorem (Theorem
3.4). In Corollary 1.2 we have interpreted this result as a closure property of certain levels V of the cumulative hierarchy. Any K closed under all K ~g (n<w) must be a limit ordinal by virtue of Lemma 3.3.4. With Cl+Id) ( ) this n observation, Corollary 1.2 follows immediately from Theorem 1.1. Another way to interpret Theorem 1.1 is to draw from it bounds for the functions, which are ~~-definable provably in KP(G). Corollary 6.1: 1. If
f:V
V is ~~-definable provably in KP(G), then for some finite
+
n and every set a: ~ If
n
f:On
< ~Z2+Id) -
(rank(a»
(n)
is ~~-definable provably in KP(G), then for some finite
On
+
rank(f(a»
and every ordinal
f(a)
<
'JJ Z2+Id) -
(a)
(n)
3. If the ordinal a is ~~-definable provably in KP(G), then for some finite
n:
a
< JZ2+Id) -
Proof:
.L..
(0)
(n)
Let F[a,b] be a ~¥-definition of f with KP(G)
I-
Yx 3!yF[x,y]. Choose
n according to Theorem 1.1, and for an arbitrary set a let a:= rank(a). Then a
I". V l' and so by Theorem 1.1: f(a) 6 V a+ 'J\ g (1) It follows that rank(f(a» < ']Zl+Id)(n) (a+1) (l+Id)(n) Ct+
~ JJt2+Id) -
(ri)
~
Analogous to
~
By applying
('J]y(a» -
= JL(2+Id) -
-
(n )
(a)
~ ']Z2+Id) -
(n- l )
(a) .
.L.. .
~
to the constant function with value a.
As explained in § 1, we have by our results determined K(g):=~Z2+Id) (0) to be the least ordinal s.t. (VK(g),g) is a TI 2-model of KP(G).
-
(w)
A little care is needed, if we want to apply our results to specific ordinal functions
g:On + On according to our remarks in § 1. The idea is to
replace the relation constant G by a
~l-formula
of KP defining the graph of
g. The axiom (G 1) thereby becomes a IT-formula, which can be replaced by an open 6o-formula, and so there is no problem for the asymmetric interpreta-
P.
230
Pdpptnghaus
tion. The asymmetric interpretation of the axiom rule for (G 2), however, has to be checked for each individual case. In practice it is sometimes more convenient to replace (G 1) and (G 2) by some other equivalent axiom, whose asymmetric interpretation is easier to verify. In order to obtain a theory equivalent to Jager's KPu of [Ja IJ, for example, we can replace G by a formula defining the graph of g= Aa.w. In this case we are lucky, since we can choose a lI-formula for this purpose. a
Alternatively we can define KPu to be KP extended by the axiom of infinity: 3xLim(x) , where Lim(c) is the usual llo-formula of KP expressing that c is a limit ordinal. g= Aa.w and
g= Aa.(I+a)·w
are provably Ll-defin-
able in this version of KPu. In either approach the asymmetric interpretation works for both choices of g by virtue of hold for KPu, in particular
WE:;
Vw+l' So all our results
K(Aa.w) = K(Aa.(I+a)·w)
the Bachmann-Howard
ordinal (see [PaJ) is the least ordinal K s.t. V is a IT2-model of KPu. K In order to find the IT2-ordinal in the sense of [Ja we have to modify
2J,
the asymmetric interpretation to refer to the levels of the constructible hierarchy. A careful checking of the proof of the Asymmetric Interpretation Theorem reveals that the cumulative hierarchy could be replaced by any hierarchy
H:On + V satisfying the following conditions.
(a) All H are transitive, and a < S ===> Ha£H a S (b) a E: H => "3 a < S: a ~ H a S (c) a,~E.Ha ===> {a,b}6Ha+ 1 and Uae.H a+ 1 (d) a,bEH a ' D[y,~] lIa-formula of KP(G) ===> Iv s a] D[y,b]}EHa+ 1 • (e) H () On a a
For the general case of KP(G) we have no asymmetric interpretation into the levels of the constructible hierarchy L, since condition (d) fails for this language. KPu, however, is formulated in the language of KP, and so the Asymmetric Interpretation Theorem works, if the predicate constants Va are interpreted by La' the corresponding level of the constructible hierarchy. We arrive at a variant of Theorem 1.1 and its corollaries for the theory KPu. In particular it follows that
K(Aa.w)
is the least ordinal K s.t.
L is a IT 2-model of KPu, i.e.: the Bachmann-Howard ordinal is the IT 2-ordiK nal of KPu. In order to obtain a theory equivalent to the theory KPi of Jager and Pohlers mula
in [JPJ, we proceed as follows. We observe that there is a llo-for-
Ad(c)
of KP expressing that c is an admissible
~.
Aczel [RA, Thm , 2.4, p, 315£.J there is a IT 3-sentence for every non-empty, transitive set M: M is admissible
Ax
By Richter and
of KP s t , KP iff (M,E)~ Ax KP i
Kripke-Platek Set Theory
We choose a
~o-formula
Tran(c)
231
expressing that c is a non-empty, transi-
tive set, and let (AXKP)(c) be obtained from Ax by bounding to c all unKP bounded quantifiers in Ax KP (see [Bw, p, 15J). Then Ad(c) Tran(c) (AXKP)(C) serves our needs. By a lengthy, but easy verification one can
:=
A
see that for every sentence B, which is an instance of an axiom of KP, Ad(c) ~ B(c)
is derivable using only the axioms of equality and extensio-
nality and the scheme of foundation. Now we define KPi to be KP extended by the following axiom of inaccessibility:
::h(Ad(y)
A
aEy). g=i\a.a+
is
provably Li-definable in KPi. The asymmetric interpretation presents no difficulty, if we choose to interpret into the constructible hierarchy. Namely for a E La we have that a € La+ , La+ is admissible and La+ E La++1' and so (3 YEL + 1)(Ad(y) " a6 y) is true. This is essentially what is a + needed to verify the asymmetric interpretation for the axiom rule in the case of the axiom of inaccessibility. As result we obtain a variant of Theorem 1.1 and its corollaries for the theory KPi. In particular it follows that
K(i\a.a+)
is the IT 2 - or di na l of KPi.
REFERENCES [BwJ
Barwise, J.: Admissible Sets and Structures. An Approach to Definability Theory, Berlin-Heidelberg-New York 1975
[FeJ
Ferbus, M.-C.: Functorial bounds for cut elimination in L .1, in: Bw Arch. math. Logik 24 (1984), 141-158
[Gi IJ
Girard, J.-Y.: A survey of rr~-logic, in: Cohen, L.J., Los, J., Pfeiffer, H., Podewski, K.-P. (eds.): Logic, Methodology and Philosophy of Science VI. Proceedings of the Sixth International Congress at Hannover 1979, Amsterdam-New York-Oxford-Warszawa 1982, 89-107
[Gi 2J
--: rr~-logic, Part 1: Dilators, in: Annals of Mathematical Logic 21 (1981), 75-219
[How]
Howard, W.A.: A system of abstract constructive ordinals, in: JSL 37 (1972), 355-374
[Ja IJ
Jager, G.: Zur Beweistheorie der Kripke-Platek-Mengenlehre tiber den nattirlichen Zahlen, in: Arch. math. Logik 22 (1982), 121-139
[Ja 2J
--: Some Proof-Theoretic Contributions to Theories of Sets, this volume
[JpJ
Jager, G., Pohlers, W.: Eine beweistheoretische Untersuchung von (~!-CA)+(BI) und verwandter Systeme, in: Sitzungsberichte der Bayerischen Akademie der Wissenschaften (1982), 1-28
[JerJ
Jervell, H.R.: Introducing homogeneous trees, in: Stern, J. (ed.): Proceedings of the Herbrand Symposium. Logic Colloquium '81, Amsterdam-New York-Oxford 1982, 147-158
232
P Pdppinghaus
CPa]
Pappinghaus, P.: Ptykes in Godel's T und Hierarchien von Ordinalzahlfunktionen, to appear in the Proceedings of the rri-workshop in Oslo - August 1984
[pc]
Pearce, J.: A constructive consistency proof of a fragment of set theory, in: Annals of Pure and Applied Logic 27 (1984), 25-62 Richter, W., Aczel, P.: Inductive definitions and reflecting properties of admissible ordinals, in: Fenstad, J.E., Hinman, P.G. (eds.): Generalized Recursion Theory. Proceedings of the 1972 Oslo Symposium, Amsterdam-London-New York 1974, 301-381
[RA]
[Sch]
Schwichtenberg, H.: Proof theory: Some applications of cut-elimination, in: Barwise, J. (ed.): Handbook of Mathematical Logic, Amsterdam-New York-Oxford 1977, 867-912
[Ta]
Takeuti, G.: Proof theory, Amsterdam 1975
Logic Colloquium '85 Edited by The Paris Logic Group
233
© Elsevier Science Publishers B.V. (North.Holland), 1987
WEAKLY NORMAL GROUPS U. Hrushovski and A. Pillay* In this paper we show (Theorem 4.1) that a group G is weakly normal n (or I-based) if and only if every definable X c G is a Boolean combinan) tion of cosets of acl(¢)-definable subgroups (of G Vn if and only if n every definable X c G is a Boolean combination of cosets of definable n) subgroups (of G Vn. (By a ~ we mean here a structure on which there is a ¢-definable group law).
Moreover (Theorem 3.2) such a group is abe-
lian-by-finite. This improves results and answers questions from [Pi3J (which in turn generalizes results of Zilber [zJ). The results of the present paper were actually proved independently by the two authors around the same time. In section 1 we recall some equivalent conditions to that of weak normality, and we take the opportunity to give proofs of assertions in earlier papers (eg. [Pi3J). In section 2 we recall some basic results on stable groups (concerning generic types, stablizers, etc.). In section 3 we show that a "locally connected" definable subgroup of a weakly normal group
is acl(0)-definable.
(This suffices to prove Theo-
rem 3.2). In section 4 we show that if G is a weakly normal group, then any p
~
S(G) is determined by the definable cosets in p (this, together with
results of section 3, is enough to prove equivalence in Theorem 4.1). For background on stability theory, see [PilJ and [ShJ, and for stable groups, see [B-LJ and [PlJ. The second author wishes to thank Daniel Lascar for some very helpful remarks. 1.
T will here be a complete theory [is, as usual, a very saturated model of T and [e q is as in [Sh~. Every element of ofq is a member of a certain sort S.
Two important sorts are S-, the sort of elements of [,
*Supported by NSF grant DMS 8401713
U. Hrushovski and A. Pillay
234
and Sn, the sort of n-tuples from [n A definable set X (of elements of a given sort) is said to be weakly normal if every infinite set of pairwise distinct conjugates of X (with q) respect to automorphisms of [e has empty intersection. T is said to be weakly normal for sort S if every definable set of elements of sort S is a Boolean combination of weakly normal definable sets.
T is weakly normal
if T is weakly normal for all sorts S. It is Easy to see (as in [Pi-S]) that if T is weakly normal for sort =
S , then T is stable.
On the other hand, suppose [has a ¢-definable group
structure, and X is a coset of an acl(¢)-definable subgroup H. ily X is weakly normal.
Then eas-
We will in fact show that for weakly normal
groups, this is the typical situation. Recall from [ShJ that if p is a strong type over some set,then Cb(p) (canonical b~se of p) is the set of elements of [eq which are fixed by every automorphism of [eq which fixes the unique nonforking extension of p to [eq.
The stable theory T is said to be I-based for sort S if for any
a of sort Sand B, Cb(stp(a/B»
~
acl(a).
(Note that Cb(stp(a/B»
c acl(a)
means that the global nonforking extension of stp(a/B) does not fork over {a}).
T is said to be I-based if T is I-based for all sorts S.
We say that T type-interprets a pseudoplane i f there are complete eq with p(x) u q(y) c I(x,y) such that types p(x), q(y), I(x,y) of r [eq (teq [eq > is a pseudoplane (see [LcJ).
The following are equivalent:
(a)
T is weakly normal.
(b)
T is stable and I-based.
(c)
T is stable and does not type-interpret a pseudoplane.
Proof.
Following a request of the referee
we first show that the equiva-
lence between a) and b) goes sort by sort. weakly normal for sort S. E
~
acl(a).
p, let X'
=
E
P and X' is weakly normal.
X weakly normal, X
E
-+
b):
Assume T is
Let p
= tp(a/M).
For each weak-
the intersection of all conjugates
of X under automorphisms of M which are in p. X'
a)
It is enough to show that for a saturated model
M and a of sort S, Cb(tp(a/M» ly normal definable set X
So,
Note that X' is definable,
It is then clear that Cb(p)
dcl {X': e q• p} where we view each such X· as an element of M =
But each such X' is clearly acl(a)-definable, as only finitely many conjugates of X' are true of a.
rhus Cb(p)
~
acl(a);b)
-+ a):
Assume T is
Weakly Normal Groups I-based for sort S.
We use the notation of [Pi2J.
~~ -definable.
Let
We show by induction on
,
Mlt~ (X»
Boolean combination of weakly normal definable sets. 1.
Mlt~(X) =
Let R¢(X)
extend X to some p
E S~
=
Claim.
~-definition of p.
Let r(y)
=
We may assume that
Let Pl(x)
= p(x)/c.
If not we can easily find a,c
i'l f
W,
i < W realizing the above set of fori, tp(ac.) for all i,j < w. Clearly J
~
l is inconsistent.
tPA (a/c )
Pl(x,c o) and so Co
Co
But this contradicts the fact that for i'l j, c
E
acl(a).
E
Cb(stp(a/c
o))'
By hypothesis
the claim is proved.
i
'I c
Now by the claim and compactness we find a definable set Y which is weakly normal.
=
R¢(Y)
k, Mlt¢(Y)
~¢-definable
=
1.
Write
tp(y).
mulas such that moreover tp(ac.) 0
Let
Let M be a model such that X is over M and eq k . Let c E M be the canonical
k.
{Pl(x'Yi)' r(Yi)' yi'l Yj : i,j <
Ll¢
z}.
=
that X is a
(M) with R~(P)
base of p, i.e., c is t*e PI (x) as Pl(x,c).
be a formu-
~(x,y)
~~ is the Boolean closure of {~(x,y), x
la with x of sort S. X be
235
So Y is
~¢-definable
E
j'
Thus
PI (x,c)
and we may assume that
Thus the symmetric difference of X and Y is also
and has R¢ rank less than k.
So using induction, X is a Boo-
lean combination of weakly normal sets. b) -+ c)
We assume T stable, I-based, and type-interprets a pseudoplane,
and obtain a contradiction. plane.
Suppose the type I(x,y) defines the pseudo-
Let a,b be such that t:I(a,b).
Note that b
~acl(a).
Let b '
be such that stp(b'/b' ua) is the nonforking extension of stp(b/a). b 'I b'.
Thus I(x,b) A I(x,b') is algebraic.
Let q(x) be the global nonforking extension of stp(a/b). ness, q is definable over acl(a). is non-algebraic.
Thus I(x,b)
So (*)
By l-based-
uI(x,b') c q(x).
But q(x)
This contradicts (*)
c) -+ a) is proved in [Pi2J. This completes the proof of Proposition 1.1. It is easy to check that if T is weakly normal and T' is interpreted eq, in T then also T' is weakly normal. On the other hand to check weak normality of a particular theory one would appear to have to analyse definable sets of imaginaries. The next result shows that it is enough to have weak normality of the sorts Sn, n ~ 1 (noting a)
~~
b) in Proposition 1.1.).
the sort-by-sort proof of
U. Hrushovski and A. Pillay
236
Let T be stable and I-based for the sorts Sn (n ~ 1).
Proposition 1.2. Then T is I-based. Proof.
Let C be an equivalence class of the equivalence relation defined
by the L-formula E(x,y).
We must show that Cb(stp(fCl/A) ~ acl(fCl); or equivalent-
element of [eq.
ly that Cb(stp( fC fA» fC1 a
E
Let uS denote by rC~, the class C viewed as an
!:: M for any model M containing fCl.
M means C n M f- 0 (1. e. C is M-definable).
E
C.
By assumption Cb(stp(a/A»
~ acl(a) c M (*).
~ Cb(stp(a/A).
able, we have Cb(stp('C'/A»
Note that
So pick a in M such that But as C is a-defin-
So (*) is enough.
It follows (by pp elimination of quantifiers) that any module (i.e. its theory) is weakly normal.
For the same reason, an abelian group G n, equipped with predicates for some subgroups of G n ~ 1, is weakly normal. Let us remark that in the application of weak normality we use directly the property of being I-based. 2.
By a stable group G we mean a structure G on which there is defined (without parameters) a (distinguished) group law, such that Th(G) stable.
(e.g. an algebraically closed field).
T
is
From now on, G is a stable
group and we consider G as an elementary substructure of a large saturated model of Th(G).
It is also convenient to assume G to be ITI+-saturated.
We will be talking about definable subsets of G and I-types over G, although everything we say will, of course, work for definable subsets of n, n the group Gn and I-types over G where G is equipped with the structure n). (So n-types over G correspond to I-types over G
induced from G.
We recall some notions appearing in [B-L] (which have their origin also in works of Cherlin, Poizat, Zilber). By a A-definable (or infinitely definable) subgroup of G we mean the conjunction A~ of a collection ~(x) of at most ITI formulas over G, each defining a subgroup of G. tions. A~).
(So
A~
We also assume
~
closed under finite conjunc-
also defines a subgroup of G which we often identify with
Similarly we can define a A-definable left or right coset in G,
which will clearly be a left or right coset of a unique A-definable subgroup of G. p(x) group
A~
E
Sl(G) is said to be a left generic of the A-definable sub-
of G if
~
c p and for all a
E
G
(A¢) , ap does not fork over A
whenever A EGis a set of parameters over which each ¢ E (equivalently ap does not fork over parameters for the ¢
E ~.)
¢
where
~
~
is defined
is a set of canonical
Similarly we can define right generics of
~
Weakly Normal Groups
237
(It is a fact that left generics coincide with right generics, and are called just generics.)
If ¢ is defined over A and p
Sl(A) then p is
E
called a generic of ¢ if some nonforking extension of p to a jTj+-saturated model is a generic of ¢.
(This is consistent with the case when A=G).
Also a nonforking extension of a generic of ¢ is a generic of ¢. Let A¢ be a A-definable subgroup of G and
be (define) a right co-
Then p f Sl (G) is said to be a (left) generic of Af if
set of A¢ in G.
(A~)G and generic q
for some a E
A~
E
Sl (G) of A¢, P = qa.
The subgroup A¢ of G Is said to be connecten if whenever ¢(x) defines a subgroup of G and ¢ A ¢' has finite index in ¢' for some ¢'
E
T,then
r1J*-¢=j LEMMA 2.l.
i) p
Let AT be an infinitely definable subgroup of G. E
51 (G) which is a generic of
A~.
Then there is some
If the definable subgroup ¢(x) is in
p, then ¢ A ¢' has finite index in ¢' for some ¢'
¢.
E
Also A¢ is connec-
ted if and only if A¢ has a unique generic. ii)
Let
A~
be a right coset (in G) of the connected subgroup A¢.
has a unique left generic p
E
5
and moreover
l(G) the collection of all right cosets in p.
Proof. ii) a,b
i)
*-+
A~'
where
Then
A~
is
~'
follows from 3.13 of [B-LJ.
Let ql,q2 be (left) generics of E (A~)
~ A~
G
SO ql = pa, q2
A~.
and p the unique generic of A¢.
unique (right) generic of A¢.
SO ql
q2'
=
pb for some
= pba -1 ,as
But p
The rest of ii)
p is the
follows from
i).
Recall that for p
E
51 (G), Stab p
=
{a
E
G:
A-definable subgroup of G, more precisely, Stab p L} where Stab¢(p) = {a
E
G : ¢(bx,c)
E
p
+7
ap =
¢(bx,c)
p}.
=
Stab p "is" an
A{Stab¢(p) : ¢(x,y) E
ap,
E
b, c in G}, and
the latter is, by stability, a definable subgroup of G. Lemma 2.2. ¢ c p. Proof. Stab(pa
Suppose that p
E
Sl (G) and for some right coset A¢ of Stab p,
Then Stab p is connected, and p is the unique left generic of A¢. Let a
-1
E
) c pa
G
(A¢) .
-1
.
Then clearly Stab p = Stab (pa o
By 6
is its unique generic.
-1
) and moreover
(p.344) of [PI], Stab p is connected and q
= pa-1
So clearly, by Lemma 2.1 ii) p = qa is the unique
generic of A¢. Finally we say that the definable subgroup H of G is locally connected i f for any conj ugate H' of H (under automorphism) has infinite index in H.
H n H' = H or H n H'
U. Hrushovski and A. Pillay
238 3.
As before,a group G denotes a structure with a distinguished 0-definable group law.
We say G is weakly normal if Th(G) is.
mal group is stable. Lemma 3.1.
So a weakly nor-
The following improves 3.1 of [Pi3J.
Let G be a weakly normal group.
Then any locally connected
definable subgroup of G is acl(0)-definable. Proof. Let H be a definable locally connected subgroup of G, and let eq E G be its canonical parameter.
u
Claim.
> G and
There is G'
g' such that the definable coset Hg' is G'-
definable, tp(g'/G') is a (left) generic of Hg' and tp(g'/G) does not fork over 0 (in particular g' and u are independent over 0). Proof of Claim. Gil> G, Gil 'x=x').
=>
First let a be such that tp(a/G) is a generic of H.
So note that a and g are independent over G.
be a model such that tp(a/G') does not fork over G. eric of H (or rather of ¢
G'
where ¢(x) defines H).
generic of the coset Hg (which is G'-definable). Hg'.
Let
G u a, and g be such that tp(g/G") is a generic of G"(Le. of So let G'
=>
G' ug
So tp(a/G') is a genThus tp(ag/G') is a
We put g'=ag.
So Hg
Also, as tp(g/G") is a generic of 'x=x', tp(ag/G") does not fork
over ¢ and so tp(g'/G) does not fork over 0.
This proves the claim.
We may assume that G' is very saturated.
Let v be the canonical
parameter of Hg'. By local connectedness of H, the Hg'-genericity of tp(g'/G') and Lemma 2.1, it follows that v mality of G, v meter of H). over 0, so u Theorem 3.2. ite.
E
acl(g').
So u
E
Clearly u
acl(g').
E
Cb(tp(g'/G').
E
By weak nor-
dcl(v) (u is the canonical para-
But by the claim u and g' are independent
acl(0), i.e., H is acl(0)-definable.
E
Let G be a weakly normal group.
Then G is abelian-by-fin-
Moreover, if the language of G is just the language of groups, the
converse also holds. Proof.
The fact that G is abelian-by-finite follows from Lemma 3.1 as
in [Pi3J (based on an idea of Zilber [zJ).
We will give a slightly
streamlined argument, pointed out by Poizat:
We may assume by stability,
that G has no proper centralizer of finite index. H = {(h,glhg): g
2
h
E
G}.
E
G, let
So each H is a locally connected definable subg
group of G , and so by Lemma 3.1, acl(0)-definable. finitely many distinct H ,g g
in G.
For each g
E
G.
So there are only
But the H 's index the cosets of Z(G) g
239
Weakly Normal Groups So Z(G) has finite index in G.
So Z(G)
= G.
If G is abelian-by-finite, and has no structure other than the group structure, then G is known to be interpretable (with parameters) in an Rmodule for some ring R (see [BCM] for example). ted out that any module is weakly normal.
But we have already poin-
So therefore is G.
This com-
pletes the proof of Theorem 3.2. Remark 3.3.
S. Buechler has pointed out that the proof of Lemma 3.1 works
with G I-based replaced by the weaker condition a dominates Cb(p) over 0."
"for any p
stp(a/A),
In effect we obtain g' dominates v over 0. So
g' dominates u over 0, and as g' and u are independent over 0, u is algebraic over 0.
Thus this weaker condition implies abelian-by-finiteness.
4. Here we prove our main theorem. Theorem 4.1. (a)
Let G be a weakly normal group.
n Then every definable subset of G is
a Boolean combination of cosets of acl(0)-definable subgroups of (b)
en.
Conversely, if G is a group in which every definable subset of
en
is
a boolean combination of definable subgroups, then G is weakly normal. The theorem may be restated as follows.
An Abelian structure is an
Abelian group A together with distinguished subgroups of An for the various n's.
Any Abelian structure is interpretable in a module.
Restatement
Let G be a weakly normal group.
bi-interpretable with an Abelian structure.
Then G is parametrically More precisely, G has a 0-
definable subgroup A of finite index.
Let M be any elementary submodel O of G, and add a constant for every element of M so that M = dcl(0). O O' Let A denote the model with universe A and with the induced Abelian structure alone, and let
A denote
A with the full induced structure.
Then:
(1)
There exists a MO-definable bijection of G with An (for some n.)
(2)
The structures
relations.
Aand
(A,a) A M have exactly the same definable aE n 0
To prove the restatement, use theorem 3.2 to find A, and add constants for MO' co sets of A in
(1)
e.
is obvious as M has a set of representatives for the O For (2), let ¢(x) define a subset of An for some n.
By 4.l(a),¢(x) is equivalent to a Boolean combination of cosets of ¢-definable subgroups. form C-UiC each C i
C
It may be written as a disjunction of expressions of the
where C and the Ci's are cosets of 0-definable subgroups, i, C, and no C has finite index in C; and moreover it can be i
U. Hrushovski and A. Pillay
240
arranged that if e and D are two cosets that appear positively, enD has infinite index in both C and D.
Under these conditions, it is easy to see
that the set of cosets appearing positively is uniquely determined. ember that ¢ is 0-definable). model, e n M '" O. (A,a)
ae
A 1
If c
O
E
(Rem-
Since M is a O then e = (C - c) + c is ¢-definable in
Thus each C is algebraic.
e
nM O'
.The cosets occuring negatively can now be dealt with in a
f M O
similar manner.
Thus ¢ is 0-definable in (A,a)
AnM'
0 The proof of 4.l(a) hinges on the following lemma.
Lemma 4.2.
Let G be weakly normal, ITI+-saturated, and p
Sn(G).
Then
Let us first see how Theorem 4.l(a) follows from this Lemma.
Given
for some (right) coset p
E
aE
Sn(G), let
A~
A~
of Stab p,
be as in Lemma 4.2.
~ c
E
p.
It follows from Lemma 4.2 and Lemma
2.2 that Stab p is connected. Thus ~Stab(p) ++ A~, where ~= {¢(x) over G: nected subgroup and l=Stab p By Lemma 3.1, each ¢
E ~
-+
¢ defines a locally con-
¢}.
is acl(0)-definable.
By Lemma 2.1 ii), P is the unique generic type of A{~(X)
E
p:
right coset of an acl(0)-definable subgroup}.
So it follows that if
p,q
~
Sn(G),then p = q if and only if whenever
E
~
is a
defines a right coset of
an acl(0)-definable subgroup of G,then ~ E P iff ~ E q. It follows easily n that every definable X c G is a Boolean combination of cosets of acl(0)definable subgroups. Proof of Lemma 4.2. generic of G.
Without loss of generality n = 1.
Let the model G'
=>
Let tp(g/G) be a
G' u g be [TI+-saturated.
Let p' =
tp(a/G') be the nonforking extension of p. By genericity of tp(g/G),we have a and ga are independent over G.
(I)
Let q = tp(ga/G') (=gp'). Let S = Stab p. too.
Let
Claim (II).
gS
So S is a A-definable subgroup of G,and moreover S
Stab p '
denote the (A-definable) left coset of S by g. Let f be a G-automorphism of G'.
Then f(q) = q iff f(gS)
gS. Proof.
Note that any G automorphism f of G' fixes p'. Thus q = f(q) iff 1 -1 gp' = f(gp') iff gp' = f(g)p' iff f(g)- g p' = p' iff f(g) g E S iff gS =
f(g)S iff fS = f(gS) (as S is also G-definable). Let now fgSl denote the set of canonical parameters for the definable cosets in gS.
241
Weakly Normal Groups So from Claim (II) we have (III)
q does not fork over G u 'gS', and moreover, as G is I-based we have;
(IV) cgS' s: ad (G u {ga}) . Claim (V.) Proof.
tp (ga/G u {a}
u rgS 1) does not fork over G c r gS 1 .
By (IV) and (I), tp(a/G
u i ga}
urgS'j does not fork over G and
so does not fork over G u 'g5 Now apply forking symmetry. '• Now by (III) and Claim (V) we have: (VI.)
tp(ga/G') and tp(ga/G u {a}
u rgS') are parallel.
Now ga f (gS)a, the latter being an infinitely {a} u rgSl-definable By (VI) and saturation of G' there is a' E G' such that ga E g(Sa') (note that g E G' and S is G-definable). So a E Sa'.
As p'
=
tp(a/G'), is the nonforking extension of p, there
is, by saturation of G, some a" E G such that a So we have shown that some right coset of S
=
E
Sa", i.e. 'x
E
Sa'"
E p.
Stab p is in p, and this
concludes the proof of Lemma 4.2, and thus of Theorem 4.l(a). Proof of Theorem 4.l(b). Let G be a group, and assume that every 0-definable subset of G(n) is a Boolean combination of definable cosets. mal.
We show that G is weakly nor-
Without loss of generality, G is IT!+-saturated. Note first that G is stable:
call a formula ¢(x,y) stable if there
is no sequence ai' indexed by an infinite linearly ordered set I, such that i < j *¢(a.,a.) IIl¢(a.,a.). ].
J
J].
By the usual Ramsey argument, the class
of stable formulas in variables x,y is closed under negation and under disjunctions, hence under Boolean combinations.
Since every definable subset
of G(n) is a Boolean combination of cosets, it suffices to prove that cosets are stable. Suppose ¢(x,y) = "(x,y) E dH" is unstable, where H is a n. definable subgroup of Gn and d f G By making a change of variables, we ~ay
assume d
1.
So there exists a sequence ai(i ( Q) such that i < j *
(a.,a.) ( dH II (aJ.,a ].
so (1,a tion.
-1
J
3
a
l)
(H.
i dH. i) Since (a
In particular, (a
E dR, (a O,a3) f dH, O,a1) E dR, we get (a E dR, a contradic2,a l) 2,a 3)
We need to show that every definable subgroup is a finite union of
cosets of subgroups definable almost over 0.
Let Sa be a subgroup defin-
able from the parameters a; for convenience, and without real loss of generality, assume a (G.
Let S
=
{(x,y):
S
x
is a subgroup and YES}. x
assumption, S is a Boolean combination of definable cosets; they are all definable over some set, say X.
Since S is O-definable, it must also be
equal to a Boolean combination of X'-definable cosets, whenever X' is a
By
242
U Hrushovski and A. Pillay
conjugate of X; so X may be chosen independent from a.
Using dis-
junctive normal form, we may write S as a finite union of sets of the form C-(Ei
U ••.• U En)'
where C is a coset of some subgroup H, and each E is a i coset of some subgroup of H. Let b realize the generic type of the connected component of Sa over XU{a}, juncts C-(Ei
U,:.U
and let HI
{y:
Claim Sao
=
En)'
So the pair (a,b) is in one of the dis-
Let H denote the subgroup of which C is a coset,
(l,y) E H}.
Sa n HI has finite index in both Sa and HI'
= HlO,
In other words
where ( )0 denotes the connected component of ( ).
Let p be the generic type of S 0 (over X U {a,b}), and let c F p, -1 a -1 Then (a,c) E C, so (l,bc ) E H, Since bc itself realizes p, every o o generic element of Sa is in HI' so Sa C HI' We need to show conversely that HI F.1, 1
=
0
Let F be the subgroup belonging to E and let i, i {(y: (l,y) E F.,}. Let Q = {L: (a,l) E E Q = {t: Fi,l has fin1 1 i}, 2 ES a
ite index in HI}'
If i
0
E Ql' then b ESC HI a
0
c F.J., l' so (I, b) ( Fi;
since (a,b) , E (l,b) E F and (a,l) E E are not simultaneously possii, i i o
Now let q be the generic type of HI over X' U {a,b}, There are two possibilities: (1) (2)
If c
F q,then
(a,c) is not in any E i, For some i, if c F q, then (a,c) E E i,
If the first possibility holds, then (a,c) E C whenever c So suppose (2) holds, lows that (l,c) E F (a,l) E E i,
f=
q, so H~ c' Sa'
By comparing two generic realizations of q, it folo for c F q, Thus HI c Fi,l' Again by (2),
i Thus i E Q l
n Q ' a contradiction,
2
This proves the claim,
Now HI is X-definable, so HI n Sa is definable almost over X. (Every o conjugate of HI n Sa is a subgroup of HI containing HI ; the number of possibilities is bounded.) almost over {a}. almost over 0.
For the same reason, HI n Sa is definable
Since a and X are independent, HI n Sa is definable Thus Sa has a subgroup of finite index definable almost
over 0. We have shown that every definable group is a finite union of cosets of acl(0)-definable subgroups, bination of such cosets.
Therefore every formula is a Boolean com-
It was remarked in section 1 that any coset of
an acl(0)-definable subgroup of
en is
weakly normal.
By propositions 1.1
243
Weakly Normal Groups and 1.2, G is weakly normal. j{~RKS
~
By Theorem 4.1, the structure of an infinite dimensional vector space
over a field, with a predicate picking out a basis, is not I-based (although it is w-stable).
4.4
It follows easily from Theorem 4.1 that a weakly normal group is non-
multidimensional (in fact any type over a ITI+-saturated model is a translate of a type based on acl(0», and has the following strong version of NOTOP:
0\
i f M M are models independent over M c M
u M2) l n M2, then acl l, 2 is a model. To see this,note that the property is preserved by addition
of constants and by interpretation, and use the restated version of the theorem and the corresponding fact for modules from [Pi-PrJ.
4.5
The first author has recently shown that if
~
is weakly normal and q
is a non-trivial regular type, then T interprets a group G, and q is domination-equivalent to the generic type of a subgroup of G.
Thus the char-
acterization of weakly normal groups in this paper is a powerful fact about general I-based theories, at least in the superstable context.
4.6
A supers table theory is simple if for all algebraically closed sets
A,B,C with A Aut(~/B)
c
B n C, Band C are independent over A if and only if
u Aut(~/C) generates Aut(~/A).
This is considerably weaker than
weak normality; it can be thought of as saying that weak normality holds if one adjoins imaginaries of high complexity.
The second author has shown
that simple superstable groups are nilpotent-by-finite. REFERENCES [BCMJ [BLJ [LcJ [Pi IJ [Pi 2J [Pi 3J [Pi-PrJ [Pi-SJ
W. Baur, G. Cherlin and A. Macintyre, Totally categorical groups and rings, Journal of Algebra, 57(1979), 407-440. Ch. Berline and D. Lascar, Superstable groups, to appear in the Annals of Pure and Applied Logic, 30, {1986) 1.43. A.H. Lachlan, Two conjectures on the stability of ~o-categorical theories, Fund. Math. 81(1974), 133-145. A. Pillay, An introduction to stability theory, Oxford University Press, 1983. A. Pillay, Stable theories, pseudoplanes and the number of countable models, submitted to Annals of Pure and Applied Logic. A. Pillay, Superstable groups of finite rank without pseudoplanes, appear in Annals of Pure and Applied Logic (Proceedings of Trento meeting). A. Pillay and M. Prest, Modules and stability theory, submitted to Transaction of A.M.S. A. Pillay and G. Srour, Closed sets and chain conditions in stable theories, Journal of Symbolic Logic, 49(1984), 1350-1362.
244 [p 1]
[P 2J [Sh] [Z]
U. Hrushovski and A. Pillay B. Poizat, Groupes stables avec types g~neriques reguliers, Journal of Symbolic Logic, 48(1983), 339-355. B. Poizat, A propos de groupes stables, preprint, 1985. S. Shelah, Classification Theory, North-Holland, 1978. B. Zilber, Structural properties of models of~l-categorical theories, preprint, 1983.
U. Hrushovski Department of Mathematics University of California at Berkeley Berkeley, California 94720
A. Pillay Department of Mathematics University of Notre Dame Notre Dame, Indiana 46556
Logic Colloquium '85 Edited by The Paris Logic Group
245
© Elsevier Science Publishers B.V. (North-Holland), 1987
A PROPOS DE GROUPES STABLES Bruno Poizat Universite Pierre et Marie Curie
Et qui f.,a);t f.,..t lu 6le.uJtf., nouvelle.!> que [e. Jt~ve TJtouveJtont dan}., ce Ml lave. c.omme une gJte.ve Le myf.,Uque aliment qui 6eJta);t le.uJt v-igue.uJt
C.B. La grande mode de la fin des annees 60, pour les logiciens de tournure d'esprit quelque peu algebrique, c'etait Ie safari aux groupes, aux anneaux, aux corps, ou aux structures les plus bizarres qui eliminaient les quanteurs, ou bien qui avaient une theorie decidable.
Quand sont app-
arues les classifications fondamentales de la Theorie des Modeles contemporaine:
cat~goricite, stabilite, superstabilite, etc .. , il a fallu aussi
pourchasser ceux qui entraient dans ce cadre: On a vu defiler d'impressionantes theories de r~sultats, qui manifestent bien l'enthousiasme des amoureux de la chose; malheureusement, pour beaucoup d'entre eux, la recette manquait de sophistication:
on sort un
plat congele d'un bouquin d'algebre, on enrobe de sauce logique, et on passe au four
a micro-ondes.
Nos estomacs se lassent vite de mets si peu
epices, et nos esprits sont tourmentes par cette insidieuse question: pourquoi faire cela, pourquoi est-il si necessaire de partir che des groupes stables?
a
la recher-
S'agit-il d'un rapprochement artificiel de deux
notions venues d'horizons etrangers - Ie groupe, la stabilite - n'ayant d'autre interet que de faire Ie bonheur des directeurs de these en mal de sujet? Eh bien non, car on peut pretendre que ces groupes stables interviendront dans (presque?) tout contexte ou la Logique aura un int~ret mathematique, et pas seulement metamathematique; ces deux choses, Ie groupe et ~a
stabilite, ant la meme signification: Les groupes sont
a la
f.,tJtuc.tUlte.
fois les objets les plus typiques, les plus
mysterieux et les plus fascinants de nos mathematiques, et il est inutile d'argumenter longtemps pour convaincre ceux qui ne sont deja convaincus qu'un groupe apporte une structure mathematiquement signifiante. Quant a la stabilite, c'est ce qui permet de domestiquer cette structure: trap de structure, ce n'est plus de la structure, c'est un chaos. J'ajoute que la signification de la stabilite, elle, n'a plus de mystere,
B. Poizat
246
et est facilement accessible maintenant qu 'on dispose d 'un bon manuel sur Le sujet. En un mot, chaque fois qu'une structure restera sous notre contrale, elle sera stable, et, si elle n'est pas triviale, on y trouvera un groupe; d'oll l'importance des groupes stables. ablement terroriste, je renvoie
a la
Pour conforter cet argument pass-
construction par Boris Zil'ber d'un
"groupe de liaison" dans une structure aleph-un-categorique non fortement minima Ie [ZIL'BER 1980J, au "groupe de Galois" associe par Ellis Kolchin
a certaines
equations differentielles [KOLCHIN 1973J, [POIZAT 1985, ch.
l8J, et au groupe aasoc Le par Ehud Hrushovski trivial
a un
type regulier non
[HRUSHOVSKI 1986]·
A - STRUCTURES SANS STRUCTURE La structure qui en est Ie plus depourvue (de structure), c'est bien celIe d'un ensemble infini A dans Ie seul langage de
l'egalite~
Si donc
mon introduction n'est pas un del ire pur, on n'y peut definir de groupe infini.
Vous en etes bien persuades, mais comment Ie prouver?
Faisons une premiere tentative. metrie, vous pouvez argumenter ainsi:
Si vous avez des lumieres en geosoit G un tel groupe; il est omega-
categorique, donc d'exposant fini n; mettez sur A une structure de corps K algebriquement clos, de characteristique zero; comme nous Ie verrons, a G devient alors un groupe algebrique; soit G son plus grand sous-groupe a affine connexe; G/G est une variete abelienne, qui est finie car une variete abelienne infinie contient des elements de tout ordre fini premier a, characteristique. Choisissez une representation lineaire de G qui
a la
devient un groupe de matrices; les valeurs propres du "point generique" a de G (qui est Ie type defini par l'ideal premier associe a cette variete) sont
a chercher
parmi les racines nO de l'unite; donc Ie generique reste
generique sur chacune de ses valeurs propres; or l'equation matricielle = 0 definit un ferme de ZarisKi de Ga: si elle est satisfaite
det(X - AI)
g~,eriquement, elle est satisfaite partout, et en particulier par l'iden-
tite; la seule valeur propre possible pour Ie generique est donc 1, et n comme 1 est racine simple du polynome X - 1 Ie generique, et donc tout a a, el~ment de G satisfait X = I, G = I, G est fini! Pendant qu'on y est, on observe qu'en characteristique p un groupe algebrique connexe d'exposant fini est d'exposant pm; il est forme de matrices unipotentes, il est done nilpotent.
247
Groupes Stables
Si vous manquez de lumieres en geometrie, vous cherchez une preuve plus raisonable:
mettez cette fois sur A une structure de chatne; les
chalnesentune propriete que James Schroerl a qualifie de nuet~e [SCHMERL 1977]. et que je prefererais appeler loc~e.
a savoir
l'exis-
tence d'un entier k (pour une chaine k=2) tel que pour tout element a et tout uple fini b Ie type de a sur
b
soit determine par sa restriction
a
moins de k elements de b. Dne structure locale ne permet pas d'interpreter un groupe G infini: soit G un groupe. defini sur une partie definissable de Am/E, ou E est une relation d'equivalence definissable; si G etait infini, on pourrait trouver une suite indiscernable (dans l'ordre) de m-uples al, ...a •... , donnant n des elements distincts bl ••••• b ••••• dans G (ou plus exactement dans une n extension elementaire de G); il existe alors un entier n tel que Ie type du produit. au sens de la loi du groupe. bl .... b
sur {al, .••a soit n} n determine par sa restriction a {al ••••ai_l'ai+l' •••an}; on voit que bl ••. b
ri
est rationnel sur cet ensemble. ce qui est en conflit avec l'indis-
cernabilite de la suite. On peut etre surpris de constater qu'il est plus facile d'enrichir d'abord la structure. et de montrer un resultat plus fort que celui pose par Ie probleme d'origine; c'est que l'egalite, de meme que toute structure stable infinie. n'est pas locale; en effet. dans une structure locale on ne peut trouver de suite infinie totalement indiscernable, comme Ie lecteur Ie verifiera aisement; il verifiera avec une egale facilite qu'une structure locale ne peut avoir la propriete d'independance (voir [POIZAT 1985, ch. 12]).
Pour un raisonement direct. il faut utiliser l'analogue
stable de la localite. c'est-a-dire l'existence d'un entier k tel que pour tous a et b. il existe une partie
c de b d'au plus k elements tel que
tp(a/b) soit l'unique extension non deviante de tp(a/c) (pour la deviation. voir [POlZAT 1985. ch. 15]).
,B - LES EXEMPLES CANONIQDES Afin de ne pas egarer d'avantage un lecteur, ou une lectrice, qui ne serait deja specialiste du sujet. je decris maintenant les principales familIes de groupes connus pour etre stables. Tout d'abord je precise ce que j'entends par "groupe stable": UYl gJtoupe. G de6hU daYL6 UYle. -6.tJw.c.:twte. -6ta.ble.; cela revient a dire, grace au Theoreme de Separation des Parametres [POIZAT 1985. 12.31]. un groupe G.
248
muni d'une structure suppl~mentaire, Ie tout etant stable.
Cette conven-
tion n'est pas Ie produit d'un amour gratuit de la generalite; d'abord, la Theorie des Modeles est impuissante
a distinguer,
dans un cadre general,
ce qui vient de la seule loi de groupe de ce qui necessite un langage plus riche, et je ne connais aucun theoreme de stabilite specifiant la restriction au seul langage des groupes; ensuite, quand bien meme nous etudions un groupe G reduit
a sa
seule loi de groupe, nous voyons apparaltre des
sous-groupes H au des groupes quotients G/H definissables dans G, et ceuxla doivent etre consideres avec toute la structure qui provient de G; enfin, un groupe apparaissant dans un contexte mathematique a de fortes chances de porter une structure supplementaire.
Par exemple, Ie langage
naturel pour etudier les groupes algebriques est celui de la geometrie, celui au on considere tout ce qui est definissable grace au corps de base (qui est algebriquement clos); G est alors une structure aleph-un-categorique; il est cependant vrai que, dans bien des cas, on peut reconstituer toute la geometrie Autre illustration:
a partir
de la seule loi de groupe.
d'apres un theoreme d'Angus Macintyre, amelior~
par Gregory Cherlin, on sait qu'un corps supers table est algebriquement clos; comme les corps algebriquement clos sont Ie paradigme meme de structure om~ga-stable, et meme fortement minimale, on a l'impression que Ie probleme est regIe.
C'est une erreur, car les corps qu'on voit appar-
aftre dans Ie contexte des groupes stables peuvent etremunisd'une structure plus riche; il sera it particulierement important de savoir si un corps de rang de Morley fini est necessairement de rang un; et meme, ce qu'est un corps de rang un, personne ne Ie sait:
il est possible que ce
ne so it rien d'autre qu'un corps (algebriquement clos), dans lequel on aura singularise quelques canstantes, mais persanne n'a reussi
a Ie
mon-
trer. 1 - Les groupes faiblement normaux TOU4
leo
g~oupeo
abet1enb
~ont ~tableo, quand on est dans Ie seul
langage des groupes, naturellement; plus generalement, il en est ainsi des modules, et des groupes abeliens par fini, qui s'interpretent dans un module.
Leurs parties definissables ant meme une structure beaucoup
plus trivialeque ce qu'impose la stabilite:
d'apres Wanda Szmieliev
[SZMIELIEV 1955] et Walter Baur [BAUR 1976], ce sont seulement les combinaisons booleennes (finies~) de classes modulo des sous-groupes definis-
Groupes Stables sables sans parametres.
Ces groupes sont des structures dimensionelles,
chaque type etant D-equivalent sur
0.
249
a
(et en fait:
translate de) un type base
On observa, en lisant [POIZAT 1985, ch. 6J, que l'analyse modele-
theorique des modules se ramene au seul resultat d'algebre suivant, du
a
[NEUMANN 1952J: LEMME DE B. NEUMANN: S'<' G eAt un gJeoupe qu.<. eAt Jeec.ouveJt..t paJt un nomlYte n.£n.<. de c£MlleA modu1.oceJt.:taA..nll de. lleA llOUll-gJeOUpU, :a eAt Jee.e-ouveJt.t paJt
e-illeA d' e.YittLe. illeA doiii [e. gJt.Oupe. eAt d' -<-niUe-e. 6-<-n-<- daM Pour ceux qui connaissent un peu de stabilite:
G.
une theorie Test
dite 6a.<.b£.emenJ.: nuJema£.e (au "one-based") si, pour chaque uple 11 et chaque modele M de T, l'ensemble canonique de definition de tp(~/M) est algebrique sur a; au encore, si A designe l'intersection de M et de la cloture algebrique imaginaire de ~, tp(~/M) ne devie pas sur A; dans une telle theorie, Ie type d'une suite infinie indiscernable est determine par la connaissance de ses deux premiers elements:
Ie premier base Ie type moyen
de la suite (d'oll Ie terme "one-based"), Ie second fixe son type fort. L'interet de cette notion, c'est que toutes les theories omega-categoriques superstables sont faiblement normales, comme Ie montre l'analyse de [CHERLIN, HARRINGTON, LACHLAN 1985J. Anand Pillay a montre qu'un groupe G etait faiblement normal si et seulement si, pour chaque entier n, chaque partie definissable de en eta it combinaison booleenne de classes modulo des sous-groupes de finis sables presque sans parametres (i.e., avec des parametres dans la cloture algebrique imaginaire de 0); plus recemment, Ehud Hrushovski a montre que si n toute partie definissable de chaque G etait combinaison booleenne de classes modulo des sous-groupes definissables de en, avec parametres quelconques, alors C etait stable et faiblement normal; tout cela se trouve dans ce volume, [HRUSHOVSKI, PILLAY 1987J.
Ces groupes sont abeliens par
fini, comme Ie montre Ie resultat suivant, de pure theorie des groupes: THEOREME L'e.Memb£.e. D ={(x,x-
l)
/ x
nombJee. 6'<'n.<. de c£M-6 eA modu1.o deA C eAt abe.£.J..en paJt 6J..n.<.. PREUVE
E
llOU6
G} eAt c.omb.<.na.<.llon boo£.e.enne. d'un
-gJt.OupeA de c
2
-6'<' e.t -6e.u1.eme.n.t -6-<-
Supposons que C ait un sous-groupe abelien H dont les classes -1
-1
anH; posons K = {(x,aix a ) / x E H}: ce sont des sousi i -1 2, groupes de G puisque H est abelien; D est la reunion des (a i ,a i )Ki• soient alH
250
B. Poizat
Supposons maintenant que D soit de la forme indiquee; il s'ecrit comme reunion d'un nombre fini d'ensembles de la forme aK n ,aiK
n •••
i anK ou K •.. K sont d'indice infini dans K. Cela signifie que Ie n n, i, groupe K est contenu, un ensemble petit pres (une reunion finie de clas-
n,
a
ses modulo des sous-groupes d'indice infini) dans un translate de D; K est presque contenu dans (a,b Prenons u
(ax,b
~
-1
)D ~ {(ax,b
-1 -1
x
-1 -1
x
) dans K n (a,b
) / x
-1
E
G}.
)D; pour presque tout v dans
K, v, uv et vu sont dans (a,b-l)D; en effet, ces conditions n'~liminent qu'un nombre fini de classes modulo des sous-groupes d'indice infini de K: d'apres Ie lemme de Neumann de tels v existent bien. Done v = (ay,b encore t ~ xay
-1 -1
y
= ybx;
); uv
(axay,b
~
par symetrie xby
-1 -1 -1 -1
x
by)
~
(at,b
-1 -1
t
); so it
= yax.
u et v etant ainsi fixes, et toujours grace au lemme de Neumann, on peut trouver w dans K tel que w, posant
W ~
(az,b
-1 -1
z
d'autre part xayaz xay
= yax;
done a
~ ~
UW,VW
et uvw soient dans (a,b
), on obtient d'une part xayaz taz
zbt
~
zbxay
= zbybx;
= xazby
~
-1
)D; en
zbxby, et
so it en tout xby
= ybx
~
b et u et v commutent.
Si done u dans K est pris en dehors de quelques classes interdites, Ie groupe K est recouvert par Ie centralisateur de u plus un nombre fini de classes d'indice infini:
d'apres Ie lemme de Neumann, u est central.
K est done recouvert par son centre plus un petit ensemble, et, toujours pour la meme raison, il est commutatif. D est done recouvert par un nombre fini de classes modulo des sous-
2;
groupes abeliens de G
sa premiere projection, qui est G tout entler,
est recouverte par un nombre fini de classes modulo des groupes abeliens; d'apres Ie lemme de Neumann (encore lui:), G est abelien par fini. FIN Tout cela conduit a poser, sans beaucoup de conviction, Ie probleme suivant: PROBLEME 1: S..[ tou;te pMUe de.6..[rU6.6ab.e.e de G e.6t c.omb..c.na...c..60n boo.e.e.enne de c..ta.6.6 e.6 modui.o de.6 .60U.6-g!1.0Upe.6 de.Q.<.rU6.6ab.e.e.6, G e.6t-U 6a..i.b.e.ement noJrJna1.? E.6t-U meme abWen paJt 6in..i.? 2 - Les·groupes algebriques Un groupe algebrique, sur un corps algebriquement clos K, est une 2 variete G avec une loi de groupe qui est un morphisme de G dans G. Les varietes
sont des ensembles dafinissables, et les morphismes entre vari-
etes des applications definissables assez particulieres; mais il est bon
Groupes Stables
251
de savoir des a present que tout groupe definissable dans K est definissablement isomorphe
a un groupealgebrique;
en tout cas ces groupes sont
omega-stables, de rang de Morley fini, puisque K aces proprietes. Un groupe algebrique G a un plus petit sous-groupe definissable d'indice fini, qu'on note GO; c'est la composante connexe de l'identite, pour la topologie de Zariski. II a egalement un plus petit sous-groupe a a normal definissable G tel que Ie groupe quotient G/G so it une variete a complete; GO/G est ce qu'on appelle une "variete abelienne"; sa loi de a groupe est commutative. Au contraire. G est un groupe affine. qui est (geometriquement) isomorphe
a
a
un groupe lineaire, c'est-a-dire un groupe b Enfin G a des sous-groupes mimimaux G tels que Ie quotient a droite (ou a gauche) G/Gb soit une variete complete; ces Gb sont appeles a; "groupes de Borel" de G; ils sont tous conjugues dans G la structure de a G est fortement determinee (au moins dans Ie cas dit reductif, ou il n'y de matrices.
a pas de sous-groupes normaux unipotents) par sa decomposition. dite de Bruhat, en classes doubles modulo un Borel. Si vous ne savez pas ce qu'est une variete, retenez qu'une variete complete est "petite" pour les images (l'image par un morphisme d'une variete complete est complete) mais pas pour les restrictions Cune sousvariete d'une variete complete ne l'est pas necessairement, puisque toute variete s'obtient en recollant des ouverts affines; toutefois un ferme d'une variete complete est complet). L'exemple des groupes algebriques est important, car l'essentiel de notre savoir-faire en la matiere consiste
a etendre
aux groupes stables
des constructions bien connues des geometres. 3 - Les groupes de Mekler Dans [MEKLER 1981] Alan Mekler introduit une facon systematique de deguiser une structure en un groupe.
Cela montre qu'un groupe stable peut
interpreter n'importe quelle structure stable:
il convient de ne pas per-
dre de vue ce fait avant de poser des conjectures trop fortes, comme l'a fait souvent l'auteur de ces lignes. Mekler considere un graphe f defini sur un ensemble { •••• a •.••• }, i et Ie quotient GCf) du groupe nilpotent de classe 2, d'exposant p # 2, libre engendre par les a., divise par les relations "a chaque fois que a
et a
~
sont lies dans Ie graphe.
i
et a
j
commutent"
II montre que si f
j i satisfait quelques conditions simples (il dit alors que
r
est un "nice
252
B. Poizat
graph") il est interpretable dans G(f); toute structure, dans un langage fini, est equivalente
a interpretation
pas tout-a-fait interpretable dans
r
pres a un "nice graph"; G(r) n I est
(nous verrons qu'il y a des raisons
profondes qui empechent cela), mais il a presque les memes proprietes modele-theoriquesque f, et en tout cas meme spectre de stabilite.
Cepen-
dant, ni la categoricite, ni meme la dimensionalite ne sont conservees: de fait, la construction de Mekler augmente la profondeur d'une unite. Afin
d'obtenir sans fatigue une construction de meme portee ideo-
logique que celIe de Mekler, j'introduirai dans la section D de cet article une autre
fa~on,
elle absolument triviale, de deguiser une structure
en groupe, avec cette fois preservation de la profondeur. sera plus, comme dans Ie cas de Mekler, reduit
a la
Le langage ne
loi de groupe, ce qui
n'a aucune espece d'importance dans Ie contexte ou nous nous sommes pIeces. Etroitement apparentes aux groupes de Mekler sont les groupes n-nilpotents libres d'exposant pm, dont la totale transcendance a ete etablie par Andreas Baudisch dans [BAUDISCH 1982]. C - LES GENERIQUES Ce qui a fait Ie bonheur des pionniers de la stabilite algebrique, ce sont les-conditions de chaine sur les sous-groupes definissables: dans un groupe omega-stable, pas de suite infinie decroissante de tels sousgroupes (car il y a decroissance,
a chaque
etape, du rang de Morley ou
du degre de Morley); dans Ie cas superstable, pas de telle suite si chacun est d'indice infini dans Ie precedent (decroissance du rang de Shelah); pas de suite infinie uniformement definissable dans Ie cas stable (propriete de l'ordre).
On
a mis un peu de temps
a decouvrir
Ie fait, moins
trivial, que dans un groupe stable l'intersection d'une famille de sousgroupes uniformement definissables est celIe d'un nombre fini d'entre eux [BALDWIN, SAXL 1976]; pour plus de precisions, je renvoie
a l'introduc-
tion de [POIZAT 1983]. Afin de ne pas donner l'impression (fausse) d'un esprit autrefois brillant, mais marque du poids des ans, je m'abstiens de repeter une fois de plus Ie detail des proprietes des types et formules generiques, qui sont Ie deuxieme ingredient de l'etude des groupes stables: egalement
a cette
introduction.
Je rappelle seulement qu'un ensemble generique est gros:
je renvoie
253
Groupes Stables par definition, une partie definissable A de G est dite
genitique (a
gauche) si un nombre fini de ses translates recouvrent G, G uanA.
a gauche
=
alA u •..
Dans un groupe stable, il se trouve qu'un ensemble est generique si et seulement s'il l'est
a droite,
et que si A' vB est gener-
ique, alors A est generique ou Best generique. II existe done des types qui ne satisfont que des formules g~neri ques:
on les qualifie de
a gauche
sur ses types:
"ge.nVu£que..6".
Faisons agir G par translation
si p est Ie type de x sur G, ap sera Ie type de
ax sur G; comme dans une theorie stable tout type est definissable, on voit facilement que Ie stabilisateur G(p) de p pour cette action est une intersection de groupes definissables; pest generique si et seulement si son stabilisateur est GO, l'intersection de tous les sous-groupes definissables d'indice fini de G; en fait, s'il est suffisament sature, G agit transitivement sur ses types generiques, et il y a exactement un type generique dans chaque classe modulo GO; nous appelons g{n~que
pkincipal
celui qui est dans GO. On dit que G est si G
=
QOnnexe s'il n'a qu'un seul
g~nerique, c'est-a-dire
GO; dans Ie cas d'un groupe algebrique, cela equivaut bien
a la
connexite pour la topologie de Zariski. Dans Ie cas superstable, il se trouve que les generiques sont les types de rang U de Lascar maximum (je veux dire que dans un groupe superstable il y a des types de RU maximum, ce qui n'est pas du tout vrai dans une structure superstable arbitraire, et que ce sont les generiques); ce sont aussi les types de RC de Shelah maximum, ainsi, d'ailleurs, que les types de rang maximum pour toute notion de rang invariante par translation.
Dans une theorie superstable, les generiques se definissent done
sans faire mention de 1a loi de groupe; au contraire, dans Ie cas seulement stable, les generiques peuvent n'etre pas preserves par bijection definissable, si bien qu'on peut obtenir sur Ie meme ensemble deux lois de groupe G et G qui n'ont pas les memes generiques. 2 l Cette plus grande robustesse des generiques dans Ie cas superstable explique pourquoi tant de theoremes demontres pour les groupes superstables restent conjecturels dans Ie cas seulement stable.
Un exemple typ-
ique est ce theoreme de [POIZAT 1983J qui affirme que Ie produit libre de deux groupes G et 1, ou IGI ~
lH!
Hne
peut ~tre superstable que si IGI
=
1, ou
iHI
2.
Dans Ie cas omega-stable, les generiques sont aussi les types de rang de Morley maximum (et aussi ceux de rang de Cantor maximum, bien que
254
B. Poizat
ce rang n'en soit pas un au sens lascarien du terme; GO est alors un groupe definissable (condition de chaine), et leur nombre est l'indice de
GO dans G ; cet indice est donc le degre de Morley de
G, et nom un
minorant de ce degre, comme on pouvait s'y attendre. Nous verrons que c'est principalement dans le cas de rang fini que la structure du groupe se concentre autour de ses generiques. D - CE QUtON SAlT FAIRE 1 - Avant meme que les logiciens n'aient mis le nez dans le sujet, tout etait deja la, dans le fond, dans cet article d'Andre Weil [WElL 1955], ce que ce grand esprit a eu la faiblesse d'ecrire en anglais, ou il montre que pour determiner un groupe algebrique il suffit de se donner un "morceau de groupe", c'est-il-dire une variete avec une loi qui n'est definie, associative et simplifiable que generiquement. Hrushovski a donne une version abstraite de ce theoreme : partant d'un type p complet, dans une theorie stable, avec une fonction f telle que, si a,b,c sont trois realisations de p independantes, (i) a et f(a,b) d'une part, b et f(a,b) d'autre part, sont deux realisations de p independantes (ii) f(a,f(b,c»
= f(f(a,b),c),
il reussit a faire de p le generique d'un
groupe "t ypa-def In I s sabLe" (L, e., la loi de groupe est def in i.ssabIe - ici, au moins generiquement, c'est f - mais l'ensemble sous-jacent de cette loi est infiniment definissable) : 11 montre ensuite ce resultat remarquable que, dans une theorie stable, tout groupe type-definissable est contenu dans un groupe definissable, c'est-a-dire quton peut trouver une partie definissab1e englobant ce groupe ou sa loi f definit encore un groupe. Tou·t recemment, ce meme Hrushovski a mont re comment, dans le cas particulier d'un corps algebriquement clos de caracteristique nu1le, on pouvait mettre une strucutre de variete sur son
groupe qui en fait un
groupe algebrique; c'est en fait une demonstration particulierement nette (pour un logicien!), et purement modele-theorique, du Theoreme de Weil. Donc, un groupe definissable dans la theorie d'un corps algebriquement clos de caracteristique nulle est definissablement isomorphe a un groupe algebrique; c'est Laurentius Van den Dries qui a le premier remarque que ce resultat s'obtenait a partir du Theoreme de Weil, avec un doigt d'elimination d'imaginaires; il a toujours affirme, avec une grande constance, que pour avoir la meme chose en caracteristique p il convient d'ajouter au cocktail un theoreme de Serre.
255
Groupes Stables 2 - La premiere apparition consciente des generiques est dans [CHERLIN,
SHELAH 1980J, ou est montre qu'un corps stable infini est connexe tant additivement que multiplicativement; on en deduit facilement, gr~ce
a un
peu de Theorie de Galois, qu'un corps superstable est algebriquement clos; dans Ie cas omega-stable, cela avait ete prouve par [MACINTYRE 1971J. Les corps separablement clos sont stables, mais on ne sait pas si ce sont les seuls exemples de corps stables; et quand on augmente Ie langage, il reste
a resoudre
stable:
quand vous aurez lu Ie point 5 ci-apres, vous comprendrez com-
de douloureuses questions, meme dans Ie cas omega-
bien il serait interessant de savoir si un corps K, avec un sous-groupe multiplicatif M infini non-trivial, peut etre de rang de Morley fini. 3 - D'apres Cherlin, un groupe supers table infini contient un sous-groupe abelien definissable infini; pour Ie voir, on part d'un centralisateur H infini minimal; si H n'est pas commutatif, tout element non central y a un centralisateur fini; pour des questions de rang, chaque classe de conjugaison non centrale est generique; il n'y en a qu'un nombre fini, et H O
a une composante connexe H
O
(finiment) definissable; H n'a qu'une seule
classe de conjugaison non centrale; or dans un tel groupe, comme l'a observe Joachim Reineke dans [REINEKE 1975J, il Y a des chaines infinies de centralisateurs, ce qui contredit la stabilite. 4 - Toutefois les generiques sont virtuellement presents chez Zil'ber, [ZIL'BER 1977J, qui a eu l'heureuse idee de faire operer Ie groupe sur ses types
de ge'ne'raliser aux groupes de. Jtang de. MolLte.y
Mni un
r~sultat
q~e les geometres enoncent habituellement pour les sous-ensembles definis-
sables connexes, au sens de la topologie de Zariski, d'un groupe algebrique. II dit qu'une partie definissable A de G est ~nde~ompo~abie. (a gauche) si pour tout sous-groupe H de G, ou bien A est contenu dans une seule classe ~ gauche modulo H, ou bien A est reparti en une infinite de classes modulo H; je ne sais si les notions d'indecomposabilite
a gauche
et
a
droite sont equivalentes; grace au lemme de Konig, on voit facilement que tout ensemble definissable se partitionne de maniere unique en un nombre fini de sous-ensembles indecomposables maximaux.
ut de. RM 6ini, e.t .6~ ••• Ai' • • • •• ut une. 6amille. de. palttiu de. G, toutu ~nde~ompMabiu e.t ~haffine. ~onte.nant i.'et&ne.nt ne.utJte. de. G, aloM i.e. gJtoupe. H enge.nd!te pM Void i: enonce du Theoreme de Z11' ber:
S~ G
B. Poizat
256
le6 Ai e6t de6inihhable et connexe; il se met meme sous la forme H = A.
~l
• • • • • A.
~n
(AB designe l'ensemble des produits ab, avec a dans A, b
dans B), si bien qu'un nombre fini de Ai suffisent La demonstration consiste
puis
a definir
a faire
a engendrer
H.
apparaitre Ie generique p de H,
H comme stabilisateur de p.
Comme premiere consequence de ce theoreme, on voit que sont definissables (et connexes:) certains groupes qui ne devraient pas l'etre; par exemple si A est une partie
quelconque. meme non-definissable, de G (qui
est de RM fini), et si H est un sous-groupe definissable connexe
de G,
alors Ie groupe [A,HJ engendre par les commutateurs de A et de H est definissable et connexe, et seul un nombre fini d'elements de A interviennent dans sa definition; G'
= [G,GJ
est aussi definissable, car
[G,GoJ y est d'indice fini. 5
Ce th~oreme a ete utilise par Zil'ber pour faire surgir des
corps dans certains contextes; par exemple:
soient M et A deux groupes
abeliens infinis, M agissant sur A comme groupe d'automorphismes; on suppose que M agit de faJon propre et irreductible sur A, et que cette structure est de rang de Morley fini; alors on peut y definir un produit sur A, qui, avec son addition, en fait un corps K, de sorte que l'action de M apparaisse comme celIe d'un sous-groupe multiplicatif de K. En poursuivant un peu, Zil'ber definit un corps, algebriquement clos bien sur, dans n'importe quel groupe connexe resoluble non nilpotent de rang de Morley fini; on ne sait pas si c'est possible dans Ie cas nilpotent non abelien. Cela mene finalement au theoreme de Ali Nesin [NESIN 198?J, qui affirme qu'un groupe connexe, resoluble, de rang de Morley fini a un derive nilpotent; ce r~sultat est une version abstraite du Theoreme de Lie-Kolchin, qui pourvoie d ' une base triangularisante tout groupe linea ire connexe resoluble; Nesin raisonne par contradiction, en faisant de la geometrie dans un contre-exemple de rang minimum. 6 - Ce Theoreme de Zil'ber met en evidence des correlations entre des proprietes algebriques et des proprietes purement modele-theoriques; par exemple, Zil'ber a montre qu'un groupe simple de rang de Morley fini etait aleph-un-categorique, et meme algebrique, moyennant l'introduction de quelques parametres, sur chacun de ses ensembles definissables infinis; en effet, d'un tel ensemble A on extrait un sous-ensemble indecom-
Groupes Stables
257
posable infini. qu'on translate pour avoir un ensemble B contenant l'element neutre; il existe alors des conjugues B:!. •••• B" de B tels que G '" Bn~
Bl '" On
voit avec la meme facilite qu'un corps de rang de Morley fini est
aleph-un-categorique. Cela conduit
a l'analyse
qu'a faite Daniel Lascar. dans [LASCAR 1985J.
d'un groupe G de rang de Morley fini; il decompose G en une tour {e} '" HO C HI C • • • • C H G de sorte que chaque quotient Hi+/H soit unidin i mensionnel (si vous ne savez pas ce que cela veut dire. voyez [POIZAT 1985. ch. 20J); G est done finidimensionnel. et en particulier elimine les quanteurs infinitaires; cette propriete de dimensionalite est meme tres forte. puisque les dimensions sont isolees les unes des autres par des formules: elle implique l'egalite du rang de Morley et du rang de Lascar (reprendre la demonstration de [POIZAT 1978J). ce qui rassurera les ames sensibles ayant lu les premieres oeuvres de Cherlin. Lascar montre aussi que si deux sous-groupes unidimensionels H et K de G portent des dimensions orthogonales. ils se centralisent l'un l'autre; mais en general il n'est pas possible d'associer
a chaque
dimension de G
un de ses sous-groupes normaux: si G a parexemple deux dimensions. l'une portee par R. l'autre par G/R. on ne peut pas toujours faire remonter la deuxieme dans G; Lascar montre que ce qui fait obstacle a cela, ce sont les sous-groupes abeliens normaux, et qu'on pourra toujours Ie faire si G est semi-simple. c'est-a-dire sans sous-groupe abelien normal infini. et dans ce cas il decompose G. a un noyau fini pres. en un produit de groupes unidimensionnels. i.e •• aleph-un-categoriques.
C'est un analogue
de la d~composition d'un groupe alg~brique connexe semi-simple en produit de groupes simples. II montre egalement que tout groupe G de RM fini a un "gros" sousgroupe abelien A. portant la totalite de ses dimensions:
on ne peut aug-
menter G sans augmenter A. Ce qui fait la beaute de ces theoremes. c'est qu'ils n'ont aucune signification algebrique. puisque. dans Ie cas de la geometrie. tout est aleph-un-categorique~ On peut se demander si la Theorie des Modeles per-
met de demontrer des resultats d'une nature vraiment differente de celIe de ceux de Lascar. 7 - L'etude des groupes superstables est sortie de sa phase artisanale grace
a l'oeuvre
monumentale de Chantal Berline et Daniel Lascar
258
B. Poizat
[BERLINE, LASCAR 1986J.
Comme je l'ai dit, un groupe superstable a un
rang U; on a de meme des types de RU maximum dans l'espace des classes modulo un sous-groupe definissable H, et l'inegalite de Lascar ([POlZAT 1985, ch. 19J) devient dans ce contexte: RU(H)
+ RU(G/H) < RU(G)
<
RU(H)
RU(G/H) ; on voit que, si on
$
se donne Ie developpement de Cantor du RU(G) , cela ne laisse qu'un nombre fini de possibilites pour Ie coefficient de tete du developpement de RU(G/H). II Y a autre chose: C( RU(G) w .n + wa.n; gr~ce
a une
supposons que Ie developpement de RU(G) soit C(
+
+ w
k.~,
et soit p un type de RU
habile manipulation de parametres canoniques Berline et
Lascar mont rent que Ie RU du stabilisateur de pest &gal non pas strictement inferieur.
i_l
a
etant Ie monome wi· n ,
A
de p, et
Cela leur permet de decomposer G en une
tour de sous-groupes normaux {e} Gi/G
a celui
GO c G l
C
••••
C
G = G, Le rang de k
i
Afin de trouver une generalisation adequate du Theoreme de Zil'ber, ils introduisent
la notion d'ensemble a-indecomposable, qui est la sui-
vante: A est a-indecomposable si pour tout sous-groupe H definissable, si a, RU(A/H) < w alors lA/HI = 1. Dans Ie groupe G, ils trouvent egalement un gros sous-groupe abelien, de rang au moins wC(; ils en d~duisent aisement qu'un corps gauche superstable est commutatif (et alg~briquement clos) , ce qui donne une demonstration conceptuelle de ce resultat auparavant demontre par Cherlin. lIs montrent aussi que Ie rang d'un corps superstable est un mon6me wa'n (ils conjecturent que n
= 1),
et qu'il est a-indecomposable, tant addi-
tivement que multiplicativement.
Cette forte propriete de connexite per-
met de montrer que la structure formee d'un corps K, avec un automorphisme non-identique s qui fixe une infinite de points de K, ne peut etre superstable:
en characterique 0, un corps aux differences finies ne peut
etre superstable, alors qu'on a un bel exemple de corps differentiels omega-stables, les differentiellement clos. 8 - Les
g~neriques
n'avaient qu'une existence foe tale lorsque Cher-
lin a entrepris de classer les groupes de petits rang de Morley [CHERLIN 1979J; sa methode repose sur I' observation qu 'un nombre inferieur au moins 2, qu'un nombre inferieur
a2
a3
vaut
vaut au moins 1, que Ie seul nom-
Groupes Stables
259
bre plus petit que 1 est 0, et qu'apres il n'y a plus rien.
II montre,
comme nous l'avons vu en 3 ci-dessus, qu'un groupe connexe de rang 1 est commutatif; qu'un groupe connexe de rang 2 est resoluble, et que si son centre est trivial c'est Ie produit semi-direct du groupe additif d'un corps algebriquement clos K par son groupe multiplicatif; et enfin qu'un groupe connexe de rang 3, non resoluble, et possedant un sous-groupe definissable de rang Z (son "Borel"), est SLZ(K) ou PSLZ(K), pour un K algebriquement clos:
il arrive
a reconstituer
la decomposition de Bruhat du
groupe par une methode ad hoc. Cela lui a donne l'idee que tous les groupes de rang de Morley fini devaient s'obtenir en combinant d'une fafon ou d'une autre les exemples connus, et en particulier
a conjecturer
qu'un groupe simple de rang de
Morley fini etait un groupe algebrique. Cette conjecture a ete prouvee par Simon Thomas [THOMAS 198?J pour un tel groupe localement fini:
il s'agit essentiellement de combinatoire
des groupes finis. Par ailleurs, une analyse parallele
a celIe
de Cherlin a ete faite
par Chantal Berline [BERLINE 1986J pour les groupes de rang de Lascar U, U W wU·Z et w · 3 •
E - LA TOPOLOGIE DE ZARISKI Aucun geometre ne parlerait de generiques sans introduire la topologie de Zariski.
Si K est un corps algebriquement clos, les fermes de
cette topologie sont les ensembles definis par des formules positives, qui sont de la forme Pl(x) = polynomes
a
°
Pk etant des coefficients dans K; un ensemble definissable (les geometres A ••• A
Pn(x) = 0, PI""
disent "constructible") est combinaison booleenne de fermes de Zariski; on a de meme une topologie de Zariski sur chaque variete, qui est obtenue en recollant des varietes affines. Ces fermes, comme tout ensemble definissable, peuvent gtre conside~ res non seulement comme ensemble de points de
x", mais aussi comme ensem-
ble de types en n variables sur K, soit encore comme ensemble d'ideaux premiers de KCiJ; Ie point generique d'une variete irreductible est celui qui n'appartient
a aucun
ferme non-vide.
Quelques proprietes remarquables des ces fermes: (1)
si f(i,a) definit un ferme, il existe une formule g(i,a) equivalente
a f(i,a)
telIe que g(i,b) so it f erme quel que soit
b ;
260 (ii)
la topologie de Zariski est noetherienne:
pas de suite infinie
decroissante de fermes; (iii)
toute application definissable est generiquement un morphisme (du
moins en caracteristique
0); sa restriction a un ouvert non vide conven-
able est continue; (iv)
tout sous-groupe constructible d'un groupe algebrique doit etre
ferme. La topologie de Zariski n'est pas preservee
par bijection construc-
tible (elle l'est seulement generiquement~), mais elle l'est par les isomorphismes constructibles de groupes algebriques, qui sont des morphismes geometriques,
a extraction
de racine po pres; il y a donc une topologie
de Zariski bien determinee attachee
a un
groupe constructible, qui ne peut
etre muni d'une structure de groupe algebrique que d'une seule faron, et, en fait, la demonstration par Rrushovski du Theoreme de Weil consiste
a
montrer que s'il y aune topologie de Zariski T sur un groupe stable G, il y en a une et une seule T', pour laquelle la multiplication est fonction 2 continue de G dans G, et qui agree generiquement avec T. II est remarquable que sous la seule hypothese de stabilite du groupe nous ayons reussi
a
identifier les generiques, c'est-a-dire les ensembles
definissables Zariski-denses; peut-on definir les Zariski-fermes? Observons en premier lieu que nous avons bien une topologie de Zariski dans Ie cas faiblement normal, dont les fermes sont les disjonctions de classes modulo les sous-groupes definissables; dans Ie cas omega-stable, nous avons bien la condition de chaine descendante sur ces fermes, par une application aisee du Lemme de Konig (merci Anand), puisque nous l'avons pour les groupes; dans Ie cas stable, nous n'obtenons bien sur que des conditions locales.
On verifie sans peine la continuite de la multiplica-
tion, grace au caractere abelien par fini du groupe. Mais il n'est pas possible de definir une telle topologie dans Ie cas general, sans introduire de restrictions, comme l'illustre l'exemple trivial que j'ai promis dans la section A, et qu'il est temps de devoiler. Je considere un groupe G 2-abelien elementaire, c'est-a-dire un espace vectoriel sur Ie corps
a deux
elements, et j'ajoute au langage une
partie A infinie de G formee d'elements independents; soient Ao = 0, Al A, A l'ensemble des sommes de n elements distincts pris dans A, •.• n
La theorie de G elimine les quanteurs dans Ie langage comprenant l'addition et les A .
n
=
Groupes Stables
261
Les types sur G en une variable x sont ainsi decrits: on a .d'une part Ie type generique, pour lequel
a+X
n'est dans aucun
An' quel que soit a dans G; pour les autres types, il existe un n minimum et un unique a dans G tel que
a+x
soit dans An'
G est une structure sans mystere, bidimensionelle, de rang de Morley omega; on observe que G n'a pas de sous-groupe H propre infini definissable (si on veut eliminer les groupes finis, on peut prendre un Q-espace vectoriel, c'est-a-dire un groupe divisible sans torsion):
en effet, si
x' et x" sont deux realisations independantes du generique p de H, x'+x" doit avoir Ie meme type, ce qui ne laisse que deux possibilites: generique de G, ou bien p
=
p est Ie
0:
G a bien une topologie de Zariski:
les fermes en sont les combinai-
sons bool~ennespositives de translates des A la condition de chafne descendante),
n
(verifier
a titre
d'exercice
Mais si maintenant nous ajoutons
aA
une structure quelconque, dans un langage L, nous eliminons cette fois les quanteurs par les A~, ensemble des a + , .• + an tels que (al, .. a satisl n) fasse f, ou fest une formule du langage L. A est trivialement interpretable dans G; G ne l'est pas dans A, mais tous les types de G,
a l'exception
du generique, correspondent
a des
trans-
lates de types de n-uples de A, si bien que G et A ont meme spectre de stabLl Lt e ,
Cette construction est tres semblable a, et tout aussi triviale que, eq la construction T de Shelah: cette fois, au lieu d'ajouter un "type 11 l'infini" [POIZAT 1985, ch. l6J de rang U I, nous ajoutons un type generique orthogonal sion
a A;
a tous
les autres (i.e., nous ajoutons une nouvelle dimen-
cette orthogonalite n'empeche pas tout element du groupe d'etre
somme de deux generiques:), qui sera de RU maximum si A est superstable; dans ce cas supers table , ce RU(G) est facile
a determiner:
c'est la borne a w,
superieure des RU des n-types de A, qui est necessairement de la forme d'apres l'inegalite de Lascar. On
voit que Ie generique est peu concerne par la veritable structure
de (G,A), contrairement
a ce
qu'on n'en pourra rien faire:
qui se passe dans Ie cas de rang fini, et G est 2-abelien elementaire, et pour la
merne raison que precedemment, n'a pas de sous-groupes definissables; on ne voit vraiment pas ce qu'on peut en dire de plus: Quant 11 lui trouver une topologie de Zariski, c'est desespere:
il
262
B. Poizat
faudrait distinguer des ferm~s parmi
les Af, n
an
fixe, c'est-a-dire defi-
nir canoniquement une topologie de Zariski sur une structure stable quelconque! On observe qu'on peut deguiser de la meme maniere une structure
a
superstable en un corps de rang w , en utilisant une base de transcendance au lieu d'une base d'espace vectoriel. line construction aussi uniforme ne peut que faire augmenter Ie rang, ne serait-ce que parce que, d'apres Lascar, une structure de rang de Morley fini, mais avec propriete de recouvrement fini, ne peut etre interpretee dans un groupe de rang de Morley fini.
On se pose alors, egalement
sans grande conviction, Ie probleme suivant: PROBLEME 2: TouJ:e .6.tw.e.-tWte. omega au amega-un-c.a..tegoJUque. u.t-eUe in.teJtpltUabfe daY1.6 un glloupe idem? II reste Ie faible espoir de trouver la topologie de Zariski pour un groupe de rang fini, ou meme de trouver quelque chose d'adapte au cas general.
Si jamais ~a existe, ~a serait diablement utile.
Voici un cas
o~ on peut s'en passer:
THEOREME: Si G u.t un glloupe. .6.table., dan.t le. ge.nhuque pltinc.Lpal u.t d' OIldlle 2, G a un .60U.6 -gMupe de.6bU..6.6abfe. d' indice &in.[ qui ut d' expo>
.6an:t
2.
PREUVE
Soient a et b generiques et independants sur G, satisfaisant les
formules de GO; on sait que ab est aussi generique sur G; donc a, b, ab sont d'ordre 2:
a et b commutent.
L'intersection H de G et du centrali-
sateur de a est un sous-groupe definissable de G (par definition de la stabilite); c'en est une partie gen~rique, donc un sous-groupe d'indice fini, contenant GO; comme tout element de GO est produit de deux generiques, GO est contenu dans Ie centre de H, et en fait dans Ie groupe K forme des elements d'ordre 2 de ce centre.
FIN
Je laisse au lecteur l'exercice suivant: 2 satisfait x a, avec a dans G, alors a = 1.
si Ie generique principal Et je lui pose Ie probleme:
PROBLEME 3: Si G u.t .6.table e.t canne.x!::, e:t.6i .6an genhuque .6a.:tU6ai.t n x I, G u.t-il d'exp0.6an.t n? On remarque combien ce probleme est trivial pour un groupe algebrin que ou pour un groupe abelien, puisque dans ce cas l'equation x = 1 definit un ferme de Zariski.
On peut naturellement poser Ie meme probleme
Groupes Stables
263
pour tout systeme d'equations en m variables satisfait generiquement dans m: G est-il satisfait partout? C'est Ie seul probleme de la version originale de cet article qui ait resiste
a l'action
devastatrice de Hrushovski qui, apres y avoir reflechi
quelque temps sans trouver la solution, a declare qu'il etait interessant; Ie suivant lui est intimement lie: PROBLEME 4: Un gJtoupe -6-ta.ble ..in6..i.n-L pw.:t-il n'avoht qu' un nombJte Mtti de c.la.6-6 u de eonjugtLiAOn-6 ? En effet, dans Ie cas de rang fini, cela implique que Ie centralisateur du generique est fini, done que ce generique est d'ordre fini, et on conclut tres rapidement que ce n'est pas possible pour un groupe algebrique (voir Ie debut de la section A).
L'argument de Reineke, qui elimine
la possibilite d'une seule classe (sans compter l'element neutre, naturellement:), et qui est Ie point de depart de l'analyse de Cherlin, est tres different de ce qu'un geometre avancerait pour montrer qu'un groupe connexe de rang un est comrnutatif. Je sais montrer qu'un groupe de rang de Morley fini ne peut avoir seulement deux classes de conjugaison non centrales; la demonstration utilise des choses insensees (il faut savoir qu'un groupe d'exposant 4 ou 6 est localement fini), et je ne la reproduis pas ici. Peut-etre faut-il essayer de repondre d'aborder la conjecture de Cherlin? ambitieuse,
a la
a ces
questions simples avant
C'est une conjecture extremement
mesure de l'ambition secrete de tout logicien, qui est
de montrer un theoreme d'un interet proprement mathematique, comrne celuici qui donnerait une caracterisation si directe
des groupes algebriques.
II n'est pas raisonable de poursuivre l'analyse de Cherlin au-dela du rang 3 (mais la Fortune ne sourit-elle qu'aux gens raisonables?) sans forger auparavant des outils plus precis que l'enclume et Ie marteau par lui utilises. Est-il possible de fabriquer ces outils? elle si motivee?
Et cette conjecture est-
Dans un premier temps, devant l'abondance de resultats
apparemrnent si divers, nous avons cru que ces groupes ressemblaient tellement aux groupes algebriques qu'ils devaient etre des groupes algebriques; mais avec un peu de recul, nous constatons que tout cela est centre sur Ie meme theme, l'existence des generiques, que 9a a Ie gout du Canada Dry, que 9a a la couleur du Canada Dry, mais que ce n'est peut-etre pas du Canada Dry.
264
B. Poizat
En conclusion, je vois mal comment on peut attaquer cette conjecture, so it en vue d'une preuve directe, soit en vue d'une preuve par inspection (on compare les deux listes:), avant de dire, d'une faqon ou d'une autre, ce qu'est une variete complete. REFERENCES [1] [2] [3] [4] [5] [6] [7] [8J [9J [10] [11] [12] [13] [14J [15J [16] [17] [18] [19J [20J [2lJ [22] [23] [24]
BAUDISCH, 1982, Decidability and stability of free nilpotent Lie algebra and free nilpotent p-group of finite exponent, Ann. Math. Logic, 23, 1-25. BALDWIN, SAXL, 1976, Logical stability in group theory, J. of Australian Math. Soc., 21, 267-276. BAUR, 1976, Elimination of quantifiers for modules, Israel J. of Math., 25, 64-70. BERLINE, 1986, Superstable groups; a partial answer to conjectures of CherI in and Zil'ber, Annals of Pure and Applied Logic, 30, 4561. BERLINE, LASCAR, 1986, Superstable groups, Annals of Pure and Applied Logic, 30, 1-43. CHERLIN, 1979, Groups of small Morley rank, Annals of Math. Logic, 17, 1-28. CHERLIN, HARRINGTON, LACHLAN 1985, Aleph-zero-categorical alephzero-stable structures, Annals of Pure and Applied Logic, 28, 103135. CHERLIN, SHELAH, 1980, Superstable fields and groups, Annals Math. Logic, 18, 227-270. HRUSHOVSKI, 1986, Doctoral dissertation, Berkeley. HRUSHOVSKI, PILLAY, 1987, Weakly normal groups, ce volume. KOLCHIN, 1973, Vifi6Vr.en:tial a.1.qebJz.a. and a£aehtlLi.c- gJtoup&., Academic Press, New York. LASCAR, 1985, Les groupes omega-stables de rang fini, Trans. Amer. Math. Soc., 292, nb. 2, 451-462. MACINTYRE, 1971, On omega-one-categorical theories of fields, Fund. Math., 71, 1-25. MEKLER, 1981, Stability of nilpotent groups of class 2 and prime exponent, Journ. Symb. Logic, 46, 781-788. NESIN, 198?, Doctoral dissertation, Yale University. NEUMANN, 1952, A note on algebraically closed groups, Journ. London Math. Soc., 27, 247-249. PILLAY, 1984, Review of several papers on stable groups, Journ. of symb. Logic, 49, 317-321. PILLAY, 1986, Supers table groups of finite rank without pseudoplanes, Ann. Pure and Applied Logic, 30, 95-101. PILLAY, sROUR, 1984, Closed sets and chain conditions in stable theories, Journ. of symb. Logic, 49, 1350-1362. POIZAT, 1978, Une preuve par la theorie de la deviation d'un theoreme de J. Baldwin, C.R. Acad. Sc. de Paris, 287, 589-591. POIZAT, 1981, sous-groupes d~finissables d'un groupe stable, Journ. of symb. Logic, 46, 137-146. POIZAT, 1983, Groupes stables, avec types generiques regu1iers, Journ. of symb. Logic, 48, 339-355. POIZAT, 1983a, Une theorie de Galois imaginaire, Journ. of symb. Logic, 48, 1151-1170. POIZAT , 1984, La structure geometrique des groupes stables, Hum-
Groupes Stables
[25J [26J [27J [28J [29J [30J [31J [32J
265
boldt Universitat, Seminarberichte Nr. 60, 205-217. POIZAT, 1985, CO~ de Th~okie d~ Model~, Nur al-Mantiq wal-Ma'arifah, Villeurbanne. REINEKE, 1975, Minimale Gruppen, Z. fur Math. Logik, 21, 357-359. SCHMERL, 1977, On aleph-zero-categoricity and the theory of trees, Fund. Math., 94, 121-128. SZMIELIEV, 1955, Elementary properties of abelian groups, Fund. Math., 41, 203-271. THOMAS, 1983, The classification of the simple periodic linear groups, Arch. Math (Basel), 41, 103-116. WElL, 1955, On algebraic groups of transformations, Amer. Journ. of Math., 77, 355-391 ZIL'BER, 1977, Groupes et anneaux dont la th~orie est categorique (en russe), Fund. Math., 95, 173-188. ZIL'BER, 1980, Totally categorical theories: structural properties and the non-finite axiomatizability, Lecture Notes in Math, (Springer), 834, 381-410.
REMERCIEMENTS Many thanks to Jeannine Swanson for her beautiful typing.
Mathematiques, U.E.R., 47 Univ. Pierre et Marie Curie 4, place Jussieu 75230 Paris CEDEX 05 FRANCE
Logic Colloquium '85 Edited by The Paris Logic Group © Elsevier Science Publishers B.V. (North-Holland), 1987
I
I
267
I
,
LOGIQUE ET GEOMETRIE ALGEBRIQUE REELLE H.-F. ROY U.E.R. de Mathematiques Universite de Rennes I Campus de Beaulieu 35042 - RENNES CEDEX 1. INTRODUCTION 1.1. La geometrie algebrique reelle, c'est-a-dire l'etude des sous-
ensembles algebriques de ~n definis par des egalites polyn6miales, se developpe depuis quelques annees comme une discipline a part entiere [5]. La seule ambition de cet article est de presenter les aspects de ce developpement qui sont en relation avec la logique mathematique, sans details techniques ni demonstrations. Dans la bibliographie on n'a pas cherche a remonter systemat.Iquement aux sources mais plut.St a fournir des references accessibles. 1.2. Signalons rapidement que les ensembles algebriques reels ont des par t Lcul.ar Lt.es qeome tr Ique s bien dLf'f'ererrtes de celles des ensembles algebriques complexes [35] : par exemple ils peuvent comporter plusieurs composantes connexes pour la topologie euclidienne tout en etant algebriquement irreductibles (la cubique d'equation x+y2_x3), leur dimension 3 locale n'est pas constante (Ie parapluie de Cartan d'equation z(x 2+y2)_x a un manche de dimension 1), etc ...
~~~
... "
..' . J.: . .
: ;' .
~
••• & . . : "
~.,
Parapluie de Cartan
268
M.F Roy
2. L'UTILISATION DE LA LOGIQUE Situation peu banale en mathematique, la geometrie algebrique reelle est un domaine dans lequel les specialistes ont besoin de connaitre un peu de logique, au moins la signification et l'usage d'une formule du premier ordre. 2.1. Theorie des modeles 2.1.1. Un r esu l tat classique de tneorte des moaeies Le principe de TurskiSeidenberg est en effet un des resu l tat.s fondateurs de la qeorne t r Le algebrique reelle. Theot-eme 2.1.1.1. (principe de Tarski-Seidenberg) ([ 38], [37], [5]) : La theor Ie des corps reels clos admet I' elimination des quantificateurs.
•
Le but de ce theoreme etait
~
l'origine de donner une procedure de
decision dans les problemes de geometrie. Ce theoreme nous sert en permanence sous la forme de deux importantes consequences en algebre et en geometrie. 2.1.2. Consequences en geometrie
(~
corps reel clos R fixe)
Theoreme 2.1.2.1. (theoreme de projection) [5] : La projection sur Rn d'un ensemble semi-algebrique de Rn+ 1 est semik al.qebrIque (un semf.-a lqebr lque est une partie de R def'LnLe par une combinaison booleenne d'inegalite~ polynomiales, autrement dit l'extension d'une formule sans quantificateur du lang age des corps ordonnes parametres dans R). Comme il n I est pas facile de decr Lre les objets
~
~
• I' aide de pro-
jections (essayez d'ecrire avec des projections I 'adherence pouri La topologie euclidienne d'un ensemble semi-algebrique) et qu'il est plus facile de manier des formules on utilise couramment : Theoreme 2.1.2.2. [5] : Tout ensemble de Rn defini par une formule du premier ordre du lang age des corps ordonnes
~
parametres dans Rest semi-algebrique.
Un mat.hema t Lc.l en (non logicien) n' est que re habitue
~
• analyser la
nature syntaxique des objets qu I il manipule et manier Ie concept de formule du premier ordre, exercice que Ie geometre algebriste reel est amene ~ faire frequemment, se revele etre assez delicat. L'importance qeometr Ique de la stabili te des ensembles semt-a Iqebriques par projection est considerable.
Les ensembles et fonctions
sem.l a l.qebr Lques jouissent en effet de pr opr Ie tes de finitude remarquables : un ensemble semi va Iqebr i que est par exemple union finie de e
269
Logique et Geomhrie Algebrique Reelle
semf-e lqebr-Lques homeomorphes a des paves ouverts, une fonction semia l qebr i.que a une croissance borriee par un pol ynome , etc ... [10],[5]. Affirmer qu' un objet est seml -e Iqebr i.que revient donc
a
lui assurer
quantite de bonnes proprietes. 2.1.3. Consequences en algebre (a corps reel clos variable) Theoreme 2.1.3.1. (theoreme d'homomorphisme d'Artin-Lang) ([1],[22], [23]) :
Soi ent R et K deux corps reels clos avec R C- K, si un ensemble n d'equations polynomiales coefficients dans R a une solution dans K n. elle a une solution dans R
a
Le theoreme d'homomorphisme d'Artin-Lang est equivalent
a
Theoreme 2.1.3.2. [5] n 5i un semi-algebrique de R est vide son extension
est vide.
a Kn
La theor te des corps reels clos, ou plus prec i sement une premiere
version du theoreme d' homomorphisme 2.1.3.1. a permis a E. Artin de resoudre Ie 17e probleme de Hilbert. Theoreme 2.1.3.3. (solution du 17e prob l eme de Hilbert) ([1],[5]) : Les pol ynomes en n variables positifs sur
n"
sont des sommes de
carres de fractions rationnelles de n variables. L'histoire du 17e probleme de Hilbert ne s'arrete pas En particulier Pfister a obtenu grace
a la
a Artin
([5]).
theorie des formes quadra-
tiques multiplicatives une majoration sur Ie nombre de carres necessaires [28] .
Le theoreme d'homomorphisme d'Artin-Lang est egalement utilise dans la demonstration du theoreme des zeros reels. Theor eme 2.1.3.4. (t.heoreme des zeros reels) ([21],[30],[16],[5]) :
Tous les polynomes qui interviennent dans 1'enonce sent des polynomes
an ,
variables
a coefficients n
dans R. 5i un polynome P s'annule sur les ~
zeros communs dans Raux polynomes P
... ,Pk' on peut trouver un entier 1, ,~ 2m .• m et une somme de carres de polynomes 5 tels que P +5 est dans l'ldeal ... ,P k. 1, Insistons sur un aspect remarquable de la situation : pour resoudre
~ngendre
par P
des questions concernant les nombres reels, on ne connait pas de methode qui evite d'utiliser la theorie des corps reels clos. 5ignalons,
enfin,
que Ie
theoreme 2.1.3.1.
est equivalent
a
la
modelp.-completude des corps reels clos, et que les theoremes 2.1.2.1. et 2.1.3.1. peuvent etre montres par des methodes geometriques ou algebriques directes, sans utiliser Ie theoreme 2.1.1.1.
M.F. Roy
270
2.1.4. Par contre les geometres algebristes reels n'aiment pas utiliser Le transfert sous la forme : Theoreme 2.1.4.1. (theoreme de transfert) [5] Un theoreme vrai dans ~n et s'exprimant par une formule du premier ordre est vrai dans tout corps reel clos.
•
En effet la verite de I' enonce dans IR peut etre assuree par des methodes transcendantes (series, integration de champs de vecteurs) et en general la situation n'est pas jugee satisfaisante avant qu'on connaisse une demonstration par des methodes elementaires. L'experience montre que contrairement 11 ce qu I on pourrai t craindre, cette recherche de purete des methodes va de pair avec une
grande s Impl i c l te et elegance des
preuves et met en evidence des idees et techniques utiles meme pour les reels
([15], [5]). 11 reste que
l e theoreme 2.1.4.1. est utile pour
fournir des enonces qui ont une vocation 11 etre demontres elementairement. 2.1. 5.
Le principe de Tarski-Seidenberg reste done d' ae tual Lte , On
dispose depuis peu de temps d'une demonstration elegante, tres elementaire et clairement algorithmique ([18], [5]). L'interet porte aux diff'er ent.s algorithmes ti'elimination des quantificateurs existants, a leur compLex i t e et mise en oeuvre effective se deve l oppe ([4], [8], [17], [27]). La
geometrie
semi-algebrique
a,
prevoit-on,
un
brillant
avenir
en
robotique [34]. 2.2. Logique categorique La logique cateqorique a permis d' introduire en qeome t r i e a Iqebr Lque ree I Ie un outil abstrait qui a depuis fait ses preuves, Le spectre reel: Disons tres rapide ment que la theorie des categories s'Interesse a des objets jouissant de proprietes universelles et qu'un des aspects de la logique cateqorlque consiste a decrire des conditions de nature logique pour que des solutions de proble mes universels existent. La solution de certains proble mes universels n'existe pas dans les ense mbles, mais necessite la construction de categories plus generales, par exe mple de topes. CI est ainsi que dans sa these M. Coste [9] a donne un cadre general pour la theorle des spectres qui englobe dlff'erents exe mples deja connus : Ie spectre de Zariski et Ie topos etale munis de leurs faisceaux structuraux, et ouvre la posslbiltte de nouvelles applications. Nous avons done ete am enes a introduire Ie topos etale reel [12] qui est pour la theorle des anneaux locaux henseliens de corps reslduel reel clos l'analogue du topos de Zariski pour la theorie des anneaux locaux et du topos etale
Logique et Geometrie Algebrique Reelle
271
de Grothendieck pour la theorie des anneaux locaux henseliens de corps residuel separablement clos [2J. Une difference essentielle avec l e cas de la topologie eta l e est qu'ici on n'obtient pas une topologie de Grothendieck
(c'est-~-dire
espace topologique generalise) mais un brave espace topologique spectre reel
un le
[14J muni d'un faisceau structural dont les fibres sont
des anneaux locaux hense l Iens de corps r'es LdueI reel clos. La theor Ie des topos ne joue done dans cette histoire qu'un role transitoire. (La situation est d'ailleurs la meme dans Ie cas du spectre p-adique [31J). 3. LE SPECTRE REEL ET QUELQUES APPLICATIONS 3.1. Definition 3.1.1. Le cas d'un corps K (de caracteristique zero) eclaire bien la difference entre Ie cas classique et Ie cas reel. Dans Ie cas etale la construction universelle du "spectre" donne la cloture e Iqebr Ique corps, munie de l'action du groupe de Galois de
K
K du
sur K. Dans Le cas
etale reel elle donne la collection de toutes les clotures reelles de K, collection indexee par les ordres totaux sur K, qui constituent dans ce cas l e spectre reel. Le faisceau structural a comme fibres les d if'f'erentes clotures reelles de K. 3.1.2. Pour un anneau A quelconque
les
elements
du
spectre
reel
Spec A sont les cones premiers de A, c ves t-a-dl re les sous-ensembles de r 2 A tels que (HaCa, axaca, A c n , -1 ¢a Vx Vy xyEa,x¢a ==> -yEa. (Pour comprendre les axiomes,
penser aux elements de
a
comme
~
des
elements positifs ou nuls). Le support an - a du cone premier a est un ideal premier. La topologie de Spec A est definie par les ouverts de base r
U(a de
... ,a) = {a E:Spec A!-a 1'a, ..• ,-a ¢a} (y penser comme la partie 1, n r n Spec A ou les elements a 1, ... ,a de A sont strictement positifs). r
On appelle partie constructible de ,Boole des parties de
n
SpecrA un element de I' al qebr e de
Spec r A enqendree par les ouverts de base. Comme
pour tous les espaces topologiques qui proviennent d I une construction universelle du type "spectre" on a Le resul tat suivant. Theoreme 3.1.2.1. ([13J, [SJ) Le spectre reel est un espace quasi-compact, en general non separe, dont les points s'identifient aux ultrafiltres sur les parties constructibles.
•
M.F. Roy
272
3.2. Le cas de l'anneau des fonctions polynomiales Un cas important geometriquement est celui de l'anneau A de fonctions n. polynomiales sur un ensemble algebrique V de dimension d de R 3.2.1. Rapports entre Ie spectre reel et les points de V. Grace au t.heoreme d' homomorphisme d' Artin-Lang (ou p Iutfit grace au theoreme 2.1.3.2.) on peut demontrer Theoreme 3.2.1.1. ([13], [5]) Les
semi a Lqebr i que s c
contenus dans V sont en bijection avec
parties constructibles de
SpecrA.
les
N
On note S l e constructible as soc i e
• au semi-algebrique S. Les points du spectre reel de As' identi f ient donc par 3.2.1.1. et 3.1.2.1.
aux
ultrafiltres
de
semi-algebriques contenus
dans V. Un
logicien averti ne manquera pas de remarquer que dans Ie cas ou Vest l' espace affine
an
dimensions, un point du spectre reel n' est autre
qu'un n-type [36]. L' ensemble V s' identifie
a une
partie de SpecrA puisqu' un point de
V definit l'ultrafiltre forme des semi-algebriques qui Ie contiennent. Le cone premier associe
ax
est compose des fonctions polynomes posi-
tives ou nulles en x. La topologie induite par SpecrA
sur Vest la topologie euclidienne
et grace au Theoreme 3.2.1.2. (theoreme de finitude) [5] Tout ouvert semi-algebrique est union finie de domaines de positivite (c- est-a-dt re d'ensembles de la forme
{xE:RnIP1(x) >O, ... ,Pk(x) >O}, ou
P1"",Pk sont des polynomes) On peut demontrer Tbeor eme 3.2.1. 3. ([ 13], [5])
L' application restreinte aux ouverts seml ve Iqebr l.ques de Vest une bijection sur les ouverts quasi-compacts de SpecrA. 3.2.2. Theorie de la dimension. On def i nI t la dimension d'un semi-atqebrtque S comme l e sup des d tels qu'il
existe
une
application
semi-algebrique continue
injective de
]O,1[d dans S, et la dimansion d'un cone premier comme la dimension de Krull de l' anneau
A/a n - a.
La dimension locale de S en x est la dimen-
sion minimale des voisinages semi-algebriques de x dans S. Une chaine de cones premiers de longueur n es t 1a donnee de n+1 c~nes premi ers
avec
CJ. • C l
I
a. l' On montre Ie l+
0.
0
, ...
,an
273
Logique et Geometrie Algebrique Reelle Theoreme 3.2.2.1.U13] [5])
a) La dimension de S est egale
a la
N
longueur maximale des chaInes de cones premiers dans S
au sup des dimensions des cones premiers appartenant b) La dimension locale de S en x est egale
a la
a S.
longueur maximale des
N
chaInes de cones premiers de S se terminant en x.
•
Si Vest irreductibie les ordres totaux sur l e corps de fractions I< de A forment un sous-ensemble de SpecrA : on identifie un ordre total sur I< aux elements de A positifs ou nuls pour cet ordre, qui forment un cone
premier. Le support d'un ordre est l'ideal (0), puisqu'un polynome nul pour un ordre total sur K est Ie polynome nul. Autrement dit un ordre sur K est un element de dimension maximale de Spec A. En terme d'ultrar
filtres de semi-algebriques un ordre sur K s'identifie
a un
ultrafiltre
de semi -a l qeb r iques tous de dimension d , d' apres Le theoreme 3.2.2. 1.. Le spectre reel de A comporte donc comme sous ensembles privilegies l'ensemble algebrique de depart V et l'ensemble des ordres totaux sur 1<. Si on note
Cent (V)
l' ensemble des points centraux de V, ces tva-d l re
les points de V ou la dimension locale de Vest maximale, on a l e theoreme Theor eme 3.2.2.2. ([ 5] ) L' adherence dans
SpecrA
de l' ensemble des ordres totaux sur I< est
Cent(V). 3.2.3. Fonctions de Nash et faisceau structural
•
Les fonctions de Nash sur un ouvert U semi-algebrique de ~n sont les fonctions analytiques qui sont algebriques sur les polynomes. Les anneaux de fonetions de Nash pos sedent les bonnes pr opr i.ctes algebriques des anneaux de polynomes (noetherianite) et sont des anneaux excellents [25]. Du point de vue geometrique, les fonctions de Nash sont beaucoup plus adaptees que les fonetions polynomes
a l'etude
des parti-
eularites des ensembles algebriques reels. Par exemple on a Ie Theor erne 3.2.3.1. (theoreme de separation de Mostowski) ([26], [5]) Soient F
et F deux f ermes semi val.qebr Iques dans U, i l existe u,ne 2 fonetion de Nash strietement positive sur F1 et strietement negative 1
•
sur F2' Les fonetions de Nash separen t done en partieulier les eomposantes connexes des ensembles algebriques reels, ee qui n'etait pas Ie cas pour les fonetions polynomes.
274
M.F. Roy
Une caracterisation d'Artin-Mazur en termes purement algebriques ( [3], [5]) permet de donner une definition des f onctions de Nash de nature elementaire, qui s'etend done au cas d'un corps reel clos quelconque, et de montrer que l e faisceau des fonctions de Nash est l e faisceau structural sur SpecrA [32]. Si on s'interesse maintenant au spectre reel de l'anneau de fonctions de Nash sur U et
a son
faisceau structural, on peut montrer Ie resultat
suivant Theoreme 3.2.3.2. (idempotence du spectre reel) [32] Le spectre reel de I' anneau
a
JI'( U)
des f onctions de Nash sur U est
homeomorphe 0', et Le faisceau structural sur Spec r .A"(U) N triction U du faisceau structural sur Spec A. •
a
est la res-
r
Ce resul tat peut se qener-al i se r au cas d' un anneau quelconque [32] et est equivalent au Theoreme 3.2.3.4. (lemme de substitution) ([6], [5]) Soient f une fonction de Nash sur U, ~ un homomorphisme entre JY(U) et Ie corps reel clos L, U et f les extensions de U et f L L Alors (~(X1), ... ,~(Xn»
E
a L.
UL
~(f) = fL(~(Xl), ... ,~(Xn» L'enonce meme du lemme de substitution utilise Ie principe de Tarski-
Seidenberg (pour pouvoir definir U et f L L). Le lemme de substitution est a la base des theoremes suivants Iheoreme 3.2.3.5. (17 e prob Ieme de Hilbert pour les fonctions de Nash) ([6], [5]). Toute fonction de Nash positive sur un ouvert semi-algebrique connexe U est une somme de carres d'elements du corps de fractions de
JY(U) . •
Theorerne 3.2.3.6. (t.heoreme des zeros reels pour les fonctions de Nash) ([6], [5]) Si une fonction de Nash f s'annule sur les zeros communs
a des
fonc-
tions de Nash f 1, ... ,f p' il existe un entier m et une somme de carres de fonctions de Nash s tels que f2m-t-S appartient a L'Tdea l enqendre par (f 1, ..• ,f p)' 3.3. Quelques applications du spectre reel
•
Plut.St que faire un catalogue des resul tat s concernant l e spectre reel, on a pref'ere donner quelques applications significati ves, ou Ie
spectre reel n' intervient pas dans l' enonce mais est un ingredient essentiel de la demonstration.
275
Logique et Geometric Algebrique Reelle
3.3.1. Nombre minimum d'inegalites permettant de decrire un domaine de positivite dans Rd. En utilisant les proprietes du spectre reel, notamment la theorie de la dimension, L. Brocker a pu exploiter pour l'etude du nombre minimum
d d' Lneqal l tes necessal res pour def i n.ir un domaine de positivi t.e dans R les proprietes algebriques du corps K des fractions rationnelles, bien connues grace 11 la theor Ie des formes quadratiques multiplicati yes de Pfister ([28], [5]) pour montrer Ie remarquable resultat suivant Theor eme 3.3.1.1. ([7], [5]) Tout domaine de positivite de V peut etre defini 11 l'aide d'au plus d .(d-2) .(d-4) ... Lneqa.li tes . Dans Ie plan par exemple tout domaine de positivite, defini par un nombre d'inegalites strictes aussi grand que l'on veut, peut etre decrit par deux inegalites seulement .... 3.3.2. Topologie des ensembles algebriques reels Les proprietes topologiques combinatoires des ensembles algebriques reels ont fait l'objet de nombreux travaux (voir [20] ou [5]). Soient <1>
V un ensemble a.lqebr Lque reel compact de dimension d,
C --> V
:
une triangulation semf-al.qebr tque (C est un complexe
simplicial fini et Si
0
<1>
un homeornorphisme semt-a l qebr Ique de C sur V).
est un simplexe de dimension d-1 de C on note g(o) I e nombre de
simplexes de dimension d dont
0
est une face. Alors
Theoreme 3.3.2.1. ([39], [11], [5]) Si
0
est un simplexe de dimension d-1 l e nombre g(o) est pair.
Theoreme 3.3.2.2. ([ 11], [5]) Si <1>(0')
0
et 0'
sont deux simplexes de dimension d-1 et si <1> (0)
et
sont contenus dans un meme sous-ensemble algebrique irreductible
de dimension d-1 de V, al or s g(o) est congru 11 g(o') modulo 4. Grace au dictionnaire fourni par Ie spectre reel entre l'algebre et 1a geometrie, ces theoremes sont demontres dans [11] 11 partir de resultats algebriques sur 1es anneaux de valuation reels. 4. PERSPECTIVES D'AVENIR 4.1. Les anneaux exce11ents L'etude du spectre reel des anneaux excellents semble tres prometteuse. Les anneaux excellents [25] jouissent d' excellentes proprle tes (d I ou leur nom) du point de vue de I' al qebre commutative tout en enq.l.obant 1es cas particuliers des anneaux de fonctions pol ynomes , de series
M.p. Roy
276
formelles, de series formelles e l.qebrIques , de fonctions analytiques, de fonctions de Nash. II semble probable que les spectres reels des anneaux excellents ant toutes les bonnes proprietes des spectres reels des anneaux de fonctions polynomiales. A partir d'un theoreme de dimension reelle J. Ruiz a deja pu mantrer Theoreme 4.1.1. [33] 5i M est une var Let.e analytique compacte, les sous-ensembles de M definis par une combinaison booleenne d'inegalites portant sur des fonctions analytiques sur M sont en bijection avec les constructibles du spectre reel de l'anneau des fonctions analytiques sur M. Ceci donne la solution du 17e probleme de Hilbert pour les fonctions analytiques reeties sur une var Iete analytique r ee l l e compacte M. Theoreme 4.1.2. [33] Toute fonction analytique strictement positive sur M est somme de carres de fonctions meromorphes. Le cas non-compact reste ouvert. Il est Lnt.eres san t de signaler que dans Le cas analytique on a l e theoreme 4.1.1. qui remplace Ie theoreme d'homomorphisme d'Artin-Lang mais pas de stabilite par projection: la projection d'un ensemble semianalytique n'est pas en general semi-analytique [24], phenomene qui est a l'origine de l'etude des ensembles sous-analytiques. 4.2. Les ensembles semi-pfaffiens L' etude de la qeomet r Le des ensembles def In I s par une combinaison booIeenne d' Ineqal I tes se pose pour d' autres classes de f onc t i ons ' que les polynomes ou les fonctions analytiques, par exemple pour les fonctions pfaffiennes [19]. Les fonctions pfaffiennes sont definies par induction: les polynomes sont des fonctions pfaffiennes et une fonction analytique dont les der I vee s sont des expressions po.lynomia l es d' e l l e-meme et de fonctions pfaffiennes deja definies est pfaffienne (un exemple typique de fonction pfaffienne est l'exponentielle qui verifie l'equation diff'er ent l.eLle y' = y ) . Les ensembles semi-pfaffiens sont les ensembles def Lni s par des combinaisons bool eenne s d' Ineqal Ltes portant sur des fonctions pfaffiennes. A partir de resultats partiels on pense que les ensembles semi -pf aff iens partagent les bonnes pr-opr i e tes de f ini tude des ensembles semi-algebriques [40]. Les fanctions pfaffiennes sont utilisees par Khovanski dans l'etude des "fewnomials" (polynomes avec peu de monomes). Il montre un theoreme
Logique et Gdometrie Algebrique Reelle
de Bezout
r~el
277
qui majore Ie nombre de solutions non
d~g~n~r~es
d'un
sys teme d' equat Ions pol.ynomial.es en n inconnues par une fonction du nombre de monomes du systeme [19]. II s'agit la d'un
~nonc~
du premier
ordre dont on ne connait pas de demonstration ~lementaire. 5. EN GUISE DE CONCLUSION On est done amene a constater I' infl uence profonde qu' a exercc la logique en g~om~trie algebrique reelle. La reciproque est aussi vraie : les corps r~els clos, introduits par Artin et Schreier pour l'etude du 17e probleme de Hilbert ont jou~ un role important dans Ie developpement de la theorie des modeles. Les recents travaux de Pillay-Steinhorn et van den Dries sur les structures O-minimales [29] montrent que l' histoire de ces echanges et influencec
r~ciproques
n'est pas finie.
On observe aussi que dans Ie developpement de la theorie une approche geometrique
ou
algebrique
directe
logiques ou categoriques d'origine,
se
substitue
aux
considerations
signe que les objets abstraits
finissent par se laisser apprivoiser au point de devenir des objets qu'il est a l'evidence nature 1 d'etudier. BIBLIOGRAPHIE [1] Artin
[2] [3]
[4] [5] [6] [7] [.ll]
[9] [10] [11]
E., Uber die Zergelung definiter Funktionen in Quadrate, Abh. Math. Sem. Univ. Hamburg 5 (1927) 85-99. Artin M., Grothendieck A., Verdier J.-L., Theor Ie des topos et cohomologie etale des schemas (SGA4) volume II, Springer Lect. Not. Math. 270 (1972). Artin M., Mazur B., On periodic points, Ann. of Math. 81 (1965) 82-99. Ben-Or M., Dexter K., Reif J., The complexity of elementary algebra and geometry, Prepublication MIT (1983). Bochnak J., Coste M., Roy M. -F ., Ceomet.r l e aLqebr Lque r ee l l.e , Ergebnisse der Mathematik, Springer-Verlag (a paraitre). Bochnak J., Efroymson J., Real algebraic geometry and the 17th Hilbert problem, Math. Ann. 251 (1980) 213-241. Brocker L., Minimale Erzeugung von Positivbereich, Geometria Dedicata 16 (1984) 335-350. Collins G. E., Quantifier elimination for real closed fields: a guide to the litterature, dans Computer algebra, symbolic and algebraic calculation, Springer-Verlag (1982) 79-81. Coste M., Localizations, spectra and sheaf representation, dans Applications of sheaves, Springer Lect. Not. Math. 753 (1979) 212-238. Coste M., Ensembles semt -a IqebrIques , dans Ceometr Le al qehr i.que r~elle et formes quadratiques, Springer Lect. Not. Math. 959 (1982) 109-138. Coste M., Sous-ensembles algebriques r~els de codimension 1, C.R. Acad. Sc. Paris 300 (1985) 661-664.
M.F. Roy
278
[12] Coste M., Roy M.-F., Topologies for real algebraic geometry, dans Topos theoretic methods in geometry, A. Kock ed. Arhus Universitet (1979) 29-100. [13] Coste M., Roy M.-F., La topologie du spectre reel, Contemp. Math. 8 (1982) 27-59. [14] Coste M., Roy M.-F., Le spectre eta Ie reel d'un anneau est spatial, C.R. Acad. Sc. Paris 290 (1980) 91-94. [15] Delfs H., Knebusch M., Semi-algebraic geometry over a real closed field II : Basic properties of semialgebraic spaces, Math. l. 178 (1981) 175-213. [16] Dubois D., A Nullstellensatz for ordered fields, Ark. Math. 8 ( 1969) 111-114. [17] Grigor'ev D., Vorobjov N., Solving systems of polynomial inequalities in subexponential time, Prepublication Universite de Leningrad. [18] Hormander L., The analysis of partial differential operators II, Springer Verlag (1983). [19] Khovanskii A. G., Fewnomials and pf3ff manifolds, Proc. Int. Congress of Mathematics Warsaw (198~). [20] King H., Survey on the topology of real algebraic sets, Rocky Mountain Journal of Math. 1414 (1984) 821-830. [21] Krivine J.-L., Anneaux preordonnes, J. Analyse Math. 21 (1964) 307-326. [22] Lang S., The theory of real places, Ann. of Math. 57 (1953) 378391. [23] Lang 5., Algebra, Addison-Wesley (1971). [24] Lojasiewicz 5., Ensembles semi-analytiques, I.H.E.S. Bures sur Yvette (1964). [25] Matsumura H., Commutative algebra, Benjamin (1970). [26] Mostowski T., Some properties of the ring of Nash functions, Ann. Scuola Norm. Sup. Pisa 111-2 (1976) 243-266. [27] Paugam A., Algorithmes d'elimination des quantificateurs, Colloque d'algebre de Rennes (1985). [28] Pfister A., Zur darstellung definiter Funktionen als Summe von Quadraten, Invent. Math. 4 (1967) 229-237. [29] Pillay A., Steinhorn C., On dedekind complete O-minimal structures, Preprint (1985). [30] Risler J.-J., Une caracterisation des Ldeaux des var Ietes al qebriques reelles, C. R. Acad. Sci. Paris 271 (1970) 113-127. [31] Robinson E., Affine schemes and p-adic geometry, Ph. D. University of Cambridge (1983). [32] Roy M.-F., Faisceau structural sur Ie spectre reel et fonctions de Nash, dans Geometrie algebrique reel Ie et formes quadratiques, Springer Lect. Not. Math. 959 (1982) 406-432. [33] Ruiz J., Cones locaux et completions, C. R. Acad. Sci. Paris paraitre). [34] Schwartz J., Shamir M., Mathematical problems and training in robotics, Notices of the A.M.S. (august 1983). [35] Shafarevitch I.R., Basic algebraic geometry, Springer Verlag (1974). [36] Shoenfield J., Mathematical logic, Addison Wesley (1967). [37] Seidenberg A., A new decision method for elementary algebra, Ann. of Math. (1954) 365-374. [38] Tarski A., A decision method for elementary algebra and geometry, prepared for publication by J.e.c. Mac Kinsey, Berkeley (1951).
(a
Logique et Geometrie Algebrique Reelle [39] Thom R., Un lemme sur les applications differentiables, Bol. Soc.
279
Mat. Mexicana, 1, (1956) 59-71. [40] van den Dries L., Tarski problem an pfaffian functions, Prepublication Stanford University.
Logic Colloquium '85 Edited by The Paris Logic Group © Elsevier Science Publishers B.V. (North-Holland), 1987
281
SOME ASPECTS OF CATEGORICAL SEMANTICS: SHEAVES AND GLUEING Andre Scedrov' Department of Mathematics University of Pennsylvania Philadelphia, FA 19104, U.S.A. Abstract. Two uses of sheaf models in functional analysis and in computable analysis are given. A new proof of the numerical instantiation property of intuitionistic ZF set theory is presented. Introduction The ongoing revival of interest in theories based on intuitionistic logic has come from two independent sources.
On the one hand,
BISHOP [1967] motivated several efforts toward a more precise formulation of its underlying logical principles: c f , FRIEDMAN [1977),
and FEFERMAN [1978).
Theirs was,
MYH ILL [1975], r n a sense,
a
continuation of KLEENE's and KREISEL's work on intuitionistic formal systems of arithmetic and analysis (cf. the other hand,
TROELSTRA [1973]).
On
intuitionistic first order or higher order logic or
arithmetic turned out to be synonymous with varlOUS free categories isolated in the wake of ARTIN-GROTHENDIECK-VERDIER [1972]: LAWVERE [1972],
FREYD [1972,1986],
[1977),
LAMBEK-SCOTT [1983].
constructivism,
BOILEAU-JOYAL Thus,
[1981],
MAKKAI-REYES
while one motivation came from
the other arose from a category-theoretic formu-
lation of algebraic geometry and algebraic topology in terms of sheaf categories,
1. e.,
Grothendieck topoi
(JOHNSTONE [1977],
BARR-
WELLS [1985]). One of the popular mlsconceptions about theories With intuit i
on i s t i c logic is that,
due to the lack of Excluded Mlddle,
must be weaker than their counterparts With tradltlonal 10glC.
In fact,
it
they
(boolean)
was suggested already in GC'iDEL [1932] that
'boolean theories may be considered as equiconsistent subtheories of their intuitionistic analogues.
Godel's negative interpretation,
which establlshed this result for Peano arithmetic With respect to lntuitionistic arithmetlc,
was anticipated to a certain extent ln
• Part1ally supported by N.S.F. and by Centre en ktudes Categor1ques, Montreal
Interun1vers~ta~re
282
A. Scedrov [1925] and it has since been extended to type theory in
KOLMOGOROV MYHILL
[197'!] and to set theory in FRIEDMAN [1973].
course,
This has,
of
invo 1ved the formul at ion of the corresponding int ui t ioni s-
tic theories.
(Cf.
t r on i s t r c ZF.
section 3 for the precise description of r rrt u t
>-
We wish to emphasize that the negative interpre-
tation is but a special case of the relationship that naturally occurs in many areas of mathematics,
e.g.
regular open sets vs.
algebraic closures of prime fields v s,
open sets,
fields,
divl-
sible torsion groups with inflnitely many elements of every flnite order vs.
abelian groups,
mutative cO-algebras, semantics),
commutative von Neumann algebras vs.
weak forcing vs.
boolean algebras vs.
com-
strong forcing (i. e., Kripke
Heyting algebras,
models and their symmetric extensions vs. reader may consult BLASS-SCEDROV [1983),
boolean-valued
Grothendieck tOP01. SCEDROV
[198'!a),
(The
and
[1980] for details. )
FOURMAN
Another point worth emphasizing is the ubiqulty of categorles that naturally lnduce e.g.
pr e t opo i
interpretations of intuitionlstic theories,
and topoi
(MAKKAI-REYES
[1977],
BOILEAU-JOYAL
[1981]).
Even logically strong lntuitionistic theories may be interpreted in er-ot nena s eck t opo i , e. g. tionistic ZF (cf.
FOURMAN
the Fourman interpretation of intui[1980] and also HAYASHI
[1981]),
based
on a reconstructlon of the von Neumann cumulative hierarchy within any Grothendieck topos,
allowing any object as the object of atoms.
These category-theoretic interpretations provide a framework for the following approach.
Involved mathematlcal objects (often
stated in terms of equivariance or continuity in addltional parameters) may be considered as representations of simpler
mathemati~
cal objects g i ve n internally ''In the right category",
e.,
a suitably chosen lnterpretation.
1.
under
This approach plays a signifi-
cant part ln obtalning several simplified arguments or lmproved results ln various areas of mathematics: homotopy theory (JOYAL [1985J, [198'!, VAN
[1985]),
TIERNEY
1985]),
differential geometry (MOERDIJK-REYES
commutative ring theory (MULVEY [197'!],
DEN BOSSCHE
[1985]),
analysis (ROUSSEAU [1979,
functional analysis (SCEDROV [1986]). (TAKEUTI
[1978J,
OZAWA
[1983],
JECH
BORCEUX-
1985]),
and
Boolean-valued analysis [1985])
is based on the same
idea ln a more restricted settlng. Three further illustrations of this approach are dlscussed in the present paper.
In sectlon 1 we study linear algebra in catego-
ries of sheaves on compact Hausdorff spaces to obtain new facts concernlng the law of inertla and the elgenvalue problem
Ax ,
~Bx
283
Some Aspects of Categorical Semantics for symmetr1c matrices over rings of continuous funct1ons.
The ranK
computat1on algorithm for such contjnuous matrlces is discussed. In section 2 we consider the topos of recursive sheaves [1982])
as a s e t t i ng for computable analysis.
(MULRY
This enables us to
give a simple proof of the effective Weierstrass Approximation Theorem for computable sequences of real functions
(first obtained
by the methods of recurS1ve analysis in POUR-EL - CALDWELL [1975]). As in the first example,
the r-e su l t s are obt a i ne d by interpreting
some basic construct1ve mathematics
(BISHOP [1967].
BRIDGES [1979])
1n a sU1table category. This technique of simplifying the issues by "mov1ng into a better category" was first applied to intuitionistic logic itself 1n FREYD [1978]
1n a new,
elegant proof of the closure of var10US
intu1tionistic theories under instantiation rules infer C ,
3!x(C
ft
A)",
"From
with var10US further restrictions on
and on the range of x. 1 )
3xA A,
on
In c a t e g or-yv t he or-e t t c terms this rule
is stated as the project1v1ty of a terminator in var10US free categories,
1.
e.,
everyep1morphism
A ---} 1
splits.
Freyd e xh i b i t e d
these categories as retracts of categories in which 1 was obviously projective, category.
thereby showing the proJect1vity of
It was shown in SCEDROV-SCOTT [1982]
in the glven that the 1nterpre-
tations of f1rst order and h1gher order 10glC and ar1thmet1c impliC1t in the categor1es obtained by Freyd's construction are the same as the slash interpretations given syntactically by Kleene and by Friedman.
In section 3 we discuss the numer i c e I
3xeH. A(x) n.
is provable,
then
A(n)
s ns t ent i et i on:
Lf
1S provable for some numeral
We show that Freyd's method may be extended to intu1tion1st1c
ZF with Collection,
thus glv1ng an alternative proof of the nume-
rical instantiation (first obtained in BEESON [1979] by the recurSive realizability-and-true interpretat1on). request,
At the referee's
we make the category-theoretic setting expl1clt.
We would liKe to thank Professor Feferman for inv1ting us to speaK at the Logic Colloquium '85.
We are grateful to Professor
Lascar and other members of the Paris 10glC group for provid1ng a pleasant atmosphere during the meet1ng. 1.
Continuous matrices KADISON [1984] aSKed for the characterizat10n of compact Haus-
dorff spaces trlces over
X
for which one has diagonal1zat1on of normal ma-
C(X) . The natural setting determ1ned independently by
GROVE-PEDERSEN [1984a,bJ
(us1ng geometric - operator algebra me-
A. Scedrov
284 thods) and SCEDROV [1986] Stonean spaces,
open
Fa
(using sheaf models) is that of the sub-
defined by the condition that any two disjoint
subsets have disjoint closures.
ful to assume that
X
(Furthermore,
it is help-
is totally disconnected.) Sub-Stonean spaces
have apparently been rediscovered several times (e. g. [1970] before this occassion).
in CHOQUET
GILLMAN-JERI SON [1960] referred to
them as F-spaces. A particular kind of sub-Stonean spaces are the RJckart spaces,
Fa
characterized by the requirement that open closures.
subsets have open
This condition is clearly equivalent to the condition
(studied in BERBERIAN [1972]) that the lattlce of open-and-closed sets is countably complete.
Rickart spaces were called basically
disconnected in GILLMAN-JERISON [1960]. Every Rickart space is a totally disconnected sub-Stonean space, but not vice versa. these spaces,
All of
including the proper subclass of extremally d i s con-
nected spaces considered in boolean-valued analysis,
occur natural-
ly in functional analysis as the spaces of maximal ideals in various abelian operator algebras.
These spaces have slmple characteri-
zations in terms of the topological sheaf interpretation of the properties of natural ordering of i nt u i t i on i s t i c reals (cf.
below).
We use this fact to prove the law of inertia and give the solution to the eigenvalue problem C(X)
Ax:
~Bx
for symmetric matrices over
for any totally disconnected sub-Stonean space
X,
and to
show that the ordinary rank computation algorithm applies to matrices over
C(X)
iff
X
is a Rickart space.
The topological interpretation of constructlve analysis goes back to SCOTT [1968]
(in the Baire space case, which has recently
been investigated in SCOWCROFT [1984] from a model-theoretic point of vlew).
A more general setting is given ln FOURMAN-SCOTT [1979].
For our purposes it suffices to interpret valued continuous functions on and let
na: bll
int ( XEX : a(x)
int (XEX b(x)
J ,
X,
(Dedekind) reals as real-
rationals as rational constants
a(x): b(x)
[a < bll
J,
[a
( XEX ; a t x)
bR b(x)
Formulae involving propositional connectives and quantifiers over the reals are interpreted as interiors of subsets obtained by the corresponding boolean algebra operations on subsets of standard Soundness Lemma states that tively provable assertion to,
say,
A.
IIAII
X
X.
The
for any construc-
(Constructive provability may refer
provability in r rrt u i t i cn r s t r c ZF,
when this lemma is a
special case of the soundness of the Fourman interpretation mentioned in the introduction.
In practice, however,
it suffices to
Some Aspects of Categorical Semantics
285
such as rnt u i t r cru s t i c second order
refer to much weaker theories, ari thmet r c. )
The reader will have noticed that the interpretation glven above is formulated for Dedekind cuts.
They are intuitionistically
equivalent to Cauchy sequences of rationals in the presence of the Axiom of Countable Choice, but not otherwise (FOURMAN-HYLAND [1979] ). Proposition 1.1. Hausdorff space. a)
iff
be a totally disconnected,
iff X
BVrER. (1':0
1
V
Proof. Regarding
a)
and
Go
set
The second condition ln a)
1
(1'10)) I
X,
HVUEP(H). VnEH.
(rie u
b)
b)
, notice that
Ua l 01
nnll ( XEX : a t x)
Re g ar-d i ng
Proposition 1.2.
c)
is treated similarly.
is the
The second conC(X) is
, interpret sets of natural numbers as and let
IInEUIi
Un
The following are lntuitionistic consequen-
VI', sER 3pER. (r:ps
a)
) II
reflects a characterizatlon in GILLMAN-JERISON
sequences of open sets, ces of
1 (rie uj
> -2- n l
[1960] requlring that every finitely generated ideal in pr-a nc i pa i ,
v
X.
interior of a closed dition ln
: X •
(r:O» U : X .
is extremally disconnected iff :
compact
I'll', SER. 3pe:R. (r-e ps v s e pr-) B : X .
is a Rickart space iff IVrER. (r-z o v
X
c)
X
is sub-Stonean iff IVrER. (r-z o v r-t o) I
X
b)
Let
Then:
V
s:pr)
:
All eigenvalues of an invertible symmetric real matrix are
lnvertible reals. b) P
W2,
c) with
U
Any positive (seml)definite matrix with
Any lnvertible real matrix unitary,
Proof.
P
may be wrltten as
positive (semi)definite.
W
and
T
S
may be factored as
S: UT,
upper triangular with positive diagonal.
The eigenvalues of a symmetric matrix may be obtained
anyway (ROUSSEAU [1985]). proposition hOlds,
If,
r n addition,
the condition of the
then (SCEDROV [1986]) every symmetric matrix
of the form QDQT, with Q unitary and QD- 10T so part a) follows. Regarding
D b),
diagonal. let
W
lS
(QDQT)-l: OD1/2QT.
Regarding c), recall that the Gram-Schmidt algorithm is allowed (SCEDROV [1986])
Lemma 1.3.
, Assume the condition of Proposition 1.2. Then,
intuitionistically. any invertible congruence transform an invertible symmetric matrix (negatlve) eigenvalues as
A.
A
STAS
of
has the same number of positlve
286
A.Scedrov Proof.
Lemma 1.2 a,c) allows us to follow the standard proof.
Indeed, write S: UT by Lemma 1.2c, and let Set) tU + (l-t)UT, with 0 ! t ! 1 . Each Set) is invertible because U is unitary, and the other factor tI + (1-t)T is triangular with positive diagonal. Indeed, 1et d > 0 be a diagonal entry of T. Both t , (1-t) d
o
t < 2/3 , then
If
t
+
(l-t) d
z
d/3 > O.
If
t > 1/3 , then
t + (l-t)d l t > 1/3 > 0 Because each S(t)TAS(t) is invertible, Lemma 1.2a implies that the eigenvalues of each S(t)TAS(t) are apart from O. S(O)TAS(O) : STAS S(1)TAS(1): UTAU U-1AU The latter matrix clearly has the same eigenvalues as Theorem 1.4. Stonean space,
A
Let
X
be a compact, totally disconnected sub-
and let
A
be an invertible symmetric matrix over
C(X) . Then any invertible congruence transform number of positive (negative) eigenvalues as
STAS
has the same
A
Proof. Lemma 1.3 in the topological sheaf interpretation over X . Because
X
is totally disconnected,
it has a baS1S of open-
and-closed sets, hence a supremum involved in the 1nterpretation of an eXistential assertion may be replaced by a single instance The eigenvalue problem Ax : ABx is our next topic. Lemoa 1. 5.
Assume the c orid i t i on of Proposition 1. 2.
intuitlonistically, any symmetric
n'n
I
Then,
real matrix and any
posltlve definite n'n real matrix may be simultaneously diagonalized by a congruence transformation. Proof. Again, the assumption allows us to follow the standard argument. By Lemma 1.2b we may write B: w2 , With W: wT positive definite. Letting
Thus
Ax: ABx
becomes
AW-1 y: AWy, w1th
y: Wx.
C: w-l , the eigenvalue problem under consideration is
reduced to the standard eigenvalue problem My = AY with M symmetric (here M: CTAC). Let U be a un1tary matrix for which uTMU is d1agonal (SCEDROV [1986]) and let S: CU . Then U: WS, so STBS STW2S: sTwTws = uTU : I and STAS: UTCTACU : UTMU 1S diagonal
I
Theorem 1.6. Stonean space.
Let
Then,
X
be a compact,
in the ring of
n.n
totally disconnected submatrices over
C(X),
any
symmetric matrix and any positive definite matrix may be simultaneously diagonalized by a congruence transformat10n. over
Proof. X
Lemma 1.5 in the topological sheaf interpretation
We close this section with a remark on the rank computat10n algor1thm for matrices over C(X)
287
Some Aspects of CategoricalSemantics Proposition 1.7.
Each of the following assertions intultio-
nistically implies the next: i)
\freR. (r=O v 1 (r=O»
ii)
For any
m'n
and
real matrix
tation matrix
P,
such that Proof.
there exist a real permu-
and a real
m-ri
L.
echelon matrix R
PA = LR .
First half of
half of
A
a lower triangular real matrix
with a unit diagonal, iii)
\fr, seR. 3peR. (r=ps v s e pr-) .
t )
.
Row reduction may be used already under the second
i)
(SCEDROV [1986).
to recognize non-zero pivots Theorem 1.8.
Let
X
Here, however,
one must also be able
I
be a compact Hausdorff space.
Then the
following are equivalent: i)
X
ii}
IS a Rickart space,
For any
m.n
matrix
mutation matrix trix
Lover
P
A over
C(X)
such that
there eXIst a per-
m'n
echelon matrix
on
Rover
C(X)
PA = LR
Proposition 1. 1b and the topological sheaf interpre-
tation of Proposition 1.7 2.
C(X)
C(X), a lower triangular ma-
with the constant function
the di ag ona 1, and an Proof.
over
I
Computability in analysis and recursive sheaves In this section we discuss a sheaf interpretation which yields
the posltlve results obtained in the study of computability in classical analysis (POUR-EL - CALDWELL [1975), [1983a,b) [1967), term,
POUR-EL - RICHARDS
as simple consequences of constructive theorems (BISHOP
BRIDGES [1979).
Since "computable analysis" is an overused
let us begin by making some distinctions.
The study of com-
putability in classical analysis that we are concerned with here is mostly the stUdy of the effective approximability as an additional property in the context of the classical continuum,
and the rela-
tions of this property to the main notions of classical analysis (cf.
Definitions 2.3 and 2.7 below).
This IS in contrast with con-
structive computable analysis (KUSHNER
[198~J)
in which only the
computable reals and the computable functions are allowed.
This
contrast is well illustrated by the fact that there is a counterexample to the effectivization of the existence theorem for the s o l u t i on to the wave equation under the first approach (cf. POUR-EL - RICHARDS
[1983b),
other approach (SCEDROV
e. g.
while the theorem holds under the
[198~b).
The counterexample arises when
A. Scedrov
288
the initial function is computable and ct , and we insist on a computable solution. On the other hand, a computable solution may be found (by the Kirchoff formula)
if both the initial function and
its derivative are computable. oBishop-style constructive setting is linked with constructive computable analysis and algebra by the recursive realizability semantlcs (most recently considered in HYLAND [t982] and McCARTY [1984]).
The sheaf interpretation considered here is different and
more suited to the Pour-EI - Richards approach.
The fact that there
lS a common core of positive results in either semantics certainly suggests the possibility of general transfer principles.
These
principles and the related category-theoretic issues are being investigated by Rosolini in his dissertation under D. Scott [1986]).
(ROSOLINI
Martin Hyland also tells us that he had been aware of the
uses of the sheaf model described here. Although the semantics we will deal with in this section is a special case of a sheaf interpretation of a higher order language in a Grothendieck topos (cf. indeed,
e. g.
Chapter 0 in SCEDROV [1984a]),
a special case of the Fourman interpretation With (non-
standard) natural numbers as atoms, this specific case explicitly. in MULRY [1982, Let
N
lt seems worthwhile to state
The topos in question was introduced
1985].
be the monoid of recursive functions
f: H ---) H ,
with the monoid multiplication defined as composition. a subcategory of the category whose objects are with an action of the monoid
N),
We consider
N-sets (i.e.,
riant maps (i.e., maps that preserve the action).
Rec
of recurslve sheaves has the same maps,
but its ob-
jointly surjective collection of recursive functions and any e qu i var-r an t map
that
F:
S ---) X
generated by these functions,
SeN
F(fi)
=
xfi
is
The SUbca-
X With the property that for any finite,
jects are the N-sets fk '
N
Note the
itself an N-set with the action defined by composition. tegory
sets
and whose morphisms are equlva-
for all
fl'
f2'
from the sub-N-set
there is a unique
1! 1 ! k . Observe that
xe:x N
such
itself
is a recursive sheaf. Let us consider a multisorted first order language With equality,
whose sorts are recursive sheaves.
For each formula
A(x1""
,X n) of this language (With free variables among those diSPlayed) and a choice of elements al"" ,an of the corresponding sorts we define the forcing relation
as
"A(Xt, ... , Xn) is forced for
t
at,···, an")
A(x) [a]
(to be read
289
Some Aspects of CategoricalSemantics •
T.
Y
•
(A "
B) (X) [a)
•
(A v B) (x)
1.
(Xl
•
= X2) [al. a2)
i f f . A(x) iff
[~)
iff
al = a2 •
and
[a)
•
B(x) [a)
•
there exist finitely many jointly surJective fl •...• fk EN
such that for each
i=l ..... k • • A(X)[alfi ..... anfi) iff iff
there exist finitely many jointly surJec-
then tive
•
B(x'j
[alf, ... , anf) ,
fl •...• fk EN
i= 1, ...• k
such that for each
there exists an element
the sort of
y
iff
for each sort of
fEN
of
,
and each element
Y • • A(y,
b
x) [b. al f, ... ,anf)
We say that a sentence of this language holds in forced.
b
such that
• A(y.x)[b.al fi'···.a n fi) • Vy.A(Y.X) [a)
or
• B('it) [alfi"" • anfiJ. for each fEN, if • A(x) [alf •...• anf)
Rec
of the .
iff it is
An interpretation of intUltionistic higher order arithmetic
is obtained if the appropriate sheaves are specified as sorts. natural number sort is given by the sub-N-set
HeN
of constant functions (under the static action). object
0
The
consisting
The truth-value
(the internal power set of a singleton) is the set of
families of r. e.
sets closed under inclusion and finite union,
where the action is by left translation:
Sf
consists of all
gEN
such that
fg
S.
The exponential
yX
of N-sets
X, Y
maps
enumerates an r-, e.
set in
is the N-set whose elements are the equivariant
F: N.X ---) Y , the act ion being
power-object of meaning of
H.
X
(Ff) (g, x ) = F(gf,
is given by the exponential
(XEY) [a, P)
H
wi th
aEX,
OX,
xj .
The
so the
PEOX , 1 S obVi ous.
This interpretation of intuitionistic higher order arithmetic is of no interest:
internal DedeKind real s are just external real s ,
A contribution of MULRY [1982) is. number sort to values.
l.e .•
N
and.
secondly.
to replace
0
firstly,
to expand the natural
to consider only the r. e.
by its sub-N-set
0r.e.
truth-
consisting
of the principal ideals. A formula is called arltlJmetlc if the only quantifiers it contains are quantlfiers over
N. The following two easy facts are
recent folklore: Proposition 2.1. Or. e. N hold in Rec with parameters over
Induction on
N
and Comprehension w.r. t.
for positive existential arithmetic formulae N and Or. e. N . This inclUdes all ~~
formulae with parameters over
N
only
290
A. Scedrov Proposition 2.2.
Vx,y~N
Starting from the sort
(x=y v l(x=y»
N,
holds in
the standard construction of
rationals yields (in this semantics) the N-set valued recurSive functions c ompo s i t i on. values,
Rec
f: X ---) Q.
0
of rational-
where the N-action is by
To describe DedeKind cuts in
with r-. e.
0
truth-
we recall a few basic definitions in computable analysis
(POUR-EL - RICHARDS [1983b). Definition 2.3.
A sequence of reals
(xKI
is called
computable if there is a recursive double sequence of rationals IrknJ such that IXk - ~nl < 2- n for all x, n Proposition 2.4. of Dedekind cuts in
(MULRY [1982) 0
The object
R
(Or. e. 0)2
C
is given by the N-set of computable sequen-
ces of reals. where the N-action is by composition We shall refer to
as the obj ect of (int ernal) real s , Let
R
us first observe that: Proposi tion 2. 5. a)
I-
Every real is computable.
b)
I-
Every feNN is r-e cur-s i ve .
Proof.
Given
x~R
let
r~aN
be a recursive double sequence I- Vn~N ( l x - rl < 2- n ) .
obtained by Proposition 2.4.
Clearly
An internal recursive index
e~N
function
e(n)
= index of
Proposition 2.6. Proof.
r
f
is glven by the recursive
~m.r(n.m)
n ) x .
VX~R. 3n~N.
Given a computable sequence
recursive function fen)
I-
of
such that
fen)
A sequence
IFKI
x
of reals,
> xn
for all
we need a n.
Let
= r nO+2 We recall that: Definition 2.7.
of real functions on [0,1]
1S called computable if: a) For every computable sequence
IXnl
of pOints in [0,1] ,
(FK(x n) I is a computable ccum e sequence of r-e a l s ,
and
b) There exists a recursive function d(n,K) all x, y e [0.1]. ix - YI < 2- d(n,k) implies
such that for IFK(X)
- FK(Y)I
< 2- n
Our central observation is: Proposition 2.8.
The internal
C([0.1])
1n
is glven
Rec
by the N-set of computable sequences of real-valued functions on [0,1] , where the N-action is given by c ompo s i t r ori. Proof.
It is readily checKed that the internal
given by the N- set of comput abl e sequences of point s from the real world.
Furthermore.
[0,1] i
describing any N-set
n
1S
[0, 1]
X means
291
Some Aspects of CategoricalSemantics specifying the morphisms
N ---) X .
Thus,
let
F: N. [0,11 ---) R
Rec. It is clear from Proposition 2.~ and from the equivariance of F that the condition a) of Definition 2.7 holds for F. Regarding b) , note that: I- ItK, nEN. 3mEN. Itx, yE [0, I). ( [x - yl 2- m,. IFK (x) - FK (y) I < 2- n), be a morphism in
so there are recursive, such that for each
jointly surjective
i=I, ... ,p
computable sequences
x,
tx f i (j)
Y -
there is
of points in
<
Y f i. (j) I
fl"'"
f p : H ---) H such that for all
giEN [0.1)
2 - g i (j)
and all
j.
if:
,
then: IFptfj,(J) (Xfi(j» - FPz.fj,(j) (Yfj, (J))! < 2- Pt(j) , Pl,P2:H.H ---) H are (recursive) proJections. Letting
where
g = minlgi: i=I, ... ,pj , and considering only constant
x,
Y, we
obtain that: tx -
for all dition FK's to
yl
n,KEH b}
< 2-g(n,K)
implies
IFK(x)
and all computable points in
< 2- n
- FK(y)! [0,1].
Then the con-
of Definition 2.7 holds for the unique extensions of
[0, I) . The other direction is verified similarly
We note that every computable real function (say, lS completely described by a
(recursive index of a)
on
[0, I]
triple sequence
of rationals specifying the values of the function along a recurSlve enumeration of the rationals ln
[0,1].
computable sequence of real functions on
Similarly,
[0, I]
every
r s completely des-
crlbed by a recursive index (of a quadruple sequence of rationals). Furthermore,
Rec
these remarKS may be internalized in
in light
of Proposition 2.5. One more deflnition is needed before stating our maln result. Definition 2.9.
A double sequence
FnK(X)
of real functions
lS said to effectively uniformly converge to a sequence
=
K -)
e(n,m)
if there is a recursive function e(n,m) implies IFnK(x) - Fn(x)! < 2- m for all
Fn(x)
such that n,
as
K
m, x
Notice that the convergence described ln this definition is effective in both Theorem 2.10. functions on IPnKl
K, n,
but uniform only in
K
For any computable sequence
[Fnl
Of real
[0, I] , there exists a computable double sequence
of polynomials such that,
K -)
~,
P nK
effectively
uniformly converges to Proof.
Fn By Propositions 2. I,
as
following Proposition 2.8,
2.2,
2.5,
2.6,
strass Approximation Theorem ln BRIDGES [19791 in
and the remarks
one may follow the proof of the Weler-
(IV:3.~) internally Rec. One shows analogously to Propositlon 2.8 that the con-
A. Scedrov
292
vergence described in Definition 2.9 is the internal uniform convergence of a sequence in the internal with Propositions 2.q, Remark.
This,
C([O,1])
together
2.8 yields the theorem
HYLAND [1982]
recursive realizability.
studies the "Effective Topos" based on
The covariant representable functor given
by the natural number object in the Effective Topos in fact has
Bee, not just in Sets.
values in
Propositions 2.4 and 2.8 given
above claim that this functor preserves the metric spaces of reals and of un i forml y cont mucus real funct ions on
[0, 1] .
In fact,
one
may show in a similar manner that all "computable Banach spaces" of POUR-EL - RICHARDS [1983a,b] arise this way. The numerical instantiation in intuitionistic ZF with Collection
3.
We formulate i nt ur t i cn i s t i c ZF set theory, order language With equality, binary relation symbol 1\
,
V
~
,
,
e:.
in the first
and whose logical symbols are
1,
3 , V . The underlying logic is i nt u i t i on i s t i c
predJ.cate calculus with equality, Extensionality, hension,
IZF,
whose only non-logical symbol is a
Pairing,
and the non-logical axioms are
Infinity, Union,
Power Set, Full Compre-
and the aXioms:
Foundation. 'Ix (Vye:x.A(y) Collection.
Vxe:u. 3y A(x,y)
A(x))
~
~
~
'Ix A(x)
,
3z.Vxe:u. 3ye:Z.A(X,y) ,
With the usual restrictions on free occurrences of varJ.ables. 2) We recall that any (intuitionistic or boolean) first order theory canonically determines (and J.S J.n turn determined by) a logos,
i. e.,
a regular category in which the sUbobJects of each
object form a distributive lattice under inclusion,
and the pUll-
back map along each morphism is a lattice homomorphism With a right adJoint. 3)
This logos may be somewhat loosely described as the ca-
tegory of formulae and definable functional relations, equalJ.ty of morphisms being the provable equality. 4) Let L be the logos gJ.ven by the first order theory IZF. number object
(n. n. o. )
Note that L has a natural (JOHNSTONE [1977J) given by a formula of IZF
that defines the set of natural numbers.
The terminator in
L
is
not proJective,i.e. ,IZF does not allow unrestricted instantiation (FRIEDMAN-SCEDROY [1985J). that every epic in split,
i.
L
We shall prove,
on the other hand,
from a subobject of a
n.n. o.
to
dification of Freyd's category-theoretic glueing argument. tactic terms, tatJ.on.
does
e .• IZF allows the numerical instantiation. We use a moIn syn-
our argument J.S an extension of the slash interpre-
A potential relevance of this interpretation to the problem
293
Some Aspects of CategoricalSemantics at hand was suggested by H. Friedman. In Freyd's original glueing construction (cf.
SCOTT [1982), a category category 1 (in which
A
Indeed,
sends an object
morphisms from
to
A
A,
to the collection
and
f
are pairs of maps and morphisms of mutativity condition.
S A
<singleton.
to
1
Let
<S,f,A>, where
a map from
rCA)
x:A ---> B
defined by composition.
gory whose objects are triples an object of
1 ---> A
the global sections functor
A, and a morphism
r(x):r(A) ---> reB)
SCEDROV-
e. g.
1S associated with another
is projective) and a functor
that preserves terminators. r:A ---> Sets
with
S
of
to the map be the cate-
is a set,
A
rCA). The morphisms
satisfying the obvious com-
identity,
1>
1S a terminator in
1, and it is obviously projective. Erasing all of the sets-struc)-:1 ---> A
ture defines a functor topos) with a n. n. o. sentation of l o go r
1
so is (r-e sp,
topoi).
free t opos) wi th a n. n. o. 5), A ---> 1
If
A
is a logos (resp.
and the er-as i nz functor is a repreIf
A
is the free logos (resp.
the unique free represent at ion
followed by erasing must be the identity, hence
a retract of
A
is
1, and thus it has a projective terminator.
The whole problem in employing Freyd's glueing construction 1n showing the projectivity of terminators of other logoi, particular in
L,
representation of
in
consists in show1ng the existence of a logos L
1n
t
(where Sets is now considered as the
logos of possibl Y 1 ar-se subset s of the universe of small set s i , The existence of such a representation follows from MYHILL [1973)
if
IZF is weaKened by stating: Replacement
¥xeu.3!y A(X,y)
instead of Collection.
Indeed,
~
3z.¥xeu.3yez.A(x,y)
MYHILL [1973) shows that t n i s frag-
ment of IZF does allow unrestricted instantiation. hand,
the representation in
t.
In the case at
does not exist because
projective. We split all L-epics to
by embedding
1S not L
into
X. but in a manner that does not introduce any new L to be represented in (a sublogos of) t. In category-theoretic terms, our construction of X is similar to a limit-slice construction in FREYD [1972J (cf. also BARR-WELLS [1985)). To an audience of logicians, X may be described as the logos of formulae and provable functional relations of a certain
another logos
numerals and allows
theory
T.
Theory
T
is best described as the union of a transfinite
sequence of theories The language of
Ta+1
Ta
through all ord1nals
is the language of
Ta
a
6)
TO = IZF.
extended by new
A. Scedrov
294 c on s t ant s in
To'
cA'
where the sentence
but the constant for
axioms and rules of ';Ix (xecA
C>
A(x».
To
y
3Y';lx(xey
A(x»
c>
is provable
was not available.
Extend the
to this new language and add a new aX10m
For a limit ordinal
the language of
1.
Tl
is the union of previously introduced languages together With new constants
bw.U.V.A
specified below.
Extend the aXioms and rules,
and add new axioms as follows: Let and
V,
free.
w
a term,
small sets of terms of the previous languages such that:
r)
';Ixew. 3y A(x, y)
ii)
For each
ueu,
uew
i11)
For each
ueu,
there is
1S provable in some
provable in some Then let
bw.V,V.A
introduced yet),
,
To
veV
,
0 < 1 To ,
.
0 < 1
A(u.v)
such that
is
0 < 1
be a new constant of
Tl
(if it has not been
and add the following aXioms: •
vebw,U.V.A'
veV
Proof.
To
is provable in some
';Ixew 3yebw, U. V, A A (x, y)
Lemma 3. 1. X
x, y
A(x,y) be a formula with exactly V
L
for each
is embeddable in
(1) (2)
X
Viewed category-theoretically,
this is immediate because
is a directed union of slices w. r-. t , well-supported objects.
logician may use transfinite induction on is conservative over IZF.
0
This is clear for
A
to show that each 0+1
At
To
a limit
ordinal lone needs to consider only finitely many instances of (1)
and (2)
(by compactness),
and replace them by
i)
- iii)
and
finite union In order to define a representation of a full SUbcategory of
t.
L
t, we specify
1n
We refer to the global sections of
t
as w i t nes s e s . For each ordinal a , let Ma be the set of all witnesses whose Sets-part is a singleton on a set of Witnesses in some
My,
y
Let M = Ua Ma. We may. tegory of possibly large subsets of M functor ( )- ).
Sets.
Let
L*
those objects
be the full <S, f,A>
of) witnesses In
M,
of course,
subcategory of
t
whose objects are
for which the elements of and
f
It may be readIly seen that
view the ca-
as a fUll subcategory of S
are
(n-tuples
1S given by the erasing functor ( )-.
L*
is a sublogos.
to the situat10n described in SCEDROV-SCOTT
(This is analogous
[1982] and in LAMBEK-
SCOTT [1983].) We define a functor
F: L ---> L*
by letting
F(A)
= <S.f.A>
S = (
WIth
Some Aspects of CategoricalSemantics
295
and in particular in SCEDROV-SCOTT [1982] in terms of slash. SCEDROV-SCOTT [1982], where
is a small set of witnesses an
S
tion of
1::,
1. e.,
Lemma 3.2. Proo£.
a constant of
F: L ---> L"
A
)
TI-Vx€w-.3y A(x, y) Then for every
A«P,U>,
Say
VX€<S,W>.3y A(x,y)
v
is a global sec-
holds in
there is and
L"
let a (small) X c M
are satisfied for
T~
Then the corresponding instances of
V
U,
and a limit
i) - iii) ln the defini-
tion of
T
L"
such that By applying Col-
TI-A(U-,v-)
erasing.
among the axioms of
L". The rest
holds in
(we suppress additional parameters of
be such that the conditions
holds in
and
is a representation of logoi.
lection twice in the metatheory, ordinal
M,
T
We must verify that Collection holds ln
is as in MYHILL [1973) . and
As in
the witnesses may be written as pairs <S,v> ,
as obtained from (1)
and
S,
X
(2)
by are
and VX€<S,w>. 3y€<X,bw,u,V,A>.A(X,y)
L"
The set of natural numbers
W and each of the numerals
are deflnable in IZF as usual in ZF. a formula
C(y) , we write
Theorem 3. 3.
A(Y)
If
for
Y 3!y
Suppose that a sentence
in IZF. Then there eXlsts a numeral
n
nEW
lS definable in IZF by (C(y)A A(y»
such that
. is provable
3x€w.A(x) A(n)
lS
provable r n IZF. Proof. The numerals of
X
are standard (cf.
MYHILL [1973],
SCEDROV-SCOTT [1982]). Apply Lemmata 3.2 and 3.1 Footnotes 1)
Constructivlst motivations and some proof-theoretic aspects of
2)
AS mentioned in the introductlon,
these rul es are dl s c u s s e d e. g.
in KRE ISEL [1970, 1972]. equiconsistency of IZF With
the traditional ZF and the Fourman interpretation of IZF ln any GothendiecK topos attest to the fact that IZF is a prooftheoretically strong theory wlth a semantics of independent mathematical interest. fragment
At present, however,
the type-theoretlc
(obtained by deleting Collection and restricting
Comprehension to formulae With bounded quantlfiers) is much better understood both topos-theoreticallY (e. g. [1981)
and proof-theoretlcally (e.g.
the other hand,
we do not see how to ellminate Collectlon from
some of the argument s in FEFERMAN [1969], [1978),
BOILEAU-JOYAL
GIRARD [1972]). On
and BtNABOU [1985].
JOHNSTONE-PARt
Because these argument s most
often concern categorles related to lntultlonlstlc
10~lC,
296
A. Scedrov it would be desirable that the arguments themselves are intuit t
3)
ori i s t i c , to allow iteration and internalization.
A closely related notion of pretopos is considered e. g.
in
BARR-WELLS [1985] and in MAKKAI-REYES [1977]. ~)
This construction is described in detail in chapter 8 of
5)
They may be represented as the categories of formulae and pro-
MAKKAI-REYES [1977]. vable functional relations of intuitionistic first order (resp. higher order) arithmetic. 6)
Our metatheory is boolean,
but c f . FR I EDMAN-SCEDROV [1983] for
a similar construction expressed intuitionistically. References ARTIN, M.
, GROTHENDIECK, A. and VERDIER, J.L.
[1972]
"Theorie des Topos et Cohomologie Etale des Schemas, Seminaire de Geometrie Algebrique du Bois Marie 1963/6~",
Springer LNM
£§.2.
and ilQ .
BARR, M. and WELLS, C. [1985]
"Toposes,
Triples,
Wiss.
, Sprlnger-Verlag,
~
and Theories",
Grundlehr.
Math.
Berlin.
BEESON, M.J. [ 1979]
Continui ty in r nt u i t i on i s t t c Set Theories, Colloquium 78" eds.),
(M.Boffa,
North-Holland,
Amsterdam,
pp.
in:
"LOglC
and K. McAloon,
D. van Dalen,
1-52.
BtNABOU, J. [1985]
Fibered Categories and the Foundations of Naive Category Theory,
J.
Symbolic LOglc 50 ,
10-37.
BERBERIAN, S.K. [1972]
"Baer "-Rings", Verlag,
Grundlehr.
Math.
Wiss.,
Springer-
Berlin.
BISHOP, E. [ 1967]
"Foundations of Constructlve Analysis",
McGraw-Hill,
New York. BLASS, A. and SCEDROV, A. [1983]
Classifying Topoi and Finlte Forclng, Algebra
BOILEAU, A. and [1981] BORCEUX, F. [1985]
~,
Pure Appl.
JOYAL, A.
La Logique des Topos, J. and
J.
111-1~0.
Symbolic Logic 116 , 6-16.
VAN DEN BOSSCHE, G.
"Algebra in a Localic Topos, Ring Theory",
with Applications to
LNM 1038, Springer-Verlag, Berlin.
297
Some Aspects of Categorical Semantics BRIDGES, D.S. "Constructive Functional Analysis",
[1979]
Pitman,
London.
CHOGUET, G. Formes Lineaires Positives sur Les Espaces de
[1970J
Fonctions. Compt.
Espaces Sous-Stoniens et Pseudo-Mesures,
Rend.
Ac a d.
Sci.
Paris,
Ser.
A, 270 ,
164-166.
FEFERMAN, S. Set-theoretical Foundations of Category Theory,
[ 1969]
"Reports of the Midwest Category Seminar III" MacLane,
ed.),
Springer LNM
, 201-247.
~
Constructive Theories of Functions and Classes,
[1978J
"Logic Colloquium 78" McAloon, FOURMAN,
In:
(S.
eds.),
(M. Boffa,
North-Holland,
D. van Dalen,
Amsterdam,
pp.
in: K.
159-224.
M.P.
[1980J
Sheaf Models for Set Theory,
J.
Pure Appl.
Algebra
1..2, 91-101. FOURMAN, M.P. and HYLAND, [ 1979]
J.M.E.
Sheaf Models for Analysis, Sheaves" (M. P. eds. ),
Fourman,
Springer LNM
in:
C. J.
"App l
Mul vey,
Lc a t Lorrs of and D. S.
Scot t,
~
, 280-301.
Sheaves and Logic,
in:
"Appllcations of Sheaves"
(M. P.
Mulvey,
FOURMAN, M.P. and SCOTT, D.S. [ 1979]
Fourman,
Springer LNM
C. J. ,
~
D. S.
Scott,
e ds . ) ,
302-401.
FREYD, P. [1972]
Aspects of Topoi,
Bull.
Austral.
Math.
Soc. 1.,
1-76,
467-480. [ 1978]
On proving that 1 is an indecomposable projective in various free categories,
unpublished note.
[1986]
Choice and Well-Ordering,
Ann.
Pure Appl.
Logic,
to
appear. FRIEDMAN, H. [ 1973]
The Consistency Of Classical Set Theory Relative to a Set Theory with Intuitionistic Logic, Logic
[1977]
~,
J. Symbolic
315-319.
Set-theoretic Foundations for Constructive Analysis, Annals of Math.
1Q§.,
1-28.
FRIEDMAN, H. and SCEDROV, A. [ 1983]
Set Existence Property for Intuitionistic Theorles wlth Dependent Choice, Ann. 129-140 ; Corrigendum,
Pure Appl.
ibid., .a.2.,
LOglC
101.
~
,
298
A. Scedrov [ 1985]
The LaCK of Definable Wltnesses and Provably Recursive Functions in Intuitionistic Set Theorles, Math.,
GILLMAN,
L.
57 , 1-
and JERISON,
[1950] GIRARD,
M.
"Rings of Continuous Functions", Princeton,
Advances in
I~.
Van Nostrand,
N. J.
J.Y.
[1972]
Interpretation Fonctionelle et tlimination des Coupures dans L'Arithmetique d'Ordre Superieure, These d'Etat,
GROVE,
and PEDERSEN,
K.
un i ver-s i t e Paris VII .
G.K.
Sub-Stonean Spaces and Corona Sets,
[198~a]
Analysis 56 ,
Dlagonalizing Matrices over
[198~b]
Analysis Q2, GC)DEL,
C(X)
,
J.
Functional
55-89.
Zur Intuitionlstichen ArithmetiK und Zahlentheorie, Ergebnisse e i ne s math.
HAYASHI,
Kolloq.,
Heft
~
,
3~-38.
S.
[1981] HYLAND,
On Set Theories in Toposes,
Springer LNM ll.2.i,
23-29.
J.M.E.
[1982]
The Effective Topos, Symposium"
(A. S.
North-Holland,
in: "The L. E. J.
Troel stra and D.
Amsterdam,
pp.
Brouwer Centenary e d sv ) ,
van Dalen,
165-215.
T.J.
[1985]
Abstract Theory of Abelian Operator Algebras: Application of Forcing,
JOHNSTONE,
P. T.
[1978]
and PARt,
R.
Trans.
A.M.S.
£§.2,
An
133-162.
(eds.)
"Indexed Categories and Their Applications", LNM
JOYAL,
Functional
K.
[1932]
JECH,
J.
12~-1~3.
Sprlnger
.2.2..1.
A.
[ 1955]
Closed Model Structures on Toposes.Preliminary Report. Abstracts A. M. S.,
KADISON, [198~]
October 1985,
p.
335.
R.V. Dlagonallzing Matrices,
Amer.
J.
Math.
.1QQ,
1~51-1~58.
KOLMOGOROV, A. [1925]
Sur Le Principe "Tertium Non Datur", de la Soc.
Math.
translation: In:
de Moscou
Recuell.
6~7-567.
Math.
English
On the Prlnciple of Excluded Mlddle,
"From Frege to Godel"
Harvard un i v.
R
Press,
(J.
van He i j e no or-t ,
Cambridge,
Mass.,
1957,
ed.), ~1~-~37.
299
Some Aspects of Categorical Semantics KREISEL, G. Church's Thesis: A Kind of Reducibility Axiom for
[ 1970)
Constructive Mathematics, Theory, R. E.
in:
Intuitionism and Proof
Proceedings Buffalo 1968 (A.
Vesley,
eds.),
North-Holland,
Kino,
J. Myhill,
Amsterdam,
pp.
121-150.
Which Number-Theoretic Problems Can Be Solved in
(1972)
n:
Recursive Progressions on J.
Symbolic Logic 37 ,
~
Paths through
,
311-324.
KUSHNER, B.A. Lectures on Constructive Mathematical Analysis,
(1984)
Translations of Mathematical Monographs, Providence, LAMBEK. J. and SCOTT,
vo i
,
A.M.S.
60,
R. I.
P.J.
New Proofs of Some Intuitionistic Principles,Ze1tschr.
(1983)
math.
LogiK Grundl.
Math.
, 493-504.
~
LAWVERE, F.W. Introduction to "Toposes,
(1972)
Logic".
Spr1nger LNM
~
Algebraic Geometry and
.
MAKKAI, M. and REYES, G. "First Order Categorical Logic",
(1977)
Springer LNM §.U
McCARTY, D.C. Realizability and Recursive Mathematics,
(1984)
Dissertat10n,
Oxford University. MOERDIJK,
I.
(1984)
and REYES, G. De Rham's Theorem in a Smooth Topos, Cambridge Phllos.
[ 1985)
Soc . .2.Q,
Math.
Proc.
61.
Connections on Microlinear Spaces,
Preprint,Univ.
de
Montreal. MULRY,
P.S.
[ 1982)
Generalized Banach-Mazur Functionals in the Topos of Recursive Sets,
[ 1985)
J.
Pure Appl.
Adjointness in Recursion,
Algebra ZQ,
71-83.
Preprint.
j'1ULVEY, C.J. [ 1974)
Intuitionistic Algebra and Representations of Rings, in: A. M. S.
MYHILL,
Memoirs.1.!!.§. ,
3 - 57 .
J.
(1973)
Some Properties of Intuitionistic Zermelo-Fraenkel Set Theory,
in:
"Cambridge Summer School in Mathemat1cal
Logic, Proceedings 1971" (A.R.D. Rogers,
Jr.,
ed.)
Mathias and H.
Springer LNM TIl. , 206-231.
A. Scedrov
300 [ 1974]
Embedding Classical Type Theory ln Intuitionistic Type Theory,
Proc.
Corrigendum, [ 1975]
Symp. ibid.,
Pure Math. .1.]. , part I, part II,
Constructive Set Theory,
J.
267-270;
185-188. SymbOlic Logic 40 ,
347 - 382. OZAWA, M. [ 1983]
Boolean-valued Analysls and Type I Proc.
POUR-EL, M.B. [1975]
and
Japan Acad.
Ser.
A , 22,
AW' algebras,
368-371.
CALDWELL, J.
On a Slmple Definitlon of Computable Function of a Real Variable - with Applications to Functlons of a Complex Variable, Math. £1.,
POUR-EL, M.B. and RICHARDS, [1983a]
math.
LogiK Grundlagen
I.
Computability and Noncomputability in Classical Analysis,
[1983b]
Zeitschr.
1-19.
Trans.
A. M. S.
£72,
539-560.
Noncomputability ln Analysis and PhysiCS: A Complete Determinatlon of the Class Of Noncomputable Linear Operators,
Advances in Math.,
, 44-74.
~
ROSOL I N I, G. [1986]
Continuity and Effectiveness in TopOi,
in preparation.
ROUSSEAU, C. [ 1979]
Topos Theory and Complex Ana l y s i s , of Sheaves" (M. P. e d s. ),
[ 1985]
Fourman,
C. J.
In:
Mulvey,
"App l a c a.t i cn s
D. S.
Scott,
Springer LNM 753 , 623-659.
Spectral Decomposition Theorem for Real Symmetric Matrices in Topoi and Applications, Algebra
~
J.
Pure Appl.
(1985) 91-102.
SCEDROV, A. [1984a]
"Forcing and Classifying TOPOi", Memoirs of the Amer. Math.
[ 198,*b]
Soc.,
v o l , &.2.Q. , Pr-ov i deric e , R.1.
Differential Equations ln Constructlve Analysis and in Re cur-s i v e Realizabillty Topos, Algebra l l ,
[ 1986]
J.
Pure Appl.
69-80.
Diagonallzation of Contlnuous Matrices as a Representation of t nt u i t i on i s t a c Reals, LOglC,
to appear.
Ann.
Pure Appl.
301
Some Aspects of CategoricalSemantics SCEDROV, A. and SCOTT, P.J. A Note on the Friedman Slash and Freyd Covers,
[1982J
in:
"The L. E. J.
Brouwer Centenary Symposium" (A. S.
Troelstra and D. van Dalen, Amsterdam, SCOTT,
pp.
eds.),
North-Holland,
443-452.
D.S. Extending the Topological Interpretatlon to Intuitio-
[1968J
n i s t r c Analysis, SCOWCROFT, [1984)
Compositio Math. .a.Q,
194-210.
P. The Real-Algebraic Structure of Scott's Model of Intuitionistic Analysis, Ann. Pure Appl.
LogiC 27 ,
275-308.
TAKEUTI, G. [ 1978)
"Two Applications of Logic to Mathematics", Univ.
Press,
Princeton,
Princeton
N. J.
TIERNEY, M. [1985J
Categorical Models for Homotopy n-types. Report.
TROELSTRA, A.S. [1973)
Abstracts A. M. S., October 1985,
Prellminary p.
335.
(ed.)
Metamathematical Investigation of Intuitionistic Arlthmetic and Analysis, Springer LNM
~
.
Logic Colloquium '85 Edited by The Paris Logic Group © Elsevier Science Publishers B.V. (North-Holland), 1987
303
CRITERES D'INDEPENDENCE D'EQUATIONS DIOPHANTIENNES DE FRAGMENTS DE L'ARITHMETIQUE Ulf R. Schroerl, Mlinchen
Plusieurs auteurs ont etudie l'independance
d'equations
diophantiennes de fragments d'arithmetique: cf. Shoenfield [7], Shepherdson [4], [5], [6], Wilkie [8] et van den Dries [1]. Dans ces analyses, les seules methodes appliquees etaient celles de la theorie des modeles. Dans ce qui suit, quelques resultats obtenus par des methodes elementaires de la theorie de la demonstration seront presentes; ces resultats seront donnes sans demonstrations, une version complete de notre etude sera publiee ailleurs. L'idee principale de notre approche est de traduire la relation de derivabilite pour les systemes consideres en une relation arithmetique pour polynomes. Ceci est particulierement facile
a
illustrer pour Ie cas du fragment d'arithmetique
Zo suivant. Zo est formule dans Ie langage avec 0, S, +, . et comprend les axiomes definissants habituels de ces symboles ainsi que les axiomes supplementaires suivants: s+(t+u)=(s+t)+u,
s+t=t+s,
s+u=t+u~s=t
s - (t·u)=(s.t) -u ,
s·t=t·s,
s· (t+u)=(s.t)+(s.u),
d.s=d·t~s=t
(d=2,3, •• ),
ou s,t,u representent des termes quelconques du langage. Pour pe systeme on a la propriete suivante: Une formule de la forme
ou ri,si,uj,V
j
sont des termes quelconques, est derivable
dans Zo si et seulement si un des polynomes uo-vo, .. ,un-v n E Z[x •. ,x ou bien un polynome PEZ[x .. ,x n'ayant que des 1, n] 1, n] coefficients positifs ~t avec, en particulie~ p(O, .. ,O»O)est
304
U.R. Schmer/
dans l'ideal engendre par les polynomes ro-so, .. ,rm-s m dans ~[x" .. ,xnl, c.a.d. si et seulement si uO-vOE(ro-so,··,rm-sm)Qv .. vun-vnE(ro-so,··,rm-sm)~ V3pE'+N[x" .. ,xnl pE(ro-so, .• ,rm-sm)~ . Etant donne que chaque formule sans quantificateurs est, par logique propositionnelle, equivalente a une conjonction de formules de cette forme particuliere, cette propriete fournit une caracterisation de formules ouvertes quelconques derivables dans zoo En particulier pour les equations diophantiennes r=s il s'ensuit: r=s est refutable dans Zo si et seulement si r-s est diviseur d'un pOlyn6me
a coefficients
positifs:
ssi , Trivialement, un polynome qui est diviseur d'un polynome
a
coefficients positifs ne peut avoir de zeros reels (a" .• ,an)E Rn avec a,>o, .. ,a n>0. II en resulte Ie critere simple suivant -
de l'independance d'equations de Zo: Si r=s n'a pas de solution n dans N, mais a une solution reel Ie (a" .• ,an)ER avec a,~O, .. , a >0, alors cette equation n'est ni prouvable ni refutable
n-
dans Zoo Les auteurs cites ci-dessus ont etudie surtout des fragments d'arithmetique a induction ouverte. Pour ces systemes aussi il est possible de donner des caracterisations similaires. Nous considerons d'abord Ie systeme z, avec les symboles 0, S, +, et avec induction ouverte (les fonctions signe
predecesseur et
peuvent egalement etre admises, mais les extensions
ainsi obtenues sont conservatrices a l'egard de la prouvabilite de formules ouvertes). La caracterisation des formules ouvertes prouvables dans Z, est tout
a
fait pareille
a
celIe de Zo;
pour les equations diophantiennes on a:
il existe un cEN tel que, pour toutes les substitutions de X1E{0,1, .. ,c-1,x,+c} pour x" .. , xnE{O", .. ,c-',xn+c} pour x n' il existe un pOlyn6me qEZ[x1, .. ,xnl avec
305
Independence d'Equations Diophantiennes
Pour la decidabilite des equations, Z, est donc equivalent au systeme Z +3x[t=Ovt=8x] (pour tous les termes t). En ce qui o concerne l'independance des equations, on obtient Ie critere suivant pour Z,: 8i r=s n'a pas de solution dans N, mais a des solutions reelles arbitrairement grandes - c.a.d. des n solutions (a" •. ,an)ER avec a,~c, •. ,an~c pour tout cEN alors r=s est independant de Z,. II en ressort que meme des equations aussi simples que nx+m=ny (O<m
n'a que des coefficients positifs. Un theoreme de Polya [3] sur des formes definies positives permet d'en deduire la decidabilite au moins pour des equations r=s
a
une des classes de polynomes suivantes:
- polynomes dont Ie degre partiel par rapport
au
r-s appartient
a chaque
variable xi est egal au degre total du polynome - polynomes quadratiques. Mais nous conjecturons que cette question est decidable pour des equations quelconques.
a
Un systeme beaucoup plus fort que Z, est obtenu en ajoutant z, la fonction ~; on notera ce syteme par Z2' Dans Z2'
toutes les equations dont l'impossibilite peut etre montree par des considerations de congruence elementaires sont refutables - done, par exemple, la plupart des equations traitees dans Ie chapitre 2 du livre de Mordell [2]. D'apreS un resultat de Wilkie, Z2 est equivalent - en ce qui concerne la decidabilite formelle des equations diophantiennes - a l'extension de Zo par les axiomes
U.R. Schmerl
306
3x[t=Ovt=5x] 3x[t=kxvt=kx+1v •• vt=kx+k-1]
(k=2,3, •. )
3x[S=t+xvt=S+x], ou s et t sont des termes quelconques; nous designerons cette extension par Z3. En utilisant Ie langage elargi avec 0, 5, P, +,
"
~,
[k]'
~,
les axiomes existentiels de Z3 peuvent
etre remplaces - en conservant l'equivalence - par les axiomes ouverts t=Ovt=5Pt
t t=k·[k]+R
k
(k=2,3, .. )
s=t+(s~t)vt=s+(t~s)
.
Les equations refutables dans Zz et Z3 peuvent etre caracterisees comme suit: Z3 f- r (x) ~s (x)
i~
existe des ensembles finis
1(x'Y1
ssi.
{t~ (x,y 1) I i
, •• ,
.• Y ) li
{t
1
p. 1
1 ••
i
au, pour tout j,
ou
ou
n
(x, Y1 .. y n ) E (r (}{) - s n
(x) , t 1' 1 (x, Y1 ) , •• , t i n n 1
(x, Y1 .. y n) ) ,
1~j~n,
(k~Z)
Independence d'Equations Diophantiennes
307
ou s, s1' s2 sont des termes. Bien que cette caracterisation ait l'air tres technique, elle donne un critere bien simple de l'independance des equations: Si pour une equation r(x
.. ,x .. ,x il existe des polynomes reels n) n)=s(x 1, 1, f •. ,f que nER[y]tels 1, (i) r(f (y), .• ,f (y) )=s (f1 (y), .. ,f (y» 1 n n .. ,f ont des coefficients dominants positifs n 1, (iii) 3aER f .. ,f 1(a), n(a)E:l, (ii) f
alors r(x)is(x) n'est pas derivable dans Z2 ou Z3. On reconnait tout de suite que les equations (x+1)2=2. (y+1)2 et (x+1)3+(y+1)3=(Z+1)3 ne peuvent etre refutees dans ce systeme, ce que Shepherdson a deja prouve par des moyens de la theorie des modeles. L'irrationalite de /2 peut facilement etre prouvee dans un systeme Z4 comprenant une regIe d'induction du type A (kx+i)
(pour tout i,
1~i
A (x)-+A(kx) A(Sx) de sorte qu'il est possible de montrer egalement que les equations de Fermat (x+1)n+(y+1)n=(z+1)n, n>2, n'y sont pas refutables. References. [1]
L. van den Dries, Some model theory and number theory for models of weak systems of arithmetic, Proc. Model Theory of Algebra ans Arithmetic, Karpacz 1979, SLNM 834
[2]
L.J. Mordell, Diophantine equations, London/New York 1969
[3]
G. Polya, tiber positive Darstellung von Polynomen, Vierteljahresschrift der Naturw. Ges. zu ZUrich, LXXIII (1928) 141-145
[4]
J.C. Shepherdson, The rule of induction in the free variable arithmetic based on + and ., Proc. Syrnp. at Clermont-Ferrand, 1961
[5]
J.C. Shepherdson, Non-standard models for fragments of number theory, Proc. Int. Syrnp. on Model Theory, Berkeley 1963, 342-358
308
U.R. Schmerl
[6]
J.e. Shepherdson, A non-standard model for a free variable fragment of number theory, Bull. Acad. Polan. Sc. XII (1964) 79-86
[7]
J.R. Shoenfield, Open sentences and the induction axiom, Journ. of Symb. Logic 23 (1958) 7-12
[8]
A.J. Wilkie, Some results and problems on weak systems of arithmetic, Proc. Logic Call. '77, Amsterdam 1978, 285-296
U. Schmerl Math. Institut der Universitat Theresienstrafe 39 8000 MUNCHEN 2 (W-Germany)