This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
0, and t.herefore since jll( /..[1) + /ll( 1'1 ) JlI( t.~r) we have 1'1(Lo), III( L 1) < 1'1( L~.I). Consequently [(L:d*J [(LI)*] [(1. 1 )*] = [OJ,l and (2 2) is still Yalid. 0;
0;
:By (20), (21) and (22)
f L "'J
0;
fYJ,l + [0 ,']'
This completes the proof that the theorem for connected linear graphs is true when :t1(L) = 11 if it is t rU(l for :t l ( t.) < n. We have proved it for 21(L) = 0 and therefore it. is true in general. If t. is not. connected we can obtain P[L* ] satisfying the theorem by multiplying together the pol.\-nomial;; of its component". COROLI...\R\". All!! element [X] oj R ran be expressed (IS a polynomial in the [y,] /.t'it" rational ill/cyer cOf'fficients am/lIo conMtmt term. For X is a tillite linear form in the L; with integer coefficients.
128
31
A 'ring in graph theory 3.
i::)CBGRAPHH
Let 8 denote any subgraph of a linear graph L. Let the number of components T of 8 such that P1(T) = r be i,(8). We define a function Z(L) of L by Z(L)
= ~ oS
11 z~,(S),
(23)
r
where the Zr are independent indeterminates over the ring I of rational integers. Although (23) involves a formal infinite product, yet for a gh-en 8 only a finite number of the i,(8) can be non-zero and so, for each L, Z(L) is a polynomial in the Zi' THEoRE11
III. Z(L) is a V-fllnction.
For first it is obdous that Z(L) satisfies (-1-). Secondly, if A if> any link of L, then the subgraphs of L which do not contain A are "imply the subgraphs of L: 1 , and the ~ubgraphs 8 of L which do contain A are in I-I correspondence with the subgraphs of L:~. For, for such an 8, 8:; is a subgraph of L:;; and if 81 is any subgraph of L:'I there is one and only one subgraph S of L having the same I-cells as 81 with the addition of and therefore satisfying = 8. Further differs from 8 only in that a component T of is replaced by T~; and. by (9) and (10), T:; is connected and P1(T") = P1(T). Hence i,(8:;) = i,(S) for all r. Hence by (23) Z(L) = ~ 11 z~,(S)+ ~ 11 zV~),
S:-.
A
S:;
S(L~)
where
S(L~)
S
S:;
S(C;) r
r
for example denotes a subgraph 8 of L: 1• Therefore Z( L) = Z(L: 1 ) + Z(L:~),
(24)
that Z(L) satisfies (5). Thirdly, for any product L1 L2 the subgraphs of LIL2 are simply the products of the subgraphs 81 of Ll with the subgraphs 8 2 of L 2 . It is evident that
SO
i.(8 1 8 2 )
and therefore
Z(Ll L 2 )
=
i,(8 1 ) + i,(82 ),
11
=
~ z~,(SlH·i,(S,) 8 1 .81 r
=
(.'~ 8
1
11 Z~,(Sl») r
(~
11 z~,(S'») =
lj'!.
r
Z(L 1) Z(L 2 ).
(25)
Thus Z(L) satisfies (4), (5) and (6). That is. it is a V-function. THEORE:.\I
IV.
(26)
For each subgraph of y, has just one O-cell (~2), and therefore just one component. Hence Z(y,) is a linear form in the Zr- The number of subgraphs 8 such that Pl(S) = k is the number with ct1(S) = k, by (19), and this is the number of ways of choosing l.I-cells out of r. 4.
STRrCTrRE OF THE RI:'\G
R (27)
LEMMA.
This equality can be obtained by expanding xT = ((x - I) + I)' in powers of (.1' - I). expanding each of the terms in the resulting series in powers of x, and then equating coefficients.
129
W. T. TUTTE
32 TREORE)]
V. Ris isomorphic with the ring Ro of all polynomials in the
Zi
with integer
coefficients and no constant term. For by Theorem III Z(L) is a V-function with values in Ro· Hence by Theorem I Z(L)
(28)
h[L*].
=
where h is a homomorphism of R into Roo Let [tJ be the dement of R defined by [til
=
i (-
j_1l
I )i+j (~.) [yJ.
.J
Then. by Theorem IV and the lemma. h[t i ]
= j~O s;,( -
l),j
If we multiply (29) by (;). sum fromi [Yr]
=
0 to i
=
i
C) (~) Zs = Zi' =
(30)
r. and use the lemma we find
L(I.') itJ II
(29)
(31)
I
Hence b~' Theorem II. Corollary. any element [X] of R can be expressed as a polynomial in the [til with integer coefficients and no constant term. )[oreover this expression is unique: otherwise there would be a polynomial relationship between the [t;]o and tlwrefore by (30) between the Zi, with integer coefficients, and this would contra(lict the definition of thez i . It follows that h is an isomorphism of R on to Ro (for ever.\' integer polynomial in the [tJ is in R).
VI. Let :l'o,;l:! • .r 2 • "". be an infinite sequence of COllnected linear graphs, anl/ the corre.~ponding isomorphism classes. such that (i) xo-:::;Yo, (ii) Pt(xr ) = r, (lnd (iii) Xr contains noisthlll1l8 A such that for some component Lo of (xr)~' pt(Lo) = O. Then any element [X] of R I/(/s a unique expression as a polynomial in the [xJ with integn coefficients and no constant terril. By Theorem II (v) and equation (31) we have, for r> 0, THEoRE31
X o, Xl' X 2, ...
[X r ]
=
[q + [S,],
(32)
where [Sr] is a polynomial in those [til for which i < r. Hence [t r]
[xr ]
+ [Vr].
(33) where [Vr] is a polynomial in those [x;] for which i < r. (If we assume this for r < nit follows for r = n by substitution in (32). Since [xo] = [Yo] = [to] it is true for r = 0, and therefore it is true in general.) Clearly [Sr] and [Vr] have no constant terms. By Theorem II, Corollary, and equations (31) and (33), [X] can be expressed as a polynomial without a constant term in the [Xi]. Suppose this expression not unique. Then there will be a polynomial relationship =
P([X;])
=
0
(34)
between the [x;]. Of the terms of non-zero eoefficient in P([x;J) pick out the subset JtI! of those which involve the greatest suffix occurring in them raised to the highest power
130
A ring in graph theory
33
to which it occurs. Of this subset 1111 pick out the subset M2 of terms involving the second greatest suffix appearing in Jf1 raised to the highest power to which it occurs in Mv and so on. This process must terminate in a subset Mk consisting of a single term A [x;]a(i) [Xj]a(i) •..• It is evident that if we substitute from (32) in (34), we shall obtain a polynomial Q([t i ]) = 0 relationship
between the ltd, in which the coefficient of [ti],,(i) [t, Ja(j) ••• is A =1= o. But it was shown in the proof of Theorem V that there is no polynomial relationship between the [til This contradiction proves uniqueness and so completes the proof of the theorem. 5.
TOPOLOGICALLY IXYARIAXT JV-Fl:XCTIOXS
Let A be a I-cell of a linear graph L on a complex K. Let p be any point of A. We can obtain a new linear graph M on K from L by replacing A by the point p, taken as a O-cell of M, and the two components of A -p taken as I-cells of M. We call this operation a subdivision of A by p. Given any two linear graphs L 1 , L2 on the same K we can find a linear graph L3 which can be obtained from either by suitable subdivisions. Such a linear graph is evidently obtained by taking as the set V of O-cells the set of all points of K which are O-cells either of L1 or of L 2 , and by taking as I-cells the components of K - V. We seek the condition that a W-function W(L) shall be topologically invariant, i.e. depend only on K. By the above considerations a necessary and sufficient condition for this is that JV(L) shall be invariant under subdivision operations. (For then
JV(Ltl = JV(L3) = W(L 2 )·) Suppose therefore that A is any I-cell of L, possibly a loop, and let M be obtained from L by subdividing A by a point p. Let us denote the new I-cells by Band C. Then hy (5) for any JV-function W(L) W(M) = W(M~) + JV(M~)
= =
W((M~)c) + W((llf~)~)
+ W(M~) W(p. (M~)~) + JV((M~)c) + W(M~).
Here p is used to denote the linear graph which consists solely of the O-cell p. It is i,omorphic to Yo. By making use of the obvious isomorphisms M~,;;; L and (M~)c';;; Lo, where Lo is the linear graph derived from L by suppressing A, we obtain W(M)- W(L) = W(yo.L o)+ W(Lo),
Therefore
JV(M) - W(L) = h([Yo][Lt] +[Lt]),
(35)
where h is a homomorphism of R, regarded as an additive group, into an additive _-\belian group G (Theorem I). Let N denote the set of all elements of R which are ofthe form [Yo] [X) + [X). Clearly X is an ideal of R. Let {X} denote that element of the difference ring R - N which contains [X). THEoRE~1 VII. Afunction W(L) on the set of all linear graphs L to the additive Abelian group G (commutative ring H) is a topologically invariant W-function (V-function) if PSP
43,
I
131
34
W.
T. TUTTE
and only if it is of the form k{L*}, where k is a homomorphi8m of the additive yroup R - N (ring R - N) into G (H). For in (35), by a proper choice of L, we can have any linear graph we please as L o· It follows that the necessary and sufficient condition for the W-function W(L) to be topologically invariant is that h shall map all elements of R of the form [Yo] [L*] + [L*] and therefore all elements of N on to the zero of G. This proves the theorem for Wfunctions. The same argument applies to V-functions, except that h in (35) is then a homomorphism of R (as a ring) into the ring H.
VIII. Let Xo, Xl> X 2 , ••• be as in the enunciation of Theorem V I. Then any element {X} of R - N has a unique expression as a polynomial in the {Xi) (i > 0) wilh ill/llr!" roefficients. For we can obtain such an expression for {X} by replacing each [Xi] by the corresponding {x;} in the expression for [X] in terms of the [Xi] whose existence is asserted in Theorem VI. Now for all {X}, {X} + {Yo}{X} = {O}, and so R - N has a unity element - {Yo} = - {xo} which we may denote by l. Hence {xo} is not an indeterminate over I. and we can regard our polynomial for {X} as a polynomial in those {Xi} for which i > (I (with perhaps a constant term). If this expression for {X} is not unique then there will be a polynomial {P} in the {xJ (i > 0) without a constant term such that THEOREM
A {xo} + {P}
=
{O},
where A is some integer. Hence if [P] is the polynomial of the same form in the [Xi I we must have
A[xo] + [P] + [Xo] + [xo] [Xo] = [0]
(36)
for some [Xo]. Equating coefficients oflike powers of [xo], as is permissible by Theorem VI, we sel' that [Xo] cannot involye [xo], and hence that A = - [Xo] = [P]. Consequently {PI is a constant and therefore, by its definition, the zero polynomial in the {Xi}' Thl' theorem follows. 6. SmlE
COLOl"RIXG PROBLE:\lS
The homomorphism of the ring Ro (see Theorem V) into the ring of polynomials in two independent indeterminates t and z by~he correspondence Zi -+tzi transforms Z(LJ into Q(L; t,z) = "L,tPo(S)zP,(S) (3.) s by (23). Since Z(L) is of the form h[L*] where h is a homomorphism of R into lio (Theorems I and III). (/(L; t, z) can be defined by a homomorphism of R into the ring of polynomials in t and z and is therefore a V-function (Theorem I). The coefficient of tazb, for fixed a, b, therefore satisfies (4) and (5) and so is a Jrfunction. Writing a = l,b = Oweobtainthe function of Example I of the Introduction. This function satisfies W(Ll L 2 ) = 0 (by (37) since Po(l:J) is always positive) and so it can be regarded as a V-function with values in the ring constructed from the additiYe group of the rational integers by defining the' product' of any hro elements as O. Q(L; t, z) has an interesting property which we call
132
35
A ring in graph theory Tm:oRlm
IX. If Ll and L2 are connected dual linear f/mphs on the 8phere then I
t Q(Ll ; t,z) =
I
zQ(L2 ; z,t).
(3S)
This follows from (37) as a consequence ofthe fact that there is a I-I correspondence S -->- S' between the subgraphs S of Ll and the subgraphs S' of L2 such that PolS) = Jil(S') + I and
PI(S) = PoU),)-1.
(S' is that subgraph of L2 whose I-cells are precisely those not dual to I-cells of K)
For a proof of this proposition reference may be made to the paper' Xon-separable and planar graphs' by Hasslcr Whitney (Trans. American ~l!ath. 8oc. 3-1 (HI:!2). :1:J!Hi2).
\Ye go on to consider two kinds of colourings of a linear graph, which we distinguish as :x-colol/rings and fJ-colourinr/.s. An :x-colouring of L of degree A is a single-yalued function on the set of O-cells of L to a fixed set H the numher of whose elements is A. If.f is an :x-colouring let ¢(f) denote the number of I-cells A of L such that f a,;sociates all the end-points of A with the same element of H (e.g. every loop has this property). We say that any suhgraph of L all of whose I-cells have this property for f is Ilssociated with f. We use the symbol S(f) to denote a subgraph associated with iI gh-enf, and!(S) to denote an~' :x-colouring with which a gh'en .'; is associated. THEORt:)I
X. Let J(L: "'-9) bp, the nl/mber o.f :x-colollrill~8f of L of df'f/I"i'e Afor whirh
\\(f) has the mllie
¢. Then tlte follOlcillrJ identity
"£. ./(L: A, 9) .T9 = ¢
i.~ trw.
(.t: - 1)',1 /.) (I(L: .
.,\
~-I
,.1' -
1)
(30)
",hpre.l: is an illdeterminate oter I. For, by (3i) and (1), the right-hand side is (.t: - I j".,
~
(.1' _ I )"("') ,\J>,j..~1
N
=~(.r-I)"(·')~ 8
/lSi
I);
["r the :x-colourings associated with 8 are precisely those which map all the O-cells in the same component of S on to the same element of H. This last ('xpresl'ion ('qual~
"£. "£. (.r -
I ),,(N)
1 .';(/)
=
"£. .1f(1l 1
since the number of subgraphs associated with f and having just -x I ( 8) I-cells is the number of ways of choosing :Xl(8) I-cells out of ¢(f). This complet('s the proof of the theorem. If we write .r = 0 in (311) we find that ( - 1)'.(/') J(L; ,\.0). which is Example II of tile Introduction, is the "-function filL: -,\, -I). We thus obtain the well-known result* .I(L; A, 0) = ~ (-I )"("")"I"~S'. :-;
*
Hassler \\'hitn"y, • A logical expansion in lIlathem .. ti,,~·, /JI/Il. Am.-rirtl/l _lffllh. Soc. 311 II !)32), 5;2- 9.
133
W.
36
T. TUTTE
If we orient the I-cells of L and adopt the convention that the boundary of an oriented loop vanishes, we can define I-cycles on L with coefficients in some fixed additive Abelian group G of finite order A. The number* of such I-cycles on L will bp AP,(L). We call them fi-colourings of L with respect to G. Let E(L; G, V) be the number of such I-cycles for which just!fr of the I-cells havc coefficient zero. Let go be any fi-colouring with respect to G of L and let !fr(ga) be the number of its zero coefficientst. We say that a subgraph S of L is associated with ga if every I-cell of L not in S is assigned the zero element of G as its coefficient in ge. We use the symbol Sigal to denote a subgraph of L associated with ga and gaiS) to denote a fi-colouring with which a given subgraph is associated. Clearly the number of fi-colourings associated with a given S is the number of fi-colourings of S, which is AI>,(.s). THEORE:\I
XI. If x is an indeterminate over I then
~ E(L;
G.!fr)xVr
=
(X_I)"'(L)~.o(L)Q( L; x-I, x~ I)'
(40)
For, by (37) and (I), the right-hand side is (x - I ),,(L)-.o(L) ~ (x - I )Po('»~P,(S) AP,(,'» = s
~
(x - I )"I(L)~",(S)
S
~
I)
Uo(S)
= ~ ~
(x - I ),.,(L)~"I(S) =
O,.S(fl G )
~
xVr(go);
flo
for the number of suhgraphs of L associated with go and having just iX1(L) -!fr(ga) +r I-cells is the number of ways of choosing r I-cells out of the !fr(ga} which have zero coefficient in {Jr;. COROLLARY. E(L; G, V) is lite samejor ail additive Abelian groups G of the same order It If we write x = I) in (40) we find that (_I)"I(L)~20(L) E(L; G, 0), which is Example III of the Introduction, is the V -function Q(L; - I, - A). It takes the value - I when L is Yo and therefore corresponds to a homomorphism of R into the ring of rational integers which maps N into O. It is therefore, by the preceding section, topologically invariant. If Ll and L2 are dllallinear graphs on the sphere, the fi-colourings of Ll are closdy connected with the a-colourings of L 2 • In fact a I-cycle go bounds on the sphere and any 2-chain which it bounds on the map defined by Ll has a dual O-chain which is an a-colouringf. of L2 such that ¢(f.) = !fr(ga)' There is also a relationship between the a-colourings and the fi-colourings of the sam~ linear graph L expressed by the following identity in x (X-l)"P')1 (E(L; G, V)
(x~i + I
r)
=
A"I(L)-20(L)
1J(L;
A,¢)X¢.
(-II)
This is obtained by writing A:(X - I) for (x - I) in (40) and then eliminating the function Q by means of (39). • S"e Lef.qchetz, Alyebraic Torolorl!J (.-\n1<'r. :\Iath. Soc. Colloquium Publications, vol. 27). p. 106. may be mentioned that fur graphs on the sphere a jJ-colo\lring is essentially equi\'al~nt to a COiOlll'mg of the regIOns of tho map .Iefined by a graph in A colours. The colours can be repres~nte" by :lelllents of U and sothe colouring can be represented by a 2-chain on the map with ?oe.fficlents m G. A jJ-eulourmg IS SImply the boundary of s\lch a 2.chain. The number of 1-,")1; mCldent With two regions of the same colour (or incident with only one region) in a given COIOllf'ine IS gIven by the nllmber Y(Ya) where flo is the corresponding jJ-colouring.
t It.
134
A ring in graph theory 7.
37
CUBICAL :q;TWORKS
We define a cubical network as a I-complex for which there exists a finite simplicial ,lissection in which each O-simplex is incident with ll(,t less than two, and not more than three I-simplexes. Clearly any other simplicial dissection of such a complex will have the same property. The O-simplexes which are each incident with three I-simplexes we call nodes. The set of nodes is evidently independent of the particular simplicial dissection taken. A component of a cubical network which does not contain a node is eyidently a ,imple closed curve, and if a component does contain nodes then the remainder of it (II
x
bl
al
-\ Z
~I
bl
bl
Fig.
must consist of a number of non-intersecting open arcs whose end-points are nodes of the component. We call these open arcs the arcs of the cubical network. The number of nodes in a cubical network N is clearly two-thirds of the number of arcs of ~V. It is therefore even. Let X be an arc having distinct end-points P and Q in a cubical network N. In a simplicial dissection of N let AI' A2 be those I-simplexes incident with P, and B I , B2 those I-simplexes incident with Q, which are not in X. Let aI' u 2 , bl , b2 be the other end-points of AI' A 2, Bv B2 respectively. By suitable subdivisions of a gi\'en simplicial di,.;section we can always arrange that aI' bl , u 2 ' b2 are distinct points and not nodes of N. Other cubical networks can be obtained from N by replacing X, AI' A 2 , B I , B 2, P and Q by other systems of simplexes (see Fig. I). Hfor example we suppress Al and B I , introduce a new arc Y joining a l to bi and then introduce an arc Z joining a point in Y
135
W.
38
T. TUTTE
to a point in X, we obtain 51. 'Ve call this process a A-operation on .V. If NI can be obtained from .V by a finite sequence of A-operations we say that Nand Nl are A-equivalent. In such a case it is clear that Nl has the same number of nodes as Nand that if N is connected, so is Nl · By supprcssing X in X we obtain N's., and by suppressing Z in N we obtain S~. We define an F-function as a single-valued topologically invariant function on the set of all cubical networks to an additive Abelian group () or commutative ring H which satisfies the general law
F (.),. F (·'x '"')
=
F( ., V)
-
F( ·'z· -,"' )
(42)
THEORE)[ XII. ff 11'( L) is a topologically inmriant Ir-fllnction, and F(X) is the mlllP of Jr(L) for ull!llinpar grrlph on the cubicalnetlDorl.: X, then F(X) is an Fjllllction. For let "\~ be the I-complex obtained from X by identifying all the points of tIl(' closure of X, and let Lo be any linear graph on ~\~ (clearly such exist). No is evidently homoeomorphic to the I-complex obtained from X by identifying all the points of the closure of Z. Binee II"(L) is topologically inmriant it follows from (5) that
F(S) - F(X:,J
= /I'(Lo) = F(S) - F(S'z),
which proves the theorem. A trivial example of an F-function is F(S) = x"(S) where x is an arbitrary real or complex number and I/(.\') is one-half of the number of nodes of.Y. This function also satisfies
F(.VI u Nz) = F(Nl ) F(Nz)'
(43)
where Nl and Nz are any two disjoint cubical networks and NI u N2 is their union. Other F-functions may be obtained as follows. We define a subnetwork of N as a I.-complex which is the union of all the nodes of N and some subset of the arcs and nodeless components of X, such that each node of N is an end-point of at least one arc of the subset. If the number of arcs of a subnetwork T which have a given node v of S as an end-point (arcs which are loops being counted twice) is odd, we say that v is an odd node of T. The number of odd nodes of T is even, for it is congruent mod. 2 to thl' number of end-points of arcs of T (a loop being regarded as having two end-point~. though they happen to coincide). Let k(T) be one-half the number of odd nodes of T. Let 1T kP") be the number of subnetworks of N for which k(T) = k. As an example a cubical network J which consists of a single simple closed curve has just two subnetworks-J itself and the null complex-and so 1To(J) = 2 and 1Ti(J) = 0 (i > 0). Let .If be the I-complex obtained from the cubical network ~V of Fig. 1 by sup pressing X, AI' A 2 , Bl and B 2 · Let T be any subnetwork of .V, X'x, 51 or N'z, and let To be its intersection with ill (which is contained in each of these four complexes). If we are told which of aI' (/2' bl , 62 are contained in To it is easy to determine for each of the four cubical networks how many mbnetworks there are which agree with To in JI. and how many of these have n (or I, or 2) odd nodes outside To' A consideration oftlJ(' possible cases will show 1Tk (X)+1T k (.\":\:) = 1Tk (N)+1T k (N'z), (H) whence (-l)"(S)1T k(N) satisfies (42) and is thus an F-function. If therefore we dcfine a polynomial D(.V; x) by D(X; x) = ~1T,AN)Xk k
136
A ring in graph theory
39
then (-I),,(X)D(X; x) will be an F-function. Further, by an argument analogous to the proof of (25) this F-function satisfies (43). If N has no nodeless component, 1To(X) = D(X; 0) is by its definition the number of solutions of Petersen's problem * for X. We define a Hamiltonian circuit of.v as a subnetwork of X which is connected and has no odd nodes. It is easily verified that the residue mod. 2 of the number of Hamiltonian circuits of ~y satisfies (42), so this also is an F-function. Let 1'i+1 (i;:. I) be a cubical network with just 2i nodes aI' a 2 , a 3 , ... , a 2i , having just one arc linking each pair of nodes a" a'+1 for which r is odd, having just two arcs linking each pair of nodes a" a,+1 for which r is even, and having two arcs which are loops the end-points of one coinciding in a 1 and those of the other in a2i . The nodes and arcs define a linear graph which we also denote by 1'i+1' THEORE)I
XIII. Any connected cubical network S oj 2n nodes (n > 0) is A-equimlent
10 a homoeomorph oj 1'11+1'
For first, if S, not being homoeomorphic to I'n~I' contains a simple closed curve ]{ of k > I arcs, then X is A-equivalent to a cubical network NI containing a simple closed curve of k-I arcs. For we can suppose that ]{ contains the arc X (Fig. I) and also (11 and bl . Then S clearly has the property desired. It follows that by a sequence of .\-operations we can convert X into a cubical network having a loop. Let 0, be the I-complex derived from 1',+1 (r> 0) by suppressing the loop on a 2 ,. If part of a cubical network M meeting the rest of M only in a single node is homoeomorphic with 0" we call it aJrond of M of degree r, and say that the node corresponding to a 2 , is the base of the frond. The above argument showed that X is A-equivalent to a cubical network N2 having a frondJ (of degree r say). Secondly either N2 contains a simple closed curve passing through the base off, or it is A -equivalent to a cubical network having a frond of degree at least r with a simple closed curve through its base. For if the base Co ofJ is not on such a curve there will be a sequence co, Cv c2 , c3 , .•• , c. of minimum length such that consecutive nodes ci , ci +1 are linked by an arc Ci , and such that c. is on a simple closed curve KI in N 2 • Otherwise we could extend the sequence co, Cv c2 , ••. indefinitely in such a way that Ci differed from C;+I for each i without repetitions, which is absurd since N2 has only a finite number of nodes. By A-operations on Co, CI, ... in turn it is possible to transfer the frond to a base on a simple closed curve without altering its degree. Now at this stage the simple closed curve through the base of the frond may be a loop, in which case X has been transformed into a 1',-homoeomorph, and i = n + I since connexion and number of nodes are invariant under A-operations; or it may contain just two arcs in which case N2 has been transformed into a cubical network having a frond of degree exceeding r; or it can be reduced to a curve of just two arcs by a sequence of A-operations on those of its arcs not meeting the base of the frond. Hence if N2 is not homoeomorphic with 1',,+1 it can be transformed into a cubical network with a frond of degree greater than r. A finite number of such transformations will therefore change it into a homoeomorph of 1',,+1'
*
Denes Konig, Theorie der Endlichrn "ud unendlichen Graphen (Leipzig, 1936), p. 186.
137
W. T. TUTTE
40
XIV. Let F(N) be any F-function. Then there is a unique topologically invariant W junction W(L) such that W(L) = F(N) whenever L is a linear graph on N. For the linear graphs I'H1 may be taken as the linear graphs X i + 1 of Theorem VI. If we make the definitions Yo = Yo and 1'1 = Y1 then the I'i clearly satisfy the conditions of Theorem VI, and so by Theorem VIII {L*} has a unique expression as a polynomial in the {I'i}' Hence there is a unique topologically invariant W-function W(L) which is equal to F(N) whenever N is a product of I'i and L is on N. By Theorem XII there is a unique F-function F1(N) such that W(L) = F1(N) whenever L is on N. But ifthe value of an F-function is given for every product of I'i' then it is determined for all N. For by (42) if it is known for all N such that n(N) = p and for one cubical network M such that n(M) = p+ I, then it is determined for any cubical network .lIl A-equivalent to a homoeomorph of M. By applying Theorem XIII to each component having a node we see that every cubical network is A-equivalent to a homoeomorph of a product of I'i and so the required result follows by induction. Since F(N) = FI(N) whenever N is a product of I'i it follows that F(N) = F1(N) for every cubical network N. This proves the theorem. THEOREM
COROLLARY. For an Fjunction satisfying (43) 'W-function' can be replaced by 'Vfunction' in the above argument. As an example we mention an application of the above theory to the problem of
functions obeying the law f(N) = f(N'x) + f(Mo) (45) (see Fig. I). By eliminating f(Mo) from two equations of the form (45) it is easy to show that f(N) is an F-function multiplied by (_I)n(N). Hence it is fixed when its values for the products of the I'i are given. But by applying (45) to these products we can show that for them f(N) = 2n (N)A where A is a constant. Since 2,,(,V)A is obviously a solution of (45) it follows that it is the general solution.
TRINITY COLLEGE CAMBRIDGE
Reprinted from Pro£'. Cambridge Phil. Soc. 43 (1947).26-40
138
ANNALS OF MATHEMATICS
Vol. 51, No. I, January, 1950
A DECOMPOSITION THEOREM FOR PARTIALLY ORDERED SETS By R. P.
DILWORTH
(Received August 23, 1948)
1. Introduction Let P be a partially ordered set. Two elements a and b of Pare camparable if either a ~ b or b ~ a. Otherwise a and b are non-comparable. A subset S of P is independent if every two distinct elements of S are non-comparable. S is dependent if it contains two distinct elements which are comparable. A subset C of P is a chain if every two of its elements are comparable. This paper will be devoted to the proof of the following theorem and some of its applications. THEOREM 1.1. Let every set of k + 1 elements of a partially ordered set P be dependent while at least one set of k elements is independent. Then P is a set sum of k disjoint chains. l It should be noted that the first part of the hypothesis of the theorem is also necessary. For if P is a set sum of k chains and S is any subset containing k + 1 elements, then at least one pair must belong to the same chain and hence be comparable. Theorem 1.1 contains as a very special case the Rad6-Hall theorem on representatives of sets (Hall [1]). Indeed, .we shall derive from Theorem 1.1 a general theorem on representatives of subsets which contains the Kreweras (Kreweras [2]) generalization of the Rad6-Hall theorem. As a further application, Theorem 1.1 is used to prove the following imbedding theorem for distributive lattices. THEOREM 1.2. Let D be a finite distributive lattice. Let k(a) be the number of distinct elements in D which cover a and let k be the largest of the numbers k(a). Then D is a sublattice of a direct union of k chains and k is the smallest number for which such an imbedding holds. 2. Proof of Theorem 1.1. We shall prove the theorem first for the case where P is finite. The theorem in the general case will then follow by a transfinite argument. Hence let P be a finite partially ordered set and let k be the maximal number of independent elements. If k = 1, then every two elements of P are comparable and P is thus 1 This theorem has a certain formal resemblance to a theorem of Menger on graphs (D. Konig, Theorie dej' endlichen und unc>tdlichen Graphen, Leipzig, (1936)). Menger's theorem, however, is concerned with the characterization of the maximal number of disjoint, complete chains. Another type of representation of partiallyorcJered sets in terms of chains has been considered by Dushnik and Miller [3 J (see also Komm [4]). I t can be shown that if n is the maximal number of non-comparable elements, then the dimension of P in the sense of Dushnik and Miller is at most n. Except for this fact, there seems to be little connection between the two representations.
139
162
R. P. DILWORTH
a chain. Hence the theorem is trivial in this case and we may make an argument by induction. Let us assume, then, that the theorem holds for all finite partially ordered sets for which the maximal number of independent elements is less than k. Now it will be sufficient to show that if C1 , ••• , Ck are k disjoint chains of P and if a is an element belonging to none of the Ci , then C1 + ... + Ck + a is a set sum of le disjoint chains. For beginning with a set a1, ... ,ak of independent elements (which exist by hypothesis) we may add one new element at a time and be sure that at each stage we have a set sum of le disjoint chains. Since P is finite, we finally have P itself represented as a set sum of le chains. Let, then, C1 , ••• , Ck be le disjoint chains and let a be an element not belonging to C1 + ... + Ck • Let U i be the set of all elements of Ci which contain a, let Li be the set of all elements of Ci which are contained in a, and let Ni be the set of all elements of C; which are non-comparable with a. Finally let
U = U1 + ... L = L1 + ...
N
+ Uk + Lk = N1 + ... + Nk
C
=
C1
+ ... + C
k•
Clearly U; + Ni + Li = C; and U + N + L = C. We show now that for some m the maximal number of independent elements in N + U - U m is less than le. For suppose that for each j there exists a set Sj consisting of le independent elements of N + U - U j • Since there are le elements in S j and they belong to C = C1 + ... + Ck , there is exactly one element of S j in each of the chains C; . Since S j contains no elements of U j it follows that S j contains exactly one element of N j . Thus S = S1 + ... + Sk contains at least one element of N i for each i. Now let 8i be the minimal element of S which belongs to C i . 8i exists since the intersection of Sand C; is a finite chain which we have proved to be non-empty. Furthermore, 8i E Ni since there is at least one element of Ni which belongs to S and all of the elements of U i properly contain all of the elements of N i . Hence 81, . . . ,8k EN. Now if 8; ;;;; 8j for i ~ j, let 8j E Sr. Since Sr contains an element ti belonging to Ci , we have from the definition of 8i that ti ;;;; 8i ;;;; 8j and ti ~ 8j since t; E C; and 8j E Cj • But this contradicts our assumption that the elements of Sr are independent. Hence we must have 8j ~ 8j for i ~ j and 81, •.• , 8k form an independent set. But since 8i belongs to N, 8i is non-comparable with a and hence a, 81, ..• , 8k is an independent set containing le + 1 elements. But this contradicts the hypothesis of the theorem and hence we conclude that for some m, the maximal number of independent elements in N + U - U m is less than le. In an exactly dual manner it follows that for some l, the maximal number of independent elements in N + L - Ll is less than le. Now let T be an independent subset of C - U m - Ll . If T contains an element x belonging to U - U m and an element y belonging to L - L l , then x ;;;; a ;;;; y contrary to the independence of T. Since
(N
+U-
U m)
+ (N + L
140
- L l)
=
C - Um
-
Ll
DECOMPOSITION THEOREM FOR PARTIALLY ORDERED SETS
163
it follows that T is either a subset of N + U - U m or of N + L - L! . Hence the number of elements in T is less than k and thus the maximal number of independent elements in C - U m - L! is less than k. Since U m + L! is a chain there is at least one independent set of k - 1 elements in C - U m - L! . Hence by the induction hypothesis C - U m - L! = C~ + ... + C~-1 where C~ , •.. , C~-1 are disjoint chains. Let C~ be the chain U m + a + Ll . Then
C
+a =
C~
+ ... + C~
and our assertion is proved. We turn now to the proof of the general case. Again when k = 1 the theorem is trivial and we may proceed by induction. Hence let the theorem hold for all partially ordered sets having at most k - 1 independent elements and let P satisfy the hypotheses of the theorem. A subset C of P is said to be strongly dependent if for every finite subset S of P, there is a representation of S as a set sum of k disjoint chains such that all of the elements of C which belong to S are members of the same chain. Clearly -any strongly dependent subset is a chain. Also from the theorem in the finite case it follows that a set consisting of a single element is always strongly dependent. Since strong dependence is a finiteness property it follows from the Maximal Principle that P contains a maximal strongly dependent subset C1 • Suppose that P - C1 contains k independent elements ai, ... , ak. Then from the maximal property of C1 we conclude that C1 + ai is not strongly dependent for each i. Hence there exists a finite subset Si such that in any representation as a set sum of k chains there are at least two chains which contain elements of C1 ai. Si must clearly contain ai since C1 is strongly dependent. Let S = SI + ... + Sk. By the strong dependence of C1 , S = Kl + ... + Kk where K 1, ... , Kk are disjoint chains such that for some n ~ k we have S· C1 C K" . Since S contains ai, ... , ak which are independent, for some m ~ k we have am E K" . Let K: be the chain Sm' Ki . Then Sm = K~ + ... + K~ and Sm,C1 C Sm,S,C1 C Sm·K" = K~ . But by definition am E Sm and am E K". Hence Sm' (C1 + am) C K~ which contradicts the definition of Sm. We conclude that P - C1 contains at most k - 1 independent elements. But since C1 is a chain and P contains a set of k independent elements, it follows that P - C1 contains a set of k - 1 independent elements. Thus by the induction hypothesis we have P - C1 = C2 + ... + Ck . Hence
+
P = C1
+ ... + C
k
and the proof of the theorem is complete. 3. Application to representatives of sets. G. Kreweras has proved the following extension of the Rad6-Hall theorem on representatives of sets: Let ~ and .\8 be two partitions of a set into n parts and let h be the smallest number such that for any r, r parts of ~ contain at most r + h parts of .\8. Let k be the smallest number such that n k elements serve to represent both partitions. Then h = k.
+
141
164
R. P. DILWORTH
To show the power of Theorem 1.1 we shall prove an even more general theorem in which the partition requirement is dropped. Now if ~ is any finite collection of subsets of a set S we shall say that a set of n elements (repetitions being counted) represents ~ if there exists a one-to-one correspondence of the sets of ~ onto a subset of the n elements such that each set contains its corresponding element. For example, the set {I, 1, I} represents the three sets {I, 2}, {I, 3}, and {I, 4}. The theorem can then be stated as follows: THEOREM 3.1. Let ~ and ~ be two finite collections of subsets of some set. Let ~ and ~ contain m and n sets respectively. Let h be the smallest number such that for every r, the union of any r h sets of ~ intersects at least r sets of~. Let k be the smallest number such that n k elements serve to represent both collections ~ and~. Then h = k. It can be easily verified that if ~ and ~ are partitions of a set, then h as
+ +
defined in Theorem 2.1 is equivalent to the definition given in the theorem of Kreweras. For the proof let ~ consist of sets Al , ... , Am and ~ consist of sets B 1 , ••• , Bn. We make the sets AI, ... , Am, B 1 , ••• , Bn into a partially ordered set P as follows:
Ai
~
Ai
i = 1, ... , m
Bj
~
Bj
j
Ai
~
B j if and only if Ai and B j intersect.
=
1, ... , n.
It is obvious that P is a partially ordered set under this ordering. Now let w be the maximal number of independent elements of P. Since the union of any r + h sets of ~ intersects at least r sets of ~, it follows that any independent subset of P can have at most r + h + (n - r) = n + h elements. Hence w ~ n + h. On the other hand for some r there are r + h sets of ~ whose union intersects precisely r sets of ~. Hence these r + h sets of ~ and the remaining n - r sets of ~ form an independent subset of P containing n + h elements. Thus w = n + h. By Theorem 1.1, P is the set sum of w chains C1 , ••• , Cw • Now if a chain Ci contains two sets they have a non-null intersection by definition. Hence for each Ci there is an element a; common to the sets of Ci . But since Al , ... , Am are independent in P it follows that they belong to different chains and hence the w elements al , ... , aw represent ~. Similarly, aI, ... , aw represent ~ and thus n + k ~ w. But since P cannot be represented as a set sum of less than w chains, it follows that n + k = w = n + h. Hence h = k and the theorem is proved. 4. Proof of Theorem 1.2.
Let us recall that an element q of a finite distributive lattice D is (union) irreducible if q = x U y implies q = x or q = y. It can be easily verified that if q is irreducible, then q ~ x U y implies q ~ x or q ~ y. From the finiteness 2 of S it I L is assumed to be finite for sake of simplicity. The theorem holds without this restriction. In the proof, "elements covered by a" must be replaced by "maximal ideals in a" and "irreducible elements" must be replaced by "prime ideals."
142
DECOMPOSITION THEOREM FOR PARTIALLY ORDERED SETS
165
follows that every element of D can be expressed as a union of irreducible elements. From this fact we conclude that if x > y, there exists at least one irreducible q such that x ~ q and Y ~ q. Now let P be the partially ordered set of union irreducible elements of D. Let a be such that k = k(a). Then there are k elements aI, ... , ak which cover a. Let qi be an irreducible such that ai ~ qi and a ~ qi. Then if qi ~ qi where i ¢ j we have a = ai n aj ~ qi n qi ~ qi which contradicts a ~ qi' Hence ql, ... , qk are an independent set of elements of P. Next let q~ , ... , q; be an arbitrary independent subset of P. Let a' = q~ U ... Uq; and for each i let = q~ U ... U q;-l U q;+l U ... U q;. Now if = a' for some i, then
P;
P; q; = q; n a' = q; n P; =
(q;
n q~) U ... U (q; n q;-l) U (q; n q;+l) U ...
U M (q; n q;)
and hence q; = q; n q~ for some j ¢ i. But then q~ ~ q; contrary to independence. Thus a' > P; for each i and P; U p~ = a' for i ¢ j. Let a = p~ n ... n P; and for each i let Pi = p~ n ... n P;-l n P;+l n ... n p;, If Pi = a, then P; = P; U a = P; U Pi = (p; U p~) n ... n (p; U P;-l) n (p; U P;+l) n ... n (p; Up;) = a' which contradicts P; < a'. Hence Pi > a and Pi n Pi = a for i ¢ j. Let Pi ~ ai where ai covers a. Then a ~ ai n ai ~ Pi n Pi = a for i ¢ j and hence ai n ai = a, i ¢ j. Thus aI, ... , al are distinct elements of D covering a. It follows that l ~ k and hence k is the maximal number of independent elements of P. Now by Theorem 1.1 P is the set sum of k disjoint chains CI , ••• , Ck • We adjoin the null element z of D to each of the chains Ci • Then for each xED, there is a unique maximal element Xi in C i which is contained in x. Now suppose x > Xl U ... U Xk in D. Then there exists an irreducible q such that x ~ q and Xl U ... U Xk ~ q. But q E C i for some i and hence Xl U ... U Xk ~ Xi ~ q contrary to the definition of q. Hence X = Xl U ... U Xk . Consider the mapping of D into the direct union of CI , ••• , Ck given by X ---?
{Xl, ••• ,
Xk}.
Now if Xi = Yi for i = 1, ... ,k, then X = Xl U ... U Xk = YI U ... U Yk = Y and the mapping is thus one-to-one. Since X U Y ~ Xi U Yi we have (x U Y)i ~ Xi U Yi . But since (x U Y)i is union irreducible we get X U Y ~ (x U Y)i -+ X ~ (x U Y)i or Y ~ (x U Y)i -+ Xi ~ (x U Y)i or Yi ~ (x U Y)i -+ Xi U Yi ~ (X U Y)i • Thus (x U Y)i = Xi U Yi and we have X U Y ---?
{Xl
U YI, . . • ,Xk U Yk}'
Similarly X n Y ~ Xi n Yi -+ (X n Y)i ~ Xi n y •. But X ~ X (x n Y)i and Y ~ X n Y -+ Yi ~ (x n Y)" Hence Xi n Yi ~ (X (x n Y)i = Xi n Yi and we have X
nY
---? {Xl
n YI,
143
••• ,Xk
n Yk}'
n Y -+ Xi ~ n Y)i' Thus
166
R. P. DILWORTH
This completes the proof that D is isomorphic to a sublattice of a direct union of k chains.
Now suppose that D is a sublattice of the direct union of l chains C: , ... , C~ where l < k. Again let a be such that k(a) = k and let aI, ... , ak be the k distinct elements covering a. Define a' = al U . . . U ak and let a: = al U ... U ai-I U ai+l U ... U ak for each i. Now a: = U ... U q~ where E And if = U y', then = U y; where E But then either = U y: = or = x: U = y: and hence either = x' or = y'. Thus each is union irreducible. But al U ... U ak = a' ;;;; for i = 1, ... , l. Thus for each i ~ l there is a j such that aj ;;;; Since l < k there is some r such that a: ;;;; U ... U q~ = a' ;;;; ar • But then ar = a: n ar = a which contradicts the fact that ar covers a. Hence l ;;;; k and we conclude that k is the least number of chains whose direct union contains D as a sublattice. This completes the proof of Theorem 1.2.
q: x' q:
Y;
q:
x: ,Y: C: .
q: x:
q; q:
q: .
q:
q: C: . q: x: q:
x:
q:
YALE UNIVERSITY CALIFORNIA INSTITUTE OF TECHNOLOGY REFERENCES
1. P. HALL. On representatives oj subsets. J. London Math. Soc. 10 (1935),26-30. 2. G. KREWERAS. Extension d'un theoreme sur les repartitions en classes. C. R. Acad. Sci. Paris 222 (1946), 431-432. 3. B. DUSHNIK AND E. W. MILLER. Partially ordered sets. Amer. J. of Math. vol. 63 (1941), 600-610. 4. H. KOMM. On the dimension of partially ordered sets. Amer. J. of Math. vol. 20 (1948), 507-520.
144
THE MARRIAGE PROBLEM.*
By
I'Al'L
R.
HAL~ros
and
HERBERT
E.
VAUGHAN.
In a recent issue of this journal Weyll proved a combinatorial lemma whieh was apparently considered first by P. HalP Subsequently Everett and Whaplf's 3 publishf'd another proof and a gf'neralization of the sallle lemma. TIlf'ir proof of the gf'neralization appears to duplicate the usual proof of TydlOnoff's tlworf'm.t The purpose of this note is to simplify the presf'ntation by employing the statemf'nt rather than the proof of that result. At the same time we pre"f'nt a somewhat simpler proof of the original Hall If'mma. Suppose that eac·h of a (possibly infinite) set of boys is acquaillt .. d with a finite set of girls. "G nder what conditions is it possible for eaeh hoy to marry one of his a('quaintances? It is df'arly necessary that every fillite set of k boys be. colledively. acquainted with at least k girls; the £\'(-rrttWhaples result is that this condition is also sufficient. We treat first the ease (considered b~' II all) in which the number of bo\":, is finite, say n, and proceed by induction. For 11 = 1 the result is triyial. If 11, > 1 and if it hanpens that every set of .1,; boys, 1 < k < 11, has at 1(',H k 1 acquaintances, tlH'n an arbitrar~' one of the hoys ma~' marr~' an~' one of his acquaintanees and refer the others to the induction hypothesis. If. on the othrr hand. some group of l· hoys. 1 < To- < II. has exadl~' 7.~ aeqllaintances. then this Ret of k ma~' he marripd off b~· induetion and, we assert. tlw remaining n - k bo~'s satisf~' the necessar~' eondition with respeet to the as wt unmarried girls. Indeed if 1 < 11 S 11 -7." and if some set of h baclwlor, were to know fewer than h spinsters, then this set of h bachelors togdher with the k married men would have known fewer than k h girls. .\n
+
+
" R('('eiH'd .Jllne (j. l!H!l. H. ".('~.l. "_-\llI1o~t pl'rioclie inn1l'iant "edor ~ets in a metric "ector Sp:l'·~." ..tmcrirfl" .'ounlal of J{flth('mfltic.~. yol. i1 (l!l4!)), PI'. li8-20;;. 2 P. Hall. "On ]'('prpsl'ntation of sllh~et<' ./ounlal of the LOlldon Jfathcm(JI;,nl Society. yol. 10 (l!):l;'). pp. 2(j·:l0. 3 C ..J. En'rl'tt and G. 'Yhaples. " Rppre,.;entations of sl''luenpl's of set~," .-imrr;',·qll ./01l1·"fI/ 0/ J[at"('m"tir.~, \'01. 71 (I!)t!)), PI'. 2Ri-2!l:l. Cf. al~o ~r. Hall, "Distinrt !'<·"re· sentath'es of suhsets," Rulletil1 of the .-imerican ]{athema.tical Soriety, "01. 54 (If! ',I. pp. !)22·!)2(j. • C. CIH'yalley and O. Frink, .Jr., "Birompartness of Cartesian produd~," /lull,'in of the .,fmel·ican J[athcmatical Rodety, "01. .fi (}!).fl), PI'. (j}2-GU. 1
214 146
215
THE lL\RRIAGE l'UOBLElI.
application of the induction hypothesis to the n -lc bachelors concludes rhe proof in the finite case. If the set B of boys is infinite, consider for each b in B the set G (b) of his acquaintances, topologized by the discrete topology, so that G (b) is a ('om pact Hausdorff space. Write G for the topological Cartesian l'roduet of all G(b); by Tychonoff's theorem (J is compact. If {b l , ' • " bn } is any finite set of boys, consider the set II of all those elementfol rJ = rJ (b) of G for \Ihieh g(b;,) =l=g(b j ) when eYer b.=I=bj,i,j=l,·· ',11. The ;.:(·t II is a dosed subset of G and, hy the result for the finitf' c'asc, II is not empty. :-;ince a finite union of finite sets is finite, it follow;; that the da,,:s of all sets !'nch as II has the finite intersection property and. I:on;;cquc'ntly, has a non "Illl'ty intersection. Since an element rJ = g( b) in thi;; inter:"cc·tion i:; :,ueh that g (b / ) =1= g (b") whene\"er b' =1= b", the proof i:; (·omplete. It is prrhaps worth remarking that this tIWOI'(,1ll fUl'lli"hes the solution llf the ('elehrated prohlrlU of the monk!>." Withont f'ntering into the hi;;tory ot: this well-known problem, we state it and it;.: solution in the language of the pre(·eding diseus!>ion. _\. necessary and suliil·ient (·ondition that eat·h lIn," b lUay establish a harem consisting of n( b) of hi;: a(·qaintanees, Il (b) = 1, :!. 3,' . " is that, for e\'ery finite sub;:et Bn of R. tIl<' total numher of ;H·quaintances of the members of Bo be at lea"t <,qual to ~Il (b), where the ,ul1lmation runs oYer ewry b in B o' The proof of this seemingly more ;!,'neral assertion may be based on the de\'iee of rt'phH'ing eadl b in B by /I(b) rf'plieas seeking eon\"entional marriages, with thr llncll'r;.tanding" that f';[c·h repli(·a of b is acquainted with exactly the ;.:alllP girl" a" h. Sin!'f' the ;lated re"triC"tion on the function n implies that thr I'f'l'lic'a" ;::ati,:fy tIlt, Hall (,flndition. an applieation of the E\'erett-'Yhaples tlWO\'l'1I1 yirlcl~ thf' c1""iI'l'c1 rf'~lIlt. l'xln:nSITY OF ('111('.\(;0 .\XII
I'Xln:usITY lit' IU.I XOIS.
"H, BaIza!', Les Cellt COlltell n,.61atiqll('.~, I\"
!):
Dcs moillell et tlOl:i('('.~, Paris
( !8-19).
Reprinted from Amer. J. Math. 72 (1950), 214-215
147
CIRCUITS AND TREES IN ORIENTED LINEAR GRAPHS by T. van Aardenne-Ehrenfest (Dordrecht) and N. G. de Bruijn (Delft)
§ 1. P n (a)-cycles. In this § we state the problem which gave rise to our investigations about graphs. The further contents of the paper are independent of this § 1. Consider a set of a figures 1, 2, ... , a, and let n be a natural number. A sequence of n figures will be called an n-tuple. Clearly, there are an different n-tuples. An oriented circular array, consisting of an figures, will be called a P,,-cycle, whenever it has the property that each n-tuple occurs exactly once as a set of n consecutive figures of the cycle. An example, with a = 3, n = 2 is the cycle 1 1 2 2 3 3 1 3 2.1) The existence of P ,,(a)-cycles, for arbitrary values of a and n, was proved by M. H. MARTIN [3J, 1. ]. GOOD [2Jand D. REEs [4]. One of us showed ([IJ) that, for a = 2, the number of different P n (2)-cycles equals 2f(n), f (n) = 2n-l __ n. This result was derived as follows. The number of P ,,(21-cycles can be interpreted as the number of circuits in a certain graph N n+ 1 (compare also [2J). The graph Nn+l can be obtained by a certain operation from N n' and by a general theorem on circuits in oriented graphs the number of circuits of Nn+1 could be expressed in the number of circuits of N". This theorem on graphs was proved in [IJ only for the case that at any vertex 2 edges point outward and 2 inward. In the present paper we shall deal, among other things, with the general case (theorem 4). This result immediately enables us to determine the number of P ,,(a)-cycles for arbitrary a. Referring to [IJ for details, we only state the result: The number of different P n(a)-cycles is ern (a !)q, where q = an-I. For example, there are 24 P 2 (3l-cycles. Six of them are
12 3 3 2 2 13 12322 1 3 3 12 2 3 3 2 13
12 2 3 2 13 3 12 23 3 13 2 1 223 1 332
1) It has to be understood that these figures have to be placed around an oriented circle. Therefore, 21 is one of the 2 - tuples occurring in the cycle. Naturally, 112233132 and 331321122 are considered as one and the same cycle, but 112313322, which has the reversed order, is a different one.
149
204
Another six are obtained from these by interchanging the figures 2 and 3 everywhere. By reversing the orientation, 12 new cycles arise. § 2. Preliminaries about permutation groups.
Let €i m be the symmetric group of degree m, that is, the group of all m! permutations of a set Em of m objects. If ~ is a subset of €i m then the number of cyclic permutations in ~ will be denoted by 1~ I, and the total number of elements in ~ by n(~). A subset 'II of €i m will be called a D-set (in em), whenever it has the property that 1 5 'II 1 has the same value for all SEem. It is easily seen that, in that case, we have 1 5 'II 1 = m- I . n ('II). For, if C is any cyclic permutation, then there are exactly n ('II) possibilities for 5 such that S'II contains C, and it follows that m! 1 5 'II I = (m-1)!n('II). Furthermore, it may be remarked that I 5 'II I == !'II 5 i, since SBS-I is cyclic whenever B is cyclic. Therefore, if 'II is aD-set and if P is an arbitrary element of em, then 'II P is also aD-set. em itself clearly is a D-set in em, but theorem 1 will show that non-trivial D-sets exist. Let E, be a sub-set of the set of objects Em' containing lobjects. Consider the sub-group @ Cern of all permutations which only permute the elements of E z, leaving the remaining elements of Em invariant. If G is any permutation of @, then G denotes the corresponding permutation of the objects of E " that is to say, we disregard the objects belonging to Em - E z (which are invariant under G). G is defined uniquely by G, and vice versa. The same notation will be used for sets: if ~ C @, then ill denotes the set of all G, where G E~. L e m mal. Let \8 be a sub-set of @ such that 58 is aD-set in @, and let C Eem be a cyclic permutation. Then we have
1\8 C I =
l-I . n (\8).
Proof. We shall deal with the cyclic representations of the permutations involved. Let G be the element of @ whose cyclic representation is obtained by cancelling the objects of Em - E I from the cyclic representation of C. Further, let GI be an arbitrary permutation of @. Then it is easily verified that GIG (of degree l) shows the same number of cycles as GIC (of degree m). Hence GIC is cyclic whenever GJ; is cyclic. Therefore
1\8 C I = 158 G I =
l-I . n
150
(58)
=
l-1 . n (\8).
205
L em m a 2. Let mbe a subset of @ such that \8 is a D-set in &. Let Q be any arbitrary permutation of Sm. Then we have
I@QI
ImQI n(m)
n (@) .
Proof. If there is no G e @ such that GQ is cyclic, then both sides are equal to zero. Now assume that G e @ is such that GQ is cyclic; put GQ = C. We have @ Q = @ C, and mQ = (m G-l) C . The set ~1 = is (;-1 is a D-set in @, since was a D-set in @. Now, by lemma I,
m
1m Q 1= 1m1c 1=
1-1 . n
(m 1)
1-1 • n and lemma 2 has been proved.
(@),
= 1-1 • n
(m).
Analogously
1@Q 1= 1@C 1=
Let k and n be natural numbers, and take m = kn. We consider a set Em of m objects, divided into k systems, each of them containing n objects. We shall again denote by elm the group of all permutations of Em. ~ denotes the group consisting of all k! (n!) k permutations H with the property that Ha and Hb belong to the same system whenever a and b belong to the same system. Or, shortly, H transforms systems into systems. The 0 rem I. ~ is a D-set in elm. Proof. If either k = 1 or n = I, then we have ~ =el m, and the theorem is trivial. Next we shall deal with the case k = 2, m = 2n. It has to be shown that 1 5 ~ 1 does not depend on 5 (5 eel 2n ). Let ~1 be the set of all permutations mapping the first system onto itself, and let ~2 be the set of those mapping the first system onto the second. Thus ~ = ~1 + ~2' Let p be the number of objects of the first system mapped into the first system by 5, then there are q = n - p objects of the first system which are mapped into the second system. Then we have
15 ~11 =
q{(n-l)!}2.
This can be seen, for instance, by interpreting 1 5 ~1 1 as the number of circuits (see § 3) in the following graph. Take two vertices, A and B, and 2n oriented edges: p of them from A to A, p from B to B, q from A to Band q from B to A. The number of circuits can be shown to be q {(n - I) !}2. It can be very rapidly
151
206
determined by theorem 6, for there are exactly q trees with root A. $)2 can be written as S o ~l' where S o is an arbitrary element of i92' Now I 5 ~21 = I 51 ~l I ' where 51 = 55 0, 5 1 has the same nature as S, apart from t he fact t hat p and q changed their roles. Hence I S$') , I~ p{(n - I )I}', and so IS $') I ~ IS $') , 1
+ IS $'), 1 ~
IP + q){
(n - l )!}'~
(n!) ' /n.
This does not depend on 5, and so our theorem has been proved in the case k = 2. Next we consider the general case k> 2. \Ve have to show that i 51 ~ I = I S 2 ~ I for any pair 5 1. S 2 (51 e.Sm> S 2 e.S m ) . Since any S ESm can be written as a product of transposit ions, it is sufficient to prove that I 5 .\,1 1= 1ST ~ I for a ll 5 and for any transpositi.on T. Or, what is t he same thing, that
(2.1 ) for any Q eGrn an d any transposition T fG rn . vVe may assume t hat T interchanges two sym bols belonging to the first and to the second system, respectively (if T interchanges t\vo symbols of the same system, then we have $) = T $), and (2 .1 ) is t rivial). Let $)* be t he su b-group of .S) consisting of all permutations of .S) which leave all indi vidual elements of t he 3 rd , 4th, . .. , k/ h system in vari ant, and le t ® be the group arising from 6 m in the same manner. Vrle now apply lemma 2, with l = 2n and m= ,!Q*. Since the theorem has been proved fo r the case k = 2, we know that Therefore
.i5 * is
$')*Q i n ($')*)
I
a D-set in @.
I (IJ Q I ,,-(0 ) '
I T.\J*QI n (.1)')
Evidently T (II ~ (IJ, and so I $')*Q I ~ I T $')*Q I fur all Q EGm • Since ,!Q* is a sub-group o f $), we can spli t iQ into classes, $) '= 1..: s)*Qi' and nnw (2.1 ) follows immEdi ately . The order of the group S) is kl(nl)k, and therefore (2.2)
I $') I ~ m -' ,, ($') )
~
m -' k! (n!)k
~
n - ' (k -
I) ! (n !)'.
Let sr be t he set of all permutations K with the property t hat the n objects of each syst em are transformed into objects of n different systems. In other words, ]( is such that, if a and b belong to the same syst em, then Ka and Kb bel ong to different systems. Clearly Sf is emp ty if k < n. It is not difficult to show t ha t stis a D-set. For, if H is an arbitrary p ermutation of S) , we have Sf = Sf H. It follows that Sl' is the
152
207 sum of a number of left-classes mod ~: ~ = r K, :po Each component K, .\1 is a D-set, by theorem 1. Hence ft is aD-set. It is easily seen that in the special case k = n t he number of ~ lemcnts in st is (nl)2n, and so we have (2.3)
I Sl' 1~ (n!) '"n-'
(k
~
n).
§ 3. T -Graphs. In §§ 3, 4, 5, 6 we shall be mainly concerned with a special type of finite oriented linear graphs, called T-graphs 1). These have the property that, at each vertex P, the number at of oriented edges pointing to Pi equals the number of edges pointing away from Pi' For simplicity we assume a i > 0 for all i. If this number happens to be the same for all vertices (at = (J for all i) then we shall call the graph a 1'(0 ) 2). 'Vc do not exclude the possibility that, in a T-graph, several different edges point froin Pi to PI' and we neither exclude edges pointing from P; to Pi itself (closed loops). Therefore, a T-graph can be interpreted as a pair of mappings of a finite set of edges {e l , . . . , e..J onto a finite set of vertices {PI' "', P N } such that each vertex is the image of the same number of edges in both mappings. The first mapping maps every edge onto the point \vhere it starts from, and the second one onto the point where it terminates. \Ve shall call these vertices the tail and th e head of the edge, respectively. If the head of c; coincides with the tail of el , then ci and c; will be called consecutive (which does not imply that ej and ei are consecutive). By a complete circuit (a circnit for short ) is meant any cyclic arrangement of the set of edges in such a manner that the head of each edge coincides with the tail of t he next one in the circuit. Or, in other words, such that consecutive edges in the circuit are consecutive in the graph. Nat urally, two circuits are considered as identical whenever the first one is a cyclic permutation of the second. It has to be understood that the order of the edges counts, and not only the order of the heads. So, fo r instance , if m = 3, N = 1, t hen PI is the head as well as the tail of all edges. There are two different circuits, viz. (c 1 , e2 , e a ) and (e t , e3 , ez). I) Tuttc [5 J calls them simple oriented networ/ls. 2) In the paper [IJ t he name "T-net" denoted the same thing as T (2) d oes in OUf present notation.
153
208 The number of circuits of a graph T will be denoted by I Til). A permutation P of the set of edges e1 , •.• , em will be called conservative (with respect to T), whenever Pei = ej always implies that the head of e. coincides with the tail of ej • We choose one special conservative permutation A 0' arbitrary, but fixed in the sequel. The set of all conservative permutations of T can be represented as @ A 0' where @ is the group of all permutations which leave the tails of all edges invariant. Evidently, any circuit determines a cyclic conservative permutation, and vice versa. Therefore,
I T I = I @A o '· This simple relation between the number of circuits in a graph and the number of cyclic permutations in a set explains why we choose the same notation I I for both. Consider a vertex Pi where a i edges start and a i edges terminate. By the local symmetric group @. we shall denote the group of all ,ail permutations which permute the at edges whose tail is Pi' but which leave invariant all edges whose tail is not Pi' Clearly @ is the direct product of &1' ... , &N'
§ 4. Traffic regulations. We shall also consider circuits described under certain restrictiw ,conditions, called traffic regulations. Let mv ... , mn be sub-sets of @v ... , O:I n , respectively, and .construct the set (4.1) ,defined in the same way as the direct product (4.2)
@
=
@1 X @2 X ... X @N'
a circuit described under the traffic regulation m is defined as a circuit corresponding to a permutation BAo, where BE m, and A 0 is the fixed permutation chosen in § 3. Denoting the number of circuits described under the traffic by I T I )8, we have regulation ~ow
m
I T I = IT!
(4.3)
6.l ,
ITI
I T I \B
=
1m Ao! .
1) We have > 0 if and only if T is connected (see [2J). For non·connected graphs our theorems are trivial. Nevertheless, all our proob ·are valid for that case also.
154
209 The traffic regulation (4.1) will be called reg7tlar if, for each i, ~i is a D-set in @i 1). The are m 2. 2) If 58 is regular, then we have 1 1 (4.4; n (58) I T I \8 = n (@) I T I (!j,
where n (58) and n (@) denote the number of elements of 58 and @, respectively. Proof. Since @i itself satisfies the condition imposed on 58 i , it is sufficient to show that the value of the left-hand-side of (4.4) does not change if some 58 i is replaced by the corresponding @i' If this has been proved, we can replace all 58 i 's by @i 's one after the other, and (4.4) follows. To this end we consider 58, defined by (4.1) and 58*, defined by
58* =
(4.5)
@1
X
58 2
X
58 3
X ... X
581\"
and we have to show that (4.6 for the latter ratio equals n (58) : n (58*). Referring to (4.3) we write (4.7)
IT \\8 =
L:
158 1 B 2 ••• BNAo I,
/ T I \8* = l: I @1 B 2 . . . B N A 0 I, where, in both sums, B 2 , • •• , BN run independently through the elements of 58 2 , ••• , 58 N , respectively. If we put B2 .. . B N Ao = Q, then we have, by lemma 2,
/58 1 Q I : I @1 Q I = n (58 1) : n (@1)' c'I.pplying this to each pair of corresponding terms of the sums in (4.7), We obtain (4.6). § 5. Special traffic regulations.
Let T be a T-graph with N vertices and m edges. Again, the numbers of edges pointing towards PI> ... , P N are denoted by aI' . . . , aN' respectively, and so m = a l aN' Let A be a positive integer. Then by TA 3) we denote the graph which arises from T if We replace any edge PiP; of T by A edges PiP;, with the same orientation. Hence TA has N vertices and Am edges. The edges of TA arising from one and the same edge of T are said to form a bundle.
+ ... +
') As in § 2, the bar indicates that the permutations are considered as permutations of the a i edges whose tail is Pi, whereas the other edges are disregarded. 2) This theorem was used implicitly in [1]. 3) The notations Ta and T(a) (for the latter see § 3) must not be confused.
155
210
In T>' we shall consider several possible traffic regulations. We first choose a fixed conservative permutation A 0 which transforms bundles into bundles. A traffic regulation will be obtained by choosing, at each \'ertex Pi' a set 58 i of permutations of the ).ai edges starting from that vertex. We shall consider three possibilities, all regular in the sense of § 4. 1°. 58 i = @i' where @i is the local symmetric group (of order (Aa i )!)· 2°. 58 i = ~i' Here ~i is the sub-set of @i which transforms bundles into bundles. In other words, as to the edges whose tail is Pi it acts like the group ~ of theorem I, where the systems are given by the bundles. Thus n = A, k = a i • 3°. 58 i = Sf i . Here Sf i is the sub-set of ~i consisting of the permutations which transform the edges of each outgoing bundle at Pi into sets of edges belonging to A different bundles (see the end of § 2). We have, by theorem 2, 1
(5.1)
T>'I@
iY
N
N
II ail (A!)CJi
II (Aa i )!
i= 1
II T (A, ai )
i=l
i= 1
where T (A, a i ) is the number of clements of Sf i . As stated in § 4, we have 1 T>' 1 Ql ~ 1 T>' 1 . The number 1 T>' 1 ~ can be connected with the number of circuits in T itself. To this end we consider a circuit of T>' described according to the traffic regulation ~. At any stage, the bundle to which an edge belongs only depends on the bundle containing the preceding edge. Therefore, the sequence of bundles described by the circuit is periodic mod m, and any bundle is used exactly A times. It follows that each circuit under consideration defines a circuit of T. Conversely, it is easily seen that each circuit of T arises fmm A-1 (A !)m different circuits of T>' in this manner. Hence
I T>'I
(5.2)
~
= 1.-1 (A!)m . I T I,
and so we obtain from (5.1) The
0
re m 3.
1
T>'
1
1. T i§l
= A-
I
1 .
We shall now make the restriction that A = a, that is to say a 1 = ... = aN = a = A,
156
(A:ii(!. T
is a
m=Na.
T(CJ) ,
and that
211
Then we have (see (2.3))
= (0'!)2<7
cp ()., O'i)
and now (S.l) and (S.2) lead to (S.3)
1Ta
I~ =
1T 1.0'-1 (O'!)m .
(O'~O'~l~;ar =
1 T 1.0'-1 (O'!) N(2<7-1).
If T is a T(a), with N vertices and m = O'N edges, then by T* we denote the graph defined as follows. T* has m vertices Ev ... , Em. Two vertices E i , E; are connected in T* by an edge from Ei to E j if and only if ei , e; are consecutive in T. This process was considered in [1] for the case 0' = 2 only.
The
0
rem 4.
1
T* , = 0'-1 (0' !)N(a-l) . 1 T
I.
Proof. By "O'-cycle in T" is meant a circular array containing each edge of T exactly 0' times, such that two edges are consecutive in the array if and only if they are consecutive in T. A O'-cycle will be called restricted if it is such that any pair of consecutive edges of T occurs just once as a pair of consecutive elements in the array. It will be clear that any restricted O'-cycle in T defines uniquely a circuit in T*, and vice versa. The restricted O'-cycles in T are closely related to the circuits in Ta described under the traffic regulation ~. Actually, if we identify the edges of each bundle in Ta a ~-circuit in Ta becomes a restricted O'-cycle, owing to the definition of ~. Conversely, any restricted O'-cycle gives rise to a large number of ~-circuits. Any bundle occurs 0' times in the cycle, and each time an arbitrary edge of the bundle can be chosen. So we see that (O'!)m different ~-circuits arise from one restricted O'-cycle. Now the theorem follows from (S.3), since m = NO'. In a T-graph which is not necessarily a T(a) we can still consider (unrestricted) O'-cycles 1). The number of different O'-cycles can be determined from theorem 3. A difficulty lies in the fact that a O'-cycle may be periodical with a period md, where d is a proper divisor of 0', which could not happen with a restricted O'-cycle. If c (e) denotes the number of those e-cycles in T whose period is exactly me, then we have obviously d I T P 1 = E- c (d) (e!)m. dip
e
1) And, if T is a T(a), we can consider (unrestricted) e-cycles, for arbitrary values of e.
157
212
Hence we obtain from Mobius' inversion formula, c (e) = E
dip
~ fl ([d) . (d!)-m·1 e
Td
I,
and so the number of unrestricted e-cycles equals E c (d)
dip
=
!l/ p (!l) (d !)-m d . I Td I ' e e d
where p is Euler's indicator. I Td I can be evaluated by theorem 3. Especially, IS T is a T(u) , then the number of unrestricted ecycles is (ad)!)N . e1 ~ p ( de ) ( (d!)u a! . 11 I .
§ 6. Trees in T-graphs. Let T be a T-graph with N vertices and m edges. The number of edges whose tail is Pi is again denoted by a i . Choose an arbitrary vertex; for convenience of notations we take it to be Pl' We shall define the notion: (oriented) tree with root P. A tree with root PI is a sub-set A of the set of edges of T, with the following properties. a. Any vertex #- PI is the tail of just one element of A. b. No element of A has its tail in Pl' c. Any vertex can be connected with PI by a set of consecutive edges, all belonging to A. It is easily seen that c can be replaced by c*. A contains no closed oriented cycles. There is a striking relation between trees and circuits. Choose a fixed edge el whose tail is P 1> and consider an arbitrary circuit of T. We traverse it, starting with el . Running through the circuit, each vertex Pi will be visited ai times. The edge by which we leave Pi after having visited it for the acth time will be called the last exit of Pi' The 0 rem 5a. The set A consisting PN is a tree with root Pl'
at the last exits at P2' ... ,
Proof. The properties a and b are trivial. We shall verify c*. We can number the edges of T according to the order in the circuit, with indices 1, ... , m; el gets the index 1. If ei and ej both belong to A, and if ei and ej are consecutive in T, then we have i < j. For, ei+l has the same tail as el , and j is
158
213
the maximal value of the indices of all the edges with this tail. Consequently, A does not contain any closed cycle; the indices in such a cycle would increase indefinitely. The 0 rem 5b. If a tree A with root PI is given, and if e1 is given, then there are exactly (6.1)
N
II (ai-I)!
i=l
circuits of T whose set of last exits coincides with A. Proof. At any vertex we number the outgoing edges 1). with the following restrictions: At PI the edge e1 gets the number 1; at Pi (i > 1) the edge belonging to A gets the highest possible number, that is a i . The number of ways in which this can be arranged is expressed by (6.1). It remains to be shown that, for each numbering of this type, there exists a circuit (and not more than one circuit) corresponding with this numbering. First thing it will be clear that we have no choice at all if we try to traverse a circuit according to this numbering. Starting with eI> we arrive at a vertex P 2 , say. It is prescribed which outgoing edge we have to take first, etc. If We meet a vertex for a second time, we are forced to leave it by the edge bearing the number 2, and so on. The process has to stop somewhere, the graph being finite. The only reason why it should stop is, that We arrive at a vertex where all outgoing edges have already been taken before. This must be PI> for all other vertices have been entered at least as often as they have been left. We can show that at this moment all edges of T have been used, each exactly once of course, which means that a circuit has been described. Assume that a certain edge is vacant, that means that it has not yet been used. Considering its head, there is a vacant entry and hence there is a vacant exit. Especially, the exit belonging to A has to be vacant, since it has the highest number. This vacant edge of A leads into another vacant edge of A, and so on. Bye, we eventually arrive at PI> and we find that there is a vacant outgoing edge. This contradicts the fact that the process stopped. From Theorem 5a and 5b we immediately obtain. The 0 rem 6. The number 0/ trees in T with a given root is N
I T I . { II (a, - I)! i=1
}-1,
1) This way of numbering is different from the one considered in t he proof of theorem Sa.
159
214 which does not depend on the vertex chosen as the root. As before, j T I denotes the number of circuits of T.
Theorem 6 furnishes a new proof of theorem 3. For, there is a simple relation between the number of trees in T and in TA. Any tree in TA gives rise to a tree in T, by the mapping TA --+ T which maps entire bundles of TA into the corresponding edges of T. Conversely, in any bundle of TA an edge can be chosen in A ways. Any tree in T contains N-I edges, and so we have t (TA) = t (T) . AN-I,
(6.2)
where t (T) and t (TA) denote the number of trees with a given root, in T and TA, respectively. By theorem 6 we have (6.3)
I TA I =
(6.4)
I T I = t (T)
N
t (TA) . II (AG; i=l
N
. II (G; i= 1
I) !,
I) ! .
Theorem 3 follows from (6.2), (6.3) and (6.4). § 7. Trees in arbitrary oriented graphs.
\Ve consider an oriented graph G, with N vertices PI' ... , PN. We no longer require that it is a T-graph, that is to say, the number of edges starting from Pi need not be the same as the number of edges pointing towards Pi' Again, we can consider (oriented) trees, with a given root. Tutte [5J showed, that the number of trees in T with a given root can be interpreted as the value of a certain determinant. Since his result is in several ways connected with the results of the present paper, we give a full account of his theorem, with a new proof. Let (aij) (i, j = I, ... , N) be the following matrix. If i =F- j, then a;; = - bii , where bi ; denotes the number of oriented edges from P; to Pi (Pi is the tail and Pi is the head of these edges) Further N
a ii
is such that I
1=1
ail =
O.
The 0 rem 7 (Tutte). The number of trees with the given root Pi equals the minor of a ii in the matrix (a i ;). Proof. For simplicity of notation we take i = 1. We first consider a special graph, where each vertex =F- PI is the tail of just one edge, and where no edge leaves Pl' This graph is
160
215
either a tree or it is not; the possibility of constructing more than one tree in this graph does not exist. We shall show that the minor of all is 1 or according to whether the graph is or is not a tree. First assume that the graph is a tree. We shall apply induction with respect to N; for N = 2 the result is trivial. Take N > 2. There is at least one vertex which is not the head of an edge. This is the case with P 2' say. Then the second column of the matrix reads 0, I, 0, ... , 0. Hence the value of the minor of an is not altered if the second row and the second column are both cancelled. The new matrix corresponds to the graph which results by cancelling P 2 and the edge starting from P 2 • This new graph is still a tree, and the induction is completed. Next assume that the graph is not a tree. Then it shows somewhere a cycle of edges not containing Pl' For example, let the cycle consist of the edges P 2 P a, P aP 4 , P 4 P a. Then, in the matrix, the 2"d, 3rd , and 4th row are linearly dependent, for their sum vanishes. It follows that the minor of an equals zero. This completes the proof of the theorem for our special graph. The general case is easily reduced to this one by repeated application of the following operation. Divide the set of edges starting from a certain edge, P 2 , say, into two groups. Now construct two graphs; the first one arises from the original graph by cancelling the edges of the first group, the second one by cancelling the edges of the second group. The matrices of the graphs are such that the second row of the original matrix equals the sum of the corresponding rows in the new matrices; all other rows are identical in the three matrices. Therefore, the minor of an in the original matrix is the sum of the minors of all in the new matrices. On the other hand, the number of trees in the original graph is the sum of the numbers of trees in both graphs. This proves the theorem.
°
Theorem 6 shows that in a T-graph the number of trees does not depend on the choice of the root. T u t t e deduced the same fact from theorem 7. We repeat his argument. Assume that the graph considered in theorem 7 is a T-graph. Then we have a;i = a i - (li where (li is the number of edges from Pi to Pi' Therefore, we also find that the sum of the elements in each column of the matrix is equal to zero. It is a well-known fact that if in a square matrix the sum of the elements in each row and, in each column vanishes, then the cofactors of all elements have the same value. Especially, the minor of a;i does not depend on i.
161
216
We again consider an arbitrary graph G, which need not be a T-graph. Let P v ... , P n be its vertices, and let a i be the number of edges starting from Pi' and 'l: i the number of edges pointing towards Pi' Furthermore, bi; denotes the number of oriented edges from Pi to Pi' Hence ai = r b;;, '1:; = r bi;' i i Next we consider a permutation S of the N objects 1, 2, ... , N. Let Gs be the graph arising from G in the following manner: replace each edge Pi P; of Gs by an edge Pi PSi, where Sj is the result of S applied to the object j. Therefore, if the analogues of ai' ii' bi; for the graph Gs are denoted by ap), TP), bJS), respectively, then we have
a/S) = ai , TS/S) = T; ,bi,s/S)
(7.1)
= bi;'
Let ti (Gs) denote the number of oriented trees in Gs whose root is Pi' and let 6N denote the group of all N! permutations of the objects 1, ... , N. Then we have The
0
rem 8.
r
s.'5N
ti (Gs) = (N -
1) ! II ak' k"*i
Proof. We may and do assume i = 1. We shall apply theorem 7. To this end, we consider the matrix (i, j = 1, ... , N)
where (ji; is Kronecker's symbol. Its determinant det Ms is a multilinear polynomial in the variables Av ... , AN: det Ms = Is (AI' ... , AN), and it will be clear from theorem 7 that (7.2)
tl (Gs)
()
=
IT Is 1
(A v a2 ,aa,·· .,aN).
We put (7.3) In the first place we can show that P (AI' ... , AN) does not contain terms of degree < N - 1. For instance, consider the term with A3 A4 ... AN, which does not contain either Al or A2 • Let T be the transposition of the objects 1 and 2. Then the coefficient of A.3 ... ),N in det MTS is easily seen to be the opposite of the coefficient oj A3 . .• A.N in det Ms. If S runs through6N, then TS does the same, and so the coefficient of A3 • •• AN in: P (Av ... , AN) turns out tc be zero. We next deal with the terms of degree N - 1, and therefore
162
217 we consider Al A2 ... Ai_l Ai + l ... AN. Its coefficient in Is equals b;/SJ. Consequently, its coefficient in P (AI' ... , AN) is -
E
b;/SJ
S€@)N
= - E
S€@)N
biS-1i '
= - (N -
I)! Eb;; = - (N-l) !O"i j
Finally, the coefficient of Al ... AN in Is equals 1, and in P P'l' ... , AN) it is N! Thus we have proved that P (At> ... , AN) = Al ... AN' (N -I)! {N - E O"i/A;}. i
From (7.2) and (7.3) we now deduce ()
E tl (Gs) = ", P (At> 0"2' ... , O"N) = 0"2 '"
S€@)N
011.
O"N' (N -I)!
Theorem 8 is, in some sense, a generalization of theorem For, if we apply theorem 8 to a graph which is a T(a), then we have, by theorem 6, (7.4)
E
S€@)N
I Gs I =
(N-I)! O"N-l {(O"-I) !}N
=
(N-I)!
! . (0"l)!1 0"
It is not difficult to see that (7.4) is equivalent with theorem 1 (take n = 0", k = N). Note added in proof. By theorems 6 and 7 the number of circuits in a T-graph can be expressed as a determinant. For the special case that T is a T(2), this result was announced by W. T. TUTTE and C. A. B. SMITH (On unicursal paths in a network of degree 4, Amer. Math. Monthy 48, 233237 (1941)).
REFERENCES 1. N. G. DE BRUI]N. A combinatorial problem. Nederl. Akad. Wetensch., Proc. 49, 758-764 (1946) = Indagationes Math. 8, 461-467 (1946). 2. 1. ]. GOOD. Normal recurring decimals, ]. London Math. Soc. 21, 167-169 (1947). 3. M. H. MARTIN. A problem in arrangements. Bull. Amer. Math. Soc. 40, 859-864 (1934). 4. D. REES. Note on a paper by 1. ]. GOOD. ]. London Math. Soc. 2t, 169-172 (1947). 5. \V. T. TUTTE. The dissection of equilateral triangles into equilateral triangles. Proc. Cambridge Phil. Soc. 44, 463-482 (1948).
Reprinted from
Simon Slevin 28 (1951), 203-217
163
THE FACTORS OF GRAPHS \\T. T. Tl'TTE
1. Introduction. A graph G consists of a non-null set V of objects called t'ertices together with a set E of objects called edges, the two sets having nn common element. \Vitheach edge there are associated just two vertices, called its ends. Two or more edges may have the same pair of ends. G isfillite if both Vand E are finite, and illfinite otherwise. The degree de(a) of a vertex a of G is the number of edges of G which have a as an end. G is locally finite if the degree of each vertex of G is finite. Thus the locally finite graphs include the finite graphs as special cases. A sllbgraph II of G is a graph contained in G. That is, the vertices and edge,; of II are vertices and edges of G, and an edge of II has the same ends in II as in C. A restriction of G is a subgraph of G which includes al1 the vertices of G. A graph is said to be regular of order n if the degree of each of its vertices is I:. An lI-factor of a graph G is a restriction of G which is regular of order 11. The problem of finding conditions for the existence of an lI-factor of a gin-n graph has been studied by various authors [3; 4; 5]. It has been solved. in part. by Petersen for the case in which the given graph is regular. The author h;I" given a necessary and sufficient condition that a giwn locally finite graph shall have a I-factor [6; 7]. In this paper we establish a necessary and sufficient condition that a given locally finite graph shal1 have an ll-factor, where 11 is any positiw integer. Actually we obtain a more general result. \Ve suppose gi\"Cn J function f which associates with each ,-ertex a of a given locally finite graph G a positive integer f(a), and obtain a necessary and sufficient condition that G 5hal1 have a restriction If such that dH(a) = f(a) for each vertex a of G. The discussion is based on the method of alternating paths introduced by Petersen [4]. We also consider the problem of associating a non-negative integer with each ~dge of G so that for each ,-ertex c of G the numbers assigned to the edg-t>5 having c as an end sum to f(c). \Ve obtain a necessary and sufficient condition for the solubility of this problem. :\Jy attention has been drawn to two other papers in which similar theoril'5 of factorization haw been put forward. In one of these papers, GaIlai [lJ gives a valuable unified theory of factors and gives some new results on the factorization of regular graphs. He also claims to have obtained a necessary and sufficient condition for the existence of a 2-factor in a general locally finite graph, but leaves the discussion of this for another occasion. In the other papl'r Bekk [I] establishes a necessary and sufficient condition for the existenct' at" an n-factor in a general finite graph, where n is any positive integer. Prominl'!1t Received February 20. 1951.
164
THE FACTORS OF GRAPHS
31.')
in his theory is the hyper-n-prime graph, a generalization of the hyperprime graph introduced in [6]. 2. Recalcitrance. A path in a graph G is a finite sequence (1)
s.:ltisfying the following conditions: (i) The members of P are alternately vertices and edges of G, the terms a lr a2, ... ,aT being vertices. (ii) If 1 < i < r, then a t and a i+l are the two ends of Ai. We say that P is a path from al to a TI and that its length is r - 1. \,",~ note that the terms of P need not be all distinct. \Ye admit the case in which P has length o. Then P has just one term, a wrtex of G. The wrtices x and y of G are connected in G if a path from x to y in G exists. If this is so for each pair {x, yl of vertices of G, then G is connected. The relation of being connected in G is evidently an equivalence relation. I t therefore partitions G into a set {Gal of connected graphs such that each edge or vertex of G bdongs to some Ga and no two of the G. have any edge or vertex in common. "·e call the graphs Ga the components of G. If S is any proper subset of the set of vertices of a giwn graph G, \\·e denote k- G(SJ the subgraph of G obtained by suppressing the members of S and all td~es of G having one or both ends in S. Suppose now that G is locally finite and that S is a finite set of ,"ertices of G. If S docs not include all the vertices of G the graph G(S) is defined. Then if II is any finite component of G(S) we denote the number of edges which haw Olll? end in S and the other a vertex of H by <,(H). We have (1)
v(H)
+ L dG(c) coH
==
0 (mod 2),
fn: the expression on the left is equal to twice the number of edges of G ha,·ing an end which is a vertex of H. (We have used the symbol c :: H to denote that c i- a vertex of H.) ""e denote by K(G, S) the set of all finite components H of G(S) which satisfy
t'CH)
(3)
+
Lf(c) coH
==
1 (mod 2).
If K(G, S) is finite we denote the number of its elements by keG, S). If S includes all the wrtices of G we write keG, S) = O. In either case we write
reG, S)
(4)
= keG, S)
+L
(f(c) - dG(c».
(f.e;
We call reG, S) the recalcitrance of G with respect to S. If K(G, S) is infinite reG, S) is infinite.
we qy that
THEORDI
I. If G is finite, reG, S) is even or odd according as Lf(c) Cf.G
is e'i.'en or odd.
165
31G
W. T. TUTTE
Proof. By (2), (3), and (4), r(G, S)
=0
L
dG(c)
CfG
+
Lf(c) (mod 2). CfG
But the sum of the degrees of the vertices of G is even, since it is twice tht number oLedges of G. The theorem follows. The locally finite graph G is constricted with respect to f if there exist disjoint finite sets Sand T of vertices of G such that (5)
Lcor f(c) < r(G(T), S).
As an example, G is constricted if it has a vertex a such that dG(a) < f(a). In this case (5) is satisfied if T is null and S has the single element a. Again, Gis constricted if r(G, S) > 0 for any set S of vertices of G, for then (5) is satisfied with T null. So by Theorem I a finite graph G is constricted if the sum of the numbers f(c), for all the vertices c of G, is odd. In this case (5) is satisfied il Sand T are both null. We define an f-factor of the given locally finite graph G as a restriction F of G such that dF(c) = f(c) for each vertex c of G. Similarly, a restriction F of a subgraph X of G is an f-factor of X if dF(c) = f(c) for each vertex c of X .. \ restriction F of a subgraph X of G is an incomplete f-factor of X if dF(c) <; f«() for each vertex c of X, and dF(c) = f(c) for all but a finite number of the vertices of X. The deficiency of such an incomplete f-factor is the sum
taken over all vertices c of X for which dF(c) < f(c). Our object in this paper is to show that G has no f-factor if and only if G is constricted with respect to f. THEORBI I I. Let F be an incomplete f-factor of G, and let S be any finite set rd 1!ertices of G. Then the deficiency of F is not less than r(G, S).
Proof. If 1I is any member of K (G, S), let w(lI) be the number of edges oi F which have one end in S and the other a vertex of H. Analogously \\'ith (2) \rc have (6)
L dF(c)
=0
w(H) (mod 2).
ftH
Let P be the set of all elements H of K(G, S) such that dF(c) = f(c) ior each vertex c of H. Let Q be the set of all other members of K(G, S). Let the numbers of members of P and Q be p and q respectively; q must be finite. The sum of the numbers f(c) - dF(c) taken over all vertices of G not in 5 which satisfy dF(c) < f(c) is at least q. If H E P, then by (3) and (6), v(H) ;t. w(H). Hence at least p of the ('d~es of G having just one end in S are not edges of F. It follows that
166
THE FACTORS OF GRAPHS
L:
ce,.!
(f(c) - dF(c»
;;> p
+ L:
ceS
317
(f(c) - dG(c».
[fence if D is the deficiency of F we have D ;;> p THEOREM
+ q + L:
ceS
(f(c) - dG(c»
= reG, S).
III. IfG is constricted with respect tof, it has nof-factor.
Proof. Suppose G is constricted. Then there are disjoint finite subsets Sand T of the set of wrtices of G such that (5) is satisfied. Assume G has an f-factor F. Then F(T) is an incomplete f-factor of G(T). Its deficiency D is equal to the number n of edges of F having one end in T and the other not in T. Hence, by Theorem 1[, f(c) ;;> n = D ;;> r(G(T), S).
L:
coT
Thi" contradicts the definition of Sand T. 3. Alternating paths. An f-subgraph of G is a restriction J of G having the following properties: (i) The number of edges of J is finite. (ii) dAc) < f(c) for each wrtex c of G. .-\. wrtex c of G is deficient in J if dJ(c) < f(c). Let us suppose that we are given an f-subgraph J of G and that a is a wrtcx of G \\"hich is deficient in J. Following a long-established tradition we refer to an edge of G as blue or red according as it is or i:; not an edge of J . .-\.n alternating path ba.<ed on a is a path P in G which satisfies the following com}i tions: (iJ The first term of P is a. (ll) ~o edge of G occurs twice as a term of P. (iii) If P has more than one term the edges of G which occur in Pare alternately red and blue, the first one being red. If P includes the subsequence (c, C, d) where C i:; an edge of G, we say that P passes through c and the11 C, or P passes through C and then d. Let n(a) be the set of alternating paths based on a; neal is !lot null since it ha~ one member whose only term is a. Let C be an edge of G, with ends c and d. If no member of neal has C as a L"lm, Cis acursal. If some member of neal passes through C and then d, Cis rle,"aibable to d or from c. If C is describable to d but not to c, Cis unicursal to d or from c. If C is describable both to c and to d, C is bicursal. .\ vertex of G is accessible from a if it is a term of some member of neal. The vertex a is singular if no deficient vertex of G. other than a itself. IS accessible from a. THEORDI
1\'. Only a finite number of vertices of G are accessible from a.
167
318
W. T. TUTTE
If b is a vertex of G accessible from a, then either b = a, or b is an end of a blue edge, or b is an end of an edge B whose other end is either a or an end of a blue egge. Since the number of blue edges and the degree of each vertex of G arc finite, the theorem follows. THEORBI Y. Let A and B be edges of G which are of different colours and have a common end x. Suppose A is unicursal to x. Then B is describable from x.
There is a member P of II (a) which passes through A and then x. If B is not a term of P preceding A there is evidently a member of IJ(a) which agrees with P as far as A and continues (x, B, ... ). Then B is describable from x. If B precedes A in P, either the theorem is satisfied or P passes through B and then x. In the latter case there is a member of IJ(a) which agrees with P as far as B and continues (x, A, ... ). Then A is not unicursal to x, contrary to hypothesis. 4. Bicursal components. Let us suppose that G has at least one bicursal edge. The bicursal edges of G, with their ends, define a subgraph of G. We refer to the components of this subgraph as the bicursal components. THEOREM
VI. The bicursal components are finite graphs.
This follows from Theorem 1\', since the vertices of a bicursal component are all accessible from a and G is locally finite. Let L denote any bicursal component. An entrant of L is any member of II «(1 I which has a vertex of L as a term. If P is an entrant of L we denote by e(PI the vertex of L which occurs first as a term of P. \Ve then say that P enters L at e(P). A vertex of L at which some entrant of L enters L is an entrance of L. Let P be an entrant of L. Let A be the first edge of G in P after the fir"! occurrence of e(P) which is not in L, if such an edge exists. The section of P by L is defined as follows. If the edge A exists, the section is the part of P extend in!! from the first occurrence of e(P) to the term immediately preceding A. Othemisf. the section is the part of P extending from the first occurrence of e(P) to the last term of P. In either case the section is an alternating path based on e(P and having only edges and vertices of L as terms (except that its first edge may be blue). If e is any entrance of L we denote by ~(e) the set of sections by L of tho;;,. members of IJ(a) which enter L at e. Since the edges of L are not acursal, L has at least one entrance. If a is ;l vertex of L then a is an entrance of L. In the following series of theorems (VII-XI)) we suppose that some entrance e of L is specified, with the proviso that e is a if a is a vertex of L. THEOREM
of
~
VI I. There exists an edge of L which is a term of some mellli>c'
(e).
Procj. Suppose first that e is a. Any red edge of L having a as an end j; clearly a term of a member of ~(a). Suppose therefore that the edges oi L
168
THE FACTORS OF
G~~PHS
319
having a as an end are all blue. Each of these is describable from a, and no one is the first edge of a member of II (a). Hence some red edge C having a as an end is describable to a. But all red edges having a as an end are describable from a. Hence Cis bicursal and therefore an edge of L, contrary to supposition. :\ow consider the case in which a is not a vertex of L. Let P be an entrant of L sllch that e(P) = e. Let C be the edge of G which immediately precedes the first occurrence of e in P. Then C is unicursal to e. Any edge of L having e as an end and differing in colour from C is clearly a term of a member of ~(e). Suppose therefore that the edges of L having e as an end all have the same colour as C. Since they are all describable from e, some edge E of G differing in colour from C is describable to e. But E is describable from e, by Theorem \ .. Hence E is bicursal and therefore an edge of L, contrary to supposition. THEOREM VIII. If A is an edge of L with ends x and y, and If some member P'
0'- .l (e) passes through x and then A, then some other member of ~ (e) passes through Y (Ind then A.
Proof. Since A is bicursal there exists a member Q of II (a) which passes through yand then A. It may happen that every term of Q which precedes A i~ an edge or vertex of L. Then a is a vertex of L and therefore e = a by the definition of e. Hence the section of Q by L is a member of ~(e) which passes through yand then A. In the remaining case, let B be the last term of Q preceding A which is an l'dge of G but not an edge of L. Let b be the immediately succeeding term of Q. Then b is a vertex of L. Let C be the first edge of G in P' which succeeds Bin Q but does not succeed A in Q. Such an edge exists since A is an edge both of P' and of Q. Let the ends of C be rand s. We may suppose that P' passes through r and then C. Suppose Q passes through r and then C. Then there is a member of .l(e) which agrees •.,ith pI as far as C and then continues with the terms of Q from C to A. This member of ~(e) passes through y and then A. :\lternatively, suppose Q passes through s and then C. There is a member QI of II(a) which enters L at e, then agn."Cs with P' as far as C, and continues with thl' terms of Q in reverse order from C to b. Let D be the edge of Ql immediately preceding the first occurrence of e. If B ¢ D it follows that B is describable irom b. But Q passes through B and then b. So B is bicursal and therefore an t(1~e of L, contrary to its definition. We conclude that B = D and therefore I, = e. Hence there is a member of II(a) which agn'es with Ql as far as Band ;l~n:es with Q from B to A. The section of this path by L is a member of .l(e) which passes through y and then A. THEOREM IX. Let A be an edge of L which is a term of some member of ~(e). Let x be an end of A distinct from e. Then there is all edge B of L 'Which dijTers ;n rolour from A, which has x as an end, and which is a term of some member of
.lIe'.
169
320
W. T. TUTTE
Proof. By Theorem VIII there is a member of .:\(e) which passes through x and then A. The last edge preceding A in this member of .:\(e) has the required properties. THEORE!\f X. If A is any edge of L and x is any end of A, then there is a member of a (e) which passes through x and then A.
Proof. Let U be the set of all edges of L occurring as terms in the members oi .:\(e); t" is non-null, by Theorem VII. Let I" be the set of all other edges of L.
Assume that V is non-null. Since L is connected there is a vertex z of L which is an end of a member B of t" and a member C of V. If z is not e we may suppose that Band C differ in colour, by Theorem IX. By Theorem VIJI there is a member of .:\(e) which passes through B and then z. C is not a term of this member of .:\(e). Hence there is a member of .:\(e) which agrees with this one as far as B and then continues with z and C. This contradicts the definition of C. Suppose now that z is e. If Band C differ in colour we obtain a contradiction as before. \Ve deduce that all the edges of L having e as an end have the same colour. If e = a it follows from Theorem VII that e is an end of some red edge of L. Then C is red. Hence there is a member of .:\(e) which has C as its fiN edge, contrary to assumption. If e is not a it follows from Theorem \"11 that there is a member P of II(a) entering L at e in which the first occurrence oi e is immediately succeeded by an edge of L. We may take this edge to be B. Sincl' Band C have the same colour there is a member of II(a) which agrees with P as far as the first occurrence of e and then continues with C. Hence C is a meml'er of C. contrary to assumption. We conclude that IT is null. The theorem now follows from Theorem \"III. Let G1 denote any subgraph of G. An edge A of G is said to touch G1 if A is not an edge of G1 and just one end, say x, of A is a vertex of G1• Such an edge A is tmicursallo or from G1 if it is unicursal to or from x respectively. THEOkBf XI. If a is a t'erlex of L then all edges of G 'U.'hich touch L are IlIIi" (ursal from L. If a is not a t'ertex of L then tizere exists jllst one edge of G 'ii.'izir/z toucizes L and is tmicursal to L, and all other edges of G which touch L are unicztrSa! from L.
Proof. Let A be an edge of G which touches L. Let x be the end of A which is a vertex of L. Assume that A is not unicursal from x. "'e recall that a = c ii a is a vertex of L. If x is not e there is an edge C of L differing in colour from A and havin~ .\" a5 an end, by Theorem IX. This is true also if x = e = a. For then A is blue ;;inre it is not unicursal from a and not bicursaI. and Theorem VII shows that "'1111,' red edge of L has a as an end. In either of these cases it follows from TheoreIll X that there is a member of II(a) which enters L at e, whose section by L p.l~:'e5 through C and then x, and which continues from C with the terms x and A. But A is not bicursal since it is not an edge of L. Hence A is unicursal from x. contrary to assumption.
170
THE FACTORS OF GRAPHS
321
l'\ow suppose that x = e and e is not a. By Theorem VII, there is a member P of II (a) which enters L at e and in which the first occurrence of e is immediately succeeded by an edge C of L. Let the edge of G which immediately precedes the first occurrence of e in P be B. Clearly B touches L and is unicursal to L. Suppose that A and B are distinct. If A differs in colour from B it is describable from x = e, by Theorem V. If A and B have the same colour this differs from that of C. By Theorem X there is a member Q of Il(a) which enters L at e, and whose section by L passes through C and then e. It is clear that A and B cannot both precede C in Q. Hence there is a member Q' of Il(a) which agrees with Q as far as C and then continues with e and one of the edges A and B. Actually, it continues with e and A since B is unicursal to e. Hence if A and B are distinct, A is unicursal from x. This completes the proof of the theorem.
5. Bicursal units. Let T be the set of all vertices of G which are ends of bicursal edges. Let T' be the set of all edges of G having both ends in T. Then T and T' define a subgraph G' of G. We refer to the components of G' as bicursal ullits. Evidently a bicursal component having a given vertex b is a subgraph of the bicursal unit having the vertex b. By Theorem IV the bicursal units are finite graphs. THEOREM XI I. Let M be any bicursal unit. If a is a vertex of M then all edges of G which touch ll[ are unicursal from J{. If a is not a vertex of J[ then there exists Just one edge of G which touches M and is unicursal to J[, and all other edges of G 'which touch }.[ are unicursal from J[.
Proof. Since some edges of J[ are bicursal there exists a member P of Il(a) having a vertex of J[ as a term. Let e be the first vertex of J[ to occur in P. I f a is not a vertex of A[ there is an edge E of G which immediately precedes the first occurrence of e in P. Then E touches J[ and is unicursal to e and J[. We denote the bicursal component of which e is a vertex by L. If instead a is a ycrtex of J[ we denote the bicursal component of which a is a vertex by L. .\ subgraph L' of J[ which is a bicursal component distinct from L is supplied from L if there exists a sequence (Llo L~, ... , L t ) of bicursal components and a sequence (A 1, A 2, ... , A '-I) of edges of J[ such that (i) L1 = Land L, = L', (ii) the L t are subgraphs of M, (iii) for each integer i in the range 1 -< i < t, A tis unicursal from L t and to L i + 1• We can show that any subgraph of }.{ which is a bicursal component distinct from L is supplied from L. For suppose it is not. Then since JI is connected thl're is an edge B of .M wi th ends band c belonging to bicursal componen ts L' amd L", where L' is L or is supplied from L, and L" is not L and is not supplied from L. \ow B is not bicursal by the definition of a bicursal component, and is not acursal, by Theorem XI. It is not unicursal to L", since L" is not supplied from
171
322
W. T. TUTTE
L. Hence B is unicursal to L'. But this is contrary t6 Theorem Xl since L' is either L or is supplied from L. The Theorem now follows by the application of Theorem Xl to each of the bicursal components which are subgraphs of AI. If a is not a vertex of the bicursal unit JI, we call the edge of G which touche.; M and is unicursal to 111 the entrance-edge of .V. \Ye classify such bicursal units as red-entrant and blue-entrant according as their entrance-edges arc red or blue. A bicursal unit having a as a vertex is a-entrant. 6. Singular vertices. In this section we suppose that a is a singular vertex. \Ve denote the numbers of red-entrant and blue-entrant bicursal units by k, and kb respectively. These numbers arc finite, by Theorem 1\'. Let C denote the set of all vertices of G \\'hich arc not ycrtices of G'. Thus no bicursal edge has an end in C. Let l' be the set of allmemhers of C to which some red edge is unicursal. Let W be the set of all members of C from which some ml edge is unicursal or to which some blue edge is unicursal. Clearly, a
(9)
q 1'.
Suppose c E 1'. :\n)' blue edge of G having c as an end is unicursal from (, by Theorem 1'. Hence, by (9), no red edge of G can be unicursal from c. There are just f(c) blue edges of G which have c as an end and are therefore unicursal from c since c is accessible from, but distinct from, the singular ycrtex a. :\ow suppose i E W. If some red edge is unicursal from i then either a = i or there is a blue edge unicursal to i. If a = i or there is a blue edge unicursal to ;, then each red edge having i as an end is unicursal from i, by Theorem \. and the definition of Il(a). Hence any red edge having i as an end is unicursal from I. Consequently no blue edge of G can be unicursal from i. It is clear from these results that l' and 1I' are disjoint sets. By Theorem 1\ they are finite sets. If i ': W, let y(i) be the number of red edges of G unicursal from i \\"hich ,If(: entrance-edges of red-entrant bicursal units. Let z(i) be the number of hlul' edges of G which arc unicursal to i from members of 1". Let II denote the graph G(Y). If i E lV, any red edge unicursal from i is unicursal to a vertex p distinct from a. For no red edge is unicursal to a. So In" Theorem V, p is either a ycrtex of G' or a member of 1". Hence in the graph JI. the number of edges having i as an end is y(i) + (dJ(i) - z(i)). Thus ,,"e hd\-e (10)
z(i) = y(i)
+ (dJ(i)
- dH(i)).
By the definition of a bicursal unit the entrance-cdge of any bicursal unit ,,-hich is not a-entrant is unicursal either from a member of Vor from a mt"J11hcr of W. Let A he the number of blue edges of G unicursal from a member of F to a member of W. I t is equal to the total number of blue edges unicursal irOn! members of V less the number of the entrance-edg-es of the blue-entrant binm;al units. The latter number is k b • by Theorem XII. But A is also equal to the Slim
172
322
THE FACTORS OF GIUPIIS
l)f the numbers z(i) taken over all i E W. The corresponding sum of the y(l) is k" by Theorem XII. Hence we have (11 )
The bicursal units, if any, are connected finite graphs. By Theorem XII, they are components of (C(V))(lV) = II(TV). Ll't M be any bicursal u'1it. Write q(.1I) = 0 or 1 according as .11 is or is not Ii-entrant. Let u(JI) be the number of blue edges of C which touch .11 and let ;\.11) be the number of edges of C which touch .11 and have an end in IV. l'sing Theorem XII we readily obtain the following results: if .11 is blueentrant q(JI) = 1 and u(.1I) = L"(.1J) 1, if .11 is red-entrant q(.1I) = 1 and /lUI) = dM) - I, and if .11 is a-entrant q(JI) = 0 and Il(J!) = v(.1I). In each case we have
+
u(JI) ==
(12 )
+ q(JI)
~'(.1J)
(mod 2).
The slim of 11 (.1I) and the degrees in ] of the vertices of .11 is even, since it is {\liIT the number of blue edges of C having vertices of J! as ends. :\Ioreover, if c j,;t wrtex of J1I, we have dJ(c) = f(c) unless c = a; and II is a vertex of J! if and only if q(.1I) = O. It follows from (I2) that 113)
t,(.11)
+
'Lf(c) (
t J[
+ (q(JI) + 1)(d
J
+ q(Jf)
(a) - f(aJ)
==
0 (mod 2).
Referring to the definitions of ~2 \\'e see that .11 is a member ot' K(C( 1'1, TV) it and only if (q(JI) 1) (dJ(a) - f(a)) q(.1I) == 1 (mod 2). Hence J! is nut a member of K(C( IT), TV) if and only if .11 is a-entrant (q(.1I) = 0) and the deficiency f(a) - dJ(a) of a in ] is even.
+
+
TIlEORE:\f XIII. If C is 110t col1stricted there exists alii! the deficiency of a il1 ] is even.
all
a-entrant bicursa! ul1it,
Proof. Suppose, first, that a is a member of C but not of TV. Then no red uJge of C has a as an end. Hence dG(a) = dJ(a)
r(C(F), W) = k(C(F), W)
+ 'L
itW
+
(f(i) - dH(i».
Ii a (:: TV then k(C(n, TV) > kb k" and f(a) > dJ(a). If there exists an I/-('ntrant bicursal unit and the deficiency of a in ] is odd \\'e have k (C ( V), W)
> k + k + 1. b
T
Hl l'(' ;\'(' have used the results proved above concerning the membership ot' uirllrsa! units in K(C(V), TV). In each of these cases it follo\\'s that the expn'ssion on the right of (11) is less than r(C( V), W). Then C is constricted, contrary to hypothesis. The theorem foI1O\\·s.
173
324
,V. T. TUTTE
7. Augmentation. In this section we no longer assume that the deficient vertex a is singular. Suppose P is a member of Il(a) which has more than one term, and whose last term is a vertex i of G deficient in 1. To transform 1 by P is to replace J by a restriction K of G, defined as follows. The edges of K consist of the blue edges of G which are not terms of P, together with the red edges of G which are terms of P. \Ve say thef-subgraph 1 is augmentable at the deficient vertex a if there is an f-subgraph K of G satisfying the following conditions: (i) dK(a) > dJ(a). (ii) If dJ(c) = f(c), then dK(c) = f(c). Suppose a is not singular. Then there is a member P of Il(a) whose last term is a deficient vertex i of G distinct from a. Let K be the restriction of G obtained by transforming 1 by P. By the definition of Il(a) we have
and dKCc) = dJ(c) if c is not a or i. Hence K is an f-subgraph of G, and J i, augmentable at a. Suppose next that a is singular and that G is not constricted. The deficicnc,,' of a in 1 is at least 2, by Theorem XIII. Also by Theorem XIII, a is the entran"l' of a bicursal unit J[o. By Theorem VI I there is a red edge A of JIo having II ,I; an end. Since A is bicursal there is a member P of Il(a) including at least t\\n edges, whose last term is a and whose last edge is A. Let K be the restriction ot G obtained by transforming 1 by P. By the definition of Il(a) we have dK(a) = dJ(a)
+ 2
and dK(c) = dJ(c) if c is not a. Hence K is anf-subgraph of G,and 1 is augnh'lltable at a. Thus we have the following THEORBI XIV. Let 1 be any f-subgraph of G, and let a be any t'ertex of G 'U.'lii(h is deficient in 1. Then either G is constricted with respect to for 1 is augmentable (/1 i/.
8. Condition for an I-factor. THEOREM
XV. G has no f-factor 1f and only if it is constricted with respect tl) (.
Proof. Suppose first that the locally finite graph G is constricted with re~p.'i't to f. Then G has no f-factor, by Theorem II I. Suppose next that G is not constricted with respect to f. Let 1n Iw the restriction of G which has no edges. Then 10 is an f-subgraph of G. If a vertex a of G is deficient in a givenf-subgraph 1 of G, we can, by Thcorl'I1l XIV, replace 1 by an f-subgraph K in which the degree of a is increased and no vertex of G which is not deficient in 1 is deficient in K. By repeating this prnn'ss sufficiently often we can obtain an f-subgraph K' of G in which a and those vertices of G not deficient in 1 are not deficient.
174
THE F,\CTORS OF GRAPHS
32.)
I t follows that if 5 is any finite set of vertices of G, we can, by the ahove )rocess, build from 10 an J-suhgraph 1 of G in which no member of 5 is deficient. The theorem follows at once in the case in which G is finite. Then we can take ; to be the set of all vertices of G, and the corresponding J-subgraph 1 must be 11 J-factor of G. If G is infinite and connected we use the foIlO\\'ing non-constructive argument. J haw replaced my original proof by a shorter one for which I am indebted to he referee.) Let x be any vertex of G. The number of paths in G whose first term is x and \ iIich have just 2n 1 terms, where II is any given non-negative integer, is i!lite since G is locally finite. Hence the set of paths in G having x as first term ,; denumerable. Since G is connected it follO\\"s that the set of vertices of G is !enumerable, say lal, a~, ... I. By the foregoing argument, to every positive nteger 11 there is an J-subgraph 1n such that
+
dJ.(a r )
=
f(a r ),
r .;;: n.
rhe set of edges of G is at most denumerable, say equal to lAb A" .. . 1. Put F.,(s) = 1 if As is an edge of 1n and Fn(s) = 0 othen\"ise. Then b\' the diagonal lrocess, there is an increasing sequence Ill, 11" .••• such that
lim F n , (s)
k_oc
=
F(s)
:\ists for all s. Let 1 be the restriction of G \\'hose edges are those A, for \\'hirh F(s) = 1. Then dJ(a r ) = f(a r ) for all r, and 1 is anf-bctor of G. Llstly, \\'e must consider the case in \\"hirh G is infinite and not connected. \\'c em show that no component of G is constricted with respect to.(. For if this i, not so, there is a component Ga of G such that for some disjoint finite subsets S dnd T of the set of vertires of G, (1:;)
L f(c)
<
k(Ga(T), 5)
+L
U(r) - (haT) (r)).
(tS
(tT
(·!e;lrh· each component of Gu(T) is a component of G(T). Hence (15) holds \\itiI Ga(T) replaced by G(T), so that LJ(c) "T
<
r(G(T), 5).
is constricted \\ith respect to I, contrary to hypothesis. Since the theorem has been proyed for connected graphs it ioll()\\"~ t hat each ("('Illponent of G has an I-factor. Hence (assuming the multiplicatin' axiom) 1hlTe is a set Z of I-factors of components of G \\'hich contains just one I-factor ()f l';ICh component of G. The restriction of G \\"hose edges are the edges of tl1(' 1l1"lllbers of Z is an J-factor of G.
T!lll~ G
9. II-factors. A necessary and sufficient condition for the existl'l1l"l' of an n-hctor of G, where n is a given positive integer, can be obtained by applYing Thl"Orem XV to the special case in which the value of J(c) is 11 for each \'ertex C (If G.
175
32G
W. T. TUTTE
It is convenient to denote the number of elements of a finite set U by We then obtain the following THEoRBf X\·I. G has 110 lI-factor if and only Sand T of 'cat ices of G such that
(16)
na(T)
< k(G(T),
1f
there exist disjoint ji.nite sets
Lc.s (dG(T)(c)
S) -
aCe).
- n).
Here k(G(T), S) is the number of finite components II of (G(T»)(S) = G(SU T! for which 11 times the number of vertices differs in parity from the number of edges of G which have one end in S and the other end a vertex of II. A necessary and sufficient condition for the existence of a 1-factor of a gi\·\·n locally finite graph G has been givcn in pre\·ious papers [6; 7). It is simpler in form than the expression obtained by writing 11 = 1 in (16). In the next S{'ctinn, this simpler formula is deduced from Theorem xv. The argument suggests llfl analogous simplification in the case n > 1.
10. An allied problem. Suppose that we are given a locally finite graph G, ,md a functionfwhich associates with each vertex c of G a positive integer f((·I. We consider the problem of associating with each edge A of G a non-negatiw integer h (A) so that for each vertex c of G the sum of the numbers h (A), takt-n over all edges A of G having c as an end, is f(c). If such a set of non-neg-ati\"\" integers heAl exists \\·e say that G isf-soll/ble. We note that if f(c) = 1 for each vcrtex of G, then C is f-soluble if and onl:: if it has a 1-factor. Let T be any finite set of vertices of C. \\'1' denote by SeT) the set of all vcrtices C of G having the following properties: (i) c is not an element of T. (ii) Each edge of G ha\·ing ( as an end has its other end in T. If T docs not include every vertex of C we denote by k(T) the number of finite components II of G(T) haying the following properties: (i) H has more than one \·ertex. (ii) The sum of the numbersf(a}, taken o\'Cr all wrtices of II, is odd. If T is the set of all vertices of G we write k(T) = o. THEORDf X\'IL C t'fr/ices of C such that
(17)
is
not f-solllble if a1ui only
LICc) < ef.T
k(T)
~f
there exists
a
finite set To'
+ LfCc). Cf:SI
T)
Proof. By adjoining new edges to C we can obtain a graph G' having th\' following properties: (i) The vertices of G' are the vertices of C. (ii) Two vertices are joined by an edge in G' if and only if they are joint'd by an edge in G. (iii) If two vertices a and b are joined by an edge in G', the number oi distinct edges of G' which join them is finite and not less than dG(a) fw L
+
176
327
THE FACTORS OF GR.\PHS
Clearly G' is locally finite. If Sand T are disjoint finite sets of vertices of G such that S is contained in SeT) it follows from the definition of G' that
keG' (T), S) = k(G(T), S).
(18 )
It is clear that G is f-soluble if and only if G' has an f-factor. Hence, by Theorem XV, G is not f-soluble if and only if there exist disjoint finite sets Sand T of vertices of G such that (19)
L.f(e) leT
< keG' (T),
S) -
L. (da'cT)(e) - fee»~. ce8
Suppose first that (17) is satisfied for some finite T. If SeT) is not finite then all but a finite number of its elements have degree 0 in G. since G is locally tinite. Hence G is not f-soluble since it has a vertex of degree O. If SeT) is finite it follows from (17) and (18) that (19) will be satisfied if we put S = SeT). Hcnce G is not f-soluble. Conversely, suppose that G is not f-soluble. Then (19) is satisfied for some disjoint finite sets Sand T. If possible let a be any member of S not in SeT). Consider the effect of replacing S by S' = S - Ia I. Clearly the replacement diminishes L. (lla'(T)(c) - fee») Cf:}
b,- dU'(T) (a) - f(a), that is. by at lcast da(a), from (iii). The replacement diminishes k(G'(T), S) by not more than da{T)(a), the maximum number of finite components of G' (S U T) joined to a by an edge of G'. But daCT)(a) <; da(al. Hence. if a is not an element of Sen. formula (19) rcmains valid when S is replaced by S'. If S' has an element not in SeT) \\'e repeat the argument with 5' replacing S, and so on. Since 5 is finite we find, eventually. (20)
L.f(e) Cf.
T
< k(G'(T).
C)
+
L.f(e). CftT
\\here [; is the intersection of Sand SeT). But. by (18). k(G'(T). C) is equal (I) k(T) plus the number of components of G(T U C) which consist of a single \"(·rtex. the value of f for this vertex being odd. Hence
(21)
k(G'(T). C) <; k(T)
+
L. ('fS,
T ,-('
fee).
\'ow (20) and (21) imply (17). This completes the proof of the theorem. if fCc) = 1 for each vertex e of G it is clear that G is f-soluble if and only if it In" a l-factor. Applying Theorem XVII to this case we find that G has no 1LH'tor if and only if there exists a finite set T of vertices of G such that
aCT)
< hu(T),
where hu(T) is the number of finite components of G(T) having an odd number of vertices. This is the simple criterion for the existence of a l-factor mentioned in ~9.
177
328
W. T. TUTTE REFERENCES
1. H. B. Belck, Regulare Faktoren von Graphen, J. Reine Angew. Math., vol. 188 (1950), 228-252. 2. T. Gallai, On factorization of graphs, Acta Mathematica Academiae Scientarum Hungaricae, vol. 1 (1950), 133-153. 3. P. Hall, On representation of subsets, J. London Math. Soc., vol. 10 (1934), 26-30. 4. J. Petersen, Die Theorie der reguliiren Graphs, Acta Math., vol. 15 (1891), 193-220. 5. i{. i{ado, Factorization of even graphs, Quarterly J. Math., vol. 20 (949), \t5-104. 6. W. T. Tutte, The factorization of linear graphs, J. London !l.Iath. Soc., vol. 22 (1947), 107-111. 7. - - - , The factorization of locally finite graphs, Can. J. Math., vol. 1 (1950),44-49.
The Unit·ersity of Toronto
Reprinted from ClIIllld. J. Math. 4 (1952).314-328
178
A PARTITION CALCULUS IN SET THEORY P. ERDOS AND R. RADO
1. Introduction. Dedekind's pigeon-hole principle, also known as the box argument or the chest of drawers argument (Schubfachprinzip) can be described, rather vaguely, as follows. If sufficiently many objects are distributed over not too many classes, then at least one class contains many of these objects. In 1930 F. P. Ramsey [12] discovered a remarkable extension of this principle which, in its simplest form, can be stated as follows. Let S be the set of all positive integers and suppose that all unordered pairs of distinct elements of S are distributed over two classes. Then there exists an infinite subset A of S such that aU pairs of elements of A belong to the same class. As is well known, Dedekind's principle is the central step in many investigations. Similarly, Ramsey's theorem has proved itself a useful and versatile tool in mathematical arguments of most diverse character. The object of the present paper is to investigate a number of analogues and extensions of Ramsey's theorem. We shall replace the sets S and A by sets of a more general kind and the unordered pairs, as is the case already in the theorem proved by Ramsey, by systems of any fixed number r of elements of S. Instead of an unordered set S we consider an ordered set of a given order type, and we stipulate that the set A is to be of a prescribed order type. Instead of two classes we admit any finite or infinite number of classes. Further extension will be explained in §§2, 8 and 9. The investigation centres round what we call partition relations connecting given cardinal numbers or order types and in each given case the problem arises of deciding whether a particular partition relation is true or false. It appears that a large number of seemingly unrelated arguments in set theory are, in fact, concerned with just such a problem. It might therefore be of interest to study such relations for their own sake and to build up a partition calculus which might serve as a new and unifying principle in set theory. In some cases we have been able to find best possible partition relations, in one sense or another. In other cases the methods available to the authors do not seem to lead anywhere near the ultimate Part of this paper was material from an address delivered by P. Erdos under the title Combinatorial problems in set theory before the New York meeting of the Society on October 24, 1953, by invitation of the Committee to Select Hour Speakers for Eastern Sectional Meetings; received by the editors May 17, 1955.
427
179
428
P. ERDOS AND R. RADO
[September
truth. The actual description of results must be deferred until the notation and terminology have been given in detail. The most concrete results are perhaps those given in Theorems 25, 31, 39 and 43. Of the unsolved problems in this field we only mention the following question. Is the relation }.-?(wo2, wo2)2 true or false? Here, }. denotes the order type of the linear continuum. The classical, Cantorian, set theory will be employed throughout. In some arguments it will be advantageous to assume the continuum hypothesis 2No =NI or to make some even more general assumption. In every such case these assumptions will be stated explicitly. The authors wish to thank the referee for many valuable suggestions and for having pointed out some inaccuracies. 2. Notation and definitions. Capital letters, except L\, denote sets, small Greek letters, except possibly 11', order types, briefly: types, and k, 1, m, n, K, }., IL, p denote ordinal numbers (ordinals). The letters
r, s denote non-negative integers, and a, b, d cardinal numbers (cardinals). No distinction will be made between finite ordinals and the corresponding finite cardinals. Union and intersection of A and B are A +Band AB respectively, and A CB denotes inclusion, in the wide sense. For any A and B, A -B is the set of all xEA such that xEEB. No confusion will arise from our using 0 to denote both zero and the empty set. If p(x) is a proposition involving the general element x of a set A then {x:p(x)} is the set of all xEA such that p(x) is true. T/ and}, are the types, under order by magnitude, of the set of all rational and of all real numbers respectively. }. will also be used freely as a variable ordinal in places where no confusion can arise. The relation a ~/3 means that every set, ordered according to /3, contains a subset of type a, and aji/3 is the negation of a ~/3. To every type a there belongs the converse type a* obtained from a by replacing every order relation x
< n}
for m ~ n.
The symbol
{xo,
Xl, ••• } <
denotes the set {xo, Xl, ••• } and, at the same time, expresses the fact that Xo <Xl < .... Brackets { } are only used in order to define sets by means of a list of their elements. For typographical convenience we write
L:
[x E A If(x)
180
429
A PARTITION CALCULUS IN SET THEORY
instead of 2:xEAf(x) , and we proceed similarly in the case of products etc. or when the condition xEA is replaced by some other type of condition. The cardinal of S is S and the cardinal of a is a For every cardinal a, the symbol a+ denotes the next larger cardinal. If a = b+ for some b, then we put a- = b, and if a is not of the form b+, i.e. if a is zero or a limit cardinal, then we put a- =a. Similarly, we put k-=l, if k=l+1, and k-=k, if k=O or if k is a limit ordinal. If S is ordered by means of the order relation x
I I,
I I.
II
lx:xcs; Ixl =a}. In particular, [s]a=o if Isl
(f.£
AI'A. = 0
< v < k).
Fundamental throughout this paper is the partition relation a~
(b, d)2
introduced in [6]. More generally, for any a, b" k, r the relation (1)
is said to hold if, and only if, the following statement is true. The cardinals b. are defined for v < k. Whenever
lsi
=
a;
[s]r =
L
[v < k]K.,
then there are BCS; v
I BI
= b.;
For k <wo we also write (1) in the form
a ~ (b o, bl ,
... ,
181
bk_l)r,
430
P. ERDOS AND R. RADO
(September
and if k is arbitrary, and b,=b for all v
We also introduce partition relations between types. By definition, the relation (2)
a
~ «(30, (31, ••• )~
holds if, and only if, (3, is defined for v
[s]r
S = a;
=
1:[1/ < k]K..
there are BCS; v
11
= (3.;
If k <wo, or if all (3, are equal to each other, we use an alternative notation for (2) analogous to that relating to (1). The negation of (1), and similarly in the case of (2), is denoted by a "* (b o, b1,
•••
)~.
We mention in passing that the gulf between (1) and (2) can be bridged by the introduction of more general partition relations referring to partial orders. These will, however, not be considered here. If a ~No then, clearly, a' is the least cardinal N" such that a"*(a)~... Also, Nm is regular if, and only if, Nm~(Nm)~.. for all n < m. Finally, the relation a~(bri, bi, ... )! is equivalent to ':E [1/
{y: {y, x}< C S};
R(x) = {y: {x, y}< C S}.
If, in addition, [S]r= ':E[v
In the special case r=2, we put, for xES; v
=
Inx E A ]W(x).
Also, W(O) = S. If n <wo, then we write W(xo, ... , X,,-I) instead of
182
431
A PARTITION CALCULUS IN SET THEORY
W( {Xo, ... , xn-d). It will always be clear from the context to which ordered set S and to which partition of [S]r these functions refer. We shall occasionally make use of the notation and the calculus of partitions (distributions) summarized in [5, p. 419]. The meaning of canonical partition relations
* ({3)r
a _
and that of polarized partition relations ao at-l
-
boo
b~
. • .•... bt- 1 •o
r··"-'
••• JA,
will be given in §§8 and 9 respectively. The relation defined in §4.
a-Om will
be
3. Previous results.
A].
THEOREM
1. If k <We then No-(No)~ [12, Theorem
THEOREM
2. If k, n <wo, then, for some f=f(k, n, r)
f - (n)~ [12, Theorem B]. THEOREM
(ii)
3. (i) If a ;?;N o, then a-(No, a)2. NW o)2.
NWo~(Nl'
(i) is proved in [2, 5.22]. This formula will be restated and proved as Theorem 44. (ii) is in [3, p. 366] and will follow from Theorem 36 (iv).
4. (i) If a ;?;N o, then (a 4)+_(a+)!. (ii) If a;?;No, then a4~(3)!. (iii) If 2Nn = Nn+l, then N n+2-(N,,+1, N n+2)2.
THEOREM
(i) is given in [3] and will be deduced as a corollary of Theorem 39. (ii) is in [3, p. 364], and (iii) is [3, Theorem II] and follows from Theorem 7(i).1 5. If4>~A; 14>1 >No, then, for a<wo2; {j<w~; 'Y<Wl, (i) 4>-(wo, 'Y)2. (ii) 4>-(a, {J)2.
THEOREM
1 The partition relations occurring in (i) and (ii) are to be interpreted in the obvious way. Their formal definition is given in §4.
183
432
P. ERDOS AND R. RADO
[September
(i) is [5, Theorem 5 J, and (ii) is [5, Theorem 7]. Both results will follow from Theorem 31. THEOREM
6. 17~(No, 17)2.
This relation, a cross between (1) and (2), has, by definition, the following meaning. If 5="1; [S]2=Ko+Kt. then there is ACS such that either or
A
= "1;
[A]! C K1•
This result is [5, Theorem 4].
7. If a~No, and if b is minimal such that ab>a, then a+)2 ab~(b+, a+)2.
THEOREM
(i) (ii)
a+~(b,
These results are contained in [6, p. 437]. (i) will follow from Theorem 34. 2 THEOREM 8. If 2tt • =N.+dor all v, and if a is a regular limit number then, for every b
This result is [6, Lemma 3], and will follow from Theorem 34. THEOREM
9. If q,~x;
14>1 = lxi, then X~(4), 4»1.
This result is due to Sierpinski who kindly communicated it to one of us. It will follow from Theorem 29. Our proof of Theorem 29 uses some of Sierpinski's ideas. THEOREM
10. For any a, a~(No, No)tt..
This is in [5, p. 434]. The last result justifies our restriction to the case of finite "exponents" r.
4. Simple properties of partition relations. THEOREM
11. The two relations
.. ) a (11
(i) a ~ (/30, /31, ••• );
.
~
( . • .• )~" /30,./31,
are equivalent. S By methods similar to those used in [17] one can show that (i) b:;ia' for all a>l, (ii) b=a' for those a>l for which tltto exist or not. Cf. [13, p. 224].
184
A
PARTITION
CALCULUS
IN SET THEORY
433
PROOF. Let (i) hold; 3'<=a*; [S]r=: E[II
)~;
a ~ a(1); k i6; k(1);
< kIll), ~ II < k).
(JI
(k(l) Then (1)
(1)
r
(1)
a - (flo ,fll , •.• )k(l). A n analogous result holds when the types a, fl. are replaced by cardinals.
PROOF. It suffices to consider the case of types. Let 3'(1)
= a (1) ;
Then there is SCS(1) such that
[s]r
3'
=a. Then
=E
[II
< k]K"
where K, = K~l) [S]r for II < k(t), and K, = 0 otherwise. By hypothesis, there are ACS; JI
A
=
"fl.;
[A]r C K •.
Ihi6;k(1),then IAI = Ifl. I i6;r;O¢[A]rCK.which is a contradiction. Hence JI
)~
then
I a 1- (I flo I, I fl11, ... );. lsi = lal; [S]r= E[II
PROOF. Let order S so that 3'=a. Then there are ACS; JI
I I I I'
THEOREM 14. If fl. is an initial ordinal, for all JI
(4)
(flo, fll' •.. );, I m1- (I flo I, Ifld, ... ); m-
are equivalent.
PROOF. By Theorem 13, (3) implies (4). Now suppose that (4)
185
434
P.
ERDOS AND R. RADO
[September
lSI Iml,
holds. Let S=mj [S]r= ~[v
IAI
THEOREM 15. If 1 +a~(1 +iJo, 1 +iJlt
...
)i+ r , then
a ~ (flo, fll' ... )~.
In this proposition, 1+a and l+iJ. may be replaced by a+1 and iJ,+l respectively. Also, the types a, iJ, may be replaced by cardinals.
PROOF. Let S=a. Let Xo be an object which is not an element of S, and put So = S + {Xo }. The order of S is extended to an order of So by stipulating that xoEL(S). Then So = 1 +a. Now let [S]r = ~[v
A = fl.;
This proves the first assertion. Next, if a+1~(iJo+1, ... )~+\ then, by Theorem 11 and the result just obtained, we conclude that
* 1 + a * ~ (1 + flo,* 1 + fll, * •.. a * ~ (flo,
r h;
a
•.•
hl+r,
~ (flo, . . . )~.
Finally, let 1 +a~(1 +b o, 1+bl, ... )i+ r • Let a and iJ, be the initial ordinals belonging to a and b, respectively. Then, by Theorems 14 and 13, 1 +a~(l +iJo, ... )i+ r , a ~ (b o, ... )~.
In this proposition the types a,
iJ", 'Y, may be replaced by cardinals.
In formulating the last theorem we use an obvious extension of the symbol (2). PROOF. We consider the case of types. Let S=a,
[s]r
Put Ko =
= ~[X
L [}..
OA •
< l]Ko). + ~[O < v < 1 + k]K,. Then, by hypothesis, there are A CS; v < 1 +k
186
A PARTITION CALCULUS IN SET THEORY
435
such that A ={3.; [A ]rCK•. 1f '11>0, then this is the desired conclusion. If '11=0, then A ={3o; [A ]rc:2: [X
ex -4 ('Yo, 'YI, ••• )~.
In particular, the condition on the mapping X-4P>. IS satisfied whenever this mapping is on [0, k]. The types a, {3. may be replaced by cardinals. PROOF. Let N= {p,,:X
[s]r =
:2:[" E N]K". + :2:[" E
[0, k] - N]O. By hypothesis, there are A CS; v
"EN;
[A]r C K".
or (ii)
"EE N;
In case (ii), di A = {3. which contradicts the hypothesis. Hence (i) holds, if ='Y,,; [A]rCK", where X=O".
If a-4({3)~;
Ikl = Ill, then a-4({3)~.
This shows that, as far as k is concerned, the truth of the relation Ikl. We are therefore able to introduce the relation
a-4({3)~ depends only on
ex -4 (~)~
which, by definition, holds if, and only if,
187
436
P. ERDOS AND R. RADO
a
-+
(September
(,3);
Ikl =d. A similar remark
for some, and hence for all, k such that applies to the relation a -+ (b)~.
THEOREM 18. Let k<wo; a-+({jo, f31, ...
)~;
+ ... + K"_l. Then there are sets M, NC [0, k] such that IMI + INI >k, S
[s]r
= a;
= Ko
,3,.E[K.]
for",EM;IIEN.
In the special case k = 2 we have either
(i)
,30
E
[Ko] [Kd
(ii) fh
or
E
[Ko][Kd
or (iii)
or
PROOF. Let
P. = {",:",
< k;,3,. E
[K.]},
Q. = [0, k] - p.
(II
I
< k).
NI
We have to find a set NC [0, k] such that ll[IIEN]P.1 >k-I or, what is equivalent, 1 E[IIEN]Q.I < I NI. If no such N exists, i.e. if [IIEN]Q.I E;; NI for all NC [0, k], then, by a theorem of P. Hall [8], it is possible to choose numbers p.EQ. such that p,.¢p, (",
IE
1
THEOREM 19. Let a-+({3, 'Y)2, and suppose that m is the initial ordinal belonging to a Then at least one of the following four statements holds. 4
I I.
(i) ,3
< ColO
(ii) 'Y
< ColO
(iii) ,3, 'Y ~ a, m
(iv) ,3, 'Y
~
a, m*.
PROOF. Let S be a set ordered by means of the relation x
(iii) means that ,9:;ia;
fI~m; "Y~a; "Y~m.
188
A PARTITION CALCULUS IN SET THEORY
437
an ordinal. If ~~wo, then the contradiction wo*~~*=B«~S«=m follows. Hence ~ <woo Case 2. 'YE [Ko]dK1k Then, by symmetry, 'Y<wo. Case 3.~, 'YE [Kok Then, for some sets A, RCS, A<=A«=~i B< = B« ='Y, and ~, 'Y ~a, m. Case 4.~, 'YE [K1k Then, similarly, A<=A»=~i B<=B»='Yi ~, 'Y ~a, m*. This proves the theorem. COROLLARY. For every a, (5)
(r - 2)
+ a-H (wo,
(r - 2)
+ wo)*
r
(r
~
2).
For none of the relations (i)-(iv) of Theorem 19 holds if ~ =Wo i 'Y =wo*. Hence a-H(wo, wo*)2, and Theorem 15 yields (5). The method employed in the proof of Theorem 19, i.e. the definition of a partition of [S]2 from two given orders of S, seems to have been first used by Sierpi6ski [15]. In that note Sierpi6ski proves N1-H(NlI N 1)2. Cf. Theorem 30.
I I
THEOREM 20. (i) If ~o ~ a; ~o < r, then a--"(~o, ~11 any k, ~h ~2' • • • • (ii) If~. = r for II < k, then the two relations
•••
)~ holds for
,
(6)
a --" (flo, fl1' . . . , 'Yo, 'Y1, . . . h+I.
(7)
a --" ('Yo, 'Y1, .•. );
are equivalent. PROOF OF (i). If S=a; [S]'= :E[II
a --" (flo, fl1' •.. )~.
Then either (i) there is IIo
189
438
[September
P. ERDOS AND R. RAOO
studied in the.case in which /3, ~a for all p
IBI
THEOREM 22. The following two tables give information about a number of cases in which the truth or otherwise oj any of the relations
(Po, fll .. .. )~,
(9)
a
(10)
a ---I' (ho, b" ... )~
-+
can be decided trivially.
k
~
0:
r ~ Ia l ,.
~
+
a
r> I a l r> a
k>O:
- --,-0
13. ;:;'cx
fJ. - ct
b. ~ a
b.""G
f3.~a;f3o:$a ~, ;j;. fJo;ta b. ~ a; bo>a b.>a ho
-
O
,-1 · 1>0
- ---
,>1· 1
,Bo ~ a:
(Jol r
bo';;:;a
bo>a
60<'
+
0<,<1·1 r=a>O
{JO~ Oi
+
±
- -- - -- +
- - - - - -- -
+
-
---
-
,>a
190
- -- ----- +
A PARTITION CALCULUS IN SET THEORY
439
The proofs may be omitted. When a row or column is headed by two lines of conditions the first line refers to (9) and the second line to (10). Every condition involving the suffix JI is meant to hold for every JI
5. Denumerable order types. THEOREM 23.
If n <"'0;
0:
<"'02,
then
(11)
""on ~ (n, a)!,
(12)
""on~ (n
+ 1, ""0 + 1)2.
We may assume n>O. (a) In order to prove (12), consider the set S= {(JI, }.):JI
PAI'(Bo, Bl, ... , B,,_l) = (Co. Cl , ... , C,,-l), where CA=BB.,.; CI'=BI'-B, and C.=B. for JI¢}., J.I.. Then C'="'o; CALI (x)
I
I
(;)
191
440
P. ERDOS AND R. RADO
[September
operators PAp., corresponding to all choices of A, IJ, to the system (Eo, ... , E n - l ), applying each one of the operators, from the second onwards, to the system obtained by the preceding operator, and obtain, as end product, the system (Do, ... , D .._ l ). Then D.CA.; 15.=wo(v
< n).
Then, putting D = {x.:v
24. If ex <w04, then
(13)
a
(14)
w04 -
(3, w02)2,
-t-t
(3, w02)2.
This theorem is a special case of the following theorem. THEOREM 25. Let 2~m, n<wo, and denote by lo=lo(m, n) the least finite number I possessing the following property.6 Property Pm ... Whenever peA, IJ) <2 for {A, IJ} .. C[O, I], then there is either {AD, ... , Am-d .. [0, I] such that
c
or
°
pO'a, XfJ) = there is {AD, ... ,An-d .. c[o, Z] such that P(}\a, XfJ) = 1
for a
for
< fJ < m,
{a, fJ} .. c [0, nJ.
Then
(15) (16)
wolD -
(m, won)2,
'Y -t-t (m,
Moreover, if Il-(m, m, n)2, then 10
won)2
for 'Y
< WOlD.
~ll.
Deduction of Theorem 24 from Theorem 25. We have to prove that 1&(3, 2) =4. (i) By considering the function P defined by
p(O, 1) = p(l, 2) = p(2, 0) = 0;
p(2, 1) = p(l, 0) = p(O, 2) = 1,
we deduce that 3 does not possess the property P 32 • (ii) Let us assume that 4 does not possess P 32 • Then there is peA, IJ) such that the condition stipulated for P 32 does not hold, with 1=4. If 5 The existence of such a number I follows from Theorem 2. It will follow from Theorem 39 that we may take 1= (1+31m+n-5)!2.
192
441
A PARTITION CALCULUS IN SET THEORY
{a,
,8,
'Y} .. c [0, 4] i
pea, ,8) = pea, 1') = 0,
then the assumption p({3, 1') =0 would lead to pea, ,8) = p(,8, 1') = pea, 1') = 0,
i.e. to a contradiction. Hence p({3, 1') = 1 and, by symmetry, p('Y, (3) = 1. This, again, is a contradiction. This argument proves that p(a,,8) = 0,
(17)
then pea, 1') = 1.
Since at least one of the numbers p(O, 1), p(l, 0) is zero, there is a permutation a, (3, 1', a of 0, 1, 2, 3 such that pea, (3) = 0. Then repeated application of (17) yields pea, 1') = pea, a) = 1 i p('Y, a) = i p('Y, a) = 1 i pea, a) =p(a, 1') =0, which contradicts (17). This proves lo(3, 2) ~4 and, in conjunction with (i), lo(3, 2) =4. PROOF OF THEOREM 25. 1. We begin by proving the last clause. Let
°
It ~ (m, m, n)2.
(18)
L. c [0, ld. Then Ko + K1 + K 2,
Suppose that p(X, fJ.) <2 for {X, fJ. [S]2
where S= [0,
=
ld, and K. is the set of all p(X, p.) = p(X, p.)
>
p(X, p.)
=
°
{X, fJ.}
ld such that (v = 0),
p(p., X)
(v = 1),
p(p., X) = 1
(v = 2).
By (18), there is Sl = {Xo, ... ,Xk-l}
k = mi
(20)
k
= mi
(21)
k
= ni
[sd 2 C K o, [sd 2 C KI, [sd 2 C K 2•
°
(19) implies that p(Xa, X{J) = for a < ,8 < mi (20) implies that p(Xm- 1- a, Xm- 1-{J) = for a < ,8 < mi (21) implies thatp(Xa, X{J) = 1 for {a, ,8} .. C [0, n]. This shows that [oem, n) ~ [1. 2. We now prove (15). Let l=lo(m, n)i A=[O, woll; N=[O, wo]i
[A]2
°
=
Ko
+ 'K1
(partition .6).
We use the notation of the partition calculus given in detail in [4, p. 419] which can be summarized as follows. If a is an equivalence
193
442
P. ERDOS AND R. RADO
{September
IAI
relation on a set M or a partition of M into disjoint classes then denotes the cardinal of the set of nonempty classes, and the relation
x == yeA) expresses the fact that x and y belong to M and lie in the same class of A. If, for pER, Ap is a partition of M, and if t---'l-f,(t) is a mapping of a set T into M, then the formula (t
E
T)
defines that partition A' of T for which S
== tCA')
if, and only if, fp(s)
== f,(t) (. A)
for pER.
We continue the proof of (15) by putting
+ 0', WoJA + r}) By Theorem 1 there is N'E [N]No such that IA'I =1 A'({O', r})
=
II [X, JA <
l]A({woX
by definition of A', there is p(X, f.t) {woX
+ 0', WofJ. + r} E
< 2 such that
(0' < r < wo).
in [N']2. Then,
for X, fJ. < I; {O',
K p (x.l')
r}< eN'.
By definition of 1 this implies that there is a set {Xo, ... , [0, l] such that either
c
(22)
for a
k = m;
xk-d
pi
< fJ < m
or (23)
for {a, fJ} .. C [0, n].
k = n;
If (22) holds, then we put
A'
=
{woX«
+ O'«:a < m},
where (1'a is chosen such that {(1'O, ( 1 ' l J " ' , (1'm-d
p(X«, X~)
¢
194
°
xm-d ..
443
A PARTITION CALCULUS IN SET THEORY
for some {a, then
13}
(25) for some {a, 13 }.. C [0, n]. Then, if A = [0, 'Y], we have [A ]2=Ko+' K I , where Ko is the set of all {woX+q, wOJL+r} such that {X, JL }.. C [0, l]; q
{Xo, ... , xm-d .. c [0, l]; p(X", X~) =
+ O",,:a < m};
0"0
°
< ... < O"m-l < wo; for a
< fJ < m,
which contradicts (24). If, on the other hand, A" C A;
A" = won;
[A"]2 C K I ,
then there is {Xo, .. " x,,-d
x-
(ao, aI, . . . )~
and their negatives. It turns outl that every positive relation we were able to prove holds not only for the particular type X of the set of all real numbers but for every type 4> such that (26) This fact seems to suggest that, given any type 4> satisfying (26), there always exists Xl such that i.e., that every nondenumerable type which does not "contain" WI or contains a nondenumerable type which is embeddable in the real continuum. This conjecture has, as far as the authors are aware, neither been proved nor disproved. 7 Throughout this section S denotes the set of all real numbers x such that O<x
WI·
• Cf. Theorems 31, 32. 7 Since this paper was submitted E. Specker has disproved this conjecture.
195
444
P. ERDOS AND R. RADO
[September
THEOREM 26. (i)
X-t-t (Wl)~
(ii)
X -t-t (r
for r
+ 0:0
~
0; k
for r
> o. ~
2.
PROOF. (i) is trivial, in view of Wlj;X. In order to prove (ii) it suffices, by Theorem 15, to consider the case r=2. Let {xv:v<wo} be the set of all rational numbers in S, and denote, for n <wo, by Kn the set of all {x, y} < such that the least v satisfying x <xv
r~3.
PROOF. By Theorem 15, we need only consider the case r =3. We have [s]a=Ko+'Kl, where Ko= {{x, y, z}<:y-x
Xm - Xo
< Xm+l
- xm •
m~ 00 , then the contradiction u - Xo ~ u - u follows. ASSUMPTION 2. Let A CS; if =wo+2; [A ]aCKl • Then A =B+{y, z}<; B={xo, Xl.'" }
If
THEOREM 28. X-t-t(r+1, wo+2)r for
r~4.
PROOF. It sufnces to consider the case r=4. We have [S]4 =Ko+'Kl, where Ko= {{xo, Xl, X2, Xa}<:X2-Xl<Xa-X2, Xl-XO}. ASSUMPTION 1. Let [{xo, Xl, X2, Xa, x4}d 4 cKo. Then {xo, Xl. X2, xa} EKo, and hence X2-XI <Xa-X2. Also, {Xl, X2, Xa, X4} EKo, and hence Xa-X2<X2-XI' This is a contradiction. ASSUMPTION 2. Let ACS; if =wo+2; [A]4CKI. We define B, y, z, Xv, u as in the proof of Theorem 27. Then there is mo <wo such that, for mo ~m <wo, u -Xm <X m -Xo. Then, for mo ~m <wo, {xo, Xm, Xm+l, z} EKl ; Xm+I-Xm
Ikl
~
Ixl;
X-t-t
la.1
~
(a(), alJ •.•
Sierpinski proved that X-t-t(a,
a)!
196
Ixi
(v
then
1
h.
if a~X;
lal = Ixi
(Theorem 9).
A PARTITION CALCULUS IN SET THEORY
445
PROOF.
Case 1. There is /-I < k such that a" $ X. We consider the partition S=:E' [v
I :E {A} I = I :E {fA} I ~ I X13~o = I XI ~ I :E {A} I. I :E{A} 1= Ixi =~n' say. Now we can write :E{A} =
{Aop: p<w n }. By symmetry, we have, for every v
and
Xvp E A. p - {X"a: (tL, 0-)
<
(II, p) }.
I {(/-I, 0'):(/-1, 0') «v, p)}1
THEOREM
30. IXI--t-7(~l' ~l)r for r~2.
PROOF. The substance of this theorem is due to Sierpinski [15]. By Theorem 15 we need only consider the case r = 2. Let x
Ixi
<Wt;
197
446
P. ERDOS AND R. RADO
[September
This proves Theorem 30. We note that this theorem is, in fact, an easy corollary of [5, Example 4A].
7. The general case. We shall consider relations involving certain types of cardinal ~I as well as relations between types of any cardinal. We begin by proving a lemma. We establish this lemma in a form which is more general than will later be required, but in this form it seems to possess some interest of its own. We recall that a' denotes the cofinality cardinal belonging to a which was defined in §2.
Isl'
LEMMA 1. Let S be an ordered set, and =~,,; W,,' w!;:t S. Then, corresponding to every rational number t, there is SICS such that = SICL(Su) for t
Isd Isl ;
Sierpitiski, in a letter to one of us, had already noted the weaker result that, if =~I; WI, wt ;:tS, then 17~S.
Isl
PROOF.
Case 1. There is A CS such that
I AL(x) I < I A I = I sI
(x
E A).
Then we define x. for I' <w" inductively as follows. Let Po <w .. ; x.EA(p<po). Then, by definition of n, and hence
I L:[I' < I'o](AL(x.) + {x.D I < I A I, there is x.oEA - E [I' <po](L(xo) + {x.}).
(p.
Then XjI<x,
Case 2. There is A CS such that
I AR(x) I < I A I = I s I
(x E A).
Then, by symmetry, the contradiction w..* ~ S follows. Case 3. There is A CS such that
I
min ( AL(x)
I. I AR(x) I) < I A I = I s I
(x
E
A).
Then we put Ao
=
Al =
{x:x
E A;
{x:xEA;
I AL(x) I < I A I}, I AR(x) I < IAI}·
Then A =Ao+AI. Case 3.1. Then AoL(x) ~IAL(x)1 (xEAo), and hence, by Case 1, we find a contradiction. Case 3.2. Aol Then Ad = and, by symmetry, a contradiction follows. We have so far proved that, if A CS; A = there is zEA such that AL(z) = AR(z) = Then A =A' +A", where A' =AL(z);
IAol =Isi. I F-I si.
I
I I
I
I
I
Isl I I Isl,
I Isi.
198
447
A PARTITION CALCULUS IN SET THEORY
IA'I I I lsi;
= A 1/ = A' CL(A"). By applying this result to A' we find a partition A =A(0)+A(I)+A(2) such that IA(II)
I= Is 1(11<3);
A(II) C L(A(II
+ 1))
(II < 2).
Repeated application leads to sets A (Ao, AI, . . . , Ak-1)
(k
< wo; A. < 3)
such that
I A (Ao,
. . . , Ak-1)
I = I s I;
A (Ao, ... , Ak-1) =
L
[II
< 3]A (Ao,
... , Ak-1, II);
C L(A(Ao, ... , Ak-1, II
A(Ao, ... , Ak-1' II)
+ 1»
(II <2).
Let N be the set of all systems (X o, •.. , Xk ) such that k <wo;
X. E {0, 2} (v < k) ; Xk = 1, ordered alphabetically. More accura tel y, if
P = (Ao, ... , XI:) and q = (/l0, ... , Ill) are elements of N, then we put
p
< (SlO,
••. , J-ll-lt 0, 2, 2, ... , 2, 1)
< (Slo,
••• , iJ.1),
provided only that the inner bracket contains a sufficiently large number of two's. Lemma 1 is proved. THEOREM
31. Suppose that q, is a type such that
11/>1>
No;
Let a <w02; {3 <w~; 'Y <WI. Then (27)
¢ -+ (a, a, a)2,
(28)
I/> -+ (a, fJ)2,
(29)
I/> -+ (wo, ')')2,
(30)
I/> -+ (4, a) I.
THEOREM 32. Let cp, a, 'Y be as in Theorem 31. Let S be an ordered set, S=q" and [S]2=K o+K1 • Then (a) there is VCS such that either
(i)
V
= a;
[V]2 C Ko.
or
199
448
P.
ERD5s AND R. RADO
[September
or (iii) V = WO'Y*;
[V]2 C Kh
and
(b) there is WCS such that either
W=
(i)
Wo
+ w:;
[W]2 C Ko,
or [W]2 C K 1,
(ii) W = 'Y;
or
[W]2 C K I•
(iii) W = 'Y*;
In proving Theorems 31 and 32 we may assume that is m such that
4;;;:; m < wo;
a;;;:; Wo
+ m;
Iq,1 =N
I•
There
fJ ;;;:; wom.
Let S=q,. The letters A, B, P, Q denote subsets of S, and we shall always suppose, in the proofs of the last two theorems, that
P= Q=
woo
PROOF OF THEOREM 31, (29). Let [S]2=Ko+Kl, and (31)
Wo
EE [Ko].
'Y
E [Kd.
We want to deduce that (32) There is B such that
I BRo(x) I ~ No(x E B).
(33)
For otherwise. there would be elements x. such that Xo Xl
E S; E Ro(xo);
I Ro(xo) I = I Ro(xo, Xl) I =
NI, NI,
generally, x.ERo(xo, ... , X._l),
I Ro(xo, ... , x.) I =
NI
(,.. <
"'0).
Then [{xo, Xl, • • • }
200
A PARTITION CALCULUS IN SET THEORY
449
L: [t rational]B(t) C B for some sets B(t) such that B(t)CL(B(u» (t
IL
[v
< vo]BRo(x.) I ~
~o
< I BU••) I,
and therefore we can choose x •• EB(t•• )-L:[v
If A CS, then there is xoEA such that
I ALo(xo) I > ~o. Then there are x.,
A.(v~m)
such that
Xo E Ao = S; and so on, up to
Am = Am_ILo(xm_l) = AoLo(Xo,
Xl, • • . ,
Xm-l).
Then, by (29), Am~(wo, 'Y)2; 'YEEFI(A m), and hence woEFo(Am). There is PCAm such that [P]2CKo. Then [p+ {xo, ... , X m_d]2 CKo which contradicts (34). Hence our assumption is false, and there is A such that (35)
I ALo(x) I ~
~o
(X E A).
By Lemma 1, there isB(t)CA, for rational t, such that B(t) CL(B(u» (t <,u). There are rational numbers t. (v <'Y) such thattl' >1. (,u
< Vo ]P.).
Then, by (29), B'~(a, wo)2; aEEFo(B'); woEFI(B'), and there is P'oCB' such that [P •.1 2CK 1• This defines p. for v <'Y. PutL: [v <'Y ]P. =X. Then X=wo'Y*; [X]2CKI' But this contradicts (34), and so (a) is proved. PROOF OF THEOREM 32 (b). Let the hypotheses be satisfied but (b) be false. Then
201
450
P.
ERDOS AND R. RADO
(36)
[September
'Y
Choose any A.
I
* EE [Kd·
I
ARo(x) ~No (xEA). Then, by Lemma 1, there are sets B(t)CA, for rational t, such that B(t) CL(B(u)) (t
:E [v < Vo ]Ro(x.).
X'o E B(t. o) -
Then the set X={x.:v<'Y1 satisfies X='Y; [X]2CKI which is a contradiction against (36). Hence our assumption is false, i.e., given any A, there is xEA such that ARo(x) =N I . By symmetry, it follows that there also is yEA such that ALo(y) =N I • By alternate applications of these two results we obtain elements x., y. and sets A., B.(v <wo) such that the following conditions are satisfied.
I
xoES;
yoERo(xo) = Bo;
I
I
I
xIEBoLo(yo) =AI; YIEAIRo(XI)
= BI;
generally, for v <wo,
.:E
y.l
Then the set [v <wo]{x., =D satisfies D=wo+w~; [D]2CKo. This contradiction against (36) completes the proof of Theorem 32. PROOF OF THEOREM 31, (27). Let [S]2=Ko+K I+K2 , (37)
a
EE
[K.]
(v
< 3).
Our aim is to deduce a contradiction. We shall reduce the general case to more and more special cases. For the sake of convenience of notation we shall use the same notation for the sets in question at each stage. We put Kl2 =KI +K2. The functions F l2 , L n , RI2 refer to KI2 in the same way as the functions F., L., R. refer to K •. Let ACS. By Lemma 1, there are sets A o, AICA such that AoCL(AI)' Let xoEA I. Then AL(xo) =NI' and there is vo<3 such that AL.o(xo) =N I . By repeating this argument we find numbers Vp <3 and elements Xp (p <wo) such that
I
I SL.o(xo)
I
I
I
Xp E SL'o(xo)L.1(XI) ... L'p-l(Xp-I),
I
... L.p(xp) = NI
(p
< wo).
There are Po
202
451
A PARTITION CALCULUS IN SET THEORY
Then there is PCA o such that [P]ICKo. Then a~~; [C]2CKo, where C=P+{xp .:II<m}, which contradicts (37). Hence the Assumption 1 is false, and we have woEEFo(A o). We may assume that
Wo EE [K o].
(38)
For a later application we remark that in what follows we may replace S by any nondenumerable subset of S without any of the conclusions becoming invalid. Now let A CS. Then, by (29), A--+(wo, a)2. Also, wo--+(wo, wo)2. Therefore, by Theorem 16, A--+(wo, Wo, a)2. Hence at least one of the following three relations holds. (i) wo
E Fo(A),
(iii) a
E F 2(A).
Since (i) and (iii) are false, it follows that (A C S).
(39) By symmetry,
(A
(40)
C
S).
ASSUMPTION 2. There are x., A. (II <wo) such that xoEAo; AoRo(xo) =AI; xIEA I; AIRo(XI) =A 2 ; x.EA., etc. Then [{xo, X},··· }<1 2 CK o which contradicts (38). Hence the Assumption 2 is false, and there are 110 <wo; x,ES (II <110) such that we may put A =Ro(xo, ... , X'o-I) and we then have IARo(x) I ~No (xEA). We may assume that (41)
I Ro(x) I ~ No
(x E S).
By Lemma 1, there are sets A, B such that A CL(B). By (39), there is PCA such that [P]2CKI. For a later application we remark that at this stage we might have applied (40) in place of (39) and in this way could have interchanged the roles of KI and K •. By (41), I 2: [xEP]Ro(x) I ~No, and hence IBR12 (P) I =N1• Therefore we may assume PC L 12(S - P).
(42)
ASSUMPTION 3. If QCP; A CS, then there is xEA such that
I QL (x) I = 1
No.
Now we argue as follows. By Lemma 1, there are sets A.CS-P such that A"CL(A.) (IL
203
452
P. ERDOS AND R. RADO
x. EA.;
I p. -
[September
PI'I
< vo), < v < vo). (v
p. C P
< ~o
(p.
Then we can write [0, vo] = !px:)..<wo}. We can choose Yx such that
Yx E PPOPPI ... P px - /yo, ... , YA-d
(A
< wo).
By (41) and Assumption 3, there is x.oEA. o- L [v
I
I p' - LI(xl'r) I ~ I pI - Pl'r I + I Pl'r - LI(xl') I < ~o
+ o.
By summing over r we obtain I PI-LI(D) I <~o. Hence we may put P'LI(D) = Q, and we then have Q+D ~O'; [Q+D ]2CKI which contradicts (37). Case 2. There is DCX such that 15=0'; [D]2CK2. This, again, contradicts (37). Hence the Assumption 3 is false, i.e., there are P'CP; A'CS such that (x
E A').
Then there is A"CA' such that the set PILI(x) is constant for xEA". Then there is pI! such that P IL 2(x) =pl! (xEA"). We have therefore proved that there is pI!, A" such that (43) The whole argument from (38) onwards remains valid if S is replaced by any set A. Hence it follows from (43) that if A CS, then there are P, A'CA such that (44)
By Lemma 1, there are A o, Bo such that AoCL(Bo). By repeated application of (44) we obtain sets P., A: (p<wo) such that
+ At C Ao; PI + At CAt;
Po
[p O]2 C K I; [pd 2 C K I;
C L2(Arf), PI C L2(A{) ,
Po
generally, P.+A:CA._ I ; [P.]2CKI; P.CL 2 (A:) (O
204
453
A PARTITION CALCULUS IN SET THEORY
(p.
< v < wo).
We put Bl=BoRuCPo+P1 + ... ). Then we have the result that there are sets P., Bl (v<wo) such that
{
(45)
[P.]2 C K I ; PI'
< wo), < v < wo). (v
C L 2(P.)
(p.
Now let Vo <wo; B2CB 1 ; P'CP. o' ASSUMPTION 4. IP'L2(x) I <~o (xEB2). Then there is BaCB2 such that the set D =P'L2(x) is constant for xEBa. By (39), there is QCBa such that [Q]2CK1• Then [(P' -D) Q]2CK1 ; wo2E [Kd which contradicts (37). Hence the Assumption 4 is false, i.e.
+
if Vo
(46)
< Wo;
P' C P. o' then
I {x:x E B I P'L (x) I < ~o} I ~ ~o. I;
2
To Bl the same argument applies as to S, from (38) onwards. The only change we make is that, after (41), we apply (40) instead of (39), so that now the roles of Kl and K2 are interchanged. We find sets Q., B2CBl such that, in analogy to (45), (46), the following statements are true.
Q. C LI2 (B 2)
(47)
< wo), < v < wo). (v
Q" C LI(Q.)
(p.
If Vo <wo; Q' CQ.o' then
(48)
I {x:x E B I Q'LI(X) I < ~o} I ~ ~o. 2;
By Lemma 1, there is B: CB2 (v <wo) such that B: CL(B:) (f.L
are at most ~o elements xEB2 such that at least one of the relations
I Q: LI(X) I < ~o holds. By using this result repeatedly we find elements XA (A <wo) such that, for all v <wo,
Xo E Brf; Xl
E
B{;
I P.L2(xo) I = I Q.LI(xo) I = ~o, I P.L2(xo, Xl) I = I Q.LI(xo, Xl) I = ~o,
generally, xAEB{;
I P.L2(xo,
... , XA)
I = I Q.LI(xo, ... , XA) I =
~o
(v, X < wo).
Since wo-t(wo)~, there is a number v <3 and a sequence >'0 <>'1 <
205
... ;
P. ERDOS AND R. RADO
454
[September
Xp<Wo, such that [{x>.e' X>.p··· 1<]2cK•. By (38), v;;060. We can choose y,., z,. such that, for p. <wo, y,. E P,.L2(xo,
Xl, • • • ,
z,. E Q,.L1(xo, ... , XA"'_l)'
X>''''_l);
Put X= {x>.p:p<m}; Y= {y,.:p.<wol; Z= {Z,.:p.<wo}. Case 1. v = 1. Then [Z +X]2CKI; aE [Kd. Case 2. v=2. Then [Y+X]2CK2; aE [K2]' In either case, a contradiction against (37) follows. This proves (27). PROOF OF THEOREM 31, (28). If [S]2=Ko+Kl and if we put 'Y=wom then we have, by Theorem 32 (a), either (i) aE[Ko] or (ii) /J;;i'YE[Kd or (iii) /J;;iwom~wo'Y*EE[Kd. This proves (28). PROOF OF THEOREM 31, (30). Let [S]3=K o+'K1, (49)
4
EE [Ko];
Ci
EE [Kd·
We shall deduce a contradiction. By Theorem 2, there is n <wo such that n~(m, m)S, and p such that (50)
(n -
1)(1
+ m + m(m -
1)/2)
< p < woo
By Lemma 1, there is zoES such that
I L(zo) I, I R(zo) I > ~o
and then there is CCR(zo) such that C = l7. The following diagram shows the relative position in S and the inclusion relations between the various sets to be considered in the argument that follows. It might be of help to the reader. S
{zd
J
M
206
455
A PARTITION CALCULUS IN SET THEORY ASSUMPTION. If DE [C]p, then' II [Xl, x,ED]{xo:xo
(51)
if DE [C]p,
then
{Zl, Xl, X2} E Ko
Xl,
for some Xl, X, ED.
Then [C]2=Kl +K{, where
K: = {{Xl, x,} : Xl, X2 E C;
{Zl'
Xl, X2} E K.}
(JI
< 2).
By (11), C; = 17 ~ wop-t (wo + m, p) 2. Hence there are two cases. Case 1. There is ECC such that E=wo+m; [E]2CKl. Then, since, by (49), E=wo+mEE [Kd, there are xl, x{, x, EE such that {xl, x{, x, } EKo. Then [{ZI, xl, x{, xl },,]3CKo which contradicts (49). Case 2. There is GE [C]p such that [G]2CK{ . Then {Zl' XI, x,} EEKo for all Xl, x2EG, which is a contradiction against (51). It follows that our assumption is false, and that there are HE [C]p and A CL(zo) such that
{xo, Xli X2}
EE Ko
for Xo E A ; Xl, X2 E H.
Put
V(XO,Xl) = {X2:X2EH; {XO,XI,X2} EKd
for xo, Xl E A.
Then [A]2 =Kl' +'K{' , where Kl' is the set of all {xo, Xl}
, V(Xo, Xl)' ~ n
for {xo, xd< C P,
[p]2 = L[W C H]K~, where KW = { {xo, Xl} <: Xo, Xl EP; V(xo, Xl) = W}. The number k of sets W is finite, and wo-t(wo)~, by Theorem 1. Hence there are P'CP; JCH such that [P']2CK)3l, for {xu, xd< C P'. Then for {xo, xd
Since [p,]aCKo+K I and, by Theorem 1, wo-t(wo, WO)3, there are QCP'; v<2 such that [Q]3CK•. By (49), woEE[Ko]. Hence 1'=1; [Q]'CK I • Furthermore, [J]3CKo+Kl; I=n-t(m, m)3. Hence there are
207
456
P. ERDOS AND R. RADO
[September
ME [J]"'; p<2 such that [M]8CKp. Since m~4EE[Ko], we have p=1. Then, in view of QCP'CPCA; MCJCH,
[Q
+ M]8 C K
"'0
l ;
+m =
Q + M E [Kd
which contradicts (49). Case 2. There is NCA such that N=wo+m; [N]ICK{'. Then 1
V(Xo,
Xl) 1
~ n- 1
for Xu,
Xl
E N.
Then
Ql C L(T);
TI
=
K!2)
¢
1
m.
We have [Qd l = L' [K
Q2 =
L' [K < k2]K!2),
where
0
and two elements Xoo and XOI of Qa belong to the same K~a) if, and only if, for every xIET; xaEH, the two sets {xoo, Xl, Xa} and {XOlt Xl, Xa} belong to the same class K,. Then k2 <wo and, by wo-+(wo)~, there are Q3CQa; K3
for Xo E Qs;
Xl
E T; X2 E H.
Put U=Qa+T, and choose {xo", xl' }
E V(xo", xl')
+ L[x E
T)V(xo", x)
+ LUx, y}< C T]V(x, y)
and therefore, in view of the definition of Xa and the relations 1 and (50), 1
L[xo,
~
Xl
E U]{X2:X2 E H; {xo,
(n - 1)
+ (n -
1) ( : )
Xl,
X2}< E
+ (n -
208
Kd
1) ( ; )
TI =m
I
< p = 1 HI.
A PARTITION CALCULUS IN SET THEORY
457
We deduce the existence of xl' EH such that
{XO' Xl, xl'
I EE Kl
for all Xo,
Xl
E U.
Since U=wo+mEE [Kd, there are Yo, Yl, Y2E U such that {yo, Yl, Y2} EKo. But then [{Yo, Yl, Y2, xl' },,]3CKo which contradicts (49). This proves (30) and thus completes the proof of Theorems 31 and 32. THEOREM 33. Let a <w02. Then Wl-7(a, a)2. PROOF. Let S=Wl; [S]2=K o+'Kl ; 2;;;im<wo; a;;;iwo+m, (52)
ex
EE [K.]
(v
< 2).
We have to deduce a contradiction. Let the conventions concerning the use of the letters A, B, P, Q be the same as in the proofs of Theorems 31 and 32. Choose any P. ASSUMPTION. Let [P]2CKo. Suppose that, if P'CP, then there is A such that
I P'Lo(x) I = ~o
(X E A).
Then we define x., p. (v <WI) as follows. There is Xo such that IPLo(xo) I =~o. Put Po =PLo(xo). Now let 0 <1'0 <WI, and suppose that x.ES; P.CPLo(x.) (v <1'0);
I p. -
PI' I < ~o
(Il- < v < 1'0).
Then we can write [0,1'0]= {Jl>.:X<wo}. We can choose elements Yx(X <wo) such that YxEPp.OPp.l ... Pp.x - {Yp:p <X} (X <wo). Put P' = {yx:X <wo}. Then, by our assumption, there is A such that P'Lo(x) =~o (xEA). We can choose
I
I
X'O
E A - L[v
We put P'o=P'Lo(x. o)' If, now, =Jl>.. Then
< vo]({x.} VI
+ L(x.».
<1'0, then there is X<wo such that
VI
I P. o -
POl I ;;;i
I {Yo, Yt.
... } -
p">.1
< ~o.
Also, P'oCPLo(x. o)' This completes the inductive definition of such that
x., p.(V <WI)
PI' C P Lo(Xp.);
I p. -
PI' I < ~o
Put X= {x.:v<wd. Then, by Theorem 23, X=wI>wOm-7(m, Wo +m)2. Since, by (52), wo+mEEF1(X), we have mEFo(X), and there is DE [X]m such that [D]2CKo. Let xp=max [xED]x. Then, for any x.ED, IPp-Lo(x.) I ;;;i Ipp-p.1 + Ip.-Lo(x.) I <~o. Hence we may put Q=PpLo(D), and then we have Q+D=wo+m~a; [Q+DJ!
209
458
P. ERDl)s AND R. RADO
[September
CKo. This is a contradiction against (52). Therefore our assumption is false. Now let A CS. Then, by Theorem 3, IA I =N1-+(No, N1)! and hence, by Theorem 14, A =Cd1-+(Cdo, Cd1)'. Since Cd1EEF1(A), we conclude that CdoEFo(A), so that there is PCA such that [P]2CKo. As the assumption made above is false, there is P' CP such that there are at most No elements x such that IP'Lo(x) I =No. Then there is A'CA such that IP'Lo(x) I
(x
E A").
Since IEI
x E Alii CA"i
y
EE E
= P'Lo(x);
y
EE Lo(x).
Also,
x E Alii C R(P") C R(y); x> y. Hence
So far we have proved that, given any A, there are sets P"', A"CA such that P" CL 1(A"'); [P"]2CKo, and, moreover, there are at most No elements x such that IP"Lo(x) I =No. By applying the last result repeatedly, starting with A =S, we obtain sets P .(v
[P,.]2 C Ko;
PI' C L1(P.)
(p
< II < wo).
There is Q. such that
I P.Lo(x) I < No
(II
< Wo; xES - Q.).
We can choose BCS- E[II
Wo
+ m EE Fo(B);
and there is DE [B]m such that [D ]2CK1. Then, for every v
210
459
A PARTITION CALCULUS IN SET THEORY
and complicated nature, is of interest in that it implies Theorem 7 (i) and Theorem 8. It may well be capable of further worthwhile applications. THEOREM 34. Let a, (3, -y be ordinals, and a-++({3, -yp. Then there are ordinals a). (>. <(3-) such that, if
(I'
then
< tr),
We begin by dedueing (i) of Theorem 7 or, rather, a slightly stronger proposition, from Theorem 34. COROLLARY
"',,+I-("'m
1. Let m and n be such that N:~N" (d
+1, ",,,+1)2.
This implies, a fortiori, "',,+I-("'m, ", ..+1)2 which, in its turn, by Theorem 14, is equivalent to Theorem 7 (i). Deduction of Corollary 1 from Theorem 34. Let us suppose that "'n+l-++("'m+1, "'''+1)2. Then, by Theorem 34, there are ordinals aA, kA such that I k,,1 = II [>. <,,1 IaAI (" <"'m); 1
(53)
",,,+1 -++ (ao + 1, al + 1, ... ).... ,
(54) Then, for>. <"'m, (55) For, let" <"'m, and suppose that (55) holds for>. <". Then, using 1,,1
2. Let N': =N .. ; 2N.
"'.. _ (P, ",..)2 By Theorem 14, this proposition implies Theorem 8. Deduction of Corollary 2 from Theorem 34. Let {3 <"'n. Suppose that ", .. -++({3, ",..)2. Then, by Theorem 34, there are ordinals aA, kA such that Ik,,1 =II[>.<"llaAI (Jt<{3-);
211
460
P. Wn -t7
(ao
ERD()s AND R. RADO
1 + 1, ... )~-;
[September
a,. -t7
1
(Wn)lI,.
(p.
< ~-).
Let us assume that, for some p. <{3-, we have a~ <W n (X
I I
LEMMA 2. Let T be a well ordered set, and [T]2=Ko+KI • Then there isS a set B=B(T)CT which has the following properties. We have [B]2CK I • If xET-B, then there is yEB such that {y, x}<EKo. PROOF. We may assume T~O. Choose l such that Ill> I TI. We define, inductively, yx (X
B = {y).:X
< m};
{Yx, y,.}< E KI (X < p. < m).
For, m is the least p. such that O
EE [Ko); 1pi > 1al. Let
S = a;
~
We choose an ordinal p such that xES. We define f,.(x) (p.
< v), if p.
< v;
f,.(x) ~ x.
Then we define f.(x) by the following rule. If f,,(x) =x for some p.
212
461
A PARTITION CAl.CULUS IN SET THEORY
{jl'(x),i.(x)}< E Ko
(J.I
i.(x) ~ x
< v < p;il'(x) ¢ x); (v < p; xES).
If, for some x,i.(x) <x (v
I pi
I li.(x): v < p} I ~ I S I = I a I
=
follows. Hence, given xES, there is u(x)
i.(x)
<
x
(v
< u(x));
i.(x)(x) = x
(x E S).
Then, for fixed x, [li.(x):II~u(x)} ]2CKO'
u(x)
+ 1 < /3;
u(x)
< fr.
Put M.= {j.(x):U(X)~II} (lI
a
-H
_
(Mo
1
+ 1, Ml + 1, ... )p-.
Let 0
M. =
L
[YI' E MI' for p.
< v] li.(x) :u(x)
~ lI;ix(x) = yx for X < v}.
Now, for every choice of YI'EMI' (P.
x E B(T);
i.(x)
= x E B(T)
or
x
EE
i.(x) = z E B(T).
B(T);
In either case, f.(x)EB(T). In fact, the set T does not depend on x since T is the set of all yES such that
iYll' y}< E
Ko
(p.
< v).
All this proves that, given YI'EMI' (p.
xES;
u(x)
~
v;
fix) = YI'
(p.
< v),
then
i.(x) E B(T). By definition of B(T), we have [B(T) ]2CKl and therefore B(T) <'Y.
213
462
P. ERDOS AND R. RADO
II
1
[September
Hence M. is a sum of Ut
II
1
k.
k.1
THEOREM 35. Suppose that (3~r ~3; (3, (3* ~a; any type cp such that 1cp 1 = 1a I '
cp ++ (s,
(56)
COROLLARY.
Ifr~3;
(57) (58)
s> (r-l)2. Then,for
(3)r.
s>(r-l)2, then 1/ ++
(s, Wo
+ l)r,
cp ++ (s, WI)r,
where cp is any type such that 1cp 1 = 1"A I· The negative results (57) and (58) are not too far from the ultimate truth as is seen by comparing them with the following positive results. By Theorem 1, (59)
r
Wo - (wo, Wo, ... , WO)k
(k
< wo).
By Theorem 31, (60)
where cp is any type such that 1cpl >No; WI, wi ~cp. PROOF OF THEOREM 35. The corollary follows by applying the theorem to the following two cases. (i) (3=wo+l; a=wo; Cp=TJ, (ii) (3=WI; a="A. The proof of the theorem depends on the following lemma due to Erdos and Szekeres [7]. Throughout, we put
s = (r - 1)2
+ 1.
LEMMA 3. If S is an ordered set, r>O, and if z(u)ES (u<s), then there is {uo, UI, • • • , ur-d < C [0, s] such that either
(p
+ 1 < r)
(p
+ 1 < r).
or We now prove the theorem. Let S<=cp; S«=a. Then
214
A PARTITION CALCULUS IN SET THEORY
463
where
= {{XO' ... ,x,-d<: {xo, ... ,x,-d« c s}, Ku = {{xo, ... ,xr-d<: {xo, ... ,x,-d» C s}.
K IO
Case 1. There is A E [S]· such that [A ]'CKo. Then, if A = {s(O"): 0" <s}, an application of Lemma 3 shows the existence of BE [A]' such that B EKlo which is a contradiction. Case 2. There is A CS such that A<=~; [A ]'CKI. We shall prove that one of the two relations
[A]' C KIO, [A]' C Ku
(61) (62)
holds. If both (61) and (62) are false, then there are sets X, YCA such that
= {xo, ... ,x,-d<
(63)
X
(64)
Y =
{xo, ... ,x,-d«, {Yo, ... , Yr-d< = {Yo, ... , y,-d»· =
Then there is O"
THEOREM
E[JI
Nn-++(I nl +,
IA.I =a.(JI
(i). Let
I ;
proves (i).
PROOF OF
KI
(ii). By definition of a', the hypothesis of (i) holds for
215
464
P. ERDOS AND R. RAW
[Stpttm~r
some a" n, with In[ =a'; d=a= La •. Hence (ii) follows from (i). PROOF OF (iii). Let =a; a,=b(v
Inl
ab =
r: a, .... (a+,
b+)·.
PROOF OF (iv). Since ~.= r:[v
Let a
(i) Let a~~ol and let b be minimal such that ab>a.
;;;a~t~ab.
Then
(65)
A possible value for Nk is a+, (ii) ~""'+l-++(Nm+l. ~"''' +1)2 for all m. (iii) If No
If s<:;2; {30. {3~ ;j;ao; al -+-+ (j3o. (31. S
laol = la,l. then
+ 1, s + 1, ... , s + i)',!.
PROOF. Let S< = al; S «= ao. Then to every set XE [5J' there belongs a permutation 7I" ( X) :X---+q(X) defined by
x Let
71">.
=
{xo.
Xl • • . • , X'_l} < =
{x.. (O).
X.. (l) • . • . , X.. (, _ l ) } «
.
(X<s!) be all permutations of [0. s] and, in particular, 11"{\:~
---+ X;
1I"1:X ---+ s - 1 - >..
(X
< s).
Then [Sl' = r: [v <s!lK" where K , = {X:XE [Sl'; 1I'(X) =11',). Now suppose that A C5; v <s!; [A ]'CK •. We shall deduce a contradiction in each of the three cases that follow and so establish the lemma. Case 1. v = O; A<=f3o. Then the contradiction f3o =A«~S«=ao follows. Case 2. 11=1; A<=f31' Then the contradiction Jjt=A«~S« =ao follows. Case 3. 2~v<s!; A<=s+1. Let 1I',:A~,,(A). Then A = {xo. X!, .. . , x,} < and therefore. putting Y>. =Xl+}, (X <s), we have
216
A PARTITION CALCULUS IN SET THEORY (66)
465
{XO, Xl, ... , x.-d < = {X,(O)' X,(lh •.. , X,(S-l>! «, {Xl, XI, . . . ,
X,} < = {YO,
(67)
YI, ..• , y.-l} <
= {y,(O), Y,(lh ... , Y,(,-l)
}«
= {XI+,(O), XI+,(l), ... , Xl+,(.-l) f «.
If xo«xt, then alternate applications of (66) and (67) lead to XO«Xl<<X2<< ... «x. and so to the contradiction 11". =11"0, while, similarly, the assumption XO»Xl leads to the contradiction 11". =11"1. This proves the lemma. PROOF OF THEOREM 37, (i). Let a=N m ; b=NI, and let Fbe the set of all mappings X---+h(X) of [0, wzl into [0, wm ]. We order F by putting, for ho, hlEF, ho«hl if, and only if, there is Xo<w, such that Then I FI =a', and we have, by' Lemma (68)
*
-F«
Wm+l, W'+1 ~
-
= F,
2 of [6],
if a =Nm; b =N"
say.
We can choose a set XE [F]"k' Let x---+f(x) be a one-one mapping of X on [0, Wk], and S= {(x, v):xEX; v <WI("'}' WeorderSalphabetically, by means of a relation u
(69)
1. Let Sl CS, and suppose that Sl is an ordinal. Put Xl = L [v <Wk] Jx:(x, v)Esd. Then Xl is an ordinal, and 'Jtl~F. Hence, by (68), Xl<Wm+l; Ixd ~Nm=a
I < N.;
111
I sll = ~[x E xd I
(70)
= ~[x
{II: (x,
E Xdf(x) < WI;;
II) E sd I ;;;i! ~[x E XdN/(,,)
~ N.II XII ;;;i! N.INm < N.. k ; W"'. ~ cp.
2. Let S2CS, and suppose that (S2)* is an ordinal. Put X 2 = L[V<Wk]{X:(X, v)ESz}. Then (Xz)* is an ordinal, and (X!)* ;;;i!F. Hence, by (68), (X2) * <W'+l; IX2 1;;;i!N,. Put, forxEX 2, N(x) ={v:(x, V)ES2}' Then N(x) is an ordinal. On the other hand, I The authors are indebted to G. Kurepa for pointing out that the result of this lemma had already been obtained by F. Hausdorff, [9, Satz 14].
217
466
P. ERDOS AND R. RADO
[September
(N{x»* is an ordinal, since (N{x»*~ (S2)*' Hence N{x) <wo;
(71) 3. We now apply Lemma 4 to the case s = 2;
aD
= t/J;
Its hypotheses are satisfied, by (70), (71), (69). We obtain w... -++{w",., WI+l)2. This implies (65), by Theorem 14, and cOIl)pletes the proof of Theorem 37. REMARK. If a~No and N/c=a+, then (i) of Theorem 37 yields a stronger result then (ii) of Theorem 36. For, first of all, we note that the hypothesis of (i) of Theorem 37 holds, since a
N",. -++ (b+, N...)2.
(72)
On the other hand, (ii) of Theorem 36 gives (73) It is known that, for any m,
,
,
N.... = N....
(74)
Hence N~! >N~. =Nt =a+' =a+~b+, and (72) is stronger than (73). Since we were not able to find a reference for (74) we give, for the sake of completeness, a proof now. Case 1. Let N~=N ..
which is the desired contradiction. Case 2. Let N~.. >N,,=N:". Then m>O, and Nm = :E [v <w,,]NA • for some >'.<m. Then :E[v<w .. ]N"'}.,=N , for some l<wm; NA IWA.I ~ Ill; Nm = :E[V<W,,]NA,~ IIIN,,; m~n; N", .. = L:[~<wm]NI'; N~.. ~ Iwml ~N" which, again, is a contradiction. This proves (74).
.=
THEOREM
(75)
38. If Nm -++( I{30 I, "'m+l -++
(flo
I(:lll, ...
)~, then
+ 1, fll + 1, ... hr+l. 218
A
PARTITION CALCULUS IN SET THEORY
467
We give some applications of this theorem. (i) If liSl = =Nm+h then
I'YI
(76)
"'mH ~
(fJ
+ 1, 'Y + 1)1.
For, let a=Nm' and let b be minimal such that ab>a. Then, by Theorem 7, ab~(a+, b+)2 and therefore Nm+l~(liSl, )2. Now (76) follows from Theorem 3S. (ii) If N': =N",; liSl =Nm+1 ; =N..ft , then
I'YI
I'YI
"'''ft+1 ~ (fJ
(77)
+ 1, 'Y + 1)3.
In order to prove (77), we apply Theorem 36 (ii) to a=N.. ft . We note that, by (74), a' =N': =Nm; a'+=Nm+1' Hence, by Theorem 36, N"'ft~(liSl, )1, and (77) follows from Theorem 3S. (iii) If is! = Nr.+l; = N"'r'+1' then
!
I'YI
I'Y I
(7S)
"''''Hl+1 ~
(fJ
+ 1, 'Y + 1)8.
This follows immediately from Theorem 37 (ii) and an application of Theorem 3S. We note that on putting n=k+1 in (ii) above one obtains a result which is weaker than (7S). For, (ii) becomes: if N k +1 =N m ; iSl =N k +2 ; =N"'iW then (7S) holds. The proof of Theorem 3S depends on a lemma.
!
I'YI
LEMMA 5. Let a be an ordinal. Suppose that is. (v
a
~
(fJo
+ 1, fJl + 1, .. ')1:r+1.
PROOF. Let S =a; xES. Then L(x)
S'
C S;
[S']r+1
C K.;
+
S'
=
fJ.
+ 1,
then S' = SIt {x'}; SIt CL(x'); SIt =is.; [S"]rCK.(x') which contradicts the definition of K.(x'). Hence (SO) is impossible, and (79) follows. PROOF OF THEOREM 3S. If is<'''m+h then liSl ~Nm' liSl ~(liSol, liSll, ... )~. By Theorem 13, this implies iS~(iSo, iSl' ... )~. Now (75) follows from Lemma 5. THEOREM 39. (i) If (S1)
219
468
P. ERDOS AND R. RADO
Iml >
(82)
L:[X
[September
kI IA1 ",
then (83)
m -+ (ao
(ii) If r>O; w..-+(ao,
+ 1, a1 + 1, ... h0'+1.
alt ••. )~,
and
2M• ~ N..
(84)
then Wn+1 -+ (ao
for
+ 1, a1 + 1, .. ')10
"",1
II
< n,
•
Ikl
"",1
some." < k. Then, by Theorem 17, we may apply a suitable permutation to the system ao, aI, . . . so that for the new system, again denoted by ao, aI, ... , a., ... ('II
a. = r (II
< ko);
(k o ~ II
< k).
Here ko is some ordinal, O
Wn+l -+
(alo o
+ 1, alo o+1 + 1, ... )101""'1•
This shows that we may assume, without loss of generality, that Ia.1 >r for."
< w.. ]1 kllAt =
a.
IW"+11
It suffices to show that >a. If n=O, then a
for some
VA
< w.. ]21101IA( ~
<no Hence, by (84),
L: [X
< w.. ]2M'A,
a~ }:[}.<w.. ]N.. =N...
220
469
A PARTITION CALCULUS IN SET THEORY
Deduction of (iii) from (ii). By definition of N:", "''''-('''m)~. Hence, by r applications of (ii), (85) follows. Now (86) follows from Theorem 15. PROOF OF (i). Let S=m; [S]O+I= E' [II
{xo, . . . , x,} == {Yo, ... , y,} expresses, by definition, the fact that, for some II
{xo, ... , x,}, {yo, ... , y,} E K•. We define f.(x) ES as follows. Let x be fixed, and suppose that, for some fixed ~, the elements f. = f.(x) have already been defined for all K<~. Then we putf~(x) =x, if f.=x for some K<~. If, on the other hand, f.~x for K<~, then we define f~ to be the first element y of S- {j.:K<~} such that (87)
{j.o,···, f'r-I' y} == {j.o, ... , f'r-l' x} for KO < ... < K,_1 < X.
This defines f. for all
K.
We now prove that
fA
(88)
if
x < "';
(89)
f>.
~
x.
First of all, (87) holds for y =x. Hence, by (89) and the definition of (88) in the case whenf,,=x. Now suppose that f,,~x. Then (87) holds for y =f" and, again, (88) follows. By (88) and 1 > 1 there is p(x) such that
fA, we havef>.<x. This proves
nl
ml,
f.(x) Let,forKo<'"
< f>.(x)
= x,
if K < p(x)
~
X.
{j.o(x), ... ,f'r-I(X), x} E Kg('o"""r-I,,,,j = K'(KO, ... , x). We now show that if x and z are such that
p(x) = pes),
(90)
(91)
K'(KO, •.• I Kr-1, x)
= K'(KO, ... , /C,_1, s) for KO < ...
then x=z. Let ~~p(x), and suppose that (92)
f.(x): = f.(s)
221
for
K
< X.
470
P. ERDOS AND R. RADO
[September
Then h.(x) is the first element y of S - {j.(x) : K <X} such that (87) holds, i.e.
{j.o(x) , ... ,j'r-I(X), y}EK'(KQ, " ' , Kr-l, x)
for KO< ...
Now, h.(z) is defined by the same property, with z in place of x, and (90) and (91) show thatjA(x) =jA(Z). We have thus proved, by induction, thatj.(x) =j.(z) for all K~p(X). In particular, by (90),
x = jp(:t:)(x) = jP(')(z) = z. We next prove that p(xo)?;,l for at least one Xo. Let us suppose, on the contrary, that p(x)
a(O') ~
~
I kl 'v{;
I ml = lsi L[O' < l] I k I'v(,
=
I L[u < l]{x:p(x)
= IT}
I
which contradicts (82). This proves that p(xo)?;,l for some suitable Xo. Put So= {j.(XO):K
a ~ ({30, (31, . • • )~.
I
\.8.\ ,
For, if r~ 1, then any a can be taken such that a\ > L[v
.8.,
.8•.
k\
k\
222
471
A PARTITION CALCULUS IN SET THEORY
the least number n such that n -+ (ao, ..• , a"'_l)~'
Without loss of generality, we restrict ourselves to the case k ~ 2 ; In [5, Theorem 1], an explicit upper estimate was given for the number Pic (r; a, a, ...• a). which. in that paper. was denoted by Rek, r, a). Clearly, plc(1; aD, •. '. ak_l)=1+ao+ ... +ak+l-k. By Theorem 39,
o
Pk(r + 1; ao + 1, al + 1, ...• ak-l + 1)
(93)
~ 1+
L
[X
< Pk("; aD •...•
ak-l) ]k>-".
I t is easily proved that, for I <woo
(94) For, (94) holds for 1= O. and if 0 < m <woo and (94) holds for 1= m -1. then 1+
L [X < m ]k}.r ;:i! kCm-I)r +
kCm-l)r ;:i! kl+Cm-1)r;:i! kmr ,
so that (94) holds for l=m. We have thus proved the following recurrence relation, THEOREM 40. If 2;:i!k <wo; 0
Pk("
+ 1;
In particular, we have, using the notation of [5]. R(k, ,. + 1, a + 1) ;:i! kRC ",.,,,,>' (k ~ 2; 0
< ,. ;:i! a).
This is precisely the recurrence relation established in [5], from which the explicit estimate is deduced at once. This is no coincidence, as the method of proof of the present Theorem 39 is related to that used for proving Theorem 1 of [5]. Theorem 39 implies Theorem 4 (i), i.e. + I (95) (2it.) -+ (N"+l)"K' For, clearly, N,,+l-+(N,,+l)!K' and therefore W..+l-+(W"+l)!,,. Also,
"L.J [X < wn+d I w.. 11M ;:i! N..M" Nn+l =
2M" = N... o,
say. Hence, by Theorem)9 (i), Wmo+l-+(Wn+l+1)!", and (95) follows.
223
472
P.
ERDOS AND R. RADO
(September
THEOREM 41. If r ~ 3, then, for all n,
(96)
Wn+l ~ (w" + 2, Wo + 1, r + 1, r + 1, ... , r + 1);r-l)l.
As an application, consider the case r = 3; n = 0:
Wl ~ (wo
(97)
+ 2, Wo + 1)3.
This should be compared with:
Wl ~ (wo
+ 1);
(k
< wo; r
~ 0)
which follows from Theorem 39 (ii) and Theorem 1. PROOF OF THEOREM 41. Let w~~(3<Wn+l' We apply Lemma 4 to
s = r - 1;
ao = w,,;
(30=w,,+1;
and obtain (3 ~ (w"
r-l + 1, Wo, r, ... , r)(r-l)l.
This holds, a fortiori, if (3 <w". Now Lemma 5 proves (96). A type (3 is called indecomposable if the equation (3 ='Y+8 implies that either 'Y?;,(3 or 8 ?;,(3. It is known 10 that the indecomposable ordinals are those of the form w~. The types 11 and}" are indecomposable. The next theorem asserts that in Lemma 4 the s! - 2 classes corresponding to the entries s+1 in the partition relation may be suppressed in the special case when both (30 and (31 are indecomposable, at the cost, however, of raising the remaining entries slightly. THEOREM 42. Let s?;,3; laol = lad; (30, .aT;t3ao, and suppose that (30 and (31 are indecomposable. Then (98)
PROOF. Case 1. s=3. Consider a set S with two orders such that S<=al; S«=ao. Then [s]a=K o+'Kl, where Ko is the set of all sets {xo, Xl, X2}<= {Yo, Yl, Y2}«CS for which XA~YA is an even permutation of [0,3], i.e. one of the permutations 012, 120, 201. Now let us assume that (99) It suffices to deduce a contradiction in each of the two cases that follow. Case 1.1. There is A CS such that A< =(30; [A JaCKo. Let X, y, 13 denote elements of A. Then {x, y, z}<= {Xl, Yl, implies that Xl, Yl, 131 is a cyclic permutation of x, y, z. Put B = {x:y«x, whenever
zd«
10
[13. U7S-78].
224
A PARTITION CALCULUS IN SET THEORY
473
y<x}; C=A -B. We shall prove three propositions about the two orders of A showing their effects on the partition A =B+C. 1. Let x
which is a contradiction. Case 1.2. There is ACS such that A<={31; [A]aCKI. Then {x, y, z} < = {Xl, Y1, ZI} «CA implies that Xl, yt, Z1 is an odd permutation of x, y, z. This is equivalent to saying that {x, y, z} < = {X2' Y2, Zll»CA implies that X2, Y2, Z2 is an even permutation of x, y, z. Hence the result of Case 1.1 holds if {3o is replaced by {3t, and "«" by "»". We note that.8~ is indecomposable. Hence, in place of (100) we have
fh ~ A» ~ :5»
= ao•
which is a contradiction. This shows that the assumption (99) was false, i.e. that (98) holds. Case 2. s>3. Then, by the result of Case 1, we have a1-t+({3o, .81)8. By Theorem 15, this implies (98). This completes the proof of Theorem 42. REMARK. If, in particular, {3o and .81 are ordinals, not zero, then (s-3) +.8.=.8., so that (98) can be replaced by
(s - 3) + a1 -t+ (Po, fJ1)·. We may also mention here the following corollary of two of our lemmas, in which X is the type of the continuum. (101)
If
2No
= N..,
then
+l -t+ (Wi + 1, W1 + 1)8.
W ..
PROOF. Let w,,~a1 <W,,+I. Then, by Lemma-4, with ao=X, we have al-t+(w1' W1)2. By Lemma 5 this leads to (101).
225
474
P.
THEOREM 43. If
ERDOS AND R. RADO
[September
r<s~fjo; a-++(fjo)~; fjl-+(S)~,
then
a -+ (fJo, fh)'.
This proposition remains valid if the types a, flo, fll are replaced by cardinals.
PROOF. Let r<s:ifjo; a-+({jo, fjl)'; fjl-+(S);. We have to deduce that (102) Let 3'=a; [S]r= K/ =
E'
[JI
1:[11 < k]{A:A E
[sh
[A]r
C K,}.
Then there are BCS; X<2 such that [B]'CK1; B =fj.,... If X= 1, then B-+(s);, and therefore there are AE[Bl'; JI
x,. =
{X,., .•• , x-+r-d
(p :i m).
IBI I
Now let I' <m. Then, since = fjol ~s> r, there is Y,.E [B]' such that X,,+X"+IC Y,.. But Y,.EK/, so that X,., X,.+IE [y,.]rCK.,., for some JI,.
Ixi
1I
(103)
6-+ (3h.
Hence, by Theorem 43, N ..-++(Nlo 6)', and therefore W..-++(Wlo 6)3. Now, by Theorem 15, W ..-++(W1o r+3)r (r~3) follows and therefore, finally, (r ~ 3).
(104)
(b) By (97), (105)
WI -++ (wo
+ 2)2.a
By (103) and Theorem 39, we have (106)
where m = 1+
a
m -+ (4)2,
E lI'<6]2,.t <226. It now follows from (105) and (106), 226
A PARTITION CALCULUS IN SET THEORY
475
by Theorem 43, that wrt~(wo+2, 2 26 )4 and therefore, by Theorem 15, that (107)
Wl
++
(wo
+ 2, 2 + r 26
- 4) T
(r
~
4).
vVe now give a new proof of the theorem of Dushnik and Miller
[2],11 Our proof bears some resemblance to the original proof but can, we think, be followed more easily. THEOREM 44. If a ~~o, then
a~(~o,
a)2.
PROOF. We use induction with respect to a. By Theorem 1, the assertion is true for a=~o. We assume that n>O and that the assertion is true for a <~n' and we let
lsi
= b =~" > ~oj
We suppose that if
X E [s]~o,
then
[X]2
and we want to find YE [S]b such that [Y]2CKl' There is a maximal set A = {xv:v
x. E Uo(xo, ... , X,_l),
(108)
I Uo(xo, ... , x,) I =
b
(v
< l).
For, the relations (108) imply that [A ]2CKo. Put B = Uo(A). Then
I BUo(x) I < I B I =
(109)
(x E B).
b
Case 1. b' = b. Then we define Xv (v <wn ) as follows. Let v <W,,' and suppose that x"EB (Jl
I L: [IL < v]({x,,} + BUo(x,,» I < I BI,
and therefore there is x.EB- L:~
I
I
B(T) = {x:x E Bj p(x) = We define, by induction, V<W m , 11 12
T",
X"
r}
(T
< w",).
CJl <wm ) as follows. Let, for some
Theorem 3, (i). The symbol Uo was defined in 12.
227
476
P. ERDOS AND R. RADO
[September
(I-'
< v).
Then, by definition of m,
I 2:
~
< v; x E
+ BUo(x» I ~ 2: ~ < v; x E 2:[1-' < v](1 + b•.)b,. < b.
XI']({x} =
IDI
X,.](1
+b
T ,.)
Hence =b, where D=B- L~
I
b
=
I
I D I = 2: ~ < w I DB (I-') I ~ M~m < b. m]
Now we can choose X.E[DB(r.)]b •. Then X.CUI(X,.) (I-'
=
II X [v < k ]jS•.
[10]. THEOREM 45. If k>O, then llx [v
This multiplication has been considered by Hausdorff
PROOF. In spite of its somewhat complicated appearance the proof is, in fact, very simple, as can be seen by following it in the case k=2 or k=3. Let P= L[v
Xl,' •• ,
Xk}:X. E B. for v < k}.
Case 1. There is v < k such that the following condition is satisfied. There is a system of elements x,.EB,. (,."
228
477
A PARTITION CALCULUS IN SET THEORY
function f,(xo,
Xlo ••• ,
i.) EB. such that, for any choice of x). EBA
(lI<X
(Xo, Xl, ••• , i" f,(xo, ••• , i,), X0+1, ••• , if;)
EE K •.
In particular, the function fo(io) is constant. Then we define, inductively, elements Y.(lI
E K,
which contradicts the definition of f •. The theorem is proved. 8. Canonical partition relations. Let S be an ordered set, and consider a partition (110)
[8]'
=
L:' [II < k]K•.
To every such disjoint partition there belongs an equivalence relation .,:l on [S]· defined by the rule that elements X, Yof [S]· are equivalent for.,:l, in symbols:
x ==
y(.~)
if, and only if, there is 11
IX -
•
(/W
has, by definition, the following meaning. Whenever S =a, and (110) is any disjoint partition, with any arbitrary k, then there is BCS such that B =~, and such that the equivalence relation .,:l belonging to (110), if restricted to [B]', coincides with some canonical equivalence relation .,:l~ ......-1. The main result of [4] is expressible in the form wo-. (wo)·. The problem arises of finding canonical partition relations between types other than woo The main difference between canonical and non canonical relations derives from the fact that if the 11 [4; 5]. The notation used in the present note differs slightly from that used in the earlier papers.
229
478
P.
ERDOS AND R. RADO
[September
canonical relation (111) holds then a certain choice of a subset of S can be made irrespective of the number kl of classes of (110). The relation "'0-+. ("'0)1 is equivalent to the statement that, if a denumerable set S is arbitrarily split into nono'IJerlapping subsets S., then there are either infinitely many nonempty subsets S., or else at least one of the subsets S. is infinite. The following theorem establishes a connection between canonical and noncanonical partition relations.
I
THEOREM 46. (i) Let q. denote the number of distinct equi'IJalence relations which can be defined on the set [0, s]. Let (112)
s
=
(~);
I pI> 2r;
2.
€X
-+ (It),•.
Then a-+. (~)'. If I~I >4; a-+(~):oa, then a-+* (~)2. (ii) If m, r~O, and 2"'~N" for m~n<m+2r+l; v
qo = 1; q1 = 1; q2 = 2; qa = 5; q. = 15; q5= 52; q6 = 203. A rough estimate for all s is
q. ~ 2(;)
I
obtained by observing that an equivalence relation is fixed if for p.
s>O,
and hence q. ~s!. Deduction of (ii) from (i). By Theorem 39 (iii), we have """+2r+1 -+ (Cdm
+ 2r + lh2.+2
(k
< Cdo),
and the conclusion follows from Theorem 46 (i). PROOF OF (i). Suppose that (112) holds. Let S=a, and consider any disjoint partition (110). Let .6 be the equivalence relation on [S]r which belongs to (110). Our first aim is to define a certain equivalence relation .6 * on [S]2r. Let [[0, 2r llr= {Po, Ph ... , Then
p.-d ...
230
479
A PARTITION CALCULUS IN SET THEORY
Let X= {xo, ... ,X2r-.}
if, and only if, {x;>.:XEP,.}
== pC· a(X»
== {x;>.:XEP.}C·a). Put, for X == y(·a*)
X, YE[S]2r,
if, and only if, ..1.(X) =..1.( Y). Now, by definition of q.. A* has at most q. nonempty classes. By (112), there is BCS such that B=p, and any two elements of [B]2r are equivalent for ..1.*. This means that, in the terminology of [4], A is invariant in [B]r (d. [4, p. 253 J). Choose any A CB such that 2,. < IA I
P = {Y2;>.: X < r} ;
Q = {Y2).+1-<).: X < r}.
By definition of A.~ .. '''-1' we have P =QC ·A.~o ......-1). Hence P
== Q( . ..1.);
Similarly, by considering the sets P and Q' = {Y2)'+I--<).: X< r}, we find that P =Q'( ·A.:o.. .•r-'); P =Q'(-A.); Ep ;;;; Kp(p
< r).
Hence Ep = Kp for all p. For reasons of symmetry, 'YIP = Kp, and so, finally Ep ="1P (p
231
480
P. ERDOS AND R. RADO
[September
positions of the "complete" even graph of cardinal-pair ao, ai, i.e. the graph obtained by joining every "point" of a set of cardinal ao to every point of a disjoint set of cardinal al. More generally, we introduce the notation
for the cartesian product of the t sets [S"I r", i.e. we put [So, .. "
st-d ro ... •• r.-
1
=
{(Xo, •. " X,_l):X"
E lS).J'l. for X < t}.
We shall always have 0
ao
boo
bOl b11
(113)
has, by definition, the following meaning. Whenever
}..
[So, ••• , sl_d ro ..... rl-l =
Is,,1 =a).
for
1: [p < k]K.,
I
then there are sets B).CS).. and an ordinal JI
47. If a' =b', then
232
481
A PARTITION CALCULUS IN SET THEORY (114)
In particular, (114) holds if 1
holds if, and only if, either b=O or b'>No. In particular,
(N
(116)
0) --+ {No
2ND
\No
No 2No
)1.1.
PROOF OF THEOREM 47. If 1
Ko = {(K, }.):K
< a;). < b; K +}. even}.
Then, for any K
a=
L
[II < ",.. ]a,;
b=
L
[v < ",.. ]b.,
where a.
IA.I
IB,I
Ko= {(x,y):xEA"j
yEB,;
= {(x, y):x E
y E B,;
Kl
A,,;
L'
[v <w .. ]B,;
< v < "'.. }, II;:;:; p. < ",.. }.
p.
IA"xl Ixl.
Let XE[A]a; yoEB. Then yoEB. for some v <w... We have ~ A"I
I
]1.1ct
L
233
L
I
482
P. ERDOS AND R. RADO
[September
Hence, a fortiori, (115) is false. It remains to prove (115) under the assumption b'>No. Let =No; =b; [A, B]I,I=Ko+KI. We may suppose that
IAI
IBI
(X, Y) E [A, B]Mo'& implies
(117)
[X, Y]I.IQ:K1•
Let (X, Y)E [A, B]Mo,b. 1. Put Yo= E[xEX]{y:(x, Y)EK o}. Then [X, Y= YO ]l,lCKl , and hence, by (117), I Y - Yol
E[x E
x]1
{y:y E Y; (x, y) E Ko}
I ~ I YYol
= h.
Since b'>No= lxi, this implies the existence of xoEX such that
I {y:y E
Y; (xo, y) E Ko}
I=
h.
Put
1/t(X, Y) = {y:y E Y; (xo, y) E Ko}.
t/J(X, Y) = Xo; Then
t/J(X, Y) EX; 1/t(X, Y) E [Y]b, [{ t/J(X, Y)}, 1/t(X, Y) ]1,1 C Ko.
2. Putf(y) = {x:xEX; (x, Y)EKo} (yE Y). If
yE Y
(118)
If(y) I < No,
implies
yl
thenb=1 =E[pcx; Ipi
[1/tI(X, Y), {t/Jl(X, Y)} ]1,1 C Ko. 3. We define sequences x., y .. X., Y. (II<Wo) as follows.
xo=t/J(A, B); Yo=1/t(A, B); yo=t/Jl(A- {xu}, Yo); X o=1/tl(A- {xo}, Yo). For O
x. = t/J(Xp-l, Y p-l - {yl'-d);
Y. = 1/t(Xl'-l, Y 1'-1
y. = t/Jl(Xl'-l - {x.}, Y.);
X, = 1/tl(Xl'-l -
-
{x.},
Then
x. E X,-l C X._ 2
-
{x__d C X.-3 - {XI'-2' x.-d C ...
C X 0 - {Xl, . . . , x..-d C A -
IXo,
234
. . . , x.-d ;
{Yl'-d); Y.).
A PARTITION CALCULUS IN SET THEORY
y. E Y. C Y __1
-
483
{y--d C ... C Yo - {YO, .•• , y.-d
C B - {YO, • • • , y.-1} ; [{X.}, Y.]l.l C Ko; [{X.}. {Y., Y'-H, ... } ]1.1 C Ko; [X., {y.} ]1.1 C Ko; [{XO+l' X.+2, ... }. {y.} ]1.1 C Ko; (X", y.) E Ko (j.I, II
< "'0).
This proves (115). Finally, as is well known [13, p. 135], (2 No)' >No, so that (116) is a special case of (115). This proves Theorem 48. We introduce the notation
I I
where A is a set such that A =a. If a, b
is the ordinary binomial coefficient
The following lemma is probably well known. LEMMA
6. If a~No, then
{ :} = ab for b ~ a and
{ :}
= 0 for b > a.
PROOF. The result is obvious for b = 0 and for b > a. Now let O
On the other hand, if y.EA. for JI
and the lemma follows.
235
484
P. ERDOS AND R. RADO
[September
THEOREM 49. Suppose that O<s
III
{a1} I k Itao} '0 '1
=
ao (119)
a
'.
(120)
al-l
a~
.
(121)
~ b: 1'0'"
I
Then
•••
I
k
b
'" ... ,Ft-l
b'-l
I
[b~
.
~
b'_l
al-l
'._1 '
','.-1
b.-l
.
~
{a,_l}
'0, ... ,'1-1
k
PROOF. We use the notation of the partition calculus explained in the proof of Theorem 25. In addition, if A is a partition of M, and M'CM, then the relation
I AI
~ a in
M'
expresses the fact that the number of classes of A containing at least one element of M' is at most a. (119) and (120) imply that bA~aA for)'
(122)
< t.
Let IAAI =aA for)'
IAI I
(123)
IA I ~ 1
in
[Bo,··· B,_d ro , .. ·,rl-l. t
Put, for XAE [AA]"- (s~).
=
II A(Xo, ••• , X ,- l),
where the last product is extended over all systems (X o, ••• , X ,_l ) E [Ao, ... , A._d ro , ... ".-1. By (122), this product has at least one ~ Hence, by (120), there is BAE [AA]~ factor. It follows that (s~).
lAd Ill.
I All
~
1 in
[B.,···, B,-d'., .. ·"1-1.
236
485
A PARTITION CALCULUS IN SET THEORY
By (122), we can choose (s ;;; ).
a2(XO,
••• ,
< t).
XI-I) = a(Xo, ... , X,-I, Y .. ... , Y,_I).
;;;Ikl,
Then 1.121 and therefore, by (119), thereisB'AE[A'A]b'A (X<s) such that 1.121 ;;; 1 in [Bo, ... , B._d ro •...• r'-I. By (122), we can choose Y'AE [B'A]"A (X <s). Then, for any X'AE [B'A]"A (X
= (Xo, ... , X.-l , Y" ... , Y 1-1)
=(Yo, ... , Y.-
1,
Y" ... , Y 1-1)( ·a).
This proves (123) and so establishes Theorem 49. We note the following special case of Theorem 49. COROLLARY.
If
then
( ao) -+ al
(bO)I.I. bl
,.
We give some applications of this last result. (a) If O
This is best possible in the sense that, if 2d -1 is replaced by 2d - 2, the last relation becomes false. We even have, as is easily seen,
(b) If al >
Ikl >0; at > Ikl ao , then
In particular, if we assume that 2Mo=N" then
237
486
P.
ERDOS AND R. RADO
More generally, if 2Mn=~"+lr then
(
~n
)
\NnH
(~n
[September
)1.1
\NnH I .
-+
This is best possible in the following strong sense. (c) If a'~ kl ; b>O, then
I
To prove (c), choose n~k such that Inl =a'. Then a= L:[v
I
THEOREM 50. If
21'0 =~1,
then
PROOF. Let A = [0, wo]; B = [0, wd. According to SierphlskP4 the assumption 21'0 = ~l implies, and is, in fact, equivalent to, the existence of a sequence of functions fA(y) EB (XEA), defined for yEB, such that, given any YE [B]Ml, there is XoEA such that{fA(y) :yE Y} =B (Xo~X<wo). Then [A, B]l.l=Ko+'Klr where Ko={ex, y):XEA; yEB;fA(Y) =O}. If, now, (X, Y) E [A, B]Mo.MI, then, by the property ofthe functions fA, there is XEX; Yo, YlE Y such thatfA(Y.) =v (v <2). Then (X, y.)E[X, Y]l.lK. (v<2). This proves the assertion. THEOREM 51. If a, b> 1; ( : ) -+ ( :
Then ( ::) -+ (
X·
1 •
::X'l . I
PROOF. Let 1 and m be such that III =a'; ml =b'. Then a= L:[X]aA; b= L:&t<m]b,., where aA
IAAI
14
[14], French translation in [16]. See also [1].
238
I
487
A PARTITION CALCULUS IN SET THEORY
A = E' [X
=
< I; A"K ¢ o};
{~:~
B"
= {I':I' < m; B"Y ¢
IAAI
OJ.
Then a= Ixi = E[XEA"]IA"XI; IA"xl ~
I
I
COROLLARY,
For, if
If
III
I
a> 1, then
(:') -H (:/X,l . ( :') _ (:/X'1 ,
then, by Theorem 51 and the known equation a" =a' , we conclude that
which contradicts Theorem 47. We may mention that there is an obvious extension of Theorem 51 to reI a tions for any t. In conclusion, we collect some polarised partition relations involving the first three infinite cardinals. They follow from Theorems 47-50. We put No=a; N1=b; N2 =d.
(:) -HG :Y'\ (:) -HG ~Y'\ (~) -H (~ :y'l
(:)_(: :Y'\ (:)_(: :y'l 239
(Theorem 47); (Theorem 48).
488
If 2" =b, then
P. ERDOS AND R. RADO
(:)-+(: :y.l
and
(:)~(:
:y.l
(September
(Theorem 49)
(Theorem 50).
It seems curious that the continuum hypothesis should enable us both to strengthen
to
and to show that
cannot be strengthened to
REFERENCES
1. F. Bagemihl and H. D. Sprinkle, On a proposition of Sierpinski, Proc. Amer. Math. Soc. vol. 5 (1954) pp. 726-728. 2. B. Dushnik and E. W. Miller, Partially ordered sets, Amer. J. Math. vol. 63 (1941) p. 605. 3. P. ErdOs, Some set-theoretical properties of graphs, Revista Universidad Nacional de Tucuman, Serie A vol. 3 (1942) pp. 363-367. 4. P. Erdos and R. Rado, A combinatorial theorem, J. London Math. Soc. vol. 25 (1950) pp. 249-255. 5. - - , Combinatorial theorems on classifications of subsets of a given set, Proc. London Math. Soc. (3) vol. 2 (1952) pp.417-439. 6. - - , A problem on ordered sets, J. London Math. Soc. vol 28 (1953) pp. 426-438. 7. P. Erdos and G. Szekeres, A combinatorial problem in geometry, Compositio Math. vol. 2 (1935) pp.463-470. 8. P. Hall, On representations of sub-sets, J. London Math. Soc. vol. 10 (1934) pp. 26-30.
240
A PARTITION CALCULUS IN SET THEORY
489
9. F. Hausdorff, Grundlfige einer theoru der geordneten Mengen, Math. Ann. vol. 65 (1908) pp. 435-506. 10. - - , Mengenlehre, 3d ed., 1944, §16. 11. R. Rado, Direct decompositions of partitions, J. London Math. Soc. vol. 29 (1954), pp. 71-83. 12. F. P. Ramsey, On a problem of formal logic, Proc. London Math. Soc. (2) vol. 30 (1930) pp. 264-286. 13. W. Sierpiiiski, Le,ons sur les nombres transftnis, Paris, 1928. 14. - - , 0 jednom problemu G RusjeviCa koji se odnosi na hipotesu kontinuuma, Glas Srpske Kraljevske Akademije vol. 152 (1932) pp. 163-169. 15. - - , Sur un probUme de la t~orie des relations, Annali R. Scuola Normale Superiore de Pisa Sec. 2 vol..2 (1933) pp. 285-287. 16. - - , Concernant l' hypothhe du continu, Acadbnie Royale Secbe. Bulletin de l'Acad~mie des Sciences Math6matiques et Naturelles. A. Sciences Math~matiques et Physiques vol. 1 (1933) pp. 67-73. 17. A. Tarski, Quelques t~oremes sur Its alephs, Fund. Math. vol. 7 (1925) p. 2. HEBREW UNIVERSITY OF JERUSALEM AND UNIVERSITY OF RUDING
Reprinted from Bull. Amer. Math. Soc. 62 (1956), 427-489
241
MAXIMAL FLOW THROUGH A NETWORK L. R. FORD,
JR. AND
D. R. FULKERSON
Introduction. The problem discussed in this paper was formulated by T. Harris as follows: "Consider a rail network connecting two cities by way of a number of intermediate cities, where each link of the network has a number assigned to it representing its capacity. Assuming a steady state condition, find a maximal flow from one given city to the other." While this can be set up as a linear programming problem with as many equations as there are cities in the network, and hence can be solved by the simplex method (1), it turns out that in the cases of most practical interest, where the network is planar in a certain restricted sense, a much simpler and more efficient hand computing procedure can be described. In §I we prove the minimal cut theorem, which establishes that an obvious upper bound for flows over an arbitrary network can always be achieved. The proof is non-constructive. However, by specializing the network (§2), we obtain as a consequence of the minimal cut theorem an effective computational scheme. Finally, we observe in §3 the duality between the capacity problem and that of finding the shortest path, via a network, between two given points. 1. .The minimal cut theorem. A graph G is a finite, I-dimensional complex, composed of vertices a, b, c, ... , e, and arcs a(ab), ~(ac), ... ,8(ce). An arc a (ab) joins its end vertices a, b; it passes through no other vertices of G and intersects other arcs only in vertices. A chain is a set of distinct arcs of G which can be arranged as a(ab), ~(bc), "(cd), ... ,5(gh), where the vertices a, b, c, ... , h are distinct, i.e., a chain does not intersect itself; a chain joins its end vertices a and h. We distinguish two vertices of G: a, the source, and b, the sink. 1 A chain flow from a to b is a couple (C; k) composed of a chain C joining a and b, and a non-negative number k representing the flow along C from source to sink. Each arc in G has associated with it a positive number called its capacity. We call the graph G, together with the capacities of its individual arcs, a network. A flow in a network is a collection of chain flows which has the property that the sum of the numbers of all chain flows that contain any arc is no greater than the capacity of that arc. If equality holds, we say the arc is saturated by the flow. A chain is saturated with respect to a flow if it contains Received September 20, 1955. IThe case in which there are many sources and sinks with shipment permitted from any source to any sink is obviously reducible to this. 399
243
400
L. R. FORD, JR. AND D. R. FULKERSON
a saturated arc. The value of a flow is the sum of the numbers of all the chain flows which compose it. It is clear that the above definition of flow is not broad enough to include everything that one intuitively wishes to think of as a flow, for example, sending trains out a dead end and back or around a circuit, but as far as effective transportation is concerned, the definition given suffices. A disconnecting set is a collection of arcs which has the property that every chain joining a and b meets the collection. A disconnecting set, no proper subset of which is disconnecting, is a cut. The value of a disconnecting set D (written v(D)) is the sum of the capacities of its individual members. Thus a disconnecting set of minimal value is automatically a cut. THEOREM 1. (Minimal cut theorem). The maximal flow value obtainable in a network N is the minimum of v(D) taken over all disconnecting sets D. Proof. There are only finitely many chains joining a and b, say n of them. If we associate with each one a coordinate in n-space, then a flow can be represented by a point whose jth coordinate is the number attached to the chain flow along the jth chain. With this representation, the class of all flows is a closed, convex polytope in n-space, and the value of a flow is a linear functional on this polytope. Hence, there is a maximal flow, and the set of all maximal flows is convex. Now let S be the class of all arcs which are saturated in every maximal flow.
LEMMA 1. S is a disconnecting set. Suppose not. Then there exists a chain aI, a2, ... , am joining a and b with ~ S for each i. Hence, corresponding to each ai, there is a maximal flow fi in which ai is unsaturated. But the average of these flows,
aj
1 f=-L.fi
m
'
is maximal and ai is unsaturated by f for each i. Thus the value of f may be increased by imposing a larger chain flow on alt a2, ... ,am, contradicting maximality. Notice that the orientation assigned to an arc of S by a positive chain flow of a maximal flow is the same for all such chain flows. For suppose first that (C lt k l ), (C 2, k 2) are two chain flows occurring in a maximal flow f, ki > k2 > 0, where CI = al(a al), a2(aIa2), ... ,aj(aj_l, aj), ... ,ar(ar-lt b) C2 = !31(a bl), !32(b Ib2), ... ,!3k(bk- lt bk), •.. ,!3s(b.-l! b),
and aj(aj_l, aj) = !3k(b k- l , bk) E S, aj_1 = bk, aj = bk- I. Then
244
401
FLOW THROUGH A NETWORK
contain chains C1", C2" joining a and b, and another maximal flow can be obtained from f as follows. Reduce the C1 and C2 components of f each by k2' and increase each of the C1" and C2" components by k 2• This unsaturates the arc a" contradicting its definition as an element of S. On the other hand, if (C lJ k 1), (C 2 , k 2) were members of distinct maximal flowsh,h, consideration of f = H/l h) brings us back to the former case. Hence, the arcs of S have a definite orientation assigned to them by maximal flows. We refer to that vertex of an arc a E S which occurs first in a positive chain flow of a maximal flow as the left vertex of a. N ow define a left arc of S as follows: an arc a of S is a left arc if and only if there is a maximal flow f and a chain aI, a2, ... ,ale (possibly null) joining a and the left vertex of a with no aj saturated by f. Let L be the set of left arcs of S.
+
LEMMA
2. L is a disconnecting set.
Given an arbitrary chain al(a al), a2(ala2), ... ,am (a m-l b) joining a and b, it must intersect S by Lemma 1. Let a,(a ,_l, at) be the first aj E S. Then for each ai, i < t, there is a maximal flow fl in which al is unsaturated. The average of these flows provides a maximal flow f in which alJ a2, ... , a 1-1 are unsaturated. It remains to show that this chain joins a to the left vertex of a" i.e., al_l is the left vertex of al' Suppose not. Then the maximal flow f contains a chain flow [,81 (ab 1) , ,82(b lJ b2),
••• ,
,8r(br- lJ b); k], k
> 0,
,8.
= a" b._ 1 = a" b. =
ai_I.
Let the amount of unsaturation in f of al (i = 1, ... , t - 1), be k j > o. Now alter f as follows: decrease the flow along the chain ,81, ,82, ... ,,8T by min [k, kd > 0 and increase the flow along the chain contained in by this amount. The result is a maximal flow in which contradiction. Hence at E L. LEMMA
at
is unsaturated, a
3. No positive chain flow of a maximal flow can contain more than
one arc of L.
Assume the contrary, that is, there is a maximal flow fl containing a chain flow [,81 (ab 1) , ,82(b 1b2), ... , ,8T(b r- 1, b); k], k > 0, with arcs ,81, ,8i E L, ,8; occurring before ,8i' say, in the chain. Letf2 be that maximal flow for which there is an unsaturated chain al (aal) , a2 (alJ a2), ... , a. (a.-lJ bi-I)
+
from a to the left vertex of ,8i' Consider f = Hh h). This maximal flow contains the chain flow [,810 ,82, ... , ,8T; k'] with k' :;;. tk, and each aj(i = 1, ... ,s) is unsaturated by k j > 0 in f. Again alter f: decrease the flow along
245
402
L. R. FORD, JR. AND D. R. FULKERSON
/31, /32, ... ,/3T by min [k', k;] > 0 and increase the flow along the chain contained in aI, a2, ... , a" /3 jo • • • , /3T by the same amount, obtaining a maximal flow in which /3t is unsaturated, a contradiction. Now to prove the theorem it suffices only to remark that the value of every flow is no greater than v CD) where D is any disconnecting set; and on the other hand we see from Lemma 3 and the definition of S that in adding the capacities of arcs of L we have counted each chain flow of a maximal flow just once. Since by Lemma 2 L is a disconnecting set, we have the reverse inequality. Thus L is a minimal cut and the value of a maximal flow is veL). We shall refer to the value of a maximal flow through a network N as the capacity of N (cap (N)). Then note the following corollary of the minimal cut theorem. COROLLARY. Let A be a collection of arcs of a network N which meets each cut of N in just one arc. If N' is a network obtained from N by adding k to the capacity of each arc of A, then cap (N') = cap (N) k.
+
I t is worth pointing out that the minimal cut theorem is not true for networks with several sources and corresponding sinks, where shipment is restricted to be from a source to its sink. For example, in the network (Fig. 1) with shipment from a; to b i and capacities as indicated, the value of a minimal disconnecting set (i.e., a set of arcs meeting all chains joining sources and corresponding sinks) is 4, but the value of a maximal flow is 3. b 2 ,b 3
2
Fig. 1
2. A computing procedure for source-sink planar networks. 2 We say that a network N is planar with respect to its source and sink, or briefly, N is ab-planar, provided the graph G of N, together with arc ab, is a planar 'It was conjectured by G. Dantzig, before a proof of the minimal cut theorem was obtained, that the computing procedure described in this section would lead to a maximal flow for planar networks.
246
FLOW THROUGH A NETWORK
403
graph (2; 3). (For convenience, we suppose there is no arc in G joining a and b.) The importance of ab-planar networks lies in the following theorem. THEOREM 2. If N is ab-planar, there exists a chain joining a and b which meets each cut of N precisely once.
Proof. We may assume, without loss of generality, that the arc ab is part of the boundary of the outside region, and that G lies in a vertical strip with a located on the left bounding line of the strip, b on the right. Let T be the chain joining a and b which is top-most in N. T has the desired property, as we now show. Suppose not. Then there is a cut D, at least two arcs of which are in T. Let these be O!l and 0!2, with O!l occurring before 0!2 in following T from a to b. Since D is a cut, there is a chain C1 joining a and b which meets Din O!l only. Similarly there is a chain C2 meeting D in 0!2 only. Let C2 ' be that part of C2 joining a to an end point of 0!2. It follows from the definition of T that C1 and C2' must intersect. But now, starting at a, follow C2' to its last intersection with Cl, then C1 to b. We thus have a chain from a to b not meeting D, contradicting the fact that D is a cut. Symmetrically, of course, the bottom-most chain of N has the same property. Notice that this theorem is not valid for networks which are not ab-planar. A simple example showing this is provided by the "gas, water, electricity" graph (Fig. 2), in which every chain joining a and b meets some cut in three arcs.
Fig. 2 Theorem 2 and the corollary to Theorem 1 provide an easy computational procedure for determining a maximal flow in a network of the kind here considered. Simply locate a chain having the property of Theorem 2; this can be done at a glance by finding the two regions separated by arc ab, and taking the rest of the boundary of either region (throwing out portions of the boundary where it has looped back and intersected itself, so as to get a chain). Impose as large a chain flow (T; k) as possible on this chain, thereby saturating one or more of its arcs. By the corollary, subtracting k from each capacity in T reduces the capacity of N by k. Delete the saturated arcs, and proceed as
247
404
L. R. FORD, JR. AND D. R. FULKERSON
before. Eventually, the graph disconnects, and a maximal flow has been constructed. 3. A minimal path problem. For source-sink planar networks, there is an interesting duality between the problem of finding a chain of minimal capacitysum joining source and sink and the network capacity problem, which lies in the fact that chains of N joining source and sink correspond to cuts (relative to two particular vertices) of the dual3 of N and vice versa. More precisely, suppose one has a network N, planar relative to two vertices a and b, and wishes to find a chain joining a and b such that the sum of the numbers assigned to the arcs of the chain is minimal. An easy way to solve this problem is as follows. Add the arc ab, and construct the dual of the resulting graph G. Let a' and b' be the vertices of the dual which lie in the regions of G separated by abo Assign each number of the original network to the corresponding arc in the dual. Then solve the capacity problem relative to a' and b' for the dual network by the procedure of §2. A minimal cut thus constructed corresponds to a minimal chain in the original network. 3The dual of a planar graph G is formed by taking a vertex inside each region of G and connecting vertices which lie in adjacent regions by arcs. See (1; 3).
REFERENCES
1. G. B. Dantzig, Maximization of a linear function of variables subject to linear inequalities: Activity analysis of production and allocation (Cowles Commission, 1951). 1. H. Whitney, Non·separable and planar graphs, Trans. Amer. Math. Soc., 34 (1932), 339-362. 3. - - , Planar graphs, Fundamenta Mathematicae, 21 (1933), 73-84.
Rand Corporation, Santi Monica, California
Reprinted from Canad. J. Math. 8 (1956). 399-404
248
ON PICTURE-WRITING* G. POLYA, Stanford University
To write "sun", "moon" and "tree" in picture-writing, one draws simply a circle, a crescent and some simplified, conventionalized picture of a tree, respectively. Picture-writing was used by some tribes of red Indians and it may well be that more advanced systems of writing evolved everywhere from this primitive system. And so picture-writing may be the ultimate source of the Greek, Latin and Gothic alphabets, the letters of which we currently use as mathematical symbols. I wish to observe that also the primitive picture-writing may be of some use in mathematics. In what follows, I wish to show how the method of generating functions, important in Combinatory Analysis, can be quite intuitively evolved from "figurate series" the terms of which are pictures (or, more precisely, variables represented by pictures). Picture-writing is easy to use on paper or blackboard, but it is clumsy and expensive to print. Although I have presented several times the contents of the following pages orally, I hesitated to print it. t I am indebted to the editor of the MONTHLY who encouraged me to publish this article. I shall try to explain the general idea by discussing three particular examples the first of which, although the easiest, will be very broadly treated. 1.1. In how many ways can you change one dollar? Let us generalize the proposed question. Let P" denote the number of ways of paying the amount of n cents with five kinds of coins: cents, nickels, dimes, quarters and half-dollars. The "way of paying" is determined if, and only if, it is known how many coins of each kind are used. Thus, p.= 1, P 6 =2, P IO =4. It is appropriate to set Po = 1. The problem stated at the outset requires us to compute PlOD. More generally, we wish to understand the nature of p .. and eventually devise a procedure for computing P ". It may help to visualize the various possibilities. We may use no cent, or just 1 cent, or 2 cents, or 3 cents, or .... These alternatives are schematically pictured in the first line of Figure 1 j** "no cent" is represented by a square which may remind us of an empty desk. The second line pictures the alternatives: using no nickel, 1 nickel, 2 nickels, .... The following three lines represent in the same way the possibilities regarding dimes, quarters and half-dollars. We have to choose one picture from the first line, then one picture from the second line, and so on, choosing just one picture from each linej combining (juxtaposing) the five pictures so selected, we obtain a manner of paying. Thus, Figure 1 exhibits directly the alternatives regarding each kind of coin and, indirectly, all manners of paying we are concerned with. • Address presented at the meeting of the Association in Athens, Ga., March 16, 1956. t I used it, however, in research. See 2, especially p. 156, where the "figurate series" are introduced in a closely related, but somewhat different, form. (Numerals in boldface indicate the references at the end of the paper.) •• A photo of actual coins would be more effective here but too clumsy in the following figures.
689
249
690
[December
ON PICTURE-WRITING
00 888 · . . CD 00 000 ® ®® ®®® @ @@ @@@ · . . ® ®® ®®® · . .
0
D D 0 0 0
(0 (0
(0
(0 (0 .= •
FIG. 1. A complete survey of alternatives.
0
+
+
0) +
+
®
+
+
+ +
@ + @) +
00 + 808 +
CD® + ®®® + ®® @@ @®
+ + +
®®® @@@ ®®®
+
+ +
·) . ·) . .. ·) . ·) .
·) .
.. + D'00(~}@'@'® + FIG. 2. Genesis of the figurate series.
The main discovery consists in observing that, in fact, we combine the pic.ures in Figure 1 according to certain rules of algebra: if we conceive each line of Figure 1 as the sum of the pictures contained in it and we consider the product of these five (infinite) sums, in short, if we pass from Figure 1 to Figure 2, and we develop the product, the terms of this development will represent the various manners of paying we are concerned with. The one term of the product exhibited in the last line of Figure 2 as an example represents one manner of paying one dollar (putting down no cents, three nickels, one dime, one quarter and one half-dollar). The sum of all such terms is an infinite series of pictures; each picture exhibits one manner of paying, different terms represent different manners of paying, and the whole series of pictures, appropriately called the figurate series, displays all manners of paying that we have to consider when we wish to compute the numbers P n' 1.2. Yet this way of conceiving Figure 2 raises various difficulties. First, there is a theoretical difficulty: in which sense can we add and multiply pictures? Then, there is a practical difficulty: how can we pick out conveniently from the whole figurate series the terms counted by P n, that is, those cases in which the
250
1956]
ON PICTURE-WRITING
691
sum paid amounts to just n cents? We avoid the theoretical difficulty if we employ the pictures, these symbols of a primitive writing, as we are used to employing the letters of more civilized alphabets: we regard each picture as the symbol for a variable or indeterminate. t To master the other difficulty, we need one more essential idea: we substitute for each "pictorial" variable (that is, variable represented by a picture) a power of a new variable x, the exponent of which is the joint value of the coins represented by the picture, as it is shown in detail by Figure 3. The third line of Figure 3 shows a lucky coincidence: we have conceived the three juxtaposed nickels as one picture, as the symbol of one variable (corresponding to the use of precisely three nickels). For this variable we have to substitute x 16 according to our general rule; yet even if we substitute for each of the juxtaposed coins the correct power of x and consider the product of these juxtaposed powers, we arrive at the same final result xu.
D
FIG.
=
x·
= 1,
3. Powers of one variable substituted for variables represented by pictures.
The last line of Figure 3 is very important. It shows by an example (see the last line of Fig. 2) how the described substitution affects the general term of the figurate series. Such a term is the product of 5 pictures (pictorial variables). For each factor a power of x is substituted whose exponent is the value in cents of that factor; the exponent of the product, obtained as a sum of 5 exponents, will be the joint value of the factors. And so the substitution indicated by Figure 3 changes each term of the figurate series into a power x". As the figurate series represents each manner of paying just once, the exponent n arises precisely p" times so that (after suitable rearrangement of the terms) the whole figurate series goes over into
t In a formal presentation it may be advisable to restrict the term "picture" to denote a (visible, written or printed) symbol that stands for an indeterminate; in the present introductory, rather informal, address the word is now and then more loosely used. Let us pass over two somewhat touchy points: the infinity of variables and the convergence of the series in which they arise. Both are considered in certain advanced theories and both are momentary. They will be eliminated by the next step.
251
692
ON PICTURE-WRITING
[December
(1)
In this series the coefficient of x" enumerates the different manners of paying the amount of n cents, and so (1) is suitably called the enumerating series. The substitution indicated by Figure 3 changes the first line of Figure 2 into a geometric series: (2)
In fact, this substitution changes each of the first five lines of Figure 2 into some geometric series and the equation indicated by Figure 2 goes over into (3)
(1 - x)-1(1 - X5)-1(1 - XI0)-1(1 - X26)-1(1 - X·O)-1
= Po + PIX + P 2 x2+ ... + P ..x" + ....
We have succeeded in expressing the sum of the enumerating series. This sum is usually termed the generating function; in fact, this function, expanded in powers of x, generates the numbers Po, PI, ... , P .., ••. , the combinatorial meaning of which was our starting point. 1.3. We have reduced a combinatorial problem to a problem of a different kind: expanding a given function of x in powers of x. In particular, we have reduced our initial problem about changing a dollar to the problem of computing the coefficient of x lOO in the expansion of the left hand side of (3). Our main goal was to show how picture-writing can be used for this reduction. Yet let us add a brief indication about the numerical computation. The left hand side of (3) is a product of five factors. The well known expansion of the first factor is shown by (2). We proceed by adjoining successive factors, one at a time. Assume, for example, that we have already obtained the expansion of the product of the first two factors: (1 - x)-1(1 - X6)-1
= ao + alx + a2x! + ... ,
and we wish to go on hence to three factors: (1 - x)-1(1 - x&)-1(1 - X10)-1
= bo + blx + b2x2 + ....
It follows that (b o
+ blx + blx! + ... )(1 -
xlO ) = ao
+ alx + a!x! + ....
Comparing the coefficient of x" on both sides, we find that (4)
b"
= b,,-IO + a"
(set b... =O if m
252
1956]
693
ON PICTURE-WRITING
each column shows the value of n, the beginning of each row the last factor taken into account; the bottom row would show P n for n=O, 5, 10, ... , 50 if we had computed it. Yet the table registers only the steps needed for computing the answer to our initial question and yields P 60 = 50; that is, one can pay 50 cents in exactly 50 different ways. We leave it to the reader to continue the computation and verify that PIOO = 292; he can also try to justify the procedure of computation directly without resorting to the enumerating series.* Table to compute P,o
n=O (l-x)-1 (l-xb)-l (l_x IO )-1 (I-X26 )-1 (l-x6O )-1
1 1 1
5
10
15
20
25
1 2 2
1 3 4
1 4 6
1 5 9
1 6 12 13
1 1
30
35
40
45
1
1 8
1 9 25
1
1
10
11 36 49 50
7 16
50
2.1. Dissect a convex polygon with n sides into n-2 triangles by n-3 diagonals and compute D", the number of different dissections of this kind. Examining first the simplest particular cases helps to understand the problem. We easily see that D4 = 2, Db = 5; of course D3 = 1. The solution is indicated by the parts (I), (II), and (III) of Figure 4. After the broad discussion of the foregoing solution it should not be difficult to understand the indications of Figure 4. Part (I) of Figure 4 hints the key idea: we build up the dissections of any polygon that is not a triangle from the dissections of other polygons which have fewer sides. For this purpose, we emphasize one of the sides of the polygon, place it horizontally at the bottom and call it the base. One of the triangles into which the polygon is dissected has the base as side; we call this triangle A. In the given polygon there are two smaller polygons, one to the left, the other to the right, of A. For example, the top line of Figure 4 (I) shows an octagon in which there is a quadrilateral to the left, and a pentagon to the right, of A, both suitably dissected. As the figure suggests, we can generate this dissection of the octagon by starting from A and placing on it, from both sides, the two other appropriately pre-dissected polygons. We may hope that building up the dissections in this manner will be useful. In exploring the prospects of this idea, we may run into an objection: there are cases, such as the one displayed in the second line of Figure 4 (I), in which the partial polygon on a certain side of A does not exist. Yet we can parry this objection: yes, the partial polygon on that side of A (the left side in the case of the figure) dOlls exist, but it is degenerate; it is reduced to a mere segment. • For the usual method of derivin\l: the generating function, cf. 1, Vol. 1, p. 1, Problem 1.
253
694
ON PICTURE-WRITING
[December
(l) •
(ID
~+ [Z]+[SJI27+\lS~/+~+~+ ~
••• +
IZJ .D,. f2J + •.•
++0+Ls _)·GC __+8···+g+···) (ill)
_-x:G-x~ C2J-x:[SJ-~(2;-x:fiJ-x~ ... FIG. 4. Key idea, figurate series, transition.
Part (J1) of Figure 4 shows the genesis of the figurate series. This series, which occupies the first line, is the sum of all possible dissections of polygons with 3, 4, 5, ... sides. According to Part (I) (as the next line reminds us) each term of the figurate series can be generated by placing two pre.dissected polygons on a triangle.6., one from the left and one from the right (one or the other of which, or possibly both, may be degenerate). Therefore, as the next line {the last of Figure 4 (II» indicates, the terms of the figurate series are in one-one correspondence with the expansion of a product of three factors: the middle factor is just a triangle, the other two factors are equal to the figurate series augmented by the segment. 2.2. Part (III) of Figure 4 hints the transition from the figurate series to the enumerating series. Following the pattern set by Figure 3 and Section 1.2, we substitute for each dissection (more precisely, for the variable represented by that dissection) a power of x the exponent of which is the number of triangles
254
1956)
695
ON PICTURE-WRITING
in that dissection. This substitution, indicated by Figure 4 (III), changes the figurate series into (5) where E(x) stands for enumerating series. The relation displayed by Figure 4 (II) goes over into E(x) = x[1
(6)
+ E(x) ]2.
This is a quadratic equation for E(x) the solution of which is E(x)
= Dax + D4X 2 + D5X3 + ... + D"x·- 2 + ... 1 - 2x - [1 - 4x ]1/2
(7)
2x
x
+ 2x + .... 2
In fact, to arrive at (7), we have to discard the other solution of the quadratic equation (6) which becomes 00 for x = O. 2.3. We have reduced our original problem which was to compute D,. to a problem of a different kind: to find the coefficient of x n - 2 in the expansion of the function (7) in powers of x.* This latter is a routine problem which we need not discuss broadly. We obtain from (7), using the binomial formula and straightforward transformations, that for n ;;; 3 D"
= _ ~( 2
1/2 ) (-4),,-1 n-l
=~ ~ 2
3
10 ... 4n - 10 . 4 n-l
3.1. A (topological) tree is a connected system of two kinds of objects, lines and points, that contains no closed path. A certain point of the tree in which just one line ends is called the root of the tree, the line starting from the root the trunk, any point different from the root a knot. In Figure 5 the root is indicated by an arrow, and each knot by a small circle. Our problem is: compute T,,, the number of different trees with n knots. It makes no difference whether the lines are long or short, straight or curved, drawn on the paper to the left or to the right: only the difference in (topological) connection is relevant. Examining the simplest cases may help the reader to understand the intended meaning of the problem; it is easily seen that T 1 =1,
T 2 =1, Ta=2, T 4 =4, T5=9.t * For a more usual method cf. 3, Vol. 1, p. 102, Problems 7, 8, and 9.
-t The trees here considered should be called more specifically root-trees; see 4, Vol. 11, p.365. Their definition which is merely hinted here is elaborated in 2, pp. 181-191; see also the passages there quoted of 5. It may be, however, sufficient and in some respects even advantageous if, at a first reading, the reader takes the definition "intuitively" and supplements it by examples. Observe that in Cayley's first paper on the subject, 4, Vol. 3, pp. 242-246, the definition of a tree is not even attempted. Chemistry is one of the sources of the notion "tree": if the points stand for atoms and the connecting lines for valencies, the tree represents a chemical compound.
255
696
[December
ON PICTURE-WRITING
( I)
~-:}Ib. -1.(I!l.!yy) !+!+Ly+LY+r:r+ +If / . . -- r·( r (II)
D+~
+
(D
~f
+
(0
+
(0
l
+
-I-
II
+ ••• )
t ! + It I +
L1LIj !
+Y + yy yyy +
••• )
+ ••• ) + ••• J
~ . . ~';.( m.o.! .yy. o-D"~ (]I )
+ •••
~ I~ x', !~x" Lx" y~ x', Lx.Y~ x~ ...
o x',
FIG. 5. Key idea, figurate series, transition.
256
1956]
697
ON PICTURE-WRITING
The solution is indicated by the three parts of Figure 5 the general arrangement of which is closely similar to that of Figure 4. The reader should try to understand the solution by merely looking at Figure 5 and observing relevant analogies with all the foregoing figures. He may, however, fall back upon the following brief comments. The simplest tree consists of root, trunk and just one knot. The key idea is to build up any tree different from the simplest tree from other trees which have fewer knots. For this purpose we conceive, as Figure 5 (I) shows, the "main branches" of any tree as trees (with fewer knots) inserted into the upper endpoint (the only knot) of the trunk. Therefore, as Figure 5 (I) further shows, we can conceive of any tree as the juxtaposition of the simplest tree and of several pictures, each of which consists of one, or two, or more identical trees; observe the analogy with the last line of Figure 2. Part (II) of Figure 5 displays the figurate series: the infinite sum of all different trees. Its genesis is similar to, but more complex than, that of the figurate series of Figure 2. In Figure 2 we see a product of five "virtually geometric" series; in Figure 5 we see a product of an infinity of "virtually geometric" series, multiplied by an initial one term factor (the simplest tree, the common trunk of all trees). 3.2. Part (III) of Figure 5 displays the substitution that changes the figurate series into the enumerating series. By this substitution, each "virtually geometric" series arising in Figure 5 (II) goes over into a proper geometric series the sum of which is known, and the whole relation displayed by Figure 5 (II) goes over into the remarkable relation due to Cayley* (8)
T1x
+ T,x + TaX· + ... + T"x" + ... 2
= x(1
- x)-1'I(l - x2)-rs(1 - xl )-1'1
•••
(1 - x,,)-r,. ••••
3.3. By expanding the right hand side of Equation (8) in powers of X and comparing the coefficient of x" on both sides, we obtain a recursion formula, that is, an expression for Tn in terms of T 1, T 2 , • • • , T .._1 for n ~2. The reader should work out the first cases and verify by analytical computation the values T .. for n;;i!5 which he found before by geometrical experimentation. References 1. 2. 3. 4. 5.
G. P6lya and G. Szeg6, Aufgaben und Lehrsitze aus der Analysis, 2 volumes, Berlin, 1925. G. P61ya, Acta Mathematica, vol. 68 (1937), pp. 145-254. G. P6lya, Mathematics and Plausible Reasoning, 2 volumes, Princeton, 1954. A. Cayley, Collected Mathematical Papers, 13 volumes, Cambridge, 1889-1898. D. K6nig, Theorie der endlichen und unendlichen Graphen, Leipzig, 1936.
• This form is slightly different from that given in 4, Vol. 3, pp. 242-246. For other fol'llUl see 2. p.149.
Reprinted from Amer. Math. Monthly 63 (1956), 689-697
257
A THEOREM ON FLOWS IN NETWORKS DAVID GALE
1. Introduction. The theorem to be proved in this note is a generalization of a well-known combinatorial theorem of P. Hall, [4]. HALL'S THEOREM. Let 8 Ir 8 2 , ••• , 8" be subsets of a set X. Then a necessary and sufficient condition that there exist distinct elements Xli ••• , X"' such that X, e 8! is that the union of every k sets from among the 8, contain at least k elements.
The result has a simple interpretation in terms of transportation networks. A certain article is produced at a set X of origins, and is demanded at n destinations y" .'., y". Certain of the origins X are " connected" to certain of the destinations y making it possible to ship one article from X to y. PROBLEM. Under what conditions is it possible to ship articles to all the destinations y?
An obvious reinterpretation of Hall's theorem shows that this is possible if and only if every k of the destinations are connected to at least k origins. We shall now give a verbal statement of the generalization to be proved. A more formal statement will be given in the next section. Let N be an arbitrary network or graph. To each node x of N corresponds a real number d(x), where Id(x)1 is to be thought of as the demand for or the supply of some good at X according as d(x) is positive or negative. To each edge (x, y) corresponds a nonnegative real number c(x, y), the capacity of this edge, which assigns an upper bound to the possible flow from x to y. The demands d(x) are called feasible if there exists a flow in the network such that the flow along each edge is no greater than its capacity, and the net flow into (out of) each node is at least (at most) equal to the demand (supply) at that node. An obviously necessary condition for the demands d(x) to be feasible is the following. For every collection 8 of nodes the sum of the demands at the nodes Received September 24, 1956. The results of this paper were discovered while the author was working as a consultant for the RAND Corporation. A later revision was partially supported by an O. N. R. contract.
259
1074
DAVID GALE
of S must not exceed the sum of the capacities of the edges leading into S. If this condition were not satisfied it would clearly be impossible to satisfy the aggregate demand of the subset S. The principal theorem of this paper shows that conversely, if the above condition is satisfied, then the demands d(x) are feasible. Hall's theorem drops out as a special case of this result if one applies it to the particular network described in the paragraph above and makes use of the known fact (see [1]) that transportation problems of this type with integral constraints have integral solutions. However, the simple inductive argument which works in [4] does not seem to generalize to yield a proof of our theorem. Our approach is in fact quite different and is based on the "minimum cut" theorem of Ford and Fulkerson, [2], [1]. In the next section we give a formal statement of the problem and prove the principal theorem. The final section is devoted to the treatment of a special case for which the "feasibility criterion" yields a very simple method for computing solutions. 2. The principal theorem. We proceed to define in a more formal manner the objects to be discussed. DEFINITIONS. A network [N,c] consists of a finite set of nodes N and a capacity function c on Nx N where c(x, y) is a nonnegative real number or plus infinity.
A flow f on [N, c] is a function f on Nx N such that ( 1)
f(x, y) + f(y, x)=O,
(2 )
f(x, y) < c(x, y)
for all x, yeN.
A demand d on [N, c] is simply a real valued function on N. Note that we do not require the function c to be symmetric, thus the maximum allowable flow from x to y need not be the same as that from y to x. Condition (1) above corresponds to the usual convention that the net flow from x to y is the negative of the net flow from y to x. We shall save writing many summation symbols in what follows by adopting the following convenient notation.
NOTATION.
If S is a subset of Nand d a function on N, we write
d(S)= 2.:. d(x) . zES
260
A
THEOREM ON FLOWS IN NETWORKS
1075
If 8 and T are subsets of Nand f a function on N x N we write
L.
f(8, T)=
::t€S, VET
f(x, y).
From these definitions it follows at once that if U and V are disjoint subsets of N then d(U U V)=d(U)+d(V)
(3)
f(8, U U V)= f(8, U) +f(8, V).
In particular, denoting the complement of 8 by 8' we have, f(N, T)=f(8, T)+f(8', T)
for all 8 eN.
In this notation (1) and (2) are clearly equivalent to (1')
f(A, A)=O;
and f(A, B) < c(A, B)
(2')
for all A, BeN.
The above notation is natural to our problem, for if d is a demand function then d(8) is simply the aggregate demand of the set 8, and if f is a flow then f(8, T) represents the net flow from 8 into T. DEFINITION. such that
A demand d is called feasible if there exists a flow f
( 4)
f(N,
x)~
d(x)
for xeN.
This condition states that the flow into each node must be at least equal to the demand at that node. However (1) and (4) together imply f(x,
N)~
-d(x)
so that we are also requiring the flow out of each node to be at most equal to the supply at that node (recalling that a negative demand represents a supply). Finally we note that from (3) it follows that (4) is equivalent to f(N, 8) > d(8)
(4' )
for all 8
eN.
We can now give a simple statement of our main result. FEASIBILITY THEOREM. every subset 8 C N (5 )
The demand d is feasible if and only if for d(S')~
c(8, 8').
261
DAVID GALE
1076
Proof. The necessity of (5) is obvious, for if d is feasible then there is a flow f such that d(8')~f(N,
8')=f(8, 8')+f(8', 8')=f(8, 8')< c(8, 8').
The proof of sufficiency depends on the "minimum cut theorem" of Ford and Fulkerson, which we shall now state and prove in our own formulation. While our proof is little more than a translation of the above authors' second proof [3] into our notation, we record it here, nevertheless, both for the sake of completeness and because it is substantially shorter than any proof published heretofore. DEFINITION. Let [N, c] be a network and let sand s' be two distinguished nodes (s=source, s' =sink). A flow from s to s' is a flow such that (6)
f(N, x)=O
for x =1= s, x =1= s' .
Let F denote the set of all flows from s to s'. A cut (8, 8') of N with respect to sand s' is a partition of N into sets 8 and 8' such that s E 8, s' E 8'. Let Q denote the set of all such cuts. MINIMUM CUT THEOREM.
For any network [N, c]
maxf(s, N)=min c(8, 8'), Q
F
Proof. have (7)
First note that for any flow fE F and cut (8, 8') E Q we
f(s, N)=f(s, N)+
2::: f(x, N)=f(s, N)+f(8-s, N)
xES-S
=f(8, N)=f(8, 8)+f(8, 8')=f(8, 8')5c(8, 8'). Hence, it remains only to show that equality is attained in (7) for some flow and cut. Let fE F be a flow such that 1(s, N) is a maximum. Let 8 consist of s and all nodes x such that there exists a chain 0"= (xo, XI' ••• , xn) of distinct nodes with Xo=S, Xn=X and C(X i _ l , Xi )-](X I - 1 , Xi» 0, i=l, ... , n. Now s' is not in 8, for, if it were, there would be a chain 0" as above with x=s'. But then letting p=min [C(X i _ h xJ-J(x i -" x.1)]
,
one could superimpose a flow of p along the chain 0" on top of the flow
f, contradicting the maximality of
f. 262
1077
A THEOREM ON FLOWS IN NETWORKS
The above argument shows that (S, S') is a cut, and we conclude the proof by observing that fls, N)=c(S, S'), for if not, then from (7), flS, S') c(S, S'), hence for some xeS and yeS' we would have c(x, y) -fix, y»O, but since xeS there is a chain a=(s, Xu ••• , x) which could be extended to a chain a' = (s, Xu ••• , x, y), contrary to the fact that yeS'. This completes the proof.
<
Proof of feasWility theorem.
Consider a new network [N, c] where
IV consists of N plus two additional nodes sand s'. Let U C N be all nodes x such that d(x);;;:;; O.
Then
c is defined by the rules
c(x, y)=c(x, y)
for x, ye N,
c(s, x)= -d(x)
for xe U,
c(x, s')=d(x)
for xe U',
c(x, y)=O
otherwise.
We now assert that the cut (&-s', s') is a minimal cut of [N, c], for let S, and "8 be any cut of [N, c] and let S=8-s, S'=8-s'. From the definition above we have c(s, "8)=e(S, S')+c(s, S')+c(S, s') =e(S, $')-d(S'n U)+d(S
n U') ,
c(N-s', s')=d(U')=d(S' n U') + d(S
n U') ;
and subtracting we get c(N-s', s')-c(S, 8)=d(S' n U') + d(S' n U)--e(S, S') =d(S')-c(S,
S')~
0,
the last inequality being the hypothesis (5), and the assertion is proved. Now, from the Minimum Cut Theorem, there is a flow 1 from s to s' on [N, c] such that
1(&-8', s')=c(N-s', s')=d(U'), hence (8)
flx, s')=d(x)
for all xe U'.
Let f be f restricted to N x N. Then f is clearly a flow and it remains to show that f satisfies (4). If x e U' then O=](x, N)=f(x, N)+flx, s')=f(x, N)+d(x),
263
1078
DAVID GALE
hence (9 )
fiN, x) =d(x) . If xe U then
0=7(&, x)=f(N, x)+](s, x)~f(N, x)+c(s, x)=f(N, x)-d(x),
so f(N, x) > d(x) ,
(10)
and (9) and (10) together show that f satisfies (4), completing the proof. REMARK. We wish to call attention to the following important fact. We have at no point in what has been said thus far made use of the assumption that the functions d, c and f were real valued. In fact, all definitions and proofs go through verbatim if the real numbers are replaced by any ordered Abelian group, in particular, the group of integers. One useful consequence of this remark is the fact that if a network with integer valued demand and capacity functions admits a feasible flow then this flow may also be chosen to be integer valued. We shall make use of this fact in the next section. There is a second formulation of the Feasibility Theorem which is sometimes convenient. In the network [N, c] let U be as above the set of nodes x such that d(x) < O. THEOREM. The demand d is feasible if and only if for every set Y C U' there exists a flow fy such that (11)
frlN, x)>-d(x)
(12)
fy(N, Y)L d(Y) .
Proof. The necessity is obvious. (11) and (12) imply (5).
for xe U
To prove sufficiency we show that
Let (8, S') be a partition of N and let X=U (\ S, X'=U (\ S', Y = U' (\ S, Y' = U' (\ S'. Then from (11) there exists fF such that d(X')~fy,(N,
X')=fy,(X V Y, X')+fy,(Y', X'),
and from (12), d(Y')~fy,(N,
Y')=fy,(XV Y, Y')+fy,(X', Y').
Adding these inequalities we get
264
A THEOREM ON FLOWS IN NETWORKS
1079
d(8') = d(X') +d(Y')=fy,(X V Y, X') +fy,(X V Y, Y') =fy,(X V Y, X'V Y')=f(8, 8') < c(8, 8'),
which is exactly (5). 3. An example. As an illustration of the feasibility theorem, consider the following problem. (I). Let all ••• , am and b" ••• , b" be two sets of positive integers. Under what conditions can one find integers a'j=O or 1, such that
and
for all i and j ? As a concrete illustration, suppose n families are going on a PICnIC in m busses, where the jth family has bj members and the ith bus has a j seats. When is it possible to seat all passengers in such a way that no two members of the same family are in the same bus? In the case Sal = Sb, the problem becomes that of filling an m x n matrix M with zeros and ones so that the rows and columns shall have prescribed sums. The feasibility theorem gives a simple necessary and sufficient condition for the problem to have a solution. In order to state if we need the following. DEFINITION.
integers a" a.J , Let
Let raj} be a nonincreasing sequence of nonnegative such that all but a finite number of the a l are zero.
••• ,
where j is a positive integer and let Sj be the number of elements in 8 j• The sequence of numbers {Sj} clearly satisfies the same conditions as the sequence {a,}; it is called the dual sequence of the sequence {all and is denoted by {al } *. It is clear that {al} * determines {al} since the integer a, occurs exactly sa,-sa,+l times in {a,}. Actually the correspondence between {a,l and {a,l * is completely dual in the following sense. THEOREM.
This result will not be needed in the sequel and its proof is left as
265
1080
DAVID GALE
an exercise. However, its validity can be made quite obvious by means of a simple pictorial representation. Let each number a, be represented by a row of dots, and write these rows in a vertical array so that a i + J lies under ai' thus:
a, ..... a, ... a5
•
It is then clear that the dual number
Sj is simply the number of dots in the jth column of the array. We can now give the criterion for the feasibility of Problem I. Henceforth for convenience we shall assume the numbers a, and bj are indexed in decreasing order, and shall define a,=O for m, bj=O for j>n.
i>
THEOREM.
Let {sJ}={a,}*.
Then Problem 1 is feasible if and only
if for all integers k . Proof. We may interpret (I) as a flow problem. Let N be a network consisting of m+n nodes x" ... , Xm and Yh ••• , Yn, and let C(Xi' Yi) =1 for all i and j, c=o otherwise. Let d(x,)= -a, and d(Yj)=b J. One easily verifies that the feasibility of (I) is equivalent to the feasibility of the demand d. We shall show that d is feasible by applying the second theorem of the previous section. Let Y be a subset of k nodes YJ' say Y = {YJ 1 ' .•• , Yj). We now compute the maximum possible flow into Y. Because all capacities are unity this maximal flow fy is achieved by shipping as much as possible from each node Xi into the set Y. Thus, the flow from Xi to Y is min raj, k] and the total flow into Y is
We now assert (13)
266
A THEOREM ON FLOWS IN NETWORKS
which is proved by induction on k.
1081
It is clear from the definition that
m
2:: min [a l1 IJ=m=s,.
1=1
Now min [a·l
,
min [aj, k+ IJ= { min [au
kJ kJ + 1
hence, m
m
2:: min [ai' k+lJ=2:: min [ai' kJ+-sk+I' i-I i-I and (13) follows from the induction hypothesis. The second feasibility theorem now states that the problem is feasible if and only if
and since the b j are indexed in decreasing order, the conclusion of the theorem follows. It is interesting that for this particular problem there is a simple " n-step" method for actually filling out the matrix of au's. Such procedures are sufficiently rare in programming theory so that it seems worth while to present it here. The procedure is the following: If the problem is feasible then bl ~ s, and hence a" ... , ab , ~ 1 (recall that the at's are indexed in descending order). Let a il =1 for i~b" ail=O for i>b,. Now consider the new problem, (I)', with the matrix M' having m rows and n-l columns, j=2, ... , n, with a;=a.l-a il and b;=b j • We assert that (I)' is again feasible so that by repeating the process we will eventually fill out the whole matrix. To show that (I)' is feasible we must prove, for any k, k+l
2:: b
j=2
j
k
~
m
2:: s;= 2:: min [s;, kJ, i= j=1
1
where {sa is the dual sequence to {a;}. can be rewritten l)1
?11
The expression on the right ?n
2:: min [a;, kJ=2:: min [ai-I, kJ+ 2:: min [a;, t=l i=b +1
i.,.t
1
k].
We must now consider two cases. Case 1.
s.+, ~ b,.
Then a l -I2>. k for i ;?;,b, and hence min[a i -1,
267
kJ
1082
DAVID GALE
=k=min [ai, k], so that we get
<
Case 2. SUI b,. Then for i S. Sk+h at > k + 1 so a j- 12 k and min [at -1, k]=k=min [aI' k]. For SUI i S. bl! at < k, so min raj-I, k] =min raj, k]-I, hence,
<
since
by the feasibility condition. The proof is now complete. In terms of the picnic problem, the n families should be seated in n stages according to the following simple rule: at each stage distribute the largest unseated family among those busses having the greatest number of vacant seats. REFERENCES 1. G. B. Dantzig and D. R. Fulkerson, On the mo.x-flow min-cut theorem of networks, Ann. of Math. Study No. 38, Contributions to linear inequalities and related topics, edited by H. W. Kuhn and A. W. Tucker, 215-221. 2. L. R. Ford, Jr., and D. R. Fulkerson, Maximal flow through a network, Canad. J. Math. 8 (1956), 399-404. 3. - - - - , A simple algorithm for finding maximal network flows aud an application to the Hitchcock problem, Canad. J. Math. 9 (1957), 210-218. 4. P. Hall. On Repre8entati~'es of Subset8, J. London Math. Soc., 10 (1935), 26-30. THE RAND CORPORATION AND BROWN UNIVERSITY
Reprinted from Pacific J. Math. '7 (1957), 1073-1082
268
COMBINATORIAL PROPERTIES OF MATRICES OF ZEROS AND ONES H.
J. RYSER
1. Introduction. This paper is concerned with a matrix A of m rows and n columns, all of whose entries are O's and l's. Let the sum of row i of A be denoted by rj (i = 1, ... , m) and let the stirn of column i of A be denoted by Sj (i = 1, ... ,n). It is clear that if T denotes the total number of l's in A T
=
m
L
j-l
rj
=
n
L
Sj.
j=1
With the matrix A we associate the row sum vector R = (rlJ"" rm), where the ith component gives the sum of row i of A. Similarly, the column sum vector S is denoted by
S = (SlJ ••• , sn). We begin by determining simple arithmetic conditions for the construction of a (0, I)-matrix A having a given row sum vector R and a given column sum vector S. This requires the concept of majorization, introduced by Muirhead. Then we apply to the elements of A an elementary operation called an interchange, which preserves the row sum vector R and column sum vector S, and prove that any two (0, I)-matrices with the same Rand S are transformable into each other by a finite sequence of such interchanges. The results may be rephrased in the terminology of finite graphs or in the purely combinatorial terms of set and element. Applications to Latin rectangles and to systems of distinct representatives are studied.
2. Maximal matrices and majorization. Let OJ = (1, ... , 1,0, ... ,0)
be a vector of n components with l's in the first rj positions, and O's elsewhere. A matrix of the form
is called maximal, and we refer to A as the maximal form of A. The maximal A may be obtained from A by a rearrangement of the l's in the rows of A. Also by inverse row rearrangements one may construct the given A from A. Received July 1, 1956. This work was sponsored in part by the Office of Ordnance Research. 371
269
372
H.
Let R = (7\, ... , fm) and sum vectors of A. Evidently
S=
J.
RYSER
(81, ... , 8,,) be the row sum and column R
= R.
Moreover, it is clear that the row sum vector R uniquely determines A, and hence S. Indeed, T = L r/ = L 8/ constitute conjugate partitions of T. Consider two vectors S = (S" ... , s,,) and S* = (SI*, .. . , s"*), where the Sj and s;* are nonnegative integers. The vector S is majorized by S*, S
-<
S*,
provided that with the subscripts renumbered (5; 3): (1)
s,
(2)
S,
(3)
:> ... :> S,,' s,* :> ... :> s"* ;
+ ... + S/ < s,* + ... + Sj* , S, + ... + s" = S,* + ... + s"* . S associated with the matrices A
For the vectors Sand we prove that
z=I, ... ,n-l; and
A, respectively,
-< S.
S
We renumber the subscripts of the
Sj
of A so that
s,
:>
S2
:> ... :>
Sn.
8,
:>
82
:> ... :>
8".
For A, we already have Now A must be formed from A by a shifting of l's in the rows of A. But for each i = 1, ... , n - 1, the total number of l's in the first i columns of A cannot be increased by a shifting of l's in the rows of A. Hence s, 'I
+ ... + s/ < 8, + ... + 8t,
= 1, ... , n - 1. Moreover, S,
+ ... + s" = 8, + ... + 8",
whence we conclude that S
-< S.
THEOREM 2.11. Let the matrix A be maximal and have column sum vector S. Let S be majorized by S. Then by rearranging l's in the rows of A, one may construct a matrix A having column sum vector S.
Without loss of generality, we may assume that the column sums of A satisfy s, :> S2 :> ... :> s". We construct the desired A inductively by columns by a rearrangement of the l's in the rows of A. 'Added in proof. The author has been informed recently that Theorem 2.1 was obtained independently by Professor David Gale. His investigations ('oncerning this theorem and certain generalizations are to appear in the Pacific Journal of Mathematics.
270
373
MA TRICES OF ZEROS AND ONES
of
A
By hypothesis, S -< 8, whence SI -< Bl. If SI = B[, we leave the first column A unchanged. Suppose that SI < 8t. We may rearrange l's in the rows of to obtain SI l's in the first column, unless
But if these inequalities hold, then Bl
+ ... + Bn > nSI > SI + ... + Sn
= Bl
+ ... + B.,
which is a contradiction. Let us suppose then that the first t columns of A have been constructed, and let us proceed to the construction of column t + 1. Vlie have then given an m by n matrix where the number of l's in column 7Ji is Si (£ = 1, ... ,t). Let the number of l's in column 711 be s'J (j = t + 1, ... , n). We may suppose that Two cases arise.
Case 1.
S 1+1
< S'/+1.
In this case, remove l's from column 711+1 by row rearrangements, and place the l's in columns '11+2, ••• , 7Jn. If sufficiently many l's may be removed from 711+1 in this manner, then we are finished. Suppose then that there remain e l's in column t + I, with S'+1
< e <;; S~+I'
and that no further l's may be removed by this procedure. Then there must exist an integer w > 0 such that S 1+1
+ ... + s"
But SHI
SI+2
S"
=
(n - t)e
+ w.
< e,
-< S,+1 < e,
< e.
Therefore
(n - t)e
+w
=
S 1+1
+ ... + s" <
which is a contradiction.
Case 2.
271
(n - t)e,
374
H.
J.
RYSER
By row rearrangements, insert 1's into column 11Hl from columns 11H2, • . . ,11n' If sufficiently many l's may be inserted in this manner, then we are finished. Suppose then that there remain e l's in column t + 1 with S',+1 "' e < S HI, and that no further l's may be inserted by our procedure. Let the matrix at this stage of the construction process be denoted by
[e,.l· If now £',.HI =
0,
then
(j = t
+ I, ... , n).
Suppose that some
e"
j ~ t
I,
=
Then either (k
e'l: = I,
=
I, ... ,t
+ 2. + 1),
or else for some k, 1 "' k "' t,
£'".,
=
O.
Consider the case in which e,A' = O. Since Cpl:
=
I,
('v,Hl
=
Sk ~ St+l
>
e, there must exist
O.
Interchanging £"} = 1 and C,f; = 0, and interchanging epk = 1 and ev ,Hl = 0, we see that SI, ••• , s, are left unaltered, and that e is increased by 1. Continue to increase e by transformations of this variety. Suppose that all such transformations have been applied and that e still satisfies S~+I "' e < S,+I. But now it is no longer possible to move a 1 from columns t into columns 1, 2, ... , t + I. This means that SI
+ ... + St + e =
81
+ 2, ... , n
+ ... + St + 8 t+1'
But then SI
whence s 1+1
+ ... + '\·Hl "' 81 + ... + 8t+l < e, which
= SI
+ ... + St + 1',
is a contradiction. This completes the proof.
The preceding theorem has a variety of applications. For example, let the (0, I)-matrix A of m rows and n columns contain exactly T = km l's, where k is a positive integer. Let the column sum vector of A be S = (Slo ... , sn). Then there exists an m by n matrix A* composed of O's and l's with exactly k 1's in each row, and column sum vector S. For let A be m by n, with all l's in the first k columns and O's elsewhere. If S denotes the column sum vector of A, then S < S, and the desired A* may be constructed from .4.
272
MA TRICES OF ZEROS AND ONES
375
In this connection we mention the following result anSl11g in the study of the completion of Latin rectangles (1; 7). Let A be a (0, I)-matrix of r rows and n columns, I <; r < n. Let there be k J's in each row of A, and let the column sums of A satisfy k - (n - r) <; Sj <; k. Then n - r rows of O's and 1's may be adjoined to A to obtain a square matrix with exactly k 1's in each row and column (7). To prove this it suffices to construct an n - r by n matrix A * of O's and I 's with exactly k I 's in each row, and column sum vector (k - s\, ... , k - sn). By the remarks of the preceding paragraph, such a construction is always possible. 3. Interchanges. \Ve return now to the m by n matrix A composed of O's and 1's, with row sum vector R and column sum vector S. V·/e are concerned with the 2 by 2 submatrices of A of the types
Al
= [~
~J and .12 = [~ ~J.
An interchange is a transformation of the elements of A that changes a specified minor of type A I into type A 2, or else a minor of type A 2 into type A hand leaves all other elements of A unaltered. Suppose that we apply to A a finite number of interchanges. Then by the nature of the interchange operation, the resulting matrix A * has row sum vector R and column sum vector S. THEOREM 3.1. Let A and A * be two m by n matrices composed of 0' sand l' s, possessing equal row sum vectors and equal column sum vectors. Then A is transformable into A * by a finite number of interchanges.
The proof is by induction on m. For m = 1 and 2, the theorem is trivial. The induction hypothesis asserts the validity of the theorem for two (0, 1)matrices of size m - 1 by n. We attempt to transform the first row of A into the first row of A * by interchanges. If we are successful, the theorem follows at once from the induction hypothesis. Suppose that we are not successful and that we denote the transformed matrix by A'. For notational convenience, we simultaneously permute the columns of A' and A* and designate the first row of A' by (0" 'TI" 0t> 'TI,)
and the first row of A * by (0" 'TI" 'TIlt 0,).
Here OT and 0, are vectors of all l's with rand t components, respectively, and 'TIs and 'TIt are 0 vectors with sand t components, respectively. Thus we have been successful in obtaining agreement between the two rows in the positions labelled OT and 'TI., but have been unable to obtain agreement in the positions labelled Ot and 'TIt. We may suppose, moreover, that these 2t positions of disagreement are the minimal number of disagreements obtainable among
273
376
H.
J.
RYSER
all attempts to transform the first row of A into the first row of A * by interchanges. Let A'm-l and A *m-l denote the matrices composed of the last m - I rows of A' and A*, respectively. The row sum vectors of A ' m - l and A*m-l are equal. Also corresponding columns of A'm-l and A *m-l below the positions labelled fir and 11. have equal sums. Let ai denote the (r + s + i)th column of A ' m - h and let f3t denote the (r + s + t + i)th column of A 'm- h where i = 1, ... ,t. Let al*, ... ,at* and f31*, ... ,f3t* denote the corresponding columns of A *m-l. Let ai, b i , a;*, b j * denote the column sums of at, f3t, al, f3l, respectively. Now in A ' m - l we cannot have simultaneously a 0 in the position determined by row j and column at and a I in the position determined by row j and column f3j. For if this were the case, we could perform an interchange and reduce the 2t disagreements in the first row of A'. Hence aj > b i • Moreover, aj* = aj + I and b i * = b t - 1, whence ai* - b*i = at - bi + 2 > 2. In A*m-h consider columns aj* and f3j*. There exists a row of A*m-l that has a 1 in column al and a 0 in column f3i*' Replace the 1 by 0 and the 0 by 1, and let such a replacement be made for each i = 1, ... , t. We obtain in this way a new matrix A"._l whose row and column sum vectors are equal to those of Aim-I. By the induction hypothesis, we may transform A ' m - l into A m- 1 by interchanges. However, these interchanges applied to A' will allow us to perform further interchanges and make the first rows of the transformed A I and A * coincide. Hence the theorem follows. Let ~ denote the class of all (0, I)-matrices of m rows and n columns, with row sum vector R and column sum vector S. The term rank p of A in ~ is the order of the greatest minor of A with a nonzero term in its determinant expansion (6). This integer is also equal to the minimal number of rows and columns that contain collectively all of the nonzero elements of A (4). A (0, I)-matrix A = [a,,] may be considered an incidence matrix distributing n elements Xh ... ,Xn into m sets Sh ... ,Sm' Here ail = 1 or 0 according as ;0,.', is or is not in St. From this point of view the term rank of a matrix is a generalization of the concept of a system of distinct representatives for subsets SI, ... ,S1I' of a finite set N (2). Indeed, the subsets Sh ... ,Sm possess a system of distinct representatives if and only if p = m. THEOREM 3.2. Let p be the minimal and p the maximal term rank for the matrices in ~. Then there exists a matrix in ~ possessing term rank p, where p is an arbitrary integer on the range p <; p <; p.
For an interchange applied to a matrix in ~ either changes the term rank by I or else leaves it unaltered. But by Theorem 3.1, we may transform the matrix of term rank p into the matrix of term rank p. This implies that there exists a matrix in ~ of term rank p.
274
MATRICES OF ZEROS AND ONES
377
REFERENCES
1. MarshaU Hall, An existence theorem for Latin squares, Bull. Amer. Math. Soc., 51 (1945),
387-388. P. HaU, On representatives of subsets, J. Lond. Math. Soc., 10 (1935), 26-30. G. H. Hardy, J. E. Littlewood, and G. P6lya, Inequalities (Cambridge, 1952). Dimes Konig, Theorie der endlichen und unendlichen Graphen (New York, 1950). R. F. Muirhead, Some methods applicable to identities and inequalities of symmetric algebraic functions of n letters, Proc. Edinburgh Math. Soc., f!1 (1903), 144-157. 6. Oystein Ore, Graphs and matching theorems, Duke Math. J., f!f! (1955), 625-639. 7. H. J. Ryser, A combinatorial theorem with an application to Latin rectangles, Proc. Amer. Math. Soc., f! (1951), 550-552.
2. 3. 4. 5.
Ohio State University
Reprinted from Canad. 1. Math. 9 (1957),371-377
275
GRAPH THEORY AND PROBABILITY 1'. ERDOS
A well-known theorem of Ramsay (8; 9) states that to every n there exists a smallest integer g(n) so that every graph of g(n) vertices contains either a set of n independent points or a complete graph of order n, but there exists a graph of g(n) - 1 vertices which does not contain a complete subgraph of n vertices and also does not contain a set of n independent points. (A graph is called complete if every two of its vertices are connected by an edge; a set of points is called independent if no two of its points are connected by an edge.) The determination of g(n) seems a very difficult problem; the best inequalities for g(n) are (3) (1)
21"<
g(n)
<
(2; : : ~) .
It is not even known that g(n)l/71 tends to a limit. The lower bound in (1) has been obtained by combinatorial and probabilistic arguments without an explicit construction. In our paper (5) with Szekeresl(k, I) is defined as the least integer so that every graph having l(k, I) vertices contains either a complete graph of order k or a set of I independent points (f(k, k) = g(k». Szekeres proved (2)
l(k, I)
< (k
t ~~ 2) .
Thus for k
= 3,j(3, I)
< (I ~
1) .
I recently proved by an explicit construction that 1(3, I) probabilistic arguments I can prove that for k > 3 (3)
l(k, I)
>
1 (k
>
11+e,
(4). By
t ~ ~ Y', 2
which shows that (2) is not very far from being best possible. Define now h(k, I) as the least integer so that every graph of h(k, I) vertices contains either a closed circuit of k or fewer lines, or that the graph contains a set of I independent points. Clearly h(3, I) = 1(3, I). By probabilistic arguments we are going to prove that for fixed k and sufficiently large I (4)
h(k, I)
>
11+1/2~.
Further we shall prove that Received December 13, 1957.
34
276
GRAPH THEORY AND PROBABILITY
(5)
h(2k
+ 1, I)
< c3 11+ 1/ 1c , h(2k + 2, I) < c3l1+ 1/ k •
A graph is called r chromatic if its vertices can be coloured by r colours so that no two vertices of the same colour are connected; also its vertices cannot be coloured in this way by r - 1 colours. Tutte (1, 2) first showed that for every r there exists an r chromatic graph which contains no triangle and Kelly (6) showed that for every r there exists an r chromatic graph which contains no k-gon for k <;; 5. (Tutte's result was rediscovered several times, for instance, by Mycielski (7). It was asked if such graphs exist for every k.) Now (,-1:) clearly shows that this holds for every k and in fact that there exists a graph of n vertices of chromatic number> n' which contains no closed circuit of fewer than kedges. Now we prove (4). Let n be a large number, 1
O<E
is arbitrary. Put m = [n1+'] ([x] denotes the integral part of x, that is, the greatest integer not exceeding x), p = [nl-~l where 0 < ." < E/2 is arbitrary. Let ®(n) be the complete graph of n vertices XI. X2, ••• , Xn and ®(P) any of its complete subgraphs having p vertices. Clearly we can choose ®(P) in (:) ways. Let
be an arbitrary subgraph of ®(n) having m edges (the number of possible choices of a is clearly as indicated). First of all we show that for almost all a ®a(n) has the property that it has more than n common edges with every ®(P). Almost all here means: for all a's except for
Let the vertices of ~)
to ((;)) ((~=~~)) + ((!)) ((;):(~))
<
1)
<
<
(~)p.exp( -'i).
36
P. ERDOS
Now the number of possible choices for
®(P)
is
(;) < nP < pn. Thus the number of a's for which there exists a has not more than n'edges is less than (7] < E/2)
®(P)
so that
®(P)
f\
®,,(n)
as stated. Unfortunately almost all of these graphs ®.. (n) contain closed circuits of length not exceeding k (in fact almost all of them contain triangles). But we shall now prove that almost all ®,,(n) contain fewer than n/k closed circuits of length not exceeding k. The number of graphs ® ..(n) which contain a given closed circuit (Xl, X2), (X2, X3), ... , (x It Xl) clearly equals
The circuit is determined by its vertices and their order-thus there are n(n - 1) ... (n - l + 1) such circuits. Therefore the expected number of closed circuits of length not exceeding k equals
(~r t.1! (;) ((;l j < (1
<
(1
+ .(1))
+ 0(1)) n" (2~t = n
t, .{(;)),
u(n)
since E < l/k. Therefore, by a simple and well-known argument, the number of the a's for which ®,,(n) contains n/k or more closed paths of length not exceeding k is
{~), as stated. Thus we see that for almost all a ®.. (n) has the following properties: in every ®(P) it has more than n edges and the number of its closed circuits having k or fewer edges is less than n/k. Omit from ®,,(n) all the edges contained in a closed circuit of k or fewer edges. By what has just been said we omit fewer than n edges. Thus we obtain a new graph ®..'(n) which by construction does not contain a closed circuit of k or fewer edges. Also clearly ®,,'(") f\ ®(P)
278
GRAPH
37
AND PROBABILITY
TH~:ORY
is not empty for every ®(P). Thus the maximum number of independent points in ®,,'(n) is less than p = [n l - 1 ], or h(k, [nl-~])
>n
which proves (4). By more complicated arguments one can improve (4) considerably; thus for k = 3 I can show that for every t > 0 and sufficiently large 1 f(3, I) = h(3, l)
>
l2-.,
which by (2) is very close to the right order of magnitude. At the moment I am unable to replace the above "existence proof" by a direct construction. By using a little more care I can prove by the above method the following result: there exists a (sufficiently small) constant C4 so that for every k and l (6)
h(k, I)
1
> C411+-if.
(If k > clog l (6) is trivial since h(k, l) :.;;. I.) From (6) it is easy to deduce that to every r there exists a C6 so that for n > no(r, C6) there exists an r chromatic graph of n vertices which does not contain a closed circuit of fewer than [C6 log n] edges. I am not sure if this result is best possible. We do not give the details of the proof of (3) since it is simpler than that of (4). For k = 3 (3) follows from (4). If k > 3, put 2
2
m = c6[n """7i=T] and denote by ®a(n) the "random" graph of m edges. Bya simple computation it follows that for sufficiently small C6, ®,,(n) does not contain a complete graph of order k for more than
values of a, and that for more than this number of values of a ®,,(n) does not contain a set of C7n2/lt-l log n independent points (C7 = C7(C6) is sufficiently large). Th us f(k, c7n 2/k- 1 log n) > n, which implies (3) by a simple computation. Now we prove (5). It will clearly suffice to prove the first inequality of (5). We use induction on l. Let there be given a graph ® having h(2k 1, l) - 1 1 or fewer edges and vertices which does not contain a closed circuit of 2k for which the maximum number of independent points is less than l. If every point of ® has order at least [[ilk] 2 (the order of a vertex is the number of edges emanating from it) then, starting from an arbitrary point, we reach in k steps at least I points, which must be all distinct since otherwise ® would
+
+
279
+
38
P. ERDOS
have to contain a closed circuit of at most 2k edges. The endpoints thus obtained must be independent, for if two were connected by an edge @ would contain a closed circuit of 2k 1 edges. Thus @ would have a set of at least 1 independent points, which is false. Thus @ must have a vertex XI of order at most [l1lk] 1. Omit the vertex XI and all the vertices connected with it. Thus we obtain the graph @' and Xl is not connected with any point of @', thus the maximum number of independent points of @' is 1 - I, or @' has at most h(2k + I, 1 - 1) - 1 vertices, hence
+
+
h(2k
+ 1,1) <;: h(2k + 1,1 -
1)
+ [1 Ilk] + 2
which proves (5). REFERENCES
1. 2. 3. 4. 5. 6. 7. 8. 9.
Blanche Descartes, A three cQIQur prQblem, Eureka (April, 1947). (Solution March, 1948.) - - SQlutiQn to Advanced PrQblem no,. 4526, Amer. Math. Monthly, 61 (1954), 352. P. Erdos, SQme remarks Qn the theQry of graphs, B.A.M.S. 53, (1947),292-4. - - Remarks Qn a theQrem Qf Ramsey, Bull. Research Council of Israel, Section F, 7 (1957). P. Erdos and G. Szekeres, A cQmbinatQrial prQblem in geQmetry, Compositio Math. 2 (1985) 468-70. J. B. Kelly and L. M. Kelly, Paths and circuits in critual graphs, Amer. J. Math., 7'6 (954), 786-92. J. Mycielski, Sur Ie cQIQrage des graphs, Colloquium Math. 3 (1955), 161-2. F. P. Ramsay, CQllected papers, 82-111. T. Skolem, Ein kQmbinatQnscher Satz mit Anwendung auf ein logisches Entscheidungs problem, Fund. Math. 20 (1933), 254-61.
University of Toronto and
Technion, Haifa
Reprinted from Canad. J. Math. 11 (1959), 34-38
280
Kasteleyn, P. W.
Physica 27
1961
1209-1225
THE STATISTICS OF DIMERS ON A LATTICE I. THE NUMBER OF DIMER ARRANGEMENTS ON A QUADRATIC LATTICE
by P. W. KASTELEYN Koninklijke/Shell-Laboratorium, Amsterdam, Nederland (Shell Interriationale Research Maatschappij N.V.)
Synopsis The number of ways in which a finite quadratic lattice (with edges or with periodic boundary conditions) can be fully covered with given numbers of "horizontal" and "vertical" dimers is rigorously calculated by a combinatorial method involving Pfaffians. For lattices infinite in one or two dimensions asymptotic expressions for this number of dimer configurations are derived, and as an application the entropy of a mixture of dimers of two different lengths on an infinite rectangular lattice is calculated. The relation of this combinatorial problem to the Ising problem is briefly discussed.
§ 1. Introduction. Combinatorial problems relating to a regular space lattice arise in the theory of various physical phenomena. One of these problems is the "arrangement problem", which plays a role in the explanation of the non-ideal thermodynamic behaviour of liquids consisting of molecules of different size with zero energy of mixing (athermal mixtures). In the investigations devoted to this problem (most of which have been discussed critically by Guggenheim 1)) much attention has been paid to the socalled quasi-crystalline model. One considers a regular lattice consisting of points (sites, vertices) connected by bonds. This lattice is fully covered with monomers (molecules occupying one site) and rigid or flexible polymers (molecules occupying several sites connected by bonds); the latter may be dimers, trimers etc., but also "high polymers". If the energy of mixing is zero the thermodynamic properties of this system can be calculated from the combinatorial factor, i.e. the number of ways of arranging given numbers of monomers and polymers on the lattice. The same combinatorial problem arises in the cell-cluster theory of the liquid state 2). There one divides the volume of a liquid into a set of cells of which the centres form a regular lattice, and one considers situations in which, by the removal of certain interfaces between cells, a number of double cells, triple cells etc. have been formed. For the calculation of the free energy of the liquid one then has to determine the number of ways in which
281
1210
P. W. KASTELEYN
a given volume can be divided into given numbers of single, double, triple cells etc.; the equivalence of this combinatorial factor with the former one is obvious. A two-dimensional form of the arrangement problem is encountered in the theory of adsorption of diatomic, triatomic etc. molecules on a regular surface. The empty sites of the surface then play the role of "monomers". As in many problems of this sort, it is easy to find the most general solution for a one-dimensional lattice 2), it appears very difficult to find a more or less general solution for two-dimensional lattices, whereas for three dimensions any exact solution seems extremely remote. Therefore one generally uses approximation methods; we refer to the work of Fowler and Rushbrooke 3) (who also made some rigorous calculations on two- and three-dimensional infinite strips of finite width), Chang, Flory, Huggins, Miller, Guggenheim (for detailed references see ref. IL Orr 4), Rushbrooke, Scoins and Wakefield5), and Cohen, De Boer and Salsburg 2). Recently, Green and Leipnik 6 ) claimed to have found a rigorous solution for the case of monomer-dimer mixtures on a two-dimensional lattice, but Fisher and Temperley 7), and Katsura and Inawashiro 7) proved that their results were not correct. In this paper we present a rigorous solution to the above-mentioned combinatorial problem for a very special case, viz. that of a two-dimensional quadratic lattice, completely covered with dimers (in terms of graph theory we ask for the number of "perfect matchings" of the lattice 8)). Both the absence of monomers and the dimension of the lattice form serious restrictions, but it is hoped that the present investigation may be useful as a first step. The situation has some resemblance to that of the combinatorial problem connected with the Ising model of cooperative phenomena 9), for which an exact solution has been given for an equally special case 10-13). It will be shown that the two problems are to a certain extent analogous. In § 2 and § 3 we shall develop the method of solution for the case of a finite lattice imbedded in a plane (i.e. a rectangle with edges), and in § 4 for that of a lattice imbedded in a torus (i.e. with periodic boundary conditions). In § 5 an alternative method will be sketched. As an application, the entropy of a certain mixture of dimers is calculated in § 6. It is intended to treat in a subsequent paper the statistics of dimers on other two-dimensional lattices, to discuss boundary effects and to make some remarks on three-dimensional lattices. § 2. The planar quadratic lattice. Consider a planar quadratic m X n lattice Qmn to which one can attach dimers (figures consisting of two linked vertices) in such a way that every dimer occupies two lattice points connected by a bond. We indicate the lattice points by (i, j) or p (i = 1, ... ,m; j = 1, ... , n; p = I, ... , mn), the number of "horizontal" dimers(occupying
282
1211
THE STATISTICS OF DlMERS ON A LATTICE I
two points (i, j) and (i + 1, j)) by N z and the number of "vertical"dimers (occupying two points (i, j) and (i, j + 1)) by N;. If g(Nz, N~) is the combinatorial factor, i.e. the number of ways of covering the lattice with dimers so that every site is covered by one and only one dimer vertex, we ask for the configuration generating function
Zmn(z, z') = ~~.,N., g(Nz, N~) ZN. Z'N.' ,
(1)
where the sum runs over all combinations N z, N~ satisfying 2(Nz + N~) = = mn; if desired, the counting variables z and z' may be viewed as activities and Zmn as the configurational partition function. At least one of the two numbers m and n has to be even; let m be even. We shall refer to the arrangement of dimers occupying the pairs of sites PI and Pz, P3 and P4, P5 and Ps, etc. as to the configuration C = IPl; pzl P3; P4 IP5; psi .. · IPmn-l; Pmnl· A simple but important configuration is Co = 11,1; 2,113,1; 4,11 ... 1m - 1,1; m, 111,2; 2,21 ... 1m -1, n; m, nl, which we shall call the standard configuration (fig. la). We could, however, represent this arrangement of dimers equally well by 12,1; 1,113,1; 4,11 ... or by 14,1; 3,111,1 ; 2,11 ." etc. To make the representation unique we order the points of the lattice row after row by choosing the p-numbering as follows: (2) (i,j) H P= (j -1) m + i, and we introduce the convention that the points of a configuration shall be indicated in the following ("canonical") order:
PI < Pz; P3 < P4; ... ; pmn-I < Pmn; PI
(3a)
< P3 < ... < P mn-l·
(3b)
By analogy to the determinantal approach to the Ising problem developed by Kac and Ward 11) we shall try to construct a mathematical form consisting of a series of terms each of which corresponds uniquely to one configuration and has the "weight" zNtz'Nt' of this configuration. The conditions (3) strongly suggest that this form should be a Pfattian rather than a determinant. A Pfaffian is a number attributed to a triangular array of coefficients a(k; k') (k = 1, ... , N; k' = 1, ... , N; k < k'; N even) in the following way 14) :
Pf{a(k; k')} = ~~ bpa(kl; kz) a(k3; k4) .. , a(kN-l ; kN),
(4)
where the sum runs over those permutations k}, k z, ... , kN of the numbers 1,2, ... ,N which obey
ki
< kz; k3 < k4; ... ; kN-l < kN; kl < k3 < ...
and where bp is the parity of the permutation P, i.e. -lor
283
(3')
+ 1 according
1212
P.
w.
KASTELEYN
as P is an odd or an even permutation. Pfaffians have been introduced into physics by Caianello and Fubini 15) and in lattice-combinatorial problems by Hurst and Green 13). We shall now show that it is possible indeed to define a triangular array of elements D(P; P') so that
Zmn(Z, z')
=
Pf{D(P; P')}.
(5)
We begin by noting that if we define D(P; P') = 0 for all pairs of sites (P; P') that are not connected by a bond, all terms in the Pfaffian that would not correspond to a dimer configuration will vanish. Next we put the coefficients D(P; P') corresponding to pairs of sites that are connected by a horizontal or a vertical bond equal, in absolute magnitude, to Z and z', respectively. In this way we get all configurations represented by a term of the proper weight; the conditions (3) and (3') ensure that the correspondence is one-to-one. Finally, in order that all configurations are counted positively we have to choose the signs of the non-zero elements such that the product D(P1; P2) D(P3; P4) ... has the same sign as the parity op. It is evident that the product corresponding to the standard configuration Co has to be positive. Now, from Co one can obtain any arbitrary configuration C in the following way. We draw a picture of the lattice in which every dimer of the configuration Co is represented by a dotted line and every dimer of the configu(1.4)
(1,3)
(1,;)(2))
~.1l
(~1)
(a) Fig. 1. (b) (a) The standard configuration Co of the planar lattice Q64 (b) The construction of a "new configuration C (full lines) from Co (dotted lines)
guration C by a full line (fig. 1b). Since any lattice point is the endpoint of just one line of each type the resulting figure consists of: a) pairs of sites connected both by a dotted line and by a full line ; b) closed polygons consisting of alternating dotted and full lines (to be called Co-bonds and nonCo-bonds). If we then take the configuration Co and we shift in each of the "alternating" polygons all dimers clockwise or counter-clockwise by one step, Co goes over into C. By analogy to the Ising problem 9) one might expect that each of the polygons (representing a cyclic permutation of an even number of lattice sites) would contribute a factor -1 to the parity op of the permutation P corresponding to C. In fact, this is true, but owing to the restrictions on
284
THE STATISTICS OF DlMERS ON A LATTICE I
1213
the permutations occurring in a Pfaffian, the proof is not so simple as in the Ising problem. Consider e.g. the small square in fig. lb. According to (3') its vertices occur in the term representing Co as ... D(Pl; P2) .. , D(Pa; P4) ... and in the term representing C as ... D(Pl; Pa) D(P2; P4) .... Obviously the change from PIP2PaP4 to PIPaP2P4 is not a cyclic permutation of the four points along the square. It can, however, be considered as the product of the following permutations: PIP2PaP4 -+ PtP2P4Pa (putting the points into a cyclic order corresponding to the square) -+ P2P4PaPI (permuting the four points cyclically) -+ P2P4PtPa (ordering the points PI and pa according to (3a)) -+ PtPaP2P4 (ordering the pairs (PI; Pa) and (P2; P4) according to (3b)). The resulting permutation is odd; apparently the parities of the re-ordering permutations just compensate each other. To show that this is true in general we first take a configuration which differs from Co only in the position of the dimers on one polygon. Consider a column of "Co-bonds" which is crossed by this polygon. If we describe a closed path along the polygon, we cross this column as many times in the "forward direction" (i.e., in the direction of increasing P) as in the "backward direction". This is true for any column of Co-bonds, and therefore, if the path contains r forward steps along Co-bonds it must also contain r backward steps along Co-bonds. By a similar argument combined with the alternation of the two types of bonds, we can show that it also contains r forward and r backward steps along non-Co-bonds. In the terms of the Pfaffian which correspond to Co and C, on the other hand, all points of the polygon occur in the order of increasing p. Consequently the permutation which changes the "Co-term" into the "C-term" can be considered as the product of: I) the reversal of the r pairs of sites (Co-bonds) for which the canonical order is opposite to the required cyclic order; 2) the rearrangement of the 2r Co-bonds which is needed to get all polygon vertices in the cyclic order; 3) the cyclic permutation of these 4r vertices; 4) the reversal of the r pairs of sites which now violate (3a); 5) the rearrangement of the 2r pairs which is needed to satisfy (3b). Each reversal within a pair contributes a factor - I to the parity of the resulting permutation, a reshuffling of the pairs a factor + I, and the cyclic permutation a factor (-I )4r-l. We thus find that the total parity o~ the permutation is (-1)r( 1)( - I )4r-l( 1)( - W = -I. If there is more than one polygon, we can perform the corresponding permutations consecutively; each polygon then contributes a factor -I. It shall now be indicated how these factors can be compensated for. First we remark that since no two Co-bonds have a point in common, an alternating polygon can neither intersect itself nor cross or touch other polygons. Therefore we shall not encounter such difficulties as arose in the corresponding step of Kac and Ward's method 11) 12). Any alternating polygon can be considered as built up from horizontal strips of connected unit squares. From the requirement that during the
+
285
+
1214
P.
w.
KASTELEYN
cyclic shift the opposite sides of a connected figure are shifted in opposite directions, combined with the alternation of Co-bonds and non-Co-bonds we conclude that both the numbers of unit squares in a strip and the number of strips are odd (d. fig. 1b). Consequently, each alternating polygon encircles an odd number of unit squares, and it will be sufficient to choose the signs of the D(P; P') such that among the four bonds bounding a unit square there is an odd number having a negative D(P; P') ; we have further seen that the standard configuration has to appear with a positive sign. This can be realized e.g. by attributing minus signs to the coefficients of the vertical bonds between lattice sites of odd i. We thus get the following set of coefficients D(i,j;i + l,j) D(i, j; i, j I) D(i, j; i', j')
+
=
+z for 1 S i sm - 1, s j S n, 1S j S n (-1 liz' for 1 s i s m,
=
0
=
1,
(6)
otherwise
The equations (5) and (6) are sufficient to derive the configuration generating function Zmn(z, z'). It should be remarked that throughout this paper we assume that the dimers are symmetric. If they were asymmetric all elements D(P; P') would have to be multiplied by 2, since any pair of sites might then be occupied in two distinguishable ways. § 3. The evaluation of the Pfalfian. For the evaluation of Pf{D(P; P')} we make use of the property of a Pfaffian that its square is equal to the determinant of the skew-symmetric matrix to which the given triangular array of coefficients can be extended 14). That is, in our case,
Z;'n(Z, z') = [Pf DJ2 = det D,
(7)
where D is the matrix given by (6) together with the requirement of skew symmetry: (8) D(i,j;i',j') = -D(i',f';i,j), and Pf D stands for Pf{D(P; P')}. If D were a completely periodic matrix, it could easily be brought into a diagonal form, viz. by a Fourier-type similarity transformation 9), and the calculation of the determinant would be straightforward. However, the truncated edges of the lattice Qmn disturb the periodicity of the matrix. Fortunately, it is still possible to bring it into a "nearly diagonal" form. Vve write D as the sum of two direct products of a m X m matrix and n X n matrix: (9) D = z(Qm X En) z'(Fm X Qn),
+
where E is the unit matrix,
286
1215
THE STATISTICS OF DIMERS ON A LATTICE I
Q=
f-~o
100 010 -1 0 1
o0 o0 o0
0 0
000 000
o1 -10
.F~
o0 o0 o0
-10 0 o1 0 o 0 -1
I
o0 o0
(10)
-10 o1
0 0
and the indices indicate the order of the matrices. It can be verified that Qn can be diagonalized by a similarity transformation Qn = U;;;lQnUn with the matrix Un given by Un(l; 1')
=
{2/(n
U;;;l(l; 1') = {2/(n
+
I)P il sin {ll' n/(n
+
I)},
(11 )
+ I)}! (-i)l' sin {ll'n/(n + I)};
the diagonal elements of Qn are the eigenvalues 2i cos {In/(n + I)} of Qn(l = 1, ... , n). On the other hand, this transformation obviously leaves En invariant. Qm can be diagonalized analogously by a transformation with Um, but this transformation disturbs the diagonal form of Fm, although not seriously. Transforming D with the direct product U = Um X Un we find that D = U-IDU has the following elements: D(k, 1; k',l')
= 2iz ()k,k' ()l,l' cos {kn/(m + I)} -2iz' ()k+k',m+l()l,l' X cos{ln/(n
+ I)};
i.e. the only non-zero elements are grouped in 2 X 2 blocks along the diagonal. Thus the determinant is readily found: kn 2iz cos - - m+I det D = det D = II II 1n • , k~l I ~l -21Z cos _ _ n+I __
~1n
n
. , 1n - 21Z cos - - n+I kn -2izcos--m+ 1
(12)
and, from (7) and (12), we find the following expression for the configuration generating function of the lattice Qmn: Zmn(z, z')
-1 _
Jt
~m n [ k2 n = II II 2 z2 cos - - - + Z'2 cos 2 - 1n -=
m+1
k~ 1 l~ 1
IT IT
n+1
2tmn
k~ 1 l-~ 1 1
[Z2 cos 2 ~
'
+ z'2 cos 2 ~J n+1 tmHn-l)[ kn 1n ] 2 im (n-l) zlm II II Z2 cos 2 - - - - + z'2 cos 2 - - k~ l~ m+1 n+1
J
m+1
(n even) .
(n odd)
(13)
1
In writing down the expression valid for odd n use has been made of the relation tm
II 2 cos{kn/(m
"=1
287
+ I)} =
I,
1216
P.
w.
KASTELEYN
which is a particular case of the identity [ n 4 u k~l im
2
kn
]
==
+ cos 2 - m+ 1
[u + (1 + ~t2)t]m+1 - [u -(1 + u 2)t]m+1
2(1
; (14)
+ u 2)t
this identity holds (for even m) because the two members represent two polynomials in u of the same degree m, with the same zeros, and with equal coefficients of the leading term. For numerical calculations it is sometimes useful to perform, with the aid of (14), the product over k in (13). In this way we find
f [( cos n~ 1 +
Zmn(Z, Z') = z!mn
( 1+ (2 cos 2
n~ lYJm+1 -
1
- [(cosl:...~(1+'2COS2~)tJm+1 1 n+ Ell ( )! n+ ' 1
lin]
-
In 2 1 +(2COS 2 _ -
1
J
(15)
n+l
where ( = z' Iz and an] = in or Hn - 1) according as n is even or odd. In the limit m -+ 00, i.e. for infinitely long strips of finite width n, we get
Zn(z,z') =lim {Zmn(z, z')}l/m=z!n m--..oo
IT [(COS~ + (1+(2COS2~)tJ. n+ 1 n+ 1
l~l
(16)
Finally we have, in the limit of an infinitely large lattice:
Z(z, z') = lim {Zmn(z, z')}I/mn = m,n--+oo
",/2
,,/2
= exp {n- 2J dw J dw' In 4[Z2 cos 2w + Z'2 cos 2w'J} = 0 o
(17)
",/2
= z! exp{n-I J dw In [( cos w + (1 + (2 cos2w)!J}. o
For Iz'l :s:: Izl we may expand the latter integrand in terms of (, integrate term by term, and sum the resulting series, which gives: In Z(z, z') =
i
In z + n- I
~
(-I)j (2j + 1)-2 (2j+1 =
i~O
C
=
i
In z + n- I J dx X-I arctan x =
=
t
In z'
o
(18)
l/C
+ n-I J dx x-I arctan x. o
From the equivalence of the last two expressions (which is easily proved) and the analogous derivation in the case Iz'l ;;::: Izl it follows that either of them may be used for all values of z and z'. Using the relation arctan x = (2i)-I[ln(1 + ix) - In(1 - ix)] and introducing the function
288
THE STATISTICS OF DlMERS ON A LATTICE I
1217
A 2 (x) = (2i)-1[L2(ix) - L 2( - ix)] , where L 2 (u) = - N dx x-lIn (I - x) is Euler's dilogarithm 16), we finally arrive at the following expressions for the limit of the configurational partition function per site: In Z(z, z') = tIn z
+ n- 1A 2 (z'/z)
= tIn z' + n- 1A 2 (z/z').
(19)
By substituting z = z' = I in equations (13) or (IS), (16) and (19) we immediately find the total number of dimer arrangements, g(tmn) , and its asymptotic behaviour. One is sometimes interested in the "molecular freedom" rp2 of the dimers defined 3) as the number of arrangements per dimer: rp2 = {g(!-mn)}2/mn = {}:;' g(N2, N~)}2/mn = {Zmn(I,I)}2/mn
(20)
lVs,Nz'
In particular, for the infinite lattice we find
I rp~OO) == Z2(1,1) = exp{2n-1A 2(1)} = exp{2G/n} =
1.791 622812...
(21)
where G = 1-2 - 3- 2
+ 5-2 -
7-2
+ ... =
0.915965594 ... (Catalan's constant).
Several approximate values for rp~oo) have, in more or less explicit form, been given in the literature. From Flory's theory of polymers 17) one can derive a value which corresponds to a "Bragg-Williams" or "random mixing" approximation, Chang 18) and Cohen e.a. 2) used a "1st Bethe-Kikuchi" or "quasi-chemical" approximation, Orr 4) worked out a "2nd Bethe approximation", Miller 19) calculated a lower bound, and Fowler and Rushbrooke 3) obtained a very close estimate by extrapolating their exact results for infinite quadratic strips of widths up to 8*) (which are, of course, included as special cases in our expression (16)). The various results are summarized in table II (p. 1220). § 4. The toroidal quadratic lattice. In this section we shall investigate the changes brought forward by introducing periodic boundary conditions, i.e. by winding the lattice on a torus. We shall call the toroidal m X n lattice Q~~, and the corresponding generating functions Z~~(z, z')' Z~)(z, z') and Z(t)(z, z'). One difference with the case of a planar lattice is that D(m, i; I, i) and D(i, n; i, I) should no longer be taken to be zero but equal, in absolute 0) Miller's criticism 19) of the calculations of Fowler and Rushbrooke rests on a wrong interpretation of the method and is therefore not valid.
289
1218
P. W. KASTELEYN
magnitude, to z and z', respectively; we can still choose the signs of these elements. Further we have now to distinguish four classes of configurations. The first class coml?rises those configurations that can be derived from the standard configuration (which we take identical to that of § 2) by cyclic shifts along polygons not looping the torus either in horizontal or in vertical direction, or, more generally, looping the torus an even number of times in both directions; let us call them (e, e) configurations. In an analogous
r-----------------------,
1""1
~
I-:-~~~----~~~~~----:~~-~-l I
I
I
I I I
II
i~ .........____.. ......... ____........... ---.JI
I
j I I I
r--- ....... ----..........----..·1
I
(e.e)CONFIGURATION
r--------------T---------l
[I
I
'---e ...
__ .
I
. ... .......,
II
I
= = =I
IL _______________________
I
I
I I
iI
I I
I I I I IL ________________________ I
(0,0) CONFIGURATION
r"'-- -
r--------------r---------'
I
i
r
I .... · .... I · I
~..
I
I
I
...............
........-.-
I
j I I I
1·········--········~
I
I
I
I~ ·········l________ ~I L______________l ________ - - _ L_______________ -.1 I
I
I
(0,0) CONFIGURATION
I
(0,0) CONFIGURATION
Fig. 2. Configurations from the four configuration classes of the toroidal lattice Q~l
way we define (0, e), (e,o) and (0,0) configurations (d. fig. 2; e = even, o = odd, first symbol refers to horizontai loops). If we define, for I :::;; i :::;; m, I
= -z; D(i, n; i, I) = (_I)H1z',
(22.3) (22.4)
and we remark that Pf 02 counts all configurations correctly except those from the class (0, e), Pf 0 3 all configurations except the class (e,o) and
290
1219
THE STATISTICS OF DlMERS ON A LATTICE I
Pf 0 4 all configurations except the class (0,0). These counting rules are summarized in table I: TABLE I Counting of configurations on a toroidal lattice Class of configurations
Sign of corresponding terms in Pf Dp
I
(e, e) (0, e) (e,o) (0,0)
01
+
I D. I
+
-
-
+ +
03
+ + -
I D.
+ + +
I
+
i
Pf O2 + Pf 0 3
+
-
I
It is evident from this table that if we put Z~~(z, Z')
= H-Pf 0 1 +
Pf 0 4),
(23)
all configurations are counted with the right sign so that we have obtained the analogue of eq. (5) for a toroidal quadratic lattice. The evaluation of eq. (23) runs parallel to that of eq. (5), the only difference being the occurrence of the matrices
o -I
0 -I
100 010 -I 0 1
0 0
0 0
0
000 000
0 -I
1 0
o
100 0 -I 010 0 -I 0 1
o1 o0 o0
0 -I
o1
and 000 000
-I 0 J
instead of Q (d. eq. (9) and (10)). They can be diagonalized successively by a transformation with the matrices V(l,l') = (1In)~ exp {2ll' niln}, V- (1,1') = (lIn)! exp{1(21' - 1)niln}.
(24)
Proceeding as in § 3 we find Z~~~(z, z')
trn
n
= -tIl II 2[Z2 sin 2 {2knlm} k~
1 Ic= 1
!rn
n
+ Z'2 sin 2 {21nln}]I +
+III II 2[z2 sin2{2knlm} + ~1~1 if" n
+III II 2[Z2 sin2{(2k -
z'2 sin2{(2l - I) nln}]!
+
~~
+
I) nlm}
+
Z'2 sin2{21nlnW
I) nlm}
+
Z'2 sin2{(21 - I) nln}]t.
k~II~1
!rn
n
+III II 2[z2 sin2{(2k k==II==1
The first term of the right-hand member is easily seen to be equal to zero. 291
1220
P. W. KASTELEYN
If desired, this equation can be put into a form analogous to (IS) with the aid of the following identities, valid for even m and non-negative values of u: 1m
II 2[u 2 +sin 2{2kn/m}J!
k~l
(26)
!m
I12[u 2
+ sin2{(2k - I) n/m}J! == [u + (I +
u 2 )IJlm
+ [-u +(1 +
u 2 )lJlm.
k~l
For m ~ 00, i.e. for infinite cylindrical strips, the second and fourth term of eq. (25) can be shown to be dominant and equal, and we find Z~)(z, z')
= zln
Z!ll [in]
[
C sin
(21 -
n
I)n
+
(
I + C2 sin 2
(21 -
n
I) n
)!] '
(27)
which is to be compared with eq. (16), valid for planar strips. In the limit n ~ 00 we finally obtain eq. (17) again. The values for the molecular freedom!p2 in cylindrical strips of widths up to 8 calculated by Fowler and Rushbrooke 3) can be found as special cases from eq. (27). In table II we list the various values of !p2 (exact and approximate) calculated for strips and for the infinite lattice. TABLE II The molecular freedom 'P2 for planar and toroidal quadratic
n 1 2 3 4 5 6 7 8
I
=\
planar lattice 1.000
I
toroidal lattice
1.686 1.932
1.685 1.658
1.754 1.716
1.701
00
x n lattices
method
2.414
1.618 1.551
I
1.849 1.772
1.732
Fowler and R ushbrooke 3) and this paper (in those cases where ref. 3 gave no or less accurate results, those of the present method have been recorded)
1.823 1.471 1.687 1.736 1.63 1.8 1.791 623 ...
Chang"I 18) Orr 4) Miller 19)
",0''
) . approXimate
Fowler and Rushbrooke 3) this paper
§ 5. Alternative approaches. We saw in § 2 that the number of dimer configurations on a lattice is equal to the number of alternating polygons (defined with respect to a standard configuration) on that lattice. The strong analogy with the Ising problem suggests that a solution of the present problem is possible which follows more closely the method of Kac and Ward referred to above. This can indeed be developed, and again one obtains the configuration generating function Zmn as the square root of a determinant.
292
1221
THE STATISTICS OF DIMERS ON A LATTICE I
We shall not go into details but instead mention another closely related method for calculating Zmn. It follows from the considerations of § 2 that there is a one-to-one correspondence between alternating polygons on Qmn and closed paths on a corresponding oriented lattice Q~n' sketched in fig. 3. In this lattice any bond may be traversed in only one direction: the Co-bonds in the direction of increasing (decreasing) i for odd (even) values of j, the non-Co-bonds in the direction from the "head" of a Co-bond to the "tail" of another Co-bond .
-
.... \11
\11
'1\
I
,
\1,
'1\
---
...-
I
-
\11
'1\ ...-
II
\11
'1\
.... Fig. 3. The oriented lattice
I
,
...-
Qg~.
We now form a mn X mn matrix d whose rows and columns correspond to the sites of Q~n. We define, for all combinations of indices which represent lattice points, d(i,j;i,f) d(i,j;i+I,j) d(i,j;i-I,j)
d(~, ~~~, ~ +
1
y Y
I)} = {+y:
1, ~,1 - 1) -y d(i,j;i',j') = 0
d(~,
for j odd, for j even, for i even, j odd, for i odd, j even, otherwise,
(28)
i.e. we attach weight factors y and ±y' to horizontal and vertical oriented bonds, respectively, and a factor 1 to each lattice site on its own. Thus in detd each term will correspond to a configuration of closed paths on Q~n (d. ref. 9). A term representing a permutation consisting of v cycles (each permuting an even number of points) occurs with a factor (-I)p; since in a determinant, as contrasted with a Pfaffian, there is no restriction on the order of the indices of the elements, the difficulty mentioned in § 2 does not arise here. The argument of § 2 shows again that the difference in sign between permutations consisting of odd and even numbers of cycles is compensated for by the negative signs attributed to the vertical bonds between sites with odd values of i. So detd is just the path generating
293
1222
P. W. KASTELEYN
function Hrnn(y, y') for the lattice detd =
~M ~M'
Q~~n:
h(M, M') yMy'M' = Hrnn(y, y'),
h(M, M') being the number of ways of combining M horizontal and M' vertical steps to closed paths. According to § 2 such a combination may be considered as representing a configuration of M' vertical dimers, and hence tmn - M' horizontal dimers. It follows that Zrnn(z, z')
~M'
=
g(tmn - M', M') ztrnn-M' z'M'
= zlrnn
~M' ~M
h(M, M') eM'
=
=
z!rnn
det(d)1I~1. y'~C'
(29)
This is confirmed by an evaluation of detd. For an infinite lattice e.g. one finds
H(y, y')
=
lim {Hrnn(Y, y')}1/rnn = m,n---""oo
,,/2
n/2
= exp{n- 2 f dw f dw' In [( 1 - y2) 2 o
0
z.
+ 4y2 cos 2 w + 4y'2 cos 2 w']} ;
(30)
e,
multiplication by and substitution of y = 1, y' = immediately lead to (17). This result is noteworthy in that Zrnn itself is expressed as a determinant rather than its square. The origin of this possibility lies in the fact that the quadratic lattice can, in the well-known way, be divided into two sublattices, that of the "odd" and that of the "even" sites (or, in terms of graph theory, that it is "dichromatic" 8)); this ensures the possibility of working with the oriented lattice Q:n' The algebraic root lies in the possibility of writing certain Pfaffians as a determinant (cf. Muir 14), vol. IV p. 263). For the triangular lattice, on the other hand, the method of this section cannot be used, whereas that of § 2 still works, as we hope to show in the envisaged sequel to this paper.
§ 6. The entropy of a system of dimers on a rectangular lattice. Consider a planar rectangular m X n lattice, i.e. a lattice whose horizontal and vertical bonds differ in length. Let the lattice be covered entirely by two sorts of dimers: N 2 dimers which fit only into horizontal positions (i.e. which can occupy two sites (i, j) and (i + 1, j)), and N~ dimers fitting only into vertical positions. If the energy of mixing of these dimers is zero, the configurational free energy of the mixture is completely determined by the entropy of mixing, i.e. by the combinatorial factor g(N 2, N~). This quantity can be calculated from the configuration generating function with the aid of Cauchy's formula: (31)
where the path of integration encircles the origin but excludes the singularities of Zrnn( I, e). We shall introduce x = N 2/tmn and x' = N;/tmn = 1 - x.
294
THE STATISTICS OF DIMERS ON A LATTICE I
1223
For large m and n we can evaluate (31) by the saddle-point method. We find lim {g(tmnx, tmnx')}1/mn = y(x),
(32)
where y(x) is given by the following two equivalent expressions: In y(x)
n-1A 2 (tan tnx) - tx In (tan tnx) = = n-1 A 2(tan tnx') - tx' In (tan tnx').
=
(33)
In fig. 4 the reduced entropy per dimer of this "interlocking mixture",
(] = S/tmnk = 21n y(x), which corresponds to Flory's "entropy of disorientation" 17), is plotted against x. For comparison we have also plotted the entropy of mixing for an "ideal" or "random mixture", i.e. of a system where to each single lattice site a horizontal dimer (available fraction: x) or a vertical dimer (fraction: x' = I - x) is attached in a random manner, - - - - - RANDOM MIXTURE _ _ INTERLOCKING MIXTURE
08,---,-----,--,-----.----,
Fig. 4. The reduced entropy per dimer of a mixture of horizontal and vertical dimers as a function of x (fraction of horizontal dimers).
without paying attention to possible hindrances. This quantity is equal to -x In x - x' In x'. The difference between the two entropies is a measure of what might be called the order ot interlocking. § 7. Concluding remarks. We have seen that the generating function Zmn(z, z') for dimer configurations on a planar quadratic lattice can be written in the form of a Pfaffian. The corresponding skew-symmetrie matrix could by a similarity transformation be brought into a nearly diagonal form, and its determinant, which is the square of the Pfaffian, evaluated. The asymptotic behaviour of Zmn(z, z') for large lattices was found to be described by equation (17), which can also be written as Z(z, z') = exp{(2n)-2 f" dw f "dw' In 2[z2 o 0
+ Z'2 + Z2 cos W + z'2 cos w']}.
295
(34)
1224
P. W. KASTELEYN
The same result was found when periodic boundary conditions were introduced. The effect of boundary conditions is, however, not entirely trivial and will be discussed in more detail in a subsequent paper. The right-hand member of (34) has a remarkable resemblance to Onsager's expression for the partition function per spin of a rectangular Ising system 9)10) *). A more detailed examination reveals that Z(z, z') as a function of C = z'/z has no singular points on the real positive axis; it corresponds to Onsager's partition function at the critical point (or critical line, if the strengths of the horizontal and vertical interactions vary with respect to each other). This fact might tempt one to conjecture that the more general problem of monomer-dimer mixtures would be the analogue of the Ising problem at arbitrary temperatures, and hence rigorously solvable. However, this is not the case. It is easy to see that the true analogue of the partition function of an Ising system is the generating function Hmn(y, y') = = detd for closed paths on the lattice Q~n; for an infinite lattice, the function H(y, y') given by (30) has a singularity at y = 1, which is just the value of interest for the dimer problem.
I······~I:::J
•
....i
(ol
1.
r::::]
(bl
Fig. 5. The construction of a monomer-dimer configuration from the standard configuration by (a) the omission of bonds and (b) the shift of dimers along chains of bonds.
A monomer-dimer mixture, on the other hand, has more resemblance to an Ising ferromagnet in an external field 9): the various configurations can be derived from the standard configuration Co by the omission of a number of bonds (fig. Sa) followed by the shift of dimers along certain chains of bonds, which need no longer be closed (fig. Sb). The contribution to the combinatorial factor from open chains increases with the ratio of the activity of two monomers to that of a dimer, whereas in the Ising case it increases with the ratio of the activity of +spins to that of -spins, i.e. with the magnetic field. Since this general Ising problem has as yet resisted all attempts at a rigorous solution one suspects that the monomer-dimer problem will also be very hard to solve. On the other hand, a better insight into the latter problem might throw some new light on the former. *) In this connection it may be remarked that the algebraic introduction which Hurst and Green 13) need for re-deriving Onsager's results by the Pfaffian method can be avoided by deriving them along the lines of the present paper.
296
THE STATISTICS OF DIMERS ON A LATTICE I
1225
Note added in proof. After the submission of this paper the author received preprints of a short communication by H. N. V. Temperley and M. E. Fisher (to be published in Phil. Mag.) and of an article by M. E. Fisher (to be published in Phys. Rev.), both on the statistics of dimers on a quadratic lattice. Following the lines used by Hurst and Green 13) in the discussion of the Ising problem, the authors obtain results identical to those of the present paper. They discuss in more detail the asymptotic behaviour of these results for large lattices, and in addition make some remarks on monomer-dimer mixtures. On the other hand, they restrict themselves to planar quadratic lattices, and their method - although formally equivalent to that developed above - seems less suited to generalization to other two-dimensional lattices. We hope to comment upon these papers in more detail later on. Received 28·6-61 REFERENCES I) Guggenheim, E. A., Mixtures, Clarendon Press, Oxford (1952) Chapter X. Cohen, E. G. D., De Boer, J. and Salsburg, Z. W., Physica 21 (1955) 137. Fowler, R H. and Rushbrobke, G. S., Trans. Faraday Soc. 33 (1937) 1272. Orr, W. J. C., Trans. Faraday Soc. 4.0 (1944) 306. Rushbrooke, G. S., Scoins, H. 1. and Wakefield, A. J., Discussions Faraday Soc. 15 (1953)
2) 3) 4) 5)
57. 6) Green, H. S. and Leipnik, R, Rev. mod. Phys. 32 (1960) 129. 7) Fisher, M. E. and Temperley, H. N. V., Rev. mod. Phys. 32 (1960) 1029. Katsura, S. and Inawashiro, S., Rev. mod. Phys. 32 (1960) 1031. 8) Berge, C., TMorie des graphes et ses applications, Dunod, Pari~ (1958) 175,30. 9) Newell, G. F. and Montroll, E. W., Rev. mod. Phys. 25 (1953) 352. Domb, C., Adv. in Phys. 9 (1960) 149, in particular § 3. 10) Onsager, L., Phys. Rev. 65 (1944) 117. II) Kac, M. and Ward, J. C., Phys. Rev. 88 (1952) 1332. 12) Sherman, S., J. math. Phys. 1 (1960) 202. 13) Hurst, C. A. and Green, H. S., J. chern. Phys. 33 (1960) 1059. 14) Muir, T., Contributions to the History of Determinants, London (1930).
Scott, R. F. and Mathews, G. B., Theory of Determinants, Cambridge University Press, New York (1904) 93. 15) Caianello, E. Rand Fubini, S., Nuovo Cimento 9 (1952) 1218. 16) Grabner, W. and Hofreiter, N., Integraltafel II, Springer Verlag, Wien & Innsbruck (1950) 72.
17) 18) 19) 20)
Flory, P. J., J. chern. Phys. 10 (1942) 51. Chang, T. S., Proc. roy. Soc., London, A 169 (1939) 512. Miller, A. R, Proc. Camb. phil. Soc. 38 (1942) 109. Potts, R B. and Ward, J. C., Progr. theor. Phys. 13 (1955) 38.
297
Errata to "The statistics of dimers on a lattice.
I.
of dimer arrangements on a quadratic lattice"
P.W. Kasteleyn,
by
The number
Physica 27 (1961) 1209-1225.
P. 1220, be added,
line 6:
after "dominant and equal" a footnote sign should
referring to the following footnote,
to be placed at the
bottom of the page: provided n is even.
For odd n the second term vanishes while
the third and fourth terms are equal,
and a factor 2 should be
inserted into the right-hand side of eq. (27). P. 1222, ego (30): the last term in the right-hand side should read 4y2y,2cos 2w' and not 4y,2cos2w'.
298
LONGEST INCREASING AND DECREASING SUBSEQUENCES C. SCHENSTED
This paper deals with finite sequences of integers. Typical of the problems we shall treat is the determination of the number of sequences of length n, consisting of the integers 1,2, ... , m, which have a longest increasing subsequence of length a. Throughout the first part of the paper we will deal only with sequences in which no numbers are repeated. In the second part we will extend the results to include the possibility of repetition. Our results will be stated in terms of standard Young tableaux. PART
I
Definition. A standard Young tableau of order n is an arrangement of n distinct natural numbers in rows and columns so that the numbers in each row and in each column form increasing sequences, and so that there is an element of each row (column) in the first column (row) and there are no gaps between numbers. Example.
247 38
(order = 7)
59
Definition. The shape of a standard tableau is an arrangement of squares with one square replacing each number in the standard tableau. Example.
The shape of 2 4 7 is as shown in Figure 1. 38
59
FIG.
1.
Received June 23, 1959; in revised form August 29, 1960. This work was conducted by Project MICHIGAN under Department of the Army Contract (DA-36-069-SC-78801), administered by the U.S. Army Signal Crops. The author would like to thank W. Richardson, G. Rabson, T. Curtz, I. Schensted, R. Thrall, and J. Riordan for illuminating discussions concerning this problem, and E. Graves for calculations which contributed to the solution. The problem originated as one aspect of a paper on sorting theory by R. Bear and P. Brock, Natural sorting, The University of Michigan, Willow Run Laboratories, Project MICHIGAN Report 2144-278-T, submitted for publication in Soc. Ind. App. Math. 179
299
180
C. SCHENSTED
One reason that standard tableaux are so useful to us is that it is easy to compute the number of standard tableaux of a given shape either by means of a simple recurrence relation, or by means of the following elegant result; Frame, Robinson, and Thrall (1). THEOREM. The number of standard tableaux of a given shape containing the integers 1, 2, ... ,n is
(1)
n!
n
-n-j-l
hj
Here the h j are the hook lengths, that is, the number of elements counting from the bottom of a column to a given element and then to the right end of the row.
Example. To compute the number of standard tableaux of the shape shown in Figure 2(a), we first find the hook lengths, which are shown in Figure
FIG. 2(a).
FIG. 2(b).
2(b). Then we find that the number of standard tableaux of this shape is
_ _ _--'-9_!__. 6·5·3·1·4·3·1·2·1
=
168.
Definition. S (- x is defined as the array obtained from the standard tableau, S, by means of the following steps: (i) Insert x in the first row of S either by displacing the smallest number which is larger than x, or if no number is larger than x, by adding x at the end of the first row. (ii) If x displaced a number from the first row, then insert this number in the second row either by displacing the smallest number which is larger than it or by adding it at the end of the second row. (iii) Repeat this process row by row until some number is added at the end of a row. In the above steps "adding at the end of the row" is interpreted as putting in the first column in the given row if the row does not yet have any entries in it. We define x --+ S similarly except that we replace the word "row" by the word "column" throughout.
300
181
INCREASING AND DECREASING SCUSEQl'ENCES
Example.
If
5
~
247 38 59
then
24 i
246
5- 6 = 37
and
6~S~38
58
59 6
9 LEMMA
1. 5 _ x and x
-+
5 are standard tableaux.
Proof. Since the p roofs for 5 _ x and x
-+
5 are simi lar we consid er on ly
5-x. First we note t hat if two consecutive rows of S have the same length, a nd if a number is displaced from the fi rst of these two rows, then it will either di spl ace the number which was standing under it or else some number to its left, and thus will not be added at the end of the ro\\' o Thus a row canllot be made longer than the row a bove it a nd 5 _ x cannot fai l to be a standard tablea u on accoun t of its shape. Th us we have only to prove that the llUlll bers in each row and C01 U 11111 sti ll forlll increasing sequences. A num ber is inserted into a row ill such a place tha t the number to its left (if any) is smaller , and the number to its righ t (if any) is larger. Thus thc numbcrs in c
Definition. The P -symbol corn:sponding t o a sequence of distinct in tegers is the standard tableau (. . . ((XI _ X2) - xs) . .• - Xn). T he Q-symbol correspondin g to the same sequence is t he array which is obtained b y putting k in the square which is added to the shape of the P-symbol whell X k is inserted in the P-sym boL
XtX2 . .. Xn
301
182
C. SCHENSTED
Examples. Sequence P-symbol
Q-symbol
LEMMA
3 35 354 3549 35498 354982 3549827 349 348 248 247 3 35 34 5 5 59 39 38 5 59 124 124 1 12 1 2 124 124 35 3 35 35 3 6 67 2. The Q-symbol corresponding to an arbitrary sequence is a standard
tableau. Proof. Since the Q-symbol has the same shape as the P-symbol, and since the P-symbol is a standard tableau, the shape of the Q-symbol is legitimate. Each digit added to the Q-symbol is larger than all of the previous digits, and in particular is larger than the digits above it and to its left. Hence the numbers in each row and column form increasing sequences, and the lemma is established. LEMMA 3. There is a one-to-one correspondence between sequences made with the n distinct integers Xl, X2, ••• , Xn and ordered pairs of standard tableaux of the same shape-the first containing XI. X2, ••• ,Xn and the second containing 1,2, ... , n.
Proof. Given a sequence, the P-symbol and Q-symbol are uniquely determined standard tableaux of the type mentioned in the lemma. Given a pair of standard tableaux of the appropriate types we can find the unique sequence which could have them for a P-symbol and Q-symbol as follows: The position of the largest number in the second tells us which number was added on to a row of the first without displacing another number when the last digit was inserted. This must have been displaced from the previous row by the largest number which is smaller than it (there always will be at least one number smaller than it in the preceding row since the one directly above it is smaller). This in turn must have been displaced from the next row up. Finally we get to the first row and discover what number was inserted into it. This is the last digit of the sequence. We now also know what the P-symbol and Q-symbol were before the last digit was inserted. Thus we can repeat the procedure to find the next to the last digit of the sequence. This proves the lemma. Note. Since there are n! possible sequences of Xl, X2, ••• ,Xn , Lemma 3 shows that there are n! ordered pairs of standard tableaux of order n such that the shapes of tableaux in each pair are the same, but the shapes of tableaux in different pairs are not necessarily the same. This fact is already known (2). Of course, the number of ordered pairs of standard tableaux of a given shape is equal to the square of the number of standard tableaux of that shape, which is given in turn by Expression (1).
302
INCREASING AND DECREASING SUBSEQUENCES
183
Definition. The jth basic subsequence of a given sequence consists of the digits which are inserted into the jth place in the first row of the P-symbol. LEM~fA
4. Each basic subsequence is a decreasing subsequence.
Proof. Each number in the jth basic subsequence, on insertion in the first ro\\" displaces the previous member of the jth basic subsequence, which must therefore he larger than the present member.
LEM:\fA 5. Given any member of the jth basic subsequence, we can find a member of the (j - l)st basic subsequence which is smaller and which occurs further to the left in the given sequence. Proof. The number in the (j - l)st place in the first row, when the given member of the jth basic subsequence is inserted, is such a member of the (i - l)st basic subsequence.
TIlEOREM 1. The number of columns in the P-symbol (or the Q-symbol) is equal to the length of the longest increasing subsequence of the corresponding sequence. Proof. The number of columns is the same as the number of basic subsequences. By Lemma 4 there can be at most one member of each basic subsequence in any increasing subsequence. By Lemma 5 we can construct an increasing subsequence with one element from each basic subsequence, Q.E.D . .Yote. The proof shows us how to actually obtain in increasing subsequence of maximal length.
LDl:\fA 6. (x -
5)
<--
y
=
x -
(5 <-- y).
Proof. Suppose first, that of all the digits in x, y, and 5, the largest is y. \Ve represent 5 schematically by Figure 3. There are two cases of interest.
FIG.
3.
The square added to the shape of 5 in x - 5 is in the first row, or it is not. We represent x - 5 schematically in these two cases by Figure 4(a) and 4(b) respectively, where x' is the number added to the end of some column without displacing another number when we form x - 5. It is easily verified
303
184
C. SCHENSTED
FIG.4(a).
FIG. 4(b).
that in the first case the final result is as shown second case the result is that of Figure 5(b).
III
Figure 5(a) and
III
the
=x-+(S~y)
(x-+S)+y =
FIG. 5(a).
y (x-+S)~y=
=x-+(S~y)
X'
FIG. 5(b).
This proves the lemma if y is the largest number involved, and the proof is similar if x is the largest number involved. Suppose now that, of all the digits in x, y, and S, the largest is .Y, and that .Y is in S. In this case we use induction. The lemma can be easily verified by direct calculation if S is of order 0, 1, or 2. We assume the lemma true for S of order n, and prove that it is then true for S of order n 1. Let us suppose, then, that S is of order n 1. Now, since N is the largest number in S, we see that N is at the end of whatever row it is in, and also at the end of its column. Thus, if we remove N from S we will obtain a new standard tableau, S', of order n. Now since ~V is larger than any of the other numbers, it can never displace any of them, and hence the presence or absence of .V cannot have any influence on the position of the other numbers. Thus (x ~ S) ~ y will be the same as (x ~ 5') ~ y except that N is added somewhere, and x ~ (S ~ y) will be the same as x ~ (5' ~ y) except for the addition of ~V. However, since 5' is of order n, we have by assumption
+
(x ~ 5') ~ y
=
x ~ (5' ~ y).
304
+
INCREASING AND DECREASING SUBSEQUENCES
185
Thus we have only to prove that N occupies the same position in (x ~ S) ~ y and x ~ (S ~ y) to prove the lemma. The truth of this can be easily verified for each of the possible cases which can arise as to the relative locations of X, x', and y'. Here x' (y') is the number which is added to some column (row) without displacing another number when we form x ~ S' (S' ~'y). In making these verifications it is necessary to keep the following facts in mind. If x' and y' do not fall into the same square, then we represent S', x ~ S', and S' ~ y schematically by Figure 6(a), 6(b), and 6(c) respectively. The shape of (x ~ S') +- y must have a square added to the shape of
FIG. 6(a).
FIG. 6(c).
FIG. 6(b).
x ~ S', and the shape of x ~ (S' ~ y) must have a square added to the shape of S' ~ y. By assumption (x ~ S') ~ y = x ~ (S' ~ y) so that the shape of (x ~ S') ~ y and x ~ (S' ~ y) must be Figure 7.
FIG.
7.
If x' (in x ~ S') and y' (in S' ~ y) occupy the same position then we schematically represent S', x ~ S', and S' ~ y by Figure 8(a), 8(b), and 8(c) respectively. Here the shaded parts of x ~ S' and S' ~ yare the
FIG. 8(a).
FIG. 8(b).
FIG. 8(c).
regions where numbers could have been displaced. Now let us suppose that y' > x'. Then when we insert y into x ~ S' the same numbers will be displaced in each row as were displaced when we inserted y into S, until we displace y'.
305
186
C. SCHENSTED
In S' ~ y we would have put y' where x' is, but y' > x', thus y' will be added at the end of the row containing x', and the shape of (x ---+ S') ~ Y (and hence of x ---+ (S' ~ y)) will be Figure 9. If we had had x' > y', then
FIG.
9.
the shape of (x ---+ S') ~ y and x ---+ (S' ~ y) would have been Figure 10. Thus, if we know the shapes of x ---+ S' and S' ~ y, and if we know whether x' > y' or x' < y', then we know the shape of (x ---+ S') ~ y and x ---+ (S' ~ y).
FIG. 10.
Now we can return to the problem of showing that N has the same position in (x ---+ S) ~ y and x ---+ (S ~ y). As we mentioned there are several special cases. \lIfe will consider only three of these as the others go in the same way. First suppose that the position of N in S does not coincide with either the position of x' in x ---+ S' or the position of y' in S' ~ y. Then N will never be displaced and it will have the same position in (x ---+ S) ~ y and x ---+ (S~y) as it does in S. Next suppose that the position of N in S coincides with the position of x' in x ---+ S', and that the position of y' in S' ~ y lies to the left of this. Then we have schematically Figure 11. Finally suppose that the position of N in S coincides with the position of x' in x ---+ S', and that the position of y' in S' ~ y lies one column to the right of this. Then schematically we have Figure 12. Proceeding similarly we can verify all of the other special cases, and hence the validity of Lemma fi. LEMMA 7. If one sequence is a second sequence written backwards, then Psymbol of the first is obtained from the P-symbol of the second by interchanging rows and columns.
Proof. First we note that x ---+ y = x xy and if x
>
~
y since if x
<
y they are both
y they are both~. Now we define P(XI, X2, .. . ,xn )
306
"'= ( ...
«XI
INCREASING AND DECREASING Sl7BSEQUENCES
187
s=
x-+S'=
5'+y=
x~S=
S~y=
(x-+S)--E-y=
=
X-+-(S -+- y)
FIG. 11.
and P (Xl, X2, ... ,Xn) == (Xl -~ ••• (X n-2 ---+ (Xn-l ---+Xn)) ... ). Next we assume that P(XI, X2, ••. ,Xn-l) = P(XI, X2, •.. , Xn-l) and that P(XI' X2, •••• xn) = P(XI, X2, ..• ,xn) and prove that P(XI. X2, •••• Xn• xn+!) = P(Xlo X2, .•• ,Xn, Xn+l). (We have just shown that P(Xlo X2) = Xl +- X2 = Xl ---+ X2 = P(XI, X2). furthermore P(XI) = Xl = P(XI).) We have +- X2) +- Xs) ••• +- xn)
P(Xlo X2, .••• Xn, Xn+l) = P(Xlo X2, ... , xn) +- Xn+l = j)(Xlo X2 • ...• xn) +- Xn+l = [Xl ---+ P (X2 • ...• X,,)] +- Xn+l = Xl ---+ [P(X2 • ... , Xn) +- Xn+l] = Xl ---+ [P(X2 • •.•• Xn) +- Xn+l] = Xl ---+ P(X2 • ...• Xn. Xn+l) = Xl ---+ P (X2 • •.• , Xn. Xn+l) = P(XI. X2 • ...• Xn, Xn+l).
Of these lines, the second, fifth. and seventh follow by assumption, and the
307
188
C. SCHENSTED
x-+5'=
S'-+-y=
x-+5=
S~y=
(x-+S)~y=
FIG. 12.
fourth from Lemma 6. Now P(Xlo ... , xn) is the P-symbol for the sequence while P(XI, X2, ••• , xn) is the P-symbol for the sequence X n , ••• , X2, Xl with rows and columns interchanged. Hence the lemma follows.
Xlo X2, ••• , X n,
Note. It must not be assumed that Lemma 7 holds for Q-symbols. THEOREM 2. The number of rows in the P-symbol (or the Q-symbol) is equal to the length of the longest decreasing subsequence of the corresponding sequence.
Proof. This follows immediately from Theorem 1 and Lemma 7, since writing a sequence backwards changes increasing subsequences into decreasing subsequences. 3. The number of sequences consisting of the distinct numbers and having a longest increasing subsequence of length a and a longest decreasing subsequence of length (:1, is the sum of the squares of the numbers of standard tableaux with shapes having a columns and (:1 rows. THEOREM
Xl, X2, ••• , X n ,
308
INCREASING AND DECREASING SUBSEQUENCES
189
Proof. Follows immediately from Lemma 3 and Theorems 1 and 2 (see also the note to Lemma 3). Example. To find the number of permutations of 1,2,3, ... ,25 having a longest decreasing subsequence of length three and a longest increasing subsequence of length 21 we note that the only allowed shapes with 25 squares, 21 columns, and 3 rows are those of Figure 13.
I I I I I I I I I I I I I I I I I I I
DII FIG.
13.
By the Frame-Robinson-Thrall theorem, the corresponding numbers of standard tableaux are 21,000 and 31,350 respectively. Thus the desired number of permutations is 21,000 2
+ 31,350 = 1,423,822,500. 2
PART
II
We now want to consider sequences in which some of the numbers are repeated. We can obtain the properties of such sequences in terms of sequences without repetitions by a simple artifice. Suppose the smallest number appears p times in the sequence, the next smallest q times, etc. We replace the p occurrences of the smallest number by the numbers 1,2, ... , P (in this order), the q occurrences of the next number by p 1, p 2, ... ,p q, etc. Then the decreasing subsequences of the two sequences will be in oneto-one correspondence, while the increasing subsequences of the new sequence will be in one-to-one correspondence with the non-decreasing subsequences of the original sequence.
+
+
+
Example. Given the sequence 33 2 3 4 1, we replace 1 by 1, 2 by 2, the three 3's by 4, 5, 6, and 4 by 7. The result is 45267 1. The latter sequence has a decreasing subsequence 5 2 1 which corresponds to a decreasing subsequence 3 2 1 in the original and an increasing subsequence 45 6 7 which corresponds to a non-decreasing subsequence 3334 in the original. If we construct the P-symbol for the derived sequence, and map the numbers in it back to the numbers in the original sequence, then we get a modified standard tableau in which repeated numbers are allowed, the numbers in each column form an increasing sequence, and the numbers in each row form a non-decreasing sequence. Since the numbers in the Q-symbol refer to
309
190
C. SCHENSTED
the order of addition of spaces to the P-symbol, the Q-symbols of the two sequences will be identical. We can get modified forms of each of the results in Part I. The main result, Theorem 3, now takes the form: THEOREM 4. The number of sequences of Xl. X2, ••• , Xn having a longest nondecreasing sequence of length a and a longest decreasing sequence of length {3 is the sum of the products of the number of modified standard tableaux of a given shape with the number of standard tableaux of the same shape, the shapes each having a columns and {3 rows.
Example. To find the number of sequences of seven numbers consisting entirely of l's, 2's, and 3's having a longest non-decreasing sequence of length four and a longest decreasing sequence of length three, we proceed as follows. The possible tableaux must have the shape of Figure 14.
FIG. 14.
The possible modified standard tableaux are
1 2 3 1 2 3 1 2 3
1 1 2 2 122 3 222 3
1 1 1 2 2 3 3 1 123 2 2 3 122 3 2 3 3
,
1 2 3 1 2 3 1 2 3
1 1 3 2 123 3 233 3
111 1 2 2 3 1 1 1 3
111 1
2 3
3 1 122 2 3 , 2 2 3 3 1 1 3 3 1 133 2 2 2 3 3 3 They are 15 in number.
By the Frame-Robinson-Thrall theorem the number of standard tableaux of this shape is 35. Hence the number of sequences of the desired type is 15 X 35 = 525. As a further example we will work out explicit formulae for binary sequences (sequences consisting of D's and l's). In this case the modified standard tableaux have the general form of Figure 15, where the bracketed region can have any division of D's and l's (the D's preceding the l's, of course). ~
FIG. 15.
310
191
INCREASING AND DECREASING SUBSEQUENCES
Let n be the number of digits in the sequence. Let m be the length of the longest non-decreasing subsequence. Then there are no sequences for which m < n/2. If m = n the longest decreasing subsequence is of length 1. If n/2 '" m < n, the longest decreasing subsequence is of length 2. The number of possible modified tableaux is 2m - n + 1. The number of standard tableaux is n!
+ 1) (m + l)!(n _
(2m - n
m)!'
Thus the number of binary sequences of n digits with a longest non-decreasing subsequence of length m is
+
n!(2m - n 1)2 (m l)!(n - m)! .
+
Note. Since the total number of binary sequences is 2n we have 2"
=
f
+
n!(2m - n 1)2 m>n/2(m l)!(n - m)!'
+
In the above derivation we allowed all possible binary sequences. Theorem 4 also readily solves the problem if the number of O's and l's in the sequence
is fixed. In this case there is at most one modified tableau and thus the number of sequences of n digits with a longest non-decreasing subsequence of length m is
+
n!(2m - n 1) (m l)!(n - m)!
+
with the additional restriction that the number, p, of O's must satisfy n - m '" p '" m.
Note. This shows that ( n)
P
f
=
m-max(p.n-p)
III
the sequence
+
n!(2m - n 1) (m l)!(n - m)!'
+
Throughout Part II we could have dealt equally well with increasing and non-increasing subsequences rather than decreasing and non-decreasing subsequences.
REFERENCES
1.
J.
S. Frame, G. de B. Robinson, and R. M. Thrall, The hook graphs of the symmetric group, Can. J. Math., 6 (1954),316. 2. D. E. Rutherford, Substitutional analysis (Edinburgh University Press, 1948), p. 26.
Institute for Defence Analysis Princeton Reprinted from Canad. J. Math. 13 (1961), 179-191
311
ON A THEOREM OF R. JUNGEN M. P. SCHUTZENBERGER
Let us recall the following elementary result in the theory of analytic functions in one variable. THEOREM
(R. JUNGEN [7]). If a is rational and b algebraic their
Hadamard product c is algebraic;l)", further, b is rational, c also is rational.
For several variaLles, J ungen 's proof shows that the theorem is still true for the Bochncr-:\Iartin [2] Hadamard product. It does not hold for the Cameron-:\lartin [3] and for the Haslam-Jones [6] Hadamard products. In this note we give a version of Jungen's theorem which is valid for a restricted interpretation of the notions involved when a and b are formal power series in a finite number of noncommuting variables. 1. Notations. Let R be a fixed not necessarily commutatiye ring with unit 1. For any finile set Z, F(Z) is the free monoid generated by Z and Rpul(Z) is the free module on F(Z) over R. An element a of Rpo\(Z) will usually be written in the form a = L (a, f) -J: fE F(Z) where the coefficients (a, j) are in R; Rpo1(Z) is graded in the usual mantler and 7r"a= LI(a, f)-J:fEF(Z), degf~nl. We identify R with 7r oR pol (Z). Rpol(Z) is also a ring with prouuct aa' = (a, 1') (a', 1") -J: f, 1', f" E F(Z) , f = 1'1"} . It is well known (d., e.g., [4; 3]) that these notions extend to the ling R(Z) of the fermal power series (with coefficients in R) in the noncommuting variables zEZ; R(Z) is topologized ill the same manner as a ring of commutative formal power-series and au' =lilll n •n ,_",(7r"a)(7r n ,a'). Any bER*(Z)=\aER(Z):7rua=ol has a quasI-ill verse (-b)*=limn_'" Ln'
I
LI
Y»
Received by the editors December 6, 1961.
313
I
886
M. P. SCHUTZENBERGER
[December
of the R-module R(X V Y) (resp. R:',(X V Y»). For each q = (ql, ... , qm) E R-V(X V Y), 7r n q = (7r n ql, .. " 7r n Qm). If qER*·1f (XV I') (i.e., if 7roq = 0) let Aq be the homomorphism of the monoid F(XV Y) into the multiplicative monoid structure of R(XVY) that is induced by AqX=X if xEX and Aq)'j=qj if )'iEI'. Since 7roq = 0, Aq can be extended to an endomorphism of the Rmodule R(XVY) by A
L!
+
lIence, p(x)=limm_"p(m) exists and it satisfIes P(x)El<*.lf(X), 7rup( x) = 0, p( :r.;) = Ap(-.c)P. In fact, p(:r.;) is the only clement to s,ltisfy these equations because if 7roP' = 0 and p' =\p'p, any rebtilJll 7r m P(:r.;) = 7r mP' im plies 7r m+lp' = 7r m+l\TmP' P = 7r m+IATmP(x;)p = 7r m-i!p( x). For this reason we call p( x) the solution of p. DEFI~ITIO~ 2. R:,g(X) is the least subset (of R*(X» that contain:> every coordinate of the solution of any proper system having its coonJinates in R;",(XV Y). (RE\\.-\RK. It can easily be sho\\"n that R:,g(X) is rationally closed and that it contains every coordinate of the solution of any p[I'per system having its coordinates in R:,g(XV 1').) DEFI~ITIO~ 3. For any
a, b E R(X),
a0 b
=
L I (a,f)(b,j) I f E
F(X)}.
2. Main result. Property 2.l. The elemellt a of R*(X) belongs to N;at(X) if and
only if there exists a tlnite integer X~2 and a homomorphism M of F(X) into the multiplicative monoid of RSX'V (the ring of the XXS matrices with entries ill R) sllch that a = IMit, \ I fe::: F(X) : (abbreviated as L}Jju' -f). PROOF. (1) The condition is necessary. This is tri\'ial if a=7rI(/. IIence it suffices to show that for any r, r'ER, a= LM!U--f and a' = LM'h,-", -f one can constru,:t suitable homomorphisms giving ra+a'r', (la' and a*. This is dOlle below, defining the homomorphisms by their restriction to X.
L
314
887
ON A THEOREM OF R. JUNGEN
Addition. Let N"=N+N'+2 and p."xERh"XN" defined for each xEX by J.L"X;.1 = J.L"X.V".i = 0 and
for 1
~
i
~
N" j
J.L " X;+I.N" = J.LXi.N
for 1
~
i
~
for 1
J.L"Xi,i' = the direct sum of J.LX and J.L'X
i\' j ~
i
~
.\"
j
for 2 ~ i, i' ~ X" - 1 j
The verification is trivial. Product. Let N"=N+N' and define IIfERN"X.V" ioreachfEF(X) by ~1;,;' = p.f; ..v if fr! 1, 1 ~i ~ N, i' = N + 1; IIf;.;· = 0, otherwise. Then, if p."x=ilx+llx where ilX is the direct sum of J.LX and J.!'x, one has for eachf = X(l l X(2) . • . x(n),J.!"f = iii Lliii'lIxtiiilf":j'xlj'f" = fl. Since IIfx(j)=iiiIlX(i) and (~1"'iii"h,s"=0 when f"=1, one has J.L"fl.,V" = L\(uf{,N)(p.'f{:N·):j'f"=fl· Hence, LJ.!"fl,y .. -j=aa'. QlIasi-im'erse. Let 1V" = N and define IIfERsXS for each IE F(X) by IIfi,i'=J.!fi ..v if fr!1, 1~i~N, i'=1; IIfi,;'=O, otherwise. Then JJ."x = fJ.X + IIX and since JJ.fllx = IIfx identically one has p."f = Lllf(1)1IJ<2) ... IIPk'J-tJ
+
315
888
M. P. SCHUTZENBERGER
[December
We now consider two subrings R' and R" of R that commute element-wise. Property 2.2. If a = LJ.L'!t,o'l·jER;:t(X) where J.L' is a homomorphism into R'NXX and if b = p( 00 hER~l:(X) where the proper system P has its coordinates in R~j(XU Y), then a 0 bER:1g(X). If, further, bER:~~(X) then a 0 bERiat(X). PROOF. \Ve verify first the case of bER;~~()(), i.e., of b = L,fJ."Il.-v" oj for some N" and J.L". Then a 0 b = 2:(J.L' ®J.L")!J,-','.\"' -J where the kroneckerian product J.L'@J.L" is a homomorphism of F(X) into RNN"XNN" because R' and R" commute and the result is proved. For the general case we denote by K(Z) for any set Z the ring of the NXN matrices with entries in R(Z). We shall have to consider several homomorphisms of moduleer: R_11(Z')-4KM(Z") where Z' and ZIt are two finite sets. In each case er is defined by a mapping Z'-4K(Z") which is extended in a natural fashion to a homomorphism of the monoid F(Z') into the multiplicative structure of K(Z"). Then for each a
= (aI, .. "
a.lI)
E R-'>f(Z'),
uai
=
2:! (aj,g)·ug: g E F(Z')}
and era = (eral, ... ,eraJI). l\Iore specifically, J.L: R_If(X)-4K-If(X) is induced by a mapping J.L: X-4K(X) such that the entries of each J.LX belong to R'*(X). For each qER"HI(X), }.I'q: R(XU Y)-4KM(X) is induced by }.,qf = J.Lj if jE F(X) and }.l'qYi = Mi if yiE Y. I-Icnce. since R' and R" commute element-wise, J.L}.qg = }.~qg for each gE F(XU (with Aq as previously defined). Consequently, J.L}.qp = ApqP for any
n
PER"_'1(XU Y). Let no\v Z~ -I-··l(l<·
•
n
316
889
ON A THEOREM OF R. JUNGE:-l
ther, PER~ot·,,(XU Y) all the entries appearing in lIP belong to R~ol (XUZ) and then finally (J.lp( 00 Lk,· 0:R:1g (X). This completes the proof because
L (b, /)P.'/I.S}: / E F(X) I = L I (b, /)41.\': / E F(X) I = p.bl,Y
a0 b=
where for each xEX, J.I is defined by /lX.,i· =P.'Xi,.· ·x. RE~IARK 1. Definitions 1, 2, and 3 and the computations of this section used only the structure of monoid of the additi\'e groups considered. Hence, the results are still valid when an arbitrary semiring S is taken in place of R. For S consisting of t\\·o Boolean clements, Jungen 's theorem and its special case for b rational have been obtained in a different form by Y. Bat'-Hillel, :\1. Perles and E. Shamir [1] (also by S. Ginsburg and G. F. Rose [5]) and by S. Kleene [8] respectively as by-products of more sophisticated theories. RDfARK 2. Let R = C, the fIeld of complex numbers; and P a proper system of dimension JI. Introducing 4JI new symbols Zj and replacing each Yj by z~d- iZ 4i +1 - Z4j, 2 - iZ4j+3 in the PiS we can ded uce from P a new systpm of dimel:sion 4..11 in which all the coefficients are non-negative real numbers and whose solution is simply related to p(
y; ).
Assume now that pE C~~ (XU Y) has only real non-neg;lti\,e coeftlrients and denote by a a homomorphism of CpQI(XU Y) into C. Because of the assumption that COj, Yj') = (Pit 1) =0, identically, \\"l' caa find an E>O such that lapj! <E for all j when !ax~ ~E and :ay\ ~2E for all xEX and yE Y. Since the sequence ap(O), apd) . . . . ,ap(n), ... is monotonically increasing it converges to a tillite solution (d., e.g., [10]). Hence, the canonical epimorphism of Cp.,I(XU onto the ring of the ordinary (commutative) polynomids can be extended to an epimorphism of Calg(X) onto the ring of the Taylor series oi the algebraic functions.
n
Acknowledgment. Acknowledgment is made to the \011lmOllwealth Fund for the grant in support of the visiting profess(lrship of bi,)mathematics in the Department of J'reventive :\ledicinc at Ilarvard :\ledical School. REFEREXCES 1. Y. Bar-Hillel, :'II. Perles ami E. Shamir, 0" formal propl'./ips (If simplt! phrasl' structure gramm(zrs, Technical Report :\0. 4. Information System Branch, Office (,f Naval Research, 1960.
317
890
M. P. SCHUTZENBERGER
2. S. Bochner and W. T. !\Iartin, Singularities of composite functions in several mriables, :\nn. of Math. 38 (1938), 293-302. 3. R. H. Cameron and \V. T. l'.iartin, A nalytic continuation of diagonals, Trans. Amer. ~lath. Soc. 44 (1938), 1-7. 4. K. T. Chen, R. H. Fox and R. C. Lyndon, Free differential wlculus. IV, ;\nl1. of !\lath. (2) 68 (1958),81-95. 5. S. Ginsburg and G. F. Rose, Operations which preser;;c d,jinability, System Development Corporation, Santa l'.ionica, Calif., SP-511, October, 1961. 6. l!. S. Haslam-Jones, An e:.:tt'llsion of lTadamard multiplication thcoreal, Proc. London Math. Soc. II. Ser. 27 (1928), 223-232. 7. R. Jungen, Sur les series de Taylor n'ayant que des singularites al;;ebrico-logarithrniques sur leur cercle de convergCllce, Comment. :'.Iath. Hclv. 3 (1931),226-306. 8. S. Kleene, Represoltation of ct'rnts in nerve nets and finite automata, Autumata Studies, Princcton Cniv. Press, Princeton, :\. J" 1956. 9. :\1. Lazard, Lois de groupes et analyseurs, Ann. Sci. Ecole :\orm. Sup. (4) 72 (1955), 299-400. 10. A. :\1. Ostrowski, Solutions of equations and systems of equations, Academic Press, New York, 1960. HARVARD MEDICAL SCHOOL
Reprinted from Proc. Amer. Math. Soc. 13 (1962), 885-890
318
REGULARITY AND POSITIONAL GAMES
BY
A. W. HALES AND R. 1. JEWETT
Reprinted from the TRANSACTIONS OF THE AMERICAN MATHEMATICAL
Vol. 106, No.2, pp. 222-229 February, 1963
319
SOCIETY
REGULARITY AND POSITIONAL GAMES BY
A. W. HALES AND R. I. JEWETT
1. Introduction. Suppose X is a set, 9' a collection of sets (usually subsets of X), and N is a cardinal number. Following the terminology of Rado [1], we say 9' is N-regular in X if, for any partition of X into N parts, some part has as a subset a member of 9'. If 9' is n-regular in X for each integer n, we say 9' is regular in X. For example, let X = {1,2, "',mn - n + 1} and 9' be all m element subsets of X (hereafter designated x(m»). Then 9' is n-regular in X, but not (n + 1)-
regular. Another example is the famous theorem of Ramsey which states that given integers k, m, n, there exists an integer p such that, if A = {l,2, ... ,p}, then {B(k):BEA(m)} is n-regular in A(k). The concept of regularity is useful in analyzing certain types of games, as we shall see in §3. In §2, we shall give some general results and discuss related problems. 2. Regularity. One of the first problems in this area was proposed at Gottingen in 1927. The pro blem was as follows: If the positive integers are split into two parts, does one part contain arithmetic progressions of arbitrary length? B. L. van der Waerden solved this and a more general problem. He proved that, given integers m and n, there exists an integer p such that the set of all arithmetic progressions oflength m is n-regular in {l, 2, "', p} [2]. This will be a consequence of Theorem 1. First we shall give some preliminaries. DEFINITION. If 9' and :Y are collections of sets, let 9' ®:Y be the collection of all sets A x B, where A is in 9' and B is in :Y. LEMMA 1. Let M and N be cardinal numbers. Let 9' be N-regular in X, a set of cardinality M, and let :Y be NM-regular in Y. Then 9' ®:Y is N-regular in X x Y.
Proof. Let P be a set of cardinality N. Then a partition of X x Y into N parts can be represented by a function f from X x Y into P. For each y E Y, f defines a function fy from X into P given by f'(x)
= f(x,y).
Since there are N M such functions the mapping y -+ f, induces a partition of Y into N M parts. One of these parts contains as a subset a member T of :Y. That is, for all y, y' E T Received by the editors July 5,1961 and, in revised form, December 26,1961.
222
320
REGULARITY AND POSITIONAL GAMES
fy f(x,y)
223
= f y"
= f(x,y')
(x e X).
Choose Yo e T. Then fyo partitions X into N parts and hence 3 Se9", peP such that (xeS). But then jy(x)
=
p
(xeS, yeT).
f(x,y)
=
p
(xeS, yeT)
That is,
which was to be shown. DEFINITION. Let X and Y be sets, 9" andff collections of sets. Then a mapping f: X --. Y is called provincial with respect to 9" and ff in case when A e 9", A ~ X there exists a set Be ff, B ~ Y such that B ~ f(A). LEMMA 2. Let f: X --. Y be provincial with respect to 9" and ff. Then 9" is N-regular in X, ff is N-regular in Y.
if
Proof. Let P be a set of cardinality Nand g: Y --. P. Then g(f): X --. P and there exist peP and Ae9" such that A ~ X and g(f(A)) = {pl. But there is a Beff, B ~ Y such that B ~f(A). So g(B) = {pl. LEMMA 3. Let X be a semigroup, 9"~ 2X. Suppose for each positive integer k, 9" is k-regular in a finite subset of X. Then for each n,
is regular in X.
Proof. We induct on n. Suppose 9"n-l is regular in X and k ~ 1. Then there is an integer m and Be X (m) such that 9" is k-regular in B. Since 9"n-l is km_ regular in X, 9" ® 9"n-l is k-regular in B x X ~ X x X. But the mapping (x,y) --. xy of X x X into X is clearly provincial with respect to 9" ® 9"n-l and 9"n' and thus9"n is k-regular in X. Let W be a fixed set and t ¢ W. Let X be the free semigroup on the set W. A functional f is a mapping of W into X which can be described as follows. For some positive integer n there is an n-tuple oc = (OCl' oc 2 , ••• , ocn) of elements of W U {t} in which t appears at least once, such that, for we W,f(w) is the result of replacing the t by a wand multiplying (in X) the n components of the new n-tuple. For example, if W = {I, 2, 3,4} and oc = (1, t, 3, t, t, 2, 1, t) then the corresponding functional f would satisfy f(4)
= 14344214.
321
A. W. HALES AND R. I. JEWETT
[February
Suppose that fl' f2' ... J" are functionals. Let cfJ: wn -+ fl(W)f2(W) ···f,,(W) be efined by ¢(WIW2 ... w,,) = fl(Wl)f2(W']) ···f,,(w,,). We see that if 9 is a functional of "length" n then there is a functional h such that ¢(g(w» = hew) (w E W). oosely, we have cfJ(g(t)) = h(t). If As; W, R s; W n and R is a member of {J(A) :1 is a functional} then ¢(R) is also a member of this collection. Thus, relative to this collection, cfJ is a provincial map of An onto fl(A)fl(A) ···fn(A). THEOREM 1. If A is a finite subset of W the collection {I(A): tional} is regular in X.
1
is a func-
Proof. Let (l m,j) be the statement: If BE W(m) there exists an integer p such hat {feB) :fis a functional} isj-regular in BP. We will prove these statements by inductIon and thus prove the theorem. It is clear that (Im,l) and (II) are true. Assume further that for n> 1 and kG; 1, (III,k) and (111-1) are true for all j. Let A E W(n). Pick a E A and let B = A - {a}. By assumption, there is an integer r such that {f(A):f is a functional} is k-regular in Ar. By (III-I)' the fact that B' is finite for each s, and Lemma 3, we see that {lo(B)/I(B) ···fr(B):J; is a functional} is (k + I)-regular in the subsemigroup of X generated by B, namely, Bl U B2 U ... u B S u .... The B S are disjoint, and if fo.Jlt ... ,J.. are functionals, then the set o(B)fl(B) ···flB), all of whose elements have the same "length," meets at most one of the B S • In such a situation, (k + 1)-regularity in the union implies (k + 1)regularity in one of the parts. Thus there is an integer q such that {fo(B)fl(B) ···flB)}
is (k + 1)-regular in Bq. We will use the integer q to verify (In,Ht). Let A q = Po u PI U ... U Pk. This defines a partition of Bq and so there are functionals go,gl,···,gr such that gO(B)gl(B)···gr(B) is contained in one of the parts, say Po, and also in Bq. Thus each "entry" in a g, is an element of B u {t}. Since B s; A, we can conclude that gO(A)gl(A) ... grCA) s; Aq. Now the mapping cfJ: A r -+ Uo(a)gt(A) ... urCA) defined by ¢(W 1 W2 ..• wr) = UO(a)gl(w t ) ... gr(wr) is provincial with respect to {f(A):f is a functional}. Thus, by (In,k)' {f(A)} is kregular in go(a)Ul(A) ... UreA). If Uo(a)Ut(A) •.. gr(A) is disjoint from Po we are done. If not, there are elements ai' a2, ... , ar of A such that x = Uo(a)gt(at) ... grCa,) E Po.
Suppose x
= V 1V2 ••• v, E A'.
Define cx
= (CX1' cx 2, ••• , cx,)
322
by
REGULARITY AND POSITIONAL GAMES
1963]
IX.
,
225
if VieB, if Vi = a.
= {VIt
Note that t appears in fo, so a appears in fo(a), and hence t appears in Then IX represents a functional 9 for which g(a)
IX.
=x
and g(B) s gO(B)gl(B) .. · g,(B) s Po. Since g(A) S Po, and the theorem is proved.
g(A)
= g(B) u {g(a)}, we have
COROLLARY. Let S be a finite subset of a commutative semigroup H. Then the collection of all sets
{a + nx :xeS}, where a e Hand n is a positive integer, is regular in H. Proof.
Let X be the free semigroup on S. Then the mapping
is provincial. COROLLARY (VAN DER WAERDEN). For any partition of the positive integers into a finite number of parts, one of the parts contains arithmetic progressions of arbitrary length.
The stronger statement proved by van del' Waerden and mentioned above is clear from the proof of Theorem 1. This suggests a general result which we have given as a corollary to Theorem 2. Theorem 2 is proved in Rado [3]. The proof is essentially an application of Tychonoff's Theorem, as shown by Gottschalk [4]. THEOREM 2. Let X and r be sets, and for every finite subset A of X let fA be a function from A into r. Suppose that for each x e X, r x = {fA(X): A is a finite subset of X containing x} is finite. Then there exists a function F from X into r with the property that, given any finite subset A of X, there exists a finite subset B of X such that F and fB agree on A. COROLLARY. Let X be a set and!7 a collection of finite sets. Then if !7 is nregular in X for some positive integer n, !7 is n-regular in a finite subset of X.
Proof. Let P be a set with n elements. Suppose !7 is not n-regular on any finite subset of X. Then for each finite subset A of X, there is a function fA from A into P that is not constant on any member of !7. By Theorem 2 there is a function F from X into P that is not constant on any member of !7, a contradiction. The above corollary suggests a general problem. Let M and N be cardinals. Does there exist a cardinal P having the following property? If X is a set and .<J> is a collection of sets each of cardinality less than M, and !7 is N-regular in
323
226
A. W. HALES AND R. I. JEWETT
[February
X, then [/' is N-regular in a subset of X of cardinality less than P. A simple example shows that if M > 1 and N > 1 are integers, P = ~o is best possible. The corollary says that if M = ~o, and N > 1 is finite, then P = ~o is sufficient. No further results in this area are known. Another problem is the "rectangle" problem. Let M, N, and P be cardinals. For what pairs (R, S) of cardinals (if any) is the following true? If X has cardinality Rand Yhas cardinality S, then X(M)® y(n)is P-regular in X x Y. That is, if an R x S rectangle is partitioned into P parts, one part contains an M x N rectangle. From Lemma 1, such pairs always exist. For example, if M =2, P=2, N = 5, P = 2, then R = 3 and S = 40 is sufficient. The "minimal" pairs (R, S) for given M, N, and P are not known in general. A particularly interesting case occurs when M = N = ~o and P = 2. It is easily seen that (~o, ~o) does not work and (~O,22110) does. The sufficiency of (~o,tIO) or, for that matter, (2 110, 2110) is an open question.
3. Positional games. By a positional game we shall mean a game played by n players on a "board" (finite set) X with which is associated a collection [/' of subsets of X. The rules are that each player, in turn, claims as his own a previously unclaimed "square" (element) of X. The game proceeds either until one player has claimed every element of some S E[/', in which case he wins, or until every element has 'been claimed, but no one has yet won, in which case the game is a tie. The most familiar example of such a game is "Tick-TackToe." Another is the Oriental game "Go Moku." It is known from game theory that, in a finite two-player perfect information game, either one player has a forced win or each player can force a tie [5]. LEMMA 4. In a positional game involving 2 players, where [/' is 2-regular in X, the first player has a forced win.
Proof. Since no tie can occur, one player has a forced win. Assume the second player has a forced win. But then the first player can force a win by (1) making his first move at random, and (2) thereafter following the optimum strategy for the second player, ignoring the last random move, and playing again at random if this is impossible. Since having made an extra move cannot possibly hurt, this will give the first player a win, a contradiction. Therefore, the first player has a forced win. The following result in combinatorial analysis is due to Philip Hall [6]. LEMMA 5. Let S1,S2,"',Sn be an indexed collection of finite sets. Then (A) and (B) are equivalent. (A) There exist S1' S2, ••• , Sn such that each s E Si and Sj:f= Sj if i:f= j. (B) For each F s;; {1, "', n}, the set Sl has at least as many elements as F.
UieF
324
1963]
REGULARITY AND POSITIONAL GAMES
227
If condition (A) is satisfied, we say Sl' "', Sn have distinct representatives. We will use this lemma to exhibit a tying strategy for the second player in certain positional games. LEMMA 6. Let X be the board of a 2-player positional game with winning sets !I' = {Sl' "', Sn}. For k = 1,2, "', n let T 2k - l = T2k = Sk' Then if T l , "', T 2n have distinct representatives, the the second player can force a tie.
Proof. Let the representatives be t l , t 2, "', t 2n . Consider the sets {tl' t 2}, {t3' t 4 }, "', {t 2n - l , t 2.}. Observe that in order to win the first player must have
both elements of at least one of these sets. Since the second player can easily prevent this, he can force a tie. LEMMA 7. Let !I' £; 2x , where X is finite. Let n be the size of the smallest member of!l'. Let m be the size of the largest set of the form {S e!l' : xeS} where x e X. If n ~ 2m, then in the corresponding 2-player positional game, the second player can force a tie.
Proof. By a simple counting argument, Lemmas 5 and 6 can be applied to obtain the desired result. The rest of this paper will be concerned with a particular class of 2-player positional games, namely generalizations of Tick-Tack-Toe. The traditional Tick-Tack-Toe game is played on a 3 x 3 array of points in the plane. For positive integers k and n, the "k"-game" is played on a k x k x ... x k (n times) array of points in n-space. If we choose as a board the set X = {(ai' a2' "', an) : 1 ~ aj ~ k for all i},
then S is in !I', the collection of winning sets (paths), in case !I' consists of k points in a straight line. An equivalent characterization of S e!l' would be that the elements of S, in some order, are CXl> CX2' "', CXk where CXi = (ail' ''', ain) and, for each j, the sequence (ali' a 2i , "', akj) is one of the following: (1, (2,
1, "', 1) 2, "', 2)
(k,
k, "', k)
(1, 2, "', k) (k, k-1, "', 1).
In this case we say CXl,CX2' "',cxk are in a natural order (there are two such orders). In traditional Tick-Tack-Toe, the second player can achieve a tie. In the 33 -game, however, the first player has a forced win (in fact, no tie position exists).
325
228
A. W. HALES AND R. I. JEWETT
[February
Thus, in the 3-dimensional games sold on the market, k is usually 4. Our previous results enable us to draw some conclusions about the existence of winning and tying strategies in the general case. THEOREM 4. (a) If k~3n-l (k odd) or if k~2"+1_2 (k even), then the second player can force a tie in the k"-game. (b) For each k, there exists nk such that the first player can force a win in the kn-game if n ~ nk'
Proof. (a) If k is odd, there are at most (3"-1)/2 paths through any point and this bound is achieved only at the center point. If k is even, the bound is 2" -1. The result follows readily from Lemma 7. This suggests that the center point is the optimum move for the first player if k is odd. (b) In Theorem 1, let W = {1,2, · .. ,k}. Note that if f is a functional then f(W) is a path, but the converse is not true. Now (Ik,2) and Lemma 4 yield the result. We conjecture that the bounds in Theorem 4(a) can be improved by a direct application of Lemmas 5 and 6. It seems possible that k ~ 2(21/11 - 1) -1, i.e., that the total number of points be greater than the total number of paths, can be shown to be sufficient in this way. Even though, in some k"-games, the second player cannot force a tie, a tie position may still exist, i.e., [/' (the collection of paths) may not be 2-regular in X (the board). The bounds of Theorem 4(a) apply, but much more can be said. THEOREM 5. If k ~ n 2-regulal' in the board.
+ 1,
then in the k"-game the collection of paths is not
Let k be fixed. For each n let the kn-game board Xn be the set of n-tuples on
{I, 2, ''', k}. Designate the elements of GF(2) by {a, I}. Any partition of Xn into two parts can be represented (in two ways) as a function from Xn into GF(2). Let f:Xm-+ GF(2) and g:Xn-+ GF(2) represent partitions. Then we define f ff) g: Xm+n -+ GF(2) by
(fff) g) (a1' "', am+n)
=
f(a1' ''', am)
+ g(a m+1' "', am+,.)
where addition on the right takes place in GF(2). Thus fff) g represents a partition of Xm+n into two parts. Note that "ff)" is an associative operation on functions from the Xi into GF(2). Proof of Theorem 5. Let V1, V2 , "', Vm be k-dimensional vectors over GF(2), that is, functions from X 1 into GF(2). Define
Suppose that for each choice of (i = 1, ''', k)
326
REGULARITY AND POSITIONAL GAMES
1963]
229
the vector is neither all zeros nor all ones. Then from the above discussion it can be seen that represents a partition of Xm no part of which contains a path. The theorem will be proved if for each k, k -1 such vectors can be found. The desired constructions are obvious extensions of the following two examples for odd and even k. For" = 5: (1, 0, 0, 0, 1) (0, 1, 0, 1, 0) (1, 0, 0, 0, 0) (0, 1, 0, 0, 0) For k=6: (1, 0, 0, 0, 0, (0, 1, 0, 0, 1, (1, 0, 0, 0, 0, (0, 1, 0, 0, 0, (0, 0, 1, 0, 0,
1) 0) 0) 0) 0).
REEERENCE,)
I. R. Rado, Notc on combinatorial analysis, Proc. London Math. Soc. (2) 48 (1943-·4.5), 122-160. 2. A. Y. Khinchin, Three pearls of number theory, Graylock Press, Rochester, 1952, pp. 11-12 3. R. Rado, Axiomatic treatment of rank in infinite sets, Canad. J. Math. 1 (1949), 338. 4. W. H. Gottschalk, Choice functions and Tychonoff's Theorem, Proc. Amer. Math. Soc. 2 (1951),172. 5. D. Blackwell and M. A. Girshick, Theory of games and statistical decisiolls, Wiley, New York, 1954, p. 21. 6. P. Hall, 011 represelllatives of subsets, J. London Math. Soc, 10 (1935), 26-30. CALIFORNIA INSTITUTE OF TECHNOLOGY, PASADENA, CALIFORNIA UNIVERSITY OF OREGON, EUGENE, OREGON
327
Research Notes
833
On well-Quasi-ordering finite trees By C. ST. J. A. NASH-WILLLUiS King's College, Aberdeen (Received 9 March 1963) Abstract. A new and simple proof is given of the kno,,;n theorem that, if T1 , T2 , •• , is an infinite sequence of finite trees, then there exist i and j such that i < j and Ti is homeomorphic to a subtree of 1j.
A qua.si-ordered set is a set Q on which a reflexive and transitive relation ~ is defined. Q and Q' will denote quasi-ordered sets. An infinite sequence ql' Q2' , .• of elements of Q will be called good if there exist positive integers i, j such that i < j and qj ~ qj; if not, the sequence wiII be called bad. A quasi-ordered set Q is u'ell-quasi-ordered (lI'qo) if every infinite sequence of elements of Q is good. A graph G consistR (for our purposes) of a finite set 1'(G) of elements called t'uficP8 of G and a subset E(G) of the Cartesian product V(G) x r(G). The elements of E(G) are called edges of G. If (;, lJ) E E(G), we calllJ a successor of;. If;, lJ E V(G), a ;'7-path is a sequence ;0' "',;n of vertices of G such that ;0 = ;, ;n = '1 and (;i-1>;;J EE(G) for i = I, ... , n. The sequence with sole term; is accepted as a ;;-path. If there exists a ;11-path, we say that lJ follou·s;. For the purposes of this paper, a tree is a graph T possessing a vertex piT) (called its root) such that, for every; E J'(T), there exists a unique piT) ;-path in T. The letter T (with 01' without dashes or suffixes) will always denote a tree. For the purposes of this paper, a homeomorphism of T into T' is a function 9: l"(T) -+ V(T') such that, for every; E nT), the images under 9 of the successors of; follow distinct successors of 9(;). The set of all trees will be quasi-ordered by the rule that T ~ T' if and only ifthere exists a homeomorphism of T into T'. This paper presents a new and shorter proof of the following theorem of Kruskal (2). THEORE}I
l. The set of all trees i.s u'qo.
If A, B are subsets ofQ, a mappingf: A -+ B is non-descending if a ~f(a)forevery a E A. The class of finite subsets ofQ will be denoted by SQ, and will be quasi-ordered by the rule that A ~ B if and only if there exists a one-to-one non-descending mapping of A into B, where A, B denote members of SQ. The Cartesian product Q x Q' will be quasi-ordered by the rule that (ql' q~) ~ (q2' q;) if and onl,\' if ql ~ q2 and q~ ~ q;. The cardinal number of a set A will be denoted by IAI. The following two lemmas are well known (see (I)). hut for the reader's eonYenil:'nce their proofs are given here. LElIlMA
l. If Q, Q' are wqo, then Q x Q'i.s u·qo.
Proof. We must prove an arbitrary infinite sequence (ql' q~), (q2' q;), ... of elements ofQ x Q' to be good. Call q", terminalifthere is non> rnsuch that q", ~ q". The number 53'3
329
Research Notes
834
of q", which are terminal must be finite, since otherwise they would form a bad suhsequence of q1' g2' •••• Therefore there is an N such that qr is not terminal if r > .V. We can therefore select a positive integer f(l) > N, then an f(2) > f(l) such that q/(l) .::; g/(2), then an f(3) > f(2) such that q/(2) .::; q/(a) and so on. Hince Q' is wqo, there exist i, j such that i < j and q/(;) .::; q;(j), whence (q/(il> q;w) .::; (qI(j)' q;lj» and therefore our original sequence is good. LEMMA
2. If Q is 1I'go, then SQ
i.~
wqo.
Proof. .\ssume that the lemma is false. Select an Al E SQ such that Al is the fir;;t term of a had sequence of members of SQ and IA11 is as small as possible. Theil select an A2 such that A 1 , A2 (in that order) are the first two terms of a bad sequence of members of SQ and IA21 is as small as possible. Then select an Aa such that ..1 1 , A~, ..13 (in that order) are the first three terms of a bad sequence of members of 8Q and !..t31 is as small as possible. Assuming the Axiom of Choice, this process yields a bad sequencc AI' A 2 , ..1 3 , .... Since this sequence is bad, no Ai is empty: therefore we can seleet all element (Ii from each Ai' Let Bi = Ai - {a;}. If there existed a had sequence EplI , H/(2l> ... "uch thatf( I) .::; f(i) for alIi, the sequence ..11> A 2 ,
... ,
..1 / (11-1' BIll), B / (2h
...
would be bad (since Ai .::; B j entails Ai .::; Aj and is therefore impossible if i < j). Since this would contradict the definition of A/Ill> there can be no bad Requence B I (l), B I (2), ... such thatf( 1) .::; f(i) for all i. Itfollows that the class (\8, say) of set" Hi is wqo, since any bad sequence of sets B; would have a (bad) infinite ;;uhsequenee in which no suffix was less than the first. Therefore, by Lemma L Q x )!' is wqo. Therefore there exist i, j such that i < j and (ai' BJ .::; (U j , B j ), which implies that Ai ~ Aj anel thus contradicts the badness of AI' A 2 , .... This contradiction proves the lemma. The branch of T at a vertex 6 is the tree R such that r(R) is the set ofthol'e wrtices of T which follow sand E(R) = E('l') n (V(R) x I'(R».
Proof of Theorem I. Assume that the theorem is false. Select a tree 7; such that T1 is the first term of a bad sequence oftrees and Ir(T1 ) I is as small as possible. Then !
. would be bad (since Ti .::; R E B j entails 1'; .::; 'lj and is therefore impossible if i < j). Hince this would contradict the definition of Tf{lh there can be no bad sequence R 1 • R 2 , ... such that Ri E Bl1jl and f(l) .::; f(i) for every i. Since any bad sequence of elements of B would have a bad subsequence of this form, it follows that no sequence of elements of H is bad. Therefore B is wqo and hence, by Lemma 2, SB is wqo. TherefOl'e
330
Research Notes
835
Hi ~ R j for some pail' i. j such that i < j. Therefore there is a one-to-onp 11011descpnding mapping 9: Hi -'? B j • For each R E B i , R ( 9(R) and su there exisb a humeumorphism hll of R into 9(R). A humeumorphism h of Ti into ~ may thus be defined h.v writinp: h(p(Ti )) = p(~) and making h coincide with hll on the vertices of pach if E B i • Thprefore Ti ( 7j, which contradicts the badness of 7~, T2 • ... and thus Ill'" I es Theorem I. The Tree TheOI'PIll of (2) is stronger than Theorem 1 of the present paper, hut the ahu\"(' proof of Theorem I can easily be adapted to prove the Tree Theorem hy COI1sitlf.l'iilg X x F(B) in place of .'in (where X. F have the meaninf,(s stated in (2)). Because til(' I:ecessary chanp:es are eas,\' to make, [ have sacrificed this much f,(enerality in the intc'I','4s of readability.
So/e ((([rled 10 A 1I!l1I8t l!·H;~. It has bCf'n brought to thp author's notice that Kruskal's proof of thl' Tree Theorem (2) anticipatl'd a somewhat similar proof obtained indepelldmtly hy i-i. Tarkowski (Bllll. Acrtrl. Polon. Sci. Sir. Sci . .lltltli. Asl,.. Phy'. 8 (IfHIO), an-H). I{ EFEH
(1) HI,;)!.,,,. (;.
Ord(,l'in~ b~' dj\'i,iiJili!~'
EX,' ES
in abs!ra('! algebras. 1'l"Ic. £'Jllr/"" .1/rII/•. S.O('. (:\).2
(I 9.;:!), :l~lj-331i. (2) KI:'·SKAI.,.f. B. \\·pll-
Reprinted from Proc. Cambridge Phil. Soc. 59 (1963),833-835
331
T(lljI8.
Z. Wahrsch .. inlichkcitstheorie 2, :J40
368 (191i4)
On the .Foundations of Combinatorial1.'heory I. Theory of Mobius Functions By GrAN-CARLO RO'fA
Contlluts :J40
I. J"Iro,!. ... !';oll . . . .
342 344
2. Prd;min.lI·i"H. . . . :l. The illcide"ce algpbra 4. Mai" rf'Hu\t.s . . . .
347
349
Ii. Al'l'licatiollR . . . . (i. The J<~1I1er ,'harn(·tA'ri~tie .
:l52
356
7. nl'ornetri,' lat.l.i"I'H. . . . H. J{cl'resPlll al jOIlR . • • . n. J\ ppli""1 iOIl: 1h" coloring of gral'hH ]I'. Appli('at.io,,: flows in networks . .
:JfiO
:ltn
:164
L 1lIll'1Ilhu'tillll One of the most useful principlf'H of enumeration in discrd.e probability and combinatorial t.heory is the celebrate,\ principle ol·incln8i()/I-e,cdl1.~i()n (ef. }<'J.)J.U).t *, Fn(.;cm:T, H,JOltDAN, RYSER). When :skillfully applied, thi" principle has yiedd"d Oil' ~()I\lt.ion to many a combinat.orial problem. It" mathematical foulldat.iow\ wen' t.IlOroughly investigated not long ago in a monograph by FRECHET, and it. Illight ,,1 til"';\' "ppe'ar that, after Kl.wh ('xhaustive work, little else eOlllcl lin >;aid on the subjf'd.. (lilt' fre~(lU(,nt.ly /lotiees, however, a wide gap between the bare stat.ement of the principle alld the Hkill rl'quired in rpcognizing that it applies to a particular eOlllhinatorial problem. It has often t.aken the eornbined efforts of many a ('ombinatorial analy:;t O\'er long periods to rceognize an inclusion-exclusion patt('TIl. For example, for the IlH;nage problem it took fifty-five years, since CAYJ,EY'S attempts, before JAcQln;s TOUCHARD in 1934 could recognize a pat.tl'rn, and t.hence readily obtain till' :solution as an explicit binomial forrnuht. The sitnat.ion becomeR bewildl'ring in problems requiring an enurnel"/ttiol1 of :tlly of tho JIUIHf'rous collections of combinatorial objects which are nowadllYH e·onling to the fore. The count,ing of t,rees, graphs, partially ordered seb!, complexes, finite sets on which groups act, not t.o mention more difficult problems relating to permutations with restricted po"it.ion, such as I,at-in squares and the coloring of maps, seem to lie beyond present-day methods of (,lIumeration. The lack of a systematic
332
011 till' 1<'oundationR of Combinatorial Thf'ory. I
:141
theory is hardly matched by the eonRummaw skill of a few individuals with a. natural gift for enulllerat,ioJl. Thill wurk begins the tltudy of a very general principle of enumeration, of which the inclusion-exclusion principle is the simplest, but also the typical case. It often happens that a set of objects to be counted possesses a natural ordering, in general only a partial order. It may be unnatural to fit the enumeration of such a sct into a linear ordcr such as the intcgers: instead, it turns out in a great many cases that a more effective technique is to work with the natural order of the set.. One is led in this way to set up a "difference calculus" relative to an arbitrary partially ordered set. Looked at in this way, a surprising variety of problems of enumerat,ion reveal themselves to be instances of the general problem of inverting an "indefinite ~mm" ranging over a partially ordered set. The inversion can be carried ou1. by defining an analog of the "difference operator" rclative t.o a part,ial ordcring. su(~h an olH,rator is the Mobius function, and the analog of the "fundamental theorem of the ealculus" thus obtained is the Mobius inversion furmula on a partially ordered set. This formula is here expre8sed in a language close to that of IlllmheI' theory, where it appears as the well-known illv!'rsc relat.ion hetween the Rit'mallll zeta function and the Diriehlet generating function of the classieal Mobius functioll. In fact., the algebra of formal Diriehlet series turns out to be the simplest nontrivial instance of such a "difference calculus". relat.ivc to t.he order relation of divisibilit,y. Once the importance of thc MohiuH flloetion in enumeration problems is realized, interest will natul'ally center upon relating the propertie8 of t.hit; function to the st.rueturc of the ordering. This is t.he su bje!'t of t.he first. paper of this serics; we hojw to have at. !I'ast h('glln thl' systemat if' study of the n'mal'kahl(, pl'Operti,'s of this most. natural invariant. of all onlpr relat.ion. 'Ve begin in Section 3 with a brief study of the ineidcnee algebra of a locally finite partially ordered 8et and of the invariants assoeiated with it: the zeta funl'\.ion, Mi\hius function, ineidcrw(' fune(.ion, allll Bult,r (,haraeh'ristie. The language of numbcr theory is kept., rat.her than t.hat of the calculw, of finite difl'erenel's, and the results here are quite simple. TIll' next sedioll (:ontaim~ th!' main t1H'orerns: ThcoJ'!'1ll I I'plales till' Miihius fllndion8 of two sets relakd by a Galois eonne(;tion. By Ruitably varying OIH' of the s!'ts while keeping t,he other fixed one can derive much informat.ion. Ttu'orem 2 of this 8ection is sugge~ted hy a teehnique that apparently goes back to I{AMANU,JAN. These two Imsic resuJt,s an' applied in the next H('etion to a variety of special eases; although a number of applications and special cascIO have beellieft 0111., we hope thercby to have given an idea of the t!'chniques involved. The results of Section 6 stem from an "ldeenkreis" that can be traced back to Whitney's early work on linear graphs. Theorem 3 relates t,he Mobius funetion to ccrtain very simple invariants of "cross-cuts" of a finite lattice, and the analogy with th" Euler ehara('t-eristic of combinat.orial topology is inevitahle. Pursuing this analogy, we were led to set up a series of homology theories, whose Euler eharaetprist.ic doe;; indeed poirwidl' with the Euler eharact.eristic which we had int.roduel'd by purely combinatorial devices.
333
:142
GIAN·CAR(,O ROTA:
Some of the work in lattice theory that was carried out in the thirties is uHeful in t.his inv(~8tigation; it. t,urns out, however, that modular lattices are not eombinat.orially as interesting as a type of structure first studied by WlIITNEY, which we have called geometric lattices following BIRKHOFF and t.he :French schooL The remarkable property of such lattices is that their Mobius fUllction alternates in sign (Seetion 7). To prevent the lengt.h of this paper from growing beyond bounds, we have omitt.ed applieations of the t.heory. Some elementary but typical applicat.ions will be found in Ule author's expository paper in the American Mathematical Monthly. 'rowan1:s t,he end, however, t.he temptation to give some typical examples became irresistible, and SectionR 9 and 10 were added. These hy no means exhaust the range of applieations, it is our eonviction that the Mobius inversion formula on a partially ordered set is a fundamental principle of enum{'ration, and we hope to implement this conviction in the successive papcrs of this series. One of them will deal with st.ructures in which the Mobius function is mulUplieative, ·-·that i~. has the analog of the number· theoretic property fl (mn) = ,lim) fl (n) if m and nan' l·oprime .,. anll another will give a sy:4cmat.ic (iP-vP)opment of the Ideenkrl'is centering around POLY A'S Haupt:satz, whieh can be ;;ignificant.\y extemll'd by a ;;uitable Mobius inver:sion. A few words about t.he hist.ory of the subject. Tlw statement of the Mohius inversion formula does not appear here for the first time: the first coherent vl'l'sion--with some redundant aRsumptions--i:; due t.o \VF.JSNI<:R, and was independl'ntly redis(,overed shortly afterwards by PH [Ln' HALL. Ward gave the statement ill full gl'nprality. ~(rangdy enough, howl'ver, tlll'>::e authon; did not punnw tlw (,ombinatorial impli(,at-ions of their work; nor was an attempt m!1de t.o ;;yst.emati. eally inn'sl igat.p til!' 1'1'O\lI'I-tit,;; of M6bi It" flllletiolls. Aside fl'Ol1I H A I.[:S applieutions to Il·grollp;;. and f!'OlIl SOIll!' applit'ations to ;;t,ati:4it'al IIlcchanim; by M. S. GREEN and r\~;TTL1':'!'()N, Iittl(' has hpen done; we give a hopefully ('ornpll'lp hihliography at the end. It is a pll·asut'(· (,0 aeknowledge I he eneouragenwnt of G. BmI\JI()~'~' antI :\. GLK<\SO!)i", who spotted an (,ITOl' in Ow dE'finition of a (·mss.cut. as well as of SEYMOUR RHJmMAN and KAI·LAI CHUNG. My eolleagues 1), KAN, G. WHln:HEAD, and esp('(:ially F. P)';'I'E!{SON gave IIW ('"sential help in i'l'tting up the hOl1lologil'al interpretat.ion of the crOAs·cut thl'orem.
2. Preliminaries l~ittJe knowledge i;; rC4uin'tl to read t-hi" work. The two !lot ionlS w(' shall not define are t.hose of a partially ordlm~d set (whose order relation is denoted by :;io) and a I,altice, which is a partially ord('red set where max and min of two elements (we call them join and meet, as usual. and write them V and 1\) Ilre defined. We :shall use inst,ead the symbols V and n t.o denote union and int,{'['::;('etioll of ,'1'(S only. A seymcnt [x, y], for x and y in a partially ordered set P, is the set of all element,:s z betwel'll x amI y, that is, sueh that x ;;-;, z <:'; y. \VI' shall oceasionally liRe open or hlllf,opt'n Regments such a:s [x, y). where one of HII' endpoint.s iH t.o he omitted. A segment is endowed with the induced ordcr shueture; t.hu:s, a :;;cgment of a lattiee it! aglLin a latti(·e. A partially order{'d set is lmnll!l finite. if eV('ry S{'glI1l'lIt. is finite. We Hhall only deal with loeally finite partially ordered set.;;.
334
On the Foundations of Combinatorial Theory. [
:l43
The producl P /~ Q of partially ordered H{'t,; P and Q i" the set of all ordered pairs (p. q), whcre PEP and q F Q, endowed with the order (p, q) ',' (T, 8) whenever p ;;;; rand q ;;;; s. The product of any numbl'l' of partially ordered sets is defined similarly. The cardinal pOWI'r Hom (P, Q) is the set of all mOllotonie fUIl('lion;; from P to Q, endowed with the partial order structure f ~ g whenever f (p) ;:' y(p) for evpl'Y l' in P. In a partially ordered set, an element p covers an element q when the segment [q,pl ('ont.ains two elenwnts. An atom in P is an element that coven; a millinwl
element. and a dual atom is an elempllt that is covered by a maximal clement. If P is a partially onlered set, we shall denote by p* the partially ordncd "d. obtaineo. from P by inverting the order relation.
A closure relation in a partially ordered set P is a function l' ,->- P of Pinto itHelf with the properties (1) P ;;;; 1'; (2) P= p; (3) l' ;;;; q impliet; p ~ ij. An element it; closed if l' = p. If P is a finite Boolean algebra of sets, then a closure relation on P defines a lattice structure on the closed elemmts by the rules p 1\ q = p ( l q and p V q p V q, and it is easy to see that evpry finite lattice is isomorphic to one that is obtained in this way. A Galois connection (cf. ORE, p. 182ff.) between two partially ordered sets P and Q is a pair of functions r: : P --* Q and 7l : Q --* P with the propedies: (I) both ( and 7l are order-inverting; (2) for l' in P, 7l(((p)) ~ 1', and forq in Q, ((71(q) ~ q. Undcrthese circumstances the mappings l' --*71(((p)) and q-+C(71(q)) are closure relations, and the two partially ordered sets formed by the clo~ed 8ds are isomorphic.
.=
I n Section 7, the notion of a closure relation with t.he JJ.far ['ane-Slcinitz exchange pl'Oprrty will bc used. Such a closure relation if< defined on the Bonle-all algebra P of subsets of a finite set E and satisfies the following property: if l' and q are points of E, am),'" a suhle! of E, and if p rf: S but pE S V q, then q E S V p. 8ueh a closure relation can be made the basis of WHITNEY'S theory of independence, as well as of the theory of geometric lattices. The doscd sets of a dosurc relation satisfying the MACLANE-STEINITZ exchange property where every point is a closed set form a geometric (= matroid) lat,tice in the sense of BIRKHOFF (Lattice Theory, Chapter
IX). A partially ordered set P is said to have a 0 or a I if it has a unique minimal or maximal element. We shall always assume 0 I. A padially ordered set P having a 0 and a I satisfies the chain condition (also called the ,JORDAND1W}:KI);D chain condition) when all totally ordered sub8ctl,1 of P having a maximal number of elements have the same number of elements. Under these circumstances one introduces the rank r (1') of an element l' of P as the length of a maximal ehain in the segment [0, 1'], minus one. The rank of 0 is 0, and the rank of an atom is 1. The height of P is the rank of any maximal clement, plus one.
*'
Let P be a finite partially ordered set satisfying thc chain condition and of height n 1. The characteri8tic polynomial of P is the polynomial It (0, x»).n-r(x), where r is the rank function (see the def. of It below). xEP
.L
+
If A is a finite set, we shall write n(A) for the number of elements of A.
335
:144
GrAN-CARLO ROTA:
3. 'rhe incid('nce algebra Let P be a locally finite partially ordered set_ The incirience algebra of P is defined as follows_ ConRider the set of all real-valued functions of two variableI' f (x, y), defined for x and y ranging over P, and with the property that f (x, y) = 0 if x ;f y_ The sum of two such functions f and g, as well as multiplieatio/l by scalar", are dl'fim-d as usual. The product h = fg is defined as follows:
h(x, y) = LI(x, z)g(z, y)_ x5;,z~y
In view of the assumption that P is locally finite, the sum on the right is welldefined_ It is immediately veJ-ified that this is an associative algebra over the real field (any other associative ring eould do). The incidencc algebra has an identity element which we write IJ (:l", y), the Krollecker delta. The zeta function C(x, y) of the partially ordered set [> is th!' element of the incidence algcbra of P such that C(x, y) = 1 if x ~ y and Uc, y) = 0 otherwi"e. The funl'tioll /I (:l", y) = C(x, 11) (x, y) is callpd thc hlcidence function.
a
TIll' idea of the incidence algebra is not new. The incidence algebra is
It spt>(·ial case of a semigroup algebra relative to a semigroup which is easily associated with the partially ordered set. The idea of taking "interval functions" goes back to DEDEKIND and E. T. BELL; see also WARD.
Proposition 1. The zeta lunction of a locally finite partially ordered 8et i8 invertible in the incidence algebra. Proof. We define the inverse fl (x, y) of the zeta fune! ion by induction oyer the numhf'r of l'lements in t.he :-<egment [x, y). First, Ret. fl(X, x) = I for all x in P. Sll""()~'· HOW t,iHlt fl (:1",:) ha~ ol'(,n dl"filH'd for all:; in the opell Hl'gllWllt Iy. y). Theil spt /1, (x, Y) -(.r, z) .
LIt
:r- z
~y
('Il-ady /1, is an illv{"!",;e of ~. TIll> function fl, inverse to C, i" called the Miibius func/ion of the partially ordered set P. Tlw following r('~ult, simple though it iH, is funtiam('lltal:
Proposition 2. (Miibius inversion formula). Let I(x) be a real-valued junction, de/incd for x ranging in a lomlly finite partially ordered 8et P. Let an element p e.l"i8t with the property that / (x) = 0 unle88 x ;:;; p. Suppose that (*) g(x)=2.,f(y). y~.r
l'hen (**)
f(x) = Lg(y)fl(y,X). y~x
Proof. The function g is wplI-defincd. Imlcpd, the sum Oil tilt' right can be written as I (y), which is finite for a locally finite ordered sct.
L
p'5.1I5x
Subst.ituting t.he right side of (*) into the right side of (**) and ~irnplifyillg,
336
On tht' 1<'oundationx of Combinatorial Theory. I
we get '2,g(y)p(y,x) =
2: L!(z)P(y,x) =
"'2.
345
L!(z)C(z,y)p(y,x).
Interchanging the order of summation, this becomes "L,/(z) L C(z, y)p(y, x) z
=
L,!(z) b(z, x)
=
f(x) ,q. c. d.
lI~X
Corollary 1. Let r(x) be a function dl'firwd for x in P. Suppose there is an ...11'r1l(:lIt q such that r(x) vanishes unles.~ x ~ q. Suppose that s(x) =
Lr(y).
Then r (x)
= L, p (x, y) s (y) . 1/ ;;~X
The proof is analogous to the above and is omitted. Proposition 3. (Duality). Let p* be the partially ordered set obtained by inverting the order of a locally finite partially ordered set P, and let fl* and p be the Mobius functions 0/ p* and P. 'l'hen p*(x, y) = p(y, x). Proof. We have, in virtue of Proposition 2 and Corollary 1, '2>*(x,y) = b(x,z). X~·II~·Z
Letting q(x, y) = p*(y, x), it follows that. q is an inverse of C in the incidence algebra of P. Since the inverse is unique, q = p, q. e. d. Proposition 4. The Mobius function of any segment [x, y] of P equals the re81riction to r;r, y] of the Miibius fundion of P. The proof i~ omitted. Proposition 5. Let P X Q be the direct pmduct of locally finite partially ordered sets P and Q. The Mobius function of PxQ ilS given by fl ((x, y), (u, ti» = p (x, u) p (y, 11),
X, U E
P; y,
t'
E
Q.
The proof is immediate and is omitted. Thl' same letter p has been used for the Mobius functions of three partially ordered sets, and we shall take this liberty whenever it will not cause confusion. Corollary (Princillle of Inclusion-}~xclulSion). Let P be the Boolean algebra of all subsets of a finite set of n elements. Then, for x and y in P, p (x, y) ,= (-
l)n(II)-n(X) ,
y
~x,
where n (x) denotes the number of elements of the set x. Indeed, a Boolean algebra is isomorphic to the product of n chains of two elcment.s, and every segment. [;r, y] in a Boolean algebra is isomorphic to a Boolean algebra. Aside of the simple result of Proposition 5, little can be said in general about how the Mobius function varies by taking subsets and homomorphic images of a partially ordered set. We shall see that more sophisticated notions will be required to relat.e the Mobius functions of two partially ordered sets.
337
346
I~'
GIAN·CARLO ROTA:
Let P be a finite partially ordered set with 0 and I. The Eliler characteristic of P is defined as E=l+p(O,I).
The simplest result, relat.ing t.o t.he computation of t.he Euler characteristic was proved by PHILIP HALL by combinatorial methods. We reprove it below with a vcry Rim pIe proof which shows one of the uses of the incidence algehra: Proposition 6. Let P be a finite partially ordered set with 0 and I. For every k, let 0" be the number oj chains with k elements stretched between 0 and I. Then
+ 0 3 - 0 4 + .... 15 - 71 + n 2 .... It
E = I - O2
+
Proof. p = C-1 =~ (0 71)-1 = is easily verified that equalR the number of chains of k elements stretched between x and y. I_cUing x = 0 and y = I, the result follows at once. It will be seen in section 6 that the Euler characteristic of a partially ordered set can be related to thc classical Euler characteristic in suitable homology theories built on the partially ordered set. Proposition 6 is a typical application of the incidence algebra. Several other results relating the number of chains and subsets with specified properties can often be expressed in terms of identities for functions in the incidence algebra. In this way, one obtains generalizations t.o an arbitrary partially ordered set of some classical identities for binomial coefficients. We shall not pursue this line here further, since it lieR out of the track of the prcsent work.
71"-1 (x, y)
Example 1. The classical Mobius function p (n) is defined as (- l)k if n is the prodnet of k distim't, prinH'!-l, and 0 otherwise. The ela~:;ieal inversion formula fir"t dcrived by MobiuR in 1832 is: g(m) = '2,f(n);
f(rn) =
nlm,
'2g(n)/l("Z-),
n!m.
It is casy to see (and will follow trivally from later rcsult::;) that p
(~;) is the
Mobiw-; function of the Ret of positive integers, with divisihility as the partial order. In this eal:lc tiw ineidenec algcbra ha::; a di:;tinguished subalgcbra, formed by all functions f(n, m) of the form f(n, m)
~.~
G(7;). The product H=FG of two
functions in this subalgebra can be written in the simpler form
(*)
IJ(rn) = '2,P(k) 0(11). kn=m
If we associate with the element F of this subalgebra thc formal Dirichlet series F(s) =
~ F(n)jn s ,-then the product (*) corresponds to the product of two formal
Dirichlet series considered as functions of s,
Ii (8) =
J§'(,s) (i(.~). Under thiR
repre1'1entation, the zeta function of the partially ordered set is the classical Rie·
L Ijn 00
mann zeta function C(8) ,=
S,
and the statement that the Mobius fUlldion is
n~l
338
34i
Oil the Foundations of Combinatorial Theory. I
til(' inVf'rsc of the zeta function reduces to the classical identity I g (,~) =
L f1 (n)/n·. 00
n:.- 1
It is hoped this example justifies much of the terminology introduced above.
Example 2. If P is the set of ordinary integers, then ft(m, n) = - 1 if m. = n - 1, ft(m, m) = 1, and ft(m,n) = 0 otherwise. The Mobius inversion formula reduces to a wcll known formula of the calculuR of finitc differenees, whieh is the discrete analog of the fundamental theorem of calculus. The Mobius function of a partially ordered set can be viewed as the analog of the classical difference operator 11f(n) = f(n + 1) - f(n), and the incidence algebra serves as a calculus of finite differences on an arbitrary partially ordered set.
4. Main results It t.urns out that the Mobius functions of two partially ordered sets can be compared, when the sets are related by a Galois connection. By keeping one of the sets fixed, and varying the other from among sets with a simpler st.ructure, such as Boolean algebras, subspaces of a finite vector space, partitions, etc., one can derive much information about a Mobius function. This is the program we shall develop. The basic result is the following:
Theorem 1. Let P and Q be finite partially ordered 8et8, where P has a 0 and Q hats a 0 and a 1. Let ftp and ft be their Mobiu8 function8. Let
n:Q-P; e:P-Q be a Galoi8 connection 8uch that (1)
n(x) = 0
(2).
e(O) = 1.
Then ft(O, 1)
if and only if x
~--=
1.
= Lftp(O, a)C(e(a), 0) = Lftp(O,a). a>O
[a:Q(a)=Oj
One gets a significant summand on the right for every a > 0 in P which is mapped into 0 bye. One therefore expects the right side to contain "few" terms. In general, ftp is a known function and ft is the function to be determined. Proof. We shall first establish the identity (*)
LI5(n(x),a)=C(x,e(b» a~b
for cvery b in P. Here Con the right stands for the zeta function of Q. Equation (*) is equivalent to the following statement: n(x) ~ b if and only if x ~ e(b). But this latter statement is immediate from the properties of a Galois connection. Indeed, if n(x) ~ b, then e(n(x» ~ e(b), but x ~ e(n(x», hence x ~ e(b), and similarly for the convcrse implication. To identity (*) we apply the Mobius inversion formula relative to P, thereby obtaining the identity (**)
15 (n(x), 0)
= Lftp(O,a)C(x,e(a». a~O
Now, b (n(x), 0) takes the value 1 if and only if n(x)
339
= 0,
that is, in view of
34M
GIAN-CARLO ROTA:
assumption (1), if and only if x Therefore,
1. For all ot.her values of x, we have l5(n(x),O)
-x
0,
l5(n(x), O) = 1 - n(x, 1),
We can now rewrite equation (**) in the form 1 - n(x, 1)
=
C(x, e(O))
+ L>p(O, a)C(x, eta)) a,>O
However, in vicw of assumption (2), C(x, e(O)) = C(x, 1), and this is identically one for all x in Q. Therefore, simplifying, -
n(x, I)
=
L ,up (0, a)C(x, e(a)). 11>0
Now, since C = 15 ,u(0, I)
+ n, we have ,u =
= -
L
() ,u(0, x)n(x, 1) =
05%51
,un, hence, recalling that 0
LL
'* 1,
,up (0, a) ,u(0, x)C(x, e(a)).
05%5111>0
Interchanging the order of summation, we get ,u(0, I)
= L,up(O,a) L,u(O, x)C(x, e(a)). ">0
O~x~l
The last sum on the right equals 15(0, eta)), and this equals C(e(a), 0). The proof is therefore complete. For simplicity of application, we restate Theorem 1 inverting the order of p, Corollary. Let p: Q ---+ P; q: P ---+ Q be order pre.~ert'ing lunctjon8 between
P and Q 8.uch that (1)
If p (x)
=
1
th(,11
X
q(l)
(2)
p(q(,c));Sx
(3)
= 1 , and convcrsely . .~
I.
and
q(p(x))~c~x.
Then ,u(0, 1)
=
L,up(a, I)C(q(I1),O) a< 1
=
L,up(a, 1)
[a:q(a)=Oj
where ,u is thc Mobius function of Q. The second result is :suggested by a techlli(jue whieh apparently gO('S back to RAMANUJAN (cf. HARDY, RAMANU,JAN, page 139). Theorem 2. Let Q be a finite partially ordered .set with 0, lind let P be a parlially ordered 8et with O. Let p: Q .~ P be a mon%nir, lunclion 01 () onlo P. A88ume Ihllt the int'er.se image 01 el)el'!1 interoallO, aJ in P i8 anilliNI'ulI0, xl ill fJ, amI Iltl/I !h,inver8e image 010 contains at least /11'0 points. Then L,u (0, x) == 0
[x:p(.r)-=ui
for every a in P.
Thc proof j,; by inductiun over the set P. :-;irl('(' 10,01 is an inkrval and its illvprsc image is all intprval [0, ql wit.h q . 0, we IlItvl' >
Lf.l(O,.I') [X:I'(f).~nj
= Lll(O"t} ~ O. (\." ..q
340
On the Foundations (If Combinatorial Theory. I Suppo~c
now the statement il'\ tme for all b such that b
2 »(O,x)- o.
b
It follows that.
< a in
:l4!J
P. Then
[x:p(.c)=I,)
L,u(O,x) = L L,u(O,x). b&a [:t:p(:t)-b)
[.c:p(x)=a)
The last sum equals the sum over 80me int,erval [0.1'] which is the inver8t1 image of the segment [0, aj, t.hat i8
:L :L,u (0, x) =
b~a
[z:p(:t)-bl
L,u (0, x)
=
lJ (0, r) .
0&",;0;.
But, l' > 0 becausc a is strictly greater than O. Hence lJ(r, 0) = 0, and this C011cludes the proof. 5. Applications The simplest (and typical) application of Theorem 1 is the following: Proposition 1. Let R be a subset of a finite lattice L with the following properties: I ¢ R, and for every.c of L, except x = 1, there is an element y of R such that y ~ x. For k ~ 2, let q" be the number of subsets of R containing k elements whose meet is O. Then ,u(0, 1) = q2 - qs q4 Proof. Let B(R) be the Boolean algebra of subsets of R. We take P = B(R) and Q = L in Theorem 1, and establish a Galois connection as follows. For x in L, let n(x) be the set of elements of R which dominate x. In particular, n(I) is the empty set. For A in B(R), set (! (A) = 1\ A, namely, the meet of all elements of A, an empty meet giving as usual t.he element 1. This is evidently a Galoi8 connection. Conditions (I) and (2) of the Theorem are obviously satisfied. The fundion ,up is given by t.he Corollary of Proposition 5 of Scction :1, and hence the conclusion is immediate. Two noteworthy special cases are obtained by taking R to be the set of dual atoms of Q, or the set of all element,s < 1 (cf. also WEISNER). Closure relations. A useful application of Theorem 1 is the following: Proposition 2. Let x -+ x be a closure relation on a partially ordered set Q having 1, with the property that x = 1 only if x = 1. Let P be the partially ordered .subset of all cl08ed elements of Q. Th(>:n: (a) If x > x, then ,u(x, 1) = 0; (b) If x = x, then ,u(x, 1) = ,up (x, 1), where,up is the Mobius function of P. Proof. Considering [x, 1], it may be assumed that P has a 0 and x = O. We apply Corollary 1 of Theorem 1, setting p(x) =--' x and letting q be the injection map of Pint.o Q. It. is then clear that the assumptions of the Corollary are satisfied, and the set of all a in P such that q(a) = 0 is either the empty set or the single element. 0, q. e. d. Corollary (Ph. Hall). If 0 is not the meet of dual atoms of a finite lattice L, or if 1 is not the join of atoms, then ,u(0, 1) = O. Proof. Set x = I\A (x), where A (x) is the set of dual atoms of Q dominating:l:, and apply the preceding result. The second assertion is obtained by inverting the order. Example 1. Di.ytributive lattices. Let L be a locally finite distributive lattice. Using Proposition 2, we can easily compute its Mobius function. Taking an interval
+ + ....
341
350
UlAN-CARLO ROTA:
lx, yJ
and applying Proposition 4 of Section 3, we can assn me that L is finite. For a E L, define ii to be thc join of all atoms which a dominates. Then a ...... a is a closure relation in the inverted lattice L* . .Furthermore, the subset of close(1 elements is easily seen to be isomorphic to a finite Boolean algebra (cf. BIRKIWFF Lattice Theory, Ch. IX) Applying Proposition 5 of Section 3, we find: fl (x, y) = 0 if Y is not the join of elements covering x, and fl (x, y) = ( - l)n if y is the join of n distinct elements covering x. In the special case of the integers ordered by divisibility, we find the formula for the classical Mobius function (cf. Example I of Section 3.). The Mobius function of cardinal products. Let P and Q be finite partially ordered sets_ We shall determine the Mobius function of the partially ordered set Hom(P, Q) of monotonic functions from P to Q, in terms of the Mobius function of Q. It turns out that very little information is needed about P_ A few preliminaries are required for the statement. Let R be a subset of a partially ordered set Q with 0, and let R be the ideal generated by R, that is, the set of all elements x in Q whieh are below «) some element of R. We denote by Q/R the partially ordered set obtained by removing off all the elements of R, and leaving the rest of the order relation unchanged. There is a natural order-preserving transformation of Q onto Q/R which iH one-to-one for elements of Q not in R. We shall call Q/R the quotient of Q by the ideal generated by R. Lemma. Let I: P -+ Q be monotonic w'ith range R c Q. Then the srgmcnl [f, 1] in Hom (P, Q) is isomorphic with Hom (P, QfR). Proof· For g in [f, 1], set y' (x) = y(x) to obtain a mapping y -+ g' of l/, IJ to Hom (P, Q/ R). Since y >: f, the range of g lies above R, so the map i" an i~() morphism. Proposition 3. Tlw jtJobiu8 lunction I~ of the cardinal product HOIII (P, Q) of the finite partially ordered 8Pt P u'ith the partially ordered set Q with 0 and 1 is determined as follows: (a) If f(p) 0 for some rlem!'lIt p of P which is not maximal, then fl (0. j) -, 0_ (b) In all other rases,
*
/.l (0, f) c= TIfl(0,/(1II»,
IE P,
where the product ranges over all maximal elem(mls of P, and where If, Oil IIIf' riyht stands for the Mobius lunction oj Q. (c) For f ;::;;; g, flU, g) = fl(O, g'), where g' i8 the image of y under the canom.ial map of [j, 1] onto Hom. (P, Q/R), prOl'iderl Q/R has a U. Proof· Define a closure relation in lO, IJ *, namely the :;egment [0, /J with the inverted order relation, as follows. ~et y(m) ,~ g (m) if rn i;; a maximal element of P, and y(a) = if a is not a maximal element of P. If Y === 0, then g(m) = 0 for all maximal elcnH>ntti 111, hence y (a) = 0 for all a < tiOllle maximal element. sinc!' y is monotonic. Hence g c= 0, and the assumption of Proposition 2 is satisfied. The set of closed elements is isomorphic to Hom (M, P), where }.f is a set of as many elements a;; there are maximal elements in P. Conclusion (a) now follows from Proposition 2, and eonelusion (b) from Proposition 5 of Sect.ion 3. Conclusion ((;) follows at once from the Lemma.
°
342
On the Foundations of Combinatorial Theory. I
351
We pass now to some applications of Theorem 2. Proposition 4. Let a --? a be a closure relation on a finite lattice Q, with the property that a\[1j = Ii V fj and 0> O. Then for all a E Q, Lt.t(O, x) =0. [x:x~aJ
Proof. Let P be a partially ordered set. isomorphic to the set of closed elements of L. We define p(x}, for x in Q, to be the element of P corresponding to the closed element x. Since 0> 0, any x between 0 and 0 is mapped into O. Hence the inverse image of 0 in P under the homomorphism p is the nontrival interval [0,0]. Now consider an interval [0, a] in P. Then p-l ([0, a]) = [0, x], where x is the closed element of L corresponding to a. Indeed, if 0 ~ y ~ x then y ~ x = x, hence p(y) ~ a. Conversely, if p(y) ~ a, then y ~ x but y ~ y, hence y ~ x. Therefore the condition of Theorem 2 is satisfied, and the conclusion follows at once.
Corollary (Weisner). (a) Let a > 0 in a finite lattice L. Then, for any b in L, Lt.t(O,x)
= 0
xva~b
(b) Let a
<
1 in L. Then, for any b in L, Lt.t(x, 1) =
o.
xi\a~b
Proof. Take x = x V a. Part (b) is obtained by inverting t.he order. :Examp1e 2. Let V be a finite-dimensional vector space of dimension n over a finite field with q elements. We denote by L(V) the lattice of subspace" of V. We shall use Prop~sition 4 to compute the Mobius function of L(V). In the lattice L( V), every segment [x, y], for x ~ y, is isomorphic to the lattice L (W), where W is the quotient space of the subspace y by the subspace x. If we denote by t.tn = t.tn(q) the value of t.t(0, 1) for L(V), it follows that t.t(x, y) = t.t1, whenj is the dimemrlon of the quotient space W. Therefore once ftn is known for for every n, the entire Mobius function is known. To determine ftn, consider a subspace a of dimension n - 1. In view of the preceding Corollary, we have for all a < 1 (where 1 stands for the entire space V): Lft(x, 1)
=0
where 0 stands of course for the O-subspace. Let a be a dual atom of L(V), that is, a subspace of dimension n - 1. Which subspaces x have the property that x 1\ a = O? x must be a line in V, and such a line must be disjoint except for 0 from a. A subspace of dimension n - 1 contains qn-l distinct points, so there will be qn - qn-l points outside of a. However, every line contains exactly q - 1 points. Therefore, for each subspace a of dimension n - 1 there are qn _ qn-l q - 1
-=-_=--;-_ = qn-l
distinct lines x such that x 1\ a = O. Since each interval [x, 1] is isomorphic to
343
nIAN-CARLO ROTA.
a Kpace of dimension n
1, w£' obtain L,u(x.l)-
qn-l ,un--l .
xlla=O
T"O
This is a difference equation for ,un which is easily solved by iteration_ We obtain the result. first established by PHILIP HALL (see also WEISNER and S. DELSARTE) : ,un (q)
=
(--
l)n qn(n~l)/2 = ( - l)n q(;) _
6. '1'Iw Euler characteristic Sharper results relating ,u (0, 1) to combinatorial invariants of a finite lattice can be obtained by application of Theorem 1, when the "comparison set" P remains a Boolean algebra. A cr08s-cut C of a finite lattice L is a subset of L with the following properties: (a) C does not contain 0 or 1. (b) no two elements ofC are comparable (that is, if x and y belong to C, then neith{'r x < y nor x> y holds). (c) Any maximal chain stretched between 0 and I meets the sct C. A spanning subset S of L is a subset such that V S= 1 and 1\ S = o. The main result is the following Cross-cut Theorem: Theorem 3. Let ,u be the Mobius function and E the Euler characteristic of a nontrivial finitp lattire L, alld let C be a cross-cut of L. For every integer k ~ 2, let qlt denote the !lumber of 81)(wning 8ub8ets of C contm:ning k distinct element8. Then FJ--I-o',u(O,I)c·~q2
--q3 tq4 -q5 t .. ·
Th£' l)fooj is by induction over the distance of a cross-cut C from the element 1. lklille the di::;tance d (x) of an dement. x from the element 1 as the maximum length of a chaill strctched hetween x and 1. For example, the distanel' of a dual atom is two. If C is a cross-cut of L, define the distance d (C) as max d (x) as x ranges over C. Thus, the rliHtanee of the cross-cut consisting of all dual atoms is two, and conversely, thiK is the ouly cr08H-cut having distance t.wo. It fol\mvH from Proposition 1 of Section 5 that the result holds when d (0) -~ 2 (take R ~= C in the aSHcrtion of the Proposition)_ Thus, we shall assume the t.ruth of the statempnt for all <'ross-cull< whose diHtanee iK less than /I, all () or x ~:;:: C to mcan that there i:; an element y or C such that x> y, or that there is an element y of C such that x ;:;:: y.For a general C, these possibilitics may not be mutually exdusive; they are mutually exclusive when C is a eross-cut. We shall repeate(lly make use of this remark below. Define a modified lattiee L' as follow:;. Let L' contain all the elements x such that x ;S; C in the same order. On top of C, add an element 1 covering all the elements of C, but no others; this defines L'. In L', consider the cross-cut C and apply Proposition 1 of section 5 again. If ,u' is the M(ibiu8 function of L', then
,u'(0, 1) =, P2' P3 t P4 ... , whcre Pit is the !lumber of all subsets A ( C c L' of k elements, l:mch that /'. A
344
O.
353
On the Foundations of Combinatorial Theory. I
Comparing the lattices Land L', we have
°= I
P (0, x)
+ LP (0, x) = x>C
x~u
Lp'(O, x) + p'(O, 1).
x$C
However, for x ;2; C, we have p'(O, x) = p(O, x) by construction of L'. Hence LP(O, x)
= - P2 + Pa - P4 + .,.
X~i:!V
Since the sets (xIx;;;: C) and (xIx> C) are disjoint, we can write P (0, 1)
=-
L P (0, x) =
-
",<1
[L P (0, x) + L P (0, x)] . ",;;;0
1>z>0
We now simplify thc first summation on t.he right: (*)
p(O, 1) = Ps - Pa
+ P4'"
- LP(O, x). 1>2:>0
°
Now let qk(X) be the number of subsets of C having k elements, whose meet is and whose join is x. In particular, qk(l) = qk. Then clearly· Pk = Iqt(x),
k;;;; 2,
z>O
t.he summation in (*) can be simplified to (**)
p(O, 1)
= (q2 -
qa
+ q4 -
+ ... + P (0, x)] .
... ) -
L [- q2(X) + qa(x) -
1>z>0
q4(X)
+
For x above C and unequal to 1, consider the segment [0, x]. Wc prove t.hat. = C (\ lO, x] is a cross-cut of the lattice 10, x] such that d(C(x» < d(C). Once this is done, it followR lJY the induet.ion hypothesis that. every term in brackets on t.he right of (**) vanishpR, and the proof will he complete. Conditions (a) and (b) in the definition of a cross-cut are trivially satisfied by C(x), and condition (c) is verified as follows. Suppose Q is a maxima'! chain in [0, x] which does not meet C(x). Choose a maximal chain R in the Regment [x, 1]; then the chain QU R is maximal in L, and does not intersect C. It rcmains to verify that d (0 (x» < d (C), and t.his is quite simple. Therc is a chain Q stretched between C and x whose length is d(C(x». Then d(C) exceeds the length of the chain QU R, and sincc x < 1, R has length at least 2, hcnce the length of Qu R cxceeds t.hat of Q by at least one. The proof is therefore complete. Theorem 3 gives a relation between the value p(O, 1) and the width of narrow cross-cuts or bottlenecks of a lattice. The proof of the following statement is immediate. Corollary 1. (a) If L has a cross-cut w'ith one element, then P (0, 1) = 0. (b) If L has a cross-cut with two elements, then the only two possible values of p(O, 1) are and 1. (c) If L has a cross-cut having three elements, then the only possible values of P (0, 1) are 2,1, and -1. In this connection, an interesting combinatorial problem is to determine all possible values of p(O, 1), given that L has a cross-cut with n elements.
C(x)
°
°
25
Z. Wahrscheinlichkeitstheorie. Hd. 2
345
354
GIAN·CARLO ROTA:
Reduction of the main formula. In several applications of the cross-cut theorem, the computation of the number qk of spanning sets may be long, and systemat.ic procedurel:! have to be devised. One such procedure is the following: Proposition 1. Let C be a cross· cut of a finite lattice L. For every integer Ie ~ 0, and for every ,~ubset A c C, let q(A) be the number of spanning sets containing A, and let 8 k = q(A), where A ranges over all subsets of'C having Ie elements. Set So
2: A
to be the number of elements of C. Then
p,(0, I) == So - 2S 1 + 228 2 - 2383 + ....
Proof. For every subset Be C, set p(B) = 1 if B is a spanning set, and p(B) = 0 otherwise. Then q(A)
=
2:p(B).
02112A
Applying the Mobius inversion formula on the Boolean algebra of subsets of C, we get p(A)
=
2:q(B)p,(A, B),
B2A
where p, is the Mobius function of the Boolean algebra. Summing over all subsets A c C having exactly Ie elements, 2: q(B) p,(A, B).
qk=L)(A)=2: n(A)=k
n(A)-k
B2A
Intcrchanging tlw order of summation on thc right., recalling Proposition 5 of Sf'ction
:~ and the fact that a set of Ie + Ielements possesses ( Ie
f:'lemt'nl~, \Ie
i I)
subsets of Ie
obtain
A convenient way of recasting this expression in a form suitable for computation is the following. },pt V he t.he vector space of all polynomials' in the variable .1', over the real field. The polynomials 1, x, x2, .. , , are linearly independent in V. Hpnc
Ie
= 0, 1, 2, ....
Formula (*) ("an now be rewrit,ten in the concise form
Upon applying the cross·cut theorem, we find the expression (wherp qo and q1 are also given by (*), hut t.urn out to be 0) Ix x2 (l+x)i+(I+x)3 _ ... j
p,(O,I)=L ( l+x = L(
1./ 2X)
= 8 0 --. 281
=
L(I -- 2x + 4x 2
+ 4S2 - " ' , 346
q.e.d.
-
8..t: 3-1- ... )
355
On the Foundations of Combinatorial Theory. I
The cross-cut theorem can be applied to study which alterations of the order relation of a lattice preserve the Euler characteristic. Every alteration which preserves meets and joins of the spanning subsets of some cross-cut will preserve the Euler characteristic. There is a great variety of such changes, and we shall not develop a systematic theory here. The following is a simple case. Following BmKHOFF and J6NSSON and TARSKI we define the ordinal sum of lattices as follows. Given a lattice L and a function assigning to every elemeat x of L a lattice L(x), (aU the L(x) are distinct) the ordinal sum P = LL(x) of L
the lattices L(x) over the lattice L is the partially ordered set P consisting of the set U L(x), where u;;::;; v if u E L(x) and v E L(x) and u ;;::;; v in L(x), or if u E L(x) XEL
and vEL (y) and x < y. It is clear that P is a lattice if all the L (x) are finite lattices. Proposition 2. If the finite lattice P is the ordinal sum of the latticeB L(x) over the non-triviallatUce L, and /lp, /lx and /ll. are the corresponding Mobius functions, then: If L(O) is the one element lattice, then /lp(O, 1) = /lL(O, 1). Proof. The atoms of P are in one-to-one correspondence with the atoms of L and the spanning subsets are the same. Hence the result follows by applying the cross-cut theorem to the atoms. In virtue of a theorem of J6NSSON and TARSKI, every lattice P has a unique maximal decomposition into an ordinal sum over a "skeleton" L. This can be used in connection with the preceding Corollary to further simplify the computat.ion of /l(0, n) as n ranges through P. Homolog'ical interpretation. The alternating sums in the Cross-Cut Theorem suggest that the Euler characteristic of a lattice be interpreted as the Euler characteristic in a suitable homology theory. This is indeed the case. 'Ve now define* a homology theory }[ (e) relative to an arbitrary cross-cut C of a finite lattice L. For the homological notions, we refer to Eilenberg-Steenrod. Order the elements of C, say aI, a2, ... , an. For k ;;::;; 0, let a k-simplex a be any subset of C of k + 1 elements which does not span. Let Ck be the free abelian group generated by the k-simplices. We let C- 1 = 0; for a given simplex a, let ai be the set obtained by omitting the (i + I)-st element of a, when the elements of a are ordered according to the given ordering of C. The boundary of a k-simplex k
is defined as usual as oka = L(-I)iai' and is extended by linearity to all of i=O
Ck, giving a linear mapping of Cle into Ck - 1 . The k-th homology group lh is defined as the abelian group obtained by taking the quotient of the kernel of 0" by the image of Ok+l. The rank bk of the abelian group Ilk, that is, the number of independent generators of infinite cyclic subgroups of H", is the k-th Betti number. Let IXk be the rank of C k , that is, the number of k-simplices. The Euler characteristic of the homology H (C) is defined in homology theory as
L (-I)k lXk . 00
E(C) =
k=O
* This definition was obtaint'd jointly with D. whom J now wish to thank.
KAN,
F.
PETERSON
and G.
WHITEHEAD,
25*
347
356
thAN-CARLO ROTA:
It follows from wdl-known results in homology theory t,hat E(C)
L (-I)kb 00
=
k .
1:-0
Let qk be the number of spanning subsets with k elements as in Theorem 3. Then qk+l !Xk is the total number of subsets of C having k 1 elements; if C
+
has
Nelements, then
+
!Xk =
(k ! 1) - ql:+l. It follows from the Cross-Cut Theo-
rem that
We have however
~ (- l)k (k ~
k=O
1
I)
=
-.~ (- I)' (~) = 1 -.~ (-I)' (~) = .~O
.=1
and hence E(C) = 1
in other words:
+ 1'(0, I) =
1 - (l - I)N
=
1.
E;
Proposition 3, In a {in'tte lattice, the Euler characteristic cross-cut C equals the Euler characte-ristic 0/ the lattice.
0/ the homology 0/ any
This result. can sometimes be used to compute the Mobius functions of "large" latt.iees. In general, the numbers qk are rather redundant, since any spanning sub:-<et of k elements gives rirse to several spanning subsets with more t.han k elempnts. A method for eliminating redundant spanning sets i::; then called for. One such method consists precisely in the determination of the Betti numbers bk • We conjecture that the Betti numbers of 1I (0) are theml:lelves inuependent of the eross-eut 0, and are al80 "invariants" of the lattice L, like the Euler characteril:ltic E (C). In the special ease of lattices of height 4 satisfying the chain con· dition, this conjecture has been proved (in a different language) by DOWKER. Example 1. The Betti numbers of a Boolean algebra. We take the eross-eut 0 of all atoms. If the height. of the Boolean algebra is n -+- 1, then every k-eycle, for k < n - 2, bounds, so tha,t bo = 1 and bk = 0 for 0 < k < n - 2. On the other hand, there is only one eycle in dimension n -- 2. Hence bn - 2 = 1 and we find E =-, I + (_l)n-2, which agrees with Propot;it.ion 5 of Seetion 3. A notion of Euler charaeteristic for distributive lattiees has been recently introduced by HADWIGER and KLEE. :1<'01' finite distributive lattices, KLEE'S Euler characteristic is related to the one introduced in this work. We refer to KLEE'S paper for det.ails.
7, Geometric lattices An ordered structure of very frequent occurrence in combinatorial theory is the one that has been variously called matroid (WlIITNEY). matroid lattice (BIRKlIOH'), closure relation with the exchange property (MAoI.ANE), geometric lattice
348
On the J<'ounrlntions of Combinatorial Theory. I
357
(BlRKHOFJf), abstract. linear dependence relat.ion (BLEICHER and PRESTON). Roughly speaking, theRe structures arise in the study of comhinatorial objects that. are ohta.ined by piecing t,oget.her smaller object.s with a part.icularly simple st.ructure. The typical such case is a linear graph, which is obtained by piecing together edges. Several counting problems associated wit.h such structures can often be attacked by Mobius inversion, and one finds that the Mobius functions involved have particularly simple properties. We briefly summarize the needed facts out of the theory of such structures, referring to any of the works of the above authors for the proofs. A finite lattice L is a geometric lall-ice when evl'l'y dement of L is the join of atoms, and whenever if a and b in L cover a 1\ b, then a V b covers both a and b. Equivalently, a geometric lattice is characterized by the exilltence of a rank funct.ion sat.isfying r(a 1\ b) r(a V b) ~ r(a) r(b). Notice that this implies the chain condition. In particular if a is an atom, then r(a V c) = r(c) or r(l:) 1. If M is a semimodular lattice, then the partially ordered subset of all elements which are joins of atoms is a geometric sublattice. Geometric lattices are most often obtained from a closure relation on a finite set which satisfies the MACLANE-STEINITZ exchange property. The lattiee L of closed sets in such a closure relation is a geometrie lattice whenever every oneelement set is closed. Conversely, every geometrie lattice can be obtained in this way by defining one such closure relation on the set of its atoms. The fundamental property of t.he Mobius funetion of geomet.ric lat.tices is the following: 'rhl'ort'In 4. Let I' be the Mobiu8 function of a finite geomdric lattice L. Then: (a) I' (x, y) 0 for any pair x, y ,in L, provi(led x ~ y. (h) If y CO/'eI'8 z, thpH 1'(:1', y) and 1'(;1', z) /tat'" opposite 8igu8. Proof. Any segnll'nt L.t, yJ of a geometric laHiec i::; abo a. geollletric: lattice. It. will t.hereforc suffice to aSRume that x = 0, y = 1 and that. z is a dual at.om of L. 'Ve proceed hy in
+
+
+
'*
1'(0,1)
0' -
2>(0, x).
:.Va~l
:.,,1
Now from t.he fluba(lditiv{' inequality r(x 1\ a) f- r(x
\j
(1)
<: r(x) -+- r(a)
we infer that if x V a ~~ 1, then n ~-::- dim x + dim a, hene(' dim x ~ n - 1. The element. x must therefore be a dual atom. It follows from t.he induction assumption and from the fact. that L satisfies the chain condit.ion, that all the I' (0, x) in the sum on the right. have the Harne sign, an(1 none of t,hem is zero. Therefore, I' (0, 1) is not zero, and its sign is the opposite of that I' (0, x) for any dual at.om x. This condudes the proof.
349
UlAN·CARLO ROTA:
35~
Corollary. The coefficients 0/ the characteristic polynomial 0/ a geometric lattice alternate in sign. vVe next derive a combinatorial interpretation of the Euler characteristic of It geometric lattice, which generalizes a technique first used by WHITNEY in the study of linear graphs. A subKet {a, b . ... , c} of a geometric lattice L is independent when l'(a V b V," V c) = r(a)
+ r(b) + ... + r(c).
Let C/c be the cross-cut of L of all elements of rank k > O. A maximal independent subset {a, b, ... , c} C C/c is a ba8i8 of Ck . All bases of C/c have the same number of elements, namely, n - k if the lattice has height n. A subset A c C/c is a circuit (WHITNEY) when it is not independent but every proper sub~et is independent. A set is independent if and only if it contains no circuits. Order the elements of L of rank k in a linear order, say ai, a2, ... , al' This ordering induces a lexicographic ordering of the circuits of Ck . If the subset {ail' ai., ... ,ail } (il < i2 < ... < ij) is a circuit, the subset ail' at., ... ,atl _ 1 will be called a broken circuit. Proposition 1. Let L be a geometric lattice of height n + 1, and let C/c be the crOS8-CUt of all element8 of rank k. Then p(O, 1) = (-I)nm/c, where m/c i8 the number of 8ubsets of C/c who8e meet is 0, containing n - k + 1 elements each, and '/lot containing all the arcs of any broken circuit. Again, the assertion ·implies that ml = m2 = ma = .... Proof. Let the lexieographically ordered broken circuits be PI, P 2 , ... , P 11, and let St be the family of all spanning subsets of C/c containing P t but not PI. p~ • .... or Pi I· In particular, 8 11 t 1 is the family of all those spanning sub. sets not containing all the arcs of any broken circuit. Lct qj he the number of spanning subRets of j clements and not belonging to 8 t . We !lhall prove that for each i ~ 1
(*) First, ::;et i = 1. The set 8 1 contains all spanning subset::; containing the hroken eire'uit Pl. Let PI be the cicuit ohtained by completing the broken circuit· Pl. - A spanning set contained in 8 1 contains cither PI or else PI but not PI; eall these two families of spanning subsets A and B, and let q1 and qfi he defined accordingly. Then qj = q; + qf + qf, and I~(O, 1) = q2 - qa
+ Q4'"
+ qt + (q: -
=
+ qt) + ....
Q~ -- q~+ ...
q;)- (q!f -
Now. q1 = 0, because no circuit can contain two elements; there is a one-to-one corre;;pondence between the elements of A and those of B, obtained by completing the broken circuit PI- Thus, all terms in parentheses cancel and the identity (*) holds for i = 1. To prove (*) for i > 1, remark that the element Ct of C/c, which is dropped from a circuit to obtain the broken circuit PI, does not occur in any of the previou::; eircuits, because of the lexicographic ordering of the circuits. Hence the induction can be continued up to i = (] + 1.
350
On the Foundations of Combinatorial Theory. I
359
Any :;pt belonging to Hu+l does not contain any circuit. Hence, it is an inde. pendent Het. Since it is a spanning set., it mutst contain n - k 1 elements. Thus, all the' intcgers qu+l yanitsh except q;'~L-l and the stakmcnt follows from (*), q.e.d.
+-
+
+
+ .. , +
Corollary 1. Let q(A) = An mIAn---1 m2A n - 2 mn be the character. istic polynomial of a geometric lattice of height n 1. Then (-I)kmk is a positil'e hlleger for 1 ~ k ;c:: n. equal to the number 01 independent subsets of k atOn!8 110/ containing any broken circuit. The prool is immediate: take k = 1 in the preeeding PropotsitioJl_ The homology of a geometric lattice is simpler than that of a general lattice:
+
Proposition 2. In the homology relatice to the CrOl:J8-cllt C\ k = 1, Ihe Betti numbers b1 , b2 , ••• ,b k --2 vanish.
01 all elements oj rallk
TIH' prool is not difficult. Example 1. Part'itions of a set. Let S be a finite set of n elements. A partition n of H is a family of disjoint ;;ubset,; B 1 • B 2 , ... , B k , ealled blocks, whmlP union its 8. There i" a (well-known) natural ordering of partitions, which is defined as follows: n ~ a whenever every block of 7l is contained in a block of partition a. In particular, it! the partition having n blocks, and I it! the partition having one block. In this ordering, the partially ordered set of partitions is a geometric lattice (cf. BIRKHoFJ.'). The' Mobius function for the lattice of partitions was first determined by ScnOTzENBEROER and independently by ROBERTO FRUCHT and the author. We give a new proof which uses a recurtsion. If n is a partition, the cla88 of n iR the (finite) sequence (kl' k2, ... ), where k t is the number of blocks with i elements.
°
I.emma. Let Ln bp. the lattice 01 partitiol1 . ~ of a set u'ith 11 r/p/nell/s. linE L" 01 rank k, then the 8egment [n, I] is isomorphic to L n - k . It n is 01 dass (k1, k 2 , ... ), then the segment [0, n] is isomorphic to the direct product 01 kl lattice8 isomorphic to L 1 , k2 lal/ices isomorphic to L 2 , etc. TIlt' proof is immediate. It follows from the Lemma that if [x, y] is a :segment of L n , then it is iso· morphic to a product of k j lattices isomorphic to L t , i = 1,2, .... We call the sequenec (kl' k2' ... ) the Cfa,~8 of the segment [x, y]. i8
Propo8ition 3. Let fln = fl (0, I) for the lattice ments. Then fln
= (- I)n-l (n - I)!.
01
partitions
Proof. By the Corollary to Propmlition 4 of Section 5,
01 a
set with n ele.
L>(x, 1) =
O. Let a
:rr'-,a=Q
be the (lual atom consisting of a block C 1 containing n -- 1 points, and a second bloek C 2 eontaining one point. Which non-zero part,itions x have the property t hat x II a = O? Let the blocks of such a partition x be B 1 , ••• , B k . None of the hlocks B; can contain two distinct points of the block C 1 , otherwise the two pointts would still belong to the same block in the intrff\ection. Furthermore, only one of the B j can contain the block C 2 . Hence, all the B j contain one point, exeept OIl(" which contains C 2 and an extra point. We conclude that x must be an atom, and there are n - I such atoms. Hence, fln = fl (0, 1)= fl (x, 1), where x
L x
rallge~
over a set of n --- 1 atoms. By the Lemma, the segment [x, 1] is isomorphic
351
360
GIAN-CARLO ROTA:
to the lattice of partitions of a set with n - 1 elements, hence Iln :::lince 1'2 ,,-= - 1. the conclusion follows.
Corollary. 1/ the segment [x. y] is I' (x, y) == ft~' ft~' ... ft~"
0/ class
(k 1 • k 2 •
••••
-
(n -- I) flll
-1.
k n ), then
= (- l)k.H.+ "'H,,-tI (2 !)•• (:l Ilk • ... ((n -_. 1) !t" .
The Mobius inversion formula on the partitions of a set has several ('ombinatorial applications; see the author's expository paper on t.he subject..
8. Representations There is. as is well known, a dose analogy between combinatorial results relat.ing to Boolean algebras and those relating to the lattice of subspaces of a vector space. This analogy is displayed for example in the theory of q-difference equations developed by ~'. H. JACKSON. and can be noticed in many numbertheoretic investigations. In view of it, we are led to surmise that. a result analogous to Proposition 1 of Section 5 exists, in which the Boolean algehra of subsets of R is replaced by a lattice of subspaces of a vector space over a finite field. Such a result does indeed exist; in order to establish it a preliminary definition is needed. Let L be a finite lattice. and let V be a finite-dimensional vector space over It finite field with '1 elements. A representation of L over V is a monotonic map p of L into t.he lattice M of subspaces of V, having the following propertie~: (1) p(O)
O.
(2) pta \I b) O~ pta)
V p(b).
(3) Each atom of L is mapped to a lillI' of the vect·or space r. lind tl](' "rt. of lim'" t hilS obtained span:> the entire :>paee V. A represent.ation i:> faithful when t.he mapping p is OIH'-to-OIH'. \VI' "hl111 !Set' in Scetion 9 t.hat a gn'at many ordered structures arising in combinatorial problems admit fait.hful repre:>entations. Given a representation p: L --.,.. .1[, olle defines the conjnyale map q : M -)- L as follows. Let K be the set of atoms of M (namely, lines of V), and let A be the image under p of the set of atoms of L. For s E M. let K (s) be the set of atoms of M dominated by 8, and let B(s) be a minimal 13ubsot of A which spans (in t.he "cd·or ~paee 8('lIse) evcry element of K (s). Let A (s) be the subset of A which is spanned by B(8). A simple vector-space argument., which is here omiUed. shows that t.he :>et A (8) is well defined, that. is, t.hat it does not depend upon t.he ehoiee of B(s), hut. only upon the f'hoiee of .~. LI't. C(8) be the set of atoms of L which are mapped by p onto A (8). S('t q(8).=, V C(8) in the lattic(' L; t.his defines the map q. It. is obviously a monotonie function.
J,('mma_ I,pt P : L~!If be a faithful rpJI]'(w'nta,tion and let q:.M .,.. L be the conjuyate map. Assume thai every element ()f Li.~ u. join of atom,•. Then p (q (8» ~ .~ and q(p(x» ~ x. Proof. By definition, q(8) - V O(,~), whel'e 0(8) is the inven-lc image of A (8) under p. By property (2) of a representation, p(q(s»=p(VC(s»= Vp(C(8»= VAts).
352
On the Foundations of Combinatorial Theory. I
361
But this join of the set of lines A (8) in the lattice M is the same as their span in the vector space V. Hence VA(s) ~ 8, awl we conclude that p(q(s» ~ s. To prove that q(p(x»;;:;;; x, it suffices to show that A(p(x» = B, where B is the set of atoms in A dominated by p (x). Clearly Be A (p (x», and it will suffice to establish the converse implication. By (2), and by the fact that x is a join of atoms, we have p (x) == V B. Thcrefore every line l dominated by p (x) is spanned by a subset of B. If in addition lEA, then l ;;:: V C for some subset C c B, hence l E B. This shows B::J A (p (x», q. e. d. Theorem 5. Let L be a finite lattice, where every element is a join of atoms, let p : L ~'? M be a faithful representation of L into the lattice M of subspaces of a vector space V over a finite field with q elements, and let q: M -+ L be the conjugate map. For every k ~ 2, let mA; be the ~umber of k-dirnensional subspaces s of V such that q(s) = I. Then (*) where f1, is the Mobiu8 function
0/
L.
Proof. Let Q = L*, let c: L -+ Q and c* : Q -+ L be the canonical isomorphisms between Land Q. Define'll: Q -+ Mas'll = pc*, and e: M -+Q as e = cq. We verify that'll and e give a Galois connection between Q and M satisfying the hypothesis of Theorem 1. If 'll (x) = 0, then there is ayE L such that y = c* (x) and p (y) = O. It follows from the definition of a representation that y = O. Hence x = c (y) = 1. Furthermore, e (0) = c (q (0)) = 1. It follows from the preceding I~emma that 'll and e are a Galois connection. Applying Theorem 1 and the result of Example 2 of Section 5, formula (*) follows at once. Remark. It is easy to HCC that every laUiee having a faithful representation is a geometric latticc. The converse is however not true, as an example of T. LAZARSON shows. A reduction similar to that of Proposition 1 of Seetion 7 can be carried out with Theorem 5 and representations, and another combinatorial property of the Euler characteristic is obtained.
9_ The ('olurin/!,' of graphs By way of illustration of the preceding theory, we give some applications to the classic problem of (~olorillg of graph", and to the problem of eOllst,[ucting flows in networks with specified properties. Our results extend previous work of G. D. BIRKIHWF, D. C. LJ~WIS, W. T. TU'I"l'1<; and H. WIl[,I'~EY. A linear graph G =- (V, }I) is a structure consisting of a tinitp ~ct V, whosp elements are called v('rtices, toget.her with a family E of two-plenwnt. 8uhsds of V, called edges. Two verticeH a and b are adjacent when the set (a, b) is an edge; the vertiecs a and b are caliI'd th!' l'ndpoints of (/1, b). Alternatcly, one call,., tlw vertice8 regions and calls til(' graph a map, and we Uf<(, t.he two t.crms interehangeably, considering them as two words for the same objcct. If 8 is a set of edges, the vertex set V (8) conshlts of all vertices which are incident to some edge in S. A Het of f'dgeH 8 is connected when in any part ition S =~ Au 11 into disjoint Jloll-pmpty t:ds A and 11, the vertex setH V (A) and V (R) arc not (ii:.;joint. :Evl'I'y Sf't of edges is the union of disjoint conneeted blocks.
353
362
GlAN-CARLO ROTA:
The bond closure on a graph 0 = (V, E) is a closure relation defirwd on the set E of edges as follows. If SeE, let S be the set of all edges both of whose endpoints belong to one and the same block of S. Every set consisting of a single edge is closed, and these are the only minimal non-empty closed sets. Lemma 1. The bond closure S _ S has the exchange property. Prool. Suppose e and I are edges, SeE, and e E ,Su7 but e ¢ S. Then every endpoint of e which is not in V (S) is an endpoint of f; on the other hand, Sand f have at least one point in common, otherwise e E S. Thus both e and f either connect the same two blocks of S, or else they have one endpoint in B and one common endpoint; hence E E[Ue, q.e.d. The lattice L = L(O) of bond-closed subsets of E is called the bond lattice of the graph O. Suppose that E has n blocks and p(J.) is the characteristic polynomial of L, then the polynomial J.np (J.) is the chromatic polynomial ofthe graph 0, first studied by G. D. BIRKHOFF. :From Theorem 4 we infer at once the theorem of WHITNEY that the coefficients of the chromatic polynomial alternate in sign. The chromatic polynomial has the following combinatorial interpretation. I_et C be a set of n elements, called colors. A function f: V _ C is a proper coloring of the graph, when no two adjacent vertices are assigned the same color. To every coloring / - not necessarily proper - there corresponds a subset of E, the bond of /, defined as the set of all edges whose endpoints are assigned the same color by /. The bond of / is a closed set of edges. :For every closed set S, let p(J., B) be the number of colorings whose bond is B. Then we shall prove that p (J., S) = J.nq(J., B), where q(J., S) is the characteristic polynomial of the segment [8, 1J in the lattice L. Since every coloring has a bond LP(J., '1') equals the total
t
7'.·'8
number of colorings having :some bond '1' ,,;;: S. But t.his-numlwr is evidently AI< . r(.), where k is thc number of vertices of t.he graph and r (8) is the rank of S in L. Applying t,he Mobius inver:
(*)
p (J.)
= p (J., 0) =
L J.k-r(T) I' (0, T) .
TEL
But the number of colorings whose bond is the null set 0 is exactly the number of proper colorings. WHITNEY'S evaluation (cf. A logical expansion in MathcmatiCl;) of the chromatic polynomials of a graph in terms of the number of subgraphs of sedge,; and p connected components is an immediate consequence of the cross-cut theorem applied to the atoms of the bond-lattice of O. This result of WHITNEY'S can no\\' be sharpened in two directions: first, a cross-cut ot.hcr than that. of the atoms can be taken; secondly, the computation of the cocfficients of the chromatic polynomial can be simplified by Proposition I of Section 8. The cross-cut of all elements of rank 2 is particularly suited for computation, and can bc programmed. The interested reader may wish to cxplicitly translatc the cross-C"ut t.heon'm and the results of Section 8 into the geometric lan/:,11lage of graphs. Example 1. For a complete graph on n vertices, where every two-elemcnt. subset is an edge, the bond-lattice is isomorphic to the lattice of partitions of a set with n elements. The chromatic polynomial is evidently (J.)n = J. (J.- I) ... (A - n -+- 1), and the coefficients s (n, k) are the Btirling numbers of the first kind.
354
On the Foundatiolll> of Combinatorial Theory. I
Thus,
363
L /1 (0, ;Tt) -= s (n, k). Thi::; gives a eomhillatol"ial interpretation to t.he Stirling
r(,,)~k
numhers of the first kind. For a map m embedded in the plane, where regions and boundaries have their natural meaning and no region bounds with itsclf, one obtains an interesting geometric result by applying the cross-cut thcorem to the dual atoms of the bond lattice L (Ill). Let m be a conneeted map in the plane; without loss of generality we can assume: (a) that all the regions of 11t, except one which is unboundl"d, lin im;ide a convex polygon, the outcr boundary of m; (b) that all boundaries are segml"nt.s of straight lines. The dual graph of 111 is the linear graph made up of the boundaries of m. A circuit in a linear graph is defined as a simple closed curve contain cd in t.he graph. We give an expression of the polynomial P(A. m) in term::; of I he eireuits of the dual graph. The outer boundary is alwaYH a eireuit. A set of circuit.s of a map m in the plane span.~, when their ullioll- in the set-t,heoretic sem~e - is the entire boundary of m. Proposition 1. For every integer k ;;:;; 1, let C k be the number di.stinct circuits of a map m in the plane. Then
0/ spanning sel.s of k
Proof. If the map has t.wo region::;, thcn C 1 = 1 and all other Cc =, 0, ~(J tIlt" ['{'suit is trivial. A:o<sume now t.hat 1II haR at lea::;t, 3 regionH. Then C 1 = O. All we have to prove is that the integers C k are the integers qk of Theorem a, relative to the ("fOR;;-("llt of L(m) consiHt.ing of all thc dual atom;;. By the .Jordan {'ur\'e tll('Ol'("m, every circuit. dividn,; thl' plane' into two n.giolls: thi::; give::; a one-to-one corre::;pondence of the circuits with the dual atoms of L(m). l'ollver"dy, becausc we can assume that the map i,; of the Hpecial type de'sc('ibl"d above, every dual atom in L(m) is a map with two connected regions, and so must have a~ a boundary a ~implc closed eun'e, q.e.d. It. has been IIhown by RICHARD RADO (p. 312) that the boud-Iattice L(G) of any lim-ar graph G has a fait.hful repre~ent.at.ion. Accordingly, Theorem 5 can also he appli('(l to obtain expre::;sion for /1 (0, I). These expre;,;sions usually give sharper bounds than similar exprm,sions based upon t.he cross-cut of atoms. :Fa rther-reaching techniques for the com putation of the Mobius function of L (G) are obtained by applying Theorem 1 to situations where P and Q are both bondlattie"s of graph::;. This we shall now do. A rnmlOmorphisrn of a graph G into a graph lJ is a one-to-one function f of the vertices of G onto the vertices of H, which induces a map J of the edges of () into t.he edges of H. Every monomorphism /: G -+ lJ induces a monot.onic map p: L(G).-+ L(H), where p(S) is defined as Ow closure of the image /(S) in H. It also induces a monotonic map q: L(H) -+ -" L(G). wherc q(T) is defined as t.he sct of edges of G whose image is in T.
1,I'rnrna 2. q(p (S» . ~ S for Sin D(G) and p(q(T)
~
T for Tin L(H).
Proof. Intuit.ively. pIS) is obtained by "adding edges" to S, and q(p(S» "imply ('('movell the addcd edges. Thus, the first statement is graphically clear. The sccond one can be s(,e'n as follows. q(T) is obtained from T by removing a
355
364
GJAN·CARJ,O nOTA:
number of edges. Taking p(q(T)), some of the edge;; may be replaced. but in general not. all. Thus. p(q(T)) ~ T. Taking ]If O~= L(ll)* and c: L(ll)-+ M t.o be the canonical order.im·erting map, we see t.hat n = cp and l; = qc give a Galois conncct.ion bct.ween L(O) and M. Now, n(x) =--·c 0 is equivalent to p(x) = 1 for x EL(G). This can happcn only if x has only one component, that is -- since x is closed - only if x = 1 in L(G). Thus n(x) = 0 if and only if x = 1. Secondly, (1(0)= q(l) = I, evidently. 'Ve have verified all the hypot.heses of Theorem J, and we then.fore obt.ain:
Proposition 2. Let I : G _ H be a mono1iwrphism oj a linear graph G into a line,ar graph H, and let PG and PH be the Mobius lu:nc/ions oj the bond·lattices. Then pG(O, 1)
=
LPH(a, I),
[a E JAB); q(a)=OI
where q is the map
0/
L(H) into L(G) naturally associated with
I,
as abore.
Proposition 1 can be used t.o derive a great many of the reductions of G. D. BIRKHOFF and D. C. I_EWIS, and provides a systematic way of investigating the changes of Mobius funet.ions _. and hence of the chromatic polynomial when edges of a graph are removed. It has a simple geomet.ric interpretation. An interesting applieat.ion is obtained by taking H to be the complete lattice on n elements. We then obt.ain a formula for P which completes the statements of Theorems 3 and 5. Let G be a linear graph on n vertices. I_et. C be t.he family of two-element subsets of G which arc not edges of G. I_et P be the family of all subsets of C which are closed sds in t.hc hond.lattice of the completc graph on n vertices built on the vertices of G. Then,
Corollary.
pa(O, J) 0=_,
2>
aE
(a, I).
1!'
where P is the Mobius function of the latt.icc of partitions (.f. Exampll' I)) of a set of n elements. Stronger results can be obtained by considering "epimorphisms" rathrr t.han "monomorphisms" of graphs, relating PG to the Mobius function ohtainrcl from G by "coalescing" points. In this way, one makes contact with G. A. DIRAC'S thcory of critical graphs. We leave the development of t.his topic to a latl'r work.
10. Flows in networks A network N = (V, E) is a finite set V of vertices, together with a srt of ordered pairs of vertices, called edges. We shall adopt for networks the same language as for linear graphs. A circuit is a sequence of edges 8 such that every vertex in V (8) belongs to exactly two edges of 8. Every edge has a positive and a negative endpoint. Given a function (/) from E to the integers from 0 to A - 1, let for each vert.ex I', (])(v) be defined as (]) (v) = 1) (e, v) (/) (e) ,
2: e
where the sum ranges over all edges incident t.o v, and t.he function
356
1) (1', v)
takes
On tll(,
l~oundations
of Comhinatorial Theory. I
365
t.hr valu(' -1-1 or" 1 according as the positive or ncgat.ive end of the edge e abuts at t.ht' vprtex 1l, and the valuc r.cro othrrwise. The funct.ion tP is a flow (mod. A) when iP(I') "0 (mod. A) for (~vcry vert"x v. The value tP(e) for an cdge e is called t.he caparily of the flow through e. The mod. ). fI'l
'*'
Proposition I. The number of proper flows, (mod. A) on a network N with v vertices, (' edyes and /) n)1mected component8 i8 a polynomial P (A) of degree e - - 1'+ p. 1'his polynornial ,is lite. characteri8tic polynomial of the circuit lattice of N. Tlte coefficients alternate in sign. Proof. The last st.at.ement ill an immediate eonRPquence of Theorem 4 of Seetion 8. TIl(> total llumber of flows on N (not. Il('('essinily proper) is determined as rol1ow~. Assume for ~illlpli('ity t.hat N is eOIIlll'ekd. Remove a sct ]) of tl - 1 edges from N, one adjaeent, to eaeh but one of the vertices. Every flow Oll N mn bc obtained by first allsigning to each of the edges not in I) an arbitrary eapacit.y, between 0 and A - I, and then filling in capacit.ies
357
:11 ill
for tlw cdg,'s in /) to matt·h t.he rNlllirement of zero eapadty through lllH'h ,·ert,l'x. TJ1I'rl' Ill'e ),'-'" I ways of doing t.hi",. and t.hil< is t,]l('refofl' the total ll11mhcr of flow", mod. }., If 1.111' nctwOI'k i", in p conneeteli components. the SHnl<' Ill'gunwnt. give/! AI-'" 1'. :\ow, (,very flow on 0 is a proper flow on a unique closed Hubset. 8, f)"t.ailu·d hy removing all cdges having eapaeit.y zero. Hence )f ,·tp c 2]1(8,1), ~E!::'(U)
p(S, 1) is t.he charact.eristic polynomial of t.he c10secl subgraph R. Sctt.ing n (x) ,',. e (x) ..- l' (8) -+- p (8). t.hc number of edges, ,"crt-ices and eomponf'nts of .~,
where
and applying the invcrsion formula, we get. p(G,1) o"'21n(B)I-'(S. G).
q.e.d.
Se!d.(U)
In the course of t.he proof we have also shown t.hat n(8) is t.he rank of S in tIl(' circuit latt.iep of U. Thc rank of till' null subgraph is one. The four· color problem is cquivalent to t.he st.at.ement. t.hat every planar net· work wit.hout an isthmus has a proper flow mod 5. (An ist,hmuH is an edgf' that disconnects a component of the network when removed.) Most. of t.he results of the preceding section extend t.o circuit lattices of II network. and give techniques for ('omputation of the flow polynomials of networks. \rc shall not writ.e down t.heir translation into the geomet.ric language of networ·kK. Referenees AnH.ANJ>lm.
H27 -
L., and H. :\1.
TI"'.~ 1':
11I(·id'·I\'·1' llIatri("'s an,llirlt'1I1' graph" .•1. Mat h. !\Il'l'h.
~.
835 (1959).
:K T.: Algebrail: Arit.hIlIPli,'. XI'W York: Arm·r. Mat-h. SOl'. (HI:!,), -- ExponentialllOlyuomink ABU. of 1\1ath., 11. /-ieI'. S;;, 258,,277 (19:14). BEROE, C.: TMorie des graplH'M ef. HI'S applicat.iolls. Paris: Dounod 19M!. BlltKIIOFF. flAmn;'I'T: Lall,ice Thl'or}" third pr£'limillary edition. Hanar,\ Univ('l'oil.". Hlli:l. -- Latt.iee Theory, r!'\'is('d edit-ion. American Mathematical Societ.y, ]948. BIRKHOF}'. G. D.: A tktt·rminant formula for thl' nnmber of ways of coloring a map. Ann. nf Math., 11. Sl'r. 14.42' ,4H (lIH:I). -'-, alld D. C. L.;WIS: Chruma!,i,. ItolynornialH. TrailS. Anll'r. mat,h. SO(·. 60, :~55 4.;1 (194Ii). 13u:/cllt:R, M. N., and G. B. PUt;"TOS: Abstmd lincnr dep!'n'\l'rH'c 1·e1atioIlS. Pull!. Math., Debreeen 8, 55--6:J (1961). BOUGAYEV, N. V.: Theory uf numerical derivative!!. Moscow, 18,0--1873. pp. 1-222. BRUUN, N. n. Ht:: Gem'ralization of I'olya's fuwlampntal tlH'on'rn in ('J\lmu"·at.in, ('om· binatorial analYMiH. Indngntiolll'A math. 21, r,9-liB (1959). CHUNn, K.·L., and L. T. C. HSlI: A t,ombinatorial formuln with it.H application to the theory of l'robnbility of arbitmr.v cwilt.s. Ann. math. :--itatistic-s 16, 91·-95 (1945). DEDEKIND, R.: Gesammelte Mathematische Werk£', V01l8, I--I1-IlI. Hamburg: Deut.Rehe Math. Verein. (1930). DEL.'1ARTE, S.: ~'onctions de Mobius Bur le8 groupes abl-liens finis. Ann. of Math., II. SCI'. 411, 600-609 (1948). DILWORTH, R. P.: Proof of a t:onjccture on finit.e modular lattict,s. Ann. of :Math .. II. K"r.
B~:LL,
60, 359-3U4 (1954). G. A.: On the four·color conjel't.ure. Proc. London mat.h. H()(·i£'t~·. liT. S("r. 13, ]ll:l to218 (1963). DOWKER, C. H.: Homology groups of relations. Ann. of Math., II. Sf'r. 06, 84-9r, (19;;2).
DIRAC,
358
On the Foundations of Combinatorial Theory. I
367
DUBRBTL-JACOTIN, M.-L., L. LEHmUR et R.. CROISOT: LeQons sur la tMorie des treilles des structures algebriqlles ordonnpes et des treilles geometriques. Paris: Gauthier-Villars 195:~. EIL~;NBER(l. Fl.. and N. STEENROD: Foundations of algebraic topology. Princeton: University PreS" 19fi2. FARY, I.: On straight-line representation of planar graphs. Acta Sci. math. Szeg!'d 11. 22!l233 (1948). FELLt;R. W.: An introduction to probability theory and its applications, seco'lfl ed;ti"" New York: Wiley 1960. FUAl'KLIN, P.: The four-color problem. Amer. J. l\lath. 44, 225~23() (1922). FHECHI,T, M.: Les probabilites associ{-es it, un syst,eme d'evenements compat.ibles et d{-p!",· dant.s. Actualitees seientifiques et industriclIes, nos. SG9 et 942. Pa ris: Hermann 11)40 ('t 1943. :l<'RONTEHA MARQUES, B.: Una funci6n numeriea 'on los reticuloR finit'ls qu(' se annla para los reticulos reducibl,-s. Ad-as dl' la 2a, Reuni{,n de matemat.i<,os espallOles. Zaragoza loa· III 19()2. FRncHT, R. .• and n.-c. ROTA: La funei6n (I<> Miibius para l'l retkulo di particio,H's de un ('onjunto finito. To appear in !'i('ientia «'hile). UOLDBERG, K., M. S. GRt;EN and R. E. NETTLETON: Dl'nRI' suhgraphs aJlrt ('onnectivit.y. Canadian J. Math. 11 (19G9). (;OLOMB. S. \-V.: A mat.h,-matieal t.1ll'or.,· of disndl' (,lasHi!ieat ion. I;'ollrt.h Syrnp(JBillrn in In· format.ioll Theory, London. 19(j1. o IU;t;N, M. S., and R. E. NETTLETON: Mohius fundion on the lattice of dl'nHe suhgraphs. ,1. Res. nat, Bur. Standards 64B, 41--47 (1962). -- - Expression in terms of modular distribution fun('.tions for th(' entropy dell:;ity in an infinite system. J. Chemical Physisc 29, 1365~I:nO (1958). HADWIUER, H.: Eulers Charakteristik und komhinat.orische Geomet.rie. ,1. rl'i,H' ang"\\,. Math. 194, 101 -llO (19i'ii'i). HALL. PUII.U': A contribut.ion to the tl1l'ory of gl'OlIpS of prim\' p,nH-I' order, Proc. London math. Soc., 11. Ser. 36, :{9--95 (1932). - The Eukrian fundions of a group. Q.uart.. J. Mat.h. Oxford !'iN. I:H liil, 19:1fi. HAnAHY. F.: li(l~()lv(·d prubh'IlIS in thl' t'llulll('ratiotl of graph~. ['Hhl. math. lll:-it. HtlllU,;1l A(·H(l. S('i. r.. 1i:1 .!li'i (I!)(;O). HARDY, O. H.: ]{amanujan. Call1bridg:l': Cuiversit.I' Press 1940 . . , and E. I\\. V'.'HIGllT: An introdu('tion to t.he theor~' of numhers. Oxford: Unin-rAit,y Pr('ss 19;'4. HAHTMAl'LS. ,1.: Luttic(' tlll'or,\' of g"neraliz('(l partitions, ('anadian J. i\latll. 11. 97- lOt; (H)[)!l). HILLE. K: The inversion problems of Mobius. Duke math. J. 3, i'i4!l -;i6l:l (I!l:!]). lisp. L. T. C.: Abstrad, throry of inversion of iterated sUlllmation. Oukl' math. ,J. 14.4lifi to 47:1 (1!J47). On Homanov'~ de vi"" of oI'thogonalizatioll. Kei. J{ep. l\at.. THing Hua Cniv. 5, 1--12 (l!l4X).
1\ote on an abstract inversion principle. Proe. Edinburgh math. Soc, (2) 9, 71-·73 (1954). ,1Aclu;oN, F. H.: Rt'riPH conned!'(] with the enumeration of partit.ions, Pro('. London mat.h. i-;Of' ..
II.
i-;,'r,
J, Ii:l
HX (1!)o4).
'I'll(' q-{(H'm of Taylo!"s theorem. MesKellgt'r of 1\lathemat,ies :IH. 57--·(;1 (/!)()9). J6NSSON. R.: "Lat.tice-theon-tie approal'h to projpdive and aflill<' w-omet.r.I·. !'iyrnposium on the ,\xiomatie Method. Amst.erdam, North· Holland Publishing Company, 19;J\l, Il:lH--20fi. ". and A. TARSKI: Direl't de('omposition of finit.e algphraie syst.ems. Kotre Dame Mat.he· mat,i,-al It-dun's, no. G. Illdiana: 1\o1.re Vamp 1\)47. KA'" M .• and .J. C. 'VARl): A combinat.orial solntion of the two·dimensional ]sing modI'!. PhI'S. Hevi,'w HH. 1:t{21:{:n (l9:'i2). KA['J.A:-iSKI, 1.. and .J. HIORDAN: The pro1.lollle Ups m{-nagps. !'ieripta math. 12, 113-·-124 (H)46). KL~;~;. \'.: 'I'll<' Eukr eharaetprigtif' in eOl!lbillatorial gl'oll](,t.l',\'. Auwr. math. Monthly 70, ll\l 127 (196a).
359
GUN·CARLO ROTA: On the Foundations of ('ombinatorial Theory. I
3liH
LAZARSON, 1'.: TIH' r"presentation problem for independence functions. J. I..ondon math. Soc. 8B, 21-25 (195tl). MACLANE, s.: A latt.il'(' formulation of trallst,endcnce degrees and p.bases. Duke nmth .•J. 4, 455-468 (19:ltl). MACMILLAN, B.: Absolutely monotone functions. Ann. of Math., II. Ser. GO, 467-501 (1954). MOBlllS, A. }<'.: Uber eine be80ndere Art von Umkehrung der Reihen. J. reine angew. Math. II, 105~ 123 (l8:J2). ORl'l, 0.: Theory of graphs. Providence: American Mathematical Society 1962. PO[,Y A, 0.: 1{ombinatorische Anzahlbestimmungen flir Gruppcn, Graphen und chemiKehe Verbindungen. Acta math. 68, 145-253 (1937). HADO, R.: Note on independence functions. Proe. London math. Soc., III. Ser. 7, 300- 320 (1957). HEAl), It. C.: The enumeration of locally restriett1d graphs, I. J. London math. Soc. 84, 417 t0436 (1959). REI>FU:LD, J. H.: The tht10ry of group.n·duced distributions. Amer. J. Math. 49, 433-451\ (1927). H ~;V\lZ. ANDRE: Fonctions croillllantes et mesures sur leR espaces topologiques ordonn~R. Ann. Inst. j<'ourier 6 187 -268 (1955). RIORI>AN, J.: An introduction to combinat.orial Q.nalysis. New York: Wiley 1958. HOMANOV, N. P.: On a spedal orthonormal s,VHtem and its connectioll with the thcory of primes. Math. Hhornik, N.~. HI, :1:'3 304 (l!!4r,). ROTA, G.·C.: Combinatorial tht10ry and Mobius funet.ions. To appear in Amer. math. Monthly . . - The number of partitions of a Ret. To appear in Amer. math. Monthly. HYSER, H. J.: Combinatorial Mathematics. Bufl'alo: Mathematical AHSociation of America 1963. SClIUTZENBERGER. M. P.: Contribution aux applications statistiques de la tMorie de l'infor· mation. Pub!. Inst. Htat:. lTniv: PariR. 3, 5-117 (1954). '1\\RHKr, A.: Ordinal algebraR. AmRtt'rdam: North·Holland PuhliHhing Company 1956. 'I'IJlIC·lIAllIl ••J.: HIlT un probl<\rne de permut,ation8. C. r. Aead. H"i., Pads, 198, fi:II·-(i:J:J (19:J4). 'I'1I'l"j'E, W. T.: A eontributioll to the thoory of "hrlJllluti.., polYllomiaiH. Canadian J. Math. 6. RO---91 (Hl:':l). A d".,s of .\I"·li",, ~I'OIl". ('""adiall .J. Malh. ~. 1:1 :!x (Willi)_ A hOlllotUI'Y th"ol'l'lI, fo .. IlIall'Oi
Characteristic function .. aJIII the algcl,,'" of logie. Ann. of Math., 11. H!'l·. 34, 40:' 414 (19:l3). The abstract propertie8 of linear dependence. Am(,r. J. Math. oj, 507-·5:13 (19:l5). WIELANDT, H.: Beziehungen zwischen den Fixpunktzahlen von Automorphismengruppen eim>r endIichen Gruppc. MaUl. Z. i3. I4fi ---]:'1-1 (WHO). WINTNER, A.: Eratosthenian Avcrages. Baltimore (privakly printed) 1943. Department of Mnthemati('H Massachusetts Institute of Technology Cambridge 39, MaKll8chusetts ( Received September 2, 19(3)
Reprinted by TRUEXpress Oxford England
360
PATHS, TREES, AND FLOWERS JACK EDMONDS
1. Introduction. A graph G for purposes here is a finite set of elements called vertices and a finite set of elements called edges such that each edge meets exactly two vertices, called the end-points of the edge. An edge is said to join its end-points. A matching in G is a subset of its edges such that no two meet the same vertex. We describe an efficient algorithm for finding in a given graph a matching of maximum cardinality. This problem was posed and partly solved by C. Berge; see Sections 3.7 and 3.8. Maximum matching is an aspect of a topic, treated in books on graph theory, which has developed during the last 75 years through the work of about a dozen authors. In particular, W. T. Tutte (8) characterized graphs which do not contain a perfect matching, or 1-factor as he calls it-that is a set of edges with exactly one member meeting each vertex. His theorem prompted attempts at finding an efficient construction for perfect matchings. This and our two subsequent papers will be closely related to other work on the topic. Most of the known theorems follow nicely from our treatment, though for the most part they are not treated expliCitly. Our treatment is independent and so no background reading is necessary. Section 2 is a philosophical digression on the meaning of "efficient algorithm." Section 3 discusses ideas of Berge, Norman, and Rabin with a new proof of Berge's theorem. Section 4 presents the bulk of the matching algorithm. Section 7 discusses some refinements of it. There is an extensive ·combinatorial-linear theory related on the one hand to matchings in bipartite graphs and on the other hand to linear programming. It is surveyed, from different viewpoints, by Ford and Fulkerson in (5) and by A.]. Hoffman in (6). They mention the problem of extending this relationship to non-bipartite graphs. Section 5 does this, or at least begins to do it. There, the Konig theorem is generalized to a matching-duality theorem for arbitrary graphs. This theorem immediately suggests a polyhedron which in a subsequent paper (4) is shown to be the convex hull of the vectors associated with the matchings in a graph. Maximum matching in non-bipartite graphs is at present unusual among combinatorial extremum problems in that it is very tractable and yet not of the "unimodular" type described in (5 and 6). Received November 22, 1963. Supported by the O.N.R. Logistics Project at Princeton University and the A.R.O.D. Combinatorial Mathematics Project at N.B.S.
449
361
450
JACK EDMONDS
Section 6 presents a certain invariance property of the dual to maximum matching. In paper (4), the algorithm is extended from maximizing the cardinality of a matching to maximizing for matchings the sum of weights attached to the edges. At another time, the algorithm will be extended from a capacity of one edge at each vertex to a capacity of d t edges at vertex Vt. This paper is based on investigations begun with G. B. Dantzig while at the RAND Combinatorial Symposium during the summer of 1961. I am indebted to many people, at the Symposium and at the National Bureau of Standards, who have taken an interest in the matching problem. There has been much animated discussion on possible versions of an algorithm.
2. Di~ression. An explanation is due on the use of the words "efficient algorithm." First, what I present is a conceptual description of an algorithm and not a particular formalized algorithm or "code." For practical purposes computational details are vital. However, my purpose is only to show as attractively as I can that there is an efficient algorithm. According to the dictionary, "efficient" means "adequate in operation or performance." This is roughly the meaning I want-in the sense that it is conceivable for maximum matching to have no efficient algorithm. Perhaps a better word is "good." I am claiming, as a mathematical result, the existence of a good algorithm for finding a maximum ca-rdinality matching in a graph. There is an obvious finite algorithm, but that algorithm increases in difficulty exponentially with the size of the graph. It is by no means obvious whether Qr not there exists an algorithm whose difficulty increases only algebraically with the size of the graph. The mathematical significance of this paper rests largely on the assumption that the two preceding sentences have mathematical meaning. I am not prepared to set up the machinery necessary to give them formal meaning, nor is the present context appropriate for doing this, but I should like to explain the idea a little further informally. It may be that since one is customarily concerned with existence, convergence, finiteness, and so forth, one is not inclined to take seriously the question of the existence of a better-than-finite algorithm. The relative cost, in time or whatever, of the various applications of a particular algorithm is a fairly clear notion, at least as a natural phenomenon. Presumably, the notion can be formalized. Here "algorithm" is used in the strict sense co mean the idealization of some physical machinery which gives a definite output, consisting of cost plus the desired result, for each member of a specified domain of inputs, the individual problems. The problem-domain of applicability for an algorithm often suggests for itself possible measures of size for the individual problems-for maximum matching, for example, the number of edges or the number of vertices in the
362
PATHS, TREES, AND FLOWERS
451
graph. Once a measure of problem-size is chosen, we can define FA (N) to be the least upper bound on the cost of applying algorithm A to problems of size N. When the measure of problem-size is reasonable and when the sizes assume values arbitrarily large, an asymptotic estimate of FA (N) (let us call it the order of difficulty of algorithm A) is theoretically important. It cannot be rigged by making the algorithm artificially difficult for smaller sizes. I t is one criterion showing how good the algorithm is-not merely in comparison with other given algorithms for the same class of problems, but also on the whole how good in comparison with itself. There are, of course, other equally valuable criteria ..\nd in practice this one is rough, one reason being that the size of a problem which would every be considered is bounded. I t is plausible to assume that any algorithm is equivalent, both in the problems to which it applies and in the costs of its applications, to a "normal algorithm" which decomposes into elemental steps of certain prescribed types, so that the costs of the steps of all normal algorithms are comparable. That is, we may use something like Church's thesis in logic. Then, it is possible to ask: Does there or does there not exist an algorithm of given order of difficulty for a given class of problems? One can find many classes of problems, besides maximum matching and its generalizations, which have algorithms of exponential order but seemingly none better. An example known to organic chemists is that of deciding whether two given graphs are isomorphic. For practical purposes the difference between algebraic and exponential order is often more crucial than the difference between finite and non-finite. It would be unfortunate for any rigid criterion to inhibit the practical development of algorithms which are either not known or known not to conform nicely to the criterion. Many of the best algorithmic ideas known today would suffer by such theoretical pedantry. In fact, an outstanding open question is, essentially: "how good" is a particular algorithm for linear programming, the simplex method? And, on the other hand, many important algorithmic ideas in electrical switching theory are obviously not "good" in our sense. However, if only to motivate the search for good, practical algorithms, it is important to realize that it is mathematically sensible even to question their existence. For one thing the task can then be described in terms of concrete conjectures. Fortunately, in the case of maximum matching the results are positive. But possibly this favourable position is very seldom the case. Perhaps the twoness of edges makes the algebraic order for matching rather special in comparison with the order of difficulty for more general combinatorial extremum problems (d. 3). An upper bound on the order of difficulty of the matching algorithm is n4, where n is the number of vertices in the graph. The algorithm consists of "growing" a number of trees in the graph-at most n-until they augment or
363
452
JACK EDMONDS
become Hungarian. A tree is grown by branching from a vertex in the tree to an edge-vertex pair not yet in the tree-at most n times. Such a branching may give rise to a back-tracing through at most n edge-vertex pairs in the tree in order to relabel some of them as forming a blossom or an augmenting path. At each of these three levels there may be other labelling work involvedbut it is majorized by the work already cited. The work of identifying and labelling the vertex at the other end of some edge to a given vertex need not increase more than linearly with n. An upper bound on the order of magnitude of memory needed for the algorithm is n 2-the same order of magnitude of memory used to store the graph itself.
3. Alternating paths. 3.0. A subgraph of graph G is a graph consisting of a subset of vertices in G and a subset of edges in G under the same incidences which hold for them in G. A non-empty graph G is called connected if there is no pair of non-empty subgraphs of G such that each vertex of G and each edge of G is contained in exactly one of the subgraphs. The vertices and edges of any graph partition uniquely into zero or more connected subgraphs, called its components. Maximum, minimum, and odd will refer to cardinality unless otherwise stated. 3.1. The graph E, formed from a set E of edges in G, is the subgraph of G consisting of edges E and their end-points. Any graph H, unless it has a singlevertex component, is formed by its edges. Thus in some contexts it causes no confusion to make no explicit distinction between a graph and its edge-set. In particular, a matching in G may be thought of as a subgraph of G whose components are distinct edges. The sum of two sets D and E is commonly defined as D + E = (D - E) U (E - D). The sum D + E of two graphs D and E, formed by edge-sets D and E, is defined to be the graph formed by the edge-set D E.
+
3.2. There are two other kinds of subtraction for graphs besides the settheoretic difference used above. With these we must distinguish between a subgraph and the edges which form it. Where G is a graph and E is a set of edges, G - E is the subgraph of G consisting of all the vertices of G and the edges of G not in E. For two graphs G and H, G - H is the subgraph of G consisting of the vertices of G not in H and the edges of G not meeting vertices of H. Graph G U H (graph G n H) consists of the union (intersection) of the vertex-sets and the edge-sets of graphs G and H, with incidences in G U H (graph G n H) the same as in G and H. We may also take the intersection or union of a graph with a set of edges to get, respectively, a set of edges or a graph. In the latter case the end-points of the edges being adjoined to the
364
453
PATHS, TREES, AND FLOWERS
graph must be specified. We shall have occasion to give the same edge different end-points in different graphs. 3.3. A circuit B in graph G is a connected subgraph in which each vertex of B meets exactly two edges of B. A (simple) path Pin G is either a single vertex (joining itself to itself) or else a connected subgraph whose two end-points each meet one edge, an end-edge, of P and whose other vertices each meet two edges of P. A path is said to join its end-points. 3.4. For the pair (G, A1), where M is a matching in G, a vertex is called exposed if it meets no edge of AI. Let M denote the edges of G not in AI. Define an alternating path or alternating circuit, P, in (G, A1) to be such that one edge in lv[ n P and one edge in 111 n P meets each vertex of P, except the endpoints in the case of a path. Several authors, beginning with J. Peterson in 1891, have used alternating paths to prove the existence of "factors" in certain kinds of graphs. 3.5. For any two matchings 111, and 1112 in G, the components of the subgraph formed by M, M2 are paths and circuits which are alternating for (G, M,) and for (G, M2). Each path end-point is exposed for either 1\1, or 111 2.
+
A vertex of G meets no more than one edge, each, of AI, and Jif2-and thus no more than two edges of 111, 111 2 , one in }v[, n 1112 and one in Aif2 n M,. An end-point v of a path in graph 1\1, 11£2, meeting an end-edge in iVl, n M2 , say, meets no other edge of 111,. Hence, if an edge of Jif 2 meets v, it does not belong to AI, and so it does belong to A£, 1'v1 2 • But then v is not an end-point. Therefore v is exposed for M 2 • This completes the proof.
+
+
+
3.6. An alternating path A in (G, M) joining two exposed vertices contains one more edge of M than of ],,1. M A is a matching of G larger than M by one. Such a path is called augmenting. Thus matching 11£ is not maximum if (G, A1) contains an augmenting path. The converse also holds:
+
3.7 (Berge, 1). A matching 11if in G is not of maximum cardinality if and only if (G, M) contains an alternating path joining two exposed vertices of M.
+
If iV12 is a larger matching than M, some component of graph "~if 1112 must contain more l'Vlz-edges than 11if. By 3.5, such a component is an augmenting path for (G, M).
3.8. Berge proposed searching for augmenting paths as an algorithm for maximum matching. In fact, he proposed to trace out an alternating path from an exposed vertex until it must stop and, then, if it is not augmenting, to back up a little and try again, thereby exhausting possibilities. His idea is an important improvement over the completely naive algorithm. However, depending on what further directions are given, the task can still be one of exponential order, requiring an equally large memory to know when it is done.
365
454
JACK EDMONDS
Norman and Rabin (7) present a similar method for finding in G a minimum cover-by-edges, C, a minimum cardinality set of edges in G which meets every vertex in G. The Berge-Norman-Rabin theorem (2) is generalized in (3), but a corresponding generalization of the algorithm presented here in Section 4 is unknown. 3.9. Norman and Rabin also show that the maximum matching problem and the minimum cover-by-edges problem are equivalent. Assuming every vertex meets an edge, the minimum cardinality of a cover of the vertices in G by a set of edges equals the minimum cardinality of a cover of the vertices in G by a set of edges and vertices, where a vertex is regarded as covering itself. By replacing edges by vertices or vice versa, one can go back or forth between a minimum cover by a set of edges and a minimum cover by a set of edges and vertices, where the latter set consists of a maximum matching together with its exposed vertices.
4. Trees and flowers. 4.0. A tree may be defined as (1) a graph T every pair of whose vertices is joined by exactly one path in T; (2) inductively, as either a single vertex or else the union of two disjoint trees together with an edge which has one endpoint in each; (3) as a connected graph with one more vertex than edges; and so on. 4.1. An alternating tree J is a tree each of whose edges joins an inner vertex to an outer vertex so that each inner vertex of J meets exactly two edges of J. An alternating tree contains one more outer vertex than inner vertices. This follows from the third definition of tree by regarding each inner vertex with its two edges as a single edge joining two outer vertices. 4.2. For each outer vertex v of an alternating tree J there is a unique maximum matching of J which leaves v exposed and the only exposed vertex in J. Every maximum matching of J is one of these. Definition (2) of tree can be strengthened to the statement that a tree minus anyone of its edges is two trees. Thus J minus anyone of its inner vertices, say u, is two alternating trees. One of these, J 1, contains v as an outer vertex. Assume inductively that J 1 can be matched uniquely so only v is exposed and that J 2 , the other subtree, can be matched uniquely so only the vertex V2, joined in J to u by edge e2, is exposed. Then the union of e2 and these two matchings is a matching of J which leaves only v exposed. Since every edge of J has one inner and one outer end-point, every maximum matching leaves only an outer vertex exposed. 4.3. A planted tree, J = J(M), of G for matching M is an alternating tree in G such that },{ n J is a maximum matching of J and such that the vertex r
366
PATHS, TREES, AND FLOWERS
455
J which is exposed for M n J is also exposed for M. That is, all matching edges which meet J are in J. Vertex r is called the root of J(M).
III
In planted tree J(M) every alternating path P(M), which has outer vertex v and the matching edge to v at one of its ends, is a subPath of the alternating path P .(M) in J(M) which joins v to the root r.
For k > 1, assume that P 2k- i is the unique path P(M) which contains 2k - 1 edges and assume that at its non-vend it has an inner vertex Uk and a matching edge. Then P 2k , consisting of P 2k- i together with the unique nonmatching edge in J which meets Uk, is the unique path P(M) with 2k edges. It has outer vertices Vk and v at its two ends. If Vk ~ r, then P 2k+i' consisting of P 2k together with the unique matching edge which meets Vk, is the unique path P(M) with 2k + 1 edges. It has an inner vertex Uk+i and a matching edge at its non-vend. Since our assumption is true for k = 1 and since k cannot become infinite, the theorem follows by induction. We define a stem in (G, M) as either an exposed vertex or an alternating path with an exposed vertex at one end and a matching edge at the other end. The exposed vertex and the vertex at the other end are, respectively, the root and the ttp of the stem. The preceding theorem tells us that (1) no trial-and-error search is required to find the path in J from any of its vertices back to the root and (2) the path p. in J joining any outer vertex v to the root of J is a stem. 4.4. An augmenting tree, J A = JA(M), in (G, M) is a planted tree J(M) plus an edge e of G such that one end-point of e is an outer vertex Vi of J and the other end-point V2 is exposed and not in J. The path in J A which joins V2 to the root of J is an augmenting path. This follows immediately from (4.3). 4.5. For each vertex b of an odd circuit B there is a unique maximum matching of B which leaves b exposed. A blossom, B = B(M), in (G, M) is an odd circuit in G for which M n B is a maximum matching in B with say vertex b exposed for M n B. A flower, F = F(M), consists of a blossom and a stem which intersect only at the tip of the stem (the vertex b). A flowered tree, J F, in (G, M) is a planted tree J plus an edge e of G which joins a pair of outer vertices of J. The union of e and the two paths which join its outer-vertex end-points to the root of J is a flower, F. Let Vi and V2 be these outer vertices, and Pi and P 2 be the paths in J joining them to r. We have seen that Pi and P 2 are stems (which are easily recovered from J). Since they intersect in at least r and since the path in J joining r to any other vertex is unique, P b = Pi n P 2 is an alternating path with an end at r. If its other end-point, say b, were inner, it would be distinct from r, Vh and V2. Thus r would be distinct from Vi and V2, and b would meet three different edges of J, one in P b , one in Pi not in P b , and one in P 2 not in P b • But an inner vertex meets only two edges in the tree. Therefore b is outer and P b is a stem. Thus Pi' = Pi - (P b - b) and P 2' = P 2 - (P b - b), unless one is a
367
456
JACK EDMONDS
vertex VI = b or V2 = b, have non-matching edges at their b-ends and matching edges at their outer ends. It follows in any case that B = PI' U P 2 ' U e is a circuit with only b exposed for M n B, and thus B is a blossom with Pb as its stem. 4.6. A Hungarian tree H in a graph G is an alternating tree whose outer vertices are joined by edges of G only to its inner vertices. 4.7. For a matching M in a graph G, an exposed vertex is a planted tree. Any planted tree J(M) in G can be extended either to an augmenting tree, or to a flowered tree, or to a Hungarian tree (merely by looking at most once at each of the edges in G which join vertices of the final tree).
An exposed vertex satisfies the definition of planted tree. Suppose we are given a planted tree J and a set D (perhaps empty) of edges in G which are not in J but which join outer to inner vertices of J. (1) If no outer vertex of J meets an edge not in D U J, then J is Hungarian. Suppose outer vertex VI meets an edge e not in D U J, whose other end-point is, say, V2. (2) If V2 is an inner vertex of J, we can enlarge D by adjoining e. (3) If V2 is an outer vertex of J, then e U J is a flowered tree. (4) If V2 is exposed and not in J, then e U J is an augmenting tree. (5) Finally, if V2 is not exposed and not in J, then the ~1-edge e2 which meets V2 is not in J, and thus V3, the other end-point of e2, is not in J by the definition of planted tree. Therefore, in this case we can extend J to a larger planted tree with new inner vertex V2 and new outer vertex V3 by adjoining edges e and e2. For any J and D, one of the five cases holds. Therefore by looking at any edge in G at most once, we can reach one of the three cases described in the theorem, because the other two cases, (2) and (5), consume edges and G is finite. 4.8. The algorithm which is being constructed is efficient because it does not require tracing many various combinations of the same edges in order to find an augmenting path or to determine that there are none. In fact we accomplish one or the other without ever looking again at the edges encountered in process (4.7), except to pick out from the tree the blossom or the augmenting path when case (3) or (4) occurs. We see from (4.3) and (4.5) how easy it is to retrieve the blossom or the path. When flowers arise we "shrink" the blossoms, and so if an augmenting path arises later, it will be in a "reduced" graph. However, only one other very simple kind of task translates the augmentation to (G, M) itself. That task is to expand a shrunken blossom to an odd circuit and find the maximum matching of the odd circuit which leaves a certain vertex exposed. Actually, we shall find in (7.3) that it is desirable to leave odd circuits shrunk while looking in the reduced graph for as many successive augmentations as possible since they are all reflected in augmentations of (G, M). 4.9. For H, a subgraph of G, G is the disjoint union (G - H) U oH U H+, where oH is the set of the edges with one end-point in H and one end-point in
368
PATHS, TREES, AND FLOWERS
457
C - H, and where H+ = C - (C - H) is the subgraph consisting of Hand all edges of C with both end-points in H. When H is connected, shrinking H means constructing the new graph GIH = (G - H) U oH U h by regarding H+ as a single new vertex, h = HIH, which meets the edges oH = oh. The end-points in G - H of the edges oH do not change. 4.10. If B is an odd circuit in G, then b' = BIB is called a pseudovertex of GIE. To expand b' means to recover G from GIB. The algorithm, after it expands a pseudovertex b', will make use of the circuit B. In general, finding a "Hamiltonian" circuit in a graph B+ is difficult. Therefore, when the algorithm shrinks B to form GIB, it should remember circuit B as having effected the shrinking. Thus we call circuit Bin G (rather than B+) the expansion of b'. In formal calculation shrinking Bin G is an easy operation. Essentially, just assign all the vertices and edges of B a label, b', and then, until b' is expanded by erasing these labels, ignore any distinction between vertices labelled b' and ignore edges joining them to each other. Where M is a matching set of edges in G, MIB is defined as M n (GIB). Clearly, if B is a blossom for (G, M), then MIB is a matching of GIB. 4.11. Let Go = G, G; = G;_J/Bj, and b; = B;IB; for i = 1, ... ,n, where Bi is an odd circuit in graph G;_l. We inductively define the pseudovertices (with respect to C) of Gk (k = 1, ... ,n) to be bk together with the pseudovertices in G k- 1 - Bk = G k - bk. Of course not every bi, i < k, will be a pseudovertex of Gk because some will have been absorbed into others. The order in which the pseudovertices of a Gk arise is immaterial. That is, the order in which the odd circuits B i are shrunk is immaterial except in so far as one shrunken B i is a vertex in another B i. Thus we can expand any pseudovertex b j of G k to obtain a graph Gkj for which GkjlB j = G k. The pseudovertices of Gk / (with respect to G) are the pseudovertices in B j together with the pseudovertices in G k - b j ; that is, graph Gkj can be obtained from G by shrinking in a proper order the odd circuits which were absorbed into these pseudovertices. On the other hand, we do not expand a vertex bh in Bk until vertex b,; is expanded. There is a partial order on the b;'s defined by the transitive completion of the relation bh < bk where bh is a vertex of B k. (I t is a special kind of partial ordering because each bh is a vertex of at most one Bk.) There is a partial order on the sets, Sa, S~, ... , of mutually incomparable b;'s, where Sa < S~ when every member of Sa is less than or equal to some member of S~. Evidently there is a unique family of graphs, Ga , G~, ... , which include the G/s and G. They correspond 1-1 to the sets, Sa, S~, ... , so that the pseudovertices of Ga are Sa, etc. We have Sa < S~ if and only if C~ can be obtained from Ga by shrinking certain B;, those for which b; is less than or equal to some member of S~ and not less than or equal to any member of Sa. Graph G corresponds to the empty set and Gn corresponds to the set of b/s which are maximal with respect to their partial order.
369
458
JACK EDMONDS
The complete expansion of a pseudovertex b l is the subgraph U+
= G - (G - U) C G
where U consists of all vertices of G absorbed into bi by shrinking.
4.12. Where B is the blossom of a flower F for (G, M), M is a maximum matching of G if and only if MIB is a maximum matching of GIB. 4.13. Where blossom B is in J F, a planted flowered tree for (G, M), JFI B is a planted tree for (GIB, MIB). It contains BIB as an outer vertex. Its other ollter and inner vertices are respectively those of J F which are not in B. Theorem (4.13) follows easily from (4.5). We separate the two converse statements of Theorem (4.12) into slightly stronger statements, (4.14) and (4.15).
4.14. Where B is any odd circuit in G, for every matching MI of G! B there exists a maximum matching MB of B such that M = MI U MB is a matching for G. Since any matching MI of GIB contains at most one edge meeting BIB, the edges MI in G meet at most one vertex, say bI, of B. Therefore the desired JIB is the maximum matching of B which leaves bl exposed. Since the cardinality IMBI of MB is constant, any augmentation of MI yields a corresponding augmentation of M. Therefore, the "only if" part of (4.12) is proved. Applying the above matching operation to successive expansions of pseudovertices into odd circuits we have:
Where P is the complete expansion of a pseudovertex p in G2 , where G I is the graph obtained from G2 by completely expanding p, and where M2 is any matching of G 2, there exists a matching M p of P leaving exactly one exposed vertex in P such that Mp U M2 is a matching of G I . Thus since IMpl is constant. any augmentation in G2 yields a corresponding augmentation in G I • 4.15. For (G, M), let P be a subgraPh such that (1) M n P leaves exactly one exposed vertex in P, (2) M I P is a maximum matching of GI P, and (3) p = PIP is the tip of a stem Spfor (GIP, MIP). Then M is a maximum matching of G. The edges of Sp form in G a stem, S, for (G, M). (In case Sp has no edges, take S to be the vertex in P exposed for M.) Compare M' = M + Sand M'IP with M and MIP. The definition of stem implies that M' is a matching of G with IM'I = IMI and that the exposure of the root of S is changed to the exposure of the tip of S. Similarly M' I P = M I P Sp is a matching of GI P with IM'IPI = IMIPI and with vertex p exposed. Because the cardinalities do not change, it is sufficient to show that M' is maximum in G if M' I P is maximum in GIP. Using (3.7), if M' is not maximum, G contains an augmenting path A = A (M'). If A contains no vertices of P, then it is also an augmenting
+
370
459
PATHS, TREES, AND FLOWERS
path for M'IP in GIP. Otherwise, because P contains only one exposed vertex for l,g', at least one of the ends of A is at an exposed vertex UI not in P. There is a unique sub path A I of A with one end-point at UI and containing only one vertex PI of P, at its other end. The only difference between Al and AdP = (AI U P)IP is that PI is replaced by p, which is exposed for M'IP. Thus AdP is an augmenting path for M'IP and so M'IP is not maximum. The theorem is proved. The theorem extends as follows: For (G, M), let P lt ••• ,Pn be a family of disjoint subgraphs in G such that (1) M n P ( leaves exactly one exposed vertex in Pi' (2) Mn = M n Gn is a maximum matching of Gn = GIPd ... IPnI and (3) vertices PilP i of Gn are outer vertices in a planted tree I n for (Gn, Mn). Then M is a maximum matching ofG. We may assume that the indices order the P ilP /s so that (for k = 1, ... , n - 1) those from 1 through k are contained in a planted subtree J k of I n not containing those from k 1 through n. Hence the theorem follows by induction after proving that M n- I = M n Gn - I is a maximum matching of Gn- I = GIPd ... lPn-I. Since every outer vertex of I n is the tip of a stem in G,,, this follows from the last theorem.
+
4.16. Theorems (4.7) and (4.13) show how by branching a planted tree out from an exposed vertex of (G, M) and shrinking blossoms B ( when they are encountered, we eventually obtain in a graph Gk = GIBd ... IBk either a tree with an augmenting path or a Hungarian tree. An augmenting path admits an augmentation of matching Mk = M n Gk according to (3.7), and (4~14) shows how this induces an augmentation of matching M k - I = M Gk - I and so on back through M. On the other hand, when a Hungarian tree J is obtained, submatching (J U Bk U ... UBI) n M of (G, M) cannot be improved and so this part of G is freed from further consideration. This follows immediately from (4.15) and the next theorem, (4.17), where Gk is denoted simply as G.
n
4.17. Let J be a Hungarian tree in a graph G. A matching MI of G - J is maximum in G - J if and only if MI together with any maximum matching MJ of J is a maximum matching of G. Since J and G - J are disjoint, if there exists a matching M/ of G - J which is larger than M I, then MI' U MJ is a larger matching of G than MI U M J • Conversely, suppose MI is maximum for G - J. Let M' = M/UMIUM/
be an arbitrary matching of G where MI' C G - J, where M/ C J, and where MI n «G - J) U J) is empty. Then IMI'I -< IMII. Every edge in MI meets at least one inner vertex of J; that is, where I' C I(J) is the set of the inner vertices met by MIt IMII -< WI. The graph J - I' consists of 11'1 + 1
371
460
JACK EDMONDS
disjoint alternating trees whose inner vertices together are I(J) - I'. Therefore, since the maximum matching cardinality of an alternating tree equals the number of its inner vertices, 1M/I < II(J) - I'I. Adding the three inequalities gives IM'I < IMII II(J)I = IMI U MJI. SO the theorem is proved.
+
4.18. The matching M of G = GO, to begin with, may be empty. If it leaves any exposed vertices, then the process (4.16) operates with respect to one of them. Either it produces an augmentation of M by one edge, thus disposing of two exposed vertices, or it reduces the possible domain for augmenting M to a sub graph (? = G k - J of G, containing one less exposed vertex and containing only edges and vertices not previously considered. Successive application of (4.16) may reduce the consideration of M to a sub graph Gt of G and reveal there an augmentation of M. After augmenting in Gt, obtaining a larger M for G with two less exposed vertices in Gt, (4.16) operates again in Gt, never returning to the matching in the rest of G. 4.19. Repeated application of (4.18) reduces the domain in question to a
Gn containing no exposed vertices. Then we know that we have a maximum matching; let us still call it M, with n exposed vertices in G. Thus the construction of an algorithm for finding a maximum cardinality matching in a graph is complete. Often the last application of (4.18) is unnecessary. For verifying maximality, the algorithm may as well stop when it reduces the domain to a Gn-l containing one exposed vertex, since two exposed vertices are necessary in order to augment. However, for theoretical purposes it is convenient to have the algorithm grow a tree from each exposed vertex of the final, maximum matching. 4.20. We may define an alternating forest to be a family of disjoint alternating trees and a planted forest in (G, M) to be a family of disjoint planted trees in (G, M). A dense planted forest is one which contains all the exposed vertices of (G, M). The family of exposed vertices, itself, is a dense planted forest. The algorithm works as well by growing a dense planted forest all at once, rather than one tree at a time. I t is appropriate then to define augmenting forest (flowered forest) to be a planted forest plus an edge e of G whose end-points are outer vertices of different trees (of the same tree) of the planted forest. A Hungarian forest in G is defined similarly to Hungarian tree, replacing the word "tree" by "forest." Notice that the trees of a Hungarian forest are not necessarily Hungarian trees-an outer vertex of one tree may be joined by an edge of G to an inner vertex of another tree in the forest. The theorems on trees presented in this section are essentially the same for forests.
5. The dual to matching. 5.0. A bipartite graph K is one m which every circuit contains an even
372
PATHS, TREES, AND FLOWERS
461
number of edges. This condition, that K contains no odd circuits, is equivalent to being able to partition the vertices of K into two parts so that each edge of K meets exactly one vertex in each part. The well-known Konig theorem states: For a bipartite graph K, the maximum cardinality of a matching in K equals the minimum number of vertices which together meet all the edges of K. 5.1. The linear programming duality theorem states: If (1) x ::> 0, Ax < c and (2) y # 0, A Ty ::> b, for given real vectors band c and real matrix A, then for real vectors x and y, maxz(b, x) = miny(c, y) when such extrema exist. The problems of finding a maximizing vector x and a minimizing vector y are called linear programmes, dual to each other. 5.2. The Konig theorem is now widely recognized as the instance of (5.1) where band c consist of all ones and A = AK is the zero-one incidence matrix of edges (columns) versus vertices (rows) in a bipartite graph K. In view of Theorem (5.1) the Konig theorem is equivalent to the remarkable fact that, with b, c, and A as just described, the two linear programmes of (5.1) have solutions x and y whose components are zeros and ones whether or not this condition is imposed. An elegant theory centres on this phenomenon. Graph-theoretic algorithms are well known for so-called assignment, transportation, and network flow problems (5). These are linear programmes which have constraint matrices A that are essentially A K • 5.3. For a linear programme with an arbitrary matrix A of integers, or even of zeros and ones, we cannot say that the extreme values will be assumed, as when A = A K , by vectors with integer components. Therefore, in general when we impose the condition of integrality on x, the equality of the two extrema no longer holds. In particular, when the maximum matching problem is extended from bipartite to general graphs G, a genuine integrality difficulty is introduced. Our matching algorithm met it by the device of shrinking blossoms. 5.4. The matching algorithm yields a generalization of the Konig theorem to maximum matchings in G. The new matching duality theorem, in the form "maximum cardinality of a matching in G equals minimum of something else," is also an instance of linear programming duality. I t is reasonable to hope for a theorem of this kind because any problem which involves maximizing a linear form by one of a discrete set of non-negative vectors has associated with it a dual problem in the following sense. The discrete
373
462
JACK EDMONDS
set of vectors has a convex hull which is the intersection of a discrete set of half-spaces. The value of the linear form is as large for some vector of the discrete set as it is for any other vector in the convex hull. Therefore, the discrete problem is equivalent to an ordinary linear programme whose constraints, together with non-negativity, are given by the half-spaces. The dual (more precisely, a dual) of the discrete problem is the dual of this ordinary linear programme. For a class of discrete problems, formulated in a natural way, one may hope then that equivalent linear constraints are pleasant even though they are not explicit in the discrete formulation. 5.5. Arising from the definition of a matching-no more than one matching edge to each vertex-are the obvious linear constraints that for each vertex v E G the sum of the x's corresponding to edges which meet v is less than one. To obtain a maximum cardinality matching, we want to maximize the sum of all the x's, corresponding to edges of G, subject to the additional condition that each x is zero or one. It turns out that maximum matching can be turned into linear programming by substituting for the zero-one condition the additional constraints that the x's are non-negative and that for any set R of 2k + 1 vertices in G(k = 1,2, ... ) the sum of the x's which correspond to edges with both end-points in R is no greater than k. The former condition on the x's obviously implies the latter since for no matching in G do more than k matching edges have both ends in R. The converse-that subject only to the linear constraints, L Xi can be maximized by zeros and ones-is not so obvious, but in view of (5.1) it follows from (5.6), the generalized Konig theorem. Actually the stronger converse holds-that subject only to these same linear constraints, L Ci Xi, for any real numbers Ci, can be maximized by zeros and ones. In other words, the polyhedron described by the constraints is, indeed, the convex hull of the zero-one vectors which correspond to matchings in G. We shall not prove this until we take up maxiinum weight-sum matching in paper (4). Although the convex-hull notion suggested trying to generalize the Konig theorem, and although the generalization found does suggest the true convex hull, the success of the first suggestion does not necessarily validate the second. 5.6. A set consisting of one vertex in G is said to cover an edge e in G if e meets the vertex. The capacity of this set is one. A set consisting of 2k + 1 vertices in G(k = 1,2, ... ) is said to cover an edge e in G if both end-points of e are in the set. The capacity of this set is k. An odd-set cover of a graph G is a family of odd sets of vertices such that each edge in G is covered by a member of the family. MATCHING-DUALITY THEOREM. The maximum cardinality of a matching in G equals the minimum capacity-sum of an odd-set cover in G.
374
PATHS, TREES, AND FLOWERS
463
It is obvious that the capacity-sum of any odd-set cover in G is at least as large as the cardinality of any matching in G, so we have only to prove the existence in G of an odd-set cover and a matching for which the numbers are equal.
5.7. The theorem holds for a graph which has a perfect matching M-that is, with no exposed vertices-since the odd-set cover consisting of two sets, one set containing one of the vertices and t he other set containing aU the other vertices, has capacity-sum equal to IMI . It also holds for a graph which has a matching with one exposed vertex. Here the odd-set cover may be taken as consisting of one member, the set containi ng all vertices of the graph. For the case of one exposed vertex, an odd-set cover may also be constructed as in (5.8) by applying t he algorithm to construct a Hu ngarian tree even though it obviously will not result in a ugmentation. 5.8. Applying the algorithm to (G, M), where !M! is maximum, using some exposed vertex as root, we obtain a graph G' containing a maximally matched H ungarian tree J, a number of whose outer vertices are pseudo. Let SJ consist of all odd sets of the following two types: sets each consisting of one inner vertex in J, and sets each consisting of the vertices in the complete expansion of one pseudovertex of J. The number of edges of }.{ which a member of SJ covers is equal to the capacity of the member. Every edge of M not in G' - J is covered by exactly one member of SJ' An edge of G is covered by a member of SJ if and only if it is not in G' - J. :Vlatch ing M (\ (G' - J) is a maximum matching of G' - J with one less exposed vertex than (G, M). Assuming that IAf (\ (G' - J)I equals the capacity of an odd-set cover, say S' J, of G' - J, we have that IMI equals the capacity of SJ V S' J. an odd set cover of C. Theorem (5.6) follows by induction on the number of exposed vertices. 5.9. It is evident from the proof that we may require the minimum odd-set cover to have certain other structure- in particular , that each member with more than one vertex contain the vertices of at least one odd circuit in G. \\1ith the latter restrict ion the theorem becomes a strict generalization of the Konig theorem.
6. Invariance of the dual. G.O. For any particular application of the algorithm (4) to G, yielding, say, the maximum matching 1'ff, we may skip the augme ntation steps in (4.16) by regarding the augmented matching as being the one already at hand. This gives a particu lar application of (4) to G starling with maximum matching M. In the application of the algorithm to (G, M,), we can rega rd all the branchings and blossom shrinkings as taking place without subtracting the trees] j as they arise. Thus we obtain from (C, M) a graph G* with a number of pscudo-
375
464
JACK EDMONDS
vertices which are outer vertices in a sequence {Id (i = 1, ... ,n) of disjoint planted trees in G*, one corresponding to each exposed vertex of (G, M). By expanding all the pseudovertices of G* completely, we recover the graph G. 6.1. The tree It is Hungarian in G* - II ... - II-It but usually not Hungarian in G* because an outer vertex of I I might be joined to an inner vertex of any other tree with a lower index. Hence the partition of the outer and inner vertices into trees I t depends on the order of their construction. Also non-matching edges which can occur in each tree are not unique. In general, joining outer to inner vertices of a I I are many other M edges which would do as well. The particular blossoms which led to the pseudovertices are also fairly arbitrary. And, finally, the maximum matching is far from unique. However, (6.2) will show that the graph G* is uniquely determined by G alone. 6.2. For a (G, M) where .M is any maximum matching, let G* and (Ii I be obtained from (G, M) by (6.0). (a) The non-pseudo outer vertices of the I/s and the vertices of the pseudovertex complete expansions, all called the outer vertices o (G) of G, are precisely the vertices of G which are left exposed by some maximum matching of G. (b) The inner vertices of the 1/s, called the inner vertices I (G) of G, are precisely those vertices of G not in O(G) but joined to vertices in O(G). (c) G* is obtained from G by shrinking the connected components of O(G)+, the subgraph of G consisting of vertices O(G) and all edges of G joining them. 6.3. We have defined vertex families O(G) and I(G) in terms of particular It. The theorem yields definitions dependent only on G itself. Clearly O(G*) and I(G*), defined in terms of the It in G*, are respectively the out~r and the inner vertices of the II. Notice that the early definitions of inner and outer, for vertices in an alternating tree, are consistent with the definitions for a general graph. 6.4. Proof of (6.2), (b) and (c). Let the vertex v* of G* be joined in G* to some outer vertex u* of It. Then v* is a vertex in some Ih(h , i), since It is Hungarian in G* - II - ... - It-I. But v* cannot be an outer vertex of Ih since u* is not inner and since Ih is Hungarian in G* - II - ... - Ih-I. Therefore v* is inner. It follows that each outer vertex u of G is joined only to inner vertices and to other vertices in the complete expansion of its image u*. By construction, each inner vertex is joined to an outer vertex of G. Hence, (b) is true. Since by construction the complete expansion of each outer vertex of G* is connected, it also follows that the connected components of O(G)+ correspond precisely to outer vertices of G*. Hence, (c) is true.
6.5. An outer vertex u of G, by definition, either is identical with or is contained in the complete expansion of some outer vertex u* of, say, Ii. For any maximum matching M t of alternating tree Ito M t U [M n (G* - Ii)] is
376
PATHS, TREES, AND FLOWERS
465
a maximum matching of G*, which by (4.14) induces a maximum matching AI' of G. Let AIt be the one which leaves u* exposed. If u* is pseudo, then by (4.14) AI' can be chosen so that u is exposed in the expansion. This proves half of (6.2), (a). 6.6. C. Witzgall suggested the following simplified proof of the converse, viz. that only the outer vertices are ever exposed for a maximum matching. A non-outer vertex v meets an edge e of AI. Deleting v and its adjoining edges, V J t - v is a Hungarian forest in G* - v. If v is inner, then the forest is dense in G* - v. Otherwise it is dense in G* - v except for one exposed vertex, the other end of e. In either case it follows that AI - e is a maximum matching of G - v. Assume that AI' is a maximum matching of G which leaves v exposed. Then .iV' is also a matching of G - v. Since AI' is larger than AI - e, we have a contradiction. This completes the proof of (6.2). 6.7. The definition of odd-set cover may be expanded (more than necessary for Theorem (5.6» to include the possibility of members which are even sets of vertices in G. A set of 2k-vertices has capacity k and covers the edges which have both end-points in the set. Then, clearly, Theorem (5.6) still holds for this kind of cover. With this definition of cover, it follows from the uniqueness of G* that there is a unique preferred minimum cover, S*, for any graph G. The one-vertex members of S* are the inner vertices of G*, the other odd members of S* correspond to the pseudovertices of G*, and the one even member of S* consists of the non-inner, non-outer vertices of G*.
7. Refinement of the algorithm. 7.0. Several possibilities for refining the algorithm suggest themselves. We could remember an old tree, uprooted by an augmentation, so that when a new rooted tree takes on a vertex in it, we can immediately adjoin a piece of it to the new tree. This appears not worth doing. A tree is easy to grow, easier than selecting from an old tree the piece which may be grafted. 7.1. A quite useful refinement is to leave the pseudovertices of the old tree shrunk until their expansion is necessary. We see from (4.14) that any further augmentation of a matching AI' in a graph G' with pseudovertices yields a further augmentation in G just as easily as the first. On the other hand, a maximum matching in G', reached after one or more augmentations, does not necessarily yield a maximum matching of G. The sufficiency part of (4.12) depends on the blossom being part of a flower, whereas the first augmentation in G' uproots the stem. 7.2. However, we may easily observe the circumstance arising in the application of the algorithm to (G', .M') where the shrinkage might hide a
377
466
JACK EDMONDS
possible augmentation in G. It is where a pseudovertex, say b', becomes an inner vertex of the planted tree, say J' = J'(M'). In this case, we obtain a graph G" from G' by expanding b' to an odd circuit B. The edges of J' form in G" a subgraph which we still call J'. The set M' is also a matching in G". One edge of J' M' has an end-point, say bl , in B. One edge of J' n M' has an end-point, say b2 , in B. The maximum matching MB of B which is compatible with M' in G" leaves bi exposed. The vertices bi and b2 partition B into two paths, P 2 even and PI odd, which join bi and b2• The graph J" = P 2 \J J' is a planted tree in G" for the matching MB \J M'. Unless bi and b2 coincide, P 2 will contain outer vertices of J". These may be joined to vertices not in J" which admit an extension of J", not possible for J"IB = J' C G', to a planted tree with an augmenting path.
n
7.3. If J' C G' can be extended in G' to a tree with an augmenting path, it does not matter that some of the inner vertices are pseudo because a further augmentation for G is thus determined. If J' with pseudo inner vertex b' can be extended in (G', M') to a flowered tree whose blossom B' contains b', then b' loses its distinction as an inner vertex. It might as well stay shrunk and be absorbed into the new pseudovertex B'IB' of G'IB'. In fact, Theorems (4.15) and (4.17), together, tell us that any pseudo outer vertex might as well be left pseudo during the algorithm. Therefore a pseudo inner vertex should be retained until a planted Hungarian tree J H is obtained. If no inner vertices of J H are pseudo, then (4.17) is applicable. Otherwise, at this point, a pseudo inner vertex should be expanded according to (7.2). 7.4. One of the main operations of the algorithm is described in (4.3). That is back-tracing along paths in a tree already constructed, either to obtain an augmentation as in (4.4) or to delineate a new blossom as in (4.5). The backtracing takes place in an alternating tree only because blossoms have been shrunk to pseudovertices. A pseudovertex~may be compounded from many earlier blossom shrinkings and may thus ~ncompass a complicated subgraph of G. After shrinking, back-tracing entirely bypasses the internal structure of a pseudovertex. A possible alternative to actually shrinking is some method for tracing through the internal structure of a pseudovertex. Witzgall and Zahn (9) have designed a variation of the algorithm which does that. Their result is attractive and deceptively non-trivial. REFERENCES
1. C. Berge, Two theorems in graph theory, Proc. Nat!. Acad. Sci. U.S., 43 (1957), 842-4. 2. - - - The theory of graphs and its applications (London, 1962). 3. J. Edmonds, Covers and packings in a family of sets, Bull. Amer. Math. Soc., 68 (1962). 494-9.
378
PATHS, TREES, AND FLOWERS
467
4. - - - Maximum matching and a polyhedron with (0, 1) vertices, appearing in J. Res. Nat\. Bureau Standards 69B (1965). 5. L. R. Ford, Jr. and D. R. Fulkerson, Flows in networks (Princeton, 1962). 6. A. J. Hoffman, Some recent applications oj the theory oj linear inequalities to extremal com· binatorial analysis, Proc. Symp. on App\. Math., 10 (1960), 113-27. 7. R. Z. Norman and M. O. Rabin, An algorithmJor a minimum cover oj a graph. Proc. Amer. Math. Soc., 10 (1959), 315-19. 8. W. T. Tutte, TheJactorization oj linear graphs, J. London Math. Soc., ee (1947),107-11. 9. C. Witzgall and C. T. Zahn, Jr., Modification oj Edmonds' algorithmJor maximum matching oj graphs, appearing in J. Res. Nat!. Bureau Standards 69B (1965).
National Bureau of Standards and Princeton University
Reprinted from Cllnad. J. Math. 17 (1965), 449-467
379
A THEOREM OF FINITE SETS by G. KATONA Mathematical Institute of the Hungarian Academy of Sciences Budapest, Hungary
§ 1. Introduction Let AI> ... , An be a system of different subsets of a finite set H, where IHI = hand IAil = I (1 ~ i ~ n) (IAI denotes the number of elements of A). We ask for a system AI' ... , An (for given h, I, n) for which the number of sets B satisfying IBI= 1- 1 and Be AI for some i is minimum. The first lower estimation for this minimum is given by SPERNER ([1], Hilfssatz).
I
h
(N)·t .
n· . Th·IS d epend s on . . H owever, 1·f n = ,1 IS h-l+l ·1 expected that the minimizing system is the system of all I-tuples chosen
· estImatIOn . . IS . H IS
from a subset of N elements of H. In this case the number of B's is ( N ) 1-1 which does not depend on h. A. HAJNAL proved this statement in the case of 1 = 3 (unpublished). In this paper I prove for all cases that this is, indeed, t.he minimum, and find t.he (more complicated) minimum alsQ for arbitrary n. The theorem is probf.bly meful in proofs by i~duction over the maximal number of elements of the subsets in a system, as was SPERNER'S, lemma in his paper [1]. ' KLEITMAN told me in Tihany (Hungary) that he thought I could solve t.he following problem of ERDOS by the aid of the above'theorem and the "marriage problem": Let AI> ... , An be subsets of H, where IHI = 2h and IAil = h. For what n's is it always possible to construct a system B I , . . . , Bn with the properties Bi C Ai' IBi I = h - 1 (1 < i < n).§ 3 contains the solution of this problem in a more general form.
§ 2. The main result Before the exact formulation of the theorem we need the following simple but interesting LEMMA 1. If n and I are natural number8, we can write the number n uniquely in the form (1)
n = (al(n, I
I))
+ (al-l (n, I)) + ... + (a/(n,l) (n, I)) , 1- 1
t(n, 1)
where t(n, I) :2: l, al > ai-I> ... > a/(n.l) are natural number8 and ai(n,I):2: :2: i (i = t(n, 1), t(n, 1) 1, ... , 1).
+
IR7
381
188
G. KATO:l!A
PROOF.
The existence of form (l) is proved by induction over 1. For
1 = 1 the statement is trivial. Assume that for 1 = k - l it is true also and prove for 1 = k. Let ak be the maximal integer satisfying the inequality
(~k) < n. If here equality holds, we are ready. If it does not, using the induction hypothesis we have for the number
n- (a;) the following expression:
(2)
where t ~ 1, ak-l > ... > at, ai ~ i (i = t, t + 1, ... , k -- 1). (2) gives an expression for n, we have to verify only ak > ak-l and ak ~ k. If ak <::::: <::::: ak-l held, then
would hold also, which contradicts choosing of ak' On the other hand, > ak-l and ak-l ~ k - 1. The unicity of Form (1) is proved also by induction over l. For 1 = 1 the statement is trivial. Assume that for 1 = k - 1 it is also true and prove for 1 = k. If, on the contrary, there exist two forms:
ak ~ k follows from a k
we may separate two different cases. If ak = a;', we can obtain two different
(~)
forms of n --
, which contradict our induction hypothesis. If ak
< a~,
the contradiction follows from
n<
(:k) + (~
=:)+ ... + (ak- : + I) <
=
(alc ;
1) _ 1 <
(an;
1)
(i) (i) + ... + (a:) . <:::::
Thus we proved the lemma. In the future we will use the following two notations: E[(n)
=
(a[(n, l) -
,
1- I
1) + (a[_l(n, l) - 1) + ., . + (a[(n,[)(n, 1) - I)
and l'[(n)
=
(a[(n,
1- 2
I)) +
1- I
(a[-l (n,
1- 2
t(n, 1) - I
l)) + ... +
(a[(I/,[) (n,
These numbers are uniquely determined by Lemma I.
382
l)) .
t(n, l) - I
s:
A THEOREM
O~'
189
FINITE SETS
Let us consider now the problem. I.Jet H be a finite set with h elements, and V't = {AI' ... ,An} a system of different subsets of H, where the number of elements of Ai is
Obviously, I is a fixed integer between 1 and h. Let c(V't) denote the following system c(V't) = {B: IBI = 1- 1 and Be Aj for at least one 1}. The problem is to determine the minimum of Ic(V't)I, if h; n and I are given. Theorem 1 gives the exact solution of this problem. THEOREM 1. Let h, n and I be given integers with the properties
If H is a set of h elements, and V't={AI'·· .,An},
lAd =1
(i
= 1, ... , n)
a system of different subsets of H, then
min Ic(d) I = F/(n) , where the minimum runs over all such 8ystems V't.
REMARK. It is interesting, that minlc(V't)l does not depend on h. For example, SPERNER'S estimation [1]: c
(V't)" L
+
n·l h_ 1 1
depends on h. Before the proof we shall give another theorem. We will prove them together. THEOREM 2. Let h, n and I be given integers with the properties h~l,
1
<1< hand
Further G and H are disjoint sets of h elements. If is a SY8tem of Ai'S, where and
V't={A I , ... ,An} Aie G
or
Aie 11
(1
IAil=1
then
min Ic(V't) I =
(z ~ 1) + F/
383
(1
(n -
(~) ) .
s: i::::;;: n)
:S: i ::::;;: n) ,
190
G. KATONA
PROOF. 1. First we construct the minimizing system of Theorem 1. Denote this system by J(h, n, I). Obviously, it is sufficient to construct the system J(ar(n) , n, I), where aT(n) is the least integer satisfying
(art)) ~n.
The construction will be carried out by induction over I. HI = 1, at(n)
=
= nand J(at(n), n, 1) consists of all the sets of one element. Assume we
constructed already the system J(aT_l(n), n, 1now .1(aT(n), n, I). H n =
(al(~' 1) J ' the~
1)
for all n. Construct
the minimizing system consists
(al(~' 1) ) , let H be a set of aT(n) = elements, and e an element of H. Since a/ > al- t , we can con-
of all the subsets having I elements. If n>
=al(n,I)+1
struct the system..4 (al(n, 1), n -
(al(~' I) ) , 1- 1) on H --{e} by the induc-
tion hypothesis. Define the system J J = {N
in the following manner:
U{e}: NEJ(al(n, l),n -
(al(~,l)), I-I)}.
H .9J denotes the system of all subsets of H - {e}, having I elements, then .9J and Jform together the system J(aT(n), n, I). Indeed, the number of
+n_
(a/(~, 1)) = n - and
sets is
(a/(~, I))
(4)
Ic(J(aT(n), n, I» I =
(a/(n, I)) 1- 1
However, it is easy to see, that
Ic(J(aT(n), ri, 1) 1= (a;(n,; J+ Ie
we have only to verify
+ ... + (a/(n,l) (n, I)) t(n, I) - 1
(Jl
= F/(n) .
(al(n, I), n - (a l (;' 1)) , 1 -
1)) I
and by the induction hypothesis
Ie ( ....•/t (al(n, I), n -
(al(n, I)) , I _ 1
1)) I= (a/_1-2 (n, I)) + ... + (a/(n,l) (n, I)) , t(n, 1) 1
I
which proves (4). 2. The minimizing system of Theorem 2 consists of a complete system in
G, and J (h' n - (~) , I) in H. 3. In the previous two points we showed that in the case of Theorem 1 min! c(d)
I ::;: F/(n) ,
384
191
A THEOREM OF FINITE SETS
and in the case of Theorem 2
minlc(d)I~(l~I}+F,(n -(~)}. Thus, it is suftcient to verify (5)
and
Ic( d) I Z
(6)
(I ~ 1) + F, (n - :~ )} ,
respectively. These statements will be proved by induction over l. If 1= 1, both statements are trivial. Assume we have proved for all numbers < I and prove for l. 4. First we prove the inequality (7)
if (8)
are integers, and (9)
The statement will be proved for fixed 1 and for every n, n l , n 2 using the induction hypothesis for I - l. For the sake of simplicity we use the following notations: t = t(n, I)
at
r = t(nl' l ) b , 8
= t(n2' l -
1)
ci
= =
at(n, I) a/(nl • I)
= a/(n2, I -
(r ~ i ~ 1) 1)
(8 ~
i
s:;: 1- 1)
at
= at(n) ,
bt
=
Ct-l
= at-l(n2 )
It follows from (8) and (9) that (10)
n1Zn -E,(n) = (a, ~ 1) + ... + (at ~ 1).
Because of (10) (11)
must hold, since in the contrary case it would be
what contradicts (lO). On the other hand (12)
385
at{nl) , •
192
G. KATONA
because of (8). Applying (11) and (12) we can distinguish two different cases: (a) bl = al and (b) bl = al - 1. (a) In this case (7) has the form ( al
ll-1
)
+ ( ai-I) + ... + ( l-2
at
t-l
):s::.
Decreasing both sides by ( al ) we have l-1
(13)
Let H i andH2 be disjoint sets. Construct the system..£ (b
1) on HI and the system..£(et_I' n
1-
2,
1) on IJ
1-
2•
l-I+ 1, ni _
(~l) ,
In this manner we
obtain a system Jon HI U H 2 • Applying the induction hypothesis (Point 3. (5)) for J and 1 - 1 we have
F1_I(n (14)
= Ie (..£ (b l-
I
(~l)) ~le(J)I=
+ 1, n
i -
(an, l- 1)) I + le(..£(e~_I' n
2,
1- 1»1·
However, we know (Point 1. (4)) that (15)
Ie
(Jf (bl- + I
l,nl -
(a;), 1- 1)) 1= Fl- (ni _I~l)) I
and (16)
Finally, (13) follows from (14), (15), and (16). (b) bl =
al-
l. We separate this case into two subcases: (ba)
n L (a1-11) , 2
(bb)
l -
386
n < (aI - I1) . 2
l -
A THEOREM OF
FI~nl'E
193
SETS
(ba) In this case (7) has the form
(1 all) + (t-12)+ ... + (t at 1) ~ (~l-=: )+(lb~12)+'" since
el- 1= al -
+ (r b, 1) +
+ (~l-=- 21) + (/1-23)+ ... + (8 ~ 1) , 1, because of (9) and the supposition (ba). Decreasing
l- 1) + (ai-I) al ) (al-1 1-2
both sides by ( = 1-1
we have
We can prove (17) by using of the induction hypothesis if (18) holds. However (9) gives
- 1) + (el-2 ) + ... + (e ~ (a l- 1) + tal1-1 1-2 1-1 + (a l-1- 1) + ... + (at - 1) . 1-2 t-l s)
(19)
8
Decreasing both sides by
(a1-11) we l -
obtain
V1-22) + ... + (C;) ~ (al~~~I) + .. , + (:t-=-ll)
(20)
and (20) is equivalent to (18). (bb) In this case (7) has the form
l- 1) + al ) + (1-2 al-1) + ... + (t at- l ) ~ (a1-1 (1-1 Decreasing both sides by
13
1-11) we have
(a
l -
Gra.ph
387
194
G. KATONA
Let G and H be two disjoint sets of al
...4' (a l - 1, nl ::<:::::
(a l -
(a l ~ 1J ' I -
-
1 elements. Construct the system
1)- We can it construct if n 1 _
(a l ~ 1 ) ::<:::::
1) . But this follows from al - l = bl > bl-v since nl - (a ~ 1) = (/1-11 )+ ... + (;).
I-I
l
Construct further the system ...4'(CT-l' n2 , 1- 1) on H. The possibility of this construction follows from the assumption (bb). In this manner we obtain a system f on G U H. Applying the induction hypothesis (Point 3. (6)) for fand 1- 1 we have
(22)
+ Ic(...4'(cT_l> n
2,
1- 1)
I.
However, we know (Point I. (4)) that (23)
and (24) further, (2l) follows from (22), (23) and (24). Thus we proved the inequality for I. 5. However, we need (7) under the condition
n·l
(25)
n2::<:::::-
aT
instead of (9). Thus we are going now to prove the inequality
n·l
(26)
-::<:::::EI(n).
aT
We prove (26) by induction over l, but we should like to mention that the proof of (26) is independent from the whole proof of the theorems. For 1 = l the statement is trivial. Assume we proved it for the integers < I,
aT = al and EI(n) = (~I-=-II) , thus holds with equality. We may assume aT = al + l. Obviously
and prove for I. If n=
(a;)
then
(27)
388
(26)
195
A TJlEonEM OP lILNITE SBTS
and by the induction hypothesis (28)
I- I
al _ 1 + 1
a'_1 )+ ... + (a')] :5: (a 1 r H
[( I - 1
-
2
I)
+ ... + (a, -
I) .
r- 1
I 1- I . . , summarlzmg (27) and (28) we obtain (26). In the If - --,:5: a/+ 1 al_1 + 1 contrary case I I-I
-- > - -
(29)
a/+ 1
holds because of al
~ al_ l
a/
+ 1. Let us set out from the identity
a, ) (I 1- I) (a,- I) I (a,) (1-1 a/+l ---;;;- = 1-1 - a/+l 1 .
The expression in the bracket is positive because of (29), thus we can write
H )+ + (a,)] (_I _£=..!.) < (a, - I) __ a'-1 )+ (a1-2 I [a,) [( 1-1 .. . r a/+l a/ 1-1 a/+l l '
. Since
al
I - I instead of -I -- I , an d reord er the ine> a/- I. W·r~te --,--.,.
quality
a'_l
+1
_I [(a,)+ ( a, +1
I
aH )
I- I
+ a,~~: I
al
+ ... + (a,)] < r
(a, - I) + I-I
[It-ill + .. . + (:]].
l!'inally, from the above inequality (26) follows by (28). 6. Now let us prove statement (5) for I by induction over h if h = I is trivial. Assume we have proved (5) for all sets IHI < h, and prove for h.
n·! sets At. \Ve k
There exists an element e of H, contained by at most define the following systems: and
where (30)
Naturally, and
&)~ {A:AEv(, e~A )
e ~ (A - {e): AEv(, eEv()
n, ~
n·! n·! lel:5::5:--. h
aT(n)
c(&))cc(vf) c(e )(u)e c c(v() ,
13 '
389
196
G. KATONA
{D U {a}
where .2)( u)a denotes in general the system inequality (31)
: DE .2)}. Thus the
Ic(d)I~lc($)I+lc(@)1
holds. However, 9J is a system in H - {e}, we may apply the induction hypothesis for h - 1 (32) Further, applying the induction hypothesis for 1- I we obtain
·1 c(@) 1~ F/- l (n 2 )
(33)
•
It follows from (31), (32) and (33) that (34)
F/(n - n 2 )
+ F/- (n l
2 )::;;:
Ic(d) I.
Using the result of Point 5, inequality (5) follows from (34) and (7) by (30), since (7) is proved already for I. 7. Now prove statement (6) for I by induction over h. If h = 1, it is trivial. Assume we have proved (6) for all sets 101 = IHI < h, and prove for h. The proof will be similar to the proof of the previous point. Let d l and d 2 be given by and
d
l
= {A :AEd, ACO},
d
2
= {A : A Ed, A
If Idll = rand Id21=
8,
c
H} .
there are two elements e E 0 and f E H, such
that e is contained by at most'!...:.!. , and f is contained by at most ~ sets h h Ai. Define the following systems:
9J and where
=
{A :AEd, e~A, f~A},
@l= {A - {e}:
@2
AEdl , eEA}
= {A - {f} : A Ed
2,
f E A} ,
(35)
and (36)
Naturally,
and
82
=
8
·1
1@21::;;:-· h
c(&1)
c
c(d) ,
C(@l) (U) eCc(d)
C(@2)(U)/Cc(vtj.
390
197
A THEOREM OF FINITE SETS
Thus the inequality
+ IC(@1)1 + IC(@2)1 =
Ic(vE) 1~ Ic(~) 1
(37)
+ IC(@1U@2)1
Ic(~) 1
holds. However f11 is a system in GUll - {e} - {f}, we may apply our induction hypothesis for h - 1:
"
Further, applying the induction hypothesis for 1 - 1 we obtain
IC(@1U@2)1~ (~=;) + Ft-1(r2 +82 - (~=:)).
(39)
It follows from (37), (38) and (39) that
(l ~ 1) + Ft (n - r
(40)
2 -
82 -
(h ~ 1J) +
(~= :)) ~
+ F t- 1 (r2 + 8 2 -
1
c(d) I·
Now we should like to use inequality (7) which is valid under condition (25) (Point 5). For this reason we have to verify only
r2
+
82 -
1)
h( I- 1
(41)
[n - r2
-_ 8 2 _
~
(h - 1) + r + 2
1
• (
l-
h )
1)] 1
1
at n - ( )
In- (':}]-l aT (n -
However
1
(~J)
[n - (~)ll
(42)
h
is an immediate consequence of (35) and (36). Since n < 2 of Theorem 2,
(h -
82 _
(~) is a condition
aT (n - (~)) ~ h holds and (42) results (41). Thus we can use
391
198
G. KATONA
(7) for this case:
Finally, (40) and (43) gives the desired inequality, and the whole proof is finished. Now we consider a natural generalization of the problem of Theorem 1. The problem is to determine the minimum of Jck(d)l, where 1 ~ k ~ 1, ck(d) = C(Ck-1(d» and c1(d) = c(d). It is not difficult to conjecture what is the result. To the theorem we need the following notation: F¥(n)
=
(a/(n, l») 1- k
+ (a/-1(n, 1) ) + ... + (at(n,l) (n, 1») 1- 1 - k
(1
t(n,l) - k
~ k ~ 1) ,
where (:) = 0 if b < O. THEOREM
3. Let h, n, land k be (liven integers with the properties
h ~ 1, 1 ~ k
~
1 ~ hand
If D is a set of h elements and (i = 1, .. , , n)
d={Al"" ,An},
a 8Y8tem of different subset8 of D, then min Ick(d) 1= F¥(n), where the minimum run8 over all 8uch 8Y8tems d. PROOF. It is easy to see by induction over 1, that Ick(vf(h, n, 1» Thus, we have to prove only
I = F~(n) .
ICk(~) I ~ F~(n) .
(44)
This will be proved by induction over k. For k = 1 Theorem 3 gives Theorem 1. Assume now (44) is true for values smaller than k, and prove for k. Obviously, holds and using the induction hypothesis and Theorem 1 we obtain (45)
(a) If t(n, 1) - (k - 1)
>
0, then
F~-l(n) = ( a/(n, 1)
1- (k - 1)
)
+ ... + (
at(n,l) (n, 1) ) t(n,l) - (k -1)
392
199
A THEOREM Oll' ll'INITE SETS
is an expression oftype (1). That is (46)
+ 1) = t(n, 1) - Ie + 1 al+k-l(n, 1) (t(n,l) - Ie + 1::;;: i::;;: 1- Ie + 1)
t(F~-l(n), 1- Ie
at (F¥-l(n) , 1- Ie + 1) =
and
'~+I
F ,_ k+1(Ff-1(n»=
(al(Ff-l~n),1-le+l1 =
l=t(n,I)-k+1
(47)
']+1
=
(al+~_l(n,I»)= ~
I=t(n,I)-k+ 1
1
-
~
i
1
-
I
(a~(n,l»)=Ff(n), 1 - Ie
j=t(n,l)
which proves (44) and (45). (b) If t(n, 1) - Ie
+l
::;;: 0, then (46) does not hold. However in this case
F~-l(n) _ 1 = ( a/(n, 1) ) 1-le+l
and
t(F~'-l(n)
+ ... + (ak(n,l») 1
- 1, 1- Ie + 1» = 1
a/(Ff-1 (n) - 1,1- Ie + 1) = al+k-l(n, 1) hold. Further, the equation
':i
F Z_ k+1(Ff- 1 (n) _ 1) = (48)
l
(a l (Ff- 1(n) .-1, 1- Ie + 1») = ~
1=1
=
']"1 (al+~_l(n,l») = 1=1 t - 1
i
j=k
(1::;;: i::;;: 1- Ie + 1)
(a!(n,l») 1 - Ie
=
i
1
-
(~j(n»)
j=t(n,l)
1 - Ie
= Ff(n)
is true in this case instead of (47). If we prove
F/-k+l. F" -l(n» = F,-k+l (Ff-l(n) - 1),
49)
then (44) foHows from (45), (49) and (48). (49) will be proved by the following simple lemma. LEMMA 2.
It t(m, r)
= 1, then
Fr(m + 1) = Fr(m) . PROOF.
(2 ::;;:
8 ::;;:
Let
8
m = (ar(m, r») r
+
be the least index such. that as(m, r) > as_1(m, r) 1 1. Thus, we can write
r). If there is not such 8, let 8 be equal to r
+
+ ... + (as-1 (m, r») + (a s- 1(m, r) ~ I} + ... + 8-1
393
8-2
200
G. KATONA
and
m+ 1= (aA:' r)) + ... + (as-l~~r~ + 1) .
Now it is not difficult to see, that
F,(m) = (a'(m, r)) r-l
+ ... +
+ ... +
(a s- 1 (m, r)) s-2
2)) =
(a s - 1 (m, r) - (s -
o
+ (a s-
(aAm, r)) r-l
1
(m, r) s-3
+ ... + (a s-
1) + ... + r) + s-2
1 (m,
I) =
= F,(m+ 1), which proves the lemma and Theorem 3.
§ 3. Solution of an
Erdos~problem
Let H be a finite set of h elements, and d
a system of subsets of H: (1 ~i~n).
ERDOS proposed the following problem. For which numbers n can we con-
struct a system 91 with the properties
In the solution we use the well-known marriage problem. It is clear in this connection, that it is a very important question, in which cases does F~(n) < n, F~(n) = n or FNn) > n hold. The following sequence of lemmas deals with this problem. LEMMA
3. If 1
~
k
s: l and x are positive integers, then
is a monotone increasing function between land 2l - k - 2 but it is a monotone decreasing function from 2l- k~ 1. The values f(2l - k - 2) andf(2l-- k - 1) are equal. PROOF.
Let 0
s: a < b < x be integers. It is easy to see that (:) -
(:) - (:) =0 and (:) and a
+ b>
(~)
(:)
<
0,
>0, respectively, if a+b<x, a+b=x
x, respectively.
394
201
A THEOREM OF FINITE SETS
Consider the difference t(x
+ 1) -
the above remark we obtain that
and finally,
t(x) = ( x ) _ ( x ). Using 1-k-I 1-1
l(x+I)-/(x)
if
2l-k-2<x,
l(x+I)-t(x)=O
if
21-1c-2=x,
f(x+I)-t(x»O
if
21-k-2>x.
This completes the proof. The following two lemmas are immediate consequences of Lemma 3. LEMMA
3a. III :s:: k :s:: 1 and x are positive intf!gers, then
LEMMA
3b. II 1
< k < 1 and x > 21 -
+ 1 are positive integers,
k
1< k :s:: m, then "i [(2i --: k - 1) _ (2i - ~ - I)]:s:: (2m -
LEMMA 4.
II
~ -
i=k
PROOF.
then
k
k-
m - k
~
Let a and b be positive integers, where
1) _ (2m !!.- < b < a 2
km
I}. I
- 1. Then
and similarly
2)
2)
2) [1- a ~~] = b+2
(a + _(a + = (a + b+I b+2 b+l. Further (51)
(52)
2)
(a + [2b-a+ 1]. b+1 b+2
(a+2)[2b-a+I]=(a)[2b-a+I].[ (a+2)(a+l) ], b+ I b+ 2 b b+ I (b + 2) (a - b + I)
where and
a+2 a+2 a-b+l-a 1
----->--=2.
-+ 2
395
202
G. KATONA
That is (53)
a
follows from (50), (51) and (52). Applying (53) for = 2i = obtain
or
k- 1, and b= i-I,
(l:S;:
k:s;: i),
we
~ 1) -k-l) _(2(i +.1) -k-l) ~2[(2i~k -1) _(2i-~ -1)]. (2(i '+I-k z-k , ~+1
(54) Prove now the lemma by induction over m. H m = k, the statement is trivial. Let the lemma be true for m and prove it for m + 1.
i[(2i~k-l) i=k
, -
k
_(2i-.k-l)]= ~1[(2i~k-l)_(2i-~-I)]+ ~
i=k
~
-
k
and by induction hypothesis and (54)
i[(2i~k-l) i=k
" -
k
_ (2i-~-I)]~2[(2m-k-l) _(2m-k-l)]~ mm [2(m + 1) - k- 1) _ (2(m + 1) - k-I} l m+l-k m+l "
~
k
holds, which proves Lemma 4. LEMMA
(55) PROOF.
"
< a/(n, I) then FNn) < n.
5. If l :s;: k :s;: I and 21 - k We may use Lemma 3b:
On the other hand, by Lemma 3a
396
203
A THEOREM OF FINITE SETS
holds and summarizing it we obtain
:i l-(a./(n, I)) _ (a/(~, 1))] ~ :i [(2i ~ k- 1) _(2i - ~ - 1)]. k
" -
/=k
"
" -
/=k
Applying now Lemma 4 and (54):
1)) _ (a/(~, 1))] ~ (21- k-
~[(a!(n,
<
(2(1
Obviously.
(~7)
!i l·(a!(n, 1
/=1
" -
1- k
".
k
" -
i=k
+ 1) -
k
"
1) _(21- k- I)' < 1
1) _ (2(1 + 1) - k - 1) .
kl+l-k
1+1
+ 1) - k - 1) _ (2(1 + 1) - k - 1"1 +1- k 1+ 1
1)) _ (ai(~' 1))] < (2(1 k" 1
I
also holds, since we added a nonpositive number to the left side. If we sum (56) and (57) the obtained inequality Ff(n) _ n
<
(21- k+ 1-k
1) _ (21- k + 1) + (2(1 + 1) - k - 1\_ 1
_ (2(1
6. If l
~
> a/(n, Ff(n) > n.
k ~ 1 and 21 - k
(58) PROOF.
k -1)
=
0
_1+ 1
results (55). LEMMA
+ 1) -
l+l-k
1) then
We know that
(59) If 1 > i ~ a/(n, 1) - (1- k), then a/(n,l) - (1- i) a/(n,l)
~
~
2i - k and by (59)
2i - k
holds. In this case, obviously
(:/(n, 2) - (a/(:, 1)) ~ 0
(60)
follows. If k
~
i
<
a/(n, 1) -
(1 - k), then by (59) and Lemma 3
holds, but it is trivially true for i
<
k, too.
397
204
G. KATONA
Sum (60) and (61)
:ir(~;(n'l») _ (a;(~'l»)]:za,(n'I~+k-I[(al(n'~)-(l-i»)_ ;=1
l ~-k
~
_ (al(n, 1) - (1- i»)]
=
i
k
+ k) _ (2a/(n, 1) -
(2a/(n, 1) - 21 a/(n,l) - 1 - 1
That is (62)
~-
;=1
F~(n) _ n:z (2a/(n, 1) - 21 +
a/(n, 1) -1
k) _ ( 2a/(n,1) -
a/(n, 1) -1
a/(n,l) - 1 - 1
21
21 + k ) + 1. +k - 1
+ k) + 1 +
+k-
1
s true. Here (63)
(
+ k) _ ( 2a/(n, 1) -
2a/(n, 1) - 21 a/(n,l) - 1 - 1
a/(n,l) - 1 (
21
+ k) :z (
+k -
a/(n,l) - 1 a/(n,l) - 1
+k -
a/(n,l) - 1 )_ \ a/(n, 1) - 1 - 1
1 ) 1
because of Lemma 3. However we can write the right hand side of (63) in the form (64)
Here
and since 21 - k - 1 (65)
(a/(n,
>
a/(n, 1) -
1 by supposition of the lemma, thus
i-I) _(a/(;,~;- 1) :z (a/(~,
Finally, (65), (64) (63) and (62) give which proves our lemma.
F~(n) - n
398
>1
1») -
(a/n,
2) .
205
A THEOREM OF FINITE SETS
7.
LEMMA
If
(66)
n> (21~k)
+ (2(1~~)I-k) + ... + (~),
then
F~(n)
On the other hand, if (67)
n
s;:
< n.
(21 ~ k) + (2(1 ~~\- k) + ... + (~),
then with equality only if
(68) for 80me
n 8
(k
=
(21 ~ k) + (2(1 ~~)1- k) + ... + (28 ~ k)
s;: 8 s;: I).
PROOF. Consider first the case of (66). If ai(n, 1) = then t(n, I) < k and n=
2i -
k (k
s;: i s;: I),
(21 - k) + ... + (k) + (k - 1) + ... + (t(n,I)J' .
Obviously,
I
Tc - 1
k
k Fdn) =
thus FNn) < n holds. In the contrary case
(21-I k) + ... + (k)k ' >
a,(n, I)
ai(n, I) =
hold for some r (k
s;: r <
t(n, I)
2r - k
2i -
k
I). Since
the statement follows by Lemma 5:
The case (67) may occur in two different ways. 1. If (68) holds, then obviously FNn) = n For some r (k < r s;: I), a,(n, I) < 2r - k,
2.
399
(r
< is;: I)
206
G. KATONA
and (r
Since
< i:::;;' 1) .
the statement follows by Lemma 6:
THEOREM 4. Let 1 :::;;, k :::;;, 1:::;;' h be positive integers, H a set of h elements and d={Al' ... ,An}, JA,J=1 a system of subsets of H. If (69)
n:::;;'
(21~k) + (2(1~~)I-k) + ... + (~),
there exists a system (70) but in the case of (71)
n>
(21 ~ k) + (2(1 ~~)1- k) + ... + (~)
not necessarily . PROOF. First we prove the latter case. If (71) holds then by Lemma 7 FNn) < n. We know (Theorem 1) that there exists a system d such that Ick(d)\ = FNn). Thus, a system /lJ satisfying (70) does not exist. In the proof of the existence of /lJ in the case of (69) we use the well-known marriage problem [2]: THEOREM OF ORE. Let E and F be disjoint sets and G a graph on E U F Assume G has the property that for arbitrary DeE there is a set H c F such that every element of H is connected with at least one element of D and IHI :;:::: IDI. Then there exists a one-to-one mapping between E and a subset K of F, such that the associating vertices are connected in G. In our case E = d, F = ck(d) and A E d; B E ck(d) are connected if and only if A ::> B. Thus, it is sufficient to verify that for every subsystem
e=
{AIt, ... ,Aim}cd
there are at least m sets in ck(d) , which are contained in one of AlP:::;;' j :::;;, :::;;, m). However, m :::;;, n, thus by (69)
400
A THEOREM OF FINITE SETS
207
and Lemma 7 gives (72)
F~(m) ~m.
Use now Theorem 1: This and (72) results Ick (€2)1 ~ m, which means that our graph has the property prescribed in the used theorem. Applying the theorem the obtained one-to-one mapping gives just the desired system /Ii. COROLLARY. IT 21 - k :;::: h, then (69) always holds and a system /Ii satisfying (70) always exists. This is an immediate consequence of the inequality
and the fact that d has at most
(~) elements.
REFERENCES [1] SPERNER, E.: Ein Satz tiber Untermengen einer endlichen Menge, Math. Z. 27 (1928) 544-548. [2] ORE, 0.: Graphs and matching theorems, Duke Math. J. 22 (1955) 625-639.
Reprinted from Proceeding of the Colloquium held at Tihany. Hungary. Sept. 1966 Academic Press and Akademiai Kiado, Budapest, 1968, pp. 187-207
401
Reprinted from JOURNAL OP COMBINATORIAL THEORY All Rights Reserved by Academic Press, New York and London
Vol. I, No.2, September 1966 Printed in Italy
A Short Proof of Sperner's Lemma Let S denote a set of N objects. By a Sperner collection on S we mean a collection of subsets of S such that no one contain another. In [1], Sperner showed that no such collection could have more than NC[NI21 members. This follows immediately from the somewhat stronger THEOREM.
Let
r
be a Sperner collection on S. Then
~.A.ErNCj11
<
1,
where I A
I denotes the cardinality of A. PROOF. For each A c S, exactly I A /l(N - I A /)!
maximal chains of S (as a lattice under set inclusion) contain A. Since none of the N! maximal chains of S meet r more than once, we have
I
~.A.Er A /l(N -
I A /)! <
N! ,
proving the theorem.
REFERENCE
1. B. SPERNER, Bin Satz tiber Untermenger einer endlichen Menge, Math. Z. 27 (1928), 544-548.
D. LUBELL Systems Research Group, Inc. 1501 Franklin Avenue Mineola, New York 11501
402
Sonderabdru
Fase. 6
BIRKHAUSER VERLAG, BASEL UND STUTTGART
-----.--- --- .---.------.--.-... -. ------- - - - -
- - - - - - - - - - - - -..
Mobius Inversion in Lattices By HENRY H. CRAl'Ol)
1. Introduction. In the development of computational techniques for combinatorial theory, attention has lately centered on ROTA'S theory of Mobius inversion [6]. The main theorem of ROTA'S paper, concerning the computation of the Mobius invariant across a Galois connection, is a prerequisite to the use of lattice-theoretic methods in combinatorics. By suitably combining ROTA'S main theorem with a discrete analogue of integration-by-parts, we here obtain a perfectly general formulation of Mobius inversion across a Galois connection (theorem 3, below). As immediate applications of this theory, we obtain a number of interesting computational results concerning finite lattices (section 3, 4) and combinatorial geometries (section 5). 2. Mobius Inversion across a Galois Connection. We begin with a restatement and a simplified proof of ROTA'S main theorem. The proof tlli"lls on the essential fact that for any (locally finite) ordered set Q with least element 0, the recursion
.2;a(y) "y, z)
=
0 for
H
0
I/EQ
has the unique solution a(y) = 0 with initial condition a(O) = 0, and has the unique solution a (y) = #Q (0, y) with initial condition a (0) = 1. Recall that the zeta function "y, z) has value 1 if Y ~ z, and has value 0 otherwise. Theorem 1. If J is a closure operator on a finite lattice P, and Q = P/J is the quotient lattice, consisting of the J-closed elements of P, then for all elements x E P, and elements y closed in P, x ~ y, the sum
L
#(x, t)
t;z&;t&;J(t)-1I
has value #Q (x, y) if
X
is closed, and has value 0 otherwise.
1) We wish to express our gratitude to the National Research Council, Canada, for their support of this research (grant A-2994), to K. JACOBS, for his organization of the extraordinary conference "Kombinatorik" at Oberwolfach, and to D. KLEITMAN and J. GOLDMAN, for their organization of the combinatorics seminar at M.I.T., for which this material was prepared.
403
596
H. H.
ARCH. MATH.
CRAPO
Proof. Note that the theorem may be rewritten in the form (1)
LP(X, I) 15 (J(/), y).
15 (X, J(X)) PQ(J(X), y) =
tE?
Without loss of generality, we assume x = 0 in P. For each element y L p(O, I) 15 (J(/), y). Then
E
Q, let
a(y) =
tEP
L a(y) C(y, z)
=
IIEQ
L p(O, I) b(J (I), y) C(y, z)
=
t.1I EP
L p(O, I) C(/, z)
=
bp(O, z) .
t EP
IfO<J(O), bp(O,z) =0 for all ZEQ, and a(y)=O for all YEQ. If O=J(O), bp(O, z) = 1 for Z = 0, and a(y) = PQ(O, y). I Given a function f from a finite lattice P into a ring with unit, associate the difference operators D, E lower difference Df(x) = Lf(y)p(y, x), lI;II""X
upper difference
Ef(x) =
LP(x, y)f(y)· 1I;:Z:~Y
Theorem 2 (Analogue of integration by parts). If f, g are funclions from a finile lattice P inlo a ring, then LDf(x) g(x) =
xeP
Lf(x) Eg(x).
xeP
Proof. Both are equal to Lf(x)p(x,y)g(y).
I
X.II
It is interesting to compare the proof of theorem 2 with the argument that cycles and coboundaries in a graph are orthogonal to one another. For each vertex p and edge x, let + 1 if p is the head of x, f (p, x) = - 1 if P is the tail of x, o otherwise.
1
Boundary and co boundary operators are defined by af(p) =
L f(p, x) f(x) for any 1-chain f, x
bg(x) = Lg(P) f(p, x) for any O-chain g. p
If f is a 1-cycle
(of
=
Lf(x)h(x) x
0) and h is a 1-coboundary (h = bg), then
=
Lf(x)f(p,x)g(p) x,p
=
Laf(p)g(p) p
=
LOg(P)=O. p
If a: P -+ L is a supremum-homomorphism from a complete lattice P into a complete lattice L, then ad: L -+ P is an infimum-homomorphism, defined by ad(y) = sup{x; a (x) ~ y}.
The pair a, ad is a Galois connection, in the sense that P/a d (a) is isomorphic to L/a(a d ), where ad (a) is a closure operator on P and a(a d ) is a co closure operator on L. All Galois connections between complete lattices arise in this fashion. In the special case where a is onto L, P/a d (a) ~ L.
404
Vol. XIX. 1968
597
Mobius Inversion in Lattices
We now combine theorems 1 and 2 to establish a general theorem on Mobius inversion across Galois connections. This theorem is the discrete analogue of the change-of-variables formula for integration. and of Stokes' Theorem, w = dw.
f
f
as
s
Theorem 3. If 0': P --+ L i8 a 8up-homomorphism from a finite lattice P into a finite lattice L, if f i8 a function from P and g i8 a function from L taking values in a ring, then '2,Df(x)g(O'(x» = '2, f(O'-1 (y» Eg(y) . ZEP
1/EL
Proof. Let Q = P/O'-1O' ~ L/O'O'-1 be the common quotient lattice, and regard both f and g, restricted to closed elements in P and L, as functions on Q. Then
'2, Df(x) g (0' (x» = '2, f(t) f-l(t, x) g (0' (x)) = '2,
ZEP
t,ZEP
(2)
'2, f(r) f-lQ(r, 8) g(8) = '2,
=
f(r)b(r,O'-1(Y»f-ldy,z)g(z)=
reQ,',1/EL
r,seQ
'2, f(O'-1(y))f-lL(y,z)g(z) =
=
f(t)f-lp(t,x)b(0'(X),8)g(8) =
t,zeP,seQ
1/,'EL
'2,f(O'-1(y))Eg(y).
I
1/EL
A number of related forms of Theorem 3 may be more convenient in applications. For instance, using the fact that the difference operators D and E are inverses to the summation operators Sand T, Sf(x) =
'2,
f(y);
'2, f(y)'
Tf(x) =
1/;1/;;>Z
1I;a;~11
we obtain corollary 1 by substituting Sf for /. Tg for g. Corollary 1. ItO': P --+ L i8 a sup-homomorphism from a finite lattice P into a finite lattice L, if f is a function on P and g is a function on L, then '2,f(x) g(O'(x)) = ZEP
'2, Sf (0'-1 (y)) Eg(y),
1/EL
'2, Df(x) Tg(O'(x)) =
ZEP
'2,f(O'-1(y))g(y), IIEL
'2,f(x) Tg(O'(x)) = '2,Sf(O'-1(y))g(y). ZEP
I
1/EL
The symmetric intermediate form (2) appearing in the proof of Theorem 3 deserves special note: Corollary 2. If 0': P --+ L is a sup-homomorphism from a finite lattice P into a finite lattice L, if Q is the quotient lattice P/O'-1 (0') ~ L/O'(~), and if functions f on P and g on L are defined on Q by restriction to closed elements of P, coclosed elements of L, respectively, then '2,Df(x)g(O'(x» = '2,f(r) f-lQ(r, s)g(s) = XEP
r,seQ
'2, f(O'-1 (y)) Eg(y) . I
1/EL
For Galois connections given directly as a pair of order-inverting maps 0', T whose composites O'T and TO' are increasing, it is more convenient to have Theorem 3 in the form obtained by inverting the lattice L, as follows.
405
598
H. H.CRAPO
ARCH. MATH.
Corollary 3. If rJ: P -+ L, i: L -+ P is a Galois connection between finite lattices P and L, if f is a function on P, and g is a function on L, then LDf(x)g(rJ(x)) = Lf(i(y))Dg(y).1 lIeL
",eP
Theorem 1, above, lacks the full symmetry of Galois connections because it operates between a lattice P and its quotient Q, rather than between two lattices P, L, with a common quotient Q. The symmetric form for Theorem 1, and thus for ROTA'S main theorems, is recoverable from Theorem 3 as follows.
Corollary 4. If a is a sup-homomorphism from a finite lattice P into a finite lattice L, then for any elements t E P, Z E L Lit (t, x) b (a(x), z) = L b (t, aLi (y)) It (y, z) . veL
"'EP
This common value is clearly equal to 0 unless t is closed in P (ie: t < x => a (t) < a (x)) and Z is coclosed in L (ie: 3x E P; a(x) = z). If t is closed in P and z is closed in L, both t and z correspond to elements of the common quotient lattice Q, and the common value of the summations is equal to ItQ(t, z).
Proof. Setf(x) = b(t,x), g(y) = b(y,z). Then Df(x) = ItP(t,x) and Eg(y) = ltL(y,z). The intermediate symmetric form (2) arising in the proof of Theorem 3 is in this case equal to L b(t, r) ItQ(r, s) b(s, z). I " .. Q
The theory of Mobius inversion across a composite of sup-homomorphisms develops directly from Theorem 3.
Corollary 5. If a,: P'-l -+ PI, i = 1, ... , k, is a sequence of sup-homomorphisms between finite lattices, if f is a function on Po and g is a function on Pk, then the sums (3)
L f (at (... af (x)
)) It (x, y) g (ak (... at+l (y)
))
f£,lIEPi
are equal, fori = 0, ... , k. (Note thatfori = k the evaluation of g is at y.) I For computations involving composites such as aLi (ill) it should be borne in mind that Lf is a contravariant functor, ie: aLi (i.1) = (i(a))LI. The applicability of Corollary 5 is appreciably extended by the observation that composites of sup-homomorphisms give rise to commutative diagrams involving the intermediate quotients. For two sup-homomorphisms, diagram 1 applies.
Diagram 1
406
Vol. XIX, 1968
599
Mobius Inversion in Lattices
For any sup-homomorphism p, the symbol PI indicates the map of each element to its closure, regarded as an element of the quotient lattice, while the symbol P2 indicates the map of each element of the quotient, regarded as a closed element of the domain, to its image under p. Note that (a Th
= al(a2 Tlh
(a T)2 = (a2 Tllz T2.
and
A result not obvious from previous forms of Theorem 3 derives from such consid· eration of quotients. Note that the lattices Land Q in the following corollary are not related by a Galois connection. Corollary 6. If a: P -+ L and T: L -+ Mare sup·homorrwrphisms between finite lattices, if f is a function on P and g is a function on M, and if Q is the common quotient lattice 0/ P and M relative to the composite -r(a), then L f(a 4 (x))PL(x,y)g(T(Y)) = Lf(r)PQ(r,s)g(s).
$.lIeL
r.8eQ
I
The expressions f(r), g(s) in Corollary 6 refer as usual to f((T(a))f (T)) and g((T(a))2 (s)), the values of f and g at elements closed in L, coclosed in M, relative to T(a). 3. Enumerative Lattice Theory. To each binary relation !?: X -+ Y between finite sets X and Y there corresponds a Galois connection (a, T) between the Boolean algebras B(X), B(Y). For all A ~ X, B ~ Y, a(A) = {YE Y; xEA => X!?Y}, T(B)={XEX; YEB=>x!?y}.
(These definitions are simply C1(A) = nA, C1(B) = nB, if elements of X are viewed as subsets of Y, and elements of Yare viewed as subsets of X.) Theorem 4. If !?: X -+ Y is a relation between finite sets, if f is a function defined on subsets of X, and if g is a function defined on subsets of Y, then LDf(A)g(nA ) = Lf(nB)Dg(B).
A'X
B~Y
Proof. Apply Corollary 3 of Theorem 3 to the Galois connection B(X) ~ B(Y} defined by C1(A} = ~ Y for A ~ X, T(B) = nB~X for B~ Y. I • The elements of the common quotient lattice Q are precisely those pairs (A, B) A ~X, B~ Y, which are 1} totally related: xEA, yE B => X!?y and maximal, in the sense that
nA
2} x ¢ A=>3 Y E B, x e y , 3} y¢ B => 3XEA, xey,
where Each element
SEQ
Isl2 as a subset of Y.
edenotes negation of relation !?
thus has a cardinality
407
Isl1 as a subset of X
and a cardinality
600
H. H. CRAPO
ARCH. MATH.
Corollary 1. If (a, -r) is the Galois connection defined by a finite relation
2: (rp -
2: rpl«B)1 (v -
1)IAI vla(A)1 =
A~X
e: X
-4-
Y,
1)IBI.
B~Y
This sum may in turn be calculated on the common quotient lattice Q, and is equal to
2: rpl'll P,Q (r, s) vl• I••
',SEQ
Proof. Let f(A) = rplAI for all A ~X, and g(B) = vlBI , for all B~ Y. Noting that Df(A) = rp IOI(- 1)IA- o l = (rp - 1)IAI, and similarly for Dg, the result
2:
O~A
follows directly from Theorem 4.
I
Modulo a few redundancies, Corollary 1 to Theorem 4 is also the fundamental enumerative structure theorem for finite lattices. The redundancies arise, causing nonisomorphic relations to have isomorphic lattices, when a(x) = a(A), for some subset A ~ X and some element x ¢; A (also when this situation occurs for some element and subset of Y). When such redundancies do not occur in the relation e, the supremum-irreducible elements of the lattice Q are precisely the pairs (x, a(x)) for x E X, the infimum-irreducible elements of Q are precisely the pairs (-r(y), y) for y E Y, and the relation e may be recovered from the lattice Q by (4)
yin Q.
xey~x?
Corollary 2. Let Q be a finite lattice, with set X of supremum-irreducible elements and set Yof infimum-irreducible elements. For each element Z EQ, let IX (z), (3(z) be the mtmbers of sup-irreducibles beneath z and inf-irreducibles above z, respectively. Then
2: (rp -
2: rpcc(lnfB) (v -
1)IAI PP(supA) =
A~X
1)1 BI,
B~Y
and both sums are equal to 2:rpcc(')p,(r,s)PP(·).
I
.,seQ
Redundancies can be reintroduced on the other side of the relation -- lattice correspondence, with interesting results. Given a finite lattice L, and functions j and k from finite sets X and Y, respectively, into L, a binary relation e is defined by (5)
xey~j(x)
?k(y).
Let Q be the quotient lattice of the Galois connection determined by the relation e. Then Q has two order-embeddings in L, neither of which is associated with a closure operator on L. Corollary 6 to Theorem 3 applies. Theorem 5. Let j and k be functions from finite sets X and Y into a finite lattice L, and let e and Q be the relation and quotient lattice described above. Each element z E L has cardinalities Iz 11 and Iz 12 given by
Izll = l{eEX; j(e)? z}l,
Izl2 = l{eEY; z? k(e)}l.
408
Vol. XIX, 1968
601
Mobius Inversion in Lattices
Each element SEQ is realized as a pair (A, B) of subsets of X, Y, and thus has cardinalities Is II = IA I, Isl2 = I B I·
L IP lx " ,uL(x, y) Villi. = L IPIT" ,uQ(r, s) vlsl •. In particular, ,uQ(O, 1) = L ~(O, Ixll),u(x,y)~(jYI2'0). x,veL x,yeL
r,seQ
Proof. The function j extends to a sup-homomorphism a from the Boolean algebra B(X) into L by alA) = supL{j(e); eEA}. Similarly, k extends to an infhomomorphism from the inverted Boolean algebra S(Y) into L (with opposite .: L-+ SlY»~, defined by .LI(B) = infL{k(e); eEA}. Then
.LI
.(a(A» = {bEY; aEA *j(a) ~ k(b)},
and the quotient lattice with respect to the composite .(a) is equal to Q. The formula follows from Corollary 6 to Theorem 3, and the special case results from setting 11' = Y = 0, realizing that Ir 11 = 0 * r = 0 E Q and Isl2 = 0 * s = 1 E Q . I If, in the situation described above, X = Y = L, and if L is assumed to be an ordered set, not necessarily a lattice, then the resulting lattice Q is the MACNEILLE completion of the ordered set L. 4. Cross-cuts and Complementation. ROTA'S cross-cut theorem [6] and this author's complementation theorem [1] have in common a double application of Mobius inversion. Interesting sidelights on these theorems are obtainable by consideration of a lattice of the intervals of a finite lattice. Theorem 6. Given a finite lattice L, let I (L) be the set consisting of the empty interval 0, together with all intervals [x, y], for x ~ y in L, ordered by containment. Then f'I(L) ([x, y], [w, z]) = ,uL(W, x) ,uL(y, z) if w ~ x ~ y ~ z, and
,uI(L) (0, [x, y])
= -
,uL(x, y).
Proof. If w ~ x ~ y ~ z, the interval from [x, y] to [w, z] is isomorphic to the cartesian product of the inverted interval [w, xl in L with the interval [y, z] in L. But ,uL(Y), x) = ,uL(X, w), and the Mobius invariant is multiplicative on cartesian products. This establishes the product formula.
,uI(L) (0, [w, z])
= -
L L
,uI(L) ([x, y], [w, z])
:t,y;w;:£;x:;;;;;y;:;;;z
=
-
=
-
L
,uL(W, x) ,uL(y, z)
=
x.Y;W~X~1I~Z
,uL(w, x) ~ (x, z)
= -
,uL(W, z).
I
x;w~z;;;;;z
Theorem 7. Given a finite lattice L, an arbitrary subset X ~ L, a function f defined on subsets of X, and a function g defined on intervals of L, then LDf(A)g([inf A, sup A]) = A~X
=
f(0) g(0) - f(0)
L ,u(w, z) g([w, z]) + L fIX () [x, y]) ,u(w, x) C(x, y) ,u(y, z) g([w, z]).
w,zeL
W,x.'V,zeL
409
602
H. H. CRAPO
ARCH. MATH.
Proof. The map a defined by a(A) = [inf A, sup A] is a sup-homomorphism from the Boolean algebra B(X) into the interval lattice I (L), because a(A u B)
=
[inf(A u B), sup (A u B)]
=
[inf A, sup A] v [inf B, sup B].
Note that ad ([x, y]) = X n [x, y]. By Theorem 3, LDf(A)g([infA, sup A]) = A,X L f(X n [x, y]) P,I(L) ([x, y], [w, z])g([w, z]),
o;;;; ["'. v] ;;;;[w.z] ;;;;[0.1]
which reduces to the required form, by Theorem 6, once the summation is separated into three parts: 0=[x,y]=[w,z],
0=[x,y]<[w,z],
and
0<[x,y]~[w,z].
I
Corollary 1. If X and Yare arbitrary subsets of a finite lattice L, let qk be the number of k-element subsets A of X disjoint from Yand spanning L (ie: inf A = 0, sup A = 1). Then
+ qz - ... = + 2: C(X n [x, y], Y) p,(0, x)C(x, y)p,(y, 1).
qo - ql
= !5 L (O, 1) - p,dO, 1)
"'.VEL
Proof. Set f(A) = C(A, Y), so that Df(A)
=
2:(-l)IAI-IBIC(B, Y) B,A
=
(-1)IAI!5(0,An Y).
Set g([w, z]) = 15(0, w) !5(z, 1). The sinister of the equation in Theorem 7 becomes
2: (-1)IAI!5(0, A n
A,X
2: (-l)kqk, 00
Y)!5(O, infA) 15 (sup A, 1) =
k=O
and the simplification of the dexter is obvious. I The cross-cut theorem, the complementation theorem, and, one may conjecture, other interesting facts about Mobius invariants of lattices are evaluations of Corollary 1 at particular sets X, Y. Corollary 2 (The Cross-cut Theorem). If X is a cross-cut of a finite lattice L, and if qk is the number of k-element subsets of X which span L, then
qo - ql
+ qz -
... = p,dO, 1).
Proof. In Corollary 1 to Theorem 7, let X be the crosscut, and let Y = 0. The condition X n [x, y] = 0 is satisfied if and only if x ~ y < z for some z E X, or z < x ~ y for some z E X. These possibilities are mutually exclusive, and are indicated y < X and X < x, respectively. Thus p,(O,x)C(x,y)p,(y, 1) =
2:
"'.YEL;Xn["'.!I]-O
=
L
p,(0, x) C(x, y) p,(y, 1)
X.Y;II<X
=
+ 2: p,(0, x)C(x, y) p,(y, 1) = Z,lI;X
2: !5(O,y)p,(y, 1) +x;X
2p,(0, 1).
Substitution of this formula into that of Corollary 1 completes the proof.
410
I
Vol. XIX, 1968
603
Mobius Inversion in Lattices
Corollary 3 (The Complementation Theorem). If s is any fixed element in a finite lattice L, then ~(o, 1) == ~ ~(O,x)C(x,y)~(y, 1) X,YES.L
where s.1 is the set of complements of s in L.
Proof. In Corollary 1 to Theorem 7, let X
== Land Y ==
C(X II [x, y], Y) == C([x, y], s.1)
°
°
==
s.1. Note that
1
+ ...
°
if and only if both x and yare complements of s. If == 1, qo - ql == qo == 1. If '*' 1, then at most one of 0,1 are in s.1. Assume w.l.o.g. that ¢ 8.1. Then a subset A disjoint from s.1 U {a} spans if and only if Au {a} spans. A and Au {a} have cardinalities of opposite parity, so qo - ql + ... == 0. Thus qo - ql + ... == == t5 (0, 1), and the corollary follows. I 5. Combinatorial Geometry. A combinatorial geometry (or simply, a geometry) (e.g.: [4]) is most easily defined in terms of the lattice structure of its flats (closed subgeometries). Such lattices, which are called geometric lattices, have the distinguishing characteristic that, for all x, y E L
Y covers x= 3 atom p complementary to x in [0, y]. In this definition, "y covers x" means x < y and x < t :0;; Y => t == y. The complementarity condition requires P == 0, x v p == y. We shall consider only finite geometric lattices here, so this single property will suffice for a definition. Geometric lattices are consequently relatively-complemented semimodular lattices, generated by atoms, generated by coatoms, and possessed of a well-defined rank A(x) == length of all maximal chains from to x. (Note ,1(0) == 0.) The points of the associated geometry are the atoms of the geometric lattice. Thc lines, planes, ... , of the geometry are the sets of points beneath elements of rank 2, 3, ... , respectively, in the lattice. Linear graphs give rise to geometries. If G is a linear graph with edge set X and vertex set H, the equivalence relation of path-connection along edges in a subset A ~ X yields a partition nA of the vertex set H into A-path connected components. The map a: A -+ nA is a sup-homomorphism from the Boolean algebra B(X) into the partition lattice P(H). The Galois-closed edge sets, ie: the maximal sets A of each rank ,1(a(A)), form a geometric lattice L(G). The coboundary operator, defined parenthetically in section 2, maps vq O-chains f: H -+ {a, 1, ... , v - 1} to each coboundary t5/, where q is the number of connected components of G. Colorings, those O-chains which have unequal values on the ends of any edge, correspond to coboundaries which take non-zero values on each edge. The kernel kerg of a coboundary is the set g-l(O), which is necessarily closed. "Ve wish to calculate P(x; v), the number of coboundaries with kernel x and values in the ring {a, 1, ... , v - 1}. The number of v-colorings of the graph is vqp(O; v). A coboundary is freely-determined by its values on any basis (spanning tree) for the graph, so there are V A(l)-.!(x) v-coboundaries with kernel ;;;;x, for any x E L(G).
X"
°
411
604
H. H. CRAPO
ARCH. MATH.
By Mobius inversion on L, there are p(x; v) = L ,u(x,Y)V.!(l)-.!(y) = Ep( ; v)(x) u;z~"
v-coboundaries with kernel x. The polynomial p(O; v) is clearly well-defined for any finite Dedekind lattice L, ie: a lattice satisfying the chain condition, and thus having a rank function A.. p(O; v) has been called the Poincare polynomial of the lattice L. Recent unpublished work tends to establish a relation between Poincare polynomials and general "coloring problems" on such lattices. Poincare polynomials may be considered as polynomial-valued elements of the incidence algebra of the lattice L. Let p (x, z; v) = L,u (x, y) C(y, z) VA(z)-.!(y) , 1I e L
so that p(x; v) = p(x, 1; v). Such polynomial. valued matrices have easily-calculable inverses in the incidence algebra. Theorem 8. Let functions a, b, c on a finite lattice L take value8 which are invertible element8 of a ring. Then q(x, z) has an inver8e
=
L a(x),u(x, y)b(y)C(y, z)c(z) 1I e L
in the incidence algebra of L. In particular, the Poincare polynomial p(x, z; v) has inver8e p-l (x, z; v) = L v.! (!I)-.!(:t) ,u(x, y) C(y, z). I 1I e L
Every geometric lattice L may be realized in a number of ways as a quotient of other geometric lattices with respect to sup-homomorphisms which also preserve the relation covers-or-equals. Such maps are called 8trong maps, and map atoms either to atoms or to o. There is a notion of orthogonality [3] ~ith respect to any such realization (1: M --* L, giving rise to a strong map a*: M --* L*, whenever the domain lattice M is also modular. The relation (1** = a holds. If the Poincare polynomial is modified so as to have two numerical variables, it becomes possible to obtain from the polynomial for a Boolean representation, by simple substitution of variables, the corresponding polynomial for the orthogonal geometry. For graphs, this process converts coboundary enumeration to cycle enumeration 2). The appropriate two-variable polynomial is the coboundary polynomial for any strong map (1: P --* L, defined by (6)
1'(a; rp, v) = Lrp.!(ULl(:t»p(x; v). :teL
2) Cf. [2], [7], [8]. "A Ring in Graph Theory" is an important work, in which TUTTE calculates what has come to be known as the Grothendieck group, for a category of graphs.
412
Vol. XIX, t 968
605
Mobius Inversion in Lattices
Theorem 9. If a: P --+ L is a strong map between geometric lattices P, L r (a; cp, v) = L
(7)
cpA (x) ,u (x, y) V.1(1)-.1(a(x» .
X,YEP
Proof. Directly from Theorem 3.
I
The application of Theorem 9 to Boolean representations of a geometry, such as the map from subsets A of the set of atoms to sup A in L, is particularly useful. Corollary 1. If B = B (X) is a finite Boolean algebra and a: B --+ L is a strong map into an geometric lattice L, then r(a; cp, v) = Lcplxlp(x; v) = L(CP _1)IAl v.1(1)-.1(a(A».
I
A>;X
XEL
The definition of orthogonality relative to a strong map a: M --+ L of a modular geometry onto a geometry L is as follows. The strong map a: M --+ L determines a closure J = aLI (a) on M. There is a unique co closure J* on M satisfying, for all x, y in M such that y covers x (8)
y ;;:;; J (x)
~
J* (y)
;t; x.
The coclosure J* determines a quotient lattice P (with order induced by that on M), and a map r: M --+ P which is an inf-homomorphism preserving the relation coversor-equals. The inverted lattice P is a geometric lattice, so we set L* = P. Let a* be the associated strong map from if onto L*. The rank generating function e of a strong map a: P --+ L defined by (9)
e(a;;, 1])
= L;.1L(l)-.1L(a(x»1].1 p (x)-.1L(a(x» XEP
has symmetry [2] relative to orthogonality, whenever a maps a modular geometry onto L. Theorem 10. e(a*;;, 1]) = e(a; 1],;) for any pair a: M --+ L, a*: M --+ L* of orthogonal maps of a modular geometry. Proof. The measurement Ad1) - AL(a(x)) which provides the exponent of; in
e(a), enumerates the number of intervals [y, z] of length 1 ("steps") in any maximal
chain ("path") from x to 1 in M, for which z ;t; J (y), ie: J* (z) ;;:;; y. But this is precisely the measurement AM(X) - AL*(a*(x», which provides the exponent of 1] in e(a*). Similarly, is the number of steps [y, z] in any path from 0 to x in M for which z ;;:;; J(y), ie: J* (z) ;t; y. I Corollary 2. If B = B(X) is a finite Boolean algebra and a: B --+ L is a strong map into a geometric lattice L, then r(a; cp, v) = (cp -
1).1L(1) e
413
C,.:. 1'
cp -
1).
606
H.H. CRAPO
ARCH. MATH.
Proof. By Corollary 1, -ria; rp, v) = =
L (rp -
A!;X
l)IA I V1L (1)-l(u(A» =
(rp - 1)ld1)
L ( ':'1 )lL(1)-l(U(A»(rp _
A!;;X
rp
l)IAI-l(u(A».
I
We may now complete the calculation of the cycle polynomial -r(a*) from the coboundary polynomial -ria). Corollary 3. If B = B(X) is a finite Boolean algebra and a: B --+ L is a strong map onto a geometric lattice L, with orthogonal a*: B--+ L*, then -r(a*; rp, v)
=
= (rp _
l)lllv-l(1)-r(a; v
Proof. From Corollary 2 we have e(~, 'Yj) 111- A(1). Thus
-r(a*; rp, v) = (rp - 1)1 1 1- A(1) 12*
C.:. 1'
~~ ~ 1, v).
= 'Yj-A(1)-r(a; 'Yj
+ 1, ~'Yj). Also A*(1) =
rp - 1) = (rp - 1) 111-A(1) 12 (rp - 1,
= (rp _ l)III-A(1)V-A(l) (rp _ l)A(1)-r
(a; v~ ~ ~ 1 , v).
rp':' 1) =
I
So far we have dealt with representations of geometries as quotients of simpler geometries of higher rank. A few parallel results are available for embeddings of a given geometry as a subgeometry of various larger geometries, usually of equal rank. Corollary 4. If a: P --+ L is a strong map between geometric lattices and if t: L --+ N is a 1·1 strong map from L into a geometric lattice N, then -r(da); rp, v) = v AN (1)-AN(,(1»-r(a; rp, v).
Proof. -r(da); rp, v) because AN (t (y))
=
L rpAP(U A(x» ,uL(X, y) V·N(l)-AN('(Y» = V1N (1)-lN(,(1» -ria; rp, v)
::z:,yeL
=
I
Ady) for all y E L.
Corollary 5. If L: L --+ N is a 1-1 strong map from a geometric lattice L into a geometric lattice N, then the relation vlN (I)- Ad
1 ) PLiO,
v) =
L
PN(x; v). xEN;,A(x)=O
In particular, if N is the lattice of aU partitions of the set H of vertices of a graph G and L is the lattice L(G) of closed subsets of the edge set X of G, then vIHI-I-A(,(X» PLiO, v)
=
L
00
ndv k=l
1) (v - 2) ... (v - k
+ 1)
where nk is the number of k-part color.partitions of the vertex set H of G.
Proof. Evaluate -r(t; rp, v), simplify by using Corollary 4. and set rp = O.
414
I
Vol. XlX,1968
Mobius Inversion in Lattices
607
Bibliography
H. H. CRAPo, The Mobius Function of a Lattice. J. Combinatorial Theory 1,126-131 (1966). H. H. CRAro, The Tutte Polynomial. Aequationes Math. (to appear). H. H. CRAPO, Geometric Duality. Rend. Sem. Mat. Univ. Padova 38, 23-26 (1967). D. A. HIGGs, Strong Maps of Geometries. J. Combinatorial Theory 6 (1968) (to appear). O. ORE, Galois Connexions. Trans Amer. Math. Soc. 61i, 493-513 (1944). G.-C. ROTA, On the Foundations of Combinatorial Theory I. Z. Wahrscheinlichkeitstheorie und verw. Gebiete 2, 340-368 (1964). [7] W. T. TUTTE, A Ring in Graph Theory. Proc. Cambridge Philos. Soc. 43, 26-40 (1947). [8] W. T. TUTTE, A Contribution to the Theory of Chromatic Polynomials. Canad. J. Math. 6, 80-91 (1954).
[1] [2] [3] [4] [5] [6]
Eingegangen am 9. 11. 1967 AnBchrift des Autora: Henry H. Crapo Department of Mathematics, University of Waterloo. Waterloo, Ontario, Canada
415
Reprinted from JOURNAL OF COMBINATORIAL THEORY All Rights Reserved by Academic Press, New York and London
Yo. 7, No.3, November 1969 Printed in Belgium
A Generalization of a Combinatorial Theorem of Macaulay
G. F.
CLEMENTS AND
B.
LINDSTROM
University of Colorado, Boulder, Colorado 80302, and University of Stockholm, Sweden Communicated by D. H. Younger
Received May 26, 1968
ABSTRACT
Let E denote the set of all vectors of dimension n (n :> 2) with non-negative integral components. E is ordered in the lexicographic order. Let E. denote the subset of all vectors in E with component sum v. If Hv denotes any subset of Ev let LHv denote the set of the I Hv I last elements in Ev , where I Hv I is the number of elements of Hv . Let PH. denote the set of all vectors of E'+1 , which are obtained by the addition of 1 to a component of a vector in Hv. In [3) Macaulay proved the inclusion P(LHv) C L(PHv). Sperner gave a shorter proof in [4). Let kl < k2 < ... < k n be given positive integers and let F denote the set of all vectors (al , ... , an) with integer components and 0 < ai < k i i = 1, ... , n. We shall prove Macaulay's inclusion for subsets Hv of Fv even if the operators P and L are restricted to operate in F. This will follow from our theorem. As another application we prove a generalization of the main result ill [2). By a different method Katona proved the theorem when kl = k2 = ... = k n = 1 (see [1, Theorem 1)).
1.
INTRODUCTION AND STATEMENT OF THE RESULTS
Let the integers kl , k2 , ... , k n be given such that 1 ~ kl ~ k2 ~ ... ~ k n • The set of all vectors (a l , ... , an) of dimension n with integer components ai for which 0 ~ ai ~ ki' i = 1, ... , n, will be denoted by F. Write (a l , ... , an) < (b l , ... , bn ) if al < bl or if al = bl , ... , ai-l = bi - l , ai < b i for some i, 2 ~ i ~ n (lexicographic order). Put kl + ... + k n = k. Define v = 0, 1, ... , k. If H is a subset of F, put H" = H n F" and let I H" I denote the number of elements of H" . The set of the I H" I first elements of F" in the lexicographic order will be denoted by CH" and is called the compression of H" . The set of the last I H" I elements of F" is denoted by LH., . If H is any subset of F let k
CH=
U CH". .,=0 230
416
(Ll)
A GENERALIZAtION OF A COMBINATORIAL THEOREM OF MACAULAY
231
Let r be the multi valued function from F into F which associates with (a l , ... , an) the set {Cal -
1, a2 , ... , an), (al' a 2 - 1, aa , ... , an), ... , (al , ... , an-I, an - I)}
n F.
If F(H) C H we say that H is closed. Let P be the multivalued function from F to F which associates with (a l , ••• , Cln) the set {Cal
+ 1, a
2 , ••• ,
an), (ai' a 2
+ 1, aa , ... , an), ... , (al , ... , an-I, an + I)} n F.
Let Sm denote the set of the m first vectors of F. Put a(al , ... , an)
=
n
L a;
;=1
and
rx(H)
=
L
rx(a).
IJeH
We can now state our results. THEOREM.
If H" C F"
then r(CH,,) C C(FH,,), v
=
0, 1, ... , k.
=
COROLLARY
1.
If H" C F"
COROLLARY
2.
If H C F and H
COROLLARY
3. If He F, I H I = m and H is closed then rx(H)
then P(LH,,) C L(PH,,), v
0, 1, ... , k.
is closed then CH is closed. ~
rx(Sm).
The notions in Corollary 2 were introduced by Lindstrom and Zetterstrom in solving a problem of k-adic integers [2]. They proved Corollary 2 in the special case n = 2 and all k i equal (see [2, Lemma 1, p. 167]), but convinced themselves by an incorrect example that the corresponding result for n = 3 was wrong (they overlooked 102 in their example on page 169). Theorem 1 in [2] is a special case of Corollary 3 when all k; are equal. The theorem of Macaulay follows from Corollary 1 for fixed v when kl ;;:::, vn. The reader will perhaps find the following picture helpful to grasp the theorem. Let n, k; for i = 1, ... , n and v be fixed positive integers satisfying 1~ v ~k
=
n
L ki •
;=1
Let the points (a l , ... , an) with integral coordinates a; such that be locations of buttons which turn on lights at those positions
417
°
~
ai ~ k i
232
CLEMENTS AND LINDSTROM
which have non-negative coordinates. Suppose one is required to press m of the buttons in the hyperplane a1 + a2 + ... + an = v. Which ones should be pressed so as to minimize the number of lights that go on? The minimum will be realized by using the first m buttons in the lexicographic order which lie in the hyperplane. For instance, if n = 3, kl = k2 = k3 = 4, and v = 7, the relation between the buttons, indicated by small solid circles, and lights, indicated by large open circles, are shown in Figure I. A button turns on the lights connected to it by a straight line.
FIGURE
2.
1.
n
=
3, kl
=
k2
= kl =
4, v
=
7.
PROOF OF THE COROLLARIES
We shall first prove the corollaries with the aid of the theorem. PROOF OF COROLLARY 1: Apply the mapping of F onto F which maps (al ,... , an) on (k 1 - al , ... , k n - an). Fv is then mapped on Fk- v . If Hv is
418
A GENERALIZATION OF A COMBINATORIAL THEOREM OF MACAULAY 233 mapped on HL" then CH" is mapped on LH~_" and rH" is mapped on PH~_" . The rest of the proof is now obvious. PROOF OF COROLLARY 2: If r(H) C H then r(H,;) C H V- l for v = 1, ... , k From the theorem it follows r(CH,,) C C(H"-l) and then
r(CH) = r
(u CH,,) = U r(CH,,) C U C(HV- l ) C CH. v-o k
k
k
,,=1
,,=1
*-
PROOF OF COROLLARY 3: Assume CH Sm and let a = (a l , ... , an) be the first element of F which is not element in CH. Let b = (b l , ... , bn ) be the last element of CH. It follows a < b, for CH S",. We shall next prove o:(a) > o:(b). If o:(a) = o:(b), a E CH follows from the definition (1.1) since a < b, and a ¢ CH is contradicted. We now prove that o:(a) < o:(b) implies the same contradiction. Since a < b let al = bl ,... , ai-l = b i - l , ai < b i . If ai+l = '" = an = 0 we obtain a from b by the subtraction of o:(b) - o:(a) ones from bi , bi+l , ... , bn . It then follows a E CH, for CH is closed by Corollary 2. If aj > 0 for some j > i put d = o:(b) - o:(a) and d' = d - min{d, bi - ai - I}. It follows d' ~ bi+l bn • We can now subtract d' ones from the integers bi+l , ... , bn so that we obtain non-negative integers CHI"'" Cn . Put Ci = bi - d d' and Cj = aj for j = 1,2, ... , i - I . We now have c = (cl , ... , cn) E CH, for bE CH and CH is closed by Corollary 2. From ai < Ci it follows a < c. Then we get a E CH, for C E CH and o:(c) = o:(a). Thus o:(a) > o:(b) follows. If we delete b from CH and adjoin a to CH, we obtain a set H' such that rH' C H', CH' = H' and o:(H') > o:(CH). If H' Sm we can repeat the operation' and after m steps, at most, we have H"'" = Sm and o:(H""') > o:(CH) = o:(H), which is the result.
*-
+ ... +
+
*-
3.
PROOF OF THE THEOREM
We shall first define some auxiliary notions. If H C F, put i
= 1, 2, ... , n;
d
= 0, 1, ... , k i
•
For subsets H" of F" let (CHV)i:d denote the set of the !(H,,)i:d! first elements of (F,,)i:d' We shall say that Hv is i-compressed if (CHV)i:d = (Hv)i:d, d = 0, 1, ... , k i • If CH" = Hv we say that H'I; is
compressed.
419
234
CLEMENTS AND LINDSTROM
For any subset H" of Fv we can define the sequence of sets Hvl, Hv 2, ••• , Hv;, ... by putting Hv = Hv 1 and H~+1
k;
=
U (CHv;)t:a,
where
i
,.=
j(mod n),
1
~
i ~ n.
(3.1)
<1=0
We shall prove five lemmas. LEMMA
1.
One can find p such that H "P is i-compressedfor i
= 1, 2, ... , n.
PROOF: Enumerate the elements of Fv in the lexicographic order. If a E F" let n(a) be a's number. For any subset H" of F" define n(H,,) as the sum of numbers of its elements. It is evident that
n(H"I)
~
n(H,,2)
~
...
~
n(H,,;)
~
...
and
Since the sequence cannot decrease indefinitely, there must exist a p such that Hv P = H~+1 = ... , i.e., H"P is i-compressed for i = 1,2, ... , n. LEMMA 2. If the theorem is true in n - 1 dimensions and if rHv C H"-1 (n dimensions), itfollows rH/ C HLlfor j = 2,3, .... PROOF:
The proof is by induction from j to j r(H'/)i:a
+ 1. From
n (F"-I);:
it follows (in n - 1 dimensions)
From rH,,; C H~-1 , we obtain
d
~
1,
and then
d
~
1,
(3.3)
for the left side is the first I(CH'/)i:a I elements of (F,,-I)i:
420
A GENERALIZATION OF A COMBINATORIAL THEOREM OF MACAULAY 235 If we take the union for d = 0, 1, ... , ki' we obtain by (3.1) rHt+1 C H!~~, and the lemma follows by induction since rHvl C H!_l.
If Hv is compressed, then rHv is compressed. Assume a = (a l ,... , an), b = (b l , .•. , bn) are elements
LEMMA 3.
PROOF: and a < b. Then, if (b l
, ... ,
bi
+ 1, ... , bn ) E H" ,
we shall prove a E rH'IJ. Let al = bl , ... , ai-l = bi- l , ai
1
~
< b i • Then if j
in F'IJ_I
j ~ n, ~
i, we have
+ 1, ... , an) < (bl , ••• , bi + I, ... , bn ) E H" . (a l ,... , aj + 1, ... , an) E Hv since H" is compressed.
(a l ,... , aj
It follows Hence a E rH". If j > i, we disregard the first i - I components and assume al < bl . If av < k" for some v> I, or if v = I and a'IJ < bv - 1, we have (a l ,... , a" I, ... , an) < (bl , •.• , bi 1, ... , bn) E H" and then a E rH'IJ. If a = (b l - 1, k2 ,... , k n), we find that b = (bl , k2 ,... , kv - 1, ... , k n) for some v > 1 and then j = lor v. Hence (b l I, k2 ,... , kv - 1, ... , k n) E H" or (bl , k2 ,... , k n) E Hv . It follows (bl , k2 ,... , k n) E Hv and a E rHv .
+
+
+
LEMMA 4. Let n ~ 3. Assume that g = (gl , ... , gn) and h = (hi , ... , hn) are elements in F", g < hand hn = 0 or gn = k n . Then ifh E S, and Sis i-compressed for i = 1, 2, n, it follows g E S. PROOF: The conclusion follows if we can find an increasing sequence of vectors in Fv beginning in g and ending in h such that any two consecutive vectors have an i-th component equal (i = 1,2 or n). First assume gn = k n • Consider three cases:
(lo) gl
=
(20 ) If gl
hi is trivial.
<
hi and gi
> 0 for
some i, 2 ~ i ~ n - I, we have
where and is as small as possible in the lexicographic order. We find then
(gl
+ 1, g~ ,... , g~-l , gn)
~ (hi,
h2 ,... , hn ),
if gl
The sequence follows from (3.4) and (3.5) if gl
421
+1=
+1=
hI .
hi.
(3.5)
236
CLEMENTS AND LINDSTROM
g;
If gl + 1 < hI and > 0 for some i ~ 2 proceed as in (20). If gl + 1 < hI and g~ = '" = gn-l = 0 proceed to (30). (30) If gl < hI and g2 = gs = ... = gn-l = 0, we obtain the inequalities (gl' g2 ,,,., gn)
< (hI' g2 '''., gn-l , gn
- hI
+ gl)
:;:;; (hI,,,·, hn).
(3.6)
The second vector belongs to F" since hI - gl :;:;; kl :;:;; k n = gn . We have proved that if gn = k n one can find the desired sequence of vectors. Then assume hn = O. Apply the preceding result to the vectors (kl - hI'"'' k n - hn) < (kl - gl '''., kn - gn). LEMMA 5.
The theorem is true when n = 2.
PROOF: A subset B = {(aI' a2), (al + I, a2 - I),,,, (a l + c, a2 - c)} of F" is called a block. A block which contains the first element of F" is called an initial block. If B is any block and Bo is an initial block, we easily obtain in all cases, since kl :;:;; k2 , 1
rB
1 -
1
B
1
~
1
rBo
1 -
1
Bo I.
*
(This is not true when kl ~ v > k 2). If BI '''., Br are all the maximal blocks which are subsets of H", it follows rBi n rBi = 0 if i j and then 1
rH"
1 -
1
Hv 1 =
r
L
(I
rBi
1 -
1
B; J) ~ 1 rBo
1 -
1
Bo
1
i~l
for any initial block Bo. In particular when Bo = CH", we have 1 rH" 1 ~ 1 r(CH,,) 1 and r(CH,,) C qrH,,), since r(CH,,) is compressed. PROOF OF THE THEOREM: The theorem will be proved by induction from n - 1 to n. It is true for n = 2 by Lemma 5. Assume that the theorem is true in n - 1 dimensions. Let H" be any subset of F" and consider H"i and (rH,,)i for j = 2, 3,,, .. By Lemma 1 we can determine p such that S = H"P is i-compressed for i = I, ... , n. Put (rH,,)p = T for abbreviation. From Lemma 2 it follows rSCT.
(3.7)
To complete the proof we show how to alter S to CH" and T to a subset of C(rH,,) in such a way that r(CH,,) C qrH,,) is obtained. First, if S = F" then 1 S 1 = 1 CH" 1 shows that CH" = F". Also reS) = reF,,) = F"_l C Timplies T = F"-l . Then since 1 T 1 = 1 qrH,,)I, it follows qrH,,) = F"_l , and hence r(CH)" = reF,,) = F"-l = qrH,,), and we are done. If S Fv there is a first vector g = (gl '''., gn) of F" which is not in S.
*
422
A GENERALIZATION OF A COMBINATORIAL THEOREM Of MACAULAY
237
Let h = (hI' ... ' hn ) denote the last vector of S. If h < g, then S = CHv , and so it is no loss of generality to assume h > g. It follows from Lemma 4 that hn > 0, since S is i-compressed for i = 1, 2, ... , n. Define h* = (hI' ... ' hn - l , h n - 1) and g*
= (gl , ... , gn-l , gn - 1)
if gn
•
>
O.
Note that h*, g* E F v - l • From (3.7) it follows h* E T. If XES - {h}, then rex) < x < h (where rex) denotes any image of x). This shows that rex) = h* for no XES - {h}. Now let S'
= (S - {h}) u {g},
T =
Ie: -
{h*})
u
{g*}
if gn if gn
> 0, = O.
We now show that rs' CT. Since rex) = h* for no XES - {h}, it suffices to show that reg) C T'. If gn > 0 then g* is an image of g which is in T' by construction. Observe that gn < k n . This under follows from Lemma 4 since S is i-compressed for i = 1,2, ... , n. If gi > 0 for some i, 1 ~ i ~ n - 1, then (gl ,... , gi - 1, ... , gn) is an image of g under But it is also an image of (gl , ... , gi - 1, ... , gn + 1), which is in S because it precedes g and g was the smallest element of Fv not in S. Since reS) C T, it follows (gl , ... , gi - 1, ... , gn) E T. Also, because hI > gl (S is I-compressed), we find h* =I=- (gl , ... , gi - 1, ... , gn) so (gl , ... , gi - 1, ... , gn) is (still) in T'. Thus all images of g are in T', so reS') C T. Obviously, S' is i-compressed for i = 1,2, ... , n. After a finite number of applications of " we have S'···, = CHv and r(CHv) CT···' = U. Now C(rHv) is the first 1 rHv 1 elements of F V - 1 while r(CHv) is the first 1 r(CH v) 1 elements of F V - 1 by Lemma 3. But 1 r(CHv) 1 ~ 1 U 1 = 1 T 1 = 1 rHv I. It follows r(CHv) C C(rHv), and the theorem is proved.
r,
r.
4.
CONCLUDING REMARKS
We can show that p = 4 suffices in Lemma 1 when n = 3. The proof is rather long since the number of cases which one must consider is large. We recently noticed that Katona [1] has proved our theorem when kl = k2 = ... = k n = 1. Katona puts his result in the language of set theory. He observes that "the theorem is probably useful in proofs by induction over the maximal number of elements of the subsets in a system, as was Sperner's lemma in his paper" [5].
423
238
CLEMENTS AND LINDSTROM
REFERENCES 1. G. KATONA, A Theorem of Finite Sets, Theory 0/ Graphs (Proceedings of the colloquium held at Tihany, Hungary September 1966), ed. by P. Erdos and G. Katona, Academic Press, New York and London, 1968. 2. B. LINDSTROM AND H.-O. ZETTERSTROM, A Combinatorial Problem in the k-adic Number System, Proc. Amer. Math. Soc. 18 (1967), 166-170. 3. F. S. MACAULAY, Some Properties of Enumeration in the Theory of Modular Systems, Proc. London Math. Soc. 26 (1927), 531-555. 4. E. SPERNER, Ober einen kombinatorischen Satz von Macaulay und seine Anwendung auf die Theorie der Polynomideale, Abh. Math. Sem. Univ. Hamburg 7 (1930); 149-163. 5. E. SPERNER, Ein Satz tiber Untermengen einer endlichen Menge, Math. Z. 27 (1928), 544-548.
PRINTED IN BRUGES, BELGIUM, BY THE ST. CATHERINE PRESS, LTD.
424
Reprinted from: JOURNAL OF MATHEMATICAL PHYSICS
VOLUME 11. NUMBER 6
JUNE 1970
Short Proof of a Conjecture by Dyson J. J. GOOD Department oIStatistics, Virginia Polytechnic Institute, Blacksburg, Virginia
(Received 26 December 1969) Dyson made a mathematical conjecture in his work on the distribution of energy levels in complex systems. A proof is given. which is much shorter than two that have been published before.
Let G(a) denote the constant term in the expansion
so that
of F(x; a) =
II ( I i#j
- X ~ )"i • Xi
G(a) = ~ G(a" ... , a }_, , a, - 1, a H"
By multiplying F(x; a) by this function we see that, if ~ O,} = I,"', n, then
a,
If a j = 0, then Xi occurs only to negative powers in F(x; a) so that G(a) is then equal to the constant term in
F(x 1 ,'"
}
,Xi-I' X i + 1 " " , Xn;
that is,
a1 , " ' , OJ-I' a i + 1 ,'"
G(a) = G(a" ... , a,_" aJ+" ... ,an),
,01/),
if a} = 0. (2)
Also, of course, G(O) = I.
(3)
Equations (1)-(3) clearly uniquely define G(a) recursively. Moreover, they are satisfied by putting G(a) = M(a). Therefore G(a) = M(a), as conjectured by Dyson. F. J. Dyson, J. Math. Phys. 3,140.157,166 (1962). J. Gunson, J. Math. Phys. 3, 752 (1962). K. G. Wilson, J. Math. Phys. 3,1040 (1962). 4 Z. Kopal,Numerical Analysis(Chapman and Hall, London,1955), p.21. 1
a" a,,"', a,_"
an). (I)
where all Q2' ••• , a tl are nonnegative integers and where F(x; a) is expanded in positive and negative powers of x" x., ... , X n' Dyson' conjectured that G(a) = M(a), where M(a) is the multinomial coefficient (a, + ... + a,,) !/(a,! ... an !). This was proved by Gunson' and by Wilson' A much shorter proof is given here. By applying Lagrange's interpolation formula (see, for example, Kopal 4) to the function of x that is identically equal to I and then putting x = 0, we see that
F(x;a) = ~F(x;
... ,
i.j = 1.2.···, n,
2
3
425
Reprinted from ADVANCES IN MATHEMATICS All Rights Reserved by Academic Press, New York and London
Vol. S, No. I, August 197() p,.z"nted in Belgium
On a Lemma of Littlewood and Offord on the Distributions of Linear Combinations of Vectors* DANIEL
J.
KLEITMAN
Department of Mathematics, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139
In this paper we prove the following result: THEOREM 1. Let a l , ... , an be vectors in a Hilbert space S, each with length at least unity. The number of their linear combinations with coefficients o or 1 that can lie in the union of any k regions Rl ,... , Rk in S each of diameter less than ( <) unity is not more than the sum of k largest binomial coefficients on N.
This result is related to a problem of Littlewood and Offord [1] on the distribution of roots of algebraic equations. Various special cases have been obtained by several authors [2-4]. Theorem I settles a long standing conjecture of P. Erdos [2]. The method of proof can be extended straightforwardly to prove the following generalization. THEOREM II. Let a l , ... , an be vectors in a Hilbert space S each of length at least unity and let mV Ml ,... , mn , Mn be integers. Then the number of linear combinations Ln=l Ciai with coefficients Ci integral and in [mi , M i ] that lie in the union of any k regions in S each of diameter less thiln ( <) one is no more than the number of such linear combinations whose "weights" (L ci ) are among k "most populous" weights. This is the number of linear combinations satisfying
* This
work was supported in part by NSF GP-13778.
155
427
156
KLEITMAN
As this bound can be achieved by choosing all a's identical, the result is best possible.
We present the proof in detail for Theorem I only. Parallel steps for Theorem II are easily obtained. We can, without loss of generality, assume that our regions R j are mutually disjoint. We do so below. We first note a well-known property of binomial coefficients. The sum of k largest binomial coefficients on N is equal to the sum of k + I and k - I largest binomial coefficients on N - I. To prove this we note that k largest binomial coefficients on N are (~), (r~l)"'" (~), where r = [(N - k + 1)/2] ands = r + k - I = [(N + k - 1)/2]. Applying the recursion (j) = (N j 1 ) + (j~i) to each of these coefficient yield the desired relation, viz.,
We prove our theorem by induction on N. In light of the property just described we need only show that (0, I)-linear combinations of (a l , ... , aN) lying in k disjoint regions of diameter < I can be put in I: I correspondence with (0, I)-linear combinations of (a l , ... , aN-l) lying either in k + I or k - I such regions. Now (0, I)-linear combinations of (a l , ... , aN) have coefficient of aN either zero or one. In the former case they may be considered as (0, 1)linear combinations of (a l , ... , aN-l) as they stand. They must lie in our k disjoint regions as sums of (a l , ... , aN-l) if they are to do so as sums of (a l , ... , aN)' In the latter case, involving linear combinations explicitly containing aN' the condition that they lie in our k disjoint regions is that their (a l , ... , aN-l) parts lie in the translation of these regions by-aN' If we show that the translation by -aN of at least one of our regions is disjoint from each of our k original regions we are done, since sums lying in our original k regions with aN correspond to sums without aN lying in these regions plus our translated region (k + I disjoint regions) or lying in the k - I other translated regions which are themselves mutually disjoint and of diameter < 1. If we consider a hyperplane normal to aN placed so that all regions Rl ... Ric lie on one side of it (the side on which x . aN ;?: 0) and so that it just touches the closure of some R j , the translation of R j by -aN must lie on the other side of the hyperplane and hence must be disjoint from the regions Rl ... RN . This remark completes the proof.
428
A LEMMA OF LITTLEWOOD AND OFFORD
157
If we were concerned with linear combinations as in Theorem II we could proceed in the same manner. If the coefficient of aN can take on MN - mN different values we obtain (MN - mN)k regions in terms of (aI'"'' aN-I) to correspond to the k regions RI '" R k • The desired recursion relation is obtained if these can be divided into disjoint families of regions of sizes k + (MN - mN), k + (MN - mN) - 2, k + (MN - m N) - 4,,, .. The hyperplane construction above permits us to find such families. That is if m N = 0 the original k regions along with the translation of R j as defined by -aN' -2a N '''., -MNaN , form k + M N disjoint regions. Another disjoint family can be obtained by throwing away these and repeating the procedure just described. Iteration of this procedure produces disjoint families of regions which by induction yield the recusion satisfied by k most populous weights. ACKNOWLEDGMENT
The author thanks R. Graham who suggested the geometric hyperplane interpretation of the argument described above.
REFERENCES
J. E. LITTLEWOOD AND C. OFFORD, On the number of real roots of a random algebric equation (III), Mat. USSR Sb. 12 (1943), 277-285. 2. P. ERDOS, On a lemma of Littlewood and Offord, Bull. Amer. Math. Soc. 5 (1945), 898-902. 3. D. KLEITMAN, On a lemma of Littlewood and Offord on the distribution of certain sums, Math. Z. 90 (1965),251-259. 4. G. KATONA, On a conjecture of Erdos and a stronger form of Sperner's theorem, Studia Sci. Math. Hungar. 1 (1966), 59-63. I.
PRINTED IN BELGIUM BY THE ST. CATHERINE PRESS, TEMPELHOF
429
37, BRUGES, LTD.
Ramsey's Theorem for a Class of Categories R. L. GRAHAM Bell Telephone Laboratories. Incorporated. Murray Hill. New Jersey
K. LEEB Universitiit Erlangen. Erlangen. Germany
AND B. L. ROTHSCHILD University of California. Los Angeles. California 90024 DEDICATED TO RICHARD RADO ON THE OCCASION OF HIS 65TH BIRTHDAY
1. INTRODUCfION AND BASIC TERMINOLOGY
In this paper we present a Ramsey theorem for certain categories which is sufficiently general to include as special cases the finite vector space analog to Ramsey's theorem (conjectured by Gian-Carlo Rota), the Ramsey theorem for n-parameter sets [21, as well as Ramsey's theorem itself [4, 61. The Ramsey theorem for finite affine spaces is obtained here simultaneously with that for vector spaces. That these two are equivalent was already known [5, II, and the arguments previously used to show that the affine theorem implies the projective theorem are also special cases of the results of this paper. The argument used here to establish the main result is essentially the same as that used for n-parameter sets [2]. What we do here is to abstract the properties of n-parameter sets which suffice to allow the induction argument. In particular, the properties described for n-parameter sets in Remarks 1-3 of [21 are essential. In order to state the Ramsey property for a category C we must have a notion of rank with which to index the objects and subobjects of the category. To this end, it is convenient to consider henceforth only categories c with the following property: (a) I
>
The objects of c are the nonnegative integers o. I, 2, ... , and if where c (J ,k) is the set of all morphisms from I to k in c.
k, C (J , k) - ",
Using this property, we define a rank on subobjects of an object I in c. Namely, if k - ' I and k' -t' I are representatives of the same subobject of I, then there must be isomorphisms k - " k' and k' -J k. But by (a), this means that k - k'. We define the rank of this subobject to be k, and we refer to it as a k-subobject of I. We denote by c [L] the class of subobjects of I in C of rank k. We make the convention that for k < 0, or I < 0,
431
GRAHAM, LEEB, AND ROTHSCHILD
418
[i]-
e Ill. In order to make our induction argument work, we need a finiteness condition. We assume in addition to (a) that all categories considered here satisfy: (b) For each pair of integers there is an integer Yk. I such that e a finite set with Yk.1 elements. In particular, Yo.o - 1.
[i] is
For convenience, all categories we consider are assumed to satisfy (c)
All morphisms of e are monomorphisms.
If k .....t I is a morphism of e, we let] denote the induced mapping on subjects of I. That is, if s _8 k represents a subobject of k, then] takes this subobject into the subobject of I represented by the composition fg. This is clearly well defined, and]: e e An r-coloring of e is a function
[!]-It ..... r}.
c:
e
c
is
[!] - [!].
An r-coloring
i.
[!]
We say that a sUbobject has color i if its image under c
of e
[!]
induces an r-coloring on e
[!]
by the
composition cf, where k .....t I is in e. If the image of c] is only a single element, we say that c has a monochromatic k-subobject, namely, the ksubobject represented by k .....t I. We can now state the Ramsey property for a category e satisfying (a)(c):
Given integers k. I. r, there exists a number n, depending only on k, I, r, so that for all m ~ n, every r-coloring of has a monochromatic 1subobject.
e[k]
When e has morphisms k .....t I which are all the monomorphic functions from II, .... k} to II ..... I}, then this is just the statement of Ramsey's Theorem. If e has morphisms k .....t I which are the linear monomorphisms from vk -
only on
there is an i,
c[
Ii]
k
There is a number N-Nc(k;r;II ..... I,) depending such that for any m ~ N and any r-coloring c of e
[k]'
k. r. II •...• I"
I "" i "" r,
7
and a morphism 1/
jIo
.....t m such that c
c [~]
.
~ ~ Ii}
432
h .... . r}
RAMSEYS THEOREM FOR A CLASS OF CATEGORIES
419
commutes, where incl(j) - i. This statement always holds for k < 0, since C [~]-
~, by convention.
If
all the Ii are equal, this becomes the Ramsey property stated above. 1 below provides the induction step in establishing It establishes B(k + 1;/ 1 , .•. , I,) if we know provided the categories A and Bare related in a special way. This relation is given by the conditions below. For a functor M from A to B with M (x) - y for integers x and y, we denote by Ai the induced function from subobjects of x to subobjects of y. This is given by letting Ai take the subobiect represented by s -.I x in A into the subobject represented by M(s) _M ) Y in B. Theorem
I,) for certain categories. A (k; II , ... ,I,) for all , and 1/
C(k;/I,""
Conditions on Categories A and B
There is a functor M from A to B with M(f) - I + I, 1-0,1, ... , a functor p from B to A with p (I) - I, I - 0, I, ... , an integer 1 ~ 0, and for each I - 0, 1, ... 1 morphisms,l _.Ij I + I, 1 EO; j EO; I, satisfying the following: I. For each k + 1 - 0, 1,2, ... , the diagonal d in the following diagram is epic, where!! (together with the indicated injections) is coproduct, and d is the unique map determined by the coproduct to make the diagram commute:
[k~) -----il--------;~
/~ 1+1]
II. For each commutes:
1;, Ai
k+ 1
.....!------------
s -' I
in
B
and each
q,lj
1
g
q,'j
s
III. For some
I -' I
II
+ 1 in
A,
j - I, ... ,I
..
A [I] k
the following diagram
1+1
1
M(P(g»
.
s+1
the following diagram commutes for all
j - I, ... , I:
433
GRAHAM, LEEB, AND ROTHSCHILD
420
1+1
/~
1+2
~~ 1+1
Remark. Let such that
J
+I_
h
I in B. Then by III there is some
J -' S
+ 1 in A
'"
commutes in
B
for each j . By II, the diagram
----'----;,~
,
1+ I
.+I.} .1+1
commutes for each
j.
'"
Thus
'"
---h
1+1
.r+l
commutes for each
j.
THEOREM 1. Let A and B be two categories satisfying the conditions above. Assume A (k;/ I, . .. • I,) holds for 0111 1 ' ••• , I, and r > o.
434
RAMSEYS THEOREM FOR A CLASS OF CATEGORIES
Then
+
B (k
I; II , ... , I,)
421
holds for all II , ... , I" and, > o.
3. PROOF OF MAIN RESULT We will eventually need a lemma about n-dimensional arrays of points. We state it now without proof. Proofs can be found in [3] and (2). (It is a special case of Corollary 4 below, in fact.) We denote by A· the set of ntuples (x I , . . . ,x.) of elements of Xi of a set A.
Given integers , > 0, 1 ~ 0, there exists an integer depending only on , and I, such that if n ~ N, A is a set of 1 elements, and A· is ,-colored in any way, then there exists a set of 1 ntuples (x I (j), ... , x. (;l), 1 ..;; j ..;; I, all the same color with the property that for each i, 1 ..;; i ..;; n, either Xi (j) - j for all j, or Xi (j) - a, for all j and some E A. LEMMA 1.
N - N(" I),
a,
Proof
of
Theorem
1. We use induction on L -II + ... +1,. holds vacuously if I, < k + 1 for any i or if k + 1 < 0 and trivially if 1 - O. SO we assume Ii ~ k + 1 ~ 0 and 1 > o. If any I, - 0, then k + 1 - 0, and B(k + 1;/ 1 , . . . , I,) holds trivially, since Yo.o - I. So we may assume all I, > 0, and, in particular, that L > O. Assume, then, that B(k + 1;/ 1 , . . . , I,) holds for L - I, and let II + ... +1, - L, I, > o. B(k + 1;/ 1 , ••• , I,)
DEFINITION. For I";; h ..;; m, suppose k + 1 --' I + h is in B, and f - M for some k --,' I + h - 1 in A. For any fixed choice of jh, jHI , ... ,jm -I, 1 ..;; j; ..;; I, let "', - "'1+, .j' Then the (k + I)-subobject of I + m represented by the composition
«)
.m-
I f .h k + 1 --71 + h --71 + h + 1 --7 ... --71 + m - 1 --71 + m
is said to have signature (h ;jm - 1 , ... ,jh) with respect to I and m. (The signature need not be unique for a given subobject, nor must every subobject have a signature.) An ,-coloring of B [t~~] such that all (k + 1)subobjects with the same signature have the same color is called an {/ , m)-c%,ing.
For integers / and prove Lemma 2 below.
m
we define recursively some numbers needed to
VI - NA (k;"m- I ; /, ... , J)
Vm -
N A (k; ,,0;
Vm -
1 + 1 , ... ,
Vm -
1+
I) .
The existence of these numbers is guaranteed by the hypothesis of Theorem 1.
With the same assumptions as in Theorem 1, let be integers; let x ~ Vm + I; and let B [k~I]-+c 11, ... ,,} be an
LEMMA 2. /
~
0, m
~
1
r-coloring. Then there exists / + m coloring of B ~~].
[i
-+6
435
x
in
B
such that egiS an (/, m)-
422
GRAHAM, LEEB, AND ROTHSCHILD
Proof We use induction on m. For m Assume for some m ~ 2 that it holds for m -
the choice of the (VI
v;, there is some
VI
+
1
I.
the lemma is trivially true. Then by induction, and by
m.....' x in B such that B [vk!~]
is
+ I. m - I)-colored bycg. We now color B [v~:/] as follows:
.....r'
k + 1 - ' VI + 1
and
each choice of compositions
jm - 1 •...• j I, 1 ..
k + 1
f
k + 1 ~VI +1
VI
Two subobjects, represented by
have the same color if and only if for h .. t, the subobjects represented by the
+ 1
~I ~VI +2~
... _ _
~m-I
VI
+m-I _ _ vl +m
and ~I
/
k+l~vl +1~vl +I~
have the same color, where
... _ _ VI
I]
~m-I +m-l~vl
I .. i .. m - I.
+m
This is an
"m_l_
· 0 f B [VIk+1 + ; ca II'It e ,. coIonng
Next, we color subobject in B
[v~:/].
A
A
[vJ 1 by the coloring induced by
M.
That is, a
[vJ] is assigned the same color as its image under M in
In other words,
there is some i. 1 .. i diagram commutes:
..
e' M is the coloring we use.
"m_ I ,
and some I
.....W
VI
By the choice of VI>
in A such that the following
h ..... "m-I}
Thus all the subobjects in MU bye' M(w). k
Suppose
k
.....r' I + h -
1
[ij) have the same color in B [i~\j
colored
+ 1 - ' 1+ h is in B, I .. h .. m, with /- M(j') for some in A. Consider the following diagram:
436
RAMSEYS THEOREM FOR A CLASS OF CATEGORIES
M(W~ •
k+1
-
423
vl+h+1
VI +h
q,'+h.}h
I+h
I
vl+m-1
l+h+1
q,.1 +m-I.}m-I
..
Um-I q,l+m-I.}m-1
I+m-I
I
VI+m Um
...
I+m
where
ul-M(w), u,-M(P(U'_I», i-2,3, ... ,m, 2,3, ... ,m. By condition II this commutes of ih, ih+l , ... , im - 1. Consider any subobject of I + m (!;im - 1, ... ,h) with respect to I and m. Let it be
and WI-W, for each choice with signature represented by k + 1 .....' I + m, where e is the bottom row of the diagram above with h - 1. Then ume represents a subobject of VI + m. By the definition of c' and the choice of w, all such subobjects with the same signature (!;im -I, ... ,h)
w, - P (U;_I), i -
have the same color in B [v~!;nl' since the diagram above commutes. On the
other hand, consider a subobject of I + m with signature 1, ... , ih), h ~ 2, and let it be represented by k + 1 .....' I + m, where e is the bottom row of the diagram. By the commutativity of the diagram, Um e - bM (whf), where VI + h .....b VI + m is the top row of the diagram. This means that Um e has signature (h - I;im -I, ... ,ih) with respect to (h; im -
VI
+ 1 and m -
1.
Since cg was a
(VI
+ I, m - O-coloring of B [v~!;nl' the
color of this subobject is determined only by the i,. Thus the color of any subobject with signature (h;im-I, ... , ih) with respect to I and m, h ~ I, has its color under the coloring c gUm determined only by the i" So c gUm is an (I, m)-coloring, and the lemma is proved. We may now proceed with the proof of Theorem I. Let 1- max
1<1<,
NB(k
+ 1;,;1 1 , ... ,1;-10 I, - I, 1;+1,"" I,),
a number which must exist by the induction hypothesis. Let y _ ,YI. HI, where Y'.HI is the number given by property (b). Let m - N{y, 0, where N {y ,0 is the number given by Lemma 1. Let Vm be the number used in the hypothesis of Lemma 2 (depending on I and m), and let x ~ Vm + 1. Finally, let B [k~,l""'c Ii, ... ,,) be an r-coloring. By Lemma 2 there is some
B
1+ m .....6 x in such that cg is an (I,m)-coloring of B[~~,;,l. We now color the m-tuples (h, ...• im), 1 :EO; i, ,.. I, by letting (h • ... ,im) and (k I •...• k m) have the same color if and only if for each k + 1 .....h I in B the subobjects represented by the compositions
437
_
_'"
GRAHAM, LEEB, AND ROTHSCHILD
424
II
.I,il
.I+m-l,im --'.l>- 1
h
4>1.k
4>l+m-I./cm
k + 1 --'.l>- 1 --'.l>- 1 + 1 --'.l>- ... --'.l>- 1 + m - 1
+m
and k + 1 --'.l>- 1 --'.l>- 1 + 1 --'.l>- ... --'.l>- 1 + m - 1
--'.l>-
1+ m
both have the same color in B [~~,;,]. This is a y-coloring of the m-tuples. By Lemma
and the choice of m, we can find t m-tuples z .s; t, all having the same color such that for each i either 1, (z) - z for all z or 11 (z) -it for all z and some fixed it. Let iI, ... , id be the i for which it (z) - z (there must be at least one of these since there are t m-tuples here). For 0 .s; a .s; d, let h. denote the composition (h
(z), ...
,1m (z», 1 .s;
1 + i.
-----?> 1 + i. +
... --'.l>- 1 + i. + I - 2
where we let
io - 0
and id+l
- m
• '+ia+1-2,jia+l-l
>
I
+ i.+ 1 - 1,
+ 1. Consider the following diagram:
-! ho
1 --'.l>- ...
/+;1- 1
..
4>1+11-I.j
..
hI l+il
4>1+i1-I,j
4>1+12-l.j
...
M(el)
--
-
l+i l
4>1+id-2- I ,j
1 + id-2-1
!
l+i2- 1
!
..
1+id-2
hd-2
1 +;2
{
4>1+ld-2-I,j M(ed-2)
l+id-2
4>1+ld-I-I.j
where the 1 + id - S - 1 _'d- s 1 + id-s+1 - 1 in A are those guaranteed by the Remark (following Condition III) to make this diagram commute for each 1 - 1,2, ... , t.
438
......
......
RAMSEYS THEOREM FOR A CLASS OF CATEGORIES By the choice of the represented by A
h.
we have for any
AO
.1+1t-1,}
k+I~/~/+il-1
.. .
M(Cd-Icd-2 ... c2cI) ., 1 + id
425
k
+ 1 _A 1 that the t subobjects
.,
I+il~
...
Ad
~ 1 + min B [~~ij,
I .. j .. t,
all have the same color, By Condition II, the following diagram commutes for all j: ""+II-I,}
I+il-I
r
I+i l
~
1
ho
..
"",j
1+1
Then letting a - hd M(ed_1 ... el P (ho» we see that for subobjects represented by the t compositions k+
.',j
A a I~ I~ I+I~
all have the same color, Thus
ega"',,}
I+m,
M(P(h o»
k
+ 1 _A 1 in
B
the
I " j " t,
are equal for all
j - 1,2, ...
,t, on
B[k~lj· Now consider any subobject of AI' (A [ij> in B [i~\j.
Let it be
represented by k + 1 - ' 1 + 1 in B, where f- M{f'>, k -to 1 in A, Then the subobject represented by af has signature (id;jm' ... ,itd+l> with respect to 1 and m, since af is just hd M (ed-I ... e I P (h o>1>, Since 1 + m is (f, m>-colored by c g, all subobjects of 1 + m with this signature have the same color. Thus ega gives the same color to any subobject of AI' (A [ij>, since the signature was independent of the choice of f. That is, I ..
q" r.
Consider the coloring ega"'" is some Ip
....IP 1 in B
on
B [k ~ Ij.
[ij-
{q}
for some
q,
By the choice of I, either there
such that
I]
B [k~1
or there is some
I
egaM (A
c'~'.lfp ----~.,
Iq - 1 ....Iq 1 in B
\P),
p '" q,
such that
I] ----~., Cf~'.lfq {q}.
I B [ 1~1
In the former case, we have the desired monochromatic subobject, and the theorem is proved. Hence we may assume that
439
GRAHAM, LEEB, AND ROTHSCHILD
426
B
We recall that ega4>I, I
-
[Il:1I]
ega4>I,j
ega4>I,jJq
By Condition II, 4>I,jJq -
<6""',llq
----~> {q}.
on B [k~l] for all j. In particular,
Hi::]]-
Hi::]] -
Now consider any subobject in if (A
the coloring
B,
where
J- M(f'),
M(P{Jq»J - M(P{Jq)j')
ega.
forallj.
M(P{Jq»4>lq-I,j' j - 1, ... ,I.
egaM (p{Jq» 4>1q-I,j
k + I - ' Iq in represented by
{q}
[Iq;
{q}.
Thus
j - I ..... 1 .
I]), and let it be represented by
k -J' Iq - I in A. The subobject is in if (A [L]), and thus has color q by
So cgaM (P{Jq» colors all subobjects in
We also saw above
thatcgaM(P{Jq»
I]) color q,
- [I-I] l+1 ) B[kl+ I]'
colors all subobjects
color q. But by Condition I, this accounts for all of Iq -4aM (P(fq)) x
M (A [Iq;
in4>lq _l.j(B
and hence
is the desired morphism, and the theorem is proved. 4. CONSEQUENCES
PROPOSITION 1. Let re be a class of categories such that for each category B in re there is a category A in re such that A and B satisfy the conditions of Theorem 1. Then B (k; II •...• I,) holds for all k, II," .• I, and all B in re.
Proof B(-I;II •...• I,) holds vacuously for allll." .• 1" as observed at the beginning of the proof of Theorem 1. This holds for all B in re. Thus for each B we can find a suitable A and apply Theorem 1 to obtain B (0; II •...• I,) for all II •...• I,. Proceeding in this fashion from 0 to 1 to 2, etc., we obtain B (k; II •...• I,) for all k, II •...• I, and B in re. COROLLARY 1 (Ramsey). Let C be the category with objects the nonnegative integers and morphisms k - ' I all the monomorphic functions from II, .... k) into II ..... I}, where composition is just composition of functions. Then C(k;II" ..• 1,) holds in general.
Proof We must find a class re containing C which satisfies the conditions of Proposition 1. For re choose the single category C itself. This clearly satisfies (a)-(c). So for A and B both equal to C, we must show that they satisfy the conditions of Theorem 1. Let P be the identity functor on c. For any k - ' I in c, let M (J) be the function k + I -J' I + 1 in C given by letting j' (x) - J(x), x';; k, and r (k + 1) - I + 1. Let 4>1 be the function from II ..... Il to 11 •.... I + J} which acts identically on II ..... I}. That is, 4>1 (x) - x for x .;; I. Then we claim these choices, together with choosing 1 - I satisfy I-III.
440
RAMSEYS THEOREM FOR A CLASS OF CATEGORIES
427
Consider a subobject in C [L~',l represented by some k + I - ' I + I. First suppose I (s) - I + I for some s. Then I represents the same subobject as l.s.k+I' where "'s.UI is the permutation of (l, ... ,k + II fixing everything except sand k + I, which it interchanges. "s.k+1 is an isomorphism and is its own inverse. Let k -i I be defined by letting / (x) - I"s. UI (x), I';; x.;; k. Then clearly M(J') -11rs .UI' Thus the subobject we chose is in Ii (C [L]>. The only other subobjects are represented by some k + I - ' I + I Then letting k + I -r' I be defined by we have 1- q,1 / , and the subobject is in ;; (C[k~l]>. This establishes I. II is clear from the definitions. III follows by taking e to be q,h since M(q,I) (x) - x for I .;; x .;; I. This establishes Corollary 1. We note that if one examines the argument used in the proof of Theorem 1 for this special case, the usual proof of Ramsey's Theorem emerges. where I
({l , .... k
/ (x) - I(x),
+ 11) c
II •...• I).
I';; x.;; k+l,
Let v be an infinite-dimensional vector space over GF(q) with basis For each k - O. I, .. '. let vk -
VI. V2 , ....
COROLLARY 2 (Vector Space Analog). described above, C(k;/ 1 , • .• ,I,) holds in general.
For the category C
Proof We apply Proposition 1 to a class containing C. Let A be an infinite-dimensional vector space over GF(q) with basis al. a2 •...• and let Am -
be morphisms in cm. Then their composition is defined to be k -+(Y.H) n, where y - x + ~~_I al ® I/;(wi)' Thus we can think of these morphisms as certain special affine transformations from Am ® vk into Am ® VI' (a)-(c) are satisfied for the cm. We choose for our class q; all the cm. When m - 0, we get the category C of Corollary 2. n
For each m, let B - cm and A - cm + l • We show that these satisfy Theorem 1. To define M, consider a morphism k -+(0' •• ) I in cm+!' Then W E A m+ 1 ® VI can be written uniquely as w - W' + am+1 ® Wm+h where w' E Am ® VI' Let q,': VUI -+ VI + I be determined by letting q,' (VUI) - VI+I + Wm+h and q,' - q, on Vk' Then define M«w. q,» - (w'. q,'), where k + I -+(w' ••,) I + I is in cm. One can verify by a direct check that M preserves composition. We next define P. Let k -+(w ••) I be in cm. Then P «w. q,» - (w". q,"), where w" - w + am+1 ® 0, and q," - q,. Clearly P preserves composition. Also, since the identity morphism for k in cm is (0. Ik), where Ik is the identity transformation on Vb and similarly for cm +h we see that M(J) - I + I and p(J) - I for each I. Finally, let t - IAml - qm, and for each element a E Am and each I let q,lo - (a ® VI+I • ell in cm, where el is the map from VI to VI + I acting identically on VI' Then these choices are sufficient to satisfy I-III. in 1/;:
To check I, let k + I -+(0" •• ') I + I represent a (k + I)-subobject of I + I First suppose q,' (VUI ) S/; VI' Then we can choose some isomorphism VUI -+ VUI such that q,' I/;(Vk ) c VI and q,' I/;(VUI) - VI+I + v' for some cm.
441
428
GRAHAM, LEEB, AND ROTHSCHILD
Furthermore, for a suitable choice of v E Am ® Vk+1 we have with w'EAm®V/. Of course (w',cj,') and (w'.cj,'''') represent the same subobject since (v. "') is an isomorphism. Now let k -+(w.~) I be in cm +1> where q, - q,,,,' on Vb and w - w' + am+1 ® v. Then we have M «w, q,)) - (w'. q,' "'). Thus all subobjects represented by a (w'. q,') with q,'(Vk+I) '1. V/ are in M(Cm + 1 On the other hand, ifq,'(vk+l) C VI> then (w' ,q,') - (w" + a ® V/+I> q,') for some a E Am and some w" E Am ® VI' But
V
E VI'
(w',cj,') (v.",)-(w'.cj,''''),
[Ljl.
(w"
+a
® v/+I>q,') - (a ® vl+l>e/) (w" .q,") - q,/,a (w", q,").
where q," - q,' on vk+1> This establishes I.
q,": Vk+1 -
VI'
Thus the subobject is in
q,/a (Cm
[k~l]>.
To check II, let s _(w,~) I in Cm. Then M(p«w.q,m - (w'.q,'), _(w'.l) I + I, where w' - wand q,' is the mapping determined by letting q,' - q, on V" and q,' (Vk+I) - VI+I' Clearly s
+1
This establishes II. Finally, for III, consider in cm +1 the morphism (am+1 9vJ+l.er)
- - - - - l » /+1.
M«am+1 ® v/+I.e/» ",' (V/+I) - V/+2
- (0. ",'),
where
",'
acts
+ V/+I' Now we have for each
identically
on
VI>
and
a E Am,
(a ® V/+2.el+l) (a ® VI+I.e/) - (0. ",') (a ® VI+I.e/).
This esta bJishes III. Thus Cm (k; II •...• I,) holds in general for all m by Proposition I. In particular, as noted above, if m - 0, this establishes Corollary 2. We note also that for m - 1 the subobjects of an object I can be considered to be affine subspaces of VI' Thus we have also proved the affine version of Ramsey's Theorem, which we state below. COROLLARY 3 (Affine Analog). For is true in general.
C -
c i as described above,
C (k; II •...• I,)
The application of Theorem 1 to the case A - CI> B - Co is just the statement that the affine analog for k and all I I • . . . • I, implies the vector space analog for k + 1 and all II , ...• I,. This result was already known [I, 5], and the previous proof is the same as the proof of Theorem I specialized to this case. There was another way given in (5) to show that Corollary 3 implies Corollary 2. Namely, it shows that c i (k;/I •...• 1,) implies Co (k;/I •...• 1,). This argument is also a special case of Theorem I, and we can describe it here. Actually, we replace Co with the equivalent C~, defined by letting k - ' I in c~ if and only if k - 1 _I 1- 1 is in Co. We also must adjoin an identity 10 to c~. If k -("'.~) I is in CI> then M«w.q,)) - (o.q,) in C~, where we recall that k + 1 _(o,~) 1+1 in c~. We let t - 0, thus making the choices of P and q,/j unnecessary. Clearly C~ [L~ld M(e l and I is satisfied. II is
-
442
[Lj),
RAMSEYS THEOREM FOR A CLASS OF CATEGORIES
429
vacuously true as is III, since t - o. Hence by Theorem 1, if C, (k; I J ••••• I,) holds for all II.' ..• I" then Co (k + I; II •...• I,) holds and this is just Co (k;II •... • 1,), as desired. Finally we obtain the Ramsey theorem for n-parameter sets. We refer the reader to [2] to see that the definitions used there are essentially the same as those we will use here. In particular, the categories corresponding to the notions in [2] are the quotient categories described in the last paragraph in this paper. That is, the partially ordered sets of subobjects are isomorphic. Let
be a finite group, and let A - (a I • . . . • a,o) be a finite set. Let be the category with objects o. I. 2. . . .• and morphisms described as follows: G
C (A • G)
For each k and I, the morphisms k s
G __
II ..... I)
__
v. s ) I are diagrams
f
U A ---;..
II ..... k)
U A.
where f is any epimorphic function which acts identically on A, and s is any function such that s (a) - lEG for a E A. Composition of the morphisms v.s) I an d I --(g.t) m IS• gIVen • b v•. s.·,) m, were h . d' k -y k -fg IS or mary composition of functions, and sg . t is defined by s (g(x)) . t (x) - (sg . t) (x) in G for x ElI, ...• m) U A. We note several things about this choice for C(A. G). First there is no mention of the relationship of G to A. G need not be a permutation group on A, nor even act on it at all. This was a necessary assumption for part of the proof in [2]. Second, we allow IA 1 < 2 here, where in [2], IA 1 ;;. 2 was required. Actually, in the situation in [2] where the n-parameter sets under consideration had constant set B C A, we did not need IBI ;;. 2. But this took a separate argument. What we have there is the general result for nparameter sets for arbitrary sets of constants B. COROLLARY 4 (n-Parameter holds in general.
Sets).
If
C-C(A.G).
then
C (k ; II •...• I,)
Proof Again, we consider a class ((I containing C(A. G) for which Proposition 1 holds. There is more than one possibility here. We will give the proof in detail for one class ((I. Then we will describe another class but omit the detailed verification of I-III. It is this second class ((I which provides a more direct translation of the proof in [21. The first ((I we describe now is somewhat different.
Let
(a. a2.
be an infinite set. For each t -I. 2. 3 •...• let and let c,- C(A,.G). Thus C(A.G) above is c,o here. We cm +1 and B - COl satisfy Theorem 1, for all m ;;. 1.
aJ ... .)
A,- (al •... • a,),
claim that A
-
To see this we first define M. Let k __V.s) I be in C.,+l' Then where k - I __V· ... ) I + I in COl is defined as follows. For xEA.,U{I ..... /), /(x)-f(x) if f(x)EA.,U{I ..... k}, /(x)-k+1 if f (x) - am +I> and r (1+1) - k + 1. For x E A., U II ..... n, $ (x) - s (x), and $ (I + 1) - 1. One can check that M does preserve composition. For the identity map (e/. I), I in Cm +I> where e/ acts identically on I and I (x) - lEG, x E A m +1 U II ..... n, we see that M «e/. 1)) - (e/+I • 1) in COl' so M (I) - I + 1. M«J.s)) - (/.$),
Next we define P. Let k __ !It.,) I be in Col' Then P«h.,» - (h".,"), where k __(h" .... ) I in C.,+I is defined by letting h" (x) - h (x) and u" (x) - u (x)
443
GRAHAM, LEEB, AND ROTHSCHILD
430
for x E Am U (I •...• II, and h" (am+l) - am+h preserves composition, and p (J) - I for alii.
r" (am+l) - lEG.
P
clearly
Finally, for each I and any g E G and any j, I';; j .;; m, let or just (j.g)1 for short, where djl(x)-x for x E {I •...• II U Am, djl (I + I) - ai' and 1,1 (x) - lEG for x E {I •...• II U Am, Igi (/ + J) - g. These 4>'s are indexed by the pairs (j. g). We let 1 - IAml IGI - m IGI, and for the choices above we verify I-III. 4>1.(j.g)-(djl.l gl ),
Let
k
+I
I+I
_(f,,)
represent a subobject in Cm
[7::].
Suppose first
that f (/ + il ¢ Am. Let,.. be a permutation on {I •...• k + 11 U Am fixing all a E Am and such that ,..1 takes I + I onto k + 1. Let u - <s (/ + 1))-1, and let «(. s') - ([. s) (.... I. k ... ), where as above, I.k maps {I •...• kl U Am onto lEG, and k + I onto u. Then ( -,..1 and s' - I.kl· s. In particular, since (r. I.k ...) is an isomorphism in cm {its inverse is (,..-I.I.-I k », we see that ([.s) and s') represent the same subobject of 1+1. Now let k _V" ,,") I be defined in c m+! as follows. For x E Am U {J •...• II, we let (' (x) - ( x ) if ( x ) "'" k + I, and (' (x) - am+1 if ( x ) - k + 1. We let (' (am+l) - am+I' For x E Am U (I •...• II, we let s" (x) - s' (x), and s" (am+l) - 1. Then M«([".s"»-«.s'). SO the subobject represented by ([.s) is in M(Cm+1 This is the case, then, for any ([,s) with 1(/ + I) ~ Am. On
«.
[L]>.
the other hand, suppose I (I + J) - aj E Am. Let k + I _if .,') I in c m be defined by (X) - I (x) and s' (x) - s (x) for x E {I •...• II U Am. Then ([. s) - (j. s (/ + 1)) I s') and ([. s) represents a su bobject in (j. s (/ + 1))1 (cm [k~l]>. This establishes I.
«.
For II, we note that for k _V,,) I in cm' M(P«([,s))) is the morphism I _V'.,') I + I in cm, where ( x ) - I (x) and s' (x) - s (x) for x E {I •...• II U Am, and ((/ + I) - k + I, s' (/ + I) - 1. Then for each j and g we see that (j.g)I([.s) - «.s') (j.g)t, establishing II. k
+
To verify III, we consider (m + I. ill in cm+!' Then M«m + I, \)1) is the morphism I + I -(',]) I + 2 in cm where I (x) - I for all x in {I •... , I + 21 U Am and e' (x) - x for x E {I, ... ,II U Am, and e' (/ + il - I + I, e' (I + 2) - I + 1. Then clearly (j,g)1+1 (j,g)l- (e', \) (j,g)l- M«m + I. \)1) (j,g)I' This establishes III and completes the proof of Corollary 4. The alternate choice for the class qj to prove Corollary 4 is as follows. For each m - 0, 1.2... .• let A;" - A U «(I, ... , ml x G), and let c;,. - C(A;". G). Then Co - c. Let ~ be the class of all c;". For each m. C;"+I and c;,. satisfy Theorem 1. x
For E A;" U
k _ V ,') I {I •...• I}
( (x) -
for
in C;"+h we let
we
let
M«([.s))-«.'),
where
for
f (x) and s' (x) - s (x) if I (x) E A;" U (I •...• k);
we let ( x ) - k + I, s' (x) - g's(x); and ( I + \) - k + I, I in c;,. we define P«([.s))-«.') in C;"+I by letting ( x ) - I (x), s' (x) - s (x) if x E A;" U {I •...• II, and ((m + I. g)) (m+l.g), s'«m+l.g»-1. For aEA;" and gEG, as before, we let 4>I,(a.g) - (dal.l gl ). Then I, II and III can be verified, with t - IA;"IIGI. I(x) - (m + I.g),
,(/+\)-1.
For
k_(f,,)
Now we still do not have an exact translation of the proof in [2]. In particular, we have taken no account of any action of G on A. To handle
444
RAMSEYS THEOREM FOR A CLASS OF CATEGORIES
431
this we consider a set A and a group G acting on A, a - a' E A for g E G. We consider the category CU, G) and obtain from it the category CU, G) by identifying any two morphisms k _(f•• ) I and k _(g. 0) I for which j(x)-g(;c) and s(x)-u(;c) if j(x) E It, ... ,k}, and j(;c)'u)_g(;c)o(x) otherwise. By considering G to act on (11, ... , m) x G) by (i, g)h - (i, gh) for all h E G, we obtain the categories c;,. - C(A;", G). The categories c;"+l and c;" satisfy Theorem I, where we take for M and p the functors determined by the M and p for c;"+l and c;,. above by their action on classes of identified morphisms. For the q,'s we use classes of identified q,1. fA.,) from above. There are IA;" I of these, represented by the q,1. fA. I). Thus we let 1- IA;" I here. Letting ~ be the class consisting of all c;,., we can apply Proposition 1. This is the exact translation of the proof in [2]. REFERENCES I.
R. L. GRAHAM AND B. L. ROTHSCHILD, Rota's geometric analogue to Ramsey's theorem, Proc. AMS Symp. in Pure Mathematics XIX Combinatorics AMS Providence (1971), 101-104. 2. R. L. GRAHAM AND B. L. ROTHSCHILD, Ramsey's Theorem for n-parameter Sets, Trans. Amer. Math. Soc. 159 (1971), 257-292. 3. A. HALES AND R. I. JEWETT, Regularity and Positional games, Trans. Amer. Math. Soc. 106 (1963), 222-229. 4. F. P. RAMSEY, On a problem of formal logic, Proc. London Math. Soc. 2nd Ser. 30 (1930), 264-286. 5. B. L. ROTHSCHILD, A generalization of Ramsey's theorem and a conjecture of' Rota, doctoral dissertation, Yale University, New Haven, CT, 1967. 6. H. J. RYSER, "Combinatorial Mathematics," Wiley, New York, 1963.
Reprinted from Advances in Math. 8 (1972), 417-433
445
Reprinted from JOURNAL OF COMBINATORIAL THEORY All Rights Reserved by Academic Press, New York "nd London
\ 01. 13, No.2, October 1972 Printed i11 HeLM'lIm
A Characterization of Perfect Graphs L.
LOVASZ
Eotvos L. University, Budapest, VIII. Muzeum krt. 6-8, Hungar Communicated by W. T. Tuite
Received December 3, 1971
It is shown that a graph is perfect iff maximum clique . number of stability is not less than the number of vertices holds for each induced subgraph. The fact, conjectured by Berge and proved by the author, follows immediately that the complement of a perfect graph is perfect.
Throughout this note, graph means finite, undirected graph without loops and mUltiple edges. G and I G I denote the complement and the number of vertices of G, respectively. Let fL(G) denote the maximum cardinality of a clique in the graph G, and let X(G) be the chromatic number of G. Obviously x(G) ~ fL(G).
A graph G is called perfect if x( G')
=
fL( G')
for every induced subgraph G' of G. Berge [1] formulated two conjectures in connection with this notion: (A) A graph is perfect circuit without diagonals. (B)
iff neither it nor its complement contains an odd
The complement of a perfect graph is perfect.
Obviously, (A) is stronger than (B). In [3] (B) was proved. This result also follows from the theory of anti-blocking polyhedra, developed by Fulkerson [2]. In the present paper a theorem stronger than (B) but weaker than (A) is proved. This possibility of sharpening of (B) was raised by A. Hajnal. Copyright © 1972 by Academic Press, Inc. All rights of reproduction in any form reserved.
95
447
96
LOVASZ THEOREM.
A graph G is perfect
if and only if
/1-(G') /1-(G') ?
1
G'
1
for every induced subgraph G' of G. Proof Part "only if" is trivial. To prove part "if" we use induction on 1G I. Thus we may assume that any proper induced subgraph of G, as well as its complement, is perfect. Let multiplication of a vertex x by h (h ? 0) mean substituting for it h independent vertices, joined to the same set of vertices as x. This notion is closely related to the notion of piuperfection, introduced by D. R. Fulkerson.
(I) As a first step of the proof we show that if Go arises from G by multiplication of its vertices then Go satisfies
Assume this is not the case and consider a Go failing to have this property and with minimum number of vertices. Obviously, there is a vertex y of G which is multiplied by h ? 2; let Yl , ... , Yn be the corresponding vertices of Go. Then
by the minimality of Go; hence
and 1
Go
1
=
pr
+ 1.
Put G1 = Go - {Yl , ... , Yh}' Then G1 arises from G - Y by mUltiplication of its vertices, hence by [I, Theorem 1], Gl is perfect. Thus, Gl can be covered by /1-(G l ) ::( /1-(Go) = r disjoint cliques of Gl ; let Cl , ... , Cr be these cliques, 1 C l 1 ? I C 2 1 ? ... ? 1 C r I. Obviously, h ::( r. Since 1 Gl 1 = 1 Go I - h = pr + I - h, 1
Cl
1
=
... =
1 Cr-h+l 1
Let G2 be the subgraph of Go induced by C l 1
G2
1 =
(r - h
= p.
U ... U Cr-h+l U
+ I)p + 1 <
448
1
Go
I;
{Yl}, then
97
A CHARACTERIZATION OF PERFECT GRAPHS
thus, by the minimality of Go ,
Since p.(GJ
~
p.(Go} = p, this implies
p.(6J ;> r - h
+ 2.
Let F be a stable set of r - h + 2 vertices of G2 ; then I F (') Ci I ~ 1 (1 ~ i ~ r - h + 1), hence Yl E F. This implies that F U {Y2 ,... , y,,} is stable in Go . On the other hand IF U {Y2 , ... , y,,}1
= r + 1 > p.(60 ) ,
a contradiction.
(II) We show that X(G) = p.(G). It is enough to find a stable set F such that p.(G - F) < p.(G) since then, by the induction hypothesis, G - F can be colored by p.(G) - 1 colors and, adding F as a further one, we obtain a p.(G)-coloring of G. Assume indirectly that G - F contains a p.(G)-clique CF for any stable set Fin G. Let, for x E G, h(x) denote the number of C/s containing x. Let Go arise from G by multiplying each x by h(x). Then, by Part I above,
On the other hand, obviously
I Go I =
L hex) = LF I C
F
I = pJ,
II)
where f denotes the number of all stable sets in Go , and p.(Go)
~
p.(G)
p.(60)
= max F
=
p,
L hex) =
II)eF
max F
Lr IF (') C
F,
I ~ max F
L
r#1
1=
f -
1,
a contradiction. REMARK. The condition given in the theorem is strictly related to the max-max inequality given by Fulkerson [2]. Multiplication of a vertex is the same as what he calls pluperfection.
449
98
LOVAsZ REFERENCES
1. C. BERGE, Fiirbung von Graphen, deren siimtliche bzw. deren ungerade Kreise
starr sind, Wiss. Z. Martin-Luther-Univ. Halle-Wittenberg Math.-Natur. Reihe (1961), 114. 2. D. R. FULKERSON, Blocking and anti-blocking pairs of polyhedra, 7th International Programming Symposium, The Hague, 1970. 3. L. LovAsz, Normal hypergraphs and the perfect graph conjecture, Discrete Math., in press.
Printed by the St Catherine Press Ltd., Tempelhof 37, Bruges, Belgium.
450
Reprinted from JOURNAL OF COMBINATORIAL THEORY All Rights Reserved by Academic Press, New York and London
Vol. 13, No.3, December 1972 Printed in Belgium
Note A Note on the Line Reconstruction Problem L.
LOVASZ
Eotvos L. University, Budapest, Hungary Communicated by W. T. Tutte
Received May 29, 1972
It is shown that if a graph has more lines than its complement does, then it can be reconstructed from its line-deleted subgraphs.
As in Harary's book [4], graph means finite, undirected graph without loops or multiple lines. V(G) and E(G) denote the sets of points and lines of G, respectively. Ulam [6] conjectured that, if two graphs GI and G2 are such that V(G I ) = {VI'"'' V n }, V(G 2) = {WI'"'' W n }, n ~ 3, and GI - Vi ~ G2 - Wi, for each i, then GI ~ G2 • In other words, every graph with at least three points can be uniquely reconstructed from its maximal induced subgraphs. It seems that this conjecture is particularly difficult, and it is solved for special cases only; see, e.g., [5]. An analogous conjecture, formulated by Harary [3], replaces "maximal induced subgraphs" by "maximal subgraphs". This conjecture is actually weaker than Ulam's conjecture (see [1]). In this note we prove it for graphs with "many" lines. Let GI
THEOREM.
,
G2 be two graphs, E(G I )
=
{e l
, ••• ,
{It ,.",fm}, and 1 V(G I ) 1 = 1 V(G 2) 1 = n. Assume that GI for each 1
~
i
~
m, and m
> tG).
Then GI
~
G2
-
em}, E(G 2) ei ~ G2 -
=
Ii
•
Proof Let G -+ H denote the set of all monomorphisms of G into H. Then, by the sieve formula, 1G -+
HI
=
L
(_l)iE<X>11 X
-+
H I,
(1)
X~G
where H is the complement of H and X runs over all graphs with VeX) = V(G), E(X) C E(G). In effect, the right-hand side of (1) just counts all maps from the points of G to the points of H, then takes away those Copyright © 1972 by Academic Press, Inc. All rights of reproduction in any fonn reserved.
309
451
310
LOVAsZ
maps sending (at least) one line of G to a line of N, then adds those sending (at least) two lines to lines of N, etc. Thus it counts exactly those maps which send no lines of G to lines of N, so every line of G goes to a line of H. Applying (1) to G1 and G2 we have 1G1
-
G2 1 =
L
(_I)IE(X)Q X -
6 2 1,
(2)
(_I)IE(X)Q X -
6 2 1.
(3)
X!;G1
and for G2 and G2 we have 1G2
-
G2 1 =
L X!;Ga
Since the hypothesis on maximal subgraphs assures that G1 and G2 have the same proper subgraphs (see [2, p. 92]), the terms in (2) and (3), with X =1= G1 and X =1= G2 , are equal. Also, since m > tG), 1 G1 - 6 2 1 = 1G2 - G2 1 = O. Hence 1G1 - G2 1 = 1G2 - G2 1 > 0, which proves the theorem. REFERENCES
1. D. L. GREENWELL, Reconstructing graphs, Proc. Amer. Math. Soc. 30 (1971), 431-433. 2. D. L. GREENWELL AND R. L. HEMMINGER, Reconstructing graphs, "The Many Facets of Graph Theory" (G. T. Chartrand and S. F. Kapoor, eds.), SpringerVerlag, New York, 1969. 3. F. HARARY, On the reconstruction of a graph from a collection of subgraphs, "Theory of Graphs and Its Applications" (M. Fiedler, ed.), Czechoslovak Academy of Sciences, Prague/Academic Press, New York, 1965, pp.47-52. 4. F. HARARY, "Graph Theory," Addison-Wesley, Reading, Mass., 1969. 5. F. HARARY AND B. MANVEL, The reconstruction conjecture for labeled graphs, "Combinatorial Structures and Their Applications" (R. K. Guy, ed.), Gordon & Breach, New York, 1969. 6. S. M. ULAM, "A Collection of Mathematical Problems," Wiley (Interscience), New York, 1960, p. 29.
Printed by the St Catherine Press Ltd., Tempelhof 37. Bruges. Belgium.
452
© DISCRETE MATHEMATICS 5 (1973) 171-178. North-Holland Publishing Company
ACYCLIC ORIENTATIONS OF GRAPHS* Richard P. STANLEY Department of Mathematics, University of California, Berkeley, Calif 94720, USA Received 1 June 1972
Abstract. Let G be a finite graph with p vertices and x its chromatic polynomial. A combinatorial interpretation is given to the positive integer (-1)P X( -A), where A is a positive integer, in terms of acyclic orientations of G. In particular, (-1)P x( -1) is the number of acyclic orientations of G. An application is given to the enumeration of labeled acyclic digraphs. An algebra of full binomial type, in the sense of Doubilet-Rota-Stanley, is constructed which yields the generating functions which occur in the above context.
1. The chromatic polynomial with negative arguments
Let G be a finite graph, which we assume to be without loops or multiple edges. Let V = V(G) denote the set of vertices of G and X =X(G) the set of edges. An edge e E X is thought of as an unordered pair {u, v} of two distinct vertices. The integers p and q denote the cardinalities of V and X, respectively. An orientation of G is an assignment of a direction to each edge {u, v}, denoted by u -+ v or v -+ U, as the case may be. An orientation of G is said to be acyclic if it has no directed cycles. Let X (X) = X(G, X) denote the chromatic polynomial of G evaluated at X E C. If X is a non-negative integer, then X(X) has the following rather unorthodox interpretation. Proposition 1.1. X(X) is equal to the number of pairs (a, 0), where a is any map a: V -+ {I, 2, ... , X} and 0 is an orientation of G, subject to the two conditions: (a) The orientation 0 is acyclic. (b) ffu -+ v in the orientation 0, then a(u) > a(v). * The research was supported by a Miller Research Fellowship.
453
172
R.P. Stanley, Acyclic orientations of graphs
Proof. Condition (b) forces the map a to be a proper coloring (i.e., if {u, v} E X, then a(u) =1= a(v)). From (b), condition (a) follows automatically. Conversely, if a is proper, then (b) defines a unique acyclic orientation of G. Hence, the number of allowed a is just the number of proper colorings of G with the colors 1, 2, ... , "A, which by definition is X("A). Proposition 1.1 suggests the following modification of X("A). If "A is a non-negative integer, define X("A) to be the number of pairs (a, 0), where a is any map a : V -+ { 1, 2, ... ,"A} and 0 is an orientation of G, subject to the two conditions: (a') The orientation 0 is acyclic, (b') Ifu -+ v in the orientation 0, then a(u) ~ a(v). We then say that a is compatible with O. The relationship between X and X is somewhat analogous to the relationship between combinations of n things taken k at a time without repetition, enumerated by (~). and with repetition, enumerated by (n+!-I) = (_l)k(~n).
Theorem 1.2. For all non-negative integers "A, x("A) = (-l)P X( -"A).
Proof. Recall the well-known fact that the chromatic polynomial X(G, "A) is uniquely determined by the three conditions: (i) X(G o, "A) = "A, where Go is the one-vertex graph. (ii) X(G + H, "A) = X(G, "A) X(H, "A), where G + H is the disjoint union ofG andH, (iii) for all e EX, X(G, "A) =x(G\e, "A) - x(G/e, "A), where G\e denotes G with the edge e deleted and G/e denotes G with the edge e contracted to a point. Hence, it suffices to prove the following three properties of X: (i') X(G o, "A) = "A, where Go is the one-vertex graph, (ii') X(G + H, "A) = X(G, "A) X(H, "A), (iii') X(G, "A) = x(G\e, "A) + x(G/e, "A). Properties (i') and (ii') are obvious, so we need only prove (iii'). Let a: V(G\e) -+ {l, 2, ... , "A} and let 0 be an acyclic orientation of G\e compatible with a, where e = {u, v} EX. Let 0 1 be the orientation of G obtained by adjoining u -+ v to 0, and O2 that obtained by adjoining v -+ u. Observe that a is defined on V(G) since V(G) = V(G\e). We will
454
1. The chromatic polynomial with negative arguments
173
show that for each pair (a, 0), exactly one of 0 1 and O2 is an acyclic orientation compatible with a, except for x(GI e, 'A) of these pairs, in which case both 0 1 and O2 are acyclic orientations compatible with a. It then follows that X(G, 'A) = x(G\e, 'A) + x(G/e, 'A), so proving the theorem. For each pair (a, 0), where a: G\e -+ {l, 2, ... , 'A} and 0 is an acyclic orientation of G\e compatible with a, one of the following three possibilities must hold. Case I: a(u) > a(v). Clearly O2 is not compatible with a while 0 1 is compatible. Moreover, 0 1 is acyclic, since if u -+ v -+ WI -+ W2 -+ ••• -+ u were a directed cycle in 0 1, we would have a(u) > a(v);;" a(w 1);;" a(w2);;" ... ;;.. a(u), which is impossible. Case 2: a(u) < a(v). Then symmetrically to Case I, O2 is acyclic and compatible. with a, while 0 1 is not compatible. Case 3: a(u) = a(v). Both 0 1 and O2 are compatible with a. We claim that at least one of them is acyclic. Suppose not. Then 0 1 contains a directed cycle u -+ v -+ WI -+ W2 -+ ... -+ u while O2 contains a directed cycle v -+ u -+ wi -+ w2 -+ .•• -+ v. Hence, 0 contains the directed cycle u -+ wi
-+
W2 -+ •.• -+
v -+ WI
-+
W2 -+ ••• -+
u,
contradicting the assumption that 0 is acyclic. It remains to prove that both 0 1 and O2 are acyclic for exactly x(G/e, 'A) pairs (a, 0), with a(u) = a(v). To do this we define a bijection cI>(a, 0) = (a', 0') between those pairs (a, 0) such that both 0 1 and O2 are acyclic (with a(u) = a(v)) and those pairs (a', 0') such that a': G/e -+ {I, 2, ... , 'A} and 0' is an acyclic orientation of G/e compatible with a'. Let z be the vertex of G/e obtained by identifying u and v, so V(G/e) = V(G\e) - {u, v} u {z}
= X(G\e). Given (a, 0), define a' by a'(w) = a(w) for all WE V(G\e) - {z} and a'(z) = a(u) = a(v). Define 0' by WI -+ w2 in 0' if and only if WI -+ w2 in O. It is easily seen that the map cI>(a, 0) = and X(G/e)
(a', 0') establishes the desired bijection, and we are through.
Theorem 1.2 provides a combinatorial interpretation of the positive integer (-I)P X(G, -'A), where 'A is a positive integer. In particular, when 'A = I every orientation of G is automatically compatible with every map a: G -+ {I}. We thus obtain the following corollary. 455
174
R.P. Stanley, Acyclic orientation of graphs
Corollary 1.3. If G is a graph with p vertices, then (-1)P X(G, -1) is equal to the number of acyclic orientations of G. In [5] , the following question was raised (for a special class of graphs). Let G be a p-vertex graph and let w be a labeling of G, i.e., a bijection w: V(G) ~ {l, 2, ... , p}. Define an equivalence relation - on the set of all p! labelings w of G by the condition that w - w' if whenever {u, v} E X(G), then w(u) < w(v) ~ w'(u) < w'(u). How many equivalence classes of labelings of G are there? Clearly two labelings wand w' are equivalent if and only if the unique orientations 0 and 0' compatible with wand w', respectively, are equal. Moreover, the orientations o which arise in this way are precisely the acyclic ones. Hence, by Corollary 1.3, the number of equivalence classes is (-1)P X(G, -1). We conclude this section by discussing the relationship between the chromatic polynomial of a graph and the order polynomial [4;5;6] of a partially ordered set. If P is a p-element partially ordered set, define the order polynomial n(p, X) (evaluated at the non-negative integer X) to be the number of order-preserving maps a:P ~ {l, 2, ... , X}. Define the strict order polynomial n(p, X) to be the number of strict orderpreserving maps a:P ~ {I, 2, ... , X}, i.e., if x < y in P, then a(x) < a(y). In [5] , it was shown that nand n are polynomials in X related by (P, X) = (-1)P n (P, - X). This is the precise analogue of Theorem 1. 2. We shall now clarify this analogy. If 0 is an orientation of a graph G, regard 0 as a binary relation ~ on V( G) defined by u ~ v if u -'+ v. If 0 is acyclic, then the transitive and reflexive closure 0 of 0 is a partial ordering of V(G). Moreover, a map a: V(G) ~ {l, 2, "', X} is compatible with 0 if and only if a is orderpreserving when considered as a map from O. Hence the number of a compatible with 0 is just n (0, X) and we conclude that
n
x(G, X)
= L; nco, X), o
where the sum is over all acyclic orientations 0 of G. In the same way, using Proposition 1.1, we deduce (1)
X(G, X)
= E n((5, X). o 456
2. Enumeration of labeled acyclic diagraphs
175
Hence, Theorem 1.2 follows from the known result fi(p, A) = (-l)p n(p, -A), but we thought a direct proof to be more illuminating. Equation ( 1) strengthens the claim made in [4] that the strict order polynomial is a partially-ordered set analogue of the chromatic polynomial x.
n
2. Enumeration of labeled acyclic digraphs Corollary 1.3, when combined with a result of Read (also obtained by Bender and Goldman), yields an immediate solution to the problem of enumerating labeled acyclic digraphs with n vertices. The same result was obtained by R.W. Robinson (to be published), who applies it to the unlabeled case. Proposition 2.1. Let f(n) be the number of labeled acyclic digraphs with n vertices. Then
Proof. By Corollary 1.3, (2)
fen)
= (_l)n L; x(G, -1), G
where the sum is over all labeled graphs G with n vertices. Now, Read [3] (see also [ 1] ) has shown that if Mn(k) =
L;
X(G, k)
G
(w}1.ere the sum has the same range as in (2», then
n~o 00
(3)
i2n) = (~o xn/n! 2(n) )k 00
Mn(k) xn/n!
2
Actually, the above papers have 2n2/2 where we have 2 (~) - this amounts to the transformation x' = 2'h x. One advantage of our 'normalization' is
457
R.P. Stanley, Acyclic orientations of graphs
176
that the numbers n! 2 (~) are integers; a second is that the function F(x)
=
jj
n=O
xnln!
2(~.)
satisfies the functional relation F'(x) =F(tx). A third advantage is mentioned in the next section. Thus setting k = -1 and changing x to -x in (3) yields the desired result.
;=0
By analyzing the behavior of the function F(x) = l; x nIn! 2 (1) , we obtain estimates for f(n). For instance, Rouche's theorem can be used to show that F(x) has a unique zero a ~ -1.488 satisfying Ial ~ 2. Standard techniques yield the asymptotic formula f(n) -
C2(~) n!(-a:)-n,
where a is as above and 1. 741 ~ C = l/aF( 1a). A more careful analysis of F(x) will yield more precise estimates for f(n).
3. An algebra of binomial type The existence of a combinatorial interpretation of the coefficients Mn (k) in the expansion
suggests the existence of an algebra of full binomial type with structure constants B(n) = 2 (~) n! in the sense of [2] . This is equivalen t to finding a locally finite partially ordered set P (said to be of full binomial type), satisfying the following conditions: (a) In any segment [x, y] = {zi x ~ z ~ y} of P (where x ~ y in P), every maximal chain has the same length n. We call [x, y] an n-segment. (b) There exists an n-segment for every integer n ;;;. 0 and the number of maximal chains in any n-segment is B(n) = 2 G) n!, (In particular, BO) must equal I, further explaining the normalization x' = 2 ~ x of Section 2.) 458
177
3. An algebra of binomial type
If such a partially ordered set P exists, then by [2] the value of ~k(x, y), where ~ is the zeta function of P, k is any integer and [x, y] is any n-segment, depends only on k and n. We write ~k(x, y) = ~k(n). Then again from [2],
Hence ~k (n)
=Mn (k).
In particular, the cardinality of any n-segment [x, y] isM n (2), the number of labeled two-colored graphs with n vertices; while fleX, y) = (_l)n fen), where fl is the Mobius function of P and fen) is the number of labeled acyclic digraphs with n vertices. The general theory developed in [2] provides a combinatorial interpretation of the coefficients of various other generating functions, such as (!:;=1 xn/B(n))k and (2 - !:;=o x n/B(n))-I. Since M n (2) is the cardinality of an n-segment, this suggests taking elements of P to be properly two-colored graphs. We consider a somewhat more general situation. Proposition 3.1. Let V be an infinite vertex set, let q be a positive in~ teger and let Pq be the set of all pairs (G, a); where G is a function from all 2-sets {u, v} ~ V (u =1= v) into {a, 1, ... , q - l } such that all but finitely many values of G are 0, and where a: V -+ {a, I} is a map satisfying the condition that ifG({u, > then a(u) =1= a(v) and that !:ueva(u) < 00. If(G, a) and (H, r) are in Pq , define (G, a)";;; (H, r) if: (a) a(u)";;; r(u) for all u E V, and (b) If a(u) = r(u) and a(v) = rev), then G( {u, v}) =H( {u, Then Pq is a partially ordered set of full binomial type with structure constants B(n) = n! q(~).
vn
°
vn.
Proof. If (H, r) covers (G, a) in P (i.e., if (H, r) > (G, a) and no (G '; a') satisfies (H, r) > (G', a') > (G, a)), then
E
UEV
r(u) = 1 +
L; a(u).
UEV
From this it follows that in every segment of P, all maximal chains have the same length. 459
178
R.P. Stanley, Acyclic orientations of graphs
In order to prove that an n-segment S = [(G, a), (H, r)] has n! q(~) maximal chains, it suffices to prove that (H, r) covers exactly nq n-l elements of S, for then the number of maximal chains in S will be (nq n-l )(n - I) q n-2) ... (2 q l). I = n! qeD . Since S is an n-segment, there are precisely n vertices vI' v2' ... , vn E V such that a(vi ) = 0< I = r(vi)' Suppose (H, r) covers (H', r') E S. Then r' and r agree on every v E V except for one Vi' say vI' so r'(vl) = 0, r(v l ) = I. Suppose now H'({u, vn > 0, where we can assume r'(u) = 0, r'(v) = l.lfv is not some vi' then a(u) = 0, a(v) = I, soH'({u, vn = G({u, vn.lfv = vi (2";; i";; n) and u is not vI' then r(u) = 0, rev) = I, soH'({u, vn = H( {u, v}). Hence H' ( {u, v}) is completely determined unless u = vI and v = vi' 2 ..;; i ..;; n. In this case, each H' ( {vI' v;}) can have anyone of q values. Thus, there are n choices of vI and q choices for each H'( {vI' vi})' 2..;; i..;; n, giving a total of nqn-I elements (H', r') E S covered by (H, r).
Observe that when q = 1, condition (b) is vacuous, so PI is isomorphic to the lattice of finite subsets of V. When q = 2, we may think of G ( {u, v}) = or 1 depending on whether {u, v} is not or is an edge of a graph on the vertex set V. Then a is just a proper two-coloring of v with the colors and I, and the elements of P 2 consist of all properly twocolored graphs with vertex set V, finitely many edges and finitely many vertices colored I. We remark that Pq is not a lattice unless q = I.
° °
References [I) E.A. Bender and J. Goldman, Enumerative uses of generating functions, Indiana Univ. Math. J. 20 (1971) 753-765. [2] P. Doubilet, G.-C. Rota and R. Stanley, On the foundations of combinatorial theory: The idea of generating function, in: Sixth Berkeley symposium on mathematical statistics and probability (1972) 267-318. (3) R. Read, The number of k-colored graphs on label1ed nodes, Canad. J. Math. 12 (1960) 410-414. [4] R. Stanley, A chromatic-like polynomial for ordered sets, in: Proc. second Chapel Hill conference on combinatorial mathematics and its applications (1970) 421-427. [5) R. Stanley, Ordered structures and partitions, Mem. Am. Math. Soc. 119 (1972). [6) R. Stanley, A Brylawski decomposition for finite ordered sets, Discrete Math. 4 (1973) 77-82.
460
Sonderabdruck aus
ARCHIV DER MATHEMATIK
Vol. XXIV, 1973
81RKHAUSER VERLAG, BASEL UND STUTTGART
Valuations on Distributive Lattices By LADNOR GEISSINGER
461
Fase.3
230
ARCH. MATH.
Valuations on Distributive Lattices I By LADNOR GEISSINGER
Introduction. We continue the study, begun by G.-C. Rota, of the valuation ring of a distributive lattice. This ring is the representing object for all valuations on the lattice. In the locally finite case Rota established a connection with the incidence algebra of the set of join-irreducible elements, from which he derived interesting results about the Euler characteristic and Mobius function associated with some geometric objects. In this paper we give new proofs of some of his results, and extend others. In part I we discuss general properties of the valuation module and ring of a lattice, and determine their structure for a finite geometric lattice. We then describe the duality between maps of finite distributive lattices and of finite posets. This makes it easy to characterize finite projective distributive lattices, construct the free distributive lattice on a finite poset, and determine what properties of lattice homomorphisms correspond to strict and residuated maps of po sets. We also use the valuation ring to give a construction for the coproduct of distributive lattices. In part II we will determine the structure and mapping properties of valuation rings and Mobius algebras. We use these to prove some theorems of Rota on Mobius functions, an identity due to Klee, and theorems on extending finitely additive measures.
1. The Valuation Module of a Lattice. A function f from a lattice L into an abelian group is modular if f(a v b) f(a II b) = f(a) f(b) for all a, bEL. Following Rota [19], we call any such modular function a valuation. (Birkhoff [2] reserves the term valuation for real-valued modular functions.) In the free abelian group Z(L) on L, let M (L) be the subgroup generated by all elements of the form a v b a II b - a - b with a, bEL. Then V(L) = Z(L) 1M (L) is the valuation module of L introduced by Rota [19]. Let i: L -+ V(L) be the natural induced map. The following characteristic property of (V(L), i) is an immediate consequence of its construction.
+
+
+
Proposition 1. The function i: L -+ V(L) is the universal valuation on L, that is, i is a valuation and every valuation on L into an abelian group A factors uniquely as i followed by a group homomorphism from V(L) into A. Thus the additive group of valuations on L into A can be identified with Hom(V(L), A). The functorial properties of V(L) also follow easily from the construction. Proposition 2. A lattice homomorphism rp: Ll -+ L2 induces a unique group homomorphism rp': V(L 1 ) -+ V(L 2) such that rp' i 1 = i2 rp.
462
Vol. XXIV. 19'1'3
231
Valuations on Distributive Lattices I
The existence of simple types of valuations implies certain structural properties of V(L) and i(L). For example, since every constant function from L into any abelian group is a valuation, by Proposition 1 the elements of i(L) must be non-zero and of i.nfinite order. More useful information about the linear independence of subsets of i (L) can be derived from consideration of 2-valued valuations, or equivalently, of prime ideals of L and their complements, prime filters.
Proposition 3. For any prime ideal or prime filter F of a lattice L, its characteristic function CF: L -+Z, which i8 1 on F and 0 on L\F, is a valuation. Each element of i(F) i8 linearly independent of the elements of i (L \F) and vice versa, though neither of these 8ets is necessarily independent. Proof. It is easy to verify directly the first statement. The second statement follows from the first by Proposition 1.
Proposition 4. The map i is an injection iff L is distributive. When L is distributive, if {al' ... , ar, b} eLand if b is not in the interval [A.at, Vat], then b is linearly independent of {al, ... , a r} in V(L). Proof. A well-known theorem of Stone states that any two elements of a distributive lattice can be separated by a prime filter [2, 17], hence they are independent in V(L) by Proposition 3. If L is not distributive it contains distinct elements c, x, y with c II x = C II Y and c v x = c v y from which i(x) = i(y). The condition on b holds iff there is a prime filter separating b from {al' ... , ar}. Later we will give an elementary proof of this proposition which does not depend on Stone's theorem. Whenever we deal with distributive lattices we identify Land i(L). Now L with either of the operations v or II is a semigroup, so Z(L) may be considered a semigroup algebra using either v or II as multiplication.
Proposition 5. If L is a distributive lattice, M (L) is an ideal of the semigrnup algebra for both v and II and so V(L) is a commutative ring with either (the induced) v or II as product. Moreover, for any homomorphism rp: Ll -+ L2 of distributive lattices, the extended map rp: V(L l ) -+ V(L 2 ) is a homomorphism for both v and II.
Z(L)
+
Proof. (x v y X and similarly for v.
II
Y-
X -
y)
II
t = (x
II
t) v (y
II
t)
+ (x II t) II (y II t) -
X II
t - yilt
Corollary. The ring (V(L), II) and the map i are characterized by the following universal property. For any commutative ring A and any map {3: L -)- A for which {3(x II y) = {3(x)' {3(y) and {3(x v y) = {3(x) + {3(y) - {3(x II y), there is a unique ring homomorphism ex: (V(L), II) -+ (A,') such that ex' i = {3. Suppose L is a distributive lattice. If L does not have a unit (maximal element) u or a zero (minimal element) z we can adjoin such elements to L and the enlarged lattice is still distributive. Since this merely adds onto V(L) one or two copies of Z as direct summands, whenever it is convenient we assume u, z E L. Then u and z are the identities for II and v in V(L). The usual augmentation map e: Z(L) -+ Z given by e( LCtXt} = LCt is a homomorphism for both II and v and its kernel I,
463
232
L. GEISSINGER
ARCH. MATH.
which is generated by all x - y for all x, y E L, contains M (L). Thus V(L) with the induced homomorphism B: V(L) --+Z is an augmented algebra relative to both multiplications v and A. Moreover, in Proposition 5 the ring homomorphism rp commutes with the augmentation homomorphisms B and carries h into 1 2 • In a sense the augmentation ideal I is the most important part of V(L). Namely, for any x E L, V(L) = I EB Zx so that a homomorphism I: I --+ A extends uniquely to a valuation on L when an arbitrary value 1(x) in A is assigned to x. It is often convenient to take x = z the zero of L so that I F:>! V(L)/Zz = f(L) naturally represents valuations normalized to take value 0 on z and at the same time f(L) is again a ring for the A-multiplication. In the theory of Boolean algebras there is a duality principle which comes from complementation. For a general distributive lattice L with u and z there is no complementation process in L, however there is an endomorphism of V(L) which can be used in much the same way. Namely, the map .(x) = u z - x for all x E L when extended to an endomorphism of Z(L) carries M (L) into itself and so induces a homomorphism.: V(L) --+ V(L) .• could also be described by saying that .(c) =-c for all c in the augmentation ideal and that .(z) = u or .(u) = z. The following proposition is the substitute for De Morgan's laws and justifies our subsequent practice of usually ignoring v and considering V(L) only with multiplication A.
+
Proposition 6•• : (V(L), A) --+ (V(L), v) is an isomorphism 01 augmented algebras, and .2 = id. Hence .: (V(L), v) --+ (V(L), A) is also an isomorphism.
+
+ +
+
+
Proof. z u - X AY = z u x v y - x - y = (z u - x) v (z u - y) so that .(x A y) = .(x) v .(y). Clearly .2 = id and ET = B. As a further check that. is the correct algebraic analogue of complementation, note that if x E L has a complement x' then x v x'
+ X Ax' -
X -
x'
= 0= u
+z -
x - x' so that • (x)
=
x'.
In any case, in V(L) we always have x v .(x) = u and x A.(X) = z for every x E L. More generally, for any element x in an interval [v, w] of L, the element v w - x in V(L) acts as the relative complement of x. A more useful and more familiar form of Proposition 6, again for distributive lattices, is the following.
+
Proposition 7. For all
V Xj =
Xl, ••• , Xn
in L, u - V XI = 1\ (u - XI) in V(L), that is,
L: XI - L: (Xj A + L (XI A Xi)
+
Xi
A Xk)
-
•••
(i
< i < k < ... ).
+
+
Proof. • (V XI) = z u - V XI = 1\ • (Xj) = 1\ (z u - XI) = z 1\ (u - Xi) since z A (u - XI) = O. For a direct proof, use induction beginning with either u -
X V
Y= u
+ X AY -
X -
Y = (u - x) A (u - y) or x v y =
Note that in the direct proof u can be replaced by any v
~
X
+y -
X
A y.
V XI.
Corollary. For any valuation 1on L,
I(V XI) =
L:f(xt) -
L:f(Xi AXj)
+ L:f(x, A Xi AXk) -
464
•••
(i
< i < k < ... ).
Vol. XXIV, 1973
233
Valuations on Distributive Lattices I
This is well known; in particular, when f is an additive set function on L = 2x this is the classical inclusion-exclusion formula. It is somewhat unusual to have to consider two natural ring structures on the same abelian group V(L). The relation between them, derived from x v y = x y - X II Y for all x, YEL, is given by av b= e(a)b e(b)a-a II b for all a, bE V(L). But now it is easily checked that if e: A -')- k is an augmentation of any k-algebra (A, .) there is another multiplication on A given by a b = e(b) a e(a) b - a' b for which (A, *, e) is an augmented algebra. Moreover, it is clear that the endomorphism -rIa) = -a of the augmentation ideal carries a' b into a b = rIa) rIb). To see if -r extends to all of A, note the following. If (A, .) has a unit u, then b U = U b = = e(b) u, and conversely ifzis a unit for (A, *) and if e(z) = 1 then a'z = z'a= e(a)z. An element z with this property in an augmented algebra (A, " e) has been called an (invariant) integral [13]. If (A, " e) is an augmented algebra with unit u then r can be extended to an isomorphism (A, .) -')- (A, *) iff there is such an integral in A by letting r(u) = z and r(z) = u. For any commutative ring A with unit, the set E of all idempotents forms a Boolean algebra with a II b = a . b and a v b = a b- a .b for all a, bEE. Thus Proposition 7 holds for idempotents in A. What we have just shown is essentially that when A has an augmentation e, the set {aEEle(a) = 1} (which is always a sublattice of E) is a Boolean algebra iff A has an integral z. Finally note that if K is any multiplicatively closed subset of E, then the sublattice generated by K consists of all elements of the form fLi (fLi II aj) (at II aj II ak) ... for all finite indexed families (ac) of elements of K. A generalized Boolean algebra, that is, a distributive lattice L which contains z and is relatively complemented, along with the usual symmetric difference operation /::; is a group. Moreover, M(L) is then an ideal in the group algebra (Z(L), /::;) so V(L) with the induced operation /::; is again a ring. In V(L), x /::; y = x v y - X II Y z, forallx,YEL, andmoregenerallyforallb,cE V(L),b /::;c = b v c - b II C + e(b) e(c) z. But it is easily checked that this formula for the operation /::;, which makes sense for any distributive lattice L, always yields a third ring structure on V(L), even when /::; is not defined on L or Z(L). Of course if u E L then V(L) is also a ring with product the operation complementary to /::;, that is, x /::;' Y = X II Y - X V Y u. We shall not pursue these other operations further; instead we turn to nondistributive lattices. For a nondistributive lattice L, M(L) need not be an ideal in (Z(L), ,\). The condition for M (L) to be a II-ideal is that in V(L) for all t, x, Y E L,
+
+
*
+
*
* *
*
+
L
L
+L
+
+
o=
t II (x
V
y)
+ t II (x II y) -
But in any case, t II (x
V
y) - (t II x) v (t II y) = t
t II
X -
+x v y -
t II Y = t II (x tvx vy
V
y) - (t II x) v (t II y).
+ t II X II Y -
t II
X -
t II Y =
= t+x+y-x lIy- t II x- t II y-tv xv y+t IIxlly=
= - t-x-y+xv y+tv x+tv y- tv xv y+t II xlly= = (t v x) II (t v y) - t v (x II y) . (We have suppressed the i's in itt), etc .. ) Thus M(L) is a II-ideal iff it is a v-ideal. Birkhoff [2] calls a valuation f on any lattice distributive if
465
L.
234
I(t
V
GEISSINGER
ARCH. MATH.
X V y) - f(t II X II y) = f(x V y) + I(t V X) + I(t V y) - I(t) - I (X) - I(y) = = I(x) + I(y) + f(t) - I(t II X) - f(t II y) - I(x II y).
Thus M (L) is a II-ideal iff i is a distributive valuation. By Proposition 1, i is distributive iff every valuation on L is distributive. For any lattice L, let lil(L) be the subgroup of Z(L) generated by M(L) and all elements of the form t II [x V y X II Y - x - y] for all t, x, y E L. Then 1il (L) is the II-ideal generated by M (L) and from our computation above it follows that 1il (L) is also the v-ideal generated by M(L). Thus f(L) =Z(L)/lil(L) is a ring for each of the products II and V and since 1il (L) is contained in the augmentation ideal, f(L) is an augmented algebra. The induced map i: L --+ f(L) is a homomorphism for both II and v; thus j(L) is closed under both these operations. Hence j(L) is a lattice and i is a lattice homomorphism. Moreover, by construction j(t II (x v y)) = i ((t II x) v v (t II y)) so i(L) is a distributive lattice. Also i as a function into f(L) is a distributive valuation, and f(L) is the valuation ring of i(L).
+
Proposition 8. The lunction i: L --+ f(L) is the universal distributive valuation on L and i: L --+ i (L) is the universal lattice homomorphism Irom L into distributive lattices. Proof. The first part is a consequence of the previously mentioned properties of
1il (L). For the second, suppose rp is a lattice homomorphism of L into a distributive
lattice L', and i': L' --+ V(L') is the natural injection. Then i' 0 rp is a distributive valuation and so factors uniquely as 0( 0 i where 0(: f(L) --+ V(L') is a group homomorphism. But since i', rp and i preserve II and v and i' 0 rp = rp 0 i then 0( preserves II and v on j(L), and hence on all of V(L). Thus 0( is both a lattice and ring homomorphism. The result stated in the Corollary to Proposition 7 is now seen to hold more generally for distributive valuations on any lattice. Examples. For the modular 5-element lattice Ms it is easily checked that V(Ms) is free of rank 2 while f(Ms) is free of rank 1. For the nonmodular 5-element lattice Ns, V(Ns ) = f(Ns) and the rank is 3. If L I , L2 are lattices with minimal elements Zl, Z2 respectively, a valuation I on LI X L2 is determined by I(ZI, Z2) and the normalized valuations II (x) = I(x, Z2) - I (Zl' Z2) on LI and 12 (y) = I (Zl' y) - I (Zl, Z2) on L 2. Conversely, given normalized valuations It on Lt into an abelian group A and an element c in A, the function I (x, y) = c II (x) 12 (y) is a valuation on LI X L2 into A. Thus the augmentation subgroup I (LI X L 2) is the direct sum I (L I ) EEl I (L 2) and
+
+
V(LI X L 2) R; I (L I) EEl I (L 2) EB Z (Zl' Z2) Similarly,
f(LI X L 2 )
R;
R;
[V(LI) EEl V(L 2 )]/Z (Z2 - Zl).
[f(LI) EEl f(L 2)]/Z(Z2 - Zl).
Application. Let L be a finite geometric lattice which is connected [3], that is, which is not isomorphic to a direct product of two geometric lattices. Then for any two copoints (= coatoms) x, y there is by the path theorem [3] a connected path x = xo, Xl, ... , Xr = y of copoints from x to y, where connected path means that, for each i, Xi-l II Xi is a coline above which is at least one copoint ti different from Xi-l and Xi. Thus Xi-I. tt, Xj generate a copy of Ms and so any valuation I on L must take
466
Vol. XXIV, 1973
Valuations on Distributive Lattices I
x,
235
the same value on all and t" hence on all copoints. Now for any element y of rank k there are elements x, t with x of rank k 1 and t a copoint such that y = x " t and x v t = u. Thus if all flats of rank k 1 have the same f value, then the same is true of all flats of rank k and so by induction downward f is constant on the flats of any given rank, in particular, on the points. Since a flat x of rank k is the join of k independent points it is easy to see that f(x) - f(z) = k(f(p) - f(z» for any point p. That is, the valuation g(x) = f(x) - f(z) is given by g(x) = [rk(x)] g(p). The lattice L is modular iff the rank function is a valuation. Hence if L is modular geometric then V(L) is free abelian on two generators p, z and the universal valuation i: L-+ V(L) is given by i(x) = [rk(x)] (p - z) z. When L is nonmodular there are flats x, y, t such that x v t = Y v t, t = Y " t, x < y and rk(y) = rk(x) + 1. Hence for the valuation g above [rk(x)] g(p) = g(x) = g(y) = [rk(y) 1] g(p) and so g(p) = O. Thus for nonmodular L all valuations are constant so that V(L) is free abelian on one generator z and i(x) = z for all x in L. Clearly then V(L) = P(L) in the nonmodular case. For modular L, since L is connected and atomic it cannot be distributive unless it contains just one point in which case L is the two element lattice and V(L) = P(L). SO when L is modular and not distributive there are at least two points (atoms) {p, q} and since i(p) = i(q) then j(p) = j(q) = j(p " q) = j(p v q). That is, in the group homomorphism from V(L) to P(L) the point p must be identified with z so that P(L) is free on the single generator z. Finally, for any finite geometric lattice L, if L is expressed as the product of connected geometric}attices, then V(L) and P(L) are free abelian groups and rank V(L) = 1 (off of modular connected components) while rank P(L) = 1 (off of 1-point components [isthmuses]).
+
+
+
X"
+
+
+
2. Finite Posets and Distributive Lattices. We collect together here some facts we shall need about maps of finite posets and distributive lattices. With only slight modification most of the results hold also for infinite posets and distributive lattices which have a zero and are locally finite, that is, in which every interval is a finite set. A subset J (possibly empty) of an ordered set (poset) (P, ;:;:::;) is called an order ideal (order filter) if y E J and x;:;:::; y (y ;:;:::; x) imply x E J. The set J(P) of all order ideals of P is a sublattice of 2P with unit u = P and zero z = 0. An element x of a joinsemilattice L is join-irreducible if x = r v 8 implies x = r or x = 8. The set P(L) of all (including z) join-irreducible elements of L is a poset with the induced order. If a poset P has a zero z, the poset P\z will be denoted by P. For any finite distributive lattice L, the map x -+ a(x) = {p E P(L) Ip ;:;:::; x} is a lattice isomorphism of L onto J(P(L» [2, 7]. Dually, for any finite poset M, the principal order ideals (m] = = {k E M I k ;:;:::; m} are precisely the nonzero elements of P(J (M» so that m -+ (m] is an order isomorphism of M onto P(J(M». In a distributive lattice L which is finite or just satisfies DCC, a filter (order filter closed under,,) is prime (its complement is closed under v, i.e. is an ideal) iff it is principal and its minimal element is joinirreducible. Hence a (x) may be identified with the set of all prime filters containing x or all prime ideals not containing x, and a is then the Stone representation of the distributive lattice L by a lattice of sets [2, 7]. By a (u, z)-homomorphism between lattices we mean a map which preserves (u, z, v, ,,). The following equivalences of pairs of categories will be much used.
467
236
L. GEISSINGER
ARCH. MATH.
Proposition 9. The category of finite distributive lattices and (u, z)-homomorphism~ (u-homomorphisms) is equivalent to the dual of the category of finite pwets (with z) and order homomorphisms (preserving z) . Proof. 'Ve prove both equivalences simultaneously. If {J: MI ~ll1z is an order homomorphism of finite posets then taking inverse images we get a map (J' : J( 1lfz) ~ --+J (M 1) which is a (u, z)- homomorphism of finite dist·ributive lattices . Moreover, if the .Jf j both have zeros then the J (Md are still distributh'e lattices and (J' maps J {M z ) into J {M l ) iff (J(ZI) = Z2. In this case (J': J{M 2) ~ J(M 1) is a u-homomorphism. The association ill ~ J (M) (or 111 -,>- J (lll)) and {I -'>- {J' is a contravariant functor from t.he poset category to the dist.ribut,ive lattice category. Note that for A E J (M 2), P' (A) = V {ttl I ({J(t) ] ;:£ A } . In the opposite direction we have the contravariant functor P which associates to each u-homomorphism ),: Ll ~ Lz of finite distributive lattices the order homomorphism .it *: P (L2) -+ P (L] ) given by}, * (P2) = = A {xE LI I .it (x) ~ P2}. Clearly ),(Zl) = Z2 iff ),* maps P(L 2 ) into P(L 1 ). The conclusion now follows from easy computations involving composites and the isomorphisms L ~ J (P(L)) and },f ~ P (J (.J.ll» mentioned above. For generalizations and related results see [6]. Using the concrete correspondences above it is easy to investigate the properties of special (u, z)-homomorphisms of finite distributive lattices. For example, .it is a monomorphism iff J.* is an epimorphism and it is casy to check that an epi in the category of finite posets is just a surjective order homomorphism. But it is also obvious that;' is a monomorphism iff it is injective. For each q E P{L) we let gO denote the unique maximal element in L which lies below q. Then for.it: L1 ~L 2 as above , we have .it * (p) = q iff P ~ .it (g) and p;t A(gO). That is q E I. * (P2(L» iff .it(q) > .it (qO) and ).* is surjective iff ),(q) > .it (qO) for all q E P{L z ). Dually, .it is an epimorphism iff .it * is a monomorphism and clearly a monomorphism in the category of posets is just an injective order homomorphism . Among injective order homomorphisms (J: .Mt-+M2 arc the strict maps, thosc for which p ~ q iff (J(p) ~ P{q), i.e . .Ml is isomorphic to ($( Jl 1 ) with the order inherited from .M2 • For any map Pif P{p) ~ P(q) then for any A E J (Jl 2 ) such t hat q E P' (A) we have also PEP' (A). Consequently, if p ;t q then (q] is not in the image of {J'. Thus P' is a surjective cpimorphism iff f3 is a strict map. 'Ve will later derive two other conditions which are equivalent to .it or {I' being an epimorphism . It is now casy to prove a theorem of Balbes which characterizes "projective" distributive lattices [1,7]. In the category of finite distributive latticcs and (u, z)homomorphisms L is weakly projective if for any cc L1 --+ L 2 which is surjective and any {J: L-+L2 , there is a y : L ~ Ll such that rxy = f3. In the category of finite posets and order homomorphisms III is weakly injective if for any (I.: Jl 1 -+ M 2 which is strict and any f3 : M 1 -+ i1f, there is a y: .M2-+ _~f such that YI1={J. Proposition 10 (Balbes). For a finite distributive lattice L the following are equivalent: (i) L is weakly projective,
(ii) P(L) is UJeakly injective, (iii) P(L)isalattice.
468
Vol. XXIV, 1973
Valuations on Distributive Lattices I
237
Proof. From our remarks above (i) and (ti) are equivalent. If M is weakly injective and if in the defining property above we take M1 =M, M2 a lattice, (X a strict embedding, and fJ the identity, then a retraction y of M2 onto M must exist. It follows that M must be a lattice, provided that there is such a map (X. But the natural map a --+ (a] is a strict embedding of M into the lattice J (M). Conversely, if M is a lattice, fJ: M 1 --+ M any order homomorphism, and (X: M 1 --+ M 2 a strict map, define y by y(m2) = V {fJ(m1) 1(X (m1) ~ m2}. It is easy to see that y preserves order and because (X is strict also y (X = fJ. The referee has pointed out that the equivalence of (ti) and (iii) for an arbitrary poset in place of P(L) (and with "complete" inserted in (iii)) is due to Banaschewski and Bruns: Arch. Math. 18, 369-377 (1967). Another useful result comes from the observation that for an order homomorphism fJ: M1--+M2 the induced map fJ': J(M2) --+ J(M1) takes P(J(M2)) into P(J(M1)) iff for each m2EM2 there is a (X (m2) E M1 such that fJ' ((m2]) = ((X(m2)]. In this case (x, fJ constitute a Galois connection [18] between M 1 and M 2. (Each of (X and fJ is said to be residuated.) Namely, (X(m2) = sup{ml1 fJ(ml) ~ m2} and fJ(ml) = = inf{m21 (X (m2) ~ ml} and (XfJ is a closure operator on Ml and fJ(X is an interior operator on M 2. In terms of distributive lattices Ll and L 2 , the result above states that a map A: P(L 2) --+ P(L l ) can be extended to a lattice (u, z)-homomorphism of L2 into Ll iff A has the same properties as (X above, that is, the A-preimage of every principal order filter in P(L l ) is a principal order filter in P(L 2).It is clear that the product Ll X L2 of finite distributive lattices is the categorical product in both of the lattice categories under consideration and P (Ll X L 2) is isomorphic to the disjoint union P(L l ) U P(L 2) and P(L I X L 2) to the one point join P(L l ) U P(L 2)/{Zl, Z2} which are the coproducts of the P(Lj ) and P(L j ) in the poset categories. Dually the categorical product is the Cartesian product in both poset categories so the coproduct (free distributive product) of L1 and L2 must be isomorphic to J(P(Ll) X P(L2)) and J(P(L l ) X P(L 2)) respectively in the lattice categories. We will see shortly that V(L j ) is free with rank 1P(L j ) 1 and so V(J(P(Ll) X P(L 2))) has rank 1PI (L l ) 1 . 1P (L2) I· This suggests that V(L l ) ® V(L2) might be used as a model for V(J (P (L l ) X P (L 2))). Suppose LI, L2 are any distributive lattices with units (not necessarily finite), then considering the V(L j ) as augmented algebras with A-multiplication, their coproduct is V(L l ) ® V(L 2) with the natural embeddings (Xi given by (Xl (Yl) = Yl ® U2 -and (X2 (Y2) = Ul ® Y2. The multiplicative semigroup generated by (Xl (L l ) U (X2 (L 2) is {Yl ® Y21 Yi E L j } and these are idempotents in e-1(1). The distributive lattice L generated by this semigroup consists, as we saw before, of all (yj ® til (Yi AYl ® tj A tj) (Yi AYl AYk ® ti A tj A tk) ... with i < i < k < ... for all finite indexed families (yj) in Ll and (ti) in L 2 . Fortunately, we never need to use this expression for elements of L.
L
L
+L
Proposition 11. The coproduct of Ll and L2 in the category of u-homomorphisms of distributive lattices with unit is (L, (Xl, (X2), and V(L) FI:i V(L l ) ® V(L2). Proof. For any La, if fJi: L j --+ La are u-homomorphisms, then the induced fJj: V(L j ) --+ V(La) are algebra homomorphisms, so there is a unique algebra map y: V (Ll ) ® V(L2) --+ V(La) given by y (y ® t) = y (y ® U2) Y (U1 ® t) = fJl (y) A fJ2 (t)
469
238
L.
GEISSINGER
ARCH. MATH.
such that YlXj = th. For any s, tEL we know e(s) = e(t) = 1 = e(y(s)) = e(y(t)) so that svt=S+t-sAt and y(SAt)=y(S·t)=y(s)y(t)=y(S)Ay(t) and y (s v t) = y (s) y (t) - y (s A t) = Y (s) v Y (t). Thus Y is a u-homomorphism of Land y(L) ~ L3 because L3 contains the image Y(IXt{L j )) = (3j(Lj ) of the generators IXj(Lj) of L. By the Corollary to Proposition 5, the inclusion of L into V(L 1 ) @ V(L 2) extends uniquely to a ring homomorphism of V(L) onto V(L 1 ) ® V(L2). On the other hand, the ring homomorphisms IXj: V(L j ) -l- V(L) yield a ring homomorphism of V(L 1 )® V(L 2) onto V(L) which is the identity on L. Hence V(L) R:i V(Ll) @ V(L2). Since J takes the category of finite posets into a subcategory of itself we may apply J twice to get a covariant functor which associates to each order homomorphism {3: M 1 -l- M 2 the (u, z)-homomorphism {3": J2 (M 1) -l- J2 (M 2) given, for principal ideals (A], by {3"((A])={BEJ(M2)1{3'(B)~A} for every AEJ(Ml). For each finite poset M let y: M -l-J2(M) be given by y(m) = {A EJ(M)lm ¢A}. Then y provides a natural transformation from the identity to the functor J2 since for {3: Ml-l-M2 we have (3" y(m) = {3" ({A EJ(M 1 ) Im ¢A}) = U {{3" ((A]) 1m¢: A} =
+
= {BEJ(M2)lm¢:{3'(B)} =
I
= {BEJ(M2) {3(m)¢:B} = y{3(m).
Lemma. The sublattice F(M) generated by y(M) contains all elements of J2(M) except for the unit and zero. If L is a distributive lattice, there is a unique lattice homomorphism n: J2 (L) -l- L such that n y = id. Proof. If OEJ(M) and O*M, then n{y(m)lm¢O}={AEJ(M)lm¢:A for all m¢:O}= = {A EJ(M)
IA ~O} =
(0].
Thus F (M) contains all principal ideals in J2 (M) except (M], and doesn't contain 0 or (M] since 0 E y(m) and M ¢: y(m) for all mE M. If L is a distributive lattice, the map}.: P (L) -l-J (L) given by }.(p) = {xEL Ip;:t x} is order preserving. The induced lattice homomorphism }.': J2 (L) -l- J (p (L)) composed with the isomorphism J(P(L)) R:i Lyields a lattice homomorphism n: J2(L)-l-L given by n((A]) = V {pE P(L) I J.(p) ~ A} for all A E J(L). Thus ny(y)
= V {pE P(L) I }.(p) ~ }.(y)} =
y for all yEL.
Proposition 12. For any finite poset M, (F (M), y) is the free distributive lattice on M. That is, every order homomorphism (3 of M into a distributive lattice factors uniquely thru F(M) as y followed by a lattice homomorphism, namely the homomorphism n{3" in case the lattice is finite. Proof. For finite L, n{3"y = ny {3 = {3 by the lemma and the fact that'y is a natural transformation. n {3" is unique since y (M) generates F (M). If L is infinite just replace L by a finite sublattice of L containing (3(M). The preceding is a natural generalization of the construction due to Skolem of the free distributive lattice on a finite number of generators [2].
470
Vol. XXIV, 1973
Valuations on Distributive Lattices I
239
References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11]
R. BALBES, Projective and injective distributive lattices. Pacific J. Math. 21,405-420 (1967). G. BIRKHOFF, Lattice Theory. Third ed., Providence 1967. H. CRAPO and G.-C. ROTA, Combinatorial Geometries. Cambridge (Mass.) 1970. R. L. DAVIS, Order Algebras. Bull. Amer. Math. Soc. 76, 83-87 (1970). J. FOLKMAN, The homology groups of a lattice. J. Math. Mech. 16, 631-636 (1966). L. GEISSINGER and W. GRAVES, The category of complete algebraic lattices. J. COIp.binatorial Theory (A) 13, 332-338 (1972). G. GRATZER, Lattice Theory. San Francisco 1971. C. GREENE, On the Mobius Algebra of a Partially Ordered Set. Proc. Conf. on Mobius Algebras, University of Waterloo 1971. P. R. HALMOS, Measure Theory. Princeton 1950. P. HILTON and S. WYLIE, Homology Theory. Cambridge 1960. V. KLEE, The Euler characteristic in combinatorial geometry. Amer. Math. Monthly 70,
119-127 (1963). [12] A. HORN and A. TARSKI, Measures in Boolean algebras. Trans. Amer. Math. Soc. 64,467 -497 (1948). [13] R. LARSON and M. SWEEDLER, An associative orthogonal bilinear form for Hopf algebras. Amer. J. Math. 91, 75-94 (1969). [14] H. M. MACNEILLE, Partially ordered sets. Trans. Amer. Math. Soc. 42, 416-460 (1937). [15] B. PETTIS, Remarks on the extension of lattice functionals. Bull. Amer. Math. Soc. 64, 471 (1948). [16] B. PETTIS, On the extension of measures. Ann. of Math. 64, 186-197 (1951). [17] H. RASIOWA and R. SIKORSKI, The Mathematics of Metamathematics. Warsaw 1963. [18] G.-C. ROTA, On the foundations of combinatorial theory. I. Z. Wahrscheinlichkeitstheorie Verw. Gebiete 2, 340-368 (1964). [19] G.-C. ROTA, On the combinatorics of the Euler characteristic. In: Studies in Pure Mathemathics, pp. 221-233. London 1971. [20] L. SOLOMON, The Burnside algebra of a finite group. J. Combinatorial Theory 2, 603-615 (1967). [21] E. SPANIER, Algebraic Topology. New York 1966.
Eingegangen am 24. 8. 1970 *) Anschrift des Autors: Ladnor Geissinger Mathematics Department University of North Carolina Chapel Hill, North Carolina 27514, USA
*) Eine revidierte Fassung ging am 1. 9. 1972 ein.
471
Sonderabdruck aus ARCHIV DER MATHEMATIK Vol. XXIV, 1973
Fasc.4
BIRKHAUSER VERLAG, BASEL UND STUTTGART
Valuations on Distributive Lattices II By LADNOR GEISSINGER
Introduction. We continue the study begun in part I [Arch. Math. 24, 230-239 (1973)] of the valuation ring of a finite distributive lattice. We show that it is the Mobius algebra of the set of join.irreducible elements and we derive Solomon's formula for idempotents. We use the duality between posets and distributive lattices given in part I to derive mapping properties of Mobius algebras. From this we get theorems on extending finitely additive measures, theorems of Rota concerning Mobius functions, an identity due to Klee, a factorization theorem of Stanley and Greene, and results on the characteristic valuation of a distributive lattice. 3. Finite Valuation Rings and Mobius Algebras. If L is a distributive lattice with DCC (minimum condition) and P(L) is the set of all (including the zero z) joinirreducible elements of L, then every element of L can be uniquely expressed as a finite irredundant join of elements of P(L). For each x E L the set a(x) = {pEP(L)\p ~x}
is a finitely generated order ideal in P(L) and x = Va(x). Also the prime dual ideals (filters) in L are precisely the principal filters generated by elements of P(L). Theorem 1. For every distributive lattice L with DCC, the valuation ring V(L) is a free abelian group with P(L) as basis. That is, every valuation on L is determined by its values on P(L) and these values can be assigned arbitrarily. Proof. For every xEL there are {PI, ... ,Pr}c P(L) such that x = VPt. Then by Prop. 7, in V(L) we have x=Vpt=LPt-LPtIlPJ+LPtIlPJIIPTe .... If x ¢ P (L) then each of the summands PI II .•• II PTe is strictly below x in L. Hence by induction upward on L (DCC) we conclude ~hat every x in L is a linear combination in V(L) of a finite number of elements in a(x). Thus P(L) generates V(L). For any {PI, ... , PTe} c P(L), if Pr is maximal among them then the remaining Pt are in the complement of the prime filter generated by Pr and so Pr is independent of the rest by Prop. 3. Thus P(L) is an independent set in V(L). Since this theorem is the principal result upon which the remainder of our discussion is based, we sketch alternative proofs. Perhaps the most natural procedure is to attempt to prove the statement about valuations directly without using the valuation ring. The chief difficulty is to show that any function v from P(L) into 22
Arclliv der Mathematik XXIV
473
1. GEISSINGER
338
ARCH. MATH.
an abelian group A can be extended to a valuation on L. One can easily define by induction on L an extension of the function v to L by using the unique representation of an element x as an irredundant join VPt of elements of P(L) and then setting v(x) = V(Pi) V(pj II Pi) .... However, to show that this extension is a valuation requires a rather delicate induction argument. For another proof, when L is locally finite, we can use instead the valuations vp of Proposition 3 for each PEP (L), which take the value 1 for all x ~ p and otherwise the value O. Every function f: P (L) -'>- A then determines a valuation Vf on L by vf(x) = 2:f(p) vp(x) = 2:f(p). By Mobius
2:
2:
p
p";",
inversion over P(L) we can choose f so that vf and any given v agree on P(L). One can also easily show that the Vp are independent. Hence if L is finite we get yet another proof by observing that V(L) is generated by P(L) and its dual has IP(L)I independent functionals and so again V(L) must be free on P(L). See also the proof by Greene [8]. i8
Corollary. If P(L) i8 a iI-8emilattice, then it the 8emigroup algebra of (P(L), II) over Z.
i8
a 11-8ub8emilattice of Land (V(L), iI)
For distributive lattice L with DCC, the elements of L correspond to the finitely generated order ideals in P(L) and these are closed under finite intersection. Lis then locally finite iff P (L) is locally finite, and iff for each p in P (L) there is a unique maximal element pO in L lying below p. When L is locally finite for each pin P(L) let ep = p - pO (in V(L)) and for zero z let ez = z. Parts of the following theorem appear in a paper by Davis [4] in which he showed that the valuation ring V(L) is isomorphic to the Mobius algebra of P(L) as defined by Solomon [20].
Theorem 2. Let L be a locally finite di8tributive lattice with zero, and let p, be the Mobius function of P(L). Then {e p Ip E P(L)} i8 a ba8i8 of V(L) con8i8ting of orthogonal idempotent8, x = 2:ep (p;£ x) for each x in L, ep =; 2:p,(r, p) r (r;£ p and r in P(L)) for each p in P(L), and x = 2:p,(r, p) r (p, r in P(L) and p;£ x) for each x in L.
*'
Proof. For p q in P(L) it's easy to check that ep and eq are idempotent and epileq = O. For p in P(L), p = ep + pO and for any x in L which is not in P(L), x = by c where b, c are less than x and in L. Also if b = eq (q ;£ b) and c = eq (q ;£ c) then b II c = eq (q ;£ b II c) since the eq arc orthogonal idempotents. Hence byc = b + c - bllc = 2:eq (q;£ byc). Thus by induction upward, x = 2:eq (q ;£ x) for all x in L. Hence the {eq} form a basis of V(L). The embedding i: L-,>- V(L) restricted to P(L) and the map e from P(L) into V(L) thus satisfy i(p) = 2:eq (q;£ p) and so by Mobius inversion ep = 2:p,(r, p) i(r) = 2:p,(r, p)r (r;£ pl. It follows that for every x in L, x = ep = p, (r, p)r (r ;£ p ;£ x and r, p in P(L)). The expression for any x in terms of P (L) was discovered before the other formulas above. A direct proof of this follows. By Theorem 1 for any x in L, x = 2:dpp where d p is integral and d p = 0 if P ¢ a(x). For any q in a(x),
2:
2:
2:
2:
2:
q = xilq = C~qdp) q +q~dp(Pllq) and since q is independent of the p II q
<
q in the second summand,
2: dp =
p~q
474
1. This
Vol. XXIV, 1973
339
Valuations on Distributive Lattices II
is true for each q in a(x), so by Mobius inversion over the finite subset a(x) of P(L) we get dr = L,u(r, p). p:;;;'z
Corollary. For every q in P(L), qO
= -
L,u(r, q) r (r
< q).
Solomon [20] defined the Mobius algebra M (Q), for any poset Q in which every principal ideal (q] is finite, as the free abelian group on Q with multiplication given by q "r = L,u (s, t) s (s ~ t, t ~ q and t ~ r). If Q has a zcro z, then Q is isomorphic to the poset of join-irreducible elements of the lattice J(Q) of nonempty ordcr ideals of Q. and by Theorem 2 then M(Q) Ri V(J(Q)). If Q does not have a zero, letting z denote the zero (= empty set) of J(Q), it follows that z generates a 1·dimensional ideal in V(J(Q)) and M(Q) Ri V(J(Q))/Zz. In both eases, the spanning orthogonal idempotents in M(Q) are given as above by ep = L,u (r, p)r (r ~ p) and p = L eq (q ~ p) for eaeh pin Q.
Proposition 13. An order morphism fJ: P..,.. Q of finite posets induces ring homomorphisms fJ': V(J(Q))"'" V(J(P)) and fJ': M(Q)..,.. M(P) given by fJ'(eq) = L ep (fJ(p) = q), where the sum is taken to be 0 if the index set is empty. Proof. We are identifying Q, P with the nonzero join irreducible elements of J(Q), J(P). Define a map 1: V(J(Q))..,.. V(J(P)) by A(eq) = Lep (fJ(p) = q). Then for each A EJ(Q), A = Leq (qEA) and so A(A) = Lep(fJ(p) EA) = fJ'(A). Thus A is the map induced on V(J(Q)) by the (u, z)-lattice homomorphism fJ': J(Q) ..,.. J(P). Factoring out z = () merely deletes ez = z from the expressions for ep, eq and fJ', so that in the quotient fJ': M(Q) ..,.. M(P) is given by the same formula. The conclusion of Prop. 13 still holds if we suppose fJ is only defined on an order ideal in P, or equivalently if fJ maps Pinto Q with a unit u' adjoined. If moreover P and Q have zeros and if fJ preserves zero, then fJ': J(Q) ..,.. J(P) is a lattice homomorphism and by Prop. 9 every lattice homomorphism J(Q) ..,.. J(P) comes from exactly one such fJ: p..,.. Q U {u / }. From Prop. 13 it follows that fJ/: V(J(Q))..,.. ..,.. V(J(P)) is surjective iff fJ is injective and fJ (P) ~ Q (nothing maps onto u' ) while fJ' is injective iff fJ maps onto Q (some elements could still go onto u' ). Finally, if fJ maps ontoQ, let oc:Q..,.. Pbe any function such that fJoc = id (i.e. oc (q) E {p I fJ(p) = q}), and let Tbethe subgroup of V(J(P)) spanned by{ep IpEtoc(Q)}. The subring fJ'(V(J(Q))) is spanned by {fJ'(eq)} and clearly V(J(P)) = fJ'(V(J(Q))) EB T. Furthermore, it is easy to see that when any y in V(J(P)) is expressed as a linear combination of the {fJ'(eq)} and the basis described for T, then every coefficient is 0 or ± 1. Hence if yEt fJ'(V(J(Q))), we may replace some ep in the basis for T by Y to get a basis for another direct complement of fJ'(V(J(Q))). These results combined with the duality in Prop. 9 and our subsequent discussion of mapping properties yields the following.
Proposition 14. Let A: K ..,.. L be a homomorphism of finite distributive lattices and Ae : V(K)..,.. V(L) the induced ring homomorphism. Then A is an epimorphism iff Ae is surjective, and A is an injection iff Ae is an injection. In any case, Ae( V(K)) is a direct summand of V(L). Moreover for any y E L with yEt Ae( V(K)) there is a direct complement of Ae(V(K)) with a basis which contains y. 22*
475
340
L.
GEISSINGER
ARCH. MATH.
Corollary. Let K be a sub lattice of a finite distributive lattice L. Then every valuation v of K into an abelian group A can be extended to a valuation of L into A. Moreover, for any element y E L such that y rf= V(K), there is an extension of v to L for which v(y) is any prescribed value in A. We will see later precisely what the condition that y ELand y rf= V(K) means in terms of the lattices K and L. The following interesting result will be used to extend part of Proposition 14 and its Corollary to infinite distributive lattices. Proposition 15. Let B 1 , ••• , Br be elements of a distributive lattice L. Every linear relation among the Bk which holds in V(L) also holds in V(M) where M is the finite sublattice generated by the B k • Proof. Suppose that in V(L), LakBk = 0 where the ak are integers. From our construction of V(L) this means that LakBk, when considered back in Z(L), is a linear combination of a finite number of elements of the form
OJ V Di
+ OJ II D
j -
OJ - D j
•
Let N be the finite sublattice generated by all the B k , OJ, D j • Then in V(N) the relation L akBk = 0 holds. But N is finite and M ~ N so by Proposition 14 the induced map j: V(M) -+ V(N) is an injection. Thus any relation among elements of M which holds in V(N) also holds in V(M). Remark. This immediately yields a proof of Proposition 4 which does not depend on the existence of enough prime ideals, that is, does not make use of Zorn's Lemma. Proposition 16. F'or every distributive lattice Land sublattice M, the natural map j: V(M) -+ V(L) induced by inclusion is an injection. Proof. If B 1 , ... , Br are elements of M, any linear relation among them which holds in V(L) also holds in the valuation ring of the sublattice of L generated by the Bk by Proposition 15. But since this sublattice is contained in M the relation also holds in V(M). Corollary. Any valuation of M into any ratioual vector space A (or divisible abelian group) can be extended to a valuation of L into A. Stronger versions of this were proved by Horn and Tarski [6] and Pettis [9] for bounded real-valued modular functions.
4. Combinatorial Applications. The Characteristic Valuation. For a finite distributive lattice L it is easily shown that the rank function r is given by r(y) = 1{p E P(L) 1z < p ~ y} I. Using the representation y = ep (p ~ y) in V(L) we see that r is the unique valuation on L for which r(p) - r(pO) = r(ep) = 1 for all p E P(L) and r(z) = O. The valuation X on L for which X(p) = 1 for all p E P(L) and X(z) = 0 is called by Rota the characteristic valuation of L.
L
476
Vol. XXIV, 19'1'3
Valuations on Distributive Lattices II
341
Proposition 17. For every y E L, X (y) = - L p, (z, q) (q E P (L) and z < q ~ y). For the element pO covered (in L) by some P E P(L), X(pO) = 1 + p,(z, pl. Proof. Just apply the homomorphism X on V(L) to the expression y = LP,(p, q)p (q ~ y) given in Theorem 2 to get X(Y) = LP,(p, q) (z < p ~ q ~ y). Then X(Y) = = - LP,(z, q) (z < q ~ y) and when y = pO,
X (pO) = - LP,(z, q) (z
< q<
p) = p,(z, z)
+ p,(z, p).
Now suppose P(L) is a lattice and y is the join in L of PI, ... , Pr E P(L), then
(X - 1)(y)
=
(X - 1)(Vpj) = L (X - 1) (pj) - L (X -1)(pjApj)
+ ....
Since X - 1 is the valuation which takes the value 0 on P(L) and -1 on z, we get X(y) - 1 = C2 - Ca C4 - Cs where CTc is the number of k element subsets of PI, ... , Pr whose meet is z. When y = qO for some q E P (L) this yields a result due to Rota [18].
+
+ ...
Corollary. Suppose P is a finite lattice and {q, PI, ... , Pr} ~ P is such that all Pi ~ q and every element covered by q is among the Pt. Then p, (z, q) = C2 - Ca + ... where CTc is the number of k·subsets of {Pi} whose meet is z. Proof. In J(P), q0is the join of the Pi, hence by Proposition 17, X(qO) - 1 = p,(z, q). Following Rota, we indicate how the characteristic valuation is related to the Euler characteristic of combinatorial topology [19]. For any finite totally unordered set S, the nonzero elements of J(S) = 28 are called simplexes and the nonzero elements of J2(S) are the finite simplicial complexes with vertices in S. In this setting we usually treat 28 simply as a poset and ignore its lattice structure, but joins and intersections of subcomplexes are important. For any finite distributive lattice L = J(P) we would like to consider the elements of P as simplexes and the elements of J(P) as simplicial complexes. We can approximate to this by constructing an analogue of the barycentric subdivision operator. The (first) barycentric subdivision of a simplicial complex K E J(28 ) may be described as the simplicial complex B(K) whose k-simplexes are all chains A o CAl C ••• C ATc of k 1 (non-empty) simplexes At of K. By analogy, for any finite poset M let B(M) be the set of all (non-empty) finite chains of elements of M ordered by inclusion. B (M) is an ordered simplicial complex [10] and J(B(M» is the lattice of all finite subcomplexes of B(M). To each A E J(M) we associate the subcomplex B(A) whose simplexes are those chains in B(M) all of whose elements are in A, that is, the full subcomplex with vertex set A. Then B is a lattice monomorphism of J(M) into J(B(M». Note that for mE M, B«m]) is usually not a simplex or even a subdivided simplex but it is topologically as simple since it is the cone with vertex m and base B ( {m' E M Im' < m}) and is thus contractible. When M consists of all elements of a finite lattice L except for the unit and zero, then the homology of the complex B(M) is called the order homology of L [5, 18, 19]. Now if A: LI --+ L2 is a (u, z)-homomorphism of finite distributive lattices, A* carries each chain in P(L 2) into a possibly smaller chain in P(L I ) so that A* may also be considered as an order homomorphism of BP(L 2 ) into BP(LI). Taking preimages we get a (u, z)-homomorphism B(A) of JBP(L I ) into JBP(L 2 l.
+
L. GEISSINGER
342
ARCH. MATH.
The association of J BP(L) to L and of B(A.) to A. is a functor from the lattice category into itself, and the lattice monomorphism B: L ~ J BP (L) given by B(y) = B({peP(L)
ip ~ y})
provides a natural transformation from the identity to this functor. It is well-lplOWll [10, 21] that the Euler characteristic E in combinatorial topology is a modular function from the lattice of subcomplexes of a simplicial complex into the integers, which is 0 on the empty sub complex and takes the value 1 on any contractible subcomplex. Thus for any finite distributive lattice L, the composite EB: L~JBP(L)~Z
is a valuation on L which takes the value 0 on z and the value 1 on P(L), that is, EB = X the characteristic valuation on L. From the classical formula for the Euler characteristic, for y e L, X(Y) = EB(y) = ao - al a2 - ... where ale is the number of chains of k 1 elements in P(L) which are less than or equal to y.
+
+
Klee's Identity and Extensions of Valuations. In a paper on the Euler characteristic [11], Klee proves the following identity. Proposition 18. 8uppose 8 is a A-semilattice, (at) and (bj) are finite indexed families of elements in 8, and u e 8 such that u ;?; at, bj for all i, 1- Then in the semigroup ring Z[8, A], TI (u - at) + TI (u - bj) - TI (u - at A bj) = TI (u - at) TI (u - bj). Proof. We may assume 8 is finite. From the Corollary to Theorem 1 we see that the identification of an element of 8 with the ideal it generates yields an isomorphism of Z[8, A] with V(J(8». But in V(J(8», the identity above reduces by Proposition 7 to the simple statement (u - Vat)
+ (u -
Vbj) - (u - V {at A bj}) = u - (Vat) v (Vbj)
where v means join in J(8). And this holds because V{a,Abj} = (Vat)A(Vbj) in J(8).
Klee derived the relation above using the counting result below. For positive integers c, m, n let Pc(m, n) denote the number of relations of cardinality c in {1, 2, ... , m} X {1, 2, ... , n} with domain {1, 2, ... , m} and range {1, 2, ... , n} (subsets of size c projecting onto all of {1, ... , m} and {1, ... , n} respectively). Corollary. - L(-l)CPc (m,n) = (-l)m+n .
•
Proof. Choose a semilattice 8 and elements ai, bj such that all of the meets ail A... Aai. Abit A... Abir are distinct. Then in the identity in Proposition 18, the expressions above are the coefficients on each side of the identity for any term which is a product of m of the at and n of the bj • The following is a generalization of a theorem by Klee in the same paper [11]. Let L be a lattice and K a A-subsemilattice such that every element of L is a finite join of elements of K. For any function f from K into an abelian group, and for any
478
Vol. XXIV, 1973
343
Valuations on Distributive Lattices II
finite family (al' ... , an) of elements of K, let I(al, ... ,a n ) = Lf(a;) - LI(a;lIaj)
+ LI(aillajlla/c) -
... (i
k···).
Proposition 19. The lunction f on K can be extended to a distributive valuation on L iff lor all lamilies (ai), (bj) in K, il Vai = V bj then I(al, .:., an) = f(b l , ... , br). Proof. By our earlier remarks this condition holds if f extends to a distributive valuation on L since then I (Vai) = I (al' ... , an). So assume I satisfies the condition. Then for any a E L, define. I (a) to be I (aI, ... , an) where the ai E K are chosen so that Vai = a. Extend this function on L linearly to Z[L, II], then f(a)
= I(al, ... , an) = f(Lai - Lai lIaj ... ).
For (ai) and (b j ) in K by Klee's identity, I(Vai)
+ I(Vbj) -
I( V {aj
II bj}) =
v
1«Vai) (Vb j ))
•
Finally, given a, bin L we can find (ai), (bj) in K such that Vai = a, Vbj = band a b = V {a; bj}, hence f is a valuation on L and it is obviously distributive.
II
II
Corollary. If L is distributive, f can be extended to a valuation on L iff lor all (ai) in K, il Va; = a is again in K then I(a) = I(al, ... , an). Proof. One can easily show for any (ai) in K, if an+1
~
Vai then
I(al, ... , an) = f(al,"" an+1)
and from this that if Vaj = V bj then I(al, ... , an) = I(b l , ... , br ).
See Klee [11] for this and further results. Factorization in Mobius Algebras. Suppose P is a finite pOBet with and K is the set of all elements covered by u. Let t E P, t u and 0 = and identify as usual P with the join-irreducible elements of J(P) = M(P), eu = u - uO = u - V K = (u - k) and eu = LIt(r, u) r.
*
TI
0,
U -
c is idempotent and (u - c) (u - c)
II eu =
IJ
0 for all r
(u - c) IIt~:(r, u)
and continuing for all c in 0, eu =
II r =
K
II
(u - c) eu =
IJ
(u - c)
r)
~
zero and unit {k E K: k ;;:::; t} L. In V(L) or For all c in
c. Thus
II (L It (r, u) r)
where the sum is over only those rEP for which r~ VO. But r~ VO in J(P) iff in P, sup {r, t} = u. Now in the Mobius algebra of the interval [t, u] as poset, we have (u - c) = LIt(r, u) r, but this may not hold in Jf(P). In M(P) we have
TI a
r<;1
479
344
L. GEISSINGER
ARCH. MATH.
and The coefficient of ep in this expression is 1 when p ;t ve, 0 when t and p are comparable and p* u, and Lf.t(r, u) (t ~ rand p ~ r) when p and t are incomparable and p ~ ve. It follows that, if for every PEP, sup {p, t} exists in P, then in M(P) we have Lf.t(r, u) r = TI (u - c). This yields the factorization theorem due to Greene [S] . • ;;;1
C
Proposition 20. Suppo8e P i8 a finite poset 'With unit and zero and t in P has the property that lor every p in P, the join p v t exi8t8 in P. Then in the Mobius algebra 0/ P,
Lf.t(r, u) r = (Lf.t(r, u) r)
reP
r~t
A (
L f.t(r, u) r).
r v t=u
Note that the condition on t is equivalent to requiring the injection [t, u] ~ P to be residuated, that is, the preimage of every principal liter in P is a principal filter in [t, u]. When P is a lattice, if Q is the A-subsemilattice generated by e and u, then the inclusion Q ~ P induces a homomorphism M(Q) ~ M(P). Thus TI (u - c)
o
can be computed in M(P) as L f.tp(r, u) r (r E P, r ~ t) and in the sub-algebra M(Q) as L f.tQ (r, u) r (r E Q). Greene and Stanley [S] use the latter expression in applications of the factorization theorem to geometric lattices. Residuated Maps. For a finite poset P, even if P has a zero, adjoin a new element z to get a poset P = P u {z} with z as zero. Enlarge the Mobius algebra M(P) to M(P) Et> Zz and extend multiplication by defining (z)2 = z and M(P) Az= O. In this algebra it is easily checked that the elements {p + zl pEP} u {z} multiply precisely as do the elements of P in M(P) so that we may identify M(P) and M(P) Et> Zz. Now if rp: P ~ Q is an order morphism of finite posets and each of P and Q is enlarged by adding a zero z, then obviously rp: M(P) ~ M(Q) is a A-homomorphism iff rp: M(P) ~ M(Q) is a A-homomorphism. If ip is a A-homomorphism, since x v y = = x + y - X AY for x, y in J(P) or in J(Q), then "if is also a lattice homomorphism of J(p) into J(Q) taking only z in Ponto z in Q. But following Prop. 10 we saw that rp: P ~ Q extends to a lattice z-homomorphism ip: J(P) ~ J(Q) iff rp has the property that the preimage of each principal liter in Q is a principal liter in P or is empty. This completes the proof of the following proposition. Proposition 21. A map rp: P ~ Q 0/ finite p08et8 extends to a homomorphi8m rp: M(P) ~ M(Q) 0/ the Mobius algebras ill the pre image 0/ every principal/ilter in Q i8 a principal/ilter in P or i8 empty. Note that this condition is satisfied if rp is inclusion and P is either an ideal in Q or P is a A-subsemilattice ifQ is a A-semilattice.1f rp satisfies the condition in Prop. 21, rp is half of a Galois connection and the other part is the map fJ: Q ~ P u {u} given by fJ(q) = min{p: rp(p) ~ q}. From Prop. 13 it follows that the map rp = fJ/: M(Pu {u}) ~M(Q) is given by rp(ep) = L eq(fJ(q) = p). But rp(ep) = Lf.tp(r, p)rp(r) and L eq (fJ (q) = p) = L f.tQ (t, q) t (fJ (q) = p). Comparing. coefficients of any t in Q q
t,q
yields Rota's principal theorem on Galois connections [S, lS].
480
Vol. XXIV, 19'1'3
Valuations on Distributive Lattices II
345
Corollary. Suppose rp: P -+ Q and (J: Q -+ P are order morphism8 of finite posets such that (J(q) = min{p: rp(p) ~ q} for each q in Q. Then for each t in Q and p in P,
2:
fJ(q)~p
/-lQ(t,q) =
2:
/-lp(r,p)
,p(r)~1
where a sum is taken to be 0 if the index set is empty.
References [1] R. BALBES, Projective and injective distributive lattices. Pacific J. Math. 21, 405-420 (1967). [2] G. BIRKHOFF, Lattice Theory. Third ed., Providence 1967. [3] H. CRAPO and G.-C. ROTA Combinatorial Geometries. M.I.T. Press 1970. [4] R. L. DAVIS, Order Algebras. Bull. Amer. Math. Soc. 76, 83-87 (1970). [5] J. FOLKMAN, The homology groups of a lattice. J. Math. and Mech. 1Ii, 631-636 (1966). [6] L. GEISSINGER and W. GRAVES, The category of complete algebraic lattices. J. Combinatorial Theory (A) 13, 332-338 (1972). [7] G. GRATZER, Lattice Theory. San Francisco 1971. [8] C. GREENE, On the Mobius Algebra of a Partially Ordered Set. Proc. Conf. on Mobius Algebras, University of Waterloo 1971. [9] R. R. HALMOS, Measure Theory. Princeton 1950. [10] P. HILTON and S. WYLIE, Homology Theory. Cambridge 1960. [11] V. KLEE, The Euler characteristic in combinatorial geometry. Amer. Math. Monthly 70, 119-127 (1963). [12] A. HORN and A. TARSKI, Measures in Boolean algebras. Trans. Amer. Math. Soc. 64, 467 -497 (1948). [13] R. LARSON and M. SWEEDLER, An associative orthogonal bilinear form for Hopf algebras. Amer. J. Math. 91, 75-94 (1969). [14] H. M. MACNEILLE, Partially ordered sets. Trans. Amer. Math. Soc. 42, 416-460 (1937). [15] B. PETTIS, Remarks on the extension of lattice functionals. Bull. Amer. Math. Soc. 64, 471 (1948). [16] B. PETTIS, On the extension of measures. Ann. of Math. 64, 186-197 (1951). [17] H. RASIOWA and R. SIKORSKI, The Mathematics of Metamathematics. Warsaw 1963. [18] G.-C. ROTA, On the foundations of combinatorial theory. I. Z. Wahrscheinlichkeitstheorie Verw. Gebiete 2, 340-368 (1964). [19] G.-C. ROTA, On the combinatorics of the Euler characteristic. In: Studies in Pure Mathematics, pp. 221-223. London 1971. [20] L. SOLOMON, The Burnside algebra of a finite group. J. Combinatorial Theory 2, 603-615 (1967). [21] E. SPANIER, Algebraic Topology. New York 1966. Eingegangen am 2. 10. 1972 Anschrift des Autors: Ladnor Geissinger Mathematics Department University of North Carolina Chapel Hill, North Carolina 27514, USA
481
Sonderabdruck aUB ARCHIV DER MATHEMATIK Vol. XXIV, 1973
Fasc.5
BlRKHAUSER VERLAG. BASEL UND STUTTGART
Valuations on Distributive Lattices III By LADNOR GEISSmGER
Introduction. In Parts I and II [Arch. Math. 24, 230-239, 337-345 (1973)] we were principally interested in combinatorial applications of the valuation ring of a distributive lattice. We now show how this ring provides a natural setting for some elementary results in measure theory as well as some classical results on representations of distributive lattices. Specifically, in the valuation ring V(L) of a distributive lattice L it is easy to identify various extensions of L as well as prime ideals of L and so arrive at some theorems of Pettis, Birkhoff, and Stone. For any faithful representation of L as a lattice of sets the extension of V (L) by real (or complex) scalars is naturally isomorphic to the algebra of simple functions, and the sup norm on the functions comes from an intrinsic norm on VR(L). The Stone space of L corresponds to the spectrum of VR(L) with the Zariski topology. 5. Extensions of a Distributive Lattice. It is well-known [9] that the ring (family closed under union and difference) generated by a lattice L of sets consists of all the finite disjoint unions of differences E - F of elements of L. The existence and categorical properties of this and other minimal extensions of a distributive lattice L can be easily deduced using the valuation ring V(L). We noted before that if w ~ x ~ yin L then the element w Y - x in V(L) acts like the relative complement of x in the interval [w, y] even if such a relative complement does not exist in L. If w, X" Yc are elements of Land w ~ x, ~ Yt then (YI - Xl) " (Y2 - X2) = = YI "Y2 - (Xl" Y2) V (YI "X2) and (y, - x,) = 0 in V (L) so that both Yc - x, and w Yc - Xc are idempotent elements, YI - Xl is orthogonal to Y2 - X2 (the analogue of disjoint sets) iff YI "Y2 ~ Xl V X2, and
+
w"
+
(w
+ YI -
Xl) " (w
+ Y2 -
X2) = w
+ YI " Y2 -
(Xl" Y2)
V
+ 2:
(YI " X2) •
Let R(L) be the set of all elements of V(L) of the form 8 = W (Yt - X,) where w ~ Xl ~ Yt in L for 1 ~ i ~ n and the y, - Xi are mutually orthogonal. If v ~ w then also 8 = V (w - v) (Yt - xd and w - v is orthogonal to all the y, - Xi. For another element r = w (qj - pj) in R (L) expressed as above, r" 8 = W (Yt - Xi)" (qj - Pj) is again in R(L) since the (y, - x,)" (qj - Pj) = y,,, qj -(Yt "p,)v (Xt"q,) are orthogonal and all elements are above win L. Thus R(L) is closed under the idempotent operation". To show that R (L) is closed under v, first note that since the Yt - X, are orthogonal w (Yt ..:... Xl) = V (w Yt - X,)
+
+ 2:
+ 2: + 2:
+
+ 2:
483
+
476
L. GEISSINGER
ARCH. MATH.
and dually if t ~ YI ~ XI for all i then t - L (YI - XI) = /\ (t - y, element is also in R(L). Now if q ~ p ~ w, and 8 is as above, (w
+q -
p) v (w
+ L (YI -
+ x,) and this
Xl)) =
=w+L~-~+~-~-~-~AL~-~= = L (Yl -
Xl)
+ w + (q -
for any t in L such that t w
+ (q -
p)
A
~
q and t
p) ~
A
(t - L (Yl - XI))
V YI, and since
(t - L (Yl - Xl)) = (w
+q -
p)
A
(t - L
(Yl -
Xl))
R(L) and (q - p) A (t - L (Yt - Xl)) is orthogonal to all the Yl - Xl then p) V 8 is in R(L). It then follows that for any r, 8 in R(L) above w in L, r v 8 and r v 8 - 8 w = v are both again in R(L) and above w. Thus R(L) is a distributive lattice. In fact, R(L) is ,relatively complemented"because if t;;:;; 8;;:;; r in R(L) there is a w in L such that w ;;:;; t and v = r - 8 w is in R(L) so tvv = = r - 8 t is in R(L) and is the complement of 8 in [t, r].
IS ill
(w
+q-
+
+
+
Proposition 22. For every distributive lattice L, R(L) i8 the unique minimal relatively complemented di8tributive exten8ion of L. Every lattice homomorphi8m of L into a relatively complemented di8tributive lattice L' extends uniquely to a lattice homomorphi8m of R(L) into L'. Proof. If f{J: L ---+ L' is a lattice homomorphism it extends uniquely to a ring homomorphism f{J: V(L) ---+ V(L'). So f{J extends to a lattice homomorphism of R(L) into R(L') = L' and the extension is unique since R(L) is generated by relative complements w Y - X with w ;;:;; X ;;:;; Y in Land f{J(x) has a unique relative complement in the interval [f{J(w), f{J(Y)] in L'. Note that for the augmentation homomorphism e: V(L) ---+Z, e(R(L)) = 1. Also, if L has a zero and unit (or just a zero) then R(L) is a Boolean (generalized Boolean) algebra with the same zero and unit. If L does not have a zero or unit and we adjoin them to L to get L' then R (L') will be the minimal (generalized) Boolean algebra generated by L, and V(L') will have rank one or two more than V(L). By our previous discussion of the universal properties of the map L ---+ V (L) or by propositions 16 and 22 it follows that the embedding L ---+ R(L) induces an isomorphism V(L) R:! V(R(L)). This gives part of a result due to Pettis [16] (for real valuations see also SIniley, Trans. Amer. Math. Soc. 48 (1944)).
+
Corollary. If A is a valuation on L into an abelian group A it has a unique extension to a valuation on R(L) into A. If A i8 a partially ordered abelian group and A i8 monotone on L then its extension to R (L) i8 al80 monotone. Proof. A: L---+A extends uniquely to a group homomorphism A.: V(L)---+A which restricts to a valuation on R(L). For 8 ;;:;; t in R(L) there is a w in L such that w ;;:;; 8 and so r = t - 8 w = w L (Yl - Xi) with the y, - Xl orthogonal and y, ~ Xj ~ w in L. Thus A(t) - A(8) = L (A (Yl) - A(X,)) so if A is monotone on L then it is also on R(L).
+
+
484
Vol. XXIV, 1973
Valuations on Distributive Lattices III
477
6. Representations and Prime Ideals. A form of the following statement seems to be a folk theorem of measure theory. Theorem 3. If L i8 a lattice of nonempty 8ub8et8 of a 8et X then V(L) i8 naturally i80morphic to the ri1UJ of 8imple function8 S (L) generated over Z by the characteri8tic functions of the 8et8 in L. Proof. For each A E L let CA : X -'>- Z be the characteristic function of the subset A of X. Then A -'>- CA is a modular function from lattice L into the ring S(L) which is multiplicative and hence extends uniquely to a ring homomorphism of V(L) onto S(L). To show it is a monomorphism it is necessary to show that if a relation 'Ld;CA, = 0 holds in S(L) then also 'L.d;A; = 0 holds in V(L). If M is a finite sublattice of L containing AI, ... , An, it will be enough by proposition 15 to show that V (M) is isomorphic to S (M). If B is a join-irreducible element of M and BO is the maximal element of M properly contained in B, then an element x E B\BO is not in any A EM for which A l; B. So CB is independent of CA for all A EM, A l; B. It follows as in the proof of Theorem 1 that the CB for all BE P(M) are independent and so V (M) and S (M) have the same rank. Since V (M) maps onto S (M) it follows that the map must be an isomorphism. Another version of this result states that a valuation on such a lattice of sets L extends uniquely to a group homomorphism on S(L). The algebra of simple functions SR(L) generated over the real numbers R by S(L) is a subalgebra of the Banach algebra B(X) of all bounded real-valued functions on X with the sup-norm. The norm on SR(L) then yields a norm on VR(L) = V(L)@R which we shall show is intrinsic, that is, can be defined using only the embedding of L into VR(L). We defer the definition until after a discussion of prime ideals. If T is a proper prime ideal in the ring VR(L) then T ~ L since L generates VR(L) as an R-algebra and if T and L were disjoint then for all A ~ Bin L, A A (B -A) =0 so B - A is in T which means that T = I the augmentation ideal. Thus if T is not the augmentation ideal then Tn L is a proper (Le. not Land nonempty) prime ideal of L. Also for all A ~ B in L if A is not in T then B - A is in T so that in VR(L)/T all elements of the prime filter L\(T n L) are identified to the unit of the quotient algebra, which is then isomorphic to R. Thus all prime ideals of VR(L) are maximal and are the kernels of multiplicative linear functionals VR(L) -'>- R. If P is a prime ideal of L, the valuation vp which takes the value 0 on P and 1 on L\P extends to a multiplicative linear functional on VR(L). Thus the prime ideals of VR(L) other than I correspond bijectively to the proper prime ideals (or filters) of L, and hence also of R(L). The next proposition implies the existence of enough prime ideals to separate elements of L or R(L) and even stronger separation properties. Proposition 23. If M i8 a 8ublattice of L then every prime ideal of VR(M) (or L) lifts to a prime ideal of VR(L) (or L). Proof. Let T be a prime ideal of VR(M) not the augmentation ideal and AEM\(TnM) and U the ideal in VR(L) generated by TnM. If A were in U then A would be a linear combination of elements BjAE" i = 1, ... ,n where
485
478
1.
Bi E Tn M and Ei
E
L. But V (Bi
II
GEISSINGER
El)
;:;::
ARCH. MATH.
V Bi and V Bi ;f: A, so from our earlier i
results about independence, A must be independent of the Bi II E i . Thus no element of M\(TnM) is contained in the ideal U. Also M\(Tn M) is closed under II since it is a filter in M. It is well-known, and easily proven, that an ideal in a commutative ring which is disjoint from a multiplicatively closed system is contained in a prime ideal with the same property. Thus there is a prime ideal W of VR(L) which contains U and is disjoint from M\ (T n M). Since W n VR (M) is a prime ideal in VR (M) which contains Tn M and is disjoint from M\ (T n M), it is clear that W n VR(M) = T. Of course if T is the augmentation ideal of VR(M) it is contained in the augmentation ideal of VR(L). Corollary (Birkhoff-Stone [2,22]). If A is an ideal and B a filter in a distributive lattice L, and if A and Bare di8joint, there is a prime ideal T such that T ~ A and (L\T)~ B. Proof. Apply the theorem with M = A V Band T the prime ideal of VR(M) corresponding to the prime ideal A in the lattice M, that is, T is generated by A and all b' - b with b, b' E B. From this corollary with A, B single elements, the non-topological part of Stone's representation theorem [17,22] follows immediately. That is, if we associate with each element b of L the set of proper prime ideals of L not containing b then this is a faithful representation of L as a lattice of subsets of the set of all prime ideals. We shall now compare the topology introduced by Stone [22,17] on the set of prime ideals of L with the usual Zariski topology on the prime spectrum of the ring VR(L). For any subset A ~ VR(L) let @(A) be the set of all prime ideals of VR(L) which do not contain A. Then @(A) = @(B) if either B is the ideal in VR(L) generated by A, or, in case A ~ L, B is the ideal in L generated by A. The @(A) are precisely the open sets in the Zariski topology on the set X of prime ideals of VR(L). A base for this topology consists of the sets @(IX) for all IX E VR(L). Suppose M is a finite sublattice of L and let ep for all p E P(M) denote the orthogonal idempotents ZM and p - po introduced earlier. It follows from Theorem 2 and Prop. 22 that R(M) consists of all elements ez :Lep(pEA) for all subsets A ~ P(M). For 1X=:Ldpep in VR(M) let eO(=ez+:Lep(p*z, dp*O), then eO(ER(M)~R(L). Moreover, if IX rf= I then IX and eO( generate the same ideal in VR(M) and hence also in VR(L), whereas if IX E I the same is true of IX and eO( - ez . Thus for IX E VR(M), if IX rf= I then @(IX) = @(eO(), and if IX E I then @(IX) = {T EX: ZM E T and eO( rf= T}. In thc latter case if L has a zero we may assume ZM is the zero of L, then
+
@(IX) = @(eO()\{I}.
Theorem 4. Let L be a di8tributive lattice 'With zero and let Y be the set of all prime ideals of VR(L) except for the augmentation ideal, that is, Y is the prime spectrum of the ring VR(L)/(z). For each nonempty ideal A of the lattice R(L) let U (A) = {T E Y: Til A} . Then the map A
~
U(A) is an isomorphism of the lattice of ideals of R(L) onto the
486
Vol. XXIV.19?3
479
Valuations on Distributive Lattices III
lattice of all open sets in the Zariski topology on Y. The sets U (a) for all a E R (L) are precisely the open compact subsets of Y, and the topology of Y is Hausdorff, locally compact, and totally disconnected. Y is compact iff L has a unit.
Proof. Our computation above shows that for any oc E VR(L) there is an eO( E R (L) such that U (oc) = U (eO(). Now for any ideal A in R (L), A is the union of principal ideals generated by the elements bE A and U(A) = U U(b). It is easy to see that every open set is of the form U(A) for some ideal A in R(L) since for a finite collection bI , ... , b,. of elements of R (L), U ( V bj ) = U U (b j ). The corollary to proposition 23 implies that the correspondence is one-to-one. The fact that finitely generated ideals are principal translates into the statement that the only open compact sets are the U (b) for b in R (L). The topology is Hausdorff because R (L) is a generalized Boolean algebra and the remaining assertions are easily checked. This representation of Lor R(L) by the subsets U(a) of Y differs slightly from the situation described in Theorem 3 since U (z) is the empty set. Here VR(L)/(z) is isomorphic to the ring SR(L) of simple functions on Y generated over R by the characteristic functions of the U(a) for all a E L. For oc E VR(L) or VR(L)/(z) let CO( denote the corresponding function in SR(L). Then for any prime ideal y E Y, CO(y) = AII(OC) where All: VR(L)/(z) -+ R is the algebra homomorphism with kernel y. Thus the sup-norm on SR(L) yields a norm on VR(L)/(z) given by JJocJJ
=
max{1 AII(OC) I, YEY}.
Clearly the functions in SR(L) are continuous on Y, separate points, have bounded support, and for every point of Y there is a function which does not vanish there. Thus by the Stone-Weierstrass theorem, the completion of SR(L) is the space Co(Y) of all continuous functions which vanish at 00. Note that if the characteristic function of a set A ~ Y is in Co (Y) then A must be open and compact and so A = U (a) for some a E R(L), and the function is already in .8R(L). This shows that R(L), but usually not L, can be recovered from VR(L), namely as those elements oc E VR(L) for which All (oc) = 0 or 1 for all y E Y and e (oc) = 1. Now it is well-known that the continuous linear functionals on Co (Y) correspond to bounded regular Borel measures on Y, but we can describe them more simply as follows. Any element oc E VR(L) can be expressed as oc
*
=
,.
2: d,b, + dz
where b, E R(L), bj
1
* z,
and bi A bk
=
z for
i k. Then for any prime ideal y E Y, at most one of the bi is not in y, in which case All (oc) = d" and for each bi there is a prime ideal which does not contain it. Thus
" JJocJJ = max{ldd} and so the unit ball in VR(L)/(z) consists of elements oc = 2:dibi with the bl as above and Id, I ~ 1. 1 Proposition 24. The linear extension of a bounded R-valued modular function v on R(L) with v(z) = 0 is continuous on VR(L)/(z), hence extends uniquely to a continuous linear functional on the completion of VR(L)/(z). Proof. Now v is the difference of two bounded nonnegative finitely additive measures, thus we may assume v is nonnegative, finitely additive, and v(z) = O.
487
480
L.
ARCH. MATH.
GEISSINGER
If I oe II ~ 1 with oe = "L dt bi as above, then
Iv (oe) I = I"L dt v (b I ~"L Idt Iv (bt ) ~"L v (bi ) = j)
v ( V bi ) .
So a bound for von R(L) is also a bound for the linear extension of von VR(L)/(z). Theorem I) (Tarski-Pettis [12,15]). Let M be a relatively complemented sub lattice of a distributive lattice L and suppose both contain z. Then any bounded finitely additive function v: M --+ R with v(z) = 0 can be extended to a function on L with the same properties. Moreover, for any bE L\M, the value v(b) may be prescribed arbitrarily. Poof. The injection V(M) --+ V(L) induces an isometry VR(L)/(z) --+ VR(M)/(z) by Prop. 23. Apply the Hahn-Banach theorem to extend the bounded (by Prop. 24) functional v on VR(M)/(z) to a bounded functional on the completion of VR(L)/(z). Finally, since M is a generalized Boolean algebra, M = R (M) and as we saw above, no other element of R(L) except those already in R(M) can be in the completion (closure) of VR(M)/(z). Corollary. Even if M is not relatively complemented the conclusion holds provided v is nonnegative monotone and bounded. Proof. From the Corollary to Prop. 22 the unique extension of v to R(L) is nonnegative, and it is easily seen that it is bounded. Finally, if we return to the situation in Theorem 3 where L is (or is represented by) a lattice of nonempty subsets of a set, then VR(L) 1'1:1 SR(L). For any elements AI ••..• An of L, any finite sublattice M of L containing all At. and any x E B\BO where BE P(M) and BO is maximal in M less that B (or any x E B if B = ZM) "Ld;OA,(x) = "Ld;(xEA i) = "Ldd B ~ Ai) = vB("LdjA,),
where, as before, vB(A) is 1 if A ;?; Band 0 otherwise, that is, the VB are precisely all multiplicative linear functionals on VR(M). Furthermore, for any x which is in some At, the minimal element of M which contains x is such a join-irreducible element B. Thus the values of the function "LdiOA, are the numbers vB("Ld,Ai) for all BE P(M). Since by Prop. 23 each of these extends to all of VR(L), then the sup-norm of SR(L) can be defined intrinsically in VR(L) by II"LdtAd =max{lvB(2dtAi)l: BEP(M)}
for any finite sublattice M of L containing the At. If t:p: L --+ L' is any lattice homomorphism of distributive lattices then it is easy to see that the induced map t:p: VR(L) --+ VR(L') is norm-decreasing and by Prop. 23 it is an isometry iff t:p is an injection. Completing VR(L) for each L yields a functor from the category of homomorphisms of distributive lattices to the category of (norm-decreasing) homomorphisms of commutative Banach algebras.
488
Vol. XXIV,1973
481
Valuations on Distributive Lattices III
References [1] R. BALBES, Projective and injective distributive lattices. Pacific J. Math. 21, 405-420 (1967). [2] G. BIRKHOFF, Lattice Theory. Providence 1967. [3] H. CRAPO and G.·C. ROTA, Combinatorial Geometries. Cambridge 1970. [4] R. L. DAVIS, Order Algebras. Bull. Amer. Math. Soc. 78,83-87 (1970). [5] J. FOLKMAN, The homology groups of a lattice. J. Math. and Mech. 10, 631-636 (1966). [6] L. GEISSINGER and W. GRAVES, The category of complete algebraic lattices. J. Combinatorial Th. (A) 13, 332-338 (1972). [7] G. GRATZER, Lattice Theory. San Francisco 1971. [8] C. GREENE, On the Mobius Algebra of a Partially Ordered Set. Proc. Conf. on Mobius Algebras, University of Waterloo 1971. [9] P. R. HALMOS, Measure Theory. Princeton 1950. [10] P. HILTON and S. WYLIE, Homology Theory. Cambridge 1960. [11] V. KLEE, The Euler characteristic in combinatorial geometry. Amer. Math. Monthly 70, 119-127 (1963). [12] A. HORN and A. TABSKI, Measures in Boolean algebras. Trans. Amer. Math. Soc. 84, 467 -497 (1948). [13] R. LARSON and M. SWEEDLER, An associative orthogonal bilinear form for Hopf algebras. Amer. J. Math. 91, 75-94 (1969). [14] H. M. MAcNEILLE, Partially ordered sets. Trans. Amer. Math. Soc. 42, 416-460 (1937). [15] B. PETTIS, Remarks on the extension of lattice functionals. Bull. Amer. Math. Soc. M, 471 (1948). [16] B. PETTIS, On the extension of measures. Ann. of Math. 64, 186-197 (1951). [17] H. RASIOWA and R. SIKORSKI, The Mathematics of Metamathematics. Warsaw 1963. [18] G.-C. ROTA, On the foundations of combinatorial theory. 1. Z. Wahrscheinlichkeitstheorie Verw. Gebiete 2, 340-368 (1964). [19] G.-C. ROTA, On the combinatorics of the Euler characteristic. In: Studies in Pure Math., pp. 221-233. London 1971. [20] L. SOLOMON, The Burnside algebra of a finite group. J. Combinatorial Th. 2, 603-615 (1967). [21] E. SPANIER, Algebraic Topology. New York 1966. [22] M. H. STONE, Topological representations of distributive lattices and Brouwerian logics. Casopis Math. Fys. 87, 1-25 (1937). Eingegangen am 29. 1. 1973 Anschrift des Autors: Ladnor Geissinger Mathematics Department University of North Carolina Chapel Hill, North Carolina 27514, USA
31
Ardll. der Mathematik XXIV
489