Classic Papers in Combinatorics (Modern Birkhäuser Classics)

Permissions Birkhiiuser Boston thanks the original publishers and authors of the following papers for granting permiss...

Author: Ira Gessel | Gian-Carlo Rota

74 downloads 820 Views 38MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

Permissions

Birkhiiuser Boston thanks the original publishers and authors of the following papers for granting permission to reprint specific papers in this collection. [1930) [1932) [1935a) [1935bJ [1935c) [1940) [1941] [1943) [1947) [1950a) [1950b) [1951 J [1952) [1956a) [1956b) [1956c) [1957a) [1957bJ [1959] [196la) [1961b) [1962] [1963a) [1963b] [1964J [1965J [1966a) [1966b) [1968) [ 1969) [1970aJ [1970b] [I972a[ [1972b) [1972c) [1973aJ [1973b) [1973cJ [1973d]

Reprinted from Proc. London Math. Soc. (2) 30, © 1930 by the London Mathematical Society. Reprinted from Trans. Amer. Math. Soc. 34, © 1932 by American Mathematical Society. Reprinted from Compositio Math. 2, © 1935 by Martinus Nijhoff Publishers BV. Reprinted from J. London Math. Soc. 10, © 1935 by the London Mathematical Society. Reprinted from Amer. J. Math. 57, © 1935 by the Johns Hopkins University Press. Reprinted from Dllke Math. J. 7, © 1940 Duke University Press. Reprinted from Proc. Cambridge Phil. Soc. 37, © 1941 by Cambridge University Press. Reprinted from BIIII. Amer. Math. Soc. 49, © 1943 by American Mathematical Society. Reprinted from Proc. Cambridge Phil. Soc. 43, © 1947 by Cambridge University Press. Reprinted from Ann. of Math. (2) 51, © 1950 by Princeton University Press. Reprinted from Amer. J. Math. 72, © 1950 by the Johns Hopkins University Press. Reprinted from Simon Stevin 28, © 1951 by Simon Stevin. Reprinted from Canad. J. Math. 4, © 1952 by Canadian Journal of Mathematics. Reprinted from BIIII. Amer. Math. Soc. 62, © 1956 by the American Mathematical Society. Reprinted from Canad. J. Math. 8, ©1956 by Canadian Journal of Mathematics. Reprinted from Amer. Math. MOII/hly 63, © 1956 by The Mathematical Association of America. Reprinted from Pacific J. Math. 7, © 1957 by Pacific Journal of Mathematics. Reprinted from Canad. J. Math. 9, © 1957 by Canadian Journal of Mathematics. Reprinted from Canad. J. Math. II, © 1959 by Canadian Journal of Mathematics. Reprinted from Physica 27, © 1961 by North-Holland Physics Publishing. Reprinted from Canad . .J. Math. 13, © 1961 by Canadian Journal of Mathematics. Reprinted from Proc. Amer. Math. Soc. 13, © 1962 by American Mathematical Society. Reprinted from Trans Amer. Math. Soc. 106, © 1963 by American Mathematical Society. Reprinted from Proc. Cambridge Phil. Soc .. 59, © 1963 by Cambridge University Press. Reprinted from Z. Wahrscheinlichkeitstheorie und Verw. Gebietc 2, © 1964 by SpringcrVerlag Heidelberg. Reprinted from Canad. J. Math. 17, © 1965 by Canadian Journal of Mathematics. Reprinted from Proc. of the Colloquium held at Tihany. Hungarv, Sept. 1966. © 1968 by Akademiai Kiado, Publishing House of the Hungarian Academy of Sciences. Reprinted from J. Combinatorial Theory I, © 1966 by Academic Press. Reprinted from Arch. Math. 19, © 1968 by Birkhauser Verlag. Reprinted from J. Combinatorial Theory 7, © 1969 by Academic Press. Reprinted from J. Math. Phvs. II, © 1970 by American Institute of Physics. Rcprinted from Adl'Unces in Math. 5, © 1970 by Academic Press. Reprinted from Advances in Math. 8, © 1972 by Academic Press. Reprinted from J. Combinatorial Theory Ser. B 13, © 1972 by Academic Press. Reprinted from J. Combinatorial Theory Ser. B 13, ©1972 by Academic Press. Reprinted from Discrete Math. 5, © 1973 by North-Holland Elsevier Science Publishers B.V. Reprinted from Arch. Math. 24, © 1973 by Birkhauser Verlag. Reprinted from Arch. Math. 24, © 1973 by Birkhauser Verlag. Reprinted from Arch. Math. 24, © 1973 by Birkhiiuser Verlag.

Modern Birkh¨ auser Classics Many of the original research and survey monographs in pure and applied mathematics published by Birkh¨ auser in recent decades have been groundbreaking and have come to be regarded as foundational to the subject. Through the MBC Series, a select number of these modern classics, entirely uncorrected, are being re-released in paperback (and as eBooks) to ensure that these treasures remain accessible to new generations of students, scholars, and researchers.

Classic Papers in Combinatorics

Ira Gessel Gian-Carlo Rota

Reprint of the 1987 Edition

Birkh¨auser Boston • Basel • Berlin

Editors Ira Gessel Brandeis University Department of Mathematics Waltham, MA 02454 USA

Gian-Carlo Rota (Deceased) Massachusetts Institute of Technology (MIT) Department of Mathematics Cambridge, MA 02139 USA

ISBN: 978-0-8176-4841-1 e-ISBN: 978-0-8176-4842-8 DOI: 10.1007/978-0-8176-4842-8 Library of Congress Control Number: 2008939075 Mathematics Subject Classification (2000): 68Rxx, 68R05, 05-xx c 2009 Birkh¨auser Boston, a part of Springer Science+Business Media, LLC All rights reserved. This work may not be translated or copied in whole or in part without the written permission of the publisher (Birkh¨auser Boston, c/o Springer Science+Business Media, LLC, 233 Spring Street, New York, NY, 10013, USA), except for brief excerpts in connection with reviews or scholarly analysis. Use in connection with any form of information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed is forbidden. The use in this publication of trade names, trademarks, service marks and similar terms, even if they are not identified as such, is not to be taken as an expression of opinion as to whether or not they are subject to proprietary rights.

Printed on acid-free paper. 9 8 7 6 5 4 3 2 1 www.birkhauser.com

Classic Papers in Combinatorics Edited by Ira Gessel Gian-Carlo Rota

Birkhauser Boston . Basel . Stuttgart

Ira Gessel Department of Mathematics Brandeis University Waltham, MA 02454

Gian-Carlo Rota Department of Mathematics Massachusetts Institute of Technology Cambridge, MA 021 39

U.S.A.

U. S.A.

Li brary of Congress Cataloging in Publication Data Classic papers in cornbimllorics. I. Combin
All rig hts reserved . No part or this publ ication may be reproduced. stored in a re trieval system, Of transmilted. in ,my form or by aoy means. electronic. mechanic:.!l. photocopying. recording or otherw ise.

without prior JX:rmission of lhe cupyright owner. © Birkhauser Boston. 1987 ISBN 0-8 176-3364-2 ISBN 3-7643-3364-2 Printed and bound by Qui nn-W{xxlbine Inc . Woodbine. Ne w Je rsey. Printed in the U.S.A . 98765 4 32 1

We wish to thank the following persons who have made valuable suggestions of papers to be included in this volume: George Andrews, Richard Brualdi, Andrew Gleason, Phil Hanlon, David Jackson, Jeff Kahn, Adalbert Kerber, Joseph Kung, and Richard Stanley.

Contents

[1930] [1932] [1935a] [1935b] [1935c] [1940] [1941] [1943] [1947] [1950a] [1950b] [1951] [1952] [1956a] [1956b] [1956c] [1957a] [1957b] [1959] [ 1961 a] [196Ib] [1962]

F. P. Ramsey, On a problem offormallogic [Proc. London Math. Soc. (2) 30, 264-286] ........................................................... 2 H. Whitney, Non-separable and planar graphs [Trans. Amer. Math. Soc. 34, 339-362] ........................................................... 25 P. Erdos and G. Szekeres, A combinatorial problem in geometry [Compositio Math. 2, 463-470] .......................................... 49 P. Hall, On representatives of subsets [1. London Math. Soc. 10,26--30] ... 58 H. Whitney, On the abstract properties of linear dependence [Amer. J. Math. 57, 509-533] ..................................................... 63 R. L. Brooks, C. A. B. Smith, A. H. Stone, and W. T. Tutte, The dissection of rectangles into squares [Duke Math. J. 7, 312-340] ........... 88 R. L. Brooks, On colouring the nodes of a network [Proc. Cambridge Phil. Soc. 37, 194-197] ...................................................... 118 I. Kaplansky, Solution of the "probleme des menages" [Bull. Amer. Math. Soc. 49, 784-785] ...................................................... 122 W. T. Tutte, A ring in graph theory [Proc. Cambridge Phil. Soc. 43, 26-40] ................................................................. 124 R. P. Dilworth, A decomposition theorem for partially ordered sets [Ann. of Math. 51, 161-166] ..................................................... 139 P. R. Halmos and H. E. Vaughan, The marriage problem [Amer. J. Math. 72, 214-215] ........................................................... 146 T. van Aardenne-Ehrenfest and N. G. de Bruijn, Circuits and trees in oriented linear graphs [Simon Stevin 28, 203-217] ........................ 149 W. T. Tutte, The factors of graphs [Canad. J. Math. 4, 314-328] .......... 164 P. Erdos and R. Rado, A partition calculus in set theory [Bull. Amer. Math. Soc. 62,427-489] ...................................................... 179 L. R. Ford, Jr., and D. R. Fulkerson, Maximalflow through a network [Canad. J. Math. 8,399-404] ........................................... 243 G. P6lya, On picture-writing [Amer. Math. Monthly 63,689-697] ......... 249 D. Gale, A theorem on flows in networks [Pacific J. Math. 7, 1073-1082] ............................................................ 259 H. J. Ryser, Combinatorial properties of matrices of zeros and ones [Canad. J. Math. 9, 371-377] ........................................... 269 P. Erdos, Graph theory and probability [Canad. J. Math. 11,34-38] ....... 276 P. W. Kasteleyn, The statistics of dimers on a lattice: I. The number of dimer arrangements on a quadratic lattice [Physica 27, 1209-1225] ........ 281 C. Schensted, Longest increasing and decreasing subsequences [Canad. J. Math. 13, 179-191] ..................................................... 299 M. P. Schiitzenberger, On a theorem ofR. Jungen [Proc. Amer. Math. Soc. 13,885-890] ........................................................... 313

Contents [1963a] A. W. Hales and R. I. lewett, Regularity and positional games [Trans. Amer. Math. Soc. 106,222-229] ........................................ 319 [1963b] C. St. 1. A. Nash-Williams, On well-quasi-ordering finite trees [Proc. Cambridge Phil. Soc. 59, 833-835] ...................................... 329 [1964] G.-c. Rota, On the foundations of combinatorial theory I: Theory of Mobius functions [Z. Wahrscheinlichkeitstheorie und Verw. Gebiete 2, 340-368] ............................................................... 332 [1965] 1. Edmonds, Paths, trees, and flowers [Canad. 1. Math. 17,449-467] ...... 361 [1966a] G. Katona, A theorem offinite sets [in Theory of Graphs: Proceedings of the Colloquium held at Tihany, Hungary, Sept. 1966, Academic Press and Akademiai Kiad6, Budapest, 1968, pp. 187-207] ......................... 381 [1966b] D. Lubell, A short proof of Sperner's lemma [1. Combinatorial Theory I, 299] ................................................................... 402 [1968] H. H. Crapo, Mobius inversion in lattices [Arch. Math. 19, 595-607] ..... .403 [1969] G. F. Clements and B. Lindstrom, A generalization ofa combinatorial theorem of Macaulay, [1. Combinatorial Theory 7, 230-238] .............. .416 [1970a] I. 1. Good, Short proof of a conjecture by Dyson [1. Math. Phys. 11, 1884] 425 [1970b] D. 1. Kleitman, On a lemma of Littlewood and Offord on the distributions of linear combinations of vectors [Advances in Math. 5, 155-157] ........... .427 [1972a] R. L. Graham, K. Leeb, and B. L. Rothschild, Ramsey's theorem for a class of categories [Advances in Math. 8,417-431] ........................... .431 [1972b] L. Lovasz, A characterization of perfect graphs [1. Combinatorial Theory Ser. B 13, 95-98] ..................................................... .447 [1972c] L. Lovasz, A note on the line reconstruction problem [1. Combinatorial Theory Ser. B 13,309-310] ............................................ .451 [1973a] R. P. Stanley, Acyclic orientations of graphs [Discrete Math. 5, 171-178] . .453 [1973b] L. Geissinger, Valuations on distributive lattices I [Arch. Math. 24, 230-239] ............................................................... 461 [1973c] L. Geissinger, Valuations on distributive lattices II [Arch. Math. 24, 337-345] ............................................................... 473 [1973d] L. Geissinger, Valuations on distributive lattices III [Arch. Math. 24, 475-481] ............................................................... 483

viii

Introduction

This volume surveys the development of combinatorics since 1930 by presenting in chronological order the fundamental results of the subject proved in the orginal papers. We begin with the celebrated theorem of Ramsey [1930], originally developed to settle a special case of the decision problem for the predicate calculus with equality. It remains to this day the fundamental generalization of the classical pigeonhole principle. The paper by Erdos and Szekeres [1935a] was one of the first applications of Ramsey's theorem, and it is still one of the most elegant. Through the partition calculus of Erdos and Rado [1956a], Ramsey's theorem made inroads into set theory, where nowadays it holds the limelight. The next major advance along the lines initiated by Ramsey came with the work of Hales and Jewett [1963a], a result which has served as a foundation for much further work in the area. The categorical underpinning of Ramsey theory was worked out by Graham, Leeb, and Rothschild [1972a]. Here the original ideas of Ramsey are cleverly blended with the contribution of Hales and Jewett. Whitney'S paper [1932] marks the beginning of what is now the theory of matroids. Three years later the theory makes its appearance fully clad in another paper of Whitney [1935c I which remains the basic reference on the subject. The theory of matroids was also in the backround of Tutte's paper [1947]. Tutte's paper is couched in the language of graphs and was later generalized to arbitrary matroids by Brylawski. The motivation behind much of the work of the two outstanding graph theorists of the day, Whitney and Tutte, was the coloring problem for graphs. Two short and elegant results on coloring problems are Brooks's theorem [1941) relating the chromatic number of a graph to its maximal degree and Lovasz's theorem [1972b] characterizing perfect graphs. Philip Hall's paper [1935b] was the first in what is now called matching theory. A very short proof of Hall's marriage theorem was given by Halmos and Vaughan [1950b]. In the same year Dilworth [l950a] proved his famous decomposition theorem for partially ordered sets, which generalizes Hall's theorem. Several other minimax combinatorial theorems can be viewed as variants or generalizations of the marriage theorem. Such are Tutte's definitive work on factors in graphs [1952], Ford and Fulkerson's theory of flows in networks [1956b], and Edmonds's [1965] efficient algorithm for matching in graphs. Gale [1957a] used network flow theory to prove a result on matrices of O's and I 's with given row and column sums also proved directly by Ryser [1957b]. De Bruijn and van Aardenne-Ehrenfest [1951], taking their lead from the early

Introduction

work of Kirchhoff, obtained a definitive result, now called the BEST theorem (de Bruijn-Ehrenfest-Stone-Tutte) concerning the enumeration of spanning trees and Eulerian circuits of a graph by determinants. Several years later, Kasteleyn [1961al succeeded in solving a packing problem for dimers on a lattice by reducing the problem to the evaluation of Pfaffians. P6lya's paper on picture-writing [ 1956c] foreshadows the notion of the incidence algebra, a term introduced years later by Rota [1964] in his theory of Mobius functions. Rota's work was substantially extended by Crapo [1968J. The mystery of the characteristic polynomial, defined in terms of the Mobius function of a partially ordered set, motivates Stanley's beautiful result [1973aJ on acyclic orientations of graphs. Geissinger's three papers [1973b-dJ are the definitive presentation of the theory of Mobius functions. The subject that is now called extremal set theory is represented by Katona's paper [1966aJ. The main result, independently proved by Kruskal, harks back to a theorem of Macaulay, and was generalized by Clements and Lindstrom [1969]. Lubell's blitz proof of Sperner's theorem [l966b] has been extensively generalized and applied to many problems. Kleitman's solution [1970b] of a long-standing problem of Erdos related to the Littlewood-Offord problem shows the power of a simple, but far from obvious, induction argument. Brooks, Smith, Stone, and Tutte's paper [1940] on the decomposition of rectangles into squares was the first to use Kirchhoff's laws for the solution of a problem in combinatorics, a technique that has since become standard. Kaplansky's solution [1943] of the probleme des menages using the inclusionexclusion principle has developed into what is now the theory of permutations with restricted position. Lovasz's contribution [1972c] to the Ulam reconstruction problem is another ingenious use of the inclusion-exclusion principle. Erdos's paper on graph theory and probability [1959] is the first paper to show how probabilistic methods can lead to combinatorial existence theorems. A substantial number of theorems in combinatorics, for which no explicit construction is known, can be given existence proofs by this method. Schensted's bijection [1961 b J between permutations and pairs of standard Young diagrams has proved central in the seemingly unrelated topics of plane partitions and representations of the symmetric group. Schiitzenberger's paper [1962] lays the foundation of what is now the theory of rational and algebraic power series in noncom mutative variables. Nash-Williams's striking proof [1963b] that finite trees form a well-quasi-ordered set has blossomed both in logic and in graph theory. I. 1. Good's short proof [1970a] of a conjecture of Dyson has since been widely generalized but his approach is still the good one. Ira M. Gessel Gian-Carlo Rota

x

Classic Papers in Combinatorics

264

F. P.

RAMSEY

ON A PROBLEM OF FORMAL LOGIC

By F. P.

RAMSEY.

[Received 28 November, 1928.-Read lS"December, 1928.]

This paper is primarily concerned with a special case of one of the leading problems of mathematical logic, the problem of finding a regular procedure to determine the truth or falsity of any given logical formula·. But in the course of this investigation it is necessary to use certain theorems on combinations which have an independent interest and are most conveniently set out by themselves beforehand. 1.

The theorems which we actually require concern finite classes only, but we shall begin with a similar theorem about infinite classes which is easier to prove and gives a simple example of the method of argument. THEOREM A. Let r be an infinite class, and p. and r positive integers; and let all those sub-classes of r which have exactly r members, or, as we may say, let all r-com binations of the members of r be divided in any manner into p. mutually exclusive classes C, (i 1, 2, ... , p.), so that every r-combination is a member of one and only one OJ; then, assuming the axiom of selections, r must contain an infinite sub-class A such that all the r-combinations of the members of A belong to the same Ci.

=

Consider first the case p. = 2. (If p. = 1 there is nothing to prove.) The theorem is trivial when r is 1, and we prove it for all values of r by induction. Let us assume it, therefore, when r = p-l and deduce it for r = p, there being, since p. = 2, only t\VO classes Oi, namely 0 1 and O2 ,

.. Called in German the Entscheidullgsproblell1.; see Hilbert und Ackermann. GrundziLge der Theoretischen Logik, 72-81.

2

1928.]

ON

A PROBLEM OF FORMAL LOGIC.

265

It may happen that r contains a member XI and an infinite sub-class not including XI. such that the p-combinations consisting of :1:1 together with any p-1 members of 1\, all belong to CI . If so, r l may similarly contain a member X2 and an infinite sub-class r 2 , not including ;1:2, such that all the p-combinations consisting of X2 together with p-l members of 1'2, belong to CI. And, again, 1'2 may contain an X3 and a 1'3 with similar properties, and so on indefinitely. 'Ve thus have two possibilities: either we can select in this way two infinite sequences of members of l' (XI, X2, ••• , x," ... ), and of infinite sub-classes of r (rl, r 2, ... , r", ... ), in which Xu is always It membet· of 1'.. -1, and r" a. sub-class of r,,_1 not including Xu, such that all the p-combinations consisting of Xn together with p-1 members of 1'.. , belong to CI ; or else the process of selection will fail at a certain stage, say the n-th, because 1'''_1 (or if n = 1, l' itself) will contain no member X" and infinite sub-class I'" not including x" such that all the p-combinations consisting of X" together with p-l members of 1'" belong to CI • Let us take these possibilities in turn. If the process goes on for ever let il be the class (XI, X2, ••. , x'" ... ). Then all these x' s are disti~ct, since if r> s, x,. is a member of r ,'-1 and so of I'r-2, rr-s, ... , and ultimately of r. which does not contain X,. Hence il is infinite. Also all p-combinations of members of il belong to CI ; for if x. is the term of such a combination with least suffix s, the other p-1 terms of the combination belong to 1'., and so form with x. a p-combination belonging to CI • r therefore contains an infinite subclass il of the required kind. Suppose, on the other hand, that the process of selecting the x's and r's fails at the 1I-th stage, and let Yl be any member of r"-I. Then the (p-1)-combinations of members of r ,,-1- (YI) can be divided into two mutually exclusive classes C; and C~ according as the p-combinations formed by adding to them YI belong to CI or C2 , and by our theorem (A), which we are assuming true when T = p-1 (and p. = 2), r"_I-(YI) must contain an infinite sub-class ill such that all (p-1)-combinations of the members of ill belong to the same C:; i.e. such that the p-combinatioll:> formed by joining YI to p-1 members of ill all belong to the same Ci • Moreover, this C. cannot be C1, or YI and ill could be taken to be Xn and r.. and our previous process of selection would not have failed at the n-th stage. Consequently the p-combinations formed by joining YI to p-1 members of ill all belong to C2 • Consider now ill and let Y2 be any of its members. By repeating the preceding argument il l -(Y2) must contain an infinite sub-class .:l2 such that all the p-combinations got by joining Y2 to p-1 111embt:rs of ~~ belong to the same CI •

rl ,

3

F. P.

266

[Dec. 13,

RAMSEY

And, again, this Ci cannot be Cl, or, since Y2 is a member and ~2 a subclass of ~1 and so of {',,_I which includes ~I' Y2 and ~2 could have been chosen as x" and r" and the process of selecting these would not have failed at the n-th stage. Now let Y3 be any member of ~2; then ~2-(Y3} must contain an infinite sub-class .13 such that all p-combinations consisting of Ya together with p-l members of As, belong to the same Cio which, as before, cannot be C1 and must be C2 • And by continuing in this way we shall evidently find two infinite sequences YI. !h, ... , v'" ... and ~J' ~2' ••• , Ll", ... consisting respectively of members and sub-classes. of 1', and such that y" is always a member of ~,,-J, Ll n a sub-class of An - 1 not including Yn, and all the p-combinations formed by joining y" to p-l members of A" belong to C2 ; and if 'we denote by A the class (Yb Y2, ... , Yn, ... ) we have, by a previous argument, that all p-combinations of members of ~ belong to C2 • Hence, in either case, r contains an infinite sub-clasB Ll of the required kind, and Theorem A is proved for all values of r, provided that p. = 2. For higher values of p. we prove it by induction; supposing it already established for p. 2 and p. v-I, we deduce it for p. = v. The r-combinations of members of I' are then divided into v classes Ci (i = 1,2, ... , v). We define new classes C: for i = 1,2, ... , v-I by

=

=

C: = Ci

(i

=

1, 2, .'., 1'-2),

Then by the theorem for ,U = v-I, r must contain an infinite subclass ~ such that all r-combinations of the members of D. belong to the same C:. If, in this C;, i ~ v-2, they all belong to the same Ci, which is the result to be proved; otherwise they all belong to C~-l, ·i.e. either to C._lor to Cv• In this case, by the theorem for p. = 2, ~ must contain an infinite sub-class ~' such that the r-combinations of members of .!l' either all belong to C._lor all belong to C.; and our theorem is thus established. Corning now to finite classes it will save trouble to make some conventions as to notation. Small letters other than x and y, \\"hether Italic or Greek (c.g. n, r, p., m) will always denote finite cardinals, positive unless otherwise stated. Large Greek letters (e.g. 1', A) will denote classes, and their suffixes will indicate the number of their members (c.g. 1'", is a class with m members). The letters x and y ,viII represent members of the classes r, ~, etc., and their suffixes ".. ill be used merely to distinguish them. Lastly, the letter C will stand, as before, for classes of combinations, and its suffixes will not refer to the

4

1928.]

ON

267

A PROBLEM OF F'ORMAL LOGIC.

number of members, but serve merely to distinguish the ditIerentcIasses of combinations considered. Corresponding to Theorem A we then have

B. Given any r, n, and p. we can find an mo such that, ij and the r-combinations of any r mare .divided in any manner int(} /l mutually exclusive classes O. (i = 1,2, ... , p.), then r,,, must contain a sub-class .1,. such that all the r-combinations of members of ~" belong to the same Oi. This is the theorem which we require in our logical investigations, and we should at the same time like to have information as to how large me) must be taken for any given r, n, and p.. This problem I do not know how to solYe, and I have little doubt that the values for ?no obtained below are far larger than is necessary. To prove the theorem we begin, as in Theorem A, by supposing that fl 2. We then take, not Theorem B itself, but the equivalent THEOREM C. Given any r, n, and k such that n+k ~ r, there is an 1110 such' tlwt, if m ~ m~ and the r-combinations of any r,.. are di1Jidea into two mutually exclusive classes C1 and O2 , then r .. must contain two mutually exclusive sub-classes ~n and Ak such that all the combinations formed by r members of ~,. +A k which include at least one member fr0111 ~n belong to the same Ci • That this is equivalent to Theorem B with p. = 2 is evident from the fact that, for any given r, Theorem C, for nand k, asserts more than Theorem B for n, but less than Theorem B for n+k. The proof of Theorem C must be performed by mathematical induction, and can conveniently be set out as a demonstration that it is possible to define by recursion a function /(r, n, k) which will serve as 1110 in the theorem. If r = 1, the theorem is evidently true with rno equal to the greater of 2n-1 and n+k, so that we may define THEOREM

1n ~ mo

=

=

/(1, n, k) max (2n-1, n+ k) (1/ ~ 1, k ~ 0). :For other values of r we define /(r, n, k) by recursion formulae inyolving an auxiliary function g(r, n, k). Suppose that /(r-1, n, k) has been defined for a certain r-1, and all n, k such that n+k ~ r-1, then we define it for r by putting /(1·,1, k) =/(1·-1, k-r+2, ,·-2)+1 g(1·, 0, k)

= max (,·-1, k),

=/jr, 1, g(r, n-l, k)} /(1·, n, k) =/{-r, n-1, g(r, n, k)f

g(r, n, k)

5

(n ~ 1),

(n> 1).

F. P.

268

RAMSEY

[Dec. 13,

These formulae can be easily seen to define j(r, n, k) for all positive valnes of r, nand k satisfying n+k;::: r, and g(1, n, k) for all values of 1 greater than 1, and all positive values of nand k; and we shall prove that Theorem C is true when we take tno to be this j(r, n, k). vVe know that this is so when r = 1, and we shall therefore aSSllme it for all values up to r-l and deduce it for r. 'Vhen n = 1, and m;::: 1n{) = j(r-l, k-1+2, 1"-2)+1, we may take any member x of 1'", to be sole memher of ~I and there remain at least 1(1'-1, k-1"+2, r-2) members of r",-(x); the (r-I)-combinations of these members of 1\,- (x) can be divided into classes C; and C; according as they belong to CI or C2 when x is added to them, and, by our theorem for 1-1, r -(x) must contain two mutually exclusive classes ~"-r+2' Ar_ such that every combination of r-l terms from ~k-r+2+A"-2 (since one of its terms must come from Ak-r+2, A"_2 having only 1-2 members) belongs to the same Taking Ak to be this Ak_r+:!+A"_2 all combinations consisting of x, together with r-l members of AL-, belong to the same Ci . The theorem is therefore true for 1 when n = l. For other values of n we prove it by induction, assuming it for n-l a.nd deducing it for n. Taking lll

C:.

1n;;;" 1110 =j(r, n, k) =jJ1', n-l, g(r, n, k)},

r

m must, by the theorem for n-l, contain a An-I and a Ag(i', n, k) such that every combination of l' members of A,._l+Ag(r,n,!·), at least one term ()f which comes from An-I> belongs to the same Cil say to Cl . If, now, Ag(r,n,k) contains a member x and a sub-class Ak not including x, such that every combination of x and 1'-1 members of Ak belongs to CI , then, taking An to be An - 1 (;r,) and Ak to be this A k , our theorem is true. If Dot, there can be no member of .\g(r...,I,) which has a tmb-class of k members of Ag(r, 11, 1,) connected with it in this way. But since

+

g(r, n, k) =j{1', l,g(r, n-l, k)},

A g (,., n. k) must contain a member Xl and a sub-class Ag(r, "-1, 1;), not including such that Xl combined with any 1'-1 members of Ag(r,"-I,k) gives a combination belonging to the same Ci, which cannot be Cl , or Xl and any k members of Ag(r, ,,-1, I:) could ha.ve been taken as the x and Ak above. Hence the combinations formed by Xl together with any 1'-1 members ()f Aq(r, n -I, k) all belong to C2 • But IIOW Xl,

g(r, n-l, k)

=j{r, 1, g(l', n-2, k)f,

muet contain an x 2 and a A g (r,n-2,k), not including x 2 , such tnat the combinations formed by X 2 and 1"-1 members of Ag(r. ,,-2, k) all

~t11d Ag(,.,"_I,k)

6

1928,]

ON

269

A PROBLEM OF FORMAL LOGIC,

belong to the same Ci, which must, as before, be C2, since X2 and AY(",n_2.k are both contained in Ag(",f/,k) and g(l', 1£-2, k) ~ k, Continuing in this way we can find n distinct terms Xl' x 2, .. " X" and a AY(",O,k) such that every combination of l' terms from (Xl' X2' .. " x,,) Ag(T, 0,1:) belongs to C2 , provided that at least one term of the combination comes from (Xl' X 2, .. ,' X,,), Since g(1', 0, k) ~ k this proves OUl' theorem, taking Ll" to be (.el' 3:2, .. " x.,,) and AI" to be any k terms of Ag (r, 0, I)' Theorem C is therefore established for all values of r, n, and k, with nlo equal to /(r, n, k), It follows that, if p. = 2, Theorem B is true for all values of rand n with mo equal to /(r, n-r+1, r-1), which we shall also call h(r, n, 2). For other values of It we prove Theorem B by induction, taking 1Il0 to be h(r, n, p.), where

+

h(l', n, 2)

h(r, n, p.)

=

/(1', n-1'+ 1, 1'-1)

= It jr, h(r, n, p.-l), 2 f

(p.

>

2),

For, assuming the theorem for /..l-I, we prove it for p. by defining new classes of combinations

If then rn ~ h(r, n, p.) = h{r, h(r, n, p.-1), 2}, by the theorem for' r m must contain a r k(r, Ii, ,._1) the r-combinations of whose members, belong either all to C; or all to C~, In the first case there is no more to prove; in the second we have only to apply the theorem for p.-l to, p. = 2,

r/,cT,fI,,.-I).

= =

In the simplest case in which r p. 2 the above reasoning gives equal to h(2, n, 2), which is easily shown to be 2*-1)/2. But for this, case there is a simple argument which gives the much lower value m{) = n!, and shows that our value h(r, n, p.) is altogether excesFive. For, taking Theorem C first, we can prove by induction with regard to n that, for r 2, we may take 1no to be k. (n+1)!. (k is here supposed greater than or equal to 1.) For this is true when n 1, since, if' 11£ ~ 2k, of the m-1 pairs obtained by combining any given member of r m with the others, at least k must belong to the same Ci. Assuming it, then, for n-l, let us prove it for n. If m ~ k, (n+l)! = k(n+l). n!, rm must, by the theorem for n-l, contain two mutually exclusive sub-classes An_I and Ak(n+!) such that all pairs from A,,-l+Ak(,Hl). at least one term of which comes from An_I, belong to the same Ci , say Cl . Now consider the members of Ak(ll+l); in;, 1110

=

=

7

F. P.

270

[Dec. 13,

RAMSEY

the first place, there may be one of these, x say, which is such that there are k other members of A '.(n+1) which combined with x give pairs belonging to 0 1. If so, the theorem is true, taking ~n to be ~"_I (x) ; if not, let XI be any member of A" (n+I)' Then there are at most k-1 other members of A',(n+l) which combined with XI give pairs belon~'ing to 0 1 , and A"(II+l)-(X I ) must contain a 1',,," any member of "'hieh gives when combined with XI a pair belonging to O2 • Let X2 be any lllember of .J.h,/) then, since :1'2 and A/,n are both contained in A/;(J1+I), there are at most k-1 other members of A,." which ,,,hen combined with X2 give pairs belonging to 0 1 . Hence A"n-(.r2) contains a A"(n-l) any member of which combined with :1'2 gives a pair belonging to O2 . Continuing in this way we obtain Xl, X2, ... , Xn and AT.:, such that every pair Xi, :Tj and every pair consisting of an Xi and a member of Ak belongs to O2• Theorem C is therefore proved. Theorem B for n then follows, with the 1110 of Theorem C for n-1 and 1, i.c. with 11lo equal to n!*; and it is an easy extension to show that, if in Theorem B r = 2 but f.J. =f=. 2, we can take mo to be n ! ! !, where the process of taking the factorial is performed 11-1 times.

+

II. \Ve shall be concerned with logical formulae containing ;-ariable propositional functions, i.c. predicates or relations, which we shall denote by Greek letters cp, x, etc. These functions have as arguments individuals denoted by x, y, z, etc., and we shall deal with functions with any finite number of arguments, i.c. of any of the forms

-r,

cp(x), X(x, V),

-rex, y, z),

....

In addition to these variable functions we shall have the one constant function of identity X Y 01' (x, V).

=

=

By operating on the values of cp, X, tions meaning not,

v (x) (Ex)

" " "

"

-r, ... , and = with the logical opera-

or, and, fo1' all x. there is an X f01' which,

" But this value is,1 think, still much too high. It can easily be lowered sligbtly even when following the line of argument above, by using the fact that if k is evell it is impossible for every member of an odd ciass to have exactly k-l others With which it forms a pair of C h for then twice the numbcr of thcse pll.irs would be odd; we can thus start when k is even with a. Ak(,.. I)-l instell.d of a Ak(".l)

8

1928.J

0","

A PROBLEM OF FORMAL LOGIC.

271

we can constrnct expressions such as [(x,

yl{ p(x,

Y) V x

= y} 1V

{(Ez) X(z)}

in which all the individual variables are made "apparent" by prefixes (x) or (Ex), and the only real variables left are the functions p, X, .... Such an ex pression ,re shall call a fi rst order formula. If such a formula is true for all interpretations· of the functional variables p, x, Vr, etc., we shall call it valid, and if it is trne for no interpretations of these variables we shall call it inconsistent. If it is true for some interpretations (whether or not for all) we shall call it consistent +. The Enlscheidungsproblem is to find a procedure for determining whether any given formula is valid, or, alternatively, whether any given formula is consistent; for these two problems are equivalent, since the necessary and sufficient condition for a formula to be consistent is that its contradictory should not be valid. 'Ye shall find it more convenient to take the problem in this second form as an investigation of consistency. The consistency of a formula may, of course, depend on the number of individuals in the universe considered, and we shall have to distinguish between formulae which are consistent in every universe and those which are only consistent in universes ,,,ith some particular numbers of members. vVhenever the universe is infinite we shall have to assume the axiom of selections. The problem has been solved by Behmann! for formulae involving only functions of one variable, and by Bernays and Schi:infinkel § for formulae involving only two individual apparent variables. It is solved below for the further case in which, when the formula is written in "normal form", there are any number of prefixes of generality (x) but 110ne of existence (Ex) II. By "normal form"~ is here meant that all the prefixes stand at the beginning, with no negatives betv,een or in front of them, and have scopes extending to the end of the formula . • To avoid confusion we 08011 a oonstant function substituted for a variable cp, not G value but an intelpretatiolt of cp; the values of cp (x, y, z) are got by substituting constant individuals for :1:, y. and z. t German I!1jilllbar. t H. Behmann, "Beitrage zur Algebra der Logik und zum Entseheidungsproblem", Math. Amwlen, 86 (1922), 163-229. § P. Bernays und M. Schonfinkel, "Zum Entscheidullgsproblem der mathematischen Logik", Math. Annalen, 99 (1928), 342-372. These authors do not, however, include identity in the formulae they consider. II Later we extend our solution to the case in which there are also prefixes of existence provided that these all precede all the prefixes of generality. ~I Hilbert und Ackermann, op. cit., 63-i.

9

F. P.

272

RAMSEY

[Dec. 13.

The formulae to be considered are thus of the form

where the matrix F is a truth-function of values of the functions 1>, x, Yr, etc., and for arguments drawn from Xl> X2, •.. , X~. This type of formula is interesting as being the general type of an axiom system consisting entirely of "general laws"·. The axioms for order, betweenness, and cyclic order are all of this nature, and we are thus attempting a general theory of the consistency of axiom systems of a common, if very simple, type. If identity does not occur in F the problem is trivial, since in this case whether the formula is consistent or not can be shown to be independent of the number of individuals in the universe, and we have only the easy task of testing it for a universe with one member onlyt. But when we introduce identity the question becomes much more difficult, for although it is fi'till obvious that if the formula is consistent in a universe U it must be consistent in any universe with fewer members than U, yet it may easily be consistent in the smaller universe bllt not in the larger. For instance,

=

is consistent in a universe with only one member but not in any other. We begin our investigation by expressing F in a special form. F is Il truth-function of the values of 1>, x, Yr, ... , and = for arguments drawn from .XlJ X2, ••. , X n • If 1> is a function of r variables there will be nr! values of 1> which can occur in F, and F will be a truth-function of Lnr values of 1>, x, Yr, ... , and =, which we shall call atomic IJropositions. With regard to these Ln r atomic propositions there are 2~"r possibilities of truth and falsity which we shall call alternat.ives. each alternative being a conjunction of Ln'· propositions which are either atomic propositions or their contradictions. In constructing the alternatives all the Lnr atomic propositions are to be used whether or not they occur ip F. F can then be expressed as a disjunction of some of these alternatives, namely those with which it is compatible. It is well known that such an • u. H. Langford, .. Analytic completeness of postulate sets", Proc. London Math. Soc. (2), 25 (1926), 115-6. t Bernays und SchOnfinkel, op. cit., 359. We disregard altogether universes with nc, members. ! Here and elsewhere numbers are given Hot because they are relevant to the argument, but to enable the reader to check that he has in mind the same class of entities as the author.

10

ON

IH28.]

273

A PROBLEM OF FORMAL LOGIC.

'expression is possible; indeed, it is the dual of what Hilbert and Ackermann call the "ausgezeichnete konjunktive Normalform"*, and is fundamental also in \Vittgenstein's logic. The only exception is when F is a self-contradictory truth-function, in which case our formula is certainly not consistent. F having been thus expressed as a disjunction of alternatives (in our Rpecial sense of the word), our next task is to show that some of these alternatives may be able to be removed without affecting the consistency or inconsistency of the formula. If all the alternatives can be removed in this way the formula will be inconsistent; otherwise we shall have still to consider the alternatives that remain. In the first place an alternative ma,y violate the laws of identity by containing parts of any of the following forms : Xi

Xi Xi

=t= xd,

= =

Xj. Xj

=t= Xi

Xj • Xj

=

Xl• • Xi

(i =t= j), =t= Xl:

(i

=t= j, j =t=

k, k

=t=i),

=

or by containing Xi Xj (i =t= j) and values of a function l' and its contradictory ....., l' for sets of arguments which become the same when Xi is X2' (x 1 , X 2, xg) . ....., 1'(X2, Xl' xs)]· substituted for Xj [e.g. Xl Any alternative which violates these laws must always be false and can evidently be discarded without affecting the consistency of the formula. The remaining alternatives can then be classified according to the number of x's they make to be different, which lllay be anything from 1 up to n. Suppose that for a given alternative this number is 11, then we can derive frolll it what "'e will call the corresponding y alternative by the following process : -

=

For Xl, ,dlerever it occurs in the given alternative, write YI; next, if in the alternative X2 Xl, for X2 \Yl'ite YI again, if not for X, write y,. In general, if X', is in the given alternative identical with any Xj with i less than i, \\'l'ite for Xi the y previously written for Xj; otherwise write for Xi, Yl,+l, where k is the number of y's already introduced. 'fhe expression which results contains 11 y's all different instead of n x's, some of which are identical, and we shall call it the y alternatiye corresponding to the given x alternative.

=

*

Op. cit., 16. t We write x =1= y for - (x'" 1/).

SER.

2.

YOLo

30.

NO.

1726.

11

F. P.

274

[Dec. la,

RAMSEY

Thus to the alternative cp(X l ). "-' P(X2)' p(x 3). ""' p(X~).

Xl =

·r-a· X2

=.1:4· Xl

=1= X2*

corresponds the y alternative

\Ve call two y alternatives similar if they contain the same number of y's and can be derived from one another by permuting those y's, and \\-e call t\YO x alternatives equivalent if they correspond to similar (or identical) y alternatives. Thus

is equivalent to

since they correspond to the similar y alternatives P(Yl)' '" P(Y2) • Yl

=1= Y2

""' P(Yl) . P(Y2) . Yl

=1= Y2

derivable from one another by interchanging Yl and Y2, although (a) and «(3) are not so derivable by permuting the x's. We now see that we can discard any alternative contained in F unless F also contains all the alternatives equivalent to it; e.g. if F contains (a) but not (f3), (a) may be discarded from it. For omitting alternatives clearly cannot make the formula consistent if it was not so before; and we can easily prove that, if it was consistent before, omitting these alternatives cannot make it inconsistent. For suppose that the formula is consistent, i_e. that for some particular interpretation of p, X, -y" .-., F is true for every set of x's, and let p be an alternative contained in F, q an alternative equivalent to p but not contained in F. Then for every set of x's one and only one :tlternative in F will (on this interpretation of P, x, ~, ... ) be the true one, and this alternative can never be p. For if it were p, the corresponding y alternative \yould be trl1e for some set of y's, and the simiIa.r Y alternative corresponding to q would be true for a set of y's got by permuting this last set. Giying the x's suitable values in terms of the y's, q would then .. We take one function of one variable only for simplicit,y; also to save space we :>mit expressions which may be taken for granted, such as Xl = X" Xl =1= x,.

12

1928.]

Ox

275

A PROBLEM OF FORMAL LOGIC.

be true for a certain set of :;r's and F would be false for these :;r's contrary to hypothesis. Hence p is never the true alternative and may be omitted without affecting the consistency of the formula. \Vhen we h:1\"e discarded all these alternatives from F, the remainder will fall iuto sets each of which is the complete set of all alternatives equivalent to a given alternative. To such a set of :;r alternatives will correspond a complete set of similar y alternatives, and the disjunction of such a complete set of similar y alternatives (i.e. of all permutations of a given y aitemutive) we i;hall call a form-. A form containing v y's we shall denote by an Italic capital with suffix 1', e.g. A., B •. The force of our formula can now be represented by the following conjunction, which we shall call P. For every YI'

Al or BI or ... \

FOl" every distinct gl' Y2'

.'1 2 01: .~2 or :::

For every distinct HI' Y2' ... , y., A. 01· B. or ...

(P) j.

For eyel·y distinct YI' Y2' ... , y", A" or Bn or .. . where Avo B. +, etc., are the forms corresponding to the :;r alternatives still remaining in F. If for any v there are no such forms, i.e. if no alternatives with v different :;r's remain in F, our formula implies that there are no snch things as v distinct individuals, and so cannot be consistent in a ,,"orid of v or more members. \Ve have now to define what is meant by saying that one form is involved in another. Consider a form A. and take one of the y alternatives contained in it. This Y alternative is a conjunction of the values of 1>, x, "', ... , and their negatives for arguments drawn from Yl, Y2, ... , Y•. (",",Ve may leave out the nines of identity and difference, since it is taken for granted that y's are always different.) If p. < l' we can select Jl of these y's in any way and leave out from the alternative all the terms in it which contain allY of the v-p. y's not selected. \Ve have left an alternative in p. y's ,vhich we can renumber Ylo Y2, ... , y", and the form E" to which this new alternative belongs we shall describe as being involved in the A" with which we started. Starting with one particular y alternative in A" we shall get a large number of different E,,'s hy It Cf. Langford, op. cit., 116-120. t The notation is partialJy misleading, since A, has no clos\)r relation to

A~

than to

T2 13

B~.

276

F. P.

RAMSEY

[Dec. 13,

choosing differently the p. y's which we select to preserve; and from whichever Y alternative in Av we start, the E,,'s ,vhich we find to be involved in A. will be the same. For example, {¢(YI, YI) . 1>(YI, Y2) . ¢(!12, YI) . ~ 1>(Y2' Y2) f V 1~ ¢(Yl> YI) . gJ(Yl, Y2)' ¢(Yz, YI) . ¢(Y2, yz)}

is a form A2 which involves the two EI's 1>(YI' YI) ~

(YI, YI)'

It is clear that if for some distinct set of v y's a form Av is true, then every form E" involved in Av will be true for some distinct set of p. y's contained in the v. \Ve are now in a position to settle the consistency or inconsistency of our formula when N, the number of individuals in the universe, is less than or equal to n, the number of x's in our formula. In fact, if N ~ n, it is necessary and sufficient for the consistency of the formula that P should contain a form AN together with all the forms E" involved in it for every p. less than N. This condition is evidently necessary, since the N individuals in the universe must, taken as Y1, Y2, ... , YN, have some form AN in regard to any 1>, x, ~, ... ; and all forms involved in this AN must be true for different selections of y's, and so contained in P if P is to be true for this 1>, x, ~, .... Conversely, suppose that P contains a form A N together with all forms involved in AN; then, calling the N individuals in the universe YI, Y2, ... , YA", we can define functions ' x, ~, ... to make any assigned Y alternative in AN true; for any permutation of these Ny's another alternative in AN will be the true one, and for any subset of y's some Y alternative in a form involved in AN. Since all these Y alternatives are by hypothesis contained in P, P will be true for these 1>, X. ~, ... , and our formula consistent. When, however, N> n the problem is not so simple, although it clearly depends on the An S in P such that all forms involved in them are also contained in P. These A,,'s w.e may call completely contained in P, and if there are no such An's a similar argument to that used when N ~ n will show that the formula is inconsistent. But the converse argument, that if there is an Aft completely contained in P the formnla must be consistent, no longer holds good; and to proceed further vc:e have t-o introduce a new conception, the conception of a form being serial.

14

ON

]928.J

A I'ROBLEM OF FORMAL LOGIC.

277

But before proceeding to explain this idea it is best to simplify matters by the introduction of new functions. Let be one of the variable' functions in our formula, with, say, r arguments. Then, if r will occur in P with all its arguments different [e.g. (YI' Y2' ... , YT)] and also with some of them the same [e.g. ¢(Yl' 112, .•• , Yr-h YI)]; Lut we can conveniently eliminate values of the p,econd kind by introducing new functions of fe\\'er argumentR than r, which, when all their arguments are different, take values equivalent to those of with some of its arguments identical. E.g. \ye may put

tpl(YI, Y2' ... , Yr-I)

= tp(Yl, Y2,

... , YT-h Yl)'

In this way ¢ gives rise to a large number of functions with fewer arguments; each of these functions we define only for the case in which all its arguments are different, as is secured by these arguments being y's \"itll different suffixes. If r> n, there is no difference except that can never occur "'jth all its arguments different, and so is entirely replaced by the new functions. If we do this for all the functions ' x, ~, ... , and replace them by new functions wherever they occur in P with some of their arguments the same, P will contain a new set of variable functions (including all the old ones which have no more than n arguments), and these will never occur in P with the same argument repeated. It is easy to see that this transformation does not affect the consistency of the formula, for, if it were consistent before, it must be consistent afterwards, since the new functions have simply to be replaced by their definitions. And if it is consistent afterwards it must have been so before, since any function of the old set has only to be given for any set of arguments the value of the appropriate function of the new set* . .. For instance, if

(VI'

V~,

Y3) is a function of the old set, we have five new functions O(YI' Y2' y,) = CP(YI, Y~, y,), XO (YI> yz)

= VI> 1/,),

1/Iu(YI> V:)

= (Vh V2' !11),

"'O(Yh Y2)

= (Yh YI> V,),

PO(YI)

=

q.(Vh lh, VI),

and any value of is equivalent to a. value of one and only one of the new functions. It must be remembered that the new functions are uscd only with all their arguments different; for. otherwise they would not be independent, since we should ha.ve, for instance, XO(Yh YI) equivalent to PO(YI)' But Xu(Vh YI) never occurs, and (Yh YI> YI/ is equivalent not to any value of Xo but only to Po (YI)'

15

278

F. P.

[Dec. 13,

RAMSEY

In view of this fact we shall find it more convenient to take P its new form, and denote the new set of functions by ¢o, Xo, "'0, Suppose, then, that ¢o is a function of r variables; there are

III

n(n-l) ... (11-,.+1)

values of ¢o with r different arguments drawn from Y1, Y2, ... , YII and every Y alternative must contain each of these vallie,; or its contradictory. r! of these values will haye as arguments permut.ations of Y1, Y2, ... , Yr. Any other set of r y's can be arranged in the order of their suffixes as Y'" Y•• , ... , Y'" Sl < S2 < '~3 ••• < s,., and it may happen that a given alternative contains the values of ¢o for those and only those permutations of '!I,,, '!I.. , ... , Y'. which correspond (in the obvious way) to the permutations of Yb Y2, ... , y,. for which it (the alternative) contains the values of ¢o; e.g. if the alternative contains ¢O(Yb Y2, ... , Yr) and ¢o(Y," Yr-1o ... , '!I1}' but for every other permutation of '!I1. Y2, ... , Yr coniains the corresponding yulue of """ ¢o, then it may happen that the alternative contains 1>o('!I." Y'2' ...• y,) and ¢o(Y.••, Y'r-It ... , y.,}. but for every other permutation of !}.,. y", "', Y" contains the corresponding value of ....... ¢o. If this happens, no matter how the set of r y's, Y'" Y'1' ... , Y" is chosen from Yb Y2, ... , y,,, then we say that the alternative is serial in ¢o·, and if an alternative is serial in every function of the new set we shall call it serial simply. Consider, for example, the following alternative, in which ,ve may imagine ¢o and --Yo to be derived from one "old" function ¢ by the definitions

This is serial in 1>0' since we always have ¢o(Y." Y..)· --- ¢o(y,,, y.'); lmt not in --Yo, since we sometimes have "'o(y.,)._but sometimes ....... "'o(Y.,). Hence it is not a serial alternative. We call a form serial when it contains at least one serial alternative, and can now state> our chief result as follows . • Thus, if 1>0 is a function of n ,·ariables. all alternati\"c~ are

16

~cri"l

in

1928.]

ON

A PROBLE~I OF l<'ORMAL LOGIC.

279

THEoREM.-l'hcrc is a finitc number nt, depending on n, the number of functions tp, x, Vr, ... , and the numbers of their arguments, such tllat the necessary and suhicient condition for our formula to be consistent in a universe tcith m or more members is that tlfcrc should be a serial jonn An completely contained in P. For consistency in a universc of fewer Uwn In members this condition is sufficicnt but not necessary.

We shall first prove that, whatever be the number N of individuals in the universe, the condition is sufficient for the consistency of the formula. If N ~ n, this is a consequence of a previous result, since, if A" is completely contained in P, so is any AN involved in An. If N > n, we suppose the universe ordered in a series by a relation R. (If N is infinite this requires the Axiom of Selections,) Let q be any serial alternative contained in An. If tpo is a function of r arguments, q will contain the values of either Z2, ••. , Zr as they are ordered by R. Let us suppose now that Yb Y2, ... , y" are numbered in the order in which they occur in R, i.c. that in the R series YI is the first of them, Y2 the second, and so on. Then we shall see that, if tpo is given the constant interpretation defined above, all the values of tpo and,..... tpo in q will be true. Indeed, for values whose arguments are obtained by permuting YI, Y2, ... , Yr this follows at once from the way in which tpo has been defined. For tpO(V"I' Y"2' ''', Y"r) is true if and only if the order of V"I' Y"2' ... , Y"r in the R series is given by a permutation (PI' P2' ... , pr) contained in~. But the order in the series of Y"I' Y"2' ... , Y"r is in fact given (on our present hypothesis that the order of the V's is YI' Y2' "', ?It) by (CTl' CT2' ... , CTr), which is contained in ~ if and only if 1'O(Y"I' Y"2' ... , Y",.) is contained in q. Hence values of tpo for arguments consisting of the first r V's are true when they are contained in q and false otherwise, i.e. when the corresponding values of ""-' 1'0 are contained in q. For sets of arguments not confined to the first T y's our result follows from the fact that q is serial,' i.c. that if Sl < S2 < ... < SF> so that Y'I' y.~, ... , V", are in the order given by the R series, q contains the

17

280

F. P.

RAMSEY

[Dec. 13,

values of 1'0 for just those permutations of Ys" Y'2' ... , Y'r which correspond to the permutations of YJ, Y2, ... , y,. for which it contains the values of 1'0, i.e. by the definition of 1'0 and the preceding argument, for just those permutations of y,,, Y'2' ... , YS r which make 1'0 true. Hence all the values of 1'0 and ~ 1'0 in q are true when YJ' Y2' ... , y" are in the order given by the R series. If, then, we define analogous constant interpretations for XO, 'fro, etc., and combine these with our interpretation of 1'0, the whole of q will be true provided that Yl> Y2, ... , y" are in the order given by the R series, and if YJ, Y2, ... , Yn are in any other order the true alternative will be obtained from q by suitably permuting the y's, i.c. will be an alternative similar to q and contained in the Rame form A". Hence An is true for any set of distinct YJ, Y2, ...• y". Moreover, for any set of distinct Yl> Y2, ... , Y. (v < n) the true form be one involved in A. 11, and since A" and all forms involved in it are contained in P, P will be true for these interpretations of 1'0, XO. ~o, ... , and our formula must be consistent. Having thus proved our condition for consistency sufficient in any universe, we ha,ve now to prove it necessary in any infinite or sufficiently large finite nniverse, and for thiR we have to use the Theorem B proved in the first part of the paper. Our line of argument is as follows: we have to show that, whatever 1'0, XO, 'fro, ... we take P will be false unless it completely contains a serial A". For thip, it is enongh to show that, given any 1'0, Xo, ~o, ... , there must be a set of ny's for which the true form is serial *, or, since a serial form is one which contains a serial alternative, that there must be a set of values of 'VI, Y2, ... , Yn for which the true alternative is serial. Let us suppose that among our functions 1'0, Xo, 'fro, ... there are aJ functions of one variable, U2 of two variables, ... , and G" of n variables, and let us order the universe by a serial relation R. The N individuals in the universe are divided by the UJ functions of one variable into '2(/' classes according to which of these functions they make true or false, and if N ~ 2a , kl we can find kJ individuals which all belong to the same class, i.e. agree as to which of the UJ functions they make true and which false, where kJ is a positive integer to be assigned later. Let UR call this set of kJ individuals 1'k,. N ow consider any t,yO distinct members of I\." ZJ and Z2 say, anel let ZJ preceJe Z2 in the R series. Then in regard to any of the a2 functions of two variables, 1'0 say, there are four possibilities. We may either

,,,ill

" For then P can only be true for
18

1928.]

281

ON A PROBLEM 0]<' FORMAJJ LOGlc.

have

(1)

¢o (Zh Z2) . ¢o (Z2, ZI)'

or

(2)

¢O(ZI' Z2) • '" ¢O(Z2, ZI),

or

(3)

--.. ¢O(Zh Z2) • ¢O(Z2, Zl),

or

(4)

...... ¢O(ZI, Z2) ....... ¢o(Zg, ZI)'

thus divides the combinations two at a time of the members of r t , into four distinct classes according to which of these four possibilities is realised when the combination is taken as Zl, Z2 in the order in which its terms occur in the R series; and the whole set of a2 functions of two variables divide the combinations two at a time of the members of r 1'1 into 4°s classes, the combinations in each class agreeing in the possibility they realise with respect to each of the 02 functions. Hence, by Theorem B, if kl h(2, k2' 4"2), n'l must contain a sub-class rts of k2 members such that all the pairs out of r ~ agree in the possibilities they realise with respect to each of the a2 functions of two variables. Vole continue to reason in the same way according to the following general form : -

¢o

=

r

Consider any T distinct members of ky _ 1 ; suppose that in the R series they have the order Zl, Z2, ... , Zr. Then with respect to any function of r variables there are 2r! possibilities in regard to Zl, Z2, ••• , Z~, and the ar functions of ,. variables divide the combinations r at a time of the members of l\r_l into 2r !a r classes. By Theorem B, if k r- 1 h(1', kr, 2,'!a r)*, r kr _1 must contain a sub-class I\r of kr members such that all the combinations r at a time of the members of rkr agree in the possibilities they realise with respect to each of the Or functions of r variables. We proceed in this way until we reach 1\'_1' all combinations n-l at a time of whose members agree in the possibilities they realise with respect to each of the a,,_1 functions of n-l variables. \Ve then determine that k,,-1 shall equal n, which fixes k"_2 as 71(11-1, n, 2(n-l)!" .. -I) and so on back to kl' every k r - 1 being determined from k r • If, then, N ~ 2a1 kl' the universe must contain a class r k._ 1 or r" (since k"-1 n) of n members which is contained in rkr for every 1', I' = 1,2, ... , n-l. Let its n members be, in the order given them by R, Yb Y2, ... , Yn. Then for every r less than n, Y'l, Yz, ... , y" are contained in rkr and all r combinations of them agree in the possibilities they realise

=

=

it

If a r = 0 we interpret h(r, Ie., 1) as kr and identify

19

1"r_1

and

1'kr •

F. P.

282

[Dec. 13,

RAMSEY

\\'ith respect to each function of r variables. Let Ys" Ys" ... , Y'r (SI < S2 < ... < s,.) be HllCh a combination. aHd XO a function of r variables. Then Ys" Ys" ... , Y'r are in the order given them by R, and so are Yl, Y2, ... , Yr; consequently the fact that these t,,"o combinations agree ill the possibilities which they reaJise with respect to XO means that Xo it' true for the same permutations of Y,!, Ys 2, " ' , Y" as it is of Yj, y~ . ... , y, .. The true alternative for YI, Y2, ... , Yn is therefore serial in Xo, and Himilarly it is serial in every other function of any number r of variables* ; it is therefore a serial alternative. OUf condition is, therefore, shown to be necessary in any uniYerse of at least 2'" k1 members where k1 is given by

kr- 1

= her, k" 2 ! aT)

if

aT

=

if

aT

r

kT

=1= 0)

= 0J

(1"

= n-l, n-2, ... , 2).

For universes lying between nand 2«' k1 we have not found a necessary and sufficient condition for the consistency of the formula, but it is evidently possible to determine by trial whether any given formula is consistent 111 any such universe.

III. \Ve will now consider what our result becomes ,vhen our formula contains in addition to identity only one function

= fjJ(Yi, Vi),

so that a 1

= 1, U2 = 1, aT = 0 when 1" > 2. k2 = ks = ... = k = n R.nd n_ 1

but the argument at the end k1 n ! ! !, and our necessary applies to any lmiverse with at In this simple case we can form as follo,,"s .

=

Consequently kl

= h(2, n, 4);

of I shows that we may take instead and sufficient condition for consistency least 2. n ! ! ! individuals. present our condition in a more striking

• We have shown this when r < ,,; we may also have r = n, but then there is nothing to prove since in a function of It variables every alternative is serial.

20

ON

1928.J

288

A PROBT.EM OF FORMAL LOGIC.

It is necessary and sufficient for the consistency of the formula. that it shQuld be true when ¢ is replaced by at least one of the following types of function : -

= x.y = y.

(1) The universal function

x

(2) The null function

x =1= x . y =1= y.

= y.

(3) Identity

x

(4) Difference

x =1= y.

(5) A serial function ordering the whole universe in a series,

i.e. satisfying

(a) (x) -- ¢(x, x), (b) (x, y)[x

= y V {p(x, y). "" p(y, x) f V {rp(y,

x). '" p(x, y) f],

(c) (x, y, z) { -""p(x, Y) V "" p(Y, z) V p(x, z)}. (6) .A function ordering the whole universe in a series, but also holding

between every term and itself, i.e. satisfying (x) p(x, x)

(a')

and (b) and (c) as in (5). Types (1)-(4) include only one function each; in regard to types (5) and (6) it is immaterial what function of the type we take, since if one 8atisfies the formula so, we shall see, do all the others·. We have to prove this new form of our condition by showing that P will completely contain a serial An if and only if it is satisfied by functions of at leaRt one of our six types. Now an alternative in ny's is serial in XO if it contains either

(i)

or, for short,

XO(Yl) . XoUi2) .•.• Xo(y,,)

or

"

"

II Xo(Yr), r

II "" Xo(Yr), r

but not otherwise, and it will be serial in 1>0 if it contains either or

(b)

II 9>o(Yr, y.) . "" Po(l1." y •.),

r<.

or or

(d)

n . . . . Po(y,·,

r<.

y.) . ""
• A result previously obtained for type (5) by Langford, op. cit.

21

F. P.

284

[Dec. 13,

RAMSIW

rrhere are tllllR altogether eight alternatives serial in uoth "'0 and XO got by combining either of (i), (ii) with any of (a), (b), (c), (d); but these eight serial alternatives only give rise to six serial forms, since the alternatives (i) (b) and (i) (c) can be obtained from one another by reversing the order of the y's and so belong to the same form, and so do the alternatives (ii) (b) and (ii) (c). It is also easy to see that any formula completely containing one of these six serial forms will be satisfied by all functions of one of the six types according to the scheme

Form 'l'ype of function

(i) (a)

(i) (b and c)

(i) (d)

(ii) (a)

(ii) (b and c)

1

6

8

4

5

(ii) (d) 2

flnd that conversely a formula satisfied by a function of one of the six types must completely contain the corresponding form. For instance, "a function of type 6 will satisfy the alternative (i) (b) when Yh Y', ... , Y. are in their order in the series determined by the function, and when Yl, Y2, """' Yn are in any other order the function will satisfy an alternative of the same form. In the language of the theory of postulate systems we can interpret our universe as a class K, and conclude that a postulate system on a base (1(, R) consisting only of general laws involving at most n elements will be compatible with K having as many as 2. n ! !! members if and only if it can be satisfied by an R of one of our six types.

IV. Let us, in conclusion, briefly indicate how to extend our method in order to determine the consistency or inconsistency of formulae of the more general type

which have in normal form both kinds of prefix, but satisfy the condition that all the prefixes of existence precede all those of generality. As before, we can suppose F represented as a disjunction of alternatives and discard those which violate the laws of identity. Those left we can group according to the values of identity and difference for arguments drawn entirely from the z's. Such a set of values of idelltity and difference we can denote by Hi (=, Zl, Z" ••• , ZOl), and F can be put in the form

22

a};

1928.J

285

A PROBLEM OF FORMAL I,OGle.

and the whole formula is equivalent to a disjunction of formulae .

• (Xl'

V (Ez i • Z~, ... , Z",) {Hz(=,

X2, ..• , Xn) FI(p, "',

Zl, ... ,

=,

Zj, ..• ,

Z"" Xl' ... , Xli)}

Zm)

V etc. Since if anyone of these formulae is consistent so is their disjunction, and if their disjunction is consistent one at least of its terms must be consistent, it is enough for us to show how to determine the consistency of anyone of them, say the first. In this H 1(=, Zl, Z2, ••. , zm) is a consistent set of values of identity and difference for every pair of z's. "\Ve renumber the z's Zl, Zz, ... , z" using the same suffix for every set of z's that are identical in Hb and our formula becomes

in which it is understood that two z's with different suffixes are alw[1Ys different. Now supposing the universe to have at least J!+n members, we consider the different possibilities in regard to the x's being identical WIth the z's, and rewrite our formula (Ez I ,

Z2' ••• , Z,,)(Xl> X 2 , ••• , Xl» {

i~l~ .• n Xi =1= Zj ~ G(p, X, ... , =,

Zl' ••• ,

Z,,'

Xl' .•. ,

XlI)]-,

J-l •... , IL

in which

~

means" if, then" and

G(p, ... , xn)

= II F1(p, x' ... , =, Zl' ... ,

Zl"

$1' $2' ... , $,,),

the product being taken for

=

und in G any term Xi Zj is replaced by a falsehood (e.g. Xi =1= Xi) not involving any z. N ext we modify G by introducing new functions. In G occur values of, e.f}. p, with arguments some of which are z's and some x's; from

23

ON

286

A PROBLEM OF FORllAL LOGIC.

these we define functions of the x's only by simply regarding the z's as constants, and call these Hew functions <po, XU, •..• Values of
x,,)L(CP, X, y" ... , CPo, XO' ••• , p, q, ... , =,

Xl' x~,

••• ,

Xn).

But this is a formula of the type previously dealt with, except for the variable propositions p, q, ... , which are easily eliminated by considering the different cases of their truth and falsity, the formula being consistent jf it is cOl1sist-ent in one snch casco

Reprinted from Proc. London Math. Soc. 30 (1930),264-286

24

NON-SEPARABLE AND PLANAR GRAPHS* BY

HASSLER WHITNEY

Introduction. In this paper the structure of graphs is studied by purely combinatorial methods. The concepts of rank and nullity are fundamental. The first part is devoted to a general study of non-separable graphs. Conditions that a graph be non-separable are given; the decomposition of a separable graph into its non-separable parts is studied; by means of theorems on circuits of graphs, a method for the construction of non-separable graphs is found, which is useful in proving theorems on such graphs by mathematical induction. In the second part, a dual of a graph is defined by combinatorial means, and the paper ends with the theorem that a necessary and sufficient condition that a graph be planar is that it have a dual. The results of this paper are fundamental in papers by the author on Congruent graphs and the connectivity of graphst and on The coloring of graphs.t 1.

NON-SEPARABLE GRAPHS

1. Definitions.§ A graph C consists of two sets of symbols, finite in number: vertices, a, b, c, ... ,f, and arcs, a(ab), (3(ac) , ... , o(cf). If an arc a(ab) is present in a graph, its end vertices a, b are also present. We may write an arc a(ab) or a(ba) at will; we may write it also ab or ba if no confusion arises,if there is but a single arc joining a and b in C. We say the vertices a and b are on the arc a(ab), and the arc a(ab) is on the vertices a and b. The null graph is the graph containing no arcs or vertices. The obvious geometrical interpretation of such a graph, or abstract graph, is a topological graph, let us say. Corresponding to each vertex of the abstract graph, we select a point in 3-space, a vertex of the topological graph. Corresponding to each arc a(ab) of the abstract graph, we select an arc joining the corresponding vertices of the topological graph. An arc is here a set of points in (1, 1) correspondence with the unit interval, its end vertices corresponding * Presented to the Society, October 25, 1930; received by the editors February 2, 1931. An out· line of this paper will be ound in the Proceedings of the National Academy of Sciences, vol. 17 (1931), pp. 125-127. t American Journal of Mathematics, vol. 54 (1932), pp. 150-168. t An outline will be found in the Proceedings of the National Academy of Sciences, vol. 17 (1931), pp. 122-125. § Compare Ste. Lague, Les Reseaux, Memorial des Sciences Mathematiques, fascicule 18, Paris, 1926.

25

340

HASSLER WHITNEY

[April

with the ends of the interval. Moreover, we let no arc pass through other vertices or intersect other arcs. We shall consider topological graphs no further till we come to the section on planar graphs. An isolated vertex is a vertex which is not on any arc. A chain is a set of one or more distinct arcs which can be ordered thus: ab, be, cd, ... , ef, where vertices in different positions are distinct, i.e. the chain may not intersect itself. A suspended chain is a chain containing two or more arcs such that no vertex of the chain other than the first and last is on other arcs, and these two vertices are each on at least two other arcs. A circuit is a set of one or more distinct arcs which can be put in cyclic order, ab, be, ... ,ef,Ja, vertices being distinct as in the case of the chain. A k-circuit is a circuit containing k arcs. Thus, the arc a(aa) is a I-circuit; the two arcs a(ab), (3(ab) form a 2-circuit. A graph is connected if any two of its vertices are joined by a chain. Obviously, if a and b are joined by a chain, and band c are joined by a chain, then a and c are joined by a chain. Any graph consists of a certain number of connected pieces (one, if the graph is connected). In particular, an isolated vertex is one of the connected pieces of a graph. A graph is called cyclicly connected if any two of its vertices are contained in a circuit. If Gl , G2 , • • • , Gm are a set of graphs, no two of which have a common vertex (or arc, therefore), we say the graph G, formed of the arcs and vertices of all these graphs, is the sum of these graphs. Thus, a graph is the sum of its connected pieces. Aforest is a graph containing no circuit. A tree is a connected forest. A subgraph H of G is a graph containing a subset (in particular, all or none) , of the arcs of G, and those vertices of G which are on these arcs. 2. Rank and nUllity.* Given a graph G which contains V vertices, E arcs, and P connected pieces, we define its rank R, and its nullity (or cyclomatic number or first Betti number) N, by the equations R = V - P, N=E-R=E-V+P. If G contains the single arc ab, it is of rank 1, nullity 0, while if it contains the single arc aa, it is of rank 0, nullity 1. The first two theorems follow immediately from the definitions of rank and nullity: THEOREM 1. If isolated vertices be added to or subtracted from a graph, the rank and nullity remain unchanged.

* These are just the rank and nullity of the matrix H, of Poincare. See Veblen's Colloquium Lectures, A nalysis Situs.

26

1932]

NON-SEPARABLE AND PLANAR GRAPHS

341

THEOREM 2. Let the graph G' be formed from the graph G by adding the arc abo Then (1) if a and b are in the same connected piece in G, then

+ 1;

R' = R, N' = N

(2) if a and b are in different connected pieces in G, then R' THEOREM

=

R

+ 1,

N'

=

~

0, N

~

o.

N.

3. In any graph G, R

For let Gl be the graph containing the vertices of G but no arcs. Then if Rl and Nl are its rank and nullity, Rl

= Nl =

o.

We build up G from Gl by adding the arcs one at a time. The theorem now follows from Theorem 2. THEOREM

4. A forest G is a graph of nullity 0, and conversely.

Suppose first G contained a circuit P. We shall show that the nullity of G is >0. We build up G arc by arc, adding first the arcs of the circuit P. In adding the last arc of the circuit, the nullity is increased by 1, as this arc joins two vertices already connected. (This argument holds even if the circuit is a I-circuit.) But in adding the rest of the arcs, the nullity is never decreased, by Theorem 2. Thus the nullity of Gis >0. Now suppose G is a forest, and therefore contains no circuit. Build up G arc by arc. Each arc we add joins two vertices formerly not connected. For otherwise, this arc, together with the arcs of a chain connecting the two vertices, would form a circuit. Therefore, by Theorem 2, the nullity remains always the same, and is thus o. 3. Theorems on non-separable graphs. We introduce the following Definitions. Let HI, which contains the vertex ai, and H2, which contains the vertex a2, be two graphs without common vertices. Let us rename al a, and rename the arcs of HI on al accordingly; that is, if alb is an arc on ai, we rename it abo Rename also a2 a, and rename the arcs of H2 accordingly. HI and H2 have now the vertex a in common; they form the graph G, say. We say G is formed by letting the vertex al of HI coalesce with the vertex a2 of H2, or, by joining HI and H2 at a vertex. Geometrically, we pull the vertices al and a2 together to form the single vertex a. Let G be a connected graph such that there exist no two graphs HI and

27

342

HASSLER WHITNEY

[April

H 2, each containing at least one arc, which form G if they are joined at a vertex. Then G is called non-separable. Geometrically, a connected graph is non-separable if we cannot break it at a single vertex into two graphs, each containing an arc. For example, the graph consisting of the two arcs ab, bc is separable, as is the graph consisting of the two arcs a(aa) , {3(aa). A graph containing but a single arc is non-separable, as is the graph containing only the arcs a(ab), {3(ab). If G is not non-separable, we say G is separable. Thus, a graph that is not connected is separable. Suppose some connected piece Gl of G is separable. If Hl and H2 joined at the vertex a form Gl, we say a is a cut vertex of G. We have consequently THEOREM 5. A necessary and sufficient condition that a connected graph be non-separable is that it have no cut vertex. THEOREM 6. Let G be a connected graph containing no l-circuit. A necessary and sufficient condition that the vertex a be a cut vertex of G is that there exist two vertices b, c in G, each distinct from a, such that every chain from b to c passes througha.

First suppose a is a cut vertex of G. Then, by definition, Hl and H 2, each containing at least one arc which is not a l-circuit, form G if they are joined at a. Let b be a vertex of Hl and c a vertex of H 2 , each distinct from a. As a is the only vertex in both H! and H 2, every chain from b to c in G passes through a. Suppose now every chain from b to c in G passes through a. Remove the vertex a and all the arcs on a. The resulting graph G' is not connected, band c being in different connected pieces. Let H { be that connected piece of G' containing b, and let Hi be the rest of G'. Replace a by the two vertices al and a2. Now put back the arcs we removed, letting them touch al if their other end vertices are in H {, and letting them touch a2 otherwise. Let H land H 2 be the resulting graphs. Then Hl and H2 each contain at least one arc, and they form G if the two vertices al, a2 are made to coalesce. Hence, by definition, a is a cut vertex of G. THEOREM 7. Let G be a graph containing no l-circuit and containing at least two arcs. A necessary and sufficient condition that G be non-separable is that it be cyclicly connected. *

If G is not connected, the theorem is obvious. Assume therefore G is connected. • A similar theorem has been proved for more general continuous curves by G. T. Whyburn, Bulletin of the American Mathematical Society, vol. 37 (1931), pp. 429-433.

28

1932)

NON-SEPARABLE AND PLANAR GRAPHS

343

Suppose first G is separable. Then, by Theorem S, G has a cut vertex a, and by Theorem 6, there are two vertices b, c in G such that every chain from b to c passes through a. Hence there is no circuit in G containing band c. Suppose now there exist two vertices b, c in G which are contained in no circuit. Let bd, de, ... ,gc be some chain from b to c. Case 1. There exists a circuit containing band d. In this case, let a be the last vertex of the chain which is contained in a circuit passing also through b. Let j be the next vertex of the chain. Then every chain from j to b passes through a. For suppose the contrary. Let C be a chain fromj to b not passing through a. Let P be a circuit containing band a. Follow C from j till we first reach a vertex of P. Follow the circuit P now as far as b if b was not the vertex we reached, and continue along P till we reach a. Passing from a to j along the arc aj completes a circuit containing both b andj, contrary to hypothesis. Hence, by Theorem 6, a is a cut vertex of G, and therefore G is separable. Case 2. There exists no circuit containing band d. Then there is but a single arc joining band d, and they are joined by no other chain. As G is connected and contains at least two arcs, there is either another arc on b or another arc on d, say the first. The other case is exactly similar. If we add a vertex b' and replace the arc bd by the arc b'd, band d are no longer joined by a chain, and hence the resulting graph G' is not connected. Let Hl be that part of G' containing the arc b'd, and let H2 be the rest of G'. As there is still an arc on b, H2 contains at least one arc. Letting the vertices band b' coalesce forms G, and hence G is separable. The proof is now complete. THEOREM 8. A non-separable graph G containing at least two arcs contains no 1-circuit and is oj nullity >0. Each vertex is on at least two arcs.

Suppose G contained a 1-circuit. Call it H 1• Let H2 be the rest of the graph. Then Hl and H2 have but a single vertex in common, and thus G is separable. N ext, by Theorem 7, G is cyclicly connected. As G contains no 1-circuit, G contains at least two vertices. Containing these there is a circuit. Therefore, by Theorem 4, the nullity of G is >0. Finally, if there were a vertex on no arcs, G would not be connected. If there were a vertex a on the single arc ab, b would be a cut vertex of G. THEOREM 9. Let G be a graph oj nullity 1 containing no isolated vertices, such that the removal t f any arc reduces the nullity to o. Then G is a circuit.

By Theorem 4, G ~ontains a circuit. Suppose G contained other arcs besides. Removing one 0 these, the nullity remains 1, as the circuit is still present, contrary to hypothesis. There are no other vertices in G, as G contains no isolated vertices. Hence G is just this circuit.

29

344

HASSLER WHITNEY THEOREM

[April

10. A non-separable graph G of nullity 1 is a circuit.

If G contains but a single arc, it is a I-circuit, being of nullity 1. Suppose G contains at least two arcs. By Theorem 8, it contains no I-circuit. By Theorem 7, it is cyclicly connected. Remove any arc ab from G; a and b are still connected, and therefore, by Theorem 2, the nullity of G is reduced to O. Hence, by Theorem 9, G is a circuit. The converses of the last two theorems are obviously true. 4. Decomposition of separable graphs. If the graph G contains a connected piece which is separable, we may separate that piece into two graphs, these graphs having formerly but a single vertex in common. We may continue in this manner until every resulting piece of G is non-separable. We say G is separated into its components. LEMMA. Let the connected separable graph G be decomposed into the two pieces HI and H2 which had only the vertex a in common in G. Then every nonseparable subgraph of G is contained wholly in either HI or H 2•

Suppose the contrary. Then some non-separable subgraph I of G is not contained wholly in either HI or H 2. Let II be that part of I in HI, and 12 that part in H 2; II and 12 have at most the vertex a in common. II and 12 each contain at least one arc. For otherwise, if II, say, contained no are, as it contains a vertex distinct from a, it would not be connected. Thus I is separable into the pieces II and 12 , a contradiction again. THEOREM 11. Every non-separable subgraph of G is contained wholly in one of the components of G.

This follows upon repeated application of the above lemma. THEOREM

12. A graph G may be decomposed into its components in a uniqut

manner.

Suppose we could decompose G into the components HI, H 2 , • • • , H m, and also into the components H{, H£, ... , H,{. We shall show that these sets are identical. Take any Hi. It is a non-separable subgraph of G, and thm is contained in some component HI, by Theorem 11. Similarly, HI is contained in some component H k. Thus Hi is contained in H k, and they are therefore identical. Hence Hi and HI are identical. In this manner we show that each H k is identical with some H { , and each H { is identical with some H k proving the theorem. THEOREM

R 1, R 2,

••• ,

13. Let H t , H 2 ,

Rm, and Nt, N 2 ,

H m be the components of G. Lei N m be their ranks and nullities. Then

• - .,

••• ,

30

NON·SEPARABLE AND PLANAR GRAPHS

1932]

R

345

+ R, + ... + R., NI + N, + ... + N,... R,

~

N =

Let G ' be G separated into its components, and let R' be the rank of G' . G is formed from G' by letting vertices of different components coalesce. Each time we join two pieces, the number of vertices and the number of connected pieces arc each reduced by 1, so that the rank remains the same. Thus R

=

R' .

Now 1" = 1'I + V t +···+ V ... , 1" = P I

+ P2 + ... + p ...

(where each P i = 1). Subtracting, R = R' = RI

+ R2 + ... + R ... .

E

+ E, + ... + E. ,

As also ~

E,

it follows that N = NI

+ Nt + ... + 117 ... .

For a converse of this theorem, see Theorem 17. THEOREM 14. Divide tlte arcs of lhe non-separable graph G into two groups, each containing at least one arc, forming tlte subgraphs H I and H 2 , of ranks Rl Q,n d R~ . Then

R1+R: > R.

Let the connected pieces of HI be H n , . . . I HI . (there may be but one piece, lln ), and let those of H 2 be H 21 , • • • , H 2". Then obviously

+ ... + R R21 + ... + R z",

RI = Rll R~ =

whence RI

+ RI =

Rll

I ""

+ ... + R I.. + Rt! + ... + R h .

Let G' be the sum of the graphs H n , ... , H t... Then G' is of rank Rn+ .. +Rt... We form G from G ' by letting vertices of the graphs H Il , . . . , Hb coalesce. Each time we let vertices of different connected pieces coalesce, the rank is unaltered. Each time we let vertices in the same connected piece coalesce, the rank is reduced by 1. This latter operation happens at least once. For otherwise, let al and ~ be the last two vertices we let coalesce. Then a-I and (l~ were formerly in two different pieces, It and [ 2. Thus

31

346

HASSLER WHITNEY

[April

1 I and 12 joined at a vertex form G, and G is separable, contrary to hypothesis. Thus the rank of G is less than the rank of G', that is, R

Hence

< Rll + ... + R 2n •

Theorems 13 and 14 give THEOREM 15. A necessary and sufficient condition that a graph be non-separable is that there exist no division of its arcs into two groups HI and H2, each containing at least one arc, so that

R = RI

+R

2•

5. Circuits of graphs. We shall say two non-separable graphs, each containing at least one are, form a circuit of graphs, if they have at least two common vertices. (They may also have common arcs.) Thus the two graphs GI : a(ab) and G2 : a(ab) (which are the same graph) form a circuit of graphs. However, the two graphs GI : a(aa) and G2 : (3(aa), having but one common vertex, do not form a circuit of graphs. We shall say three or more nonseparable graphs form a circuit of graphs if we can name them GI , G2 , • • • ,Gm in such a way that GI and G2 have just the vertex al in common, G2 and Ga have just the vertex a2 in common, ... , G", and GI have just the vertex am in common, these vertices are all distinct, and no other two of these graphs have a common vertex. Thus the three graphs GI:ab, G2 :bc, Ga:ca form a circuit of graphs. We note that there can be no 1-circuit in a circuit of graphs; also, no subset of the graphs in a circuit of graphs form a circuit of graphs. We may think of a circuit of graphs as forming a single graph. THEOREM

16. A circuit of graphs G is a non-separable graph.

First suppose there are but two graphs, Gl and G2, present. Suppose G were separable. Then it is separable into at least two components HI, H 2 , • • • , H k • By Theorem 11, Gl and G2 are each contained wholly in one of these components. As GI and G2 together form G, there are just two components, and they are GI and G2• These, when joined at a vertex, form G. But this is contrary to the hypothesis that GI and G2 have at least two vertices in common. Next suppose there are more than two graphs present. Let CI be a chain in Gl joining am and ai, let C2 be a chain in G2 joining al and a2, ... , let Cm be a chain in Gm joining am-I and am. These chains taken together form a cir• This theorem may also be proved easily fratn Theorem 17.

32

1932)

NON-SEPARABLE AND PLANAR GRAPHS

347

cuit P passing through all the graphs. Now separate G into its components. By Theorem 11 (see the converse of Theorem 10), P is contained in one of these components. The same is true of each of the graphs GI , G2 , • • • , Gm , and hence these graphs are all contained in the same component. Thus G is itself this component, that is, G is non-separable. THEOREM 17. Let GI , . . . , Gm be a set of non-separable graphs, each containing at least one arc, and let G be formed by letting vertices and arcs of different graphs coalesce. Then the following four statements are all equivalent: (1) GI , . . . , Gm are the components of G. (2) No two of the graphs Gh . . . , Gm have an arc in common, and there is no circuit in G containing arcs of more than one of these graphs. (3) No subset of these graphs form a circuit of graphs. (4) If R, R I , • • • , Rm are the ranks of G, GI , . . . , Gm respectively, then

R = RI

+

0

0

0

+ Rm

o

We note that we cannot replace the word rank by the word nullity in (4). For let G be the graph containing the arcs ex (ab) , (3(ab) , 'Y(ab). Let GI contain ex and (3, and G2 , (3 and 'Y. Then the nullity of G is the sum of the nullities of GI and G2 , but GI and G2 are not the components of G. We shall prove (a) if (1) holds, (2) holds, (b) if (2) holds, (3) holds, (c) if (3) holds, (1) holds, establishing the equivalence of (1), (2) and (3); (d) if (1) holds, (4) holds, and finally (e) if (4) holds, (3) holds, establishing the equivalence of (4) and the other statements. (a) If (1) holds, (2) holds. For first, in forming G from its components GI, ... , Gm , we let vertices alone coalesce, and thus no two of the graphs have an arc in common. Also, there is no circuit in G containing arcs of more than one of the graphs; for each circuit, being a non-separable graph, is contained entirely in one of the components of G, by Theorem 11. (b) If (2) holds, (3) holds. For suppose the contrary. If, first, some two graphs, say GI and G2, form a circuit of graphs, they have at least two vertices in common, say a and b. Join a and b by a chain C in GI and by a chain D in G2 • By hypothesis, GI and G2 have no arcs in common, and thus the arcs of C and D are distinct. From a follow along C till we first reach a vertex d of D. From d follow along D till we get back to a. We have formed thus a circuit containing arcs of both GI and G2 , contrary to hypothesis. Now suppose the graphs GI , . . • , Gk , k>2, formed a circuit of graphs. In the proof of Theorem 16 we found a circuit passing through all the graphs of such a circuit of graphs, again contrary to hypothesis.

33

348

HASSLER WHITNEY

[April

(c) If (3) holds, (1) holds. Assuming that no subsetofthe graphsG1, . ,G m forms a circuit of graphs, we will show first that some one of these graphs has at most a single vertex in common with other of the graphs. For suppose each graph had at least two vertices in common with other graphs. Then GI has a vertex al in common with some graph, say G2 • As G2 has at least two vertices in common with other graphs, it has a vertex a2, distinct from ai, in common with another graph, say G3 • If we continue in this manner, we must at some point get back to a graph we have already considered. Now starting with GI , consider the graphs in order, and let Gi be the first one which has a vertex in common with one of the preceding graphs other than the vertex ai-I, which we know already it has in common with Gi - I • Now of the graphs Gi - l , G i - 2 , • • • ,G1, let Gi be the first with which G; has a common vertex, other than the vertex ai-I. First suppose Gj is G.-I. Then Gi and Gi - l have at least two vertices in common, and they form therefore a circuit of graphs, contrary to hypothesis. Next suppose Gi is not G._1.Then on account of the choice of G. and G;, Gi and Gi+l have just one common vertex ah Gi +1 and Gj +2 have just one common vertex ai+l, . . . , G. and Gj have just one common vertex a. (for otherwise G. and Gi would form a circuit of graphs), and no other two of these graphs have a vertex in common. These vertices ai, aj+1, . . . , a. are all distinct. For, on account of the construction of the chain of graphs, two succeeding vertices ak and ak+l are distinct. a. and aj are distinct, for otherwise Gi and Gi+l would have a common vertex, etc. These graphs Gil Gi +1, . . . , Gi form therefore a circuit of graphs, contrary to hypothesis. Some graph therefore, say GI , has at most a single vertex in common with the other graphs. Thus either it is separated from them, or we can separate it at a single vertex. Now among the graphs G2 , • • • , Gm , there is also no circuit of graphs, so again we can separate one of them, say G2• Continuing, we have finally separated G into its components G1, G2 , • • • ,Gm • (d) If (1) holds, (4) holds. This is just Theorem 13. (e) If (4) holds, (3) holds. Let G' be the sum of the graphs GI , . . . , G",. We form G from G' by letting vertices and arcs of different graphs coalesce. Each time we let two vertices coalesce, either (a) the two vertices were formerly in different connected pieces, in which case the rank is unchanged, or ({3) the two vertices were in the same connected piece, in which case the rank is reduced by 1. Letting arcs alone coalesce (their end vertices having already coalesced) does not alter the rank. Thus in any case, the rank is never increased. To begin with, the rank of G' is G1 + ... +Gm , and by hypothesis, the rank of G is G1 + ... +G m • Thus the rank is never altered, and ({3) never

34

1932]

NON-SEPARABLE AND PLANAR GRAPHS

occurs. Hence, obviously, no circuit of graphs is formed in forming G from G'. This completes the proof of the theorem. 6. Construction of non-separable graphs. We prove the following theorem: THEOREM 18. If G is a non-separable graph of nullity N> 1, we can remove an arc or suspended chain from G, leaving a non-separable graph G' of nullity N-1.

Assume the theorem is true for all graphs of nullity 2, 3, ... , N -1. We shall prove it for any graph of nullity N (including the case where N =2). This will establish the theorem in general. Take any non-separable graph G of nullity N> 1. It contains at least two arcs, and therefore, by Theorem 8, it contains no 1-circuit. Remove from G any arc ab, forming the graph G1• If G1 is non-separable, we are through. Suppose therefore G1 is separable, and let its components be H1, H2, ... , H m-1. G1 is connected, for between any two vertices c, d there exists a circuit in G by Theorem 7, and therefore there is a chain joining them in G1• Let Hm consist of the arc abo By Theorem 17, no subset of the graphs H 1 , • • • , H m-1 form a circuit of graphs, while some subset of the graphs H 1, • • • , H m form a circuit of graphs. We shall show that the whole set of graphs H 1, • • • ,H m form a circuit of graphs. Otherwise, some proper subset, which includes H m, form a circuit of graphs. Let H be the graph formed from this circuit of graphs by dropping out H~. By Theorem 16, the circuit of graphs is a non-separable graph; hence H is connected. All the arcs in G1 not in the circuit of graphs, form a graph I. Let 11 be a connected piece of I. Then 11 has at most a single vertex in common with the rest of G. For suppose 11 had the two vertices c and d in common with H. From c follow along some chain towards d in H till we first reach a vertex e in 11. From e follow back along some chain in 11 to C. We have formed thus a circuit containing arcs of both H and It. But as H consists of a certain subset of the components of G1, this circuit contains arcs of at least two components of G1, contrary to Theorem 17. Thus 11 has at most a single vertex in common with the rest of G, and hence G is separable, contrary to hypothesis. Thus H 1, • • • , H m form a circuit of graphs, that is, G is formed of a circuit of graphs. As we assumed G1 was separable, m ~ 3. Therefore we can order the graphs so that H1 and H2 have just the vertex a1 in common, ... ,Hm-1 and H mhave just the vertex am-1=b in common, and Hm and H1 have just the vertex am = a in common. Moreover, these vertices are all distinct, and no other two of the graphs H 1, • • • ,Hm have a common vertex.

35

350

HASSLER WHITNEY

[April

As the nullity of G was> 1, the nullity of G1 is >0. By Theorem 13, this is the sum of the nullities of H 1, . . . , H m-1' Therefore the nullity of some one of these graphs, say Hi, is >0. Suppose first the nullity of Hi is 1. Then, by Theorem 10, Hi is a circuit, consisting of two chains joining ai-1 and ai. Remove one of these chains from G. This leaves a graph G', which again is a circuit of graphs. For the graph Hi we replace by an ordered set of non-separable graphs, each consisting of one of the arcs of the chain we have left in Hi. Suppose next the nullity of Hi is > 1. It is less than N, as Hi is contained in G1 , whose nullity is N -1. Therefore, by induction, we can remove an arc or a suspended chain, leaving a non-separable graph H f of nullity one less. If neither ai-1 nor ai has thus been removed, we again have a circuit of graphs. Suppose ai but not ai-1 was removed. Replace that part of the chain we removed joining ai and a vertex of Hi distinct from ai-1. Here again we have a circuit of graphs, Hi being replaced by H f and a set of arcs. The case is the same if ai-1 but not ai was removed. If finally, both ai and ai-1 were in the chain we removed, we put back all of the chain but that part between these two vertices. Here again, the resulting graph G' is a circuit of graphs. Thus in all cases we can drop out from G an arc or suspended chain, leaving a circuit of graphs. By Theorem 16, the resulting graph G' is non-separable. As also the nullity of G' is one less than the nullity of G, the theorem is now proved. As a consequence of this theorem, Theorem 8, and Theorem 10, we have THEOREM 19. We can build up any non-separable graph containing at at least two arcs by taking first a circuit, then adding successively arcs or suspended chains, so that at any stage of the construction we have a non-separable graph.

It is easily seen that, conversely, any graph built up in this manner is non-separable. For each time we add an arc or suspended chain, these arcs, each considered as a grll-ph, together with the non-separable graph already present, form a circuit of graphs.

II.

DUALS, PLANAR GRAPHS

7. Congruent graphs. We introduce the following Definitions. Given two graphs G and G', if we can rename the vertices and arcs of one, giving distinct vertices and distinct arcs different names, so that it becomes identical with the other, we say the two graphs are congruent.* (We used formerly the word "homeomorphic.") * See the author's American Journal paper, cited in the introduction.

36

1932]

NON-SEPARABLE A:"
351

The geometrical interpretation is that we can bring the two graphs into complete coincidence by a (1, 1) continuous transformation. Two graphs are called equivalent if, upon being decomposed into their components, they become congruent, except possibly for isolated vertices. 8. Duals. Given a graph G, if Hl is a subgraph of G, and H2 is that subgraph of G containing those arcs not in H l, we say H2 is the complement of Hl in G. Throughout this section, R, R', r, r', etc., will stand for the ranks of G, G' , H, H', etc., respectively, with similar definitions for V, E, P, N. Definition. Suppose there is a (1, 1) correspondence between the arcs of the graphs G and G' , such that if H is any subgraph of G and H' is the complement of the corresponding subgraph of G' , then r'

=

R' - n.

We say then that G' is a dual of G.* Thus, if the nullity of H is n, then H' (including all the vertices of G /) is in n more connected pieces than G ' . THEOREM

20. Let G' be a dual of G. Then R'

=

N,

N'

=

R.

For let H be that sub graph of G consisting of G itself. Then n

=

N.

If H' is the complement of the corresponding sub graph of G' , H' contains no arcs, and is the null graph. Thus

o.

r'

=

=

R' - n.

R'

=

But as G' is a dual of G, r'

These equations give N.

The other equation follows when we note that E' =E. THEOREM

21. If G' is a dual of G, then G is a dual of G'.

Let H' be any sub graph of G' , and let H be the complement of the corresponding sub graph of G. Then, as G' is a dual of G, * While this definition agrees with the ordinary one for graphs lying on a plane or sphere, a graph on a surface of higher connectivity, such as the torus, has in general no dual. (See Theorems 29 and 30.)

37

352

HASSLER WHITNEY

r'

=

R' - n.

R'

=

IApril

By Theorem 20, We note also,

e + e' = E.

These equations give r = e=

N.

n = e-

E - N - n'

(R' - r') =

=

e -

\1

+ (e' -

n')

R - n'.

Thus G is a dual of G'. Whenever we have shown that one graph is a dual of another graph, we may now call the graphs "dual graphs." LEMMA. If a graph G is decomposed into its components, the rank and nullity of any subgraph H is left unchanged.

For each time we separate G at a vertex, H is either unchanged or is separated at a vertex. Hence neither its rank nor its nullity is altered. (See the proof of Theorem 13.) THEOREM

22. If G' and Gil are equivalent and G' is a dual of G, then Gil is a

dualofG.

Let H be any sub graph of G, and let H' be the complement of the corresponding sub graph of G'. Let G1' and Gi' be G' and Gil decomposed into their components. Then Gi and Gi' are congruent. H' turns into a sub graph Hi of G'. Let Hi' be the corresponding subgraph of Gi', and H" the same subgraph in Gil. Then ri = ri' . But by the above lemma, r'

r{,

=

Hence

r'

r" =

=

r/'

r".

As a special case of this equation, letting H' be the whole of G', we have R'

=

R".

As G' is a dual of G, r' = R' - n.

Therefore

r" = R" - n,

and Gil is a dual of G. The converse of this theorem is not true. For define the three graphs G: a(ab), (3(ab) , ,,(ac), 5(cb), E(ad), s(db);

38

1932]

353

NON-SEPARABLE AND PLANAR GRAPHS

G': ex'(a'b'), (3'(c'd'), "('(a'd'), ll'(a' d'), e'(b'c'), r'(b' c'); G": ex"(a"b") , (3"(b"c") , "("(a"d") , ll"(a"d"), e"(c"d") , r"(c"d"). G' and G" are both duals of G, but they are not congruent.* THEOREM 23. Let GI, ... , Gm and G{, ... , G"{ be the components of G and G' respectively, and let G/ be a dual of G;, i = 1, ... , m. Then G' is a dual ofG. Let H be any subgraph of G, and let the parts of H in GI, ... , Gm be HI, ... , Hm. Let HI be the complement of the subgraph corresponding to H;inGf,i=I,··· ,m,andletH'betheunionofH{,··· ,H"{ inG'.Then H' is the complement of the subgraph in G' corresponding to H in G. Using the proof of Theorem 13, we find that 1"

+ ... + 1',,{ ,

= 1'{

and As also

R' = R{

and

1'f

+ ... + R,,{

= Rf -

ni

(i

=

1, ... , m),

adding these last equations gives 1"

= R' - n,

and hence G' is a dual of G. THEOREM 24. Let GI, ... , Gm and G{, ... , G"{ be the components of the dual graphs G and G', and let the correspondence between these two graphs be such that arcs in G; correspond to arcs in GI, i = 1, ... , m. Then G; and Gf are duals, i = 1, ... , m. Let HI be any subgraph of GI, let H' be the complement of the corresponding subgraph in G', and let H{ be the complement in G'. Then H{, G{, ... , G": form H'. By Theorem 13, we find R'

and

R{

+ Rf + ... + R":

1"

= r{ + R{ + ... + R":.

1"

= R'-

Now hence

=

r{

=

nl,

R{ - nl,

and G{ is a dual of GI. Similarly for G{ , ... , G": . • See the author's American Journal paper, however.

39

354

[April

HASSLER WHITNEY

THEOREM 25. Let G and G' be dual graphs, and/let Hi, ... , H m be the components of G. Let H { , ... , H n: be the corresponding subgraphs of G'. Then H{ , ... ,H n: are the components of G', and H! is a dual of Hi, i= 1, ... , m.

Hi is the sub graph of G corresponding to H{ in G'. Its complement is II, the graph formed of the arcs of H 2 , • • • ,H m' Obviously H 2 , • • • , H m are the components of Ii. Hence, by Theorem 13, the nullity of II is n2+na+ ... +n m • Thus, as G' is a dual of G, r{

=

R' - (n2

Similarly,

+ na + ... + n * m) •

Adding these equations gives

r{ As Hi, H 2 ,

+ r£ + ... + r": •••

=

mR' - (m - l)(nl +'n2

+ ... +

It m ).

,H m are the components of G, N

=

nl

+ n2 + ... + n

m•

Also, as G and G' are duals, by Theorem 20, R'

Hence

r{

+ r£ + ... + rn:

=

N.

=

mR' - (m - OR'

=

R'.

Let now Hd , ... ,H~kl be the components of H{ (there may be but one) and similarly for H£, ... , Hn:. Then, by Theorem 13,

Adding these equations gives

Lr:i = r; + ... + r:" =

R',

i.i

As the graphs Hd, ... , H n:km are non-separable, Theorem 17 tells us tha1 they are the components of G'. Hence G' has at least as many components a~

* Which equals nl.

40

1932)

NON-SEPARABLE AND PLANAR GRAPHS

355

G. Similarly, G has at least as many components as G'. They have therefore the same number, m, of components. There are therefore m graphs in the set H I { , • • • , H:'km • But there is at least one such graph in each graph H { , ... , H:', and there is therefore exactly one in each. Hence each graph H i { fills out the graph H [ , and the two sets of graphs H I { , • • • , H :'km and H{, ... , H:' are identical, that is, H { , ... ,H:' are the components of G'. The rest of the theorem follows from Theorem 24. As a special case of this theorem, we have THEOREM

26. A dual of a non-separable graph is non-separable.

9. Planar graphs. Up till now, we have been considering abstract graphs alone. However, the definition of a planar graph is topological in character. This section may be considered as an application of the theory of abstract graphs to the theory of topological graphs. Definitions. A topological graph is called planar if it can be mapped in a (1, 1) continuous manner on a sphere (or a plane). For the present, we shall say that an abstract graph is planar if the corresponding topological graph is planar. Having proved Theorem 29, we shall be justified in using the following purely combinatorial definition: A graph is planar if it has a dual. We shall henceforth talk about "graphs" simply, the terms applying equally well to either abstract or topological graphs. LEMMA. If a graph can be mapped on a sphere, it can be mapped on a plane, and conversely.

Suppose we have a graph mapped on a sphere. We let the sphere lie on the plane, and rotate it so that the new north pole is not a point of the graph. By stereographic projection from this pole, the graph is mapped on the plane. The inverse of this projection maps any graph on the plane onto the sphere. By the regions of a graph lying on a sphere or in a plane is meant the regions into which the sphere or plane is thereby divided. A given region of the graph is characterized by those arcs of the graph which form its boundary. If the graph is in a plane, the outside region is the unbounded region. LEMMA. A planar graph may be mapped on a plane so that any desired regioll is the outside region.

We map the graph on a sphere, and rotate it so that the north pole lie! inside the given region. By stereographic projection, the graph is mappec onto the plane so that the given region is the outside region. We return now to the work in hand.

41

356

HASSLER WHITNEY THEOREM

[April

27. If the components of a graph G are planar, G is planar.

Suppose the graphs Gi and G2 are planar, and G' is formed by letting the vertices ai and a2 of Gi and G2 coalesce. We shall show that G' is planar. Map Gi on a sphere, and map G2 on a plane so that one of the regions adjacent to the vertex a2 is the outside region. Shrink the portion of the plane containing G2 so it will fit into one of the regions of Gi adjacent to ai. Drawing ai and a2 together, we have mapped G' on the sphere.* The theorem follows as a repeated application of this process. THEOREM 28. Let G and G' be dual graphs, and let a(ab), a'(a'b') be two corresponding arcs. Form Gdrom G by dropping out the arc a(ab), and form G{ from G' by dropping out the arc a'(a'b'), and letting the vertices a' and b' coalesce if they are not already the same vertex. Then Gi and Gt' are duals, preserving the correspondence between their arcs.

Let Hi be any subgraph of Gi and let H{ be the complement of the corresponding subgraph of G{ . Case 1. Suppose the vertices a' a.nd b' were distinct in G'. Let H be the subgraph of G identical with Hi. Then Let H' be the complement in G' of the subgraph corresponding to H. Then r'

= R'

- n.

Now H' is the subgraph in G' corresponding to H{ in G{, except that H' contains the arc a'(a'b'), which is not in H{. Thus if we drop out a'(a'b') from H' and let a' and b' coalesce, we form H { . In this operation, the number of connected pieces is unchanged, while the number of vertices is decreased by 1. Hence r{ = r' - 1.

As a special case of this equation, if H' contains all the arcs of G', we find R{ = R' - 1.

These equations give Thus G{ is a dual of Gi • Case 2. Suppose a' and b' are the same vertex in G'. In this case, defining Hand H' as before, we form H{ from H' by dropping out the arc a'(a'a'). This leaves the number of vertices and the number of connected pieces un* Here and in a few other places we are using point-set theorems which, however, are geometrically evident.

42

1932]

NON-SEPARABLE AND PLANAR GRAPHS

357

changed. Thus two of the equations in Case 1 are replaced by the equations The other equations are as before, so we find again that G{ is a dual of G1• The theorem is now proved. 29. A necessary and sufficient condition that a graph be planar is that it have a dual. THEOREM

We shall prove first the necessity of the condition. Given any planar graph G, we map it onto the surface of a sphere. If the nullity of G is N, it divides the sphere into N + 1 regions. For let us construct G arc by arc Each time we add an arc joining two separate pieces, the nullity and the number of regions remain the same. Each time we add an arc joining two vertices in the same connected pieces, the nullity and the number of regions are each increased by 1. To begin with, the nullity was 0 and the number of regions was 1. Therefore, at the end, the number of regions is N + 1. We construct G' as follows: In each region of the graph G we place a point, a vertex of G'. Therefore G' contains V' = N + 1 vertices. Crossing each arc of G we place an arc, joining the vertices of G' lying in the two regions the arc of G separates (which may in particular be the same region, in which case this arc of G' is a 1-circuit). The arcs of G and G' are now in (1, 1) correspondence. G' is the dual of G in the ordinary sense of the word. We must show it is the dual as we have defined the term. Let us build up G arc by arc, removing the corresponding arc of G' each time we add an arc to G. To begin with, G contains no arcs and G' contains all its arcs, and at the end of the process, G contains all its arcs and G' contains no arcs. We shall show (1) each time the nullity of G is increased by 1 upon adding an arc, the number of connected pieces in G' is reduced by 1 in removing the corresponding arc, and (2) each time the nullity of G remains the same, the number of connected pieces in G' remains the same. To prove (1) we note that the nullity of G is increased by 1 only when the arc we add joins two vertices in the same connected piece. Let ab be such an arc. As a and b were already connected by a chain, this chain together with ab forms a circuit P. Let a'b' be the arc of G' corresponding to abo Before we removed it, a' and b' were connected. Removing it, however, disconnects them. For suppose there were still a chain C' joining them. As a' and b' are on opposite sides of the circuit P, C' must cross P, by the Jordan Theorem,

43

358

HASSLER WHITNEY

[April

that is, an arc of C' must cross an arc of P. But we removed this arc of C' when we put in the arc of P it crosses. (1) is now proved. The total increase in the nullity of G during the process is of course just N. Therefore the increase in the number of connected pieces in G' must be at least N. But G' was originally in at least one connected piece, and is at the end of the process in V = N + 1 connected pieces. Thus the increase in the number of connected pieces in G' is just N (hence, in particular, G' itself is connected) and therefore this number increases only when the nullity of G increases, which proves (2). Let now H be any subgraph of G, let H' be the complement of the corresponding subgraph of G', and let H' include all the vertices of G'. We build up H arc by arc, at the same time removing the corresponding arcs of G'. Thus when H is formed, H' also is formed. By (1) and (2), the increase in the number of connected pieces in forming H' from G' equals the nullity of H, that is, But

p' - p' r' = V' -

= n.

p', R'

=

V' - P',

as G' and H' contain the same vertices. Therefore r' = R' - n,

that is, G' is a dual of G. To prove the sufficiency of the condition, we must show that if a graph has a dual, it is planar. It is enough to show this for non-separable graphs. For if the separable graph G has a dual, its components have duals, by Theorem 25, hence its components are planar, and hence G is planar, by Theorem 27. This part of the theorem is therefore a consequence of the following theorem: THEOREM 30. Let the non-separable graph G have a dual G'. Then we can map G and G' together on the surface of a sphere so that (1) corresponding arcs in G and G' cross each other, and no other pair of arcs cross each other, and (2) inside each region of one graph there is just one vertex of the other graph.

The theorem is obviously true if G contains a single arc. (The dual of an arc ab is an arc a' a', and the dual of an arc aa is an arc a'b'.) We shall assume it to be true if G contains fewer than E arcs, and shall prove it for any graph G containing E arcs. By Theorem 8, each vertex of G is on at least two arcs. Case 1. G contains a vertex b on but two arcs, ab and bc. As G is non-separable, there is a circuit containing these arcs. Thus dropping out one of them will not alter the rank, while dropping out both reduces the

44

1932]

NON-SEPARABLE AND PLANAR GRAPHS

359

rank by 1. As G' is a dual of G, the arcs corresponding to these two arcs are each of nullity 0, while the two arcs taken together are of nullity 1. They are thus of the form a'(a'b'), (3'(a'b'), the first corresponding to ab, and the second, to be. Form G1 from G by dropping out the arc be and letting the vertices band e coalesce, and form G{ from G' by dropping out the arc (3'(a'b'). By Theorem 28, G1 and G{ are duals, preserving the correspondence between the arcs. As these graphs contain fewer than E arcs,* we can, by hypothesis, map them together on a sphere so that (1) and (2) hold; in particular, a'(a'b') crosses ae. Mark a point on the arc ae of G1 lying between the vertex e and the point where the arc a'(a'b') of G' crosses it. Let this be the vertex b, dividing the arc ae into the two arcs ab and be. Draw the arc (3'(a'b') crossing the arc be. We have now reconstructed G and G', and they are mapped on a sphere so that (1) and (2) hold. Case 2. Each vertex of G is on at least three arcs. As then G contains no suspended chain, and G is not a circuit and therefore is of nullity N> 1, we can, by Theorem 18, drop out an arc ab so that the resulting graph G1 is non-separable. G' is non-separable, by Theorem 26, and hence the arc a'b' corresponding to ab in G is not a 1-circuit. Drop it out and let the vertices a', b' coalesce into the vertex a{, forming the graph G{. By Theorem 28, G1 and G{ are duals, and thus G{ also is non-separable. Consider the arcs of G' on a'. If we drop them out, the resulting graph G" has a rank one less than that of G'. For if its rank were still less, G" would be in at least three connected pieces, one of them being the vertex a'. Let e and d be vertices in two other connected pieces of G". They are joined by no chain in G", and hence every chain joining them in G' must pass through a', which contradicts Theorem 6. If we put back any arc, the rank is brought back to its original value, as a' is then joined to the rest of the graph. Hence, G' being a dual of G, the arcs of G corresponding to these arcs are together of nullity 1, while dropping out one of them reduces the nullity to O. Therefore, by Theorem 9, these arcs form a circuit P. One of these arcs is the arc abo The remaining arcs form a chain C. Similarly, the arcs of G corresponding to the arcs of G' on b' form a circuit Q, and this circuit minus the arc ab forms a chain D. C and D have the vertices a and b as end vertices. Also, the arcs of G1 corresponding to the arcs of G{ on a{ form a circuit R. These arcs of G{ are the arcs of G' on either a' or b', except for the arc a'b' we dropped out. Thus the arcs of G1 forming the circuit R are the arcs of the chains C and D. As G1 and G{ contain fewer than E arcs, we can map them together on a * Obviously G, is non-separable.

45

360

HASSLER WHITNEY

[April

sphere so that properties (1) and (2) hold. a{ lies on one side of the circuit R, which we call the inside. Each arc of R is crossed by an arc on a{ , and thus there are no other arcs of G{ crossing R. There is no part of G{ lying inside R other than a{ , for it could have only this vertex in common with the rest of G{ , and G{ would be separable. Also, there is no part of G1 lying inside R, for any arc would have to be crossed by an arc of G{, and any vertex would have to be joined to the rest of G1 by an arc, as G1 is non-separable. Let us now replace a{ by the two vertices a' and b', and let those arcs abutting on a{ that were formerly on a' be now on a', and those formerly on b', now on b'. As the first set of arcs all cross the chain C, and the second set all cross the chain D, we can do this in such a way that no two of the arcs cross each other. We may now join a and b by the arc ab, crossing none of these arcs. This divides the inside of R into two parts, in one of which a' lies, and in the other of which b' lies. We may therefore join a' and b' by the arc a'b', crossing the arc abo G and G' are now reconstructed, and are mapped on the sphere as required. This completes the proof of the theorem, and therefore of Theorem 29. THEOREM 31. A necessary and sufficient condition that a graph be planar is that it contain neither of the two following graphs as subgraphs: G1• This graph is formed by taking five vertices a, b, c, d, e, and joining each pair by an arc or suspended chain. G2 • This graph is formed by taking two sets of three vertices, a, b, c, and d, e,j, and joining each vertex in one set to each vertex in the other set by an arc or suspended chain.

This theorem has been proved by Kuratowski.* It would be of interest to show the equivalence of the conditions of the theorem and Theorem 29 directly, by combinatorial methods. We shall do part of this here, in the following theorem:t THEOREM

32. Neither of the graphs G1 and G2 has a dual.

Suppose the graph G1 had a dual. By Theorem 28, if G1 contains a suspended chain, we can drop out one of its arcs and let the two end vertices coalesce, and the resulting graph will have a dual. Continuing, we see that the graph Ga, in which each pair of vertices of the set a, b, c, d, e are joined by an arc, must have a dual. Similarly, if G2 has a dual, then the graph G" in which each vertex of the set a, b, c is joined to each vertex of the set d, e, f by an arc, must have a dual. Both of these are impossible. * Fundamenta Mathematicae, vol. 15 (1930), pp. 271-283.

t The other half has recently been proved by the author. See Bulletin of the American Mathematical Society, abstract (38-1-39). (Note added in prooL)

46

1932]

NON-SEPARABLE AND PLANAR GRAPHS

361

(a) The graph G3• To avoid subscripts, let us call it G. Suppose it had a dual, G'. Then R = N' = 4,

N = R' = 6,

E = E' = 10.

If G' has isolated vertices, we drop them out, which does not alter its relation to G. (1) There are no l-circuits, 2-circuits or triangles in G'. For if there were, dropping out the corresponding arcs of G would have to reduce the rank of G. But we cannot reduce its rank without dropping out at least four arcs. (2) G' contains at least five quadrilaterals. For if we drop out the four arcs on any vertex of G, the rank is reduced by 1, arid if we put back any of these arcs, the rank is brought back to its original value; Theorem 9 now applies. (3) At least two of these quadrilaterals have an arc in common, as there are but ten arcs in G'. There are just two ways of forming two quadrilaterals out of fewer than eight arcs without forming any 2-circuits or triangles. One of these graphs, I{ , contains the arcs a'b', b'e', a'e', e'e', a'd', d' e'. The other, N, contains the arcs a' e' , e''j', f'b' , b'a' , e' e' , e'd' , d''j' . But there is no subgraph of the type I { in G', for this subgraph is of rank 4 and nullity 2, and there would have to be a sub graph of G of rank 2 and nullity 2, and such a graph contains a lor a 2-circuit, of which there are none in G. Hence G' contains a subgraph N. (4) Each vertex of G' is on at least three arcs, as there are no 1- or 2-circuits in G. Each of the vertices a', b', e', d' of N is on but two arcs. Hence there must be another arc on each of these vertices. As 1£ contains seven arcs, and G' contains but ten, one of the three arcs left must join two of these vertices. But if we add an arc a'b' or e'd', we would form a 2-circuit; if we add an arc a'e' or b'd', we would form a triangle; if we add an arc a'd' or b'e', we would form a graph of the type I { . As G' contains none of these graphs, we have a contradiction. (b) The graph G4 • Let us call it G. If it has a dual G', then R = N' = 5, N = R' = 4,

E = E' = 9.

We proceed exactly as for the graph G3• In outline: (1) G' contains no 1- or 2-circuits.

47

362

HASSLER WHITNEY

(2) There is no subgraph of G' containing four vertices, each pair being joined by an arc. For this graph is of rank 3 and nullity 3, and G would have to contain a sub graph of rank 2 and nullity 1, that is, a 2-circuit. (3) There are at least nine sub graphs of G' of rank 3 and nullity 2, and hence of the form a'b', a' e', b'e', b'd', e'd', as there are nine quadrilaterals in G. (4) As G' contains but nine arcs, two of these sub graphs have an arc in common. There is therefore a sub graph of one of the forms l{ : a' e', a'b', b' e', a'e', e'e', a'd', d'e', or I;: a'e', a'b', b'e', b'e', e'e', e'd', d'e'. (S) Each vertex of G' is on at least four arcs. Now each of the graphs l{, 1; contains seven arcs. We have but two arcs left which we must place so that each vertex of I { or I; is on at least four arcs. This cannot be done. The theorem is now proved. Theorem 31 together with this theorem gives an alternative proof of the second part of Theorem 29. For suppose a graph G had a dual. Then it contains neither the graph G1 nor G2• For if it did, dropping out all the arcs of G but those forming one of these graphs, Theorem 28 tells us that this graph has a dual. But we have just seen that this is not so. Hence, by Theorem 31, G is planar. Euler's formula. Map any connected planar graph G on a sphere, and construct its connected dual G' as described in the proof of Theorem 29. Then in each region of G there is a vertex of G'. Let F be the number of regions (or faces) in G. Then R'

=

N,

R = V - 1, R'

and hence

=

V' - 1,

V' =F, V-E+F=R+1-E+N+l =

2,

which is Euler's formula. HARVARD UNIVERSITY, CAMBRIDGE, MASS.

Reprinted from Trans. Amer. Math. Soc. 34 (1932),339-362

48

A Combinatorial Problem P. Erdiis ami

(~.

III

Geometry

Szek(")"es

:\lalH"lwst"r

I:\THO 1)('( 'TIOX.

Our pres('nt prohi<-m has he('n sllggest<-d h~' :\1 i-.;s Est her K kin in conncction with the following proposition. FrotH .J points of the plan(' of whieh no tllf('(' lie on til!" .,allle straight line it is alW1I~'s pos.,ihk to sdect -l points ddl'rminillg 11 conn'x quadrilateral. "'e prcsent E. Klein's proof IWfe \)("("all"(' latef on we are going to make lise of it. If the least ("OIl\"("X pol~'grlll whieh 1'11closes thc points is a quadrilah·ral or a pelltagoll the th(OITIll is triyial. Let therefore the cnelo.,ing pol~'goll he a iI'iang\(' .,1 ne. Then the two remaining points J) ami F an' inside .1 He. Two of the gin'n points (sa~' ."1 and C) llIust li(' Oil til(" SHllle side of the eOIllH"eting straight linc Irk Thm it is ekar that .1 FJ)C is a eonyex quadrilat<-ral. :\Iiss Klf'in suggested the following III on' gelleral pwhll'lll. ("all we find for a ~i1."ell II a nlllnblT .\"(1/) SI/fit Ihal frolll allY sd ('011taining at l('(lst .\" poillis it is possibll' to 81'1('(" 1/ poinis f()rlllil/~ a ('Om'i'.}· poly~oll'! There are two particular questions: (1) dOI's the numhef ,\ corresponding to II exist·? (:!) If so, how is the least .\"(n) deter, mined as a funetion of n~ (We denote the least .Y h~' .\"0(11 ).: \Ve gin' two proofs that tIll' first q\l('stioll is to he ans\\"('J'('d in the affirmatiw. Both of thelll will gin' (kfillite \'a"l('s I'm .\"(11) and the first one ('an he generalised to all~' nlllllher oj dimensions. Thus we obtain a ('ertain prelilllinary answer t< the second question. But the answer is lIot final for we ~1'IH'rall~ get in this way a nlll11l>er .\" which is too large. :\11'. E. :\Iakai proved that .Yo (.'» = H, and from 0111' seeolld dl'lllonstratioll. W( obtain .\"(5) = 21 (from the first a IlIllllber of the order :!1O!1O") Thus it is to he se(,ll, that our estimate lics pretty far fron

49

464

P. Erdus and G. Szek('res.

[2]

the true limit Xo(n). It is notahle that X(3) = 3 = 2 + 1, X o (4) = 5 =~ 22 + 1, .Yo (5 ) = n = 2 3 + 1. We might conjecture therefore that .,Yo(n) = 2"- 2 -+- 1, but the timits gi\"(,n by OUl' proofs are mueh larger. It is desirable to extend the usual definition of eonyc'x polygon to include the cases wh('J'(' three OJ" more eonsecuti\"e points lie 011 a straight line, FIRST PROOF.

The hasis of the first proof is a eomhinatorial theorem of Ramsey 1). In the introduction it was pro\'ed that from 5 points it is always possihle to select 4 forming a ('onyc'x quadrangle. Xow it can he easily pro\'ed by induction that n points determine a eonH'X polygon if and only if any 4 poillts of them form a conn'x quadrilateral. Denote the giyen points hy the numbers 1, 2, 3, ... , .Y, then any !.--gon of the set of points is represented hy a set of!.- of these numbers, or as we shall say, hy a !.--combination. Let us now suppose each n-gon to be eonca\"e, then from what we obser\"ed ahoye we can di\"ide the 4-comhinations into two classes (i. e. into "eon\"('x" and "concayc" quadrilaterals) sueh that eH'fy 5-comhination shall contain at least one "conn'x" combination and each n-comhination at least onc coneaye one. ("'c regard one combination as contained in another, if each dement of the first is also an element of the second.) From Ramsey's theorem, it follows that this is impossible for a sufficiently large .Y. Ramsey's theorem ean he stated as follows: Let k, I, i be giren posith:e inu~gl'l"s, k ~ i; I ~ i. Suppose that there fxist tu:o classes, z and p, of i-combinations of 11/ dell/ents such that each k-combination shall contain at l{'(Ist one combination from class z und each I-combinatioll shall contain at least one combination from class p. Then for slifficil'ntl.ll grcat m < m;(I.-, l) this iSllot possible. Ramsey enunciated his theorem in a slightly different form. In otheI words: if the mcmhers of z had heen determine(l as aho\"(~ at our discrction and m ~ 1lI;(k, I), then there must he at least 011(' l-combination with e\"('ry comhination of order i helonging to elass z. ') F. 1'. RA)hEY, Col\eet('d papers. On a problem of formal logi!". l!2 -Ill. R('('('ntly SKOI.E)t also prov('d Rams('y's th('or('m [Fundanwnta :\Iath. 20 (19aa), 2.;4-2fil ].

50

,\ Comhinatorial Prohkm ill Geometry,

-J.G.'i

\Vc give here a new proof of Hums<'y's theof('Ill, which diff£'rs entirely from the pre\'ious OIWS and gins for l1I i (l.·, I) slightly smallc'r limits. a) If i - I , the theorem holds for ('\'('ry k and I. For if we sel('ct out of 1/1 some ddermim·d elements (eomhinatiolls of order 1) as the e1ass 70, so that c\'ery !.--gon (this shol'tc-r denomination will he gi\'('n to the eomhinatioll of onlc'r /.') must contain at least one of the z dements, there arc' at most (/.'-1) elements whic·h do not hc'long to thc' e1l1SS z. Then tlwre must he at least (lII-k+1) dements of z. If (lII-k~1) ~ I. then there must he an I-gem of the z delllellts and thus m~k-T-I-~

which is c\'idel1tly false for suffieiently grcat m. Suppose then that i > 1. h) Thc thcorelll is tri\'ial, if k or I equals i. If, for ('xample, I.- = i, th£,11 it is sufficient to ehoos(' 111 = I. }t'or I.- = 1 llIeans that all i-gcms are z comhinations and thus in \'i)'tue of III C-~ 1 the)'e is one polygon (i. c'. the I-gcm form<,d of all t h(' denwnts), whose i-gons ar£' all z-('omiJinatiolls. The argum(,nt for 1 = i runs similarly. c) Suppose finally that k > i; and suppose that the theorem holds for (i -1) and ('n'ry I.' and I, further for i. k, 1 --, 1 and i, k ,- 1, I. \Ye shall pro\"(' that it will hold for i. I." 1 also and in \'irtul' of (a) and (h) we Illay say that the theol'elll is pl'o\'('d for all i, k. I. Suppose then that \\'C' are allle to ('arr~' out the di"isioll of the i-pol~'gons mentioned aho\'(·. Furth('r let /,-' he so great that if in ('\'(TY I-gon of 1.-' dements tht'rc' is at least 0111' rJ cOlllhination, then there is onc' (I.. -1 )-gcl/l ;111 of whose i-gons are rJ eomhinations. This ehoiec of 1.-' is always possihle in \'iltue or til(' ilH\udiollhypothesis, wc' ha\'(' only to ehoose 1.-' ,~~ lIIi(l.. --I, I). Similarly we choose /' so great that if ('Heh k-gcm of /' elements contains at least one z ('omhinatiol1. then tht·'·(' is olle (/-1)gon all of whose i-gellls aI'(' z c'olllhinatiolls. ,,'C' thl'1I take 11/ larger than k' and 1'; and let (a1,llz, ... ,ak ,) "..-1

he an arhitrary k' -gcm of' the first (11-1) elC'lllents. By h~'poth('sis ca(~h I-gem ('ontains at least one II comhination. 11<'II('e owing to the choiC'(' of' !.-', A ('ontains 011(' (1.--1 )-gclll (a"", (/"" ... , (1.m o_,) whose i-gons all helong to thl' class (I. Sin('e ill (a"", ... , (1./1/_-,,1/) ao

51

466

P. Erdos and G. Sz('k,'r,'s.

ther(, is at least ont' on(' of the I-gons

':1.

combination. it is clear that this Illust hI

(((/'" a/,!, ...•

((/'i

~

n)

"

B.

III just the salll(' wa~' wc Illay pro\"(' h~' replacing til(' rol(·, of I.- and I h~' ,<' and l' and or ':1. h~' rl. that if (h), b2 ,

•• "

- .-1'

b,,)

\S an arhitrary {'-gon of the first (11- -1) d(,lllents, tlH"n the l-gOIlS (h,. 1 • hr :! . ...•' 1 hr- \ • II)

.~

alllon~

B'

there must \)(' a 11 eOlllhillatioll. Thus w(' eall di\'id(' th(' (i -1 )-gmls of the first (n -1) e\eIll('lIb into e1ass('.., ':1.' alld ri' .,0 that ('aeh '<'-gml A shall eontnill at least oae ':/.' ('olllhinatioll lJ aIHI ('aeh l' -goll .-1' at least O!le rl' eOIllhination Ir. But. hy the illdu('tioll-h~·pothe.,es this is illlpossihh for 11/ -_-;;: 11/ (_) (1<'1') - 1. B~' followillg th(' illduetioll. it is ('asy to ohtaill for 11/((/•• l) th(' foll()\\"ing fun(·tional (,quation: 11/;(1•. I)

=- 111;_) ~III;(I.-

- 1. I).

III,

(I... I -- 1)

1.

(1 )

Ih this n'('unTIH'e-forlllula and the initial ntlll(,s 111,(1.-, I) ,.. -. / -- 1 1II;(i.ljc I. 11/((1•. i) =

ohtain('d front (a) and (h) ,n' \\'(' ohtain ('. g. ('asil~'

('
,_ ,./

(:! )

eal('ulat(' ('\"(')'\' 1II;(k,l).

Th(' funetion Illention('d in the introdu('tion has th(' f01'1ll -Finally, ror the sp('('ial cas(' i =c.:!, W(' gi\"(' a graphoth('ordic forntulat ion of Hams(,~"s t h('or(,1ll and pr('sl'nt a n'ry simple I)roof of it. TIII-:OIIE:\I: III WI arbitrary ;!.raplt II·t thl' lIIa,rillllllll nlllllblT of indl'fJl'/ulcnt points~) bl' ,.. ; if tltl' /l1l/1I/wr of poillts is.\' .~ II/(k, 1) 'hen t!tITI' c.risis in Ollr ;!.I'((plt a 1'0111 pll'fl' graph:S) of ordl'r I. 2) Two points are sai" to h(' ilu\('p,'wlt'nt if t1\{'~. are not eOIll\{"'''''\; k points U(' iIH\('p(,lu\ent if ('v('ry pair is ilu\ep(,llf\('nt. 3) A ('oJllpletc graph is on(' in whi('h ('\"tory pair of points is conJl('cted.

52

[5]

A Combinatorial Probl('m in

Gl'on1('tr~-.

PROOF. For 1= ], the theorcm is trivial for any k, since the maximum numher of indepcndent points is k and if the numher of points is (k 1), therc must he an edge (complete graph of order 1). Xow suppose thc theorem prO\'ed for (1-1) with any k. Then

+

X-k

at least - ; - edges start from olle of thc indepcndent points. Hence if S-k

----;:-"

I.

c.,

~

//I (k,

1- ] ),

.Y~k·m(k,I-])+k,

OJ)

then, out of the end points of these edges we' Illay scleet, in virtue of our induction hypothesis, a complete graph whose order is at least (1-]). As the points of this graph arc eonneet('d with thc same point, the~- form together a complete graph of order I. SEeo::.; () PROOF.

The foundation of the sc('ond proof of ollr mam theorem is formcd partly by geometrical and partly hy eomhinatorial considerations. 'Ye start from some similar problems and we shall sec, that the numerical limits are more accurate then in thc prcyious proof; they are in some respects exact. Let us consider the first quarter of th(' plane, '\'hos(' points are determined by coordinates (,/,. y). We ehoos(' n points with monotonously inereasiHg ahs<,issa(' 4).

TUEORE:\l: It is alr.l'aYs possible to 1'/100.'11' at l('(lst VIi points -a:ith increasing abscissae and either /IIonotonollsly inlT('(lsin;! or monotonously c/cCI'msing ordinates. If two ordinates are equal, thc casc may equally' he regarded as increasing or decreasing. Let us denotc hy 1(n. 11) the minimum numher of the points out of which we can select 11 monotonollsly increasing or decreasing ordinates. 'Ye assert that

1(11+1, n+l) =1(n, n) -;- 2n-1.

(6)

Let us select n monotonously increasing or decreasing points out of the 1(11, 11). Let us replace the last point by one of the (2n -1 ) new points. Thcn we shall havc onee more 1(n, n) points, Ollt .)

The same problf'm was ronsidert'd indeIK'ndf'ntly by Hil'hard Rado.

53

468

Po' Erdos and G. Szekt'res.

[Ii]

of which we can s('}ect as hefofe n monotonous points. :"IIow we feplace the last point by one of the new oncs and so on. Thus we obtain 21/. points cach an endpoint of a monotonous set. Suppose that among tlwm (n I) afe cnd points of monotonously in(,feasing sets. Then if Yl:;:::.llk I'Of I "> I.- we add PI to the monotonously increasing set of P" and thus, with it, we shall han' an increasing" set of (/I -+- I) points. If.llk ~ Y I for e\"('r~' ,.. < " t hm the (/I -+- 1 ) deefeasing end- points t hemsdn's gi v(' the monotonous set of (n + 1) memhers. If betw('en th(' 2/1 points thefe afe at least (n + 1) end -points of monotonous deereasing sets, the pfoof will run in just the same wa~'. But it nHl~' happen that, out of the 2n points, just 11 arc the (,nd-points of incfeasing sets, and n thc cnd-points of dc('reasing scts. Thcn hy the same feasoning, the end-points of the dccreasing sets ne('essarily incfease. But aftcf the last end-point P thefe is no point, fOf its Ofdin ate would he gfeatcf Of sma lief than that of P. If it is grcatef, th('n togethef with the n end-points it fOfms a monotonously increasing (n -+- 1) set and if it is smallcr, wit h the 11 points belonging to P, it forms a dc-creasing" set of (/I -;--] ) members. Hut by the same reasoning the last of the n incfeasing end-points Q ought to he also an cxtrenl{' onc and that is e\'idl'ntl~' impossihle. Thus we may deduee by induction

+

f(n-+-l, n+l)

=

n 2 -+- 1.

(i)

Similarl~' let f(i, 1.-) denote the minimum numbcf of points out of whieh it is -ifflpossihle to seleet eithef i monotonously inereasing or ,.. monotonously decreasing points. \Ye han' then

f(i, 1.-)

=

(i-l )(1.--]) -

1.

(S)

The proof is similaf to the pf('vious on('. It is not difficult to sec, that this limit is exuet i. e. 'H' can gin (i -] ) (I.. - ]) points such that it is impossihle to sd,("t out of them the desired numher of monotonously inercasing or decreasing ordinates. \Ye sol\"(, now u similar pfohlem: PI' P 2' • •• are givcn points on a straight linc. Let fl (i, 1.-) dellot(' the minimum numbef of points such that proceeding from left to right w(' shall be ahle to sd('ct either i points so that the distanccs of two ncighhouring points monotonously inerease or I.- points so that the same distances monotonously decrease. \Ye assert that (!l )

54

[7]

A Combinatorial Problem in GC'lll1ctry.

469

Let the point C hi sect the distance A:3 (A and B being the first and the last points). If the total n um ber of points is fl (i - I, k) -1--;- fl(i, It-I) - I, then either the number of points in the first half is at least fl(i-I, !.-), or else there are in the second half at least fl(i, !.--I) points. If in the first half there arc fl(i -I, t.-) points then either there are among them k points ,,-hose distanees, from left to right, monotonou-.ly deercase and then the equation for fl (i, It) is fulfilled, or there must \)(" (i -1) points with incf('asing distances. By adding the point E, we han~ i points with monotonely increasing distan('es. If in the seC'ond interyal there are fl(i, k-I) points, the proof runs in th(' same way. (The C':1se, in which two distlllH"es are the same, may be e1assed into either the increasing or the ck("f('asing sets.) It is possible to pro\"(' that this limit is exad. If the limits fl(i-I,k) and fl(i,k-]) are exaet (i.e. if it ispossihk to gin' [II (i - I , I,') - I J points so that there are no (i - 1) increasing nor It decreasing distan('es) then the limit fl(i, I.. ) is exaet too. For if we ehoo'>e ('. g. ~fl (i -1, k) - I ~ points in the () ... I inten-al, and ~fl (i, J.- -- 1) -- ] J points in the :! ... a internll, then WC' han' :fl(i, k) - I J points out of whieh it is equally impossible to sdeet i points with monotonousl~' in('reasing and k points with eleC'f('asing distances. \Ye now tackle the problem of the conn'x n-gon. If there are n gin'n points, there is al\\"ll~'s a straight IiI\(' which is )with('r parallel nor perpendicular to an~' join of two points. Let this straight line he c. Xow w(' regard the eonfiguJ"atiol1 .-11A2A:\.-lJ ... as conn'x, if the gradicllts of the lines .-1 1 .-1 2 , .-1 2 .-1:\ .... d('(')'{'a<-;e mOllotonously, and as concan' if thc~- inel'l'ase monotonously. I~et f2(i, k) denote the minimum numhcr of the points sueh that from them we ma~' pick out either i si(led conn'x or I"-sided ('onea\'e configurations. \Ye assert that (10)

W c consider the first f2(i -], k) points. If out of them there can he taken a ('onean' configuJ"ation of k points then the equatiol1 for fz(i, I.. ) is fulfilled. If not, t h('n t hl'l'e is a ('Oil n'x ('011 figmat iOIl of (i -I) points. The last point of this ('onn'x config\ll'ation we rcplace hy another point. Then we han' on('(' llIor(' eithel' k eoneaye points and thel1 the assel'tion holds, 01' (i ---1) ('On\TX ones. \Ve go on replac'ing the last point, lIntil wc' han' lIIade lise of all points. Thlls Wc' ohtain f2(i, k - 1) points, ('a('h of whieh is an end-point of a cOl1n'x configmatioll of (i -1)

55

470

P. ·ErdOs and G. Szekeres.

(8)

elements. Among them, there are either i convex points and then our assertion is proved, or (I.: - I) concave ones. Let the first of them be AI' the second A 2 • Al is the end-point of a convex configuration of (i -I) points. Let the neighbour of A I in this configuration be B. If the gradient of BAI is greater than that of AIA2' then A2 together with the (i-I) points form a convex configuration; if the gradient is smaller, then B together with A I A 2 • •• form a concave I.:-configuration. This proves our assertion. The deduction of the rceurrenee formula may start from the statement: /2(3, n) =/2(11, 3) = n (hy definition). Thus we easily obtain (II ) As hefore we may easily proY(' that the limit given by (II) is exact, i. c. it is possible to gin

e~:=:) points such, that they

contain neither convex nor concave k points. Since hy connection of the first and last points, en'ry set of k convex or concave points determines a convex k-gon it is evident that

[e:=:) + I]

points always contain a conn'x k-gon.

And as in every com'cx (2k -I) polygon there is always either a conn'x or a concave configuration of I.: points, it is evident . t hat . It .IS POSSI'J )Ie to gIve pomts, so t hat out () f t Ilcm

(21.'-"'). 1.'-2

no eonwx (2k-I) polygon can be sdeetcd. Thus the limit is also estimated from helow. Professor D. K6nig's lemma 5) of infinity also gins a proof of the theorem that if k is a definite numh('r and 11 sufficiently great, the n points always contain a convex fl·-gon. But we thus obtain a pure existence-proof, which allows no estimation of the number 11. The proof depends on the statement that if .1! is an infinite set of points we may select out of it another conn' x infinite set of points .

•)

D. KU:-OIG,

t'bl'r dnl' S(·hIIlUw(·iSl' ails d('1II Endli('h('n ins 1.."n(·ndliche

[Ada Szl'gl'd 3 (1927), I'll-laO).

Reprinted from Compositio Math. 2 (1935), 463-470

56

:W

P. HALL ON REPRESENTATIVES OF StTBSETS

P. HALLf. 1. Let a set S of mn things be divided into 1n classes of 11 things each two distinct ways, (a) and (b); so that there are m (a)-classes and lit (b)-classes. Then it is always possible to find a set R of 111 things of S which is at one and the same time a U.S.R. (= complete system of rcpresentatives) for the (a)-classes, and also a U.S.R. for the (b)-classes. This remarkable result was originally obtained (in the form of a theorem about graphs) by D. Konigt. In the present note we are concerned with a slightly different problem, viz. with the problem of the existence of a C.D.R. (= complete system of distinct representatives) for a finite collection of (arbitrarily overlapping) subsets of any given set of things. The solution, Theorem 1, is very simple. From it may be deduced a general criterion, viz. Theorem 3, for the existence of a common C.S.R. for two distinct classifications of a gi,Fen set; where it is not assumed, as in KOnig's theorem, that all the classes have thc samc number of tenns. Konig's thcorem followr-; as an immediate corollary . III

.., Civen any set S and any finite system of subsets of S : (I )

we are concerned with the question of the existence of a comp/de set of distillct represcntatire.'J for the system (1); for I"hort, a ('.D.R. of (I). By thir-; we mean a set of 111 distinct elements of S: (2)

sm·h that

helongs to 1';) for each -j = 1, 2, ... ,111. 'Ye may say, (Ii represents '1'i' It is not neceHsary that the sets 1'; shall bc finitc, nor that they should he distinct fmm one another. Accordingly, when we speak of a sYRtem of (IIi

t Rcc"iwd 23 April, H134; rCltd. 26 April, 1934. D. Konig, "l:bcr Graphcn uml ihro Amn'ndung"n ", J[allt. A I1IIll/CII , 77 (l!JHij, }'or tho theorem in the form stated aboyc, rf. ll. L. Yan d"r 'Yuc"Il'Il, " Ein Satz iil"'r Kla~~('m'int('ilungcn yon cndlichen )lcng"n ", Abimndlllllflclt Hal/l/mrfl, :; (l!1:!7), 185; also E. Spcrnel", ibid., :!:l:!, for an cxtrl'lIll'ly (;'\cgllnt proof. ~

4:;:~.

58

k of til(' ;<eh; (1), it is llIHler:o:tood that k formally distinct sets are meant. not IH'('e:<:.:aril." k actualI." diKtinet ;<et:.:. It i:.: oh"ious that.,jfa ('.D.R. of (1) doe:,; exi!olt, then any k of the sets (1) must contain hetween them at least /; elements of S. For otherwise it would he impo:o:sihle to find distinct representatives for those 1.: sets. Our main re:mlt is to ;
If A, H .... are any ;
In('('t

(the ;<et of all element:,

A /, B / .....

Their)"ill (the ;
jlJ'II\'1'

LDDL\'

(1 ) is thf' ,wI R

Theorl'lll l. we I}(,pd the following Ifl2) i8 all." ('.f).H. f~l co,

f/ l'

(I~ •

. . . . ""

(p

('(III

(]). fllld if fli,'

1I1Pf't

of all tin C.D.R. of

U(' fl. i ,". R tllf' IIl1ll .~f'f). thell th,' p 8"'8

('olillfill InfweUI tholl ".md/!! p I'll /II(O/II,~, ri:::. 111f' "'1'I1I('lIf.~ of R.

E i:<. Ii." definition, the :,et of all e1ement:< of S whil·h IJl'<:Ul' ilS repre;.;{'ntati\'(,K of SOIllI' Ti ill en')'." ('.D.H. of (]). To prm'p the lemma. let E' he the set of all el<'llIent:< (/ of S with the following propert.": there ('xi:.:t" a ;<(,IIUl'lIee of t
;
and. fllrthl')"

59

P.

H.H.L

First, we shall show that evcry elemcnt " of R' helongs to (2). For, if lIot, I'cpiaee, in (2), hy J'espectiYely; "'e ohtain a new ('.D.R. of (1) whieh doel> not contain (([' Hence "[ does not belong to R, which contradicts l ,:::; p. Thcl'e will he no loss of gCllcrality in assuming that

For it is elear that ]f' ("ontains ]f. X{'xt, it is dear that if" is any l'ienll'nt of 1',. whcrc i :-: w, thC'1l II E

For thcll

"i

E

R'.

R', alld 1l('lIeej. /", .,.,1 can he found with 1 -:cp and ;:neh that

(II' ('

1'1'

And I( E Ti thC'1l "hows that I( E ]f' also. Hell('{' belollgs to H'. In other words, the w s('ts

('\"(,I'y

('1(,lIlent of T; (i - w)

:1' I" :I':!, .... .1',<) ('ontain he!l\eell thl'1Il !.'x;letly w ('len1('nt", ,-iz, the l'Il'Ill('uts of H'. In ~I'ery ('.I>.H. of (1), tlwrl'fol'C'. thcsp w 7';'s are Iw("('"sHl'ily l"C'pn's('ntpd hy thl'''l' sallle w ('IPlllt'llt". Thi;.; shm\s that H' is eontaim'd ill H. Hl'nel' H'

K

<11](1

p

(I) .

and

H

'1\

'1'.•

T. "

This IS the assertion of til(' lelllllla. The proof of Theorem ] 11m\" fi)lIows hy inc\lIdioll OH'I' III. Th(' (',1"(' 11/ - • 1 is tl'iyial. \\'e aSSI1I11(' thell that any k of tll(' sds (1) ('nntain hl'tw(,(,11 thplll at least k e\PIll('lIts of ,..,', and also that the th('ol'em i" tl'll(' fill' 1/1-' I set". '\"e lIlay t hel'pfoJ'(' it pply the t hpo)'('m tn the 111- 1 sets (.t )

'1' •.

T~ . .... T'N •.

These han'. aeeo)'(lin/!I~', at lcast olle ('.D.H. Hl'IH'c (J) will also han' at least Oil{' ('.D.H., pl'oYided ollly that '1'11/ j" 1I0t ('olltained ill "lithe ('.D.H. of (.t).

60

Bnt if (without 10:-,"; of generality) R::'=II\. ""!. .... ,,,~

(p~U)

is til{' llH'd of "II the ('.D.H. of (.,). nll(1 if 1'", il' I:Onta int'd in /(::'. t!J(' n. h.\· till' \Pllllllll. the /.':::' I l'ctl'

p+

1'\. 1'"!. ..... 7'p. 1'", ('o ntain between th('1ll only p clclllcnb. viz. tho...;c of ff=:'. Thil' beinl! ('()\ltml'y to h~·pl)the s is. 7'", is not eontaincd in R::: : and l'fI. if "", is any cl(,IlIPllt ofT", not in il"' . there {'xil'b it ( '.D.R. of H·) in whic h fI " , doc:; IVlt ot'l'lIr. Tbil' ('. D.H. of (~) togetlJ('l' with (1 m ("ol1l'titntcl' the d('l'ired ( '. !J.B . of (I J. .-\n el('lllentaI'Y tl'alll'fol'lUatioll of Theon'lll I gin'l' THI·;OHE :\I~. ·~Ol/I"

'f S i8 ,firi"(,"

ill/o lilly /lifllt/WI' of r:f"8.o;"'.; (".f). fl!I/IINW .>; II)

(qlfil'fltf'/u'(' ",·/"finn).

s=s, V s,V s,V ....

I/O

tll"l!

(~r

II"hi,'lI /If'Io/lf/ to fl/(· .WII/II"

(·/rISS ..'iUl' h (fi"

'I'i

thffl

(i= I. ·1

... . III).

IO'n/'ir/1-,' (lI"fl O'(If. for ('(11'11 "'91.~ . .... III. (111.'/ /.. r~r fl/l- .~d.o; Ti rfJlifr/i" In/WI"'II ,h oll "/UIII'1I18 P mf~r.

fro/ll

(1/ lr'I/.~1 /., "h(88(">;'

Del10te b.,' Ii the ...;ct of all (·Ial'l'c,..; S j for \\·hi(·1! the S;

1\

1lI('C't

Ti

j..; lIot lIull . The ('onciitioll to I,c satisfil'(l II.\' th(' l'l't,..; Ti ilia.,· OWll II(' cxprC'l'",cd thul': any /.. oftIH.' 'i'l' (·ont
for :"implidty

s ,,,'

such that . fell' i= I . . ,

"1. t

hc

s('f

s, II

T, ~' .11

i" li CIt 111111. ('hel(l",ing fen' (Ii all llrhitral'Y ('1(,IlIt'nt fn)1ll .If; , th .. 1'l''''lIlt f!IHow l' . .-\ pal'ti("ulal' ease of l'OIllC int(,l'cst i ...; 'l'Ht:ol{t;){:L

II/h e .<;'"

,..... i8

,\'-=="'\

rii"irir,r/ ililo m

"/('88(8 ill 1/1'0 d i.f] i·n·III Wf/!lS.

V '. .·2 V ... V So""

S=S,-V S;V ... V S",' ,

61

~lO

Sj /\ Sj C_= .",'/ /, 8/ == /lull set, for i /:j. theil, prorided tllat, for (,(Ir-h k = 1, 2, ... , m, any k of the cla8,fleS S/ allcays contain between th,,1n ~dements from at least l' of the classe.s 8 i . it will alway.s IJ(> possilJle to fi/ld III elements of S,

The case in which all the cla~:-:e~ h;n'e the :-:ame (finite) nUlllh£'r of clements clearly fulfils the proviso. Theorem:1 then h£'eonl('!" the well-known theorem of KOnig. referred to ahoH'. The generalization of Ki)nig's theorem
Reprinted from J. London Math. Soc. 10 (1935),26-30

62

ON THE ABSTRACT PROPERTIES OF LINEAR DEPENDENCE.' By HASST.ER \VHITNEY.

1. Introduction. Let 0 1 , O2 , ' • ,COl be the columns of a matrix lU. Any subset of these columns is either linearly independent or linearly de-

pendent; the subsets thus fall into two classes. These classes for instance, the two following theorems must hold :

afC

not atbitrary;

(a) Any subset of an independent set is independent.

+

(b) If Nfl and Np+l are independent sets of p and p 1 columns respectively, then N p together with some column of N PH forms an independent set of p 1 columns.

+

There arc other theorems not deducible from these; for in § 16 we giyc an example of a system sat isfying these two theorems but not representing any matrix. Further theorems seem, however, to be quite difficult to find. Let us call a system obeying (a) and (b) a "matroid." rl'he present paper is devoted to a study of the elementary properties of mat roids. The fundamental question of completely characterizing systems which represent matrices is left unsolved. In place of the columns of a matrix we may equally well consider points or vectors in a Euclidean space, or polynomials, etc. This pape r has a close connection with a paper by the author on linear graphs; 2 we say a subgl'aph of a graph is independent if it contains no circuit. Although graphs are, abstractly, a very small subclass of the class of matroids, (sec the appendix), many of the simpler theorems on graphs, especially on non-separable and dual graphs, apply also to mah-oids. For this reason, we carryover various terms in the theory of graphs to the present theory. Remarkably enough, for matroids representing matrices, dual matroids h ave a simple geometrical interpretation quite different from that in the case of

graphs (see §13). The content s of the paper are as follows: In Part I, definitions of matroids in terms of the concepts rank, independence, bases, and circui t s are considered, and their equivalence shown. Some common theorems are deduced (for instance rrheorem 8). Non-separable and dual matroids are studied 111 Presented to the American Mathematical Society, September, 11134. 2" Non·separable and planar graphs," Transo,ctions of the American Mathematical Society, vol. 34 (1932), pp. 339·362. We refer to this paper us C. 1

509

63

510

HASSLER WHITNEY.

Part II; this section might replace much of the author's paper G. The subject of Part III is the relation between matroids and matrices. In the appendix, we completely solve the problem of characterizing matrices of integers modulo 2, of interest in topology. 1. lL1.TRolDs. 2. Definitions in terms of rank. Let a set lIi of elements e 1 , e2,' . " en be given. Corresponding to each subset N of these elements let there be a number r(N), the rank of N. If the three following postulates are satisfied, we shall call this system a matroid. (HI) The rank of the null subset is zero. (H2) For any subset N and any element e not in N, r(N

=

+ e) =

r(N)

+ k,

(k=Oorl).

(H3) For any subset N and elements el , e2 not tn N, if r(N r(N e2 ) = r(N), then r(N el e2) = r(N).

+

+ +

+e

l )

Evidently any subset of a matroid is a matroid. In what follows, M is a fixed matroid. We make the following definitions: p(N)

=

number of elements in N.

n(N)

=

p(N) -r(N)

=

nullity of N.

N is independent, or, the elements of N are independent, if n(N) otherwise, N, and its set of elements, are dependent. LEMMA

r(N)

1.

For any N, r(N) > 0 and n (N) > O.

::s: r(lIi), n(N) < n(lIi).

LE~fMA

=

0;

If N C lIi, then

2. Any subset of an independent set is independent.

e isdependentonN if r(N

+ e) =r(N);

otherwisee is independent of N.

A base is a maximal independent submatroid of lIi, i. e. a matroid B in M such that n(B) = 0, while B C N, B =1= N implies n(N) > O. See also Theorem 7. A base complement A = lIi - B is the complement in J1 of a base B. A circuit is a minimal dependent matroid, i. e. a matroid P such that n(P) > 0, while NCP, N=I=P implies n(N) =0. 3 THEOREM 1. N is independent if and only if it is contained in a base, or, if and only if it contains no circuit.

3

Compare G, Theorem 9.

64

THE ABSTRACT PROPERTIES OF LINEAR DEPENDENCE.

511

THEOREM 2. A circuit is a minimal submatroid contained in no base, i. e. containing at least one element from each base complement. A base is a maximal sub matroid containing no circuit. A base complement is a minimal sub matroid containing at least one element from each circuit.

The above facts follow at once relationship between circuits and definitions of independence and of subset, while the property of being subset to M.

from the definitions. Note the reciprocal base complements. Note also that the being a circuit depend only on the given a base depends on the relationship of the

3. Properties of rank. Our object here is to prove Theorem 3. following definition will be useful: (3.1)

b.(M,N) =r(M

+

The

+ N) -r(M).

+

+ +

+

Suppose first r(M el) = r(M) 1; then r(M e1 e2) = r(M) k, k = 1 or 2. If k = 2, then r(M e2) = r(M) 1, on account of (R2), and the inequality holds; if k = 1, r(M e2) = r(M) 1, 1= 0 or 1, and it holds again. If r(M e2 ) = r(M) 1, the same reasoning applies. If finally r(M e1 ) = r(M e2) = r(M), the inequality follows from (Ra).

+

+

+

+

+ +

+

+ N, e) < b.(M, e). If N = e +. . . + e the last lemma gives b.(M + N, e) < b.(M + e + ... + e e) <

LEMMA

b.(M

4.

p,

1

p_1,

1

THEOREM

+

3.

...
b.(M +N2,N1 ) < b.(M,N 1 ) , or,

This is true if Nl contains but a single element. For the general case, we apply the last lemma and induction, setting Nl = N' e:

+ b.(M + N 2, Nd = b.(M + N2 + e, N') + b.(M + N 2, e)
(3. 2) is evidently equivalent to: (3.3) Deduction of (11), (12) from (Rd, (R 2), (Ra). The first postulate

4. 4

65

512

HASSLER WHITNEY.

on independent sets below obviously holds if (Rl) and (R2) hold. (1 2 ) , take N, N' as given there; then

r(N)

=

p,

r(N')

=

p

To prove

+ 1.

We must show that for some i, f:!..(N, e'i) = 1. (Then e'i does not lie in N.) If this is not so, then on using Lemma 4 we find 1 = r(N') - r(N) < f:!..(N, N') = t..(N, e'l) t..(N e'l, e'2) < f:!..(N, e'd f:!..(N, e'2)

+

+

+

+ ... + f:!..(N + e'l + ... + e'p, e'p+d + ... + f:!..(N, e'ptl) 0, =

a contradiction.

5. Deduction of (Cd, (C 2 ) from (Rl), (R2), (R3)' We shall need here a theorem showing how the nullity (or rank) of a matroid may be determined when we know what circuits it contains. LEMMA

5.

Each element of a circuit is depe,ndent on the rest of the

circuit. If e is an element of the circuit P, then n(P) =1, n(P-e) =0; hence r(P) = p(P) - 1 = p(P - e) = r(P - e). LEMMA

P

=

PI

6.

If e is dependent on PI but on

+ e is a circuit.

As f:!..(P l , e) = 0, r(P) = r(P 1 ) < p(P 1 )

110

proper subset of PI, then

< p(P),

n(P)

> 0,

and P

contains a circuit P'. If p' does not contain e, take e' in P'; then

f:!..(PI hence r(P I

-

e', e') < f:!..(P' - e', e')

=

0,

e') = r(P 1 ), and

f:!..(P J -

e', e)

= r(P l
e' + e) - r(P l - e') + e) -1'(P,) =f:!..(P"e) =0,

-

and e is dependent on the proper subset PI _. e' of PI, a contradiction. Therefore P' contains e. As P' is a circuit, e is dependent on the rest of P'; hence P'=P. THEOREM 4. If e is not in N, there is a circuit in N e if and only if e is dependent on N.

66

+ e which contains

513

THE ABSTRACT PROPERTIES OF LINEAR DEPENDENCE.

Suppose PI

+e

=

P is a circuit, PI C N.

Then

t:..(N,e) < t:..(Pl , e) =0, and e is dependent on N. Suppose, conversely, t:..(N, e) = O. Let PI be a smallest subset of N on which e is dependent; then by the last lemma, P = PI e is a circuit. (It may be that P = e.)

+

THEOREM 5. If N is formed element by element, then n(N) is just the number of times that adding an element increases the number of circuits present.

Say N

=

el

+. . . + e

p•

Then if 0 is the null set,

+ ... + +. . .+

Each t:.. (e l e.-I, e.) = 0 or 1, and = 0 if and only if ei is dependent on el ei-l, i. e. if and only if there is a circuit in el el containing ei. The number of terms is p = p (N), and the theorem follows. We turn now to the proof of (0 1 ) and (0 2 ), The first is obvious. To prove the second, take PI' P 2, el, e2 as given. As

+. . .+

we have

These equations give

r(Pl

+P

2

- el

-

e2) = r(Pl

+P

2 -

e2) = r(Pl

+ P 2).

Using (R2) gives

hence the required circuit P 3 exists, by Theorem 4.

6. Postulates for independent sets. Let M be a set of elements. Let any subset N of M be either "independent" or "dependent." Let the two following postulates be satisfied: (II) Any subset of an independent set is independent.

+ ... +

+ ... +

(I2) If N = el ep and N' = e'l e'P+1 are independent, then for some i such that e'i is not in N, N e' • is independent.

+

67

514

HASSLER WHITNEY.

The resulting system is equivalent to a matroid, as we now show. Given any subset N of M, we let r(N) be the number of elements in a largest independent subset of N. Obviously Postulates (Rl) and (R 2 ) are satisfied; we must prove (Ra). Say

r(N

+ +

+e

1)

r(N

=

+

+ e2) =

r(N)

=

r.

+

Then r(N e1 e2) = r or r 1. If it equals r 1, there is an independent set N' = e'l e'r+l in N e1 e2. Let Nil = e/' e/' be an independent set in N. By (1 2 ) there is an i such that Nil e'i is an independent set of r 1 elements. But N" e'i lies in N el or in N e2, and hence r(N ed or r(N e2) > r 1, a contradiction. Therefore r(N el e2 ) = r, as required. We have shown how to deduce either set of postulates (R) or (I) from the other. Moreover the definitions of the rank and the independence or dependence of any subset of M agree under the two systems, and hence they are equivalent.

+

+ +

+ ... +

+

+ + + +

+

+

+ ... + + +

7. Postulates for bases. Let M be a set of elements, and let each subset either be or not be a " base." We assume (B 1 ) No proper subset of a base is a base. (B 2 ) If Band B' are bases and e is an element of B, then for some element e' in B', B - e e' is a base.

+

We shall prove the equivalence of this system with the preceding one. We write here e1 e2· .. instead of el e2 for short.

+ + ...

THEOREM

6.

All bases contain the same number of elements.

For suppose

B = e1 B' = e1

•

•

•

•

epep+l· . . eqeq+l· . . er , epe'P+l· .. e'q

are bases, with exactly e1 , · • • , ep in common, and r> q. We might have p = o. q > p, on account of (Bd. By (B 2 ), we can replace ep+1 in B by an element e' of B', giving a base B 1. e' = e'i, is one of the elements e'p+l, ... , e'q, for otherwise Bl would be a proper subset of B. Hence

+

If q > p 1, we replace ep +2 in Bl by an element e'i. of B', giving a base B 2 • Continuing in this manner, we obtain finally the base

68

515

THE ABSTRACT PROPERTIES OF LINEAR DEPENDENCE.

But this contains B' as a proper subset, contradicting (Bl). We shall say a subset of M is independent if it is contained in a base. (I,) obviously holds; we shall prove (1 2 ). Let N, N' be independent sets in the bases B, B'. Say

. e'qe' q+l· • • e'rer+l· .. e.• , . eq, N' = el· .. epe'p+1 .

·epe'p+l· . epep+l·

Then Nand N' have just el,· . ., ep in common, and Band B' have just these elements and er+1,· .. , es in common. By (B 2), there is an element e'i, of B' such that is a base. (This element cannot be any of el,· .. , ep, er+l,· .. , e., by (B 1 ) . ) If i 1 is one of the numbers p 1,p 2,· .. , q 1, then N e'i, is in a base BlJ as required. Suppose not; then there is a base

+

+

+

+

+

+

+

with i2=F i 1 • If p 1 q 1, we find at some point a base containing e1 , · . • , eq, e'j with p 1 < j < q 1. Then e'j is in N ' , and N e'j is in a base and is thus independent, as required. The definitions of base and independent sets in the two systems (I) and (B) are easily seen to agree. Suppose (It} and (12) hold. (Bl) obviously holds; using (1 2), we prove that all bases contain the same number of elements; (B2) now follows at once from (1 2). Hence the two systems are equivalent.

+

+

THEOREM

+

+

7. B is a base in M if and only if

r(B)

=

r(M),

n(B)

=

o.

Evidently B is a base under the given conditions. To prove the converse, we note first that there exists a base with r(M) elements, as r(M) is the maximum number of independent elements in M (see § 6). By Theorem 6, all bases have this many elements, and the equations follow.

If B is a base and N is independent, then for some N' in N' is a base.

THEOREM

B, N

+

8.

69

516

HASSLER WHITNEY.

This follows from repeated application of Postulate (12) and the last theorem.

8. . Postulates for circuits. Let lvI be a set of elements, and let each subset either be or not be a "circuit." We assume: (0, ) No proper subset of a circuit is a circuit. (0 2) If P 1 and P 2 are circuits, e1 is in both P 1 and P 2, and e2 is in P 1 but not in P 2 , then there is a circuit P a in P 1 P 2 containing e2 but not e1.

+

(0 2 ) may be phrased as follows: If the circuits P 1 and P 2 have the P 2 - e is the union of a set of circuits. common element e, then P 1 We shall define the rank of any subset of lvI, and shall then show that the postulates for rank are satisfied. Let e1,' . " ep be any ordered set of ei containing elements of lvI. Set r. = 0 if there is a circuit in e1 ei, and set r. = 1 otherwise (compare Theorem 5). Let the "rank" of ( e1,' . " ep ) be

+

+ ... +

r(e,,' . " ep ) =

p

~ i=l

rio

To prove this, let N be the ordered set e,,' . ., eq- 2 , and set r(N)

=

r,

1. There is no circuit in N containing eq• Then

OASE

N

+e

q

If there is a circuit in N

+ eq-1

containing eq - 1, and none

+ eq- + eq containing eq1

1

III

and eq, then

otherwise, 2. There is a circuit P in N + e containing e and a circuit + e + e containing e and e Then, by (0 there is a circuit + eq containing eq. Hence

OASE

P 1 in N P a in N

q -1

2

q_ 1

q

q- 1

q•

70

q -1,

2 ),

THE ABSTRACT PROPERTIES OF LINEAR DEPENDENCE.

517

CASE 3. There is a circuit P 2 as above, but no circuit PI as above. If there is a circuit P a as above, the last set of equations hold. Otherwise,

+

CASE 4. There is a circuit in N eq containing eq • This case overlaps the two preceding ones; the proof above applies here also. LEMMA 8.

The rank of any subset N is independent of the ordering of the elements of N.

We saw above that interchanging the last two elements of any subset does not alter the rank; hence, evidently, interchanging any two adjacent elements leaves the rank unchanged. Any ordering of M may be obtained from any other by a number of interchanges of adjacent elements; the rank remams unchanged at each step, proving the lemma. Postulates (Rd and (R 2 ) are obviously satisfied. To prove (Ra), suppose r(N el) = r(N e2) = r(N). Then there is a circuit in N el containing e1 and one in N e2 containing e2 ; hence r (N e1 e2 ) = r (N). The definitions of rank and of circuits under the two systems (R), (C) agree, and hence the systems are equivalent.

+

+ +

+

+ +

9. Fundamental sets of circuits. The circuits PI,· .. , P q of a matroid M form a fundamental set of circuits if q = n(M) and the elements el,· .. ,en of M can be ordered so that Pi contains en-q+i but no en-q+i (j> i). The set is strict if Pi contains en-q+i but no en--q+i (0 < j i). These sets may be called sets with respect to en-q+l,· .. , en.

+ ... +

+ ... + en,

If B = e1 en _q is a base in M = e1 then there is a strict fundamental set of circuits with respect to en-q+1,· these circuits are uniquely determined.

THEOREM 9.

•• ,

en;

As r(B) =r(M), Ll(B,e;) =0 (i=n-q+1,·· ·,n). Hence, by Theorem 4, there is a circuit Pi containing ei and elements (possibly) of B. P n-q+1,· • • , P n is the required set. Suppose, for a given i, there were also a circuit P';.=;I= Pi. Then Postulate (C 2 ) applied to Pi and P'i would give us a circuit P in B, which is impossible. This theorem corresponds to the theorem that if a square submatrix N of a matrix M is non-singular, then N can be turned into the unit matrix by a linear transformation on the rows of M. THEOREM 10.

If PI,· .. , P q form a fundamental set of circuits with

71

518

HASSLER WHITNEY.

respect to e,.-q+1,· . ,en, then there is a unique strict set P'1,· . ., P'q with respect to e"-q+h· . ., e,..

+. . . +

Set B = M - (e,.-q+1 e,.). The existence of P 1,· . ., P q shows thatr(M)=r(M -e,.)= . .. =r(B). Hencep(B)=n-q=r(M)=r(B), and B is ·a base, by Theorem 7. Theorem 9 now applies. Note that a matroid is not uniquely determined by a fundamental set of circuits (but see the appendix). 'fhis is shown by the following two matroids, in each of which the first two circuits form a strict fundamental set: M, with circuits 1234, 1256, 3456 ; M', with circuits 1234, 1256, 13456, 23456.

II.

SEPARABILITY, DUAL MATROIDS.

+

10. Separablematroids. IfM =M1 M 2, thenr(M)< r(M 1)+ r(M2), on account of (3.3). If it is possible to divide the elements of M into two groups, M1 and M 2, each containing at least one element, such that (10.1)

or, which is equivalent (as M1 and M2 have no common elements), (10.2)

we shall say M is separabl~; otherwise, M is non-separable.4 Any single element forms a non-separable matroid. Any maximal non-separable part of M is a component of M.5 THEOREM

11.

If

M=M1 +M2, then

Set M1"

=

M1-M/, M/' = M2 -M/. The relations (see Theorem 3)

r(M)

=

+ M/, M + A (M', M/') + r(M') + A. (M/, M/') + r(M') r(M2) - r(M/) + r(M1) - r(M/) + r(M') A(M1

2" )

"< A. (M/, Mz")

=

• Compare G, Theorem 15. I See G, § 4.

72

519

THE ABSTRACT PROPERTIES OF LINEAR DEPENDENCE.

+ r(M2) show that r(M') + r(M'2). r(M I ) + r(M2), M'is non-

together with the :fact that r(M) = r(MI) > r(M'I) r(M'2) and hence r(M') = r(M'd

+

+

THEOREM 12.6 Ii M = MI M 2, r(M) = separable, and M' C M, then either M' C M I or M' C M 2.

+

For suppose M' = M'l lrI'2, M'l C MI, M'2 C M2, and M'l and M'2 each contain an element. By the last theorem, r(M') = r(M'I) r(M'2), which cannot be. THEOREM

+

13. If MI and M2 are non-separable matroids with a common

element e, then M

=

+ M2 is non-separable. M'l + M'2, r(M) r(M'd + r(M'2).

MI

For suppose M = = By the last theorem, MI C M'l or MI C M'2, and M2 C M'l or M2 C M'l; this shows that either M'l or M'2 is void. THEOREM

14.

No two distinct components of M have common elements.

This is a consequence of the last theorem. THEOREM

From this :follows:

15. 7 Any matroid may be expressed as a sum of components

in a unique manner. THEOREM

16. 8

A non-separable matroid M of nullity 1 is a circuit, and

conversely. M2

=

Ii MI is a proper non-null subset of the non-separable matroid M, and M -MI' then r(M) < r(MI) r(M2). Hence

+

and n(M I )

=

0, proving that M is a circuit.

Conversely, if M elements, t.hen

r(MI)

=

MI

+ M2

is a circuit, and MI and M2 each contain

+ r(M2) =p(MI) + P(M2) -n(MI ) -n(M2) =

p(M)

> r(M),

showing that M is non-separable. 6

7 8

Compare G, Lemma, p. 344. Compare G, Theorem 12. Compare G, Theorem 10.

73

520

HASSLER WHITNEY.

+

9. Let M = Ml M2 be non-separable, and let Ml and M2 each contain elements but have no common elements. Then there is a circuit P in M containing elements of both M 1 and M 2' LEMMA

Suppose there were no such circuit. Theorem 4, we see that

and hence r(M)

r(Md

=

+ r(M2)'

Say M2

=

el

+ ... + es.

a contradiction.

Using

°

THEOREM 17.9 Any non-separable matroid M of nullity n> can be built up in the following manner: Take a circuit M l ; add a set of elements which forms a circuit with one or more elements of M 1, forming a nonseparab le matroid M 2 of nullity 2 (if n (M) > 1); repeat this process till we have M" = M.

As n> 0, M contains a circuit MI' If n> 1, we use the preceding lemma n - 1 times. The matroid at each step is non-separable, by Theorems 16 and 13.

+ ... +

18.10 Let M = Ml Mp, and let M l ,' . " Mp be nonThen the following statements are equivalent:

THEOREM

separable.

(1) M 1,'

.

M p are the components of M.

"

(2) No two of the matroids M 1,' • " M 11 have common elements, and there is no circuit in M containing elements of more than one of them. (3) r(M)

=

r(M l )

+ ... + r(Mp).

We cannot replace rank by nullity in (3); see G, p. 347. (2) follows from (1) on application of Theorems 13 and 16. To prove (1) from (2), take any Mi. If it is not a component of M, there is a larger non-separable submatroid M'i of M containing it. By Lemma 9, there is a circuit P in M'i containing elements of Mi and elements not in Mi; P must contain elements of some other Mj, a contradiction. Next we prove (3) from (1). If p > 1,111 is separable; say M = M'l M'2, r(M) = r(M'd r(M'2)' By Theorem 12, each Mi is in either M'l or M'2; hence M'l and M'2 are each a sum of components of M. If one of these

+

+

• See G, Theorem 19; also Whitney, "2-isomorphic graphs," Amerioan Journal of Mathematics, vol. 55 (1933), p. 247, footnote. 10 Compare G, Theorem 17.

74

521

TUB ABSTRACT PROPERTIES OF LINEAR DEPENDENCE.

contains more than one component, we separate it similarly, etc. (3) now follows easily. Finally we prove (1) from (3). Let M' be a component of M, and suppose it has an element in Mi. As

r(M)

=

r(M.)

+i;ei ~ r(Mj),

M' is contained in M i , by 'l'heorem 12; as Mi is non-separable, M'

=

Mi.

19. 11 The elements e1 and e2 are in the same component of M if and only if they are contained in a circuit P. THEOREM

If e1 and ez are both in P, they are part of a non-separable matroid, which lies in a single component of M. Suppose now e1 and ez are in the same component M 0 of M, and suppose there is no circuit containing them both. Let M1 be e1 plus all elements which are contained in a circuit containing e1. By Lemma 9, there is a subset M* of Mo - M1 which forms with part of M1 a circuit P s• P s does not contain e1. If e'4 is an element of P s in M 1, there is a circuit P 1 in M1 containing e1 and e'4' Let es be an element M* there are circuits P 1 and P s which contain e1 and of M*. Then in M1 tls respectively, and have a common element. Let M' be a smallest subset of Mo which contains circuits P'1 and P's such that one contains e1 , the other contains es , and they have common elements. Then P'l and P's are distinct, and M' = P'1 P's. Let e4 be a common element. By Postulate (C 2 ) , there is a circuit P 1 in M' - e4 containing e1, and a circuit P s in M' - e4 containing es.· By the definition of M ' , P 1 and P s have no common elements. By Postulate (Cd, P 1 is not contained in P'1; hence it contains an element e5 of jf" - P'1' P s does not contain e5• As P s is not contained in P's, it contains an element e6 of P'l' But now P1 contains el, P s contains es, P'1 P s have a common element e6, and P 1 P s does not contain e5 and is thus a proper subset of jf", a contradiction. This proves the theorem.

+

+

+

+

11. Dual matroids. Suppose there is a 1 - 1 correspondence between the elements of the matroids M and M', such that if N is any sub matroid of M and N' is the complement of the corresponding matroid of M', then (11. 1)

r(N')

=

r(M') - n(N).

11 Compare D. Konig, Acta Litterarum ac Scientiarum Szeged, vol. 6, pp. 155-179, 4. (p. 159). The present theorem shows that a " glied " is the same as a component.

75

522

HASSLER WHITNEY.

We say then that M' is a dual of MY 20.

THEOREM

If M' is a dual of M, then reM')

Set N and r(N')

= =

=

n(M') = reM).

n(M),

M; then n (N) = n (M) . In this case N' is the null matroid, O. (11.1) now gives reM') = n(M). Also

n(M') =p(M') -reM') =p(M) -n(M) =r(M). THEOREM

21.

If M' is a dual of M, then M is a dual of ]YI'.

Take any N and corresponding N' as before. The equations r(N') =r(M') -n(N), reM') =n(M), peN) +p(N') =p(M)

give r(N) =p(N) -n(N) =p(N) - [reM') -r(N')] =p(N) -n(M) [peN') -n(N')] =p(M) -n(M) -n(N') =1·(M) -n(N'), as required.

+

THEOREM

22.

Every matroid has a dual.

This is in marked contrast to the case of graphs, for only a planar graph has a dual graph (see G, Theorem 29). Let M' be a set of elements in 1 - correspondence with elements of M. If N' is any subset of M', let N be the complement of the corresponding subset of M, and set r(N') ~ n(M) - n(N). (R 1 ), (R 2 ), (Ra) are easily seen to holdinM',astheyholdinM; hence M' is a matroid. Obviouslyr(M') =n(M), and M' is a dual of M. 23. M and M' are duals if and only if there is a 1 - 1 correspondence between their elements such that bases in one correspond to base complements in the other. THEOREM

Suppose first M and M' are duals. Let B be a base in either matroid, say in M, and let B' be the complement of the corresponding submatroid of the other matroid, M'. Then 10 Compare G. § 8. Theorems 20,21,24,25 correspond to Theorems 20,21,23,25 in G. Note that two duals of the same matroid are isomorphic, that is, there is a 1-1 correspondence between their elements such that corresponding subsets have the same rank. Such a statement cannot be made about graphs. Compare H. Whitney, "2·iso· morphic graphs," American Journal of Mathematics, vol. 55 (1933), pp. 245·254.

76

523

THE ABSTRACT PROPERTIES OF LINEAR DEPENDENCE.

r(B') =r(M') -n(B) =r(M'), n(B') =r(M) -r(B) =0,

and B' is a base in M', by Theorem 7. Suppose, conversely, that bases in one correspond to base complements in the other. Let N be a submatroid of M and let N' be the complement of the corresponding submatroid of M'. There is a base B' in M' with r(N') elements in N', by Theorem 8. The complement in M of the submatroid corresponding to B' in M' is a base B in M with p(N') - r(N') = n(N') elements in M - N, and hence with r(M) - n(N') elements in N. This shows that r(N)

=

r(M) - n(N')

+ le,

le >

+ le',

le' >

o.

In a similar fashion we see that r(N')

=

1·(M') - n(N)

o.

As B contains r(M) elements and B' contains r(M') elements, r(M) = p (M) . Hence, adding the above equations, le

+ le' =

r(N) = p(N)

+ r(M')

+ r(N') + n(N) + n(N') -r(M) -r(M') + p(N') - p(M) = o.

Hence le = 0, and the first equation above shows that M and M' are duals. There are various other ways of stating conditions on certain submatroids of M and M' which will ensure these matroids being duals. l8 24. Let M!,· .. , Mp and M'l,· .. , M'p be the components of M and M' respectively, and let M'. be a dual of M. (i = 1,· .. , p). Then M' is a dual of M. THEOREM

Let N be any submatroid of M, and let the parts of N in Ml , · • • , Mp be N l,· .. , N p. Let N'" be the complement in M'. of the submatroid corresponding to No; then N' = N'" N'p is the complement in M' of the submatroid corresponding to N. By 'rheorems 18 and 11 we have

+ ... +

r(N')

=

r(N'l)

+ ... + r(N'p),

Also r(M')

=

r(M'l)

n(N)

+ ... + r(M'p),

=

n(Nl )

r(N';,)

adding the last set of equations gives r(N')

=

=

+ ... + n(Np).

r(M'.) - n(N.);

r(M') - n(N), as required.

See for instance a paper by the author" Planar graphs," Fundamenta MatheCut sets may of course be defined in terms of rank. 18

maticQJe, vol. 21 (1933), pp. 73-84, Theorem 2.

77

524

HASSLER WHITNEY.

THEOREM 25. Let M and M' be duals, and let M I , ' . " Mp be the components of M. Let M'l,' . " M'p be the corresponding submatroids of MI. Then M'l,' . " M'p are the components of M', and M'i is a dual of M. (i = 1,' . " p).

IS

The complement III M of the submatroid corresponding to M'i in M' ~ M j • Hence, as M and M' are duals and the M j (j =1= i) are the comj~i

ponents of

M j (see Theorem 18),

~ j~'

r(M'i) =r(M') -n( ~Mj) =r(M') -~n(Mj). J~i

Adding gives ~ i

r(M'i) = pr(M') -

j~i

(p -1) ~ n(Mj ) = pr(M') -

(p -1)n(M)

j

= pr(M') -

(p-l)r(M')

=

r(M').

Therefore, by Theorem 12, each component of M' is contained in some M'i. In the same way we see that each component of M is contained in a matroid corresponding to a component of M'; hence the components of one matroid correspond exactly to the components of the other. Let Ni be any submatroid of M i , and let N' and N'i be the complements in M' and M'i of the submatroid corresponding to N i • The equations r(M') =~r(M'j),

r(N') =r(N'i) +~r(M'j),

j

r(N')

=

r(M/) -

r(N';,)

=

r(M';,) -

n(N i ),

j~i

give n(N;,) ,

which shows that M', is a dual of Mi. THEOREM

26.

A dual of a non-separable matroid is non-separable.

This is a consequence of the last theorem.

III. 12.

MATRICES AND MATROlDS.

Matrices, matroids, and hyperplanes. Consider the matrix

amI' .. amn

78

525

THE ABSTRACT PROPERTIES OF LINEAR DEPENDENCE.

let its columns be G1 , · • • , G". Any subset N of these columns forms a matrix, and this matrix has a rank, r(N). If we consider the columns as abstract elements, we have a matroid M. The proof of this is simple if we consider the rank of a matrix as the number of linearly independent columns in it. (R1) and (R2) are then obvious. To prove (Rs), suppose r(N G1 ) = r(N G2 ) = r(N); then G1 and G2 can each be expressed as a linear G1 G2 ) = r(N). combination of the other columns of N, and hence r(N The terms independent and base carryover to matrices and agree with the ordinary definitions; a base in M is a minimal set of columns in terms of which all remaining columns of M may be expressed. We may interpret M geometrically in two different ways; the second is the more interesting for our purposes: (a) Let E'm be Euclidean space of m dimensions. Corresponding to each column G i of M there is a point Xi in Bm with coordinates ~i,· . . , lLmi. The subset G.,,· .. , Gip of M is linearly independent if and only if the points = (0,· . ., 0), X i 1 , · • . , X ip are linearly independent in B m, i. e. if and only if these p 1 points determine a hyperplane in Em of dimension p. A base in M corresponds to a minimal set of points Xi,,· .. , X ip in Em such that each X j of M lies in the hyperplane determined by 0, Xi" ... ,Xip • Then p is the rank of M. (b) Let En be Euclidean space of n dimensions. Let R 1 , · • • , Rm be the rows of M. If Y 1 , · · · , Y m are the corresponding points of En: Yi =(ail,···, ain), then the points 0, Y 1 , · • • , Yon determine a hyperplane H = H (M), which we shall call the hyperplane associated with M. The dimension d(H) of H is reM). Let N = G;, Gip be a subset of M, and let E' be the p-dimensional coordinate subspace of En containing the Xit and ... and the Xip axes. The j-th row of N corresponds to the point Y'j in E' with coordinates (aji,,· .. , ajip); this is just the projection of Y j onto E'. If H' is the hyperplane in E' determined by the points 0, Y'l'· .. , Y'm, then H' is exactly the projection of H onto E', and

+

+

°

+ +

+

+ ... +

(12.1)

d(H')

~

r(N).

Let N = (G,,,· .. , Gip ) be any subset of M, and let E', H' correspond to N. Then N is independent if and only if d(H')

=

p,

and is a base if and only if d(H')

=

d(H)

79

=

p.

526

HASSLER WHITNEY.

27. There is a unique matroid plane H through the origin in En. THEOREM

associated with any hyper-

},f

Let },f contain the elements el,· . ., en, one corresponding to each coordinate of E... Given any subset ei,,· .. , ei p , we let its rank be the dimension of the projection of H onto the corresponding coordinate hyperplane E' of En. It was seen above that if M is any matrix determining H, then },f is the matroid associated with M.

13.

Orthogonal hyperplanes and dual matroids.

We prove the fol-

lowing theorem: THEOREM 28. Let H be a hyperplane through the origin in E en , of dimension r, and let H' be the orthogonal hyperplane through the origin, of dimension n - r. Let},f and },f' be the associated matroids. Then}.{ and },f' are duals.

We shall show that bases in one matroid correspond to base complements in the other; Theorem 23 then applies. Let

M'= bn - r •l

·

..

b.._r...

be matrices determining Hand H' respectively. Say the first r columns of M form a base in M, i. e. the corresponding determinant A is '# O. As Hand H' are orthogonal, we have for each i and j

Keeping j fixed, we have a set of r linear equations in the b jk. Transpose the last n - r terms in each equation to the other side, and solve for bjk. We find all . . all· . alr n -1 n bjl (7c=1,· . ,r). ~ bj/.= A ~ Cklbjl l=r+l I=r+l an' . arl· . arr This is true for each j = 1,· .. , n - r, and the Ckl are independent of j. Thus the 7c-th column of M' is expressed in terms of the last n - r columns. As this is true for 7c = 1, . . ,r, the last n - r columns form a base in M', as required.

14. The circuit matrix of a given matrix. Consider the matrix M of § 12. Suppose the columns 0'1>· . " Cip form a circuit, i. e. the corresponding

80

'rHE ABSTRACT PROPERTIES O}<' LINEAR DEPENDENCE.

527

elements of the corresponding matroid form a circuit. Then these columns are linearly dependent, and there are numbers bl , ' . . , b.. such that (14.1) The bJ are all # 0 (j = i l , ' . . , ip), for otherwise a proper subset of the columns would be dependent, contrary to the definition of a circuit. (They are uniquely determined except for a constant factor; see Lemma 11.) Suppose the circuits of Mare P l , ' • • , Pa • Then there are corresponding sets of numbers bit,' . ·,b,.. (i=I,· . ·,s), forming a matrix

M'= the circuit matrix of the matrix M. THEOREM 29. Let P t , ' • . , P q be a fundamental set of circuits in M (see § 9). Then the corresponding rows of the circuit matrix M' form a base for the rows of M'. Hence r(M') = q = n(M).

Suppose the columns of M are ordered so that Pi contains On-q+' but no column O..-q+J (j > i). Then if the corresponding row of M' is R';. = (bit,' .. , b" .. ), we have b','n-q+' # 0 and b....-q+J = 0 (j> i). Hence the rows R't,' .. , R'q of M' are linearly independent, and r(M') > q. Hence r(M') = n(M) = q, and each row of M' may be expressed in terms of R'l,' .. , R'q. THEOREM 30. If M' is the circuit matrix of M and H', H are the corresponding hyperplanes, then H' is the hyperplane of mAlXimum dimension orthogonal to H.

This is a consequence of (14.1) and the last theorem. 31. matrix are duals. THEOREM

The matroids correspondt:ng to a matrix and its circuit

This follows from the last theorem and Theorem 28. 15. On the structure of a circuit matrix. Let M be any matroid, and M', its dual. If there exists a matrix M corresponding to M, it is perhaps most easily constructed by considering it as the circuit matrix of a matrix M' 5

81

528

HASSLER WHITNEY.

corresponding to M'. Let Hand H' be the hyperplanes corresponding to M and M'. We shall say the set of numbers (a 1 , ' • " a.,.) is in Z i, ... ip if

If (a1, ' • " a.,.) is in H and in Z i, ... are dependent, evidently. LEMMA 10. Let (b l , " the matroid N' = ei,

i.,

then the columns 0 i,,'

·,b .. ) be a point of II.

. "

0 i. of M'

IfitisinZi, ... i.,then

+ ... + ei. is the union of a set of circuits in M'.

Here ei in M' corresponds to 0; in M. We need merely show that for each is there is a circuit P in N' containing e.;,. Let leI = is, le 2 , ' • " leq be a minimal set of numbers from (i l , ' . " ip) containing is such that there is a point (CI,' . " cn ) of H in Z'{o" ... k.; then ek, e". is the required circuit. For if it were not a circuit, there would be a proper subset (ll' ... , lr) of (leI" . " le q ) and a point (d,,' . " dn) of H in Zl, ... I•. No li = leI, on account of the minimal property of (leI" . " le q ). Say II = let, and set

+. . .+

(i = 1,' . " n). Then (a l,' . " a.,.) is in H and in Z,m, ... mu with (ml,' . " m,..) a proper subset of (leI" . " le q ) containing leI, again a contradiction.

+ ... +

LEMMA 11. If P = ei, ei. is a circuit of M' and (b l , ' . " bn ) and (b'l,' .. ,b'n) are in H and in Zi, .. . iv' then these two sets are proportional.

For otherwise, (c l , ' . " cn ) with Ci = b'i,b; - b i1 b'; would be a point of H in some Zk, . .. k. with (leI" . " 7c q ) a proper subset of (i l , ' • " i p ), and P would not be a circuit. It is instructive to show directly that Postulate (0 2 ) holds for matrices: PI and P z are represented by rows (b l,' . " b.. ) and (b'l" . " b'n) of M, lying in ZI2i 1 . . . i. and Zlk, ... k. respectively, where lel,' . " le q =1= 2. Set C; = b'1b; - bIb';; then (c l, ... , Cn) is in H and in Z21, ... I., with (ll, ... , lr) a subset of (i1,' . " i p , leI" . " leq ); the existence of P 3 now follows from Lemma 10. 'l'HEOREM 32. Let M be the circuit matrix of M'. Let P l , ' • " P q form a strict fundamental set of circuits in M' with respect to en - q +l , ' . " en, and let the first q rows in M corrl'spond to P l , ' . " P q • Let (il,' . " is) be any set of numbers from (1,' . " q), let (jl" .. ,j8) be any set from (1,' .. ,n - q), and let (i' 1, . . . , i'q_.,) be the set co mpleme ntary to (i l , ' . . , is) in (1,' .. , q) .

82

529

THE ABSTRACT PROPERTIES OJT LINEAR DEPENDENCE.

Then the determinant D in M with rows i l , ' . " is and columns jl,' . , js equals zero if and only if the determinant D' with rows 1,' . " q and columns jl,' . " js, n - q i'l,' . " n - q i' q _s equals zero, or, if and only if there exists a circuit P in M' cl)ntaining none of the columns ejl'" . " ej"

+

+

In the matrix of the last q = reM) columns of M, the terms along the main diagonal and only those are =/= o. If we expand D' by Laplace's expansion in terms of the columns n - q i'l,' . " n - q i'q-s, we see at once that D' = 0 if and only if D = O. Suppose D = O. Then there is a set of numbers (a l , ' . " aq ), not all zero, with ai = 0 (i=/= i l , ' • " i 8 ) , such that

+

+

+

bin) being thei-th row of M., ble = 0 also for lc = n - q i'l,' . n- q i'q_8, as each term is zero for such le. The point (b l , ' • " bn) is in H. Any circuit given by Lemma 10 is the required circuit P. Suppose the circuit P exists. Then it is represented by a row (b l , • • • , bn ) in M. As the first q rows of M are of rank q = reM), (b l , ' . " bn ) can be expressed in terms of them; say ble = -:£,aib ik . As ble = 0 (le = n - q i'l, ... , n - q i'q-8), certainly ak = 0 (7c =(1,' . " i'q-s). D = 0 now follows from the fact that ble = 0 (le = j1,' . " js). (b i l , '

.

"

+

+

+

16.

A matroid with no corresponding matrix. 14 The matroid M' has

seven elements, which we name 1,' . " 7. three elements except (16. 1)

124,

135,

167,

286,

The bases consist of all sets of 257,

347,

456.

Defining rank in terms of bases, we have: Each set of le elements is of rank le if le < 2 and of rank 3 if le > 4; a set of three elements is of rank 2 if the set is in (16. 1) and is of rank 3 otherwise. It is easy to see that the postulates for rank are satisfied. (Ra) in the case that N contains two elements is el ) ~ r(N e2) = r(N) = 2. Then satisfied vacuously. For suppose r(N N el and N e2 are both in (16. 1); but any two of these sets have but a single element in common.

+

+

+

+

U After the author had noted that M' satisfies (C*) and corresponds to no linear graph, and had discovered a matroid with nine elements corresponding to no matrix, Saunders MacLane found that M' corresponds to no matrix, and is a well known example of a finite projective geometry (see O. Veblen and ,J. W. Young, Projective Geometry, pp. 3-5).

83

530

HASSLER WHITNEY.

If there exists a matrix M', corresponding to M', then let M be its circuit matrix. 123 is a base in M', and hence (16.2)

124,

135,

236,

1237

form a fundamental set of circuits in M'. Let R 1, R 2 , Rs, R4 be the corresponding rows of M. By multiplying in succession row 1, column 2, rows 2, 3, 4, and columns 4, 5, 6, 7 by suitable constants =F 0, we bring Minto the following form: 110 100 0 lOa o 1 0 0 (16.3) M= 0 1 b o 0 1 0 1 c d 000 1

a, b, c and dare =F o. We now apply Theorem 32 with (i1 , ·

.

·,is ; j1,· . . ,j.)

=

(1,4; 1,2),

(2,4; 1,3),

(3,4; 2,3),

i. e. using the circuits 347, 257, 167. This gives

and hence c = 1, a = d = b. Using the circuit 456, with sets (1,2,3; 1,2,3) . gives 2a = 0, a = 0, a contradiction. In regard to this example, see the end of the paper.

APPENDIX. MATRICES OF INTEGERS MOD

2.

We wish to characterize those matroids M corresponding to matrices M of integers mod 2,15 i. e. matrices whose elements are all 0 or 1, where rank etc. is defined mod 2. We shall consider linear combinations, chains: (A. 1)

(o:'s integers mod 2)

in the elements of M. The o:'s may be taken as 0 or 1; (A. 1) may then be interpreted as the submatroid N whose elements have the coefficient 1. Conversely, any N C M may be written as a chain. Submatroids are added ,. See O. Veblen, "Analysis situs," 2nd ed., American Mathematical Society Colloquium Publications, Ch. I and Appendix 2.

84

531

THE ABST-RACT PROPERTIES OF LINEAR DEPENDENCE.

(mod 2) by adding the corresponding chains (mod 2). For instance, e2) (e2 ea) ~ e , ea (mod 2). Any sum (mod 2) of circuits in M we shall call a cycle in M. N is the true sum of N " - .. , Ns if these latter have no common elements and N = N, N s • We consider matroids which satisfy the following postulate: (e ,

+

+

+

+

+ ... +

(C*) Each cycle is a true sum of circuits.

+

Postulate (C 2 ) is a consequence of (C*). For the cycle P , P 2 is a submatroid containing e2 but not e,; The existence of P 3 now follows from (C*). A simple example of a matroid not satisfying (C*) is given by the matroid M' at the end of § 9. THEOREM

33.

A circuit is a minimal non-null cycle, and conversely.

This is proved with the aid of Postulates (C J ) and (C*). THEOREM 34. Let p " . . -, P q be a strict fundamental set of circuits in M with respect to en-q+l,· . -, en. Then there are exactly 2q cycles in M, formed by taking all sums (mod 2) of PI, - - -, P q.

+ .. -+

First, each sum Pi! Pi, (mod 2) IS a cycle, containing en-q+il>· .. , en-q+i. and elements (perhaps) from B = el,· .. , en_q; obviously distinct sums give distinct cycles. N ow let Q be any cycle in M; say Q contains e",_q+k,.,· . . , en-q#c r and elements (perhaps) from B. Set Q' = Plt,. Pkr ; then Q Q' is a cycle containing elements from B alone. But B is a base (see the proof of Theorem 10), and hence contains no circuits. Consequently Q Q' is the null cycle, and Q = Q'.

+. . .+

+

+

THEOREM 35. As soon as the circuits of a strict fundamental set are known, all the circuits may be determined.

This is a consequence of the last two theorems. with the final remark of § 9. Remark.

It is to be contrasted

The word" strict" may be omitted in the last two theorems.

36. Let e,,· .. , en be a set of elements, and let p " . .. , P q be any subsets such that Pi contains Cn-q+i and possibly elements from el,· . ., en _q alone. Then there is a unique matroid M satisfying (C*), with p " . .. , P q as a strict fundamental set of circuits. THEOREM

85

532

HASSLER \VlIITNEY.

We form the 2q cycles of Theorem 34. Those cycles which contain no other non-null cycle as a proper subset we call circuits; in particular, P 1 , · • • , P q are circuits. To prove (C*), let Q be a non-null cycle. If it is not a circuit, it contains a circuit P as a proper subset. Q and Pare sums (mod 2) from P l , · • . , P q, hence the same is true of Q - P, and Q - P is one of the 2q cycles. If it is not a circuit, we again extract a circuit, etc. This theorem furnishes a simple method of constructing all matroids satisfying (0*). We turn now to the study of matrices of integers (mod 2)

all·

. a1n

am1 .

. a mn

M =

(each aij = 0 or 1).

Any linear combination (mod 2) of the columns (a's integers mod 2)

(A.2)

is a set of numbers (~aia1i'· .. , ~aiami), which we call a chain (mod 2) in M. As before, we may take each coefficient as 0 or 1, and we may consider any chain merely as a submatrix of M. The chain is a cycle if each of the corresponding numbers is E : 0 (mod 2). The columns Gil>· .. , Gi p are independent (mod 2) if there exists no set of integers a1,· .. , an not all 0 (mod 2), p), with ai = 0 (i ¥= i 1 , · . . , i such that ~aiG i is a cycle, i. e. if no non-null subset of Gi,,· . ., Gip is a cycle. Using this definition, the terms base, circuit, rank, nullity etc. (mod 2) can be defined as in Part I. Let M be a set of elements e1,· .. , en corresponding to G1,· .. , Gn in M, ei p be a circuit in M if and only if Gi,,· .. , Gip is a and let ei, circuit in M. We shall show that M is a matroid satisfying (C*) and the definitions of cycle in M and M agree. We show first that each circuit is a cycle in M. If Gi ,,· .. , Gi p is a circuit, then these columns are dependent; hence ~aiGi is a cycle, with ai = 0 (i ¥= i 1 , · • • , ip). Moreover a. = 1 (i = i 1, · . • , ip), for otherwise a Gip proper subset of Gi,,· .. , Gip would be dependent. Hence Gil is a cycle. Next, any sum (mod 2) of circuits is a cycle, evidently. Next we prove (0*). Suppose Q = Gil Gip is a cycle. Let (k 1,· .. , kq) be a minimal subset of (i1 , · . • , ip ) such that P = Gk, Gkq is a cycle; then P is a circuit. Q - P is a cycle; from it we extract a circuit, just as above, etc. It follows from (0*) that the definitions of cycle in M and M agree. Theorems 33, 34 and 35 now apply to M also. Weare now ready to prove the final theorem:

==

+ ... +

+ ... +

+ ... +

86

+ ... +

THE ABSTRACT PROPERTIES THEOREM 37. p(M) = n, and e1

OF 1.1); EAH

lJEPENmJNCE.

533

be any matroid satisfying (C*). Suppose is a base. Then 11 lU 1 is any matrix oj integers ( mod 2) with n - q columns which are independent ( mod 2), columns C n - Q+1 , ' • "0,,, can be adjoined in a unique manner to lU 1 , forming a matrix lU of which the corresponding matroid is M. Let M

+ ... + e'1l_q

Let P 1 , ' • " P q be a strict fundamental set of circuits in M with eip en-q+1' respect to en-q+l,' . " en (Theorem 9). Say P 1 = ei, Set Cn- q +1 ~ 0 i , 0 ip (mod 2); this determines Cn-q+l as a column Oip Cn-q+l is a circuit. (P'l is a of O's and 1's so that P'1 = Oil cycle; (C*) shows that it is a single circuit, as C1 Cn-q contains no circuit.) On-q+l evidently must be chosen in this manner. We choose the remaining columns of lU similarly. Let M' be the matroid corresponding to lU. Then P'l" . " P'q is a strict set of circuits in M'. These same sets form a strict set in M; hence, by Theorem 35, the circuits in M' correspond to those in M. Consequently M' = M, completing the proof. We end by noting that the matroid M' of§ 16 satisfies Postulate (0*) but corresponds to no linear graph. For letting 123 be a base and (16. 2) a fundamental set of circuits and determining the matroid as in Theorem 36, we come out with exactly M'. A corresponding matrix of integers mod 2 is constructed from (16.3) with a = b = c = d = 1; we interchange rows and columns in the left-hand portion, leave out the last row and column of the right-hand portion, and interchange these two parts. (The relation 2a = 0 is of course true mod 2.) On the other hand, it is easily seen that if the element 7 is left out, there is a corresponding graph, which must be of the following sort: It has four vertices a, b, c, d, and the arcs corresponding to the elements 1,' . " 6 are

+. . . +

ab,

+. . .+ +

+ ... +

ac,

ad,

bc,

+

bd,

+ ... +

cd.

There is no way of adding the required seventh arc. The problem of characterizing linear graphs from this point of VIeW is the same as that of characterizing matroids which correspond to matrices (mod 2) with exactly two ones in each column. HARVARD UNIVERSITY.

Reprinted from Amer. J. Math. 57 (1935), 509-533

87

Reprinted from DUKE MATHEMATICAL Vol. 7. Dooember. 1940

JOURNAL

THE DISSECTION OF RECTANGLES INTO SQUARES By

R. L.

BROOKS,

C. A. B.

SMITH,

A. H.

STONE AND

W.

T. TUTTE

Introduction. We consider the problem of dividing a rectangle into a finite number of non-overlapping squares, no two of which are equal. A dissection of a rectangle R into a finite number n of non-overlapping squares is called a squaring of R of order n; and the n squares are the elements of the dissection. The term "elements" is also used for the lengths of the sides of the elements. If there is more than one element and the elements are all unequal, the squaring is called perfect, and R is a perfect rectangle. (We use R to denote both a rectangle and a particular squaring of it.) Examples of perfect rectangles have been published in the literature. 1 Our main results are: Every squared rectangle has commensurable sides and elements. 2 (This is (2.14) below.) Conversely, every rectangle with commensurable sides is perfectible in an infinity of essentially different ways. (This is (9.45) below.) (Added in proof. Another proof of this theorem has since been published by R. Sprague: Journal fUr Mathematik, vol. 182(1940), pp. 6CH>4; Mathematische Zeitschrift, vol. 46(1940), pp. 460-471.) In particular, we give in §8.3 a perfect dissection of a square into 26 elements. 3 There are no perfect rectangles of order less than 9, and exactly two of order 9. 4 (This is (5.23) below.) The first theorem mentioned is due to Dehn, who remarked· that the difficulty of the problem is the semi-topological one of characterizing how the elements fit together. This is overcome here in §1 by associating a certain linear graph (the "normal polar net") with each "oriented" squared rectangle. The metrical properties of the squared rectangle are found to be determined by a certain flow of electric current through this network. Accordingly, in §2 we collect the relevant results from the theory of electrical networks. In particular, the elements of the squared rectangle can be calculated from determinants formed from the incidence matrix of the network. In §3, the elements are expressed in a different way, in terms of the subtrees of the network. This leads Received May 7, 1940. We are indebted to Dr. B. McMillan, of Princeton University, for help with the diagrams. 1 A bibliography is given at the end of this paper. Numbers in square brackets refer to this bibliography. I Cf. (6), p. 319. a This disproves a conjecture of LUBin; cf. (10), p. 272. For an independent example of a perfect square (published while this paper was in preparation) see (13). • Partly confirming and partly disproving a conjecture of Toepken (see (18)). 6 (12). p. 402.

88

DISSECTION OF RECTANGLES INTO SQUARES

313

to some relations between determinants and the subtrees of a network, and to some duality theorems. In §4, these duality theorems are applied to prove the converse of §1: that to any "polar net" corresponds a squared rectangle; and moreover, it is shown that (roughly speaking) the networks which correspond to the same squared rectangle in its two orientations are dual. In §5, the polar net is used to determine all the squared rectangles of a given order; in particular, the "simple" perfect rectangles of orders < 12 are tabulated. §6 contains some theorems on the factorization properties of the elements of a squared rectangle, as determined in §2; as corollaries, we have some sufficient conditions for a squared rectangle to be perfect ((6.20), (6.21)). In §7, we give "non-uniqueness" constructions-in §7.1, of rectangles which can be dissected into the same elements in essentially different ways, and, in §7.2, of pairs of squared rectangles having the same shape but different elements. These constructions depend mostly on considerations of symmetry or duality in the corresponding networks. In §8, the results of §7.2 are used to give "perfect" squares; and in §9, a whole family of "totally different" perfect squares is worked out, and this leads to the result that every rectangle whose sides are commensurable is perfectible. We conclude (§1O) by outlining some generalizations-notably "rectangled rectangles", squared cylinders and tori, "triangulated" equilateral triangles, and "cubed cubes". We prove in particular that no "perfect" dissection of a rectangular parallelopiped into cubes is possible. 6 1. The net associated with a squared rectangle 1.1. In any squaring of a rectangle R/ the sides of all the elements and of R will clearly be parallel to two perpendicular lines. We orient R by choosing one of these lines to be "horizontal" (i.e., parallel to the x-axis). The distinction between this configuration, and its reflections in the coordinate axes, is unimportant; but it is convenient to distinguish it from R in the other orientation (obtained by rotating R through an angle of !7r), called the conjugate of R. Consider the point-set formed by the horizontal sides of the elements of R. Its connected components will be horizontal line-segments (each consisting of a set of horizontal sides of elements of R); enumerate them as PI, ... , PN, say, where PI, PN are the upper and lower edges of R. Take N points PI, ... , P N in the plane. Let E be an element of R; its upper edge will lie in some one of PI, ... , PN , say Pi : similarly, its lower edge will lie in Pi (i ~ j). Join the points Pi, Pi by a line (simple arc) e. By taking all elements E of R, we get a network (linear graph) on PI, ... ,PN as vertices and the e's as I-cells. Figure 1 provides an example. The points PI , P N are the poles of the network. We can arrange the joins e in such a way that e Answering a question raised by Chowla in [5].

Throughout, all squares are supposed to have positive sides; thus zero elements are excluded. 7

89

314

R. L. BROOKS, C. A. B. SMITH, A. H. STONE AND W. T. TUTTE

(1.11) the network is realizable in a plane with no two 1-cells intersecting (except at a vertex). (1.12) No circuit encloses a pole. For we can realize the network as follows. Take Pi to be the mid-point of Pi; and take E > 0 sufficiently small. For each element E, take the vertical segment which bisects E, and cut off a length E from each end, leaving a segment AB, say. Join the upper end of AB, A, to the Pi corresponding to the upper boundary of E, by a straight line-segment, and similarly join B to P j corresponding to the lower edge of E. The path PiABP j is defined to be e. It is now easily verified that (1.11) and (1.12) hold.

/6 2.::1

'P2.

2.8

~J.:5 'P3 2.

36

33

FIG. 1

Also we have clearly (1.13) The network is connected. Remark. In general there may be several1-cells joining two vertices, though not if the squaring is perfect.

(1.14) DEFINITIONS. A network with more than one vertex, satisfying (1.11) and (1.13), is called a net. If two of the vertices of a net are assigned as "poles", and (1.12) is satisfied, the net is a polar net (p-net). The network constructed above is the normal polar net of the squared rectangle. 1.2. Kirchhoff's laws. With each 1-cell e = PiPj of our normal p-net, associate the length of the side of the corresponding element E, directed from the "upper" point (Pi) to the "lower" point (P j) ; call this the current in e. Then (1.21) Except at the poles, the total current flowing into Pi is zero. (For current flowing in = length of Pi = current flowing out.)

90

DISSECTION OF RECTANGLES INTO SQUARES

315

(1.22) The algebraic sum of the currents round any circuit is zero. (For the current in a "wire" e = PiP i is the vertical height of Pi above Pi .) (1.23) The sum of the currents flowing into PI sum of the currents flowing out of P N •

= length of horizontal side of R

=

(1.21) and (1.22) are the usual Kirchhoff laws for a flow of electric current in the net from PI to P N , it being assumed that each I-cell is a wire of unit conductance. ["Rectangulations" of rectangles can be dealt with similarly; the conductance of e will then be the ratio of the sides of E.] Equations (1.21) and (1.22) can be interpreted differently. Consider the cellular 2-complex formed by embedding our p-net in a 2-sphere. We have on it a Kirchhoff chain (K-chain), viz., the I-chain ~ (current in e).e. Then (1.24) The K-chain is a cycle modulo its poles. (1.25) The K-chain is an absolute cocycle.

(This

(This

~

~

(1.21).)

(1.22).)

2. Some results from the electrical theory of networks 2.1. In the previous section, we reduced the study of squared rectangles to the study of certain flows of electricity in networks. Here we collect the results on electrical networks in general which will be useful later. Let
Thus (2.12)

er8 =

L: c.. = o.

CST,

r

We make the convention that if o. (An independent proof is given below; see (3.14).) The second cofactor obtained by taking the cofactor of the component c." in the cofactor of Crt (r ,c. s, t ,c. u) is denoted by frs, tu]. (If N = 2, [12, 12] = 1 = - [21,12].) We put [rr, tu] = 0 = frs, ttl. The frs, tu]'s are called the transpedances (generalized transfer impedances) of
91

316

R. L~ BROOKS, C. A. B. SMITH, A. H. STONE AND W. T. TUTTE

Consider a flow of current from P z to P lI (the poles). The currents in the wires satisfy (1.21); the potential differences (P.D.'s) satisfy the analogue of (1.22); and the total current I is given by (1.23). It is known 8 that these conditions (with Ohm's Law) determine the flow uniquely when I is given, and that (2.13) P.D. from P r to P.when current I enters at P z and leaves at P 1I is [xy, rs].IIC. It is convenient to take I = C, thus fixing the values of the currents and P.D.'s of the network. The flow with I = C is called the full flow; and we speak of the "full currents", etc. Applying this to the normal p-net of a squared rectangle, where all conductances are 1, so that all the transpedances are integers, we see from (1.21)-(1.23) and (2.13) that (2.14)

Every squared rectangle has commensurable sides and elements.

The H.C.F. of the full currents of a p-net is the reduction p of the p-net. Notice that p is also the H.C.F. of all the full P.D.'s of the p-net. The flow with I = CI p is the reduced flow. 2.2. Properties of the transpedances.

We have

(2.21)

[rs, tu] = [tu, rs] = - [sr, tu],

(2.22)

Lz

(2.23)

Ctz' frs, tx]

frs, tu]

=

c. (at.

+ frs, uv]

- atr),

= frs, tv].

(2.22) and (2.23) verify that (2.13) does in fact provide a solution of the Kirchhoff equations, and that the current at each pole is C. We call frs, rs] the impedance of r, s, and write it VCrs). Then

V(rr) = 0,

(2.24)

V(rs) = V(sr),

(2.25)

2· frs, tu] = V(ru)

(2.26)

+ Vest) frs, tu] + [tr, su] + [st, ru] =

V(su) - Vert)

(from (2.23)),

O.

2.3. Alterations to the network. For later use, we need to know the effect on the transpedances of making certain alteration.s to the network m. 1. Introduce a new wire joining a vertex Pm of m to a new vertex Po. Let the new wire have conductance c; then, in the new network m1 , (2.31)

lab, xYh = c· lab, xy] V1(xO) = V1(xm)

8

+

if 0

~

V1(mO) = c· V(xm)

(8), pp. 324-331.

92

lab, mOlt = 0;

a, b, x, y;

+

C.

'317

DISSECTION OF RECTANGLES INTO SQUARES

These results are immediate from the definitions. II. Identify two points P"" P" and ignore any wire that may have joined them. In the new network em.z , (2.32)

(from the definitions),

C2 = [xy, xy] = V(xy) [rs, t] U2

(2.33)

=

frs, tu]. V(xy) - frs, xy]. [tu, xy] C

(for these expressions satisfy Kirchhoff's laws for In particular,

v (rs) -_ V(rs). V(xy)C -

(2.34)

2

em.2 ,

and agree with (2.32)).

frs, xy]Z

«2.33) may be generalized as follows: Cn divides the (n + l)-th order determinants formed as minors of the matrix of transpedances. This is an extension of the Cauchy-Sylvester identity.9) III. Introduce a new wire of conductance c in em., joining P", and P II • In the new network m.a we have, from their definitions as determinants,

c3 =

(2.35) (2.36)

C

+ c· V(xy)

frs, xy]a = frs, xy];

= C + c·C2

(from (2.32));

in particular, V 3(xy) = V(xy).

Also (2.37)

frs, tU]3 = frs, tu]

+ c· frs, tulz ;

for III is a combination of I and II. We introduce a new vertex Po , join it to P", by a wire of conductance c, and identify P lI and Po. This enables us to verify (2.37). 3. Subtrees of a network: duality We shall now characterize the complexity (and hence the transpedances) of a network more topologically, in terms of the "subtrees" of the network. This enables us to prove some duality theorems which will be useful later (§4) and are of interest in themselves. 3.1. As in the previous section, let em. be a connected network with conductances. By a subnetwork
[19], p. 87.

93

318

R. L. BROOKS, C. A. B. SMITH, A. H. STONE AND W. T. TUTTE

When a new wire of conductance c is inserted joining P", , P II , let "H" for the new network be Ha ; and when P"" P II are identified (as in §2.3, II), let "H" become H 2 • Clearly, (3.12)

Ha = H

+ c·H

2 •

But this is the relation which holds between the complexities of these networks (2.35). Also, for a connected network with only two vertices, C = sum of conductances of the wires joining PI to P 2 = H. Hence, by induction on the numbers of vertices and wires in 'D"l, we have: 1o (3.13) THEOREM. For any connected network with more than one vertex, having conductances assigned to the I-cells, C = H. If the conductances are all positive, we clearly have H

(3.14)

C

> o.

This proves

> o.

This interpretation of complexity in terms of trees enables us, if (2.32) is used, to express V(xy) in terms of the trees of networks formed from 'D"l by identifying certain pairs of its vertices, and hence in terms of the "tree-pairs" of 'D"l (formed by omitting one wire from a subtree). Hence, using (2.25), we can get similar interpretations for all the transpedances. In the case of a net, all conductances are 1, so H = number of subtrees of 'D"l; thus (3.13) gives an explicit formula for the number of subtrees of any connected network, in terms of the incidences of the network. 3.2. Duality relations. Now suppose that 'D"l can be imbedded in a 2-sphere, and let 'D"l* be its dual on the sphere. The conductivity of a wire of 'Dl* is defined to be the reciprocal of that of the dual wire of 'D"l. Thus 'D"l** = 'D"l, and the dual of a net is a net. The codual of a subnetwork 'DR of 'D"l is the subnetwork'DRc of 'D"l* whose I-cells are those not dual to any wire of 'DR. Clearly 'DR cc = 'DR. It can be shown that (3.21) A subnetwork 'DR of 'D"l is a tree if and only if both 'DR and 'DRc are connected. Hence (3.22)

If 'DR is a subtree of 'D"l, then 'DIlc is a subtree of 'D"l*; and conversely.

Let M; equal the product of conductances of wires in the subtree (of 'D"l*) which is codual to the r-th subtree of 'D"l. Let", equal the product of conductances of all wires of 'DI. Then, clearly, (3.23) 10

This result is due in principle to Kirchhoff ([9], p. 497).

94

Cf. also [3].

319

DISSECTION OF RECTANGLES INTO SQUARES

Hence, using (3.22), (3.11), and (3.13), we have (3.24)

If C* is the complexity of the dual of

m, ",·C* =

C.

In particular, we have proved (3.25)

THEOREM.

Dual nets have equal complexities.

3.3. Polar duality. Let 9> be a p-net. By (1.12), we can join the poles of 9> by an extra wire eo , without violating (1.11). The resulting net is called the completed net (c-net) of 9>. Let e be imbedded in a 2-sphere, and let e* be the dual of e. From e* omit eri , the dual of eo , and take the ends of eri as poles. We get a p-net 9>', the polar dual of 9>.11 Clearly 9>" = 9>. (The importance of polar duality arises from the fact that, as we shall show in §4.3, polar dual p-nets correspond to the same squared rectangle in its two "orientations" (§1.1).) The p-dual (polar dual) of any I-chain on 9>is defined in the obvious way (as having the same multiplicity on e~ as the given chain has on ei).

e

(3.31) THEOREM. The p-dual of the full Kirchhoff chain on a p-net 9> is the full Kirchhoff chain on the p-dual p-net 9>'. We use sm to denote the cellular 2-complex formed by a network in a 2-sphere. F, (j are (as usual) boundary and coboundary operators, and * denotes duality with respect to the 2-sphere. By (1.24), (1.25), the full K-chain X on 9> is a cycle relative to Pi, P N (the (where is the compoles of 9», and an absolute cocycle on S9>. Hence, in pleted net of 9» X is (i) a relative cycle mod P 1 , P N , and (ii) a relative cocycle mod the two 2-cells, say 1T1 , 1T2 , which have incidence with eo , the "extra" join. Dualizing, in Se*, we see that 3(* is and (i) a relative cocycle mod the 2-cells pi , (ii) a relative cycle mod lTi and IT: , the poles of 9>'. But X* has zero multiplicity on eri , for X has zero multiplicity on eo. Hence X* is (from (i)) a cycle on 9>' mod its poles, and (from (ii)) a cocycle on S9>' mod the 2-cell consisting of pi and together. But a single 2-cell cannot be a coboundary; for, dualizing, this would require a single vertex to be a boundary. Hence X* is an absolute cocycle on S9>', besides being a cycle mod its poles. So X* is a K-chain on 9>'. Let X' be the full K-chain on 9>'; thus j(* = k·X', for some k. Proof.

m imbedded

se

e

P;

P;

11 There may be several ways of placing eo on the sphere, and consequently several polar duals of 9> (differing, however, only trivially). We suppose that one of these is chosen arbitrarily. In the open plane, a convention will be introduced to make 9>' unique; cf. §§4.2,4.3.

95

320

R. L. BROOKS, C. A. B. SMITH, A. H. STONE AND W. T. TUTTE

Let 9>have complexity C, and V(IN) = V. Let the corresponding numbers for 9>' be C', V'. Using (2.22) (with Cjz = 1), we have, in 9>, F(X)

Therefore, in 859", a(j(*) So

=

C,(Pl - PN)'

= c· (Pi - P;) (these cells being oriented suitably).

sum of currents around F(Pi) in X*, C = { sum of currents along a path joining the end-points of total P.D. between the poles of 9>', in the flow X*.

e; ,

Thus

C = k·V'.

(3.32) Similarly,

C' = (11k). V.

(3.33)

Now, by (2.35), the complexity of is C + V. Similarly, the complexity of + V' = k· (C + V), by (3.32), (3.33). But by (3.25) these complexities are equal. Hence k = 1, and X* is the full K-chain on 9>'.

e

e* is C'

4. The correspondence between p-nets and squared rectangles 4.1. We now sketch a proof showing that to each p-net corresponds a squared rectangle. This correspondence is many-one and is clarified by introducing the "normal form" of a p-net (§4.2). We can then set up a 1-1 correspondence between classes of p-nets (having the same normal form) and "oriented" squared rectangles, and can prove that p-dual p-nets correspond to "conjugate" squared rectangles. (eL §1.1.) (4.10) LEMMA. numbered),

For a K-chain in a p-net 9>, whose poles are P l

,

P N (suitably

(4.11)

the potential of each vertex lies between the potentials of the poles;

(4.12)

no currents go into P l , or out of P N

;

(4.13) at a vertex Pi, there is an angle (in the plane) containing all ingoing currents, whose reflex contains all outgoing currents; (4.14) on the boundary of a 2-cell of 859>, there are two vertices Pi, Pi such that no current round this boundary goes from Pi towards Pi.

(We make the convention that zero currents do not go in or out.) Proof. Let Pi be any vertex, and suppose a current goes into Pi. Then a current goes out of Pi along at least one wire, ending at Pi, say; and so on, until we reach a pole P N (say). All this time the potential has been falling, so P N is eventually reached; and the potential of Pi is thus not less than that of P N • If all the currents at Pi are zero, we can connect Pi to a vertex P k at which not all currents are zero, by a path of zero currents; and Pi, P k have the same potential. Thus in all cases the potential of Pi is not less than that of P N ; and similarly it is not greater than that of Pl. This proves (4.11).

96

DISSECTION OF RECTANGLES INTO SQUARES

321

(4.12) follows at once from (4.11). (4.13) has been proved for the poles; so let i ~ 1, N, and suppose that two outgoing currents at Pi separate (in the plane) two ingoing ones. As in the proof of (4.11), we can continue each of the first two wires into a path down to P N , along which the current falls; and similarly we can extend the other two wires into paths of rising potential up to Pl. Hence one of the two former paths must intersect one of the latter again, say in Pi (i ~ j). The potential of Pi is both less than and greater than the potential of Pi. This is a contradiction, and so (4.13) is proved. (4.14) follows from (4.13) and (4.12) by dualizing, if we use (3.31). 4.2. Normal form of a p-net. Let 9' be a p-net imbedded in the open plane in such a way that its poles, PI , P N , can be joined in the "outside region" of S9'. (That is, 9'is first imbedded in the closed 2-sphere, an extra join eo of the poles is inserted, and the "point at infinity" is then taken to be in the 2-cell of S9' which contains eo.) We define the normal form of 9', as so placed in the plane, as follows: Consider any (not identically zero) K-chain 'X on 9'. Some currents may be zero; delete the corresponding wires, and delete all vertices at which all currents are zero. Since C > 0, we are left with a p-net still, having PI, P N as poles. Using (2.31), (2.37), (2.36) (with c = 1), we see that 'Xis a K-chain for the new p-net
97

322

R.

L.

BROOKS,

C. A.

B. SMITH,

A.

H. STONE

AND W.

T. TUTTE

the p-dual net fJ". (By (3.31).) Let fJ' have complexity C, and let the P.D. between its poles be V (= V(xy)). Thus ((3.32), (3.33)) the analogous numbers for fJ" are V and C respectively. We can take the lowest potentials in fJ'and fJ" to be zero. Suppose a wire e in fJ'has its end-points at potentials VI, V 2 , and its dual e* has its end-points at potentials V~ , V~. If JI. is a number such that VI < JI. < V 2 , we say that e comprises ( , JI.); and if Xis such that V~ < X < V~ , then e comprises (X, ). If both relations are true, we say that e comprises (X, JI.). Now, observing that V 2 - VI = current in e = current in e* = V~ - V~, we construct a squared rectangle R as follows: In a rectangle of height V and base C, we take, for each wire e of ~ the (closed) square E whose horizontal sides are at a height VI, V 2 above the base (x-axis) and whose vertical sides are at a distance V~ , V~ to the right of the left-hand vertical side (y-axis). If the current in e is zero, this square reduces to a single point, and is omitted. Let X ~ any potential of a vertex of fJ", andJl. ~ any potential of a vertex of fJ'. Then, if 0 < X < C, and 0 < JI. < V, we have the following: The wires (of fJ') comprising (X, ) form a single path from pole to pole, along which the direction of the current is constant. For, by (4.12) and duality, there is just one such wire terminating at each pole; and from (4.14), if one such wire carries current to a vertex, then just one such wire carries current from that vertex, and no more such wires terminate at that vertex. Along this path, the potential increases steadily from pole to pole; also, by choice of X, the currents along the path are non-zero. Hence just one wire in it comprises ( , JI.). So just one wire of fJ' comprises (X, JI.). Thus the point of coordinates (X, JI.) belongs to just one of the squares E. It follows that the whole rectangle is filled completely and without overlap (except of boundaries of squares). It is easy to see that the normal p-net of the squared rectangle so constructed is--:-to within reflection in the axes (which we always disregard)-the normal form of 9>. Also, it is clear from the construction that the squared rectangle assigned to fJ'differs from that assigned to fJ" only by interchange of horizontal and vertical; i.e., the two squared rectangles are conjugate. In this way, we have a 1-1 correspondence between classes of p-nets in the plane having the same normal form, and "oriented" squared rectangles. DEFINITIONS. As suggested by (4.31), the complexity of a p-net is called its (full) lwrizontal side (often written H instead of C); and the full P.D. between its poles is its vertical side (V). The "full elements" and "full sides" of a squared rectangle refer to those of its normal p-net. The "reduced elements" will be the same for all corresponding p-nets. 4.4. Defining a cross as a point of a squared rectangle which is common to four elements, and an "uncrossed" squared rectangle as one which has no crosses, we have: The normal p-nets oj uncrossed conjugate squared rectangles are p-duals.

For let fJ' be the normal p-net of the squared rectangle R; and let fJ" be the p-dual of fJ'. Let ~ be the normal p-net of the conjugate R' of R; thus, from

98

DISSECTION OF RECTANGLES INTO SQUARES

323

§4.3, ~ is the normal form of g". Now, in deriving the normal form of g' (as in §4.2) there are no zero currents to suppress; and there are no identifications of vertices possible, as otherwise R', and hence R, would have a cross. So g" =~. That is, g' and ~ are p-duals. (This result could be extended to crossed squared rectangles by making a suitable convention modifying the normal p-net when crosses are present; e.g., by regarding a cross as an "element of side zero".) 5. Enumeration of squared rectangles 5.1. Computation. To find all the squared rectangles of a given order n, we have only to make a list of all p-nets having n wires. There is no difficulty in this, if n is not too large. We can save some labor by noting that p-dual nets give essentially the same rectangles; also we can assume that no part of a net, not containing a pole, is joined to the rest only at one vertex. (For the currents in this part would all be zero, whereas we can restrict ourselves to "normal forms".) A convenient way of carrying out the calculations is to consider the c-nets. From each net of n + 1 wires, we remove one wire and take its end-points as poles in the remaining net (if it is a net; i.e., is connected). Dual c-nets give rise to pairs of polar dual p-nets; so we need consider only half the c-nets. The working can be simplified by a proper use of §2.2. In practice, the Kirchhoff equations are best solved directly (without using determinants); a single determinant then gives the full elements for all the p-nets derived from one c-net. It follows from §2.3 that all p-nets derived as above from the same c-net will have the same (full) semiperimeter, viz., the horizontal side of the c-net; and that two p-nets which differ only in the choice of poles, and their (non-polar) duals, all have the same (full) horizontal sides, viz., the complexity of the nets. (By (3.25).) Thus a number which appears in the (n + l)-th order as a side appears (several times) in the n-th order as a semiperimeter. These facts are illustrated in the table below (§5.3). 5.2. The perfect rectangles of least order.

"Simple" perfect rectangles

(5.21) A squared rectangle which contains a smaller squared rectangle (and any p-net corresponding to it) is called compound; all other squared rectangles and p-nets are simple. A p-net g', without zero currents, which has a part ~ such that ~ contains more than one wire, ~ g', is joined to the rest at only two vertices Ql , Q2 ,and contains no pole (except perhaps for Ql or Q2) is compound. For ~ must be connected; and the squared rectangle corresponding to g' will contain the smaller squared rectangle which corresponds to ~ (with Ql, Q2 as poles).

(5.22) "Trivial" imperfection. If a p-net has two equal non-zero currents, it is imperfect, and these currents constitute an "imperfection". (This is equivalent to saying that the corresponding squared rectangle is not perfect.) If a p-net has a part, not containing a pole, joined to the rest by only two wires, or if it has a pair of vertices joined by two (or more) wires, these two wires will

99

324

R. L. BROOKS, C. A. B. SMITH, A. H. STONE AND W. T. TUTTE

clearly have equal currents. If these currents are non-zero, the resulting imperfection is said to be trivial. A p-net which has a non-trivial imperfection is called non-trivially imperfect. A non-trivially imperfet:t p-net mayor may not have a trivial imperfection. We now have the theorem: (5.23) The c-net derived from a simple perfect rectangle has no part (consisting of more than one wire and of less than all but one wire) joined to the rest at less than three vertices; and the same is true of its dual.

For the normal p-net of the simple perfect squared rectangle (or of the conjugate squared rectangle) will otherwise have a zero current, or a trivial imperfection, or be compound. A perfect rectangle of the smallest possible order must evidently be simple. Applying (5.23) to the method of §5.1, we readily find that There are no perfect rectangles of order less than 9, and exactly two perfect rectangles of order 9.

Of the latter, one is well known ;12 the other is, we believe, new and has been drawn in Figure 1. Below, we give a list of the simple perfect rectangles of orders 9-11. The compound perfect rectangles of these orders follow trivially. 5.3. Table of simple perfect rectangles. Order

Full Sides

Semiperimeter

9

66,64

130

69,61

130

114,110

224

130,94

224

104, 105

209

111,98

209

11.'5,94

209

130;79

209

10

Description of Polar Net (current from Pc to Pb = ab)

ab = 30, ac = 36, bd = 14, cd = 8, be = 16, de = 2, ef = 18, df = 20, cf = 28. ac = 25,ab = 16,ae = 28,bc = 9, bd = 7, de = 2, de = 5, cf = 36, ef = 33. ab = 60, ac = 54, cb = 6, ce = 22, cd = 26, be = 16, ed = 4, bf = 50, ef = 34, df = 30. ab = 44,ac = 38,ae = 48,cb = 6,ce = 10, cd = 22, ed = 12, bf = 50, df = 34, ef = 46. ab = 60, ac = 44, cb = 16, cd = 28, bd = 12, be = 19, de = 7,bf= 45,ef= 26,df= 33. ab = 44, ad = 26,ae = 41,dc = 11, de = 15, ce = 4, cb = 7, eb = 3, bf = 54, ef = 57. ab = 34, ac = 19, ad = 23, ae = 39, cb = 15, cd = 4, de = 16, db = 11, bf = 60, ef = 55. ab = 34, ac = 23, ad = 35, ae =38, cb = 11, cd = 12, de = 3, bf = 45, df = 44, ef = 41.

Reduction

-2

1 2 2 1 1 1 1

12 First found, apparently, by Moron [111. See also [101. p. 272; [21. p. 93; [141. p. 8; and [41.

100

DISSECTION OF RECTANGLES INTO SQUARES

325

The full sides and semiperimeters of the simple perfect rectangles of the 11-th order are: Order

11

Semiperimeter

336 353 368 377 386

Sides

127, 209; 144, 209; 159,209; 168,209; 1162,224;

151, 159, 169, 178, 177,

185 194; 199; 199; 209;

162, 172, 183, 181,

191; 166, 187; 168, 185; 176, 177 196; 177, 191; 183, 185 194 205; 190, 196; 191, 195; 192, 194

Four of these are reducible, with reduction = 2; these are the rectangles whose sides are both even. Of the 67 simple perfect rectangles of the 12-th order, eleven have reduction 2, eight have reduction 3, and one has reduction 4. 6. Theorems on reduction In perfect rectangles of higher orders, much larger reductions occur; for example, a 19-th order rectangle with reduced sides 144 and 155 has p = 80. lts reduced elements are: ab = 46, ad = 40, af = 28, ag = 41, bc = 10, bi = 36, ci = 26, dc = 16, de = 3, dh = 21, eh = 18, fe = 15, fg = 13, gk = 54, hl = 39, ij = 62, kj = 49, kl = 5, lj = 44. 6.1. The following theorems on reduction are of interest. (6.11) THEOREM.

If one of the currents in a p-net is zero, the net is reducible.

Let the poles be P r , P s , and the zero current be in a wire joining P x , P lI • Then the transpedance [rs, xy] is zero. On removing the wire in question (use (2.37) with c = -1, and (2.33)), the new value for [rs, tu] is

' - [ t] _ [rs, tu]. V(xy) - [rs, xy]. [tu, xy] [rs, tu ] - rs, u C

= [rs, t] u·

C - V(xy) C .

Now, C > C - V(xy) = C' (by (2.35)) > 0 (by (3.14)). Hence the H.C.F. of the [rs, tuJ's must be at least CI(C - V(xy)) > 1. 2 DEFINITION. Let a positive integer n = m· k , where m is square-free. Then k is called the lower square root of n, and mk is the upper square root. (6.12) THEOREM. Let the full sides of a p-net be H, V. Then the reduction p is a multiple of the upper square root of the H.C.F. of Hand V. By (2.34), remembering that V 2 (rs) is an integer, we have

C divides V(rs). V(xy) - frs, xyt Since C = H, and VCrs) = V (taking P r , P s as poles), it follows that the H.C.F. of H, V divides [rs, xy]2; whence the result.

101

326

R. L. BROOKS, C. A. B. SMITH, A. H. STONE AND W. T. TUTTE

(6.13) COROLLARY. If the reduced sides of a squared rectangle have H.C.F. then the reduction of any corresponding p-net is divisible by (l.

(l,

(For, by (6.12), (l is a factor of the lower square root of the H.C.F. of the full sides and hence-since the lower square root divides the upper-of p.) (An example is the rectangle 96 X 99 given in §7.1.) (6.14)

COROLLARY.

Any p-net of a squared square has for reduction a multiple

of its reduced side.

(6.15) COROLLARY. A necessary and sufficient condition that a p-net be irreducible is that its two full sides be coprime. (6.16) (6.17) n, (H

All non-trivially imperfect p-nets are reducible.

THEOREM. LEMMA.

If H, V, k are positive integers such that, for each positive integer 1, then H, V, k all have a common factor greater than 1.

+ n V, k) >

Proof. Let No be the product of all the primes which divide k but not H. (Empty product = 1.) Let po be a prime factor of (H + No V, k). Suppose po.{ H. Then Po I No. Hence, since po I (H + No V), we have po I H, and this is a contradiction. So po I H. Therefore po divides No V but not No ; so that po divides V as well as Hand k. Proof of (6.16). Now let ~p be a p-net with full sides H (= C) and V (= V(IN)); and let a non-trivial imperfection be [IN, ab] = [IN, pq] = k, say. Thus k > 0, and we do not have both a = p and b = q. (Else the imperfection is trivial.) Join P l , P N to produce the completed net t'. Let ~ be the p-net formed from by taking P a , P b as poles, and omitting one wire joining P a , P b • (Of course, there is such a wire; there may be several. It is easy to see, from considerations of "triviality", that ~ is connected, and therefore a p-net.) Applying (2.33) to e, and using (2.35), (2.36), (2.37), we have

e

(6.18)

(H

+ V) I k· (V(ab)

- lab, pq]),

where V(ab), lab, pq] refer to the p-net formed bye with P a , P b as poles, and hence (2.36) refer equally well to ~. ~ow, we have 0 < V(ab) - lab, pq] ~ semiperimeter of ~, with equality only if the current lab, qp] equals the total current of~. In this case, e must consist of two parts, joined only by the two wires PaPb and PpP q • Further, P l , P N , being joined in e by a wire not PaPb or PpP q , must lie in the same part. Hence the imperfection in 9' with which we started was trivial. Hence (6.18) gives (since semiperimeter of ~ = complexity of e = H + V) (6.19)

(H

+ V, k) >

1.

Now let n be any positive integer. Join P l , P N by n - 1 extra wires (of unit conductance). The new p-net will have the same non-trivial imperfection (by (2.36)), so, applying (6.19) to the new net, and using (2.35), (2.32) repeatedly, we have (H+nV,k» 1.

102

DISSECTION OF RECTANGLES INTO SQUARES The lemma (6.17) now shows that (H, V) reducible.

>

1.

327

Hence, by (6.15), g> is

(6.20) COROLLARY. All irreducible p-nets having no trivial imperfections give perfect squared rectangles. (6.21) COROLLARY. If the complexity of a c-net is prime, all the squared rectangles derived from it (as in §5.1) will be perfect.· These results are sometimes useful as tests for perfection. For the reduced elements, we can prove (using the Euler polyhedron formula, and some consideration of the various cases) (6.22) At least three of the reduced elements of any perfect rectangle are even. (Three is the best number possible.)

7. Construction of some special squared rectangles 7.1. Conformal rectangles. Two squared rectangles (or p-nets in this plane) which have the same shape (that is, have proportional sides) but are not merely rigid displacements of each other (in the case of p-nets, have not the same normal form) are called conformal. An example of a conformal pair is provided by the 9-th order rectangle 64 X 66 and a 12-th order rectangle of reduced sides 96, 99, whose (reduced) net is specified by: ga = 31, ge = 21, gc = 44, ea = 10, ed = 11, ad = 1, dc = 12, ac = 13, ab = 27, cb = 14, bf = 41, cf = 15. Two conformal rectangles need not have the same full sides or reduction; for example, the rectangle 96 X 99 has reduction 3 (cL (6.21)). We now show how to construct conformal pairs having the same reduced elements (but differently arranged). Suppose that a p-net g> has a part !!2 joined to the rest only at vertices AI, ... ,Am, say, and containing no pole different from an Ai. If!!2 has rotational symmetry about a vertex P, in which the A's are a set of corresponding points, then a simple symmetry argument shows that the potential of P (in g» will be the mean of the potentials of AI, ... , Am. Hence if this is also true for another vertex P', P and P' will have equal potentials. Coalesce P and P', forming (if this can be done in the plane) the p-net g>2. If C is the complexity of g>, we see from (2.33) that, if lab, INJ and lab, INh are corresponding elements in g> and g>2 (with 1, N referring to the poles), lab, INJ2

V (PP')

= -C-

lab, INJ.

Hence the elements of g>2 are proportional to those of g>; g>and tJ'2 have the same reduced elements and sides. Their reductions are clearly in the ratio C: V(PP'). This construction enables conformal p-nets with the same elements to be written down. A simple example is shown in Figure 2. Here Al and A2 are poles. The rectangles are perfect and simple, and have reductions 5 and 6, and reduced sides 75 and 112.

103

328

R. L. BROOKS, C. A. B. SMITH, A. H. STONE AND W. T. TUTTE

In a more complicated example, illustrating a variation on the method, we make the potentials of three points P l , P 2 , P a equal. Although the network we start with is not planar, it becomes so when either P lP 2 or PlPa coincide. Such a network is specified below. It gives conformal simple perfect rectangles of the 28-th order, with reductions 96 and 120, reduced sides 6834 and 14065, and reduced elements: Ala = 3288, AlPl = 3480, Alb = 2512, Ald = 2247, Ali = 2538, aPa = 192, aAa = 3096, bPa = 968, bA 2 = 1544, P lA 2 = 576, PlAa = 2904, PaC = 1160, A 2c = 584, cAa = 1744, de = 1014, dP 2 = 1233, eA 2 = 795,

31

,39

3./\

/I

L......-

20

14

036

'---

~s

19

24

31

39

42

42.

3../\ 9

14

36

33

33

"I 191

20

f----l... -t". 19

2.4

112

}-----+--i

pi

FIG. 2

eP 2 = 219, iP2 = 942, ih = 1596, P 2h = 654, Pd = 579, P 2g = 1161, hAa = 2250, A 2j = 3,jg = 582, gAa = 1743, A2Aa = 2328. (The poles are Al and Aa.) These examples show that, even when the sides and elements of a simple perfect rectangle are given, the configuration is far from uniquely determined. We now turn to the opposite problem of constructing conformal pairs of squared rectangles having different sets of elements. Again, symmetry considerations enable us to do this. We are led to pairs of rectangles (and p-nets) which are not merely conformal but have the same full sides. Such pairs are said to be equivalent. 7.2. Symmetry method. Let a p-net 9'have a part!2 joined to the rest only at vertices Al , ... , Am, and containing no pole different from an Ai. SUp-

104

DISSECTION OF RECTANGLES INTO SQUARES

329

pose that !;2 has rotational symmetry in which the A's are a set of corresponding points, and that !;2 is not identical with its mirror-image. !;2 is the rotor, and the wires of 9' - !;2 form the stator. In 9', replace ~ by its mirror-image. It is easy to see that the full currents in the stator will be entirely unaffected, though (in general) the rotor currents will change. (This can be proved, e.g., by induction over the number of wires in the stator, if we use §2.) So we have, in general, a pair of equivalent rectangles, with different (though overlapping) sets of elements. One of the simplest examples of this method is shown in Figure 3. This gives equivalent simple perfect rectangles of order 16, reduction 5, and reduced sides 671 and 504. 13

A2~~~~~~~~~~

A,~~~~---+--~~----

:50+

504

FIG. 3

We may generalize this method by noting that it remains effective when some of the A's are coincided (corresponding to the introduction of "wires of in'finite conductance" in the stator). Or, again, we may take the stator to be itself a rotor, with AI, ... , A m as its set of corresponding points (with possible coincidences). By reflecting both parts we can get pairs of equivalent rectangles having no elements in common. 7.3. Special methods. The preceding methods (and similar ones based on duality instead of symmetry) are useful for existence theorems, as in the next section; but other devices are more suitable for producing equivalent rectangles of small orders. If, in a c-net <", we can find two wires whose end-points-say Pa , P b and P x , P y , respectively-satisfy (7.31)

V(ab)

=

V(xy)

(in

<"),

13 The rotor of Figure 3 has a remarkable property. If currents II, I., I, (summing to zero) enter the rotor (considered as a net) at AI, A., A 3, then the currents in B,G I , BIG" B.G, will be IJ/7, 1,/7, 1,/7, respectively. This explains the "extra" equalities of the currents in Figure 3. Other rotors of 15 wires (having the same type of symmetry) behave in a similar way. This phenomenon is not yet fully explained.

105

330

R. L. BROOKS, C. A. B. SMITH, A. H. STONE AND W. T. TUTTE

e

then the corresponding p-nets (obtained from by omitting each of the two wires in turn, and taking its ends as poles) will be equivalent, if not identical. For they have the same semiperimeter in any case, viz., the complexity of e. By using the properties of symmetrical or self-dual networks, we can often demonstrate an equality like (7.31). For example, in Figure 4, it is clear that

V(gh) = V(eb)

(7.32) and

[da, gh] = 0.

(7.33)

Hence (by (7.33) and (2.23) and symmetry)

[de, gh] = rae, gh] = [de, eb].

(7.34)

Now, (7.32) and (7.34) imply (if we use (2.37) and (2.33)) that the impedances of gh and eb remain equal when we add a wire joining de. Hence this new c-net

e

f

FIG. 4

satisfies (7.31), and so we get a pair of equivalent squared rectangles of the 12-th order. These rectangles are perfect, and provide the simplest example of equivalence among perfect rectangles. They both have reduction 2 and reduced sides 142 and 162. Their (reduced) specifications are respectively:

gf = 57, gd = 85, dh = 77, de = 12, ad = 4, fe = 40, be = 13, eh = 65, ab 3, ea = 7, eb = 10, fe = 17; and ef = 59, ea = 83, fe = 40,fg = 19, gh = 10, he = 11, gd = 9, dh = 1, ad = 4, de = 12, eb = 63, ab = 79. 8. Construction of perfect squares 8.1. Definition. Two conformal rectangles are said to be totally different if C2 times an element of the first is never equal to C1 times an element of the second, where C1 , C2 are their respective (corresponding) horizontal sides. For equivalent rectangles this is equivalent to: No element of the first equals an element of the second. A pair of totally different simple perfect squared rectangles gives us a perfect square at once; we have only to place them as in Figure 5, and add two corner

106

DISSECTION OF RECTANGLES INTO SQUARES

331

squares. This idea, though often in modified form, underlies all the constructions for perfect squares in this paper. (8.11) It is easy to show (by the use of determinants) that if H, V and H', V' are the full sides of the rectangles used in this construction, then the resulting square will have full side (H V)· (H' V'). In particular, if the rectangles are equivalent, the full side of the square is the square of an integer.

+

+

FIG. 5

FIG. 6

e

f

FIG. 8

FIG. 7

8.2. Symmetry method. Equivalent perfect rectangles constructed as in §7.2 can be used to give us a perfect square. The stator is taken to be a single wire AiA i (drawn outside the rotor), one of whose end-points is a pole. The equivalent rectangles so obtained will have, in general,14 just one element in common, the element corresponding to this stator. As this element is placed at a corner in both rectangles, we may "overlap" the rectangles as in Figure 6 to get a square. One of the simplest perfect squares formed in this way is based on the rotor and stator shown in Figure 7. The square is of the 39-th order. (8.21) It can be shown that, if H, V are the full sides of the equivalent rectangles used in this construction (§8.2), and E is the common element, then the full 1. The "exceptional case", in which two elements from the following set: the rotor, its reflection, and the stator-element, are equal, seems in practice to be rare. It does occur, however, if the rotor has trivial imperfections, or if it has too much symmetry, or if it has triad symmetry and only 15 wires (cf. the previous footnote).

107

332

R. L. BROOKS, C. A. B. SMITH, A. H. STONE AND W. T. TUTrE

side of the resulting squared square is (H + V - E)2, the square of an integer. In the case of triad symmetry (m = 3 in §7.2), we can show that E· (2H + 2V E) = HV, so that the full side of the squared square is, in this case,

H2

+

HV

+

V2•

8.3. Perfect squares of smaller orders. A perfect square of much smaller order is given by an elaboration of §7.3. We can show by an argument similar to that in §7.3, but longer, that in the net shown in Figure 8, V(e!) = V(ge). (We use the facts that, if g and f are coalesced in Figure 8, the net becomes symmetrical and self-dual, and that Figure 8 results from Figure 4 by joining de.) Hence the two p-nets obtained by taking respectively e, f and g, e as poles in Figure 8 are equivalent (for their horizontal sides both equal the complexity). They are in fact perfect and totally different; and, though not both simple (the e, f one being obviously compound), the method of §8.1 is easily modified to give a perfect square, which is drawn in Figure 9. It is of the 26-th order. (The least possible order of a perfect square is unknown.) We have also constructed, in a similar way, two perfect squares of the 28-th order, each of full side (1015)2 and reduced side 1015.16 8.4. Simple perfect squares. The perfect squares constructed so far have all been compound. By generalizing the method of §8.2 to certain "squared polygons", we can obtain "simple" perfect squares. First, let em. be a net with Al , ... , Am as the vertices of its "outside" polygon, in order. Consider an electric flow in em. in which all of AI, ... , Am are poles-i.e., in which currents Ii (not all zero) enter em. at Ai (L: Ii = 0). Suppose that I i ~ 0 if i > 1. (This could be weakened; but some restriction on the order of the ingoing and outgoing currents is necessary.) Then the flow in em. corresponds to a squared polygon, of angles t7l" and -&71". Proof. We reduce the number of poles of em. as follows: Suppose Ai is at potential Vi. Suppose there is more than one i for which Ii > 0; let I', 2' be the least and second least such i's. If VI' = V 2 " coalesce AI' and A 2 , (by joining them by a line outside the polygon Al ... Am and shrinking the line to a point); and let current II' + 12 , enter there, the other currents being as before. The currents in em. will be unaltered, and there is now one fewer positive current entering the network. If VI' ~ V 2 ' , we can suppose VI' > V 2" Join AI' , A 2 , by a wire of conductance 12 ,/(VI, - V 2') (passing outside the polygon Al ... Am) and take currents (II' + 12 ,) at AI' ,0 at A 2 , and Ii at Ai for the other i's. Again, the currents in em. will be unaltered, and one fewer positive current enters the system. Repeating this process till there is only one positive external current left, we have the flow in em. "imbedded" in a flow with only two poles; in fact, in a p-net flow (except that some of the extra wires may have conductances different from 1). This corresponds to a "rectangled rectangle" R. 15

See [16].

108

333

DISSECTION OF RECTANGLES INTO SQUARES

Stripping off the elements of R which correspond to the extra wires, we are left with a squared polygon, corresponding to 'Dl. Since the currents Ii are (apart from sign) at our disposal, the shape of the squared polygon can be controlled. (It has m - 2 degrees of freedom.) Now take for 'Dl a pure rotor-i.e., a network having skew symmetry; and suppose that the points AI, ... , Am are a set of corresponding points in 'Dl.

209

19 4

205

)'-11 41

168

42.

44 1'1 43

172

183

85

61

95

108

34~27

7V '":fA20

231

136

123

/13

/'-5 118

FIG. 9

If 'Dl is replaced by its reflection (leaving the currents Ii invariant), the new squared polygon will have the same shape as the old-in fact, the two squared polygons will be "equivalent". For, as in §7.2, the rectangled rectangle R will be replaced by an equivalent one, in which the "extra" elements are the same as before. By combining such a pair of equivalent polygons, as in Figure 10, and arranging their shape so that the overlapped portions coincide with elements

109

334

R. L. BROOKS, C. A. B. SMITH, A. H. STONE AND W. T. TUTTE

(which are then removed), and inserting three extra squares (in the center and at the corners), we can obtain a "simple" perfect square. For instance, the rotor shown in Figure 11 gives rise to a simple "uncrossed" perfect square of order 55, which, when drawn out, disguises its symmetrical origin very skillfully.

FIG. 11

FIG. 10

9. Perfect subdivision of the general rectangle 9.1. We begin by proving: (9.11)

There exist infinitely many totally different perfect squares.

We construct such an aggregate of squares by the method of §8.2, taking for our equivalent rectangles those furnished by the "rotor-stator" diagram (cf. §7.2) of Figure 13. In this diagram, AI, A2 are the poles, and the wire AIA3 is the stator. The three "resistances" A 1B 2 , etc., denote three copies of the p-net of some perfect rectangle. We shall select a sequence !R n of suitable p-nets, and, for each !R n , form the corresponding square ~n. The sequence ~n will then (as follows from (9.39» have a subsequence of perfect squares, every two of which are totally different. This will prove (9.11). 9.2. The perfect rectangles !Rn. Let!Rn be the p-net shown in Figure 12, with Po , Qo as poles. Write r = [(2 y'3r - (2 - V3)"]/2y'3. Thus

+

(9.21)

0 = 0;

r is an integer;

and r+1 - 4r

+ r-I =

O.

It will readily be verified that a solution of Kirchhoff's equations is given by:

(9.22)

Current in PoPr (from Po to P r) is a r , where

ar = an+1

t· [5n

+ n-I + 3r -

3r-d

if 0

= 3...

Current in PrQo is br , where br

n,

= t· [5n

bn+1 = 2n

+ n-I -

+ n-l •

3r - 3r-d

110

if 0

<

r

<

n

+ 1,

335

DISSECTION OF RECTANGLES INTO SQUARES

Current in P rP r+1 is Cr , where c,

= 3.pr

-Cn

< r < n,

if 0

= .pn - .pn-I .

(This solution is in fact the full flow.)

FIG. 12

Also the total current !Pn (the horizontal side of !Rn), and the total P.D. qn (the vertical side) are given by: Pn =

(9.23)

Now, if n

o<

CI

>

t· [(5n

+ l).pn + (n + 2).pn-11;

2, we see that

< C2 < ... < Cn-2 < (- Cn) < Cn-I < bn < bn+1 < b,,_l < b,,_2 < ... < b1 < a1 < a2 < ... < an_1 < an+1.

Hence (9.24)

If n

>

2, !R n is perfect.

From (9.23), we have (9.25)

qn and Pn/ qn

--7 00

with n.

For later use, we note that (Pn, q,,) I 9.

(9.26)

Proof. (n

From (9.23),

+ 2)qn -

2Pn

= 9.pn

and

(5n

+ l)q" -

lOPn

= 9.p,,_1 •

Now, we can prove by induction (using (9.21» that (.pn, .p,,-1) = 1. Thus (9.26) follows. [(9.24) can be generalized: If in Figure 12 the wire PoP" is inserted and the wire POP r removed, where 1 < r ~ tn, the resulting p-net !R nr is perfect. !Rn2

111

336

R. L. BROOKS, C. A. B. SMITH, A. H. STONE AND W. T. TUTTE

is essentially the same as ffi n . The reduction Pr of ffinr can be calculated; for instance, it can be shown that Pr is a factor of (cf>r - cf>r-l); and that Pr = (cf>r cf>r-l) if and only if n == 0 (mod 2r - 1).] 9.3. We next prove (9.31)

THEOREM.

For all large n, the squared square iDn is perfect.

Consider the equivalent p-nets of Figure 13, where each "resistance" denotes a certain p-net ffi, of horizontal side p and vertical side q. (The other wires have conductance 1, as usual. Later, ffin will be taken as ffi.) /6c 2+30c+/1 ~

3c 2+/3c+1I

I/CZ+/6c

/6e2. flOc +/ /

/6c 2 +3Oc+// FIG. 13

Setting c = p/q = effective conductance of these resistors, we find that the flows are as indicated in the diagram. (The quantities shown are currents.) Hence, multiplying through by q2, and adjoining the extra elements required in forming the squared square iD.as in §8.2, we find that the elements of iD (some integral multiple of the reduced elements) are:

r (A)

(9.32)

+ 14pq + lIq2, 5p2 + 13pq + 7q\ 5p2 + 4pq - q2, 5p2 + 3pq _ 5q2, 3p2 + 17pq + 2ol, 3p2 + 13pq + lIl, 3p2 + tpq + l, 3p2 + 6pq + 4q2, 3p2 _ 6q2, 2p2 + 9pq + 4q2, 2p2 + 8pq + 7l, 2p2 + 4pq + 5l, 14p'

+ 21pq + 7q',

5p2

2p2 - 4l,

(B)

2p2 - 4pq - 6l.

Multiples of the elements of ffi, the multipliers being respectively 13p

+ 17q, 12p + 13q, IIp + 16q, 9p + 8q, 4p + 9q, p

112

- 3q.

337

DISSECTION OF RECTANGLES INTO SQUARES

We also find that (9.33)

The side of ~ is 19p2

+

+

47pq

31l.

N ow take tR to be tR" , so that p = Pn and q = qn ; and let n be so large that (in virtue of (9.25» (9.34)

pn

>

180qn.

We prove that, under this condition, (9.35)

~

=

~n

is perfect.

The elements (A) are all different, and no element (A) equals an element (B).

For the elements (A) in the above list are in strictly decreasing order; so no two of them are equal. Also the least element (A) is 2p2 - 4pq - 6l which > (13p + l7q)q, which is greater than any element (B). Thus (9.35) follows. (9.36)

No two elements (B) are equal.

For suppose that two such elements are equal: (9.37)

Hap

+ (3q)

+ oq), + {3q, 'YP + oq are two

= T]('YP

where ~, T] are elements of tRn , and ap multipliers of (9.32). They are different multipliers; for tR n is perfect by (9.24). Hence, by inspection of (9.32), ao - {3'Y ;t. O. But, from (9.37), (a~ - 'YT])p = (OT] - {3~)q. Hence p I (OT] -

(9.38)

(3~).

(p, q).

Now, if OT] - {3~ = 0, we have (since p ;t. 0) a~ - 'YT] = 0, and hence, eliminate ~, T] (which are not zero), it follows that ao - {3'Y = O. So we o < I OT] - {3~ I < 20q (by inspection of (9.32), since 0 < ~, T] < q). Hence, if we use (9.26), (9.38) gives p < l80q. This contradicts (9.34). (9.31) now follows from (9.35) and (9.36); the squares ~n are perfect, for enough n.

if we have

And large

(9.39) THEOREM. Given any large enough n, then for all large enough N, and ~ N are totally different.

~n

Write Pn = p, qn = q, PN = P, qN = Q. We bring ~n and ~N to the same size by multiplying the elements of ~n (as given by (9.32» by 19P2 + 47PQ + 3lQ2 and those of ~N by 19p2 + 47pq + 3ll. (This follows from (9.33).) (9.40)

Each element (B) of ~N is less than every element of ~n

For a typical element (B) of e = (aP

~N

•

is

+ (3Q). (19p2 + 47pq + 3lq2),

where I a

I, I {3 I ~

17.

If nand N are large, this gives e < 360Pp2. (This follows from (9.25).) But each element of ~n is at least as large as p2p (times some non-zero constant). Hence if n > some no, and if then N > some No(n), so that P is large compared with p (see (9.25», we have e < each element of ~n •

113

338'

R. L. BROOKS,

(9.41)

Co A.

B. SMITH,

A.

H. STONE

AND W.

T. TUTTE

Each element (A) of ~N is greater than every element (B) of ~...

For any element (A) of ~N is at least as large as p2p2 (times some non-zero constant), whereas an element (B) of ~" is less than 360P2p. ' (9.42)

No element (A) of ~N can equal any element (A) of ~" .

Otherwise we have (ap2

+ bPQ + CQ2). (19p2 + 47pq + 31q2) = (a'p2 + b'pq + c'q2).(19p2 + 47PQ + 31Q\

where by (9.32) a, a', etc., are integers numerically less than 22. Hence (9.43)

p2. [(19a -

19a')p2

+ (47a + (31a -

19b')pq 2

19c')q 1 = similar terms in PQ and Q2.

Now,47a - 19b' ~ 0; for otherwise 191 a, whereas 0 < a < 19 (from (9.32». Hence the left side of (9.43) is numerically at least as large as p2pq (times some non-zero constant); in fact, if a ~ a', it is as large as p2p2. But the right side of (9.43) is at most PQp2 (times a constant). Hence, if N is taken large enough, so that P dominates both p and Q (this is possible, by (9.25», (9.43) isimpossible. (9.40), (9.41), and (9.42) imply (1,}.39). (9.44) COROLLARY. There is a sequence {!T.. } of perfect squares, every two of which are totally different. This is immediate from (9.31) and (9.39) and proves (9.11). A rough calculation shows that we may take !Tr = ~103(r+l). probably be greatly improved.

This could

(9.45) THEOREM. Any rectangle whose sides are commensurable can be squared perfectly in an infinity of totally different ways. Magnifying the rectangle suitably, we may suppose that its sides are integers Divide it into hk squares of side 1, by lines parallel to its sides. Take any positive integer n, and replace the i-th of these unit squares by !Tnhk+i (suitably contracted). By (9.44), this gives, for each n, a perfect subdivision of the given rectangle; and these subdivisions for any two values of n are "totally different" . Using the theorem of (2.14), we see that a rectangle can be squared perfectly if it can be squared at all. It is plausible that any commensurable-sided rectangle can be squared perfectly and simply; possibly this can be proved in a similar way if we use some extension of §8.4; but this seems to involve laborious calculations. h, k.

10. Some generalizations We mention briefly some of the extensions of the methods and results of this paper. A fuller discussion may perhaps appear later.

114

DISSECTION OF RECTANGLES INTO SQUARES

339

10.1. Rectangled rectangles. An immediate and natural generalization (as pointed out in §1.2) is to the problem of a rectangle dissected into a finite number of rectangles. The wires of the p-net merely have general (not necessarily equal) conductances. There is also (cL §8.4) a rather trivial extension in which the dissection is of a polygon (of angles !7r and -f7r). A more natural generalization, however, is given in the following section. 10.2. Squared cylinders and tori. We may regard a squared rectangle, after identification of its left and right sides, as a "trivial" example of a squared cylinder. The squared cylinders are found to correspond exactly to the relaxation of the condition (1.12) that no circuit of the p-net may enclose a pole. A second step brings us to the "squared torus". Using the existence theorem of (9.45), we can easily construct such figures. It is also possible to construct a simple non-trivial perfect torus; but this is not so easy. Of course, the word "squared" may be replaced by "rectangled". 10.3. Triangulations of a triangle. In a rather different direction, we may consider dissections of a triangle into a finite number of triangles; particularly when all the triangles considered are equilateral. It is easily proved that there is no perfect equilateral triangle; i.e., that in any such dissection of an equilateral triangle into equilateral triangles, two of the latter are equal. Apart from this, the theory extends fairly completely. Duality relations, for example, are replaced by "triality" relations. We could also consider dissections into a mixture of equilateral triangles and regular hexagons, no two of these elements having equal sides; essentially this amounts to agglomerating the imperfections of an "equilateral triangled triangle" together by sixes. There is no difficulty in constructing such figures empirically, or in finding "perfect isosceles rightangled triangles"; however, it can be done by using the theory. 10.4. Three dimensions. We have seen that the "p-net" and its generalizations are satisfactory for plane dissections. As yet, however, there is no satisfactory analogue in three dimensions. The problem is less urgent, because there is no perfect cube (or parallelopiped). That is, in any dissection of a rectangular parallelopiped into a finite number of cubes ("elements"), two of the latter are equal. Proof. It is easily seen that in any perfect rectangle, the smallest element is not on the boundary of the rectangle. Suppose we have a "perfect" cubed parallelopiped P. Let R1 be its base. The elements of P which rest on R1 "induce" a dissection of R1 into a perfect rectangle. (We can clearly assume that more than one cube rests on R1.) Let S1 be the smallest element of R1 . Let C1 be the corresponding element of P. Then C1 is surrounded by larger, and therefore higher, cubes on all four sides; for, as remarked above, S1 is surrounded by larger squares. Hence the upper face of C1 is divided into a perfect

115

340

R. L. BROOKS, C. A. B. SMITH, A. H. STONE AND W. T. TUTTE

rectangle R2 by the elements of P which rest on it; let S2 be the smallest element of R2 ; and so on. In this way, we get an infinite sequence of elements Cn of P, all different (for Cn+! < Cn). This is a contradiction. This proof excludes generalizations of "perfect cylinders" to three (or more, a fortiori) dimensions; but it does not exclude the possibility of a perfect threedimensional torus (product of three circles). It is not known whether such a thing can exist. BIBLIOGRAPHY

1. M. ABE, On the problem to cover simply and without gap the inside of a square with a finite number of squares which are all different from one another, Proceedings of the Physico-Mathematical Society of Japan, (3), vol. 14(1932), pp. 385--387. 2. W. W. ROUSE BALL, Mathematical Recreations, 11th ed., New York, 1939. 3. C. W. BORCHARDT, Ueber eine der Interpolation entsprechende Darstellung der Eliminations-Rcsultante, Journal fiir Mathematik, vol. 57(1860), pp. 111-121. 4. S. CHOWLA, Division of a rectangle into unequal squares, Mathematics Student, vol. 7 (1939), p. 69. 5. S. CHOWLA, ibid., Question 1779. 6. M. DEHN, Zerlegung von Rechtecke in Rechtecken, Mathematische Annalen, vol. 57 (1903), pp. 314-332. 7. JARENKIEWYCZ, Zeitschrift fiir die mathematische und naturwissenschaftliche Unterricht, vol. 66(1935), p. 251, Aufgabe 1242; and solution, op. cit., vol. 68(1937), p. 43. (Also solutions by Mahrenholz and Sprague.) 8. J. H. JEANS, The Mathematical Theory of Electricity and Magnetism, Cambridge, 1908. 9. G. KIRCHHOFF, Ueber die Aufiosung der Gleichungen, auf welche man bei der Untersuchung der linearen Vertheilung galvanischer Strome gefilhrt wird, Annalen d. Physik und Chemie, vol. 72(1847), p. 497. 10. M. KRAITCHIK, La Mathematique des Jeux, Brussels, 1930. 11. Z. MORON, Przegl&d Mat. Fiz., vol. 3(1925), pp. 152, 153. 12. A. SCHOENFLIES-M. DEHN, Einfuehrung in die analytische Geometrie der Ebene und des Raumes, 2d ed., Berlin, 1931. 13. R. SPRAGUE, Mathematische Zeitschrift, vol. 45 (1939), p. 607. 14. H. STEINHAUS, Mathematical Snapshots, New York, 1938. 15. A. STOHR, Zerlegung von Rechtecken in inkongruente Quadrate, Thesis, Berlin. 16. A. H. STONE, Question E. 401 and solution, American Mathematical Monthly, vol. 47 (1940). 17. H. TOEPKEN, Aufgabe 242, Jahresberichte der deutschen Mathematiker-Vereinigung, vol. 47(1937), p. 2. 18. H. TOEPKEN, Aufgabe 271, op. cit., vol. 48(1938), p. 73. 19. H. W. TURNBULL, Theory of Matrices, Determinants, and Invariants, London, 1929. TRINITY COLLEGE, CAMBRIDGE, ENGLAND, AND PRINCETON UNIVERSITY.

116

[Extracted from the Proceeding8 of the Cambridge Phil080phical Society, Vol. XXXVII. Pt. II.] PRINTED IN GREAT BRITAIN

ON COLOURING THE NODES OF A NETWORK By R. L. BROOKS Communicated by W. T. TUTTE

Received 15 November 1940 The purpose of this note is to prove the following theorem. Let N be a network (or linear graph) such that at each node not more than n lines meet (where n > 2), and no line has both ends at the same node. Suppose also that no connected component of N is an n-simplex. Then it is possible to colour the nodes of N with n colours so that no two nodes of the same colour are jO'ined. An n-simplex is a network with n + 1 nodes, every pair of which are joined by one line. N may be infinite, and need not lie in a plane. A network in which not more than n lines meet at any node is said to be of degree not greater than n. The colouring of its nodes with n colours so that no two nodes of the same colour are joined is called an "n-colouring". Without loss of generality we may suppose that N is connected, for otherwise the theorem can be proved for each connected component; and that it is not a simplex. With these suppositions, N is finite or enumerable, both as regards nodes and lines . . Now we can (n + 1)-colour N with colours co, cl , ... , cn , by giving to each node in turn a colour different from all those already assigned to nodes to which it is directly joined. We can then apply the following operations, in which the colours of directly joined nodes remain distinct. ( 1) A node directly joined to not more than n - 1 colours can be recoloured not-co. (In the term "recolouring" we include for convenience the case in which no colour is altered.) In particular, a node directly joined to two nodes of the same colour may be recoloured not-co. (2) If P and Q are directly joined they can be recoloured without altering any other nodes, so that P is not-co. For neglecting the join PQ, we may recolour P not-co, by (1); and Q can then be recoloured (possibly co). (3) Let P, P', P", ... , Q be a path, i.e. suppose every consecutive pair of nodes directly joined. Then we can recolour P, P', ... , Q successively, without altering any other nodes, so that at most Q has finally the colour co.

118

195

Research Note

Oorollary I. If N is finite, choose Q arbitrarily in N. Since there is a path joining Q to every node Pin ·N, we can recolour N with at most the node Q coloured co. Oorollary 2. If N is infinite, let F be any connected finite part, and Q a node directly joined to, but not in, F. Then we can recolour F and Q so that no node of F is coloured co. PROOF OF THE MAIN THEOREM FOR A FINITE NETWORK

Oase I. If any node X meets fewer than n lines, we can n-colour N. For let N be (n+ I)-coloured, with at most the node X coloured co. Then by (1), X also can be recoloured not-co. Oase 2. Suppose that if P, Q, A, B are any four distinct nodes, there is a path from P to Q not including A or B. Since N is not a simplex, we can find nodes P, Q not directly joined. Let N be (n+ I)-coloured so that only Q is coloured co. Then P and all nodes directly joined to it are not-co. Either P meets fewer than n lines, when N may be n-coloured by case 1, or there are two nodes A, B, directly joined to P, which have the same colour. But there is a path joining P to Q, not including A or B. Hence, by (3); N can be recoloured, without altering A or B, so that at most Pis coloured co. Since A and B have the same colour, P can be recoloured not-co, by (1). Thus N is n-coloured. Oase 3. Suppose there exist distinct nodes P, Q, A, B, such that every path from P to Q passes through A or B. Then consider the networks, contained in N, with the following specifications: N 1 • Nodes: P, and all nodes joined to P by some path not passing through A or B as an intermediate point. Lines: all lines connecting the above nodes in N.

N2. Nodes: A, B, and all nodes of N not in N1 . Lines: all lines connecting the above nodes in N.

Thus Nl and N2 are connected non-null networks, together making up the whole network N, and having in common at least one of the nodes A, B, and at most A, B, and any lines A B. Therefore if m i is the number oflines in Nt containing A, and mo is the number oflines AB, m 1 +m2~mO+n.

(X)

Clearly Nl and N2 have degree not exceeding n. There are three subcases, 3·1, 3·2 and 3·3. Oase 3·1. Suppose Nl and N2 have only one node, say A, in common. Then in each of them, the node A meets fewer than n lines. Thus by case 1, Nl and N2

119

Research Note

196

may be n-coloured; and if we permute the colours of N2 so that the colours of A in N1 and N2 become the same, the whole network N is n-coloured.

Case 3·2. One of N1 and N2 (say N1), is such that when the line AB is added it becomes an n-simplex. N1 can be n-coloured by assigning arbitrary colours to the n - 1 nodes other than A and B, and the remaining colour to A and B. By (X) there is just one line in N2 meeting A, and just one meeting B or case 3·1 holds. Thus if A and B were identified in N2 , there would still (since n> 2) be fewer than n lines meeting A (= B). Hence by case 1 the resulting network can be n-coloured; i.e. N2 can be n-coloured with A and B the same colour. The colours can be chosen so that A and B have the same as in N1 • N is therefore n-coloured. Case 3·3. Neither N1 nor N2 becomes an n-simplex on adding a join AB. Suppose they become M1 and M2 respectively by this addition. Then M1 and M2 each contain fewer nodes than N, and by (X) they are of degree not greater than n. (If not we should have case 3·1.) If M1 and M2 are n-colourable so is N. For since both contain a line AB, in any n-colouring A must have a different colour from B in each network. We can permute the colours of M1 so that the colours of A and B are the same as in M2, and then by combination obtain an n-colouring of N. Thus if N is a finite connected network of degree not exceeding n, and is not an n-simplex, either it is n-colourable, or it is n-colourable if two networks which satisfy the same conditions, but have fewer nodes, are n-colourable. Now it is obvious that the theorem is true for a network with less than four nodes. Therefore, by induction over the number of nodes, N is always n-colourable. INFINITE NETWORKS

If F is a network or set of nodes in N, we denote by N - F the network composed of the nodes of N not in F, and the lines of N neither end of which is in F. LEMMA. For each positive r we can find a connected finite network 1',. such that (i) the connected components of N -F1 -F2 - ••• -1',. are infinite for all r, (ii) every node of N lies in one and only one Fr. Let the nodes of N be enumerated as P1 , P2 , Pa, ... , and let 1',. be defined inductively as follows. When F" has been chosen, for s < r (or without any such choice when r = 1), let Rr be the first Pm not in any F". Suppose further that the connected components of N - F1 - F2 - ... - 1',.-1 are infinite. Then all the finite connected components of N - F1 - F2 - ... - F.-1 - Rr must contain a node directly joined

120

197

Research Note

to Rr in N. Therefore the number of these components cannot exceed n, and at least one infinite component of N - Fl - F2 - ... - F,.-l - Rr contains a node joined by a line to Rr • Take Fr to be the "logical sum" of R., all finite connected components of N -Fl - ... -F,._l-R., and all lines joining them inN. Thus F,.is a finite connected network, and has no node in common with F", for 8 < r. Further, the connected components of N - Fl - ... - F,. are simply the infinite connected components of N - Fl - ... - F,.-l - Rr • The inductive construction is therefore complete. By the method of choosing R., Pm must lie in some F" (m ~ 8). A node Qr can be chosen in an infinite connected component of

N -Fl

-

.. ·

-F,._l-R.,

so that Qr is directly joined in N to Rr, which lies in F,.. Thus Q. does not lie in F", if 8 ::;'r. Now N can be (n+ I)-coloured; and by (3), corollary 2, we can recolour Fr in n colours, altering only F,. and Q., i.e. not altering F" for 8 < r. Thus we can recolour F l , FI , ... , in turu in n colours, each recolouring not affecting the nodes already recoloured: that is, we can n-colour N.·

TRINITY COLLEGE CAMBRIDGE

121

SOLUTION OF THE "PROBLEME DES MENAGES" IRVING KAPLANSKY

The probleme des menages asks for the number of ways of seating n husbands and n wives at a circular table, men alternating with women, so that no husband sits next to his wife. Despite the considerable literature devoted to this problem (d. the appended bibliography), the following simple solution seems to have been missed. It is convenient first to solve two preliminary problems, perhaps of some interest in themselves. LEMMA 1. The number of ways of selecting k objects, no two consecutive,jrom n objects arrayed in a row is ,.-k+lCk.

Letf(n, k) be the desired number. We split the selections into two subsets: those which include the last of the n objects and those which do not. The former are fen - 2, k -1) in number (since further selection of the second last object is forbidden); the latter are fen -1, k) in number. Hence fen, k) = fen - 1, k)

+ fen

- 2, k - 1),

and, combining this with fen, 1) =n, we readily prove by induction thatf(n, k) =,.-k+lC k • LEMMA 2. The number of ways of selecting k objects, no two consecutive,from n objects arrayed in a circle is ,.-kCkn/(n-k).

This differs from the preceding problem only in the imposition of the further restriction that no selection is to include both the first and last objects; and the number of such selections which are otherwise acceptable isf(n-4, k-2). Hence the desired result isf(n, k) -f(n-4, k-2)=,._kC kn/(n-k). Presented to the Society, September 13, 1943; received by the editors May 4,1943.

122

1943]

SOLUTION OF THE ·PROBLEME DES MENAGES·

785

We now restate the probleme des menages in the usual fashion by observing that the answer is 2n!u n, where Un is the number of permutations of 1, ... , n which do not satisfy any of the following 2n conditions: 1 is 1st or 2nd, 2 is 2nd or 3rd, ... , n is nth or 1st. Now let us select a subset of k conditions from the above 2n and inquire how many permutations of 1, ... , n there are which satisfy all k; the answer is (n-k)! or 0 according as the k conditions are compatible or not. If we further denote by Vk the number of ways of selecting k compatible conditions from the 2n, we have, by the familiar argument of inclusion and exclusion, un=L( -l)kvk(n-k)!. It remains to evaluate Vk, for which purpose we note that the 2n conditions, when arrayed in a circle, have the property that only consecutive ones are not compatible. It follows from Lemma 2 that Vk=2n_kCk2n/(2n-k), and hence 2n Un = n! - - - - 2n- 1C1(n - i)! 2n - 1

2n

+ --2n-~2(n 2n - 2

2)! - ....

From this result it follows without difficulty that u n /n!-M- 2 as n~oo.

BIBLIOGRAPHY

A. Cayley, A problem of arrangements, Proceedings of the Royal Society of Edinburgh vol. 9 (1878) pp. 338-341. E. Lucas, Theorie des nombres, Paris, 1891, pp. 491-495. P. A. MacMahon, Combinatory analysis, vol. 1, Cambridge, 1915, pp. 253-254. E. Netto, Lehrbuch der Combinatorik, Berlin, 1927, pp. 75-80. J. Touchard, Sur un probleme de permutations, C. R. Acad. Sci. Paris vol. 198 (1934) pp. 631-633. HARVARD UNIVERSITY

Reprinted from Bull. Amer. Math. Soc. 49 (1943).784-785

123

[ 26 ] A RING IN GRAPH THEORY By W. T. TUTTE Receit'ed 10 April 1!.I46 1. IXTRoDccTIOx

We call a point set in a complex K a O-cell ifit contains just one point of K, and a I-cell if it is an open arc. A set L of O-ceJls and I-cells of K is called a linear graph on K if (i) no two members of L intersect, (ii) the union of all the members of Lis K, (iii) each end-point of a I-cell of L is a O-cell of L and (iv) the number of O-cells and I-cells of L is finite and not O. Clearly if L is a linear graph on K, then K is r.'ither a O-complex or a I-eomplex, and L eontains at least one O-cell. A I-cell of L is called a loop if its two end-points coincide and a link otherwise. We say that L is connected if K is connected. If not then the subset of L consisting of the o-cclls and I-cells of L which are in a l!omponent K) of K constitute a component of L. A component of a lineal' graph is itself a linear graph. Let the numbers of O-cells and I-cells of a linear graph L on a complex K be iXo(L) and iX)(L) respecti\·e!y. Then if PilL) = Pi(K) is the Betti number of dimension i of K we have by elementary homology theory 2 1(L)-:%0(L)

= PI(L) - PolL).

(I)

Let L). L2 be linear graphs onK I ,K 2 respectively. Then ifthere is a homoeol1lorphism of K) on to K2 which maps each i-cell of Lion to an i-cell of L2 (i = 0, I) we say that L) and L2 are isomorphic and write (2)

If LI and L2 are two linear graphs whose complexe~ KI and K2 do not meet, then together thcy constitute a linear graph L on the union of K 1 and K 2' We call it th{' product of LI and L2 and write (3)

The set of all the O-cells of a linear graph L, together with an arbitrary subset of thp I-cells constitutes a linear graph S which we call a subgraph of L. We call Sa subtrei. of L if PolS) = I and Pl(''l') = O. Let A be a link in a linear graph L on a complex K. By suppressing A we derive from L a linear graph L'I on a complex J(I' By identif.dng all the points of the closure of A in K and taking the resulting point as a I)-cell of the new linear graph we derive from L a linear graph L:~ on a complex I{'~. Xuw there exist single-\'alued fUllctions W(L) on the Het of all linear graphs to t1u' ring J of rational integers which ob!'y the I!cnerallaw~

Jr(1.,l and

11'(1.)

= W( D2) =

if

L[;;;; L2

WW,)+ JV(L'~),

124

(5)

A ring in graph theory

27

where A is any link of L. Some of these functions also satisfy W(L 1 L 2 ) = W(L 1 ) W(L 2 ),

(6)

whenever the product L1 L2 exists. We give here three examples; all three satisfy (4) and (5) and the last two satisfy (fj). Proofs of these statements will emerge later, but the reader may easily verify them at cnce. (I) W(L) is the number of subtrees of L. This function is connected with the theory of Kirchhoff's Laws. A summary of its properties and an application of it to dissection problems is given in a paper entitled' The dissection of rectangles into squares' by Brooks, Smith, Stone and Tutte (Duke lllath. J. 7 (I!J.!O), 312-40). These authors call it the complexity of L. (II) (- 1)'o(L) W(L) vanishes whenever L contains a loop, and is otherwise equal to the number of single-valued functions on the set of O-cells of L to some fixed set H of a finite number n of elements such that for each I-cell of L the two end-points are associated with different elements of H. Important papers dealing with such' colourings of the O-cells of L in n colours' are 'The coloring of graphs' by Hassler Whitney* (Ann. JIath. 33 (1932), 6S8-71H) and 'On colouring the nodes of a network' by R. L. Brooks (Proc. Cambridge Phil. Soc. 37 (1941),194-97).

(III) If we orient the I-cells of L and adopt the convention that the boundary of an oriented loop vanishes, we can define I-cycles on L with coefficients in a fixed additive Abelian group G of finite order A. (- l)ao(L)~,,(L) JV(L) is the number of such I-cycles on L in which no I-cell has for coefficient the zero element of G. These examples suggest that a general theory of functions satisfying the laws (4) and (5) should be constructed, and this paper represents an attempt to de\'elop such a theory, For this purpose it is convenient to have the following definitions, A W-function (V -function) is a single-valued function on the set of all linear graphs to an additive Abplian group G (commutath-e ring H) which satisfies equations (4) and (5) (equations (4), (5) and (6)). In the second section of this paper a ring R is defined such that each linear graph L is associated with a unique elementf(L) of R, and it is shown that every W-function to G (V -function toH) can be expressed in the form hf(L) where h is a homomorphism of R considered as an additive group (considered as a ring) into the group G (ring H), and that every such homomorphism is a lV-function to G (V-function to H). In the third section a V-function Z(L) defined in terms of the subgraphs of Lis studied; it is used in the next section in the proof of the following theorem. THEOREll. Let (x O'X 1 ,X2 , .. ,) be an infinite sequence of independent indeterminates over the ring I of rational integers. Then R is isomorphic lcith the ring of all Jlolynomials ocer I in the Xi having no constant term.

• The 1',(£) of this paper is 'Vhitney's 'nullity', awl p.,(L) is Whitney's p, The 'component;; in thiH paper are "'hitney's 'pieces': he uses the word' ('(Hllponent' with it (Iiffen'nt llwalling. A footnote to 'Vhitney's paper, c1ealing with some work of H, ~L Forster, is partieuhlrly inter, esting with respect to the subjeet of the present paper.

125

w.

28

T. TUTTE

Further there is a particular isomorphism in which the element of R corresponding to xr is the element associated with a linear graph having just one O-cell arul just r I-cells. In the fifth section those W- and V-functions which are topological invariants of the complexes K are considered, and in the sixth section a particular V-function is applied to some colouring problems. In the seventh section those I-complexes K which admit of a simplicial dissection in which cach U-simplex is an end-point of not less than two and not more than three I-simplexes are studied. A class of topologically invariant functions of these I-complexes, one member of which is associated with a well-known colouring problem, is investigated, and it is shown that each of these functions has a unique extension as a topologically invariant II'-function to all linear graphs. 2.

From the definitions of L~ and and

L~~

THE REG

R

it is evident that

(Xo(L)

=

(XI(L)

= (XI(L~)

+I

(i)

+ 1 = al(L~) + 1.

(8)

(Xo(L:I )

= (Xo(L:~)

We say that the link.d is an isthmus if its suppression increases the number of components of a linear graph. Evidently PolL) = Po(L~~) = j!o(L~I) or ]11,(L:4 ) - 1, according as.d is not or is an isthmus. Hence by (1)

(9)

}JI(L) = PI(L~) = PI(L'I) or jll(L~I)+ I, (10) according as A i" or is not an isthmus. \Ve call the class of all linear graphs isomorphic with L the isomorphism class L* I)f L. \Ve also use clarendon type for isomorphism classes. If LI and L2 are any two i~omorphism classes not necessarily distinct we can find LI in LI and L2 in L2 wch that the product LI L2 exists. All products formed in this way from LI and L2 are clearly isomorphic. We call their isomorphism class L the product of LI and L2 and write L = LI L . (II) 2 c\ uraphic 10nll is a linear limn in the isomorphism claHses L with integ{'r coefficient" of which only a tinite number may be non-zero. We do not distinguish between an isolllorphislll class L and the graphic form in which the coefficient of Lis unitv and all the other coefficients are zero. . \\'{' define addition and multiplieation for graphic forms b~' ~;\iLI + ~i'iL;

=

(~>\,Li)(~/(jLj)

=

1: ('\;+fli)L i

i i i

and

I

J

( 12)

~(i\iflj)LiLj' I,)

where the Li are isomorphism classes and the Ai are rational integers. \rith these definitions the graphic frmns are the elements of a commutative ring B. For the commutative, assoeiati\'e and di~trilJlltive lawH are e\'identh' HatisfiNI: and if X = ~ Ai Li and Y = ~ /1 iLi are any two graphic forms thcre is a IIl1i<;ue graphie'form h

=

~ (/Ii-flJL; such that Y -LZ

= X. W(' write Z = X- Y.

126

29

A ring in graph theory

If X = ~ ,\;Li is any graphic form and A an integer, we denote by AX the graphic i

form

•

L AAiL;. We also denote by 0 the graphic form whose coefficients are all zero. i

If A is a link in a linear graph L we say that the graphie form L* ~ (L: 1 )* ~ (L:'d* i" a IV-for In. Let IV denote the ~et of al!linear eombinations of a finite numlwr of lV-forms taken with integer coeffieients. Then TV is a modul of B, for with X and Y it eontains also X ~ Y. Xow if Lo is an.v linear graph sneh that the prodnct Lo L exists we han· (LoL)'/

=

and

£oL'/

(LoL)'~ = LoL'~.

Therefore if X is allY Jr-form and L any isomorphism ciass, then LX is also a If-form. Henee by (12) and (13) for any YE IV and an~' ZEB. we 11<1\'e YZE 11". That is, Jr i.-; an ideal of the eommutatiw ring B. We denotl' thl' diffl'rencl' ring B ~ If by R. The eleml'nts of R are the cosets mod. II" in B. If we dl'llotl' the coset of X lllod. II" by [X], adrlition and nmItiplieation in R satisfy

IX]+[Y] = [X+YJ. rXlrYl

=

rXY].

(l.J.) (Li)

THEORU[ r. A singh-m/IlPil functioll II"(L) on till' set of nlllinNlr gr(t/d,s L to (III luldilivp AI)(/illll (!/'Ollll (; (rollllllutllti"e ring H) iSf1 1I'}ulirlion (V-funrtio/l) if(lIIrl only if il i8 ,!fillpfonn h[L*] 11'1/1'1'1' Ii is n II01nomorplti8111 of th" IIddili!'r UI'OII/J R (rill!/ R) iI/to (; (H). Xow the functions Jr(L) which satisfy (4) dl'pPIld only on the isomorphislll classes. For such funetions we write W(L) = 1I'(L) whl're L is the isomorphism class of the linear graph L. lV(L) can now be extl'nril'rl to all graphie forllls b~' writing

(lfj)

If lI'(L) satisfies (Ii) we han' also Jr(L I L 2 ) = If(L I ) II"(L 2 ) for any two isomorphism classes LI and L 2 • allrl thereforl'. by (1:1) ami (lfj).

1I'(X I X 2 )

=

II"(X 1 ) 1I'(X 2 )·

( Ii)

where Xl ami X 2 are any two graphic forms. By (Hi) and (Ii) any singlc-valued fUllction Jr(L) satisfying l'ljuation (-J.) (l'quations (4) and (G)) is of the form hoL* where ho is a homomorphism of B considered as an additive group (considered as a ring) into (; (H); and conversely it is el'ident that if 110 is any such homomorphblll. the function 110 L* ,;atisfil's equation (4) (equations (4) and (Il)). II'(L) thl'll satisfies (il) if and only if ho maps all Jr-forllls and therl'fore all elements of IV on to the zero element of (; (H). This i" e4uil-alent to the condition that 1I0L* shall depelHI olll~' on the coset l L*]. The tlworcm now follm\'s from (1+) and (\'"i). Let .II, denote any linear graph haying just olle II-eel! and just I' I-cells (necessarily loops). {'learly all snch lilwar graphs (for a fixe(1 r) are i,;oIllOl'phic. 'Ve ril'note their isomorphism eiasH by Yr' 'Vl' call the memhl'n; of the Y,. (-/' II/( 111(11'.11 grajlh8, Clearly )lu(Y,) = I,

(IS)

r.

(HI)

Pl(Yr)

=

127

,V. T.

30

TUTTE

1/ L

-is any linear graph then [L*] can be expressed as a polynomial P[ £*] = P([ L*]; [Yo], [y.] , [y 2]' ... ) in the [y i 1such t/wt (il prL*j has no constant term, (ii) the coefficients of P [£*] are non-negative rational inteycrs, (iii) the (iegree of P[ £*] is ao(L), (iv) P [L*] im'olrcs no suffix i greater than Pl(L), and (v) if L is connected and Iws no isthmus A such Ihal fo r some component Lo of L ~, PI( £0) = 0 .. then PIL*j is of the fo rm [YpJ + [Q ] wkere p = PI{L) and [Q] is a polynomial in those [Y. ] for It-hich i i8 less thon 1). The proof is by induction. We first observe that if .::t 1(L) is zero, then L is the product of -=toiL) elementary graphs each isomorphic with Yo- Hen ce by (15) [L"'J = (YO]2.(/.), and so the theorem is true for L. Assume that, the theorem is true for all connected linear graphs ha\"ing fewer than some finite number n of I -cells. Let L l>e any linear graph having just n I -cells. If L is eonnccted , then either :t(l( L ) = I , in which case [L' I ~ [y"l, and so the theorem is true for L, or else L contains a link A. In the .'>ccow l case we have (L* - (L:I )*-(f... ~~ ) *)E n', awl therefore [ L*] = [( I/r)*]+ :·( L:~)*]. (20) By (X) £ :1and I,~~ han'" l'IIeh fewer I -cells than L and so by the inducth'e hypothesis the theorem is true for th ('lll. The propositions (i) to (i\") follow immediately for L from ( ~O ) with the help of (i) and ( to) . Xow s uppo;;e that L sati:
II L'~I'I ~

IY"I + [Q,,1,

1211

where JJ is JlL(D), ami [Oil! denotes any polynomial (not. always the samc polynomial) in those rYi) fOl" which i
0, and t.herefore since jll( /..[1) + /ll( 1'1 ) JlI( t.~r) we have 1'1(Lo), III( L 1) < 1'1( L~.I). Consequently [(L:d*J [(LI)*] [(1. 1 )*] = [OJ,l and (2 2) is still Yalid. 0;

0;

:By (20), (21) and (22)

f L "'J

0;

fYJ,l + [0 ,']'

This completes the proof that the theorem for connected linear graphs is true when :t1(L) = 11 if it is t rU(l for :t l ( t.) < n. We have proved it for 21(L) = 0 and therefore it. is true in general. If t. is not. connected we can obtain P[L* ] satisfying the theorem by multiplying together the pol.\-nomial;; of its component". COROLI...\R\". All!! element [X] oj R ran be expressed (IS a polynomial in the [y,] /.t'it" rational ill/cyer cOf'fficients am/lIo conMtmt term. For X is a tillite linear form in the L; with integer coefficients.

128

31

A 'ring in graph theory 3.

i::)CBGRAPHH

Let 8 denote any subgraph of a linear graph L. Let the number of components T of 8 such that P1(T) = r be i,(8). We define a function Z(L) of L by Z(L)

= ~ oS

11 z~,(S),

(23)

r

where the Zr are independent indeterminates over the ring I of rational integers. Although (23) involves a formal infinite product, yet for a gh-en 8 only a finite number of the i,(8) can be non-zero and so, for each L, Z(L) is a polynomial in the Zi' THEoRE11

III. Z(L) is a V-fllnction.

For first it is obdous that Z(L) satisfies (-1-). Secondly, if A if> any link of L, then the subgraphs of L which do not contain A are "imply the subgraphs of L: 1 , and the ~ubgraphs 8 of L which do contain A are in I-I correspondence with the subgraphs of L:~. For, for such an 8, 8:; is a subgraph of L:;; and if 81 is any subgraph of L:'I there is one and only one subgraph S of L having the same I-cells as 81 with the addition of and therefore satisfying = 8. Further differs from 8 only in that a component T of is replaced by T~; and. by (9) and (10), T:; is connected and P1(T") = P1(T). Hence i,(8:;) = i,(S) for all r. Hence by (23) Z(L) = ~ 11 z~,(S)+ ~ 11 zV~),

S:-.

A

S:;

S(L~)

where

S(L~)

S

S:;

S(C;) r

r

for example denotes a subgraph 8 of L: 1• Therefore Z( L) = Z(L: 1 ) + Z(L:~),

(24)

that Z(L) satisfies (5). Thirdly, for any product L1 L2 the subgraphs of LIL2 are simply the products of the subgraphs 81 of Ll with the subgraphs 8 2 of L 2 . It is evident that

SO

i.(8 1 8 2 )

and therefore

Z(Ll L 2 )

=

i,(8 1 ) + i,(82 ),

11

=

~ z~,(SlH·i,(S,) 8 1 .81 r

=

(.'~ 8

1

11 Z~,(Sl») r

(~

11 z~,(S'») =

lj'!.

r

Z(L 1) Z(L 2 ).

(25)

Thus Z(L) satisfies (4), (5) and (6). That is. it is a V-function. THEORE:.\I

IV.

(26)

For each subgraph of y, has just one O-cell (~2), and therefore just one component. Hence Z(y,) is a linear form in the Zr- The number of subgraphs 8 such that Pl(S) = k is the number with ct1(S) = k, by (19), and this is the number of ways of choosing l.I-cells out of r. 4.

STRrCTrRE OF THE RI:'\G

R (27)

LEMMA.

This equality can be obtained by expanding xT = ((x - I) + I)' in powers of (.1' - I). expanding each of the terms in the resulting series in powers of x, and then equating coefficients.

129

W. T. TUTTE

32 TREORE)]

V. Ris isomorphic with the ring Ro of all polynomials in the

Zi

with integer

coefficients and no constant term. For by Theorem III Z(L) is a V-function with values in Ro· Hence by Theorem I Z(L)

(28)

h[L*].

=

where h is a homomorphism of R into Roo Let [tJ be the dement of R defined by [til

=

i (-

j_1l

I )i+j (~.) [yJ.

.J

Then. by Theorem IV and the lemma. h[t i ]

= j~O s;,( -

l),j

If we multiply (29) by (;). sum fromi [Yr]

=

0 to i

=

i

C) (~) Zs = Zi' =

(30)

r. and use the lemma we find

L(I.') itJ II

(29)

(31)

I

Hence b~' Theorem II. Corollary. any element [X] of R can be expressed as a polynomial in the [til with integer coefficients and no constant term. )[oreover this expression is unique: otherwise there would be a polynomial relationship between the [t;]o and tlwrefore by (30) between the Zi, with integer coefficients, and this would contra(lict the definition of thez i . It follows that h is an isomorphism of R on to Ro (for ever.\' integer polynomial in the [tJ is in R).

VI. Let :l'o,;l:! • .r 2 • "". be an infinite sequence of COllnected linear graphs, anl/ the corre.~ponding isomorphism classes. such that (i) xo-:::;Yo, (ii) Pt(xr ) = r, (lnd (iii) Xr contains noisthlll1l8 A such that for some component Lo of (xr)~' pt(Lo) = O. Then any element [X] of R I/(/s a unique expression as a polynomial in the [xJ with integn coefficients and no constant terril. By Theorem II (v) and equation (31) we have, for r> 0, THEoRE31

X o, Xl' X 2, ...

[X r ]

=

[q + [S,],

(32)

where [Sr] is a polynomial in those [til for which i < r. Hence [t r]

[xr ]

+ [Vr].

(33) where [Vr] is a polynomial in those [x;] for which i < r. (If we assume this for r < nit follows for r = n by substitution in (32). Since [xo] = [Yo] = [to] it is true for r = 0, and therefore it is true in general.) Clearly [Sr] and [Vr] have no constant terms. By Theorem II, Corollary, and equations (31) and (33), [X] can be expressed as a polynomial without a constant term in the [Xi]. Suppose this expression not unique. Then there will be a polynomial relationship =

P([X;])

=

0

(34)

between the [x;]. Of the terms of non-zero eoefficient in P([x;J) pick out the subset JtI! of those which involve the greatest suffix occurring in them raised to the highest power

130

A ring in graph theory

33

to which it occurs. Of this subset 1111 pick out the subset M2 of terms involving the second greatest suffix appearing in Jf1 raised to the highest power to which it occurs in Mv and so on. This process must terminate in a subset Mk consisting of a single term A [x;]a(i) [Xj]a(i) •..• It is evident that if we substitute from (32) in (34), we shall obtain a polynomial Q([t i ]) = 0 relationship

between the ltd, in which the coefficient of [ti],,(i) [t, Ja(j) ••• is A =1= o. But it was shown in the proof of Theorem V that there is no polynomial relationship between the [til This contradiction proves uniqueness and so completes the proof of the theorem. 5.

TOPOLOGICALLY IXYARIAXT JV-Fl:XCTIOXS

Let A be a I-cell of a linear graph L on a complex K. Let p be any point of A. We can obtain a new linear graph M on K from L by replacing A by the point p, taken as a O-cell of M, and the two components of A -p taken as I-cells of M. We call this operation a subdivision of A by p. Given any two linear graphs L 1 , L2 on the same K we can find a linear graph L3 which can be obtained from either by suitable subdivisions. Such a linear graph is evidently obtained by taking as the set V of O-cells the set of all points of K which are O-cells either of L1 or of L 2 , and by taking as I-cells the components of K - V. We seek the condition that a W-function W(L) shall be topologically invariant, i.e. depend only on K. By the above considerations a necessary and sufficient condition for this is that JV(L) shall be invariant under subdivision operations. (For then

JV(Ltl = JV(L3) = W(L 2 )·) Suppose therefore that A is any I-cell of L, possibly a loop, and let M be obtained from L by subdividing A by a point p. Let us denote the new I-cells by Band C. Then hy (5) for any JV-function W(L) W(M) = W(M~) + JV(M~)

= =

W((M~)c) + W((llf~)~)

+ W(M~) W(p. (M~)~) + JV((M~)c) + W(M~).

Here p is used to denote the linear graph which consists solely of the O-cell p. It is i,omorphic to Yo. By making use of the obvious isomorphisms M~,;;; L and (M~)c';;; Lo, where Lo is the linear graph derived from L by suppressing A, we obtain W(M)- W(L) = W(yo.L o)+ W(Lo),

Therefore

JV(M) - W(L) = h([Yo][Lt] +[Lt]),

(35)

where h is a homomorphism of R, regarded as an additive group, into an additive _-\belian group G (Theorem I). Let N denote the set of all elements of R which are ofthe form [Yo] [X) + [X). Clearly X is an ideal of R. Let {X} denote that element of the difference ring R - N which contains [X). THEoRE~1 VII. Afunction W(L) on the set of all linear graphs L to the additive Abelian group G (commutative ring H) is a topologically invariant W-function (V-function) if PSP

43,

I

131

34

W.

T. TUTTE

and only if it is of the form k{L*}, where k is a homomorphi8m of the additive yroup R - N (ring R - N) into G (H). For in (35), by a proper choice of L, we can have any linear graph we please as L o· It follows that the necessary and sufficient condition for the W-function W(L) to be topologically invariant is that h shall map all elements of R of the form [Yo] [L*] + [L*] and therefore all elements of N on to the zero of G. This proves the theorem for Wfunctions. The same argument applies to V-functions, except that h in (35) is then a homomorphism of R (as a ring) into the ring H.

VIII. Let Xo, Xl> X 2 , ••• be as in the enunciation of Theorem V I. Then any element {X} of R - N has a unique expression as a polynomial in the {Xi) (i > 0) wilh ill/llr!" roefficients. For we can obtain such an expression for {X} by replacing each [Xi] by the corresponding {x;} in the expression for [X] in terms of the [Xi] whose existence is asserted in Theorem VI. Now for all {X}, {X} + {Yo}{X} = {O}, and so R - N has a unity element - {Yo} = - {xo} which we may denote by l. Hence {xo} is not an indeterminate over I. and we can regard our polynomial for {X} as a polynomial in those {Xi} for which i > (I (with perhaps a constant term). If this expression for {X} is not unique then there will be a polynomial {P} in the {xJ (i > 0) without a constant term such that THEOREM

A {xo} + {P}

=

{O},

where A is some integer. Hence if [P] is the polynomial of the same form in the [Xi I we must have

A[xo] + [P] + [Xo] + [xo] [Xo] = [0]

(36)

for some [Xo]. Equating coefficients oflike powers of [xo], as is permissible by Theorem VI, we sel' that [Xo] cannot involye [xo], and hence that A = - [Xo] = [P]. Consequently {PI is a constant and therefore, by its definition, the zero polynomial in the {Xi}' Thl' theorem follows. 6. SmlE

COLOl"RIXG PROBLE:\lS

The homomorphism of the ring Ro (see Theorem V) into the ring of polynomials in two independent indeterminates t and z by~he correspondence Zi -+tzi transforms Z(LJ into Q(L; t,z) = "L,tPo(S)zP,(S) (3.) s by (23). Since Z(L) is of the form h[L*] where h is a homomorphism of R into lio (Theorems I and III). (/(L; t, z) can be defined by a homomorphism of R into the ring of polynomials in t and z and is therefore a V-function (Theorem I). The coefficient of tazb, for fixed a, b, therefore satisfies (4) and (5) and so is a Jrfunction. Writing a = l,b = Oweobtainthe function of Example I of the Introduction. This function satisfies W(Ll L 2 ) = 0 (by (37) since Po(l:J) is always positive) and so it can be regarded as a V-function with values in the ring constructed from the additiYe group of the rational integers by defining the' product' of any hro elements as O. Q(L; t, z) has an interesting property which we call

132

35

A ring in graph theory Tm:oRlm

IX. If Ll and L2 are connected dual linear f/mphs on the 8phere then I

t Q(Ll ; t,z) =

I

zQ(L2 ; z,t).

(3S)

This follows from (37) as a consequence ofthe fact that there is a I-I correspondence S -->- S' between the subgraphs S of Ll and the subgraphs S' of L2 such that PolS) = Jil(S') + I and

PI(S) = PoU),)-1.

(S' is that subgraph of L2 whose I-cells are precisely those not dual to I-cells of K)

For a proof of this proposition reference may be made to the paper' Xon-separable and planar graphs' by Hasslcr Whitney (Trans. American ~l!ath. 8oc. 3-1 (HI:!2). :1:J!Hi2).

\Ye go on to consider two kinds of colourings of a linear graph, which we distinguish as :x-colol/rings and fJ-colourinr/.s. An :x-colouring of L of degree A is a single-yalued function on the set of O-cells of L to a fixed set H the numher of whose elements is A. If.f is an :x-colouring let ¢(f) denote the number of I-cells A of L such that f a,;sociates all the end-points of A with the same element of H (e.g. every loop has this property). We say that any suhgraph of L all of whose I-cells have this property for f is Ilssociated with f. We use the symbol S(f) to denote a subgraph associated with iI gh-enf, and!(S) to denote an~' :x-colouring with which a gh'en .'; is associated. THEORt:)I

X. Let J(L: "'-9) bp, the nl/mber o.f :x-colollrill~8f of L of df'f/I"i'e Afor whirh

\\(f) has the mllie

¢. Then tlte follOlcillrJ identity

"£. ./(L: A, 9) .T9 = ¢

i.~ trw.

(.t: - 1)',1 /.) (I(L: .

.,\

~-I

,.1' -

1)

(30)

",hpre.l: is an illdeterminate oter I. For, by (3i) and (1), the right-hand side is (.t: - I j".,
~

(.1' _ I )"("') ,\J>,j..~1

N

=~(.r-I)"(·')~ 8

/lSi

I);

["r the :x-colourings associated with 8 are precisely those which map all the O-cells in the same component of S on to the same element of H. This last ('xpresl'ion ('qual~

"£. "£. (.r -

I ),,(N)

1 .';(/)

=

"£. .1f(1l 1

since the number of subgraphs associated with f and having just -x I ( 8) I-cells is the number of ways of choosing :Xl(8) I-cells out of ¢(f). This complet('s the proof of the theorem. If we write .r = 0 in (311) we find that ( - 1)'.(/') J(L; ,\.0). which is Example II of tile Introduction, is the "-function filL: -,\, -I). We thus obtain the well-known result* .I(L; A, 0) = ~ (-I )"("")"I"~S'. :-;

*

Hassler \\'hitn"y, • A logical expansion in lIlathem .. ti,,~·, /JI/Il. Am.-rirtl/l _lffllh. Soc. 311 II !)32), 5;2- 9.

133

W.

36

T. TUTTE

If we orient the I-cells of L and adopt the convention that the boundary of an oriented loop vanishes, we can define I-cycles on L with coefficients in some fixed additive Abelian group G of finite order A. The number* of such I-cycles on L will bp AP,(L). We call them fi-colourings of L with respect to G. Let E(L; G, V) be the number of such I-cycles for which just!fr of the I-cells havc coefficient zero. Let go be any fi-colouring with respect to G of L and let !fr(ga) be the number of its zero coefficientst. We say that a subgraph S of L is associated with ga if every I-cell of L not in S is assigned the zero element of G as its coefficient in ge. We use the symbol Sigal to denote a subgraph of L associated with ga and gaiS) to denote a fi-colouring with which a given subgraph is associated. Clearly the number of fi-colourings associated with a given S is the number of fi-colourings of S, which is AI>,(.s). THEORE:\I

XI. If x is an indeterminate over I then

~ E(L;

G.!fr)xVr

=

(X_I)"'(L)~.o(L)Q( L; x-I, x~ I)'

(40)

For, by (37) and (I), the right-hand side is (x - I ),,(L)-.o(L) ~ (x - I )Po('»~P,(S) AP,(,'» = s

~

(x - I )"I(L)~",(S)

S

~

I)

Uo(S)

= ~ ~

(x - I ),.,(L)~"I(S) =

O,.S(fl G )

~

xVr(go);

flo

for the number of suhgraphs of L associated with go and having just iX1(L) -!fr(ga) +r I-cells is the number of ways of choosing r I-cells out of the !fr(ga} which have zero coefficient in {Jr;. COROLLARY. E(L; G, V) is lite samejor ail additive Abelian groups G of the same order It If we write x = I) in (40) we find that (_I)"I(L)~20(L) E(L; G, 0), which is Example III of the Introduction, is the V -function Q(L; - I, - A). It takes the value - I when L is Yo and therefore corresponds to a homomorphism of R into the ring of rational integers which maps N into O. It is therefore, by the preceding section, topologically invariant. If Ll and L2 are dllallinear graphs on the sphere, the fi-colourings of Ll are closdy connected with the a-colourings of L 2 • In fact a I-cycle go bounds on the sphere and any 2-chain which it bounds on the map defined by Ll has a dual O-chain which is an a-colouringf. of L2 such that ¢(f.) = !fr(ga)' There is also a relationship between the a-colourings and the fi-colourings of the sam~ linear graph L expressed by the following identity in x (X-l)"P')1 (E(L; G, V)

(x~i + I

r)

=

A"I(L)-20(L)

1J(L;

A,¢)X¢.

(-II)

This is obtained by writing A:(X - I) for (x - I) in (40) and then eliminating the function Q by means of (39). • S"e Lef.qchetz, Alyebraic Torolorl!J (.-\n1<'r. :\Iath. Soc. Colloquium Publications, vol. 27). p. 106. may be mentioned that fur graphs on the sphere a jJ-colo\lring is essentially equi\'al~nt to a COiOlll'mg of the regIOns of tho map .Iefined by a graph in A colours. The colours can be repres~nte" by :lelllents of U and sothe colouring can be represented by a 2-chain on the map with ?oe.fficlents m G. A jJ-eulourmg IS SImply the boundary of s\lch a 2.chain. The number of 1-,")1; mCldent With two regions of the same colour (or incident with only one region) in a given COIOllf'ine IS gIven by the nllmber Y(Ya) where flo is the corresponding jJ-colouring.

t It.

134

A ring in graph theory 7.

37

CUBICAL :q;TWORKS

We define a cubical network as a I-complex for which there exists a finite simplicial ,lissection in which each O-simplex is incident with ll(,t less than two, and not more than three I-simplexes. Clearly any other simplicial dissection of such a complex will have the same property. The O-simplexes which are each incident with three I-simplexes we call nodes. The set of nodes is evidently independent of the particular simplicial dissection taken. A component of a cubical network which does not contain a node is eyidently a ,imple closed curve, and if a component does contain nodes then the remainder of it (II

x

bl

al

-\ Z

~I

bl

bl

Fig.

must consist of a number of non-intersecting open arcs whose end-points are nodes of the component. We call these open arcs the arcs of the cubical network. The number of nodes in a cubical network N is clearly two-thirds of the number of arcs of ~V. It is therefore even. Let X be an arc having distinct end-points P and Q in a cubical network N. In a simplicial dissection of N let AI' A2 be those I-simplexes incident with P, and B I , B2 those I-simplexes incident with Q, which are not in X. Let aI' u 2 , bl , b2 be the other end-points of AI' A 2, Bv B2 respectively. By suitable subdivisions of a gi\'en simplicial di,.;section we can always arrange that aI' bl , u 2 ' b2 are distinct points and not nodes of N. Other cubical networks can be obtained from N by replacing X, AI' A 2 , B I , B 2, P and Q by other systems of simplexes (see Fig. I). Hfor example we suppress Al and B I , introduce a new arc Y joining a l to bi and then introduce an arc Z joining a point in Y

135

W.

38

T. TUTTE

to a point in X, we obtain 51. 'Ve call this process a A-operation on .V. If NI can be obtained from .V by a finite sequence of A-operations we say that Nand Nl are A-equivalent. In such a case it is clear that Nl has the same number of nodes as Nand that if N is connected, so is Nl · By supprcssing X in X we obtain N's., and by suppressing Z in N we obtain S~. We define an F-function as a single-valued topologically invariant function on the set of all cubical networks to an additive Abelian group () or commutative ring H which satisfies the general law

F (.),. F (·'x '"')

=

F( ., V)

-

F( ·'z· -,"' )

(42)

THEORE)[ XII. ff 11'( L) is a topologically inmriant Ir-fllnction, and F(X) is the mlllP of Jr(L) for ull!llinpar grrlph on the cubicalnetlDorl.: X, then F(X) is an Fjllllction. For let "\~ be the I-complex obtained from X by identifying all the points of tIl(' closure of X, and let Lo be any linear graph on ~\~ (clearly such exist). No is evidently homoeomorphic to the I-complex obtained from X by identifying all the points of the closure of Z. Binee II"(L) is topologically inmriant it follows from (5) that

F(S) - F(X:,J

= /I'(Lo) = F(S) - F(S'z),

which proves the theorem. A trivial example of an F-function is F(S) = x"(S) where x is an arbitrary real or complex number and I/(.\') is one-half of the number of nodes of.Y. This function also satisfies

F(.VI u Nz) = F(Nl ) F(Nz)'

(43)

where Nl and Nz are any two disjoint cubical networks and NI u N2 is their union. Other F-functions may be obtained as follows. We define a subnetwork of N as a I.-complex which is the union of all the nodes of N and some subset of the arcs and nodeless components of X, such that each node of N is an end-point of at least one arc of the subset. If the number of arcs of a subnetwork T which have a given node v of S as an end-point (arcs which are loops being counted twice) is odd, we say that v is an odd node of T. The number of odd nodes of T is even, for it is congruent mod. 2 to thl' number of end-points of arcs of T (a loop being regarded as having two end-point~. though they happen to coincide). Let k(T) be one-half the number of odd nodes of T. Let 1T kP") be the number of subnetworks of N for which k(T) = k. As an example a cubical network J which consists of a single simple closed curve has just two subnetworks-J itself and the null complex-and so 1To(J) = 2 and 1Ti(J) = 0 (i > 0). Let .If be the I-complex obtained from the cubical network ~V of Fig. 1 by sup pressing X, AI' A 2 , Bl and B 2 · Let T be any subnetwork of .V, X'x, 51 or N'z, and let To be its intersection with ill (which is contained in each of these four complexes). If we are told which of aI' (/2' bl , 62 are contained in To it is easy to determine for each of the four cubical networks how many mbnetworks there are which agree with To in JI. and how many of these have n (or I, or 2) odd nodes outside To' A consideration oftlJ(' possible cases will show 1Tk (X)+1T k (.\":\:) = 1Tk (N)+1T k (N'z), (H) whence (-l)"(S)1T k(N) satisfies (42) and is thus an F-function. If therefore we dcfine a polynomial D(.V; x) by D(X; x) = ~1T,AN)Xk k

136

A ring in graph theory

39

then (-I),,(X)D(X; x) will be an F-function. Further, by an argument analogous to the proof of (25) this F-function satisfies (43). If N has no nodeless component, 1To(X) = D(X; 0) is by its definition the number of solutions of Petersen's problem * for X. We define a Hamiltonian circuit of.v as a subnetwork of X which is connected and has no odd nodes. It is easily verified that the residue mod. 2 of the number of Hamiltonian circuits of ~y satisfies (42), so this also is an F-function. Let 1'i+1 (i;:. I) be a cubical network with just 2i nodes aI' a 2 , a 3 , ... , a 2i , having just one arc linking each pair of nodes a" a'+1 for which r is odd, having just two arcs linking each pair of nodes a" a,+1 for which r is even, and having two arcs which are loops the end-points of one coinciding in a 1 and those of the other in a2i . The nodes and arcs define a linear graph which we also denote by 1'i+1' THEORE)I

XIII. Any connected cubical network S oj 2n nodes (n > 0) is A-equimlent

10 a homoeomorph oj 1'11+1'

For first, if S, not being homoeomorphic to I'n~I' contains a simple closed curve ]{ of k > I arcs, then X is A-equivalent to a cubical network NI containing a simple closed curve of k-I arcs. For we can suppose that ]{ contains the arc X (Fig. I) and also (11 and bl . Then S clearly has the property desired. It follows that by a sequence of .\-operations we can convert X into a cubical network having a loop. Let 0, be the I-complex derived from 1',+1 (r> 0) by suppressing the loop on a 2 ,. If part of a cubical network M meeting the rest of M only in a single node is homoeomorphic with 0" we call it aJrond of M of degree r, and say that the node corresponding to a 2 , is the base of the frond. The above argument showed that X is A-equivalent to a cubical network N2 having a frondJ (of degree r say). Secondly either N2 contains a simple closed curve passing through the base off, or it is A -equivalent to a cubical network having a frond of degree at least r with a simple closed curve through its base. For if the base Co ofJ is not on such a curve there will be a sequence co, Cv c2 , c3 , .•• , c. of minimum length such that consecutive nodes ci , ci +1 are linked by an arc Ci , and such that c. is on a simple closed curve KI in N 2 • Otherwise we could extend the sequence co, Cv c2 , ••. indefinitely in such a way that Ci differed from C;+I for each i without repetitions, which is absurd since N2 has only a finite number of nodes. By A-operations on Co, CI, ... in turn it is possible to transfer the frond to a base on a simple closed curve without altering its degree. Now at this stage the simple closed curve through the base of the frond may be a loop, in which case X has been transformed into a 1',-homoeomorph, and i = n + I since connexion and number of nodes are invariant under A-operations; or it may contain just two arcs in which case N2 has been transformed into a cubical network having a frond of degree exceeding r; or it can be reduced to a curve of just two arcs by a sequence of A-operations on those of its arcs not meeting the base of the frond. Hence if N2 is not homoeomorphic with 1',,+1 it can be transformed into a cubical network with a frond of degree greater than r. A finite number of such transformations will therefore change it into a homoeomorph of 1',,+1'

*

Denes Konig, Theorie der Endlichrn "ud unendlichen Graphen (Leipzig, 1936), p. 186.

137

W. T. TUTTE

40

XIV. Let F(N) be any F-function. Then there is a unique topologically invariant W junction W(L) such that W(L) = F(N) whenever L is a linear graph on N. For the linear graphs I'H1 may be taken as the linear graphs X i + 1 of Theorem VI. If we make the definitions Yo = Yo and 1'1 = Y1 then the I'i clearly satisfy the conditions of Theorem VI, and so by Theorem VIII {L*} has a unique expression as a polynomial in the {I'i}' Hence there is a unique topologically invariant W-function W(L) which is equal to F(N) whenever N is a product of I'i and L is on N. By Theorem XII there is a unique F-function F1(N) such that W(L) = F1(N) whenever L is on N. But ifthe value of an F-function is given for every product of I'i' then it is determined for all N. For by (42) if it is known for all N such that n(N) = p and for one cubical network M such that n(M) = p+ I, then it is determined for any cubical network .lIl A-equivalent to a homoeomorph of M. By applying Theorem XIII to each component having a node we see that every cubical network is A-equivalent to a homoeomorph of a product of I'i and so the required result follows by induction. Since F(N) = FI(N) whenever N is a product of I'i it follows that F(N) = F1(N) for every cubical network N. This proves the theorem. THEOREM

COROLLARY. For an Fjunction satisfying (43) 'W-function' can be replaced by 'Vfunction' in the above argument. As an example we mention an application of the above theory to the problem of

functions obeying the law f(N) = f(N'x) + f(Mo) (45) (see Fig. I). By eliminating f(Mo) from two equations of the form (45) it is easy to show that f(N) is an F-function multiplied by (_I)n(N). Hence it is fixed when its values for the products of the I'i are given. But by applying (45) to these products we can show that for them f(N) = 2n (N)A where A is a constant. Since 2,,(,V)A is obviously a solution of (45) it follows that it is the general solution.

TRINITY COLLEGE CAMBRIDGE

Reprinted from Pro£'. Cambridge Phil. Soc. 43 (1947).26-40

138

ANNALS OF MATHEMATICS

Vol. 51, No. I, January, 1950

A DECOMPOSITION THEOREM FOR PARTIALLY ORDERED SETS By R. P.

DILWORTH

(Received August 23, 1948)

1. Introduction Let P be a partially ordered set. Two elements a and b of Pare camparable if either a ~ b or b ~ a. Otherwise a and b are non-comparable. A subset S of P is independent if every two distinct elements of S are non-comparable. S is dependent if it contains two distinct elements which are comparable. A subset C of P is a chain if every two of its elements are comparable. This paper will be devoted to the proof of the following theorem and some of its applications. THEOREM 1.1. Let every set of k + 1 elements of a partially ordered set P be dependent while at least one set of k elements is independent. Then P is a set sum of k disjoint chains. l It should be noted that the first part of the hypothesis of the theorem is also necessary. For if P is a set sum of k chains and S is any subset containing k + 1 elements, then at least one pair must belong to the same chain and hence be comparable. Theorem 1.1 contains as a very special case the Rad6-Hall theorem on representatives of sets (Hall [1]). Indeed, .we shall derive from Theorem 1.1 a general theorem on representatives of subsets which contains the Kreweras (Kreweras [2]) generalization of the Rad6-Hall theorem. As a further application, Theorem 1.1 is used to prove the following imbedding theorem for distributive lattices. THEOREM 1.2. Let D be a finite distributive lattice. Let k(a) be the number of distinct elements in D which cover a and let k be the largest of the numbers k(a). Then D is a sublattice of a direct union of k chains and k is the smallest number for which such an imbedding holds. 2. Proof of Theorem 1.1. We shall prove the theorem first for the case where P is finite. The theorem in the general case will then follow by a transfinite argument. Hence let P be a finite partially ordered set and let k be the maximal number of independent elements. If k = 1, then every two elements of P are comparable and P is thus 1 This theorem has a certain formal resemblance to a theorem of Menger on graphs (D. Konig, Theorie dej' endlichen und unc>tdlichen Graphen, Leipzig, (1936)). Menger's theorem, however, is concerned with the characterization of the maximal number of disjoint, complete chains. Another type of representation of partiallyorcJered sets in terms of chains has been considered by Dushnik and Miller [3 J (see also Komm [4]). I t can be shown that if n is the maximal number of non-comparable elements, then the dimension of P in the sense of Dushnik and Miller is at most n. Except for this fact, there seems to be little connection between the two representations.

139

162

R. P. DILWORTH

a chain. Hence the theorem is trivial in this case and we may make an argument by induction. Let us assume, then, that the theorem holds for all finite partially ordered sets for which the maximal number of independent elements is less than k. Now it will be sufficient to show that if C1 , ••• , Ck are k disjoint chains of P and if a is an element belonging to none of the Ci , then C1 + ... + Ck + a is a set sum of le disjoint chains. For beginning with a set a1, ... ,ak of independent elements (which exist by hypothesis) we may add one new element at a time and be sure that at each stage we have a set sum of le disjoint chains. Since P is finite, we finally have P itself represented as a set sum of le chains. Let, then, C1 , ••• , Ck be le disjoint chains and let a be an element not belonging to C1 + ... + Ck • Let U i be the set of all elements of Ci which contain a, let Li be the set of all elements of Ci which are contained in a, and let Ni be the set of all elements of C; which are non-comparable with a. Finally let

U = U1 + ... L = L1 + ...

N

+ Uk + Lk = N1 + ... + Nk

C

=

C1

+ ... + C

k•

Clearly U; + Ni + Li = C; and U + N + L = C. We show now that for some m the maximal number of independent elements in N + U - U m is less than le. For suppose that for each j there exists a set Sj consisting of le independent elements of N + U - U j • Since there are le elements in S j and they belong to C = C1 + ... + Ck , there is exactly one element of S j in each of the chains C; . Since S j contains no elements of U j it follows that S j contains exactly one element of N j . Thus S = S1 + ... + Sk contains at least one element of N i for each i. Now let 8i be the minimal element of S which belongs to C i . 8i exists since the intersection of Sand C; is a finite chain which we have proved to be non-empty. Furthermore, 8i E Ni since there is at least one element of Ni which belongs to S and all of the elements of U i properly contain all of the elements of N i . Hence 81, . . . ,8k EN. Now if 8; ;;;; 8j for i ~ j, let 8j E Sr. Since Sr contains an element ti belonging to Ci , we have from the definition of 8i that ti ;;;; 8i ;;;; 8j and ti ~ 8j since t; E C; and 8j E Cj • But this contradicts our assumption that the elements of Sr are independent. Hence we must have 8j ~ 8j for i ~ j and 81, •.• , 8k form an independent set. But since 8i belongs to N, 8i is non-comparable with a and hence a, 81, ..• , 8k is an independent set containing le + 1 elements. But this contradicts the hypothesis of the theorem and hence we conclude that for some m, the maximal number of independent elements in N + U - U m is less than le. In an exactly dual manner it follows that for some l, the maximal number of independent elements in N + L - Ll is less than le. Now let T be an independent subset of C - U m - Ll . If T contains an element x belonging to U - U m and an element y belonging to L - L l , then x ;;;; a ;;;; y contrary to the independence of T. Since

(N

+U-

U m)

+ (N + L

140

- L l)

=

C - Um

-

Ll

DECOMPOSITION THEOREM FOR PARTIALLY ORDERED SETS

163

it follows that T is either a subset of N + U - U m or of N + L - L! . Hence the number of elements in T is less than k and thus the maximal number of independent elements in C - U m - L! is less than k. Since U m + L! is a chain there is at least one independent set of k - 1 elements in C - U m - L! . Hence by the induction hypothesis C - U m - L! = C~ + ... + C~-1 where C~ , •.. , C~-1 are disjoint chains. Let C~ be the chain U m + a + Ll . Then

C

+a =

C~

+ ... + C~

and our assertion is proved. We turn now to the proof of the general case. Again when k = 1 the theorem is trivial and we may proceed by induction. Hence let the theorem hold for all partially ordered sets having at most k - 1 independent elements and let P satisfy the hypotheses of the theorem. A subset C of P is said to be strongly dependent if for every finite subset S of P, there is a representation of S as a set sum of k disjoint chains such that all of the elements of C which belong to S are members of the same chain. Clearly -any strongly dependent subset is a chain. Also from the theorem in the finite case it follows that a set consisting of a single element is always strongly dependent. Since strong dependence is a finiteness property it follows from the Maximal Principle that P contains a maximal strongly dependent subset C1 • Suppose that P - C1 contains k independent elements ai, ... , ak. Then from the maximal property of C1 we conclude that C1 + ai is not strongly dependent for each i. Hence there exists a finite subset Si such that in any representation as a set sum of k chains there are at least two chains which contain elements of C1 ai. Si must clearly contain ai since C1 is strongly dependent. Let S = SI + ... + Sk. By the strong dependence of C1 , S = Kl + ... + Kk where K 1, ... , Kk are disjoint chains such that for some n ~ k we have S· C1 C K" . Since S contains ai, ... , ak which are independent, for some m ~ k we have am E K" . Let K: be the chain Sm' Ki . Then Sm = K~ + ... + K~ and Sm,C1 C Sm,S,C1 C Sm·K" = K~ . But by definition am E Sm and am E K". Hence Sm' (C1 + am) C K~ which contradicts the definition of Sm. We conclude that P - C1 contains at most k - 1 independent elements. But since C1 is a chain and P contains a set of k independent elements, it follows that P - C1 contains a set of k - 1 independent elements. Thus by the induction hypothesis we have P - C1 = C2 + ... + Ck . Hence

+

P = C1

+ ... + C

k

and the proof of the theorem is complete. 3. Application to representatives of sets. G. Kreweras has proved the following extension of the Rad6-Hall theorem on representatives of sets: Let ~ and .\8 be two partitions of a set into n parts and let h be the smallest number such that for any r, r parts of ~ contain at most r + h parts of .\8. Let k be the smallest number such that n k elements serve to represent both partitions. Then h = k.

+

141

164

R. P. DILWORTH

To show the power of Theorem 1.1 we shall prove an even more general theorem in which the partition requirement is dropped. Now if ~ is any finite collection of subsets of a set S we shall say that a set of n elements (repetitions being counted) represents ~ if there exists a one-to-one correspondence of the sets of ~ onto a subset of the n elements such that each set contains its corresponding element. For example, the set {I, 1, I} represents the three sets {I, 2}, {I, 3}, and {I, 4}. The theorem can then be stated as follows: THEOREM 3.1. Let ~ and ~ be two finite collections of subsets of some set. Let ~ and ~ contain m and n sets respectively. Let h be the smallest number such that for every r, the union of any r h sets of ~ intersects at least r sets of~. Let k be the smallest number such that n k elements serve to represent both collections ~ and~. Then h = k. It can be easily verified that if ~ and ~ are partitions of a set, then h as

+ +

defined in Theorem 2.1 is equivalent to the definition given in the theorem of Kreweras. For the proof let ~ consist of sets Al , ... , Am and ~ consist of sets B 1 , ••• , Bn. We make the sets AI, ... , Am, B 1 , ••• , Bn into a partially ordered set P as follows:

Ai

~

Ai

i = 1, ... , m

Bj

~

Bj

j

Ai

~

B j if and only if Ai and B j intersect.

=

1, ... , n.

It is obvious that P is a partially ordered set under this ordering. Now let w be the maximal number of independent elements of P. Since the union of any r + h sets of ~ intersects at least r sets of ~, it follows that any independent subset of P can have at most r + h + (n - r) = n + h elements. Hence w ~ n + h. On the other hand for some r there are r + h sets of ~ whose union intersects precisely r sets of ~. Hence these r + h sets of ~ and the remaining n - r sets of ~ form an independent subset of P containing n + h elements. Thus w = n + h. By Theorem 1.1, P is the set sum of w chains C1 , ••• , Cw • Now if a chain Ci contains two sets they have a non-null intersection by definition. Hence for each Ci there is an element a; common to the sets of Ci . But since Al , ... , Am are independent in P it follows that they belong to different chains and hence the w elements al , ... , aw represent ~. Similarly, aI, ... , aw represent ~ and thus n + k ~ w. But since P cannot be represented as a set sum of less than w chains, it follows that n + k = w = n + h. Hence h = k and the theorem is proved. 4. Proof of Theorem 1.2.

Let us recall that an element q of a finite distributive lattice D is (union) irreducible if q = x U y implies q = x or q = y. It can be easily verified that if q is irreducible, then q ~ x U y implies q ~ x or q ~ y. From the finiteness 2 of S it I L is assumed to be finite for sake of simplicity. The theorem holds without this restriction. In the proof, "elements covered by a" must be replaced by "maximal ideals in a" and "irreducible elements" must be replaced by "prime ideals."

142

DECOMPOSITION THEOREM FOR PARTIALLY ORDERED SETS

165

follows that every element of D can be expressed as a union of irreducible elements. From this fact we conclude that if x > y, there exists at least one irreducible q such that x ~ q and Y ~ q. Now let P be the partially ordered set of union irreducible elements of D. Let a be such that k = k(a). Then there are k elements aI, ... , ak which cover a. Let qi be an irreducible such that ai ~ qi and a ~ qi. Then if qi ~ qi where i ¢ j we have a = ai n aj ~ qi n qi ~ qi which contradicts a ~ qi' Hence ql, ... , qk are an independent set of elements of P. Next let q~ , ... , q; be an arbitrary independent subset of P. Let a' = q~ U ... Uq; and for each i let = q~ U ... U q;-l U q;+l U ... U q;. Now if = a' for some i, then

P;

P; q; = q; n a' = q; n P; =

(q;

n q~) U ... U (q; n q;-l) U (q; n q;+l) U ...

U M (q; n q;)

and hence q; = q; n q~ for some j ¢ i. But then q~ ~ q; contrary to independence. Thus a' > P; for each i and P; U p~ = a' for i ¢ j. Let a = p~ n ... n P; and for each i let Pi = p~ n ... n P;-l n P;+l n ... n p;, If Pi = a, then P; = P; U a = P; U Pi = (p; U p~) n ... n (p; U P;-l) n (p; U P;+l) n ... n (p; Up;) = a' which contradicts P; < a'. Hence Pi > a and Pi n Pi = a for i ¢ j. Let Pi ~ ai where ai covers a. Then a ~ ai n ai ~ Pi n Pi = a for i ¢ j and hence ai n ai = a, i ¢ j. Thus aI, ... , al are distinct elements of D covering a. It follows that l ~ k and hence k is the maximal number of independent elements of P. Now by Theorem 1.1 P is the set sum of k disjoint chains CI , ••• , Ck • We adjoin the null element z of D to each of the chains Ci • Then for each xED, there is a unique maximal element Xi in C i which is contained in x. Now suppose x > Xl U ... U Xk in D. Then there exists an irreducible q such that x ~ q and Xl U ... U Xk ~ q. But q E C i for some i and hence Xl U ... U Xk ~ Xi ~ q contrary to the definition of q. Hence X = Xl U ... U Xk . Consider the mapping of D into the direct union of CI , ••• , Ck given by X ---?

{Xl, ••• ,

Xk}.

Now if Xi = Yi for i = 1, ... ,k, then X = Xl U ... U Xk = YI U ... U Yk = Y and the mapping is thus one-to-one. Since X U Y ~ Xi U Yi we have (x U Y)i ~ Xi U Yi . But since (x U Y)i is union irreducible we get X U Y ~ (x U Y)i -+ X ~ (x U Y)i or Y ~ (x U Y)i -+ Xi ~ (x U Y)i or Yi ~ (x U Y)i -+ Xi U Yi ~ (X U Y)i • Thus (x U Y)i = Xi U Yi and we have X U Y ---?

{Xl

U YI, . . • ,Xk U Yk}'

Similarly X n Y ~ Xi n Yi -+ (X n Y)i ~ Xi n y •. But X ~ X (x n Y)i and Y ~ X n Y -+ Yi ~ (x n Y)" Hence Xi n Yi ~ (X (x n Y)i = Xi n Yi and we have X

nY

---? {Xl

n YI,

143

••• ,Xk

n Yk}'

n Y -+ Xi ~ n Y)i' Thus

166

R. P. DILWORTH

This completes the proof that D is isomorphic to a sublattice of a direct union of k chains.

Now suppose that D is a sublattice of the direct union of l chains C: , ... , C~ where l < k. Again let a be such that k(a) = k and let aI, ... , ak be the k distinct elements covering a. Define a' = al U . . . U ak and let a: = al U ... U ai-I U ai+l U ... U ak for each i. Now a: = U ... U q~ where E And if = U y', then = U y; where E But then either = U y: = or = x: U = y: and hence either = x' or = y'. Thus each is union irreducible. But al U ... U ak = a' ;;;; for i = 1, ... , l. Thus for each i ~ l there is a j such that aj ;;;; Since l < k there is some r such that a: ;;;; U ... U q~ = a' ;;;; ar • But then ar = a: n ar = a which contradicts the fact that ar covers a. Hence l ;;;; k and we conclude that k is the least number of chains whose direct union contains D as a sublattice. This completes the proof of Theorem 1.2.

q: x' q:

Y;

q:

x: ,Y: C: .

q: x:

q; q:

q: .

q:

q: C: . q: x: q:

x:

q:

YALE UNIVERSITY CALIFORNIA INSTITUTE OF TECHNOLOGY REFERENCES

1. P. HALL. On representatives oj subsets. J. London Math. Soc. 10 (1935),26-30. 2. G. KREWERAS. Extension d'un theoreme sur les repartitions en classes. C. R. Acad. Sci. Paris 222 (1946), 431-432. 3. B. DUSHNIK AND E. W. MILLER. Partially ordered sets. Amer. J. of Math. vol. 63 (1941), 600-610. 4. H. KOMM. On the dimension of partially ordered sets. Amer. J. of Math. vol. 20 (1948), 507-520.

144

THE MARRIAGE PROBLEM.*

By

I'Al'L

R.

HAL~ros

and

HERBERT

E.

VAUGHAN.

In a recent issue of this journal Weyll proved a combinatorial lemma whieh was apparently considered first by P. HalP Subsequently Everett and Whaplf's 3 publishf'd another proof and a gf'neralization of the sallle lemma. TIlf'ir proof of the gf'neralization appears to duplicate the usual proof of TydlOnoff's tlworf'm.t The purpose of this note is to simplify the presf'ntation by employing the statemf'nt rather than the proof of that result. At the same time we pre"f'nt a somewhat simpler proof of the original Hall If'mma. Suppose that eac·h of a (possibly infinite) set of boys is acquaillt .. d with a finite set of girls. "G nder what conditions is it possible for eaeh hoy to marry one of his a('quaintances? It is df'arly necessary that every fillite set of k boys be. colledively. acquainted with at least k girls; the £\'(-rrttWhaples result is that this condition is also sufficient. We treat first the ease (considered b~' II all) in which the number of bo\":, is finite, say n, and proceed by induction. For 11 = 1 the result is triyial. If 11, > 1 and if it hanpens that every set of .1,; boys, 1 < k < 11, has at 1(',H k 1 acquaintances, tlH'n an arbitrar~' one of the hoys ma~' marr~' an~' one of his acquaintanees and refer the others to the induction hypothesis. If. on the othrr hand. some group of l· hoys. 1 < To- < II. has exadl~' 7.~ aeqllaintances. then this Ret of k ma~' he marripd off b~· induetion and, we assert. tlw remaining n - k bo~'s satisf~' the necessar~' eondition with respeet to the as wt unmarried girls. Indeed if 1 < 11 S 11 -7." and if some set of h baclwlor, were to know fewer than h spinsters, then this set of h bachelors togdher with the k married men would have known fewer than k h girls. .\n

+

+

" R('('eiH'd .Jllne (j. l!H!l. H. ".('~.l. "_-\llI1o~t pl'rioclie inn1l'iant "edor ~ets in a metric "ector Sp:l'·~." ..tmcrirfl" .'ounlal of J{flth('mfltic.~. yol. i1 (l!l4!)), PI'. li8-20;;. 2 P. Hall. "On ]'('prpsl'ntation of sllh~et<' ./ounlal of the LOlldon Jfathcm(JI;,nl Society. yol. 10 (l!):l;'). pp. 2(j·:l0. 3 C ..J. En'rl'tt and G. 'Yhaples. " Rppre,.;entations of sl''luenpl's of set~," .-imrr;',·qll ./01l1·"fI/ 0/ J[at"('m"tir.~, \'01. 71 (I!)t!)), PI'. 2Ri-2!l:l. Cf. al~o ~r. Hall, "Distinrt !'<·"re· sentath'es of suhsets," Rulletil1 of the .-imerican ]{athema.tical Soriety, "01. 54 (If! ',I. pp. !)22·!)2(j. • C. CIH'yalley and O. Frink, .Jr., "Birompartness of Cartesian produd~," /lull,'in of the .,fmel·ican J[athcmatical Rodety, "01. .fi (}!).fl), PI'. (j}2-GU. 1

214 146

215

THE lL\RRIAGE l'UOBLElI.

application of the induction hypothesis to the n -lc bachelors concludes rhe proof in the finite case. If the set B of boys is infinite, consider for each b in B the set G (b) of his acquaintances, topologized by the discrete topology, so that G (b) is a ('om pact Hausdorff space. Write G for the topological Cartesian l'roduet of all G(b); by Tychonoff's theorem (J is compact. If {b l , ' • " bn } is any finite set of boys, consider the set II of all those elementfol rJ = rJ (b) of G for \Ihieh g(b;,) =l=g(b j ) when eYer b.=I=bj,i,j=l,·· ',11. The ;.:(·t II is a dosed subset of G and, hy the result for the finitf' c'asc, II is not empty. :-;ince a finite union of finite sets is finite, it follow;; that the da,,:s of all sets !'nch as II has the finite intersection property and. I:on;;cquc'ntly, has a non "Illl'ty intersection. Since an element rJ = g( b) in thi;; inter:"cc·tion i:; :,ueh that g (b / ) =1= g (b") whene\"er b' =1= b", the proof i:; (·omplete. It is prrhaps worth remarking that this tIWOI'(,1ll fUl'lli"hes the solution llf the ('elehrated prohlrlU of the monk!>." Withont f'ntering into the hi;;tory ot: this well-known problem, we state it and it;.: solution in the language of the pre(·eding diseus!>ion. _\. necessary and suliil·ient (·ondition that eat·h lIn," b lUay establish a harem consisting of n( b) of hi;: a(·qaintanees, Il (b) = 1, :!. 3,' . " is that, for e\'ery finite sub;:et Bn of R. tIl<' total numher of ;H·quaintances of the members of Bo be at lea"t <,qual to ~Il (b), where the ,ul1lmation runs oYer ewry b in B o' The proof of this seemingly more ;!,'neral assertion may be based on the de\'iee of rt'phH'ing eadl b in B by /I(b) rf'plieas seeking eon\"entional marriages, with thr llncll'r;.tanding" that f';[c·h repli(·a of b is acquainted with exactly the ;.:alllP girl" a" h. Sin!'f' the ;lated re"triC"tion on the function n implies that thr I'f'l'lic'a" ;::ati,:fy tIlt, Hall (,flndition. an applieation of the E\'erett-'Yhaples tlWO\'l'1I1 yirlcl~ thf' c1""iI'l'c1 rf'~lIlt. l'xln:nSITY OF ('111('.\(;0 .\XII

I'Xln:usITY lit' IU.I XOIS.

"H, BaIza!', Les Cellt COlltell n,.61atiqll('.~, I\"

!):

Dcs moillell et tlOl:i('('.~, Paris

( !8-19).

Reprinted from Amer. J. Math. 72 (1950), 214-215

147

CIRCUITS AND TREES IN ORIENTED LINEAR GRAPHS by T. van Aardenne-Ehrenfest (Dordrecht) and N. G. de Bruijn (Delft)

§ 1. P n (a)-cycles. In this § we state the problem which gave rise to our investigations about graphs. The further contents of the paper are independent of this § 1. Consider a set of a figures 1, 2, ... , a, and let n be a natural number. A sequence of n figures will be called an n-tuple. Clearly, there are an different n-tuples. An oriented circular array, consisting of an figures, will be called a P,,-cycle, whenever it has the property that each n-tuple occurs exactly once as a set of n consecutive figures of the cycle. An example, with a = 3, n = 2 is the cycle 1 1 2 2 3 3 1 3 2.1) The existence of P ,,(a)-cycles, for arbitrary values of a and n, was proved by M. H. MARTIN [3J, 1. ]. GOOD [2Jand D. REEs [4]. One of us showed ([IJ) that, for a = 2, the number of different P n (2)-cycles equals 2f(n), f (n) = 2n-l __ n. This result was derived as follows. The number of P ,,(21-cycles can be interpreted as the number of circuits in a certain graph N n+ 1 (compare also [2J). The graph Nn+l can be obtained by a certain operation from N n' and by a general theorem on circuits in oriented graphs the number of circuits of Nn+1 could be expressed in the number of circuits of N". This theorem on graphs was proved in [IJ only for the case that at any vertex 2 edges point outward and 2 inward. In the present paper we shall deal, among other things, with the general case (theorem 4). This result immediately enables us to determine the number of P ,,(a)-cycles for arbitrary a. Referring to [IJ for details, we only state the result: The number of different P n(a)-cycles is ern (a !)q, where q = an-I. For example, there are 24 P 2 (3l-cycles. Six of them are

12 3 3 2 2 13 12322 1 3 3 12 2 3 3 2 13

12 2 3 2 13 3 12 23 3 13 2 1 223 1 332

1) It has to be understood that these figures have to be placed around an oriented circle. Therefore, 21 is one of the 2 - tuples occurring in the cycle. Naturally, 112233132 and 331321122 are considered as one and the same cycle, but 112313322, which has the reversed order, is a different one.

149

204

Another six are obtained from these by interchanging the figures 2 and 3 everywhere. By reversing the orientation, 12 new cycles arise. § 2. Preliminaries about permutation groups.

Let €i m be the symmetric group of degree m, that is, the group of all m! permutations of a set Em of m objects. If ~ is a subset of €i m then the number of cyclic permutations in ~ will be denoted by 1~ I, and the total number of elements in ~ by n(~). A subset 'II of €i m will be called a D-set (in em), whenever it has the property that 1 5 'II 1 has the same value for all SEem. It is easily seen that, in that case, we have 1 5 'II 1 = m- I . n ('II). For, if C is any cyclic permutation, then there are exactly n ('II) possibilities for 5 such that S'II contains C, and it follows that m! 1 5 'II I = (m-1)!n('II). Furthermore, it may be remarked that I 5 'II I == !'II 5 i, since SBS-I is cyclic whenever B is cyclic. Therefore, if 'II is aD-set and if P is an arbitrary element of em, then 'II P is also aD-set. em itself clearly is a D-set in em, but theorem 1 will show that non-trivial D-sets exist. Let E, be a sub-set of the set of objects Em' containing lobjects. Consider the sub-group @ Cern of all permutations which only permute the elements of E z, leaving the remaining elements of Em invariant. If G is any permutation of @, then G denotes the corresponding permutation of the objects of E " that is to say, we disregard the objects belonging to Em - E z (which are invariant under G). G is defined uniquely by G, and vice versa. The same notation will be used for sets: if ~ C @, then ill denotes the set of all G, where G E~. L e m mal. Let \8 be a sub-set of @ such that 58 is aD-set in @, and let C Eem be a cyclic permutation. Then we have

1\8 C I =

l-I . n (\8).

Proof. We shall deal with the cyclic representations of the permutations involved. Let G be the element of @ whose cyclic representation is obtained by cancelling the objects of Em - E I from the cyclic representation of C. Further, let GI be an arbitrary permutation of @. Then it is easily verified that GIG (of degree l) shows the same number of cycles as GIC (of degree m). Hence GIC is cyclic whenever GJ; is cyclic. Therefore

1\8 C I = 158 G I =

l-I . n

150

(58)

=

l-1 . n (\8).

205

L em m a 2. Let mbe a subset of @ such that \8 is a D-set in &. Let Q be any arbitrary permutation of Sm. Then we have

I@QI

ImQI n(m)

n (@) .

Proof. If there is no G e @ such that GQ is cyclic, then both sides are equal to zero. Now assume that G e @ is such that GQ is cyclic; put GQ = C. We have @ Q = @ C, and mQ = (m G-l) C . The set ~1 = is (;-1 is a D-set in @, since was a D-set in @. Now, by lemma I,

m

1m Q 1= 1m1c 1=

1-1 . n

(m 1)

1-1 • n and lemma 2 has been proved.

(@),

= 1-1 • n

(m).

Analogously

1@Q 1= 1@C 1=

Let k and n be natural numbers, and take m = kn. We consider a set Em of m objects, divided into k systems, each of them containing n objects. We shall again denote by elm the group of all permutations of Em. ~ denotes the group consisting of all k! (n!) k permutations H with the property that Ha and Hb belong to the same system whenever a and b belong to the same system. Or, shortly, H transforms systems into systems. The 0 rem I. ~ is a D-set in elm. Proof. If either k = 1 or n = I, then we have ~ =el m, and the theorem is trivial. Next we shall deal with the case k = 2, m = 2n. It has to be shown that 1 5 ~ 1 does not depend on 5 (5 eel 2n ). Let ~1 be the set of all permutations mapping the first system onto itself, and let ~2 be the set of those mapping the first system onto the second. Thus ~ = ~1 + ~2' Let p be the number of objects of the first system mapped into the first system by 5, then there are q = n - p objects of the first system which are mapped into the second system. Then we have

15 ~11 =

q{(n-l)!}2.

This can be seen, for instance, by interpreting 1 5 ~1 1 as the number of circuits (see § 3) in the following graph. Take two vertices, A and B, and 2n oriented edges: p of them from A to A, p from B to B, q from A to Band q from B to A. The number of circuits can be shown to be q {(n - I) !}2. It can be very rapidly

151

206

determined by theorem 6, for there are exactly q trees with root A. $)2 can be written as S o ~l' where S o is an arbitrary element of i92' Now I 5 ~21 = I 51 ~l I ' where 51 = 55 0, 5 1 has the same nature as S, apart from t he fact t hat p and q changed their roles. Hence I S$') , I~ p{(n - I )I}', and so IS $') I ~ IS $') , 1

+ IS $'), 1 ~

IP + q){

(n - l )!}'~

(n!) ' /n.

This does not depend on 5, and so our theorem has been proved in the case k = 2. Next we consider the general case k> 2. \Ve have to show that i 51 ~ I = I S 2 ~ I for any pair 5 1. S 2 (51 e.Sm> S 2 e.S m ) . Since any S ESm can be written as a product of transposit ions, it is sufficient to prove that I 5 .\,1 1= 1ST ~ I for a ll 5 and for any transpositi.on T. Or, what is t he same thing, that

(2.1 ) for any Q eGrn an d any transposition T fG rn . vVe may assume t hat T interchanges two sym bols belonging to the first and to the second system, respectively (if T interchanges t\vo symbols of the same system, then we have $) = T $), and (2 .1 ) is t rivial). Let $)* be t he su b-group of .S) consisting of all permutations of .S) which leave all indi vidual elements of t he 3 rd , 4th, . .. , k/ h system in vari ant, and le t ® be the group arising from 6 m in the same manner. Vrle now apply lemma 2, with l = 2n and m= ,!Q*. Since the theorem has been proved fo r the case k = 2, we know that Therefore

.i5 * is

$')*Q i n ($')*)

I

a D-set in @.

I (IJ Q I ,,-(0 ) '

I T.\J*QI n (.1)')

Evidently T (II ~ (IJ, and so I $')*Q I ~ I T $')*Q I fur all Q EGm • Since ,!Q* is a sub-group o f $), we can spli t iQ into classes, $) '= 1..: s)*Qi' and nnw (2.1 ) follows immEdi ately . The order of the group S) is kl(nl)k, and therefore (2.2)

I $') I ~ m -' ,, ($') )

~

m -' k! (n!)k

~

n - ' (k -

I) ! (n !)'.

Let sr be t he set of all permutations K with the property t hat the n objects of each syst em are transformed into objects of n different systems. In other words, ]( is such that, if a and b belong to the same syst em, then Ka and Kb bel ong to different systems. Clearly Sf is emp ty if k < n. It is not difficult to show t ha t stis a D-set. For, if H is an arbitrary p ermutation of S) , we have Sf = Sf H. It follows that Sl' is the

152

207 sum of a number of left-classes mod ~: ~ = r K, :po Each component K, .\1 is a D-set, by theorem 1. Hence ft is aD-set. It is easily seen that in the special case k = n t he number of ~ lemcnts in st is (nl)2n, and so we have (2.3)

I Sl' 1~ (n!) '"n-'

(k

~

n).

§ 3. T -Graphs. In §§ 3, 4, 5, 6 we shall be mainly concerned with a special type of finite oriented linear graphs, called T-graphs 1). These have the property that, at each vertex P, the number at of oriented edges pointing to Pi equals the number of edges pointing away from Pi' For simplicity we assume a i > 0 for all i. If this number happens to be the same for all vertices (at = (J for all i) then we shall call the graph a 1'(0 ) 2). 'Vc do not exclude the possibility that, in a T-graph, several different edges point froin Pi to PI' and we neither exclude edges pointing from P; to Pi itself (closed loops). Therefore, a T-graph can be interpreted as a pair of mappings of a finite set of edges {e l , . . . , e..J onto a finite set of vertices {PI' "', P N } such that each vertex is the image of the same number of edges in both mappings. The first mapping maps every edge onto the point \vhere it starts from, and the second one onto the point where it terminates. \Ve shall call these vertices the tail and th e head of the edge, respectively. If the head of c; coincides with the tail of el , then ci and c; will be called consecutive (which does not imply that ej and ei are consecutive). By a complete circuit (a circnit for short ) is meant any cyclic arrangement of the set of edges in such a manner that the head of each edge coincides with the tail of t he next one in the circuit. Or, in other words, such that consecutive edges in the circuit are consecutive in the graph. Nat urally, two circuits are considered as identical whenever the first one is a cyclic permutation of the second. It has to be understood that the order of the edges counts, and not only the order of the heads. So, fo r instance , if m = 3, N = 1, t hen PI is the head as well as the tail of all edges. There are two different circuits, viz. (c 1 , e2 , e a ) and (e t , e3 , ez). I) Tuttc [5 J calls them simple oriented networ/ls. 2) In the paper [IJ t he name "T-net" denoted the same thing as T (2) d oes in OUf present notation.

153

208 The number of circuits of a graph T will be denoted by I Til). A permutation P of the set of edges e1 , •.• , em will be called conservative (with respect to T), whenever Pei = ej always implies that the head of e. coincides with the tail of ej • We choose one special conservative permutation A 0' arbitrary, but fixed in the sequel. The set of all conservative permutations of T can be represented as @ A 0' where @ is the group of all permutations which leave the tails of all edges invariant. Evidently, any circuit determines a cyclic conservative permutation, and vice versa. Therefore,

I T I = I @A o '· This simple relation between the number of circuits in a graph and the number of cyclic permutations in a set explains why we choose the same notation I I for both. Consider a vertex Pi where a i edges start and a i edges terminate. By the local symmetric group @. we shall denote the group of all ,ail permutations which permute the at edges whose tail is Pi' but which leave invariant all edges whose tail is not Pi' Clearly @ is the direct product of &1' ... , &N'

§ 4. Traffic regulations. We shall also consider circuits described under certain restrictiw ,conditions, called traffic regulations. Let mv ... , mn be sub-sets of @v ... , O:I n , respectively, and .construct the set (4.1) ,defined in the same way as the direct product (4.2)

@

=

@1 X @2 X ... X @N'

a circuit described under the traffic regulation m is defined as a circuit corresponding to a permutation BAo, where BE m, and A 0 is the fixed permutation chosen in § 3. Denoting the number of circuits described under the traffic by I T I )8, we have regulation ~ow

m

I T I = IT!

(4.3)

6.l ,

ITI

I T I \B

=

1m Ao! .

1) We have > 0 if and only if T is connected (see [2J). For non·connected graphs our theorems are trivial. Nevertheless, all our proob ·are valid for that case also.

154

209 The traffic regulation (4.1) will be called reg7tlar if, for each i, ~i is a D-set in @i 1). The are m 2. 2) If 58 is regular, then we have 1 1 (4.4; n (58) I T I \8 = n (@) I T I (!j,

where n (58) and n (@) denote the number of elements of 58 and @, respectively. Proof. Since @i itself satisfies the condition imposed on 58 i , it is sufficient to show that the value of the left-hand-side of (4.4) does not change if some 58 i is replaced by the corresponding @i' If this has been proved, we can replace all 58 i 's by @i 's one after the other, and (4.4) follows. To this end we consider 58, defined by (4.1) and 58*, defined by

58* =

(4.5)

@1

X

58 2

X

58 3

X ... X

581\"

and we have to show that (4.6 for the latter ratio equals n (58) : n (58*). Referring to (4.3) we write (4.7)

IT \\8 =

L:

158 1 B 2 ••• BNAo I,

/ T I \8* = l: I @1 B 2 . . . B N A 0 I, where, in both sums, B 2 , • •• , BN run independently through the elements of 58 2 , ••• , 58 N , respectively. If we put B2 .. . B N Ao = Q, then we have, by lemma 2,

/58 1 Q I : I @1 Q I = n (58 1) : n (@1)' c'I.pplying this to each pair of corresponding terms of the sums in (4.7), We obtain (4.6). § 5. Special traffic regulations.

Let T be a T-graph with N vertices and m edges. Again, the numbers of edges pointing towards PI> ... , P N are denoted by aI' . . . , aN' respectively, and so m = a l aN' Let A be a positive integer. Then by TA 3) we denote the graph which arises from T if We replace any edge PiP; of T by A edges PiP;, with the same orientation. Hence TA has N vertices and Am edges. The edges of TA arising from one and the same edge of T are said to form a bundle.

+ ... +

') As in § 2, the bar indicates that the permutations are considered as permutations of the a i edges whose tail is Pi, whereas the other edges are disregarded. 2) This theorem was used implicitly in [1]. 3) The notations Ta and T(a) (for the latter see § 3) must not be confused.

155

210

In T>' we shall consider several possible traffic regulations. We first choose a fixed conservative permutation A 0 which transforms bundles into bundles. A traffic regulation will be obtained by choosing, at each \'ertex Pi' a set 58 i of permutations of the ).ai edges starting from that vertex. We shall consider three possibilities, all regular in the sense of § 4. 1°. 58 i = @i' where @i is the local symmetric group (of order (Aa i )!)· 2°. 58 i = ~i' Here ~i is the sub-set of @i which transforms bundles into bundles. In other words, as to the edges whose tail is Pi it acts like the group ~ of theorem I, where the systems are given by the bundles. Thus n = A, k = a i • 3°. 58 i = Sf i . Here Sf i is the sub-set of ~i consisting of the permutations which transform the edges of each outgoing bundle at Pi into sets of edges belonging to A different bundles (see the end of § 2). We have, by theorem 2, 1

(5.1)

T>'I@

iY

N

N

II ail (A!)CJi

II (Aa i )!

i= 1

II T (A, ai )

i=l

i= 1

where T (A, a i ) is the number of clements of Sf i . As stated in § 4, we have 1 T>' 1 Ql ~ 1 T>' 1 . The number 1 T>' 1 ~ can be connected with the number of circuits in T itself. To this end we consider a circuit of T>' described according to the traffic regulation ~. At any stage, the bundle to which an edge belongs only depends on the bundle containing the preceding edge. Therefore, the sequence of bundles described by the circuit is periodic mod m, and any bundle is used exactly A times. It follows that each circuit under consideration defines a circuit of T. Conversely, it is easily seen that each circuit of T arises fmm A-1 (A !)m different circuits of T>' in this manner. Hence

I T>'I

(5.2)

~

= 1.-1 (A!)m . I T I,

and so we obtain from (5.1) The

0

re m 3.

1

T>'

1

1. T i§l

= A-

I

1 .

We shall now make the restriction that A = a, that is to say a 1 = ... = aN = a = A,

156

(A:ii(!. T

is a

m=Na.

T(CJ) ,

and that

211

Then we have (see (2.3))

= (0'!)2<7

cp ()., O'i)

and now (S.l) and (S.2) lead to (S.3)

1Ta

I~ =

1T 1.0'-1 (O'!)m .

(O'~O'~l~;ar =

1 T 1.0'-1 (O'!) N(2<7-1).

If T is a T(a), with N vertices and m = O'N edges, then by T* we denote the graph defined as follows. T* has m vertices Ev ... , Em. Two vertices E i , E; are connected in T* by an edge from Ei to E j if and only if ei , e; are consecutive in T. This process was considered in [1] for the case 0' = 2 only.

The

0

rem 4.

1

T* , = 0'-1 (0' !)N(a-l) . 1 T

I.

Proof. By "O'-cycle in T" is meant a circular array containing each edge of T exactly 0' times, such that two edges are consecutive in the array if and only if they are consecutive in T. A O'-cycle will be called restricted if it is such that any pair of consecutive edges of T occurs just once as a pair of consecutive elements in the array. It will be clear that any restricted O'-cycle in T defines uniquely a circuit in T*, and vice versa. The restricted O'-cycles in T are closely related to the circuits in Ta described under the traffic regulation ~. Actually, if we identify the edges of each bundle in Ta a ~-circuit in Ta becomes a restricted O'-cycle, owing to the definition of ~. Conversely, any restricted O'-cycle gives rise to a large number of ~-circuits. Any bundle occurs 0' times in the cycle, and each time an arbitrary edge of the bundle can be chosen. So we see that (O'!)m different ~-circuits arise from one restricted O'-cycle. Now the theorem follows from (S.3), since m = NO'. In a T-graph which is not necessarily a T(a) we can still consider (unrestricted) O'-cycles 1). The number of different O'-cycles can be determined from theorem 3. A difficulty lies in the fact that a O'-cycle may be periodical with a period md, where d is a proper divisor of 0', which could not happen with a restricted O'-cycle. If c (e) denotes the number of those e-cycles in T whose period is exactly me, then we have obviously d I T P 1 = E- c (d) (e!)m. dip

e

1) And, if T is a T(a), we can consider (unrestricted) e-cycles, for arbitrary values of e.

157

212

Hence we obtain from Mobius' inversion formula, c (e) = E

dip

~ fl ([d) . (d!)-m·1 e

Td

I,

and so the number of unrestricted e-cycles equals E c (d)

dip

=

!l/ p (!l) (d !)-m d . I Td I ' e e d

where p is Euler's indicator. I Td I can be evaluated by theorem 3. Especially, IS T is a T(u) , then the number of unrestricted ecycles is (ad)!)N . e1 ~ p ( de ) ( (d!)u a! . 11 I .

§ 6. Trees in T-graphs. Let T be a T-graph with N vertices and m edges. The number of edges whose tail is Pi is again denoted by a i . Choose an arbitrary vertex; for convenience of notations we take it to be Pl' We shall define the notion: (oriented) tree with root P. A tree with root PI is a sub-set A of the set of edges of T, with the following properties. a. Any vertex #- PI is the tail of just one element of A. b. No element of A has its tail in Pl' c. Any vertex can be connected with PI by a set of consecutive edges, all belonging to A. It is easily seen that c can be replaced by c*. A contains no closed oriented cycles. There is a striking relation between trees and circuits. Choose a fixed edge el whose tail is P 1> and consider an arbitrary circuit of T. We traverse it, starting with el . Running through the circuit, each vertex Pi will be visited ai times. The edge by which we leave Pi after having visited it for the acth time will be called the last exit of Pi' The 0 rem 5a. The set A consisting PN is a tree with root Pl'

at the last exits at P2' ... ,

Proof. The properties a and b are trivial. We shall verify c*. We can number the edges of T according to the order in the circuit, with indices 1, ... , m; el gets the index 1. If ei and ej both belong to A, and if ei and ej are consecutive in T, then we have i < j. For, ei+l has the same tail as el , and j is

158

213

the maximal value of the indices of all the edges with this tail. Consequently, A does not contain any closed cycle; the indices in such a cycle would increase indefinitely. The 0 rem 5b. If a tree A with root PI is given, and if e1 is given, then there are exactly (6.1)

N

II (ai-I)!

i=l

circuits of T whose set of last exits coincides with A. Proof. At any vertex we number the outgoing edges 1). with the following restrictions: At PI the edge e1 gets the number 1; at Pi (i > 1) the edge belonging to A gets the highest possible number, that is a i . The number of ways in which this can be arranged is expressed by (6.1). It remains to be shown that, for each numbering of this type, there exists a circuit (and not more than one circuit) corresponding with this numbering. First thing it will be clear that we have no choice at all if we try to traverse a circuit according to this numbering. Starting with eI> we arrive at a vertex P 2 , say. It is prescribed which outgoing edge we have to take first, etc. If We meet a vertex for a second time, we are forced to leave it by the edge bearing the number 2, and so on. The process has to stop somewhere, the graph being finite. The only reason why it should stop is, that We arrive at a vertex where all outgoing edges have already been taken before. This must be PI> for all other vertices have been entered at least as often as they have been left. We can show that at this moment all edges of T have been used, each exactly once of course, which means that a circuit has been described. Assume that a certain edge is vacant, that means that it has not yet been used. Considering its head, there is a vacant entry and hence there is a vacant exit. Especially, the exit belonging to A has to be vacant, since it has the highest number. This vacant edge of A leads into another vacant edge of A, and so on. Bye, we eventually arrive at PI> and we find that there is a vacant outgoing edge. This contradicts the fact that the process stopped. From Theorem 5a and 5b we immediately obtain. The 0 rem 6. The number 0/ trees in T with a given root is N

I T I . { II (a, - I)! i=1

}-1,

1) This way of numbering is different from the one considered in t he proof of theorem Sa.

159

214 which does not depend on the vertex chosen as the root. As before, j T I denotes the number of circuits of T.

Theorem 6 furnishes a new proof of theorem 3. For, there is a simple relation between the number of trees in T and in TA. Any tree in TA gives rise to a tree in T, by the mapping TA --+ T which maps entire bundles of TA into the corresponding edges of T. Conversely, in any bundle of TA an edge can be chosen in A ways. Any tree in T contains N-I edges, and so we have t (TA) = t (T) . AN-I,

(6.2)

where t (T) and t (TA) denote the number of trees with a given root, in T and TA, respectively. By theorem 6 we have (6.3)

I TA I =

(6.4)

I T I = t (T)

N

t (TA) . II (AG; i=l

N

. II (G; i= 1

I) !,

I) ! .

Theorem 3 follows from (6.2), (6.3) and (6.4). § 7. Trees in arbitrary oriented graphs.

\Ve consider an oriented graph G, with N vertices PI' ... , PN. We no longer require that it is a T-graph, that is to say, the number of edges starting from Pi need not be the same as the number of edges pointing towards Pi' Again, we can consider (oriented) trees, with a given root. Tutte [5J showed, that the number of trees in T with a given root can be interpreted as the value of a certain determinant. Since his result is in several ways connected with the results of the present paper, we give a full account of his theorem, with a new proof. Let (aij) (i, j = I, ... , N) be the following matrix. If i =F- j, then a;; = - bii , where bi ; denotes the number of oriented edges from P; to Pi (Pi is the tail and Pi is the head of these edges) Further N

a ii

is such that I

1=1

ail =

O.

The 0 rem 7 (Tutte). The number of trees with the given root Pi equals the minor of a ii in the matrix (a i ;). Proof. For simplicity of notation we take i = 1. We first consider a special graph, where each vertex =F- PI is the tail of just one edge, and where no edge leaves Pl' This graph is

160

215

either a tree or it is not; the possibility of constructing more than one tree in this graph does not exist. We shall show that the minor of all is 1 or according to whether the graph is or is not a tree. First assume that the graph is a tree. We shall apply induction with respect to N; for N = 2 the result is trivial. Take N > 2. There is at least one vertex which is not the head of an edge. This is the case with P 2' say. Then the second column of the matrix reads 0, I, 0, ... , 0. Hence the value of the minor of an is not altered if the second row and the second column are both cancelled. The new matrix corresponds to the graph which results by cancelling P 2 and the edge starting from P 2 • This new graph is still a tree, and the induction is completed. Next assume that the graph is not a tree. Then it shows somewhere a cycle of edges not containing Pl' For example, let the cycle consist of the edges P 2 P a, P aP 4 , P 4 P a. Then, in the matrix, the 2"d, 3rd , and 4th row are linearly dependent, for their sum vanishes. It follows that the minor of an equals zero. This completes the proof of the theorem for our special graph. The general case is easily reduced to this one by repeated application of the following operation. Divide the set of edges starting from a certain edge, P 2 , say, into two groups. Now construct two graphs; the first one arises from the original graph by cancelling the edges of the first group, the second one by cancelling the edges of the second group. The matrices of the graphs are such that the second row of the original matrix equals the sum of the corresponding rows in the new matrices; all other rows are identical in the three matrices. Therefore, the minor of an in the original matrix is the sum of the minors of all in the new matrices. On the other hand, the number of trees in the original graph is the sum of the numbers of trees in both graphs. This proves the theorem.

°

Theorem 6 shows that in a T-graph the number of trees does not depend on the choice of the root. T u t t e deduced the same fact from theorem 7. We repeat his argument. Assume that the graph considered in theorem 7 is a T-graph. Then we have a;i = a i - (li where (li is the number of edges from Pi to Pi' Therefore, we also find that the sum of the elements in each column of the matrix is equal to zero. It is a well-known fact that if in a square matrix the sum of the elements in each row and, in each column vanishes, then the cofactors of all elements have the same value. Especially, the minor of a;i does not depend on i.

161

216

We again consider an arbitrary graph G, which need not be a T-graph. Let P v ... , P n be its vertices, and let a i be the number of edges starting from Pi' and 'l: i the number of edges pointing towards Pi' Furthermore, bi; denotes the number of oriented edges from Pi to Pi' Hence ai = r b;;, '1:; = r bi;' i i Next we consider a permutation S of the N objects 1, 2, ... , N. Let Gs be the graph arising from G in the following manner: replace each edge Pi P; of Gs by an edge Pi PSi, where Sj is the result of S applied to the object j. Therefore, if the analogues of ai' ii' bi; for the graph Gs are denoted by ap), TP), bJS), respectively, then we have

a/S) = ai , TS/S) = T; ,bi,s/S)

(7.1)

= bi;'

Let ti (Gs) denote the number of oriented trees in Gs whose root is Pi' and let 6N denote the group of all N! permutations of the objects 1, ... , N. Then we have The

0

rem 8.

r

s.'5N

ti (Gs) = (N -

1) ! II ak' k"*i

Proof. We may and do assume i = 1. We shall apply theorem 7. To this end, we consider the matrix (i, j = 1, ... , N)

where (ji; is Kronecker's symbol. Its determinant det Ms is a multilinear polynomial in the variables Av ... , AN: det Ms = Is (AI' ... , AN), and it will be clear from theorem 7 that (7.2)

tl (Gs)

()

=

IT Is 1

(A v a2 ,aa,·· .,aN).

We put (7.3) In the first place we can show that P (AI' ... , AN) does not contain terms of degree < N - 1. For instance, consider the term with A3 A4 ... AN, which does not contain either Al or A2 • Let T be the transposition of the objects 1 and 2. Then the coefficient of A.3 ... ),N in det MTS is easily seen to be the opposite of the coefficient oj A3 . .• A.N in det Ms. If S runs through6N, then TS does the same, and so the coefficient of A3 • •• AN in: P (Av ... , AN) turns out tc be zero. We next deal with the terms of degree N - 1, and therefore

162

217 we consider Al A2 ... Ai_l Ai + l ... AN. Its coefficient in Is equals b;/SJ. Consequently, its coefficient in P (AI' ... , AN) is -

E

b;/SJ

S€@)N

= - E

S€@)N

biS-1i '

= - (N -

I)! Eb;; = - (N-l) !O"i j

Finally, the coefficient of Al ... AN in Is equals 1, and in P P'l' ... , AN) it is N! Thus we have proved that P (At> ... , AN) = Al ... AN' (N -I)! {N - E O"i/A;}. i

From (7.2) and (7.3) we now deduce ()

E tl (Gs) = ", P (At> 0"2' ... , O"N) = 0"2 '"

S€@)N

011.

O"N' (N -I)!

Theorem 8 is, in some sense, a generalization of theorem For, if we apply theorem 8 to a graph which is a T(a), then we have, by theorem 6, (7.4)

E

S€@)N

I Gs I =

(N-I)! O"N-l {(O"-I) !}N

=

(N-I)!

! . (0"l)!1 0"

It is not difficult to see that (7.4) is equivalent with theorem 1 (take n = 0", k = N). Note added in proof. By theorems 6 and 7 the number of circuits in a T-graph can be expressed as a determinant. For the special case that T is a T(2), this result was announced by W. T. TUTTE and C. A. B. SMITH (On unicursal paths in a network of degree 4, Amer. Math. Monthy 48, 233237 (1941)).

REFERENCES 1. N. G. DE BRUI]N. A combinatorial problem. Nederl. Akad. Wetensch., Proc. 49, 758-764 (1946) = Indagationes Math. 8, 461-467 (1946). 2. 1. ]. GOOD. Normal recurring decimals, ]. London Math. Soc. 21, 167-169 (1947). 3. M. H. MARTIN. A problem in arrangements. Bull. Amer. Math. Soc. 40, 859-864 (1934). 4. D. REES. Note on a paper by 1. ]. GOOD. ]. London Math. Soc. 2t, 169-172 (1947). 5. \V. T. TUTTE. The dissection of equilateral triangles into equilateral triangles. Proc. Cambridge Phil. Soc. 44, 463-482 (1948).

Reprinted from

Simon Slevin 28 (1951), 203-217

163

THE FACTORS OF GRAPHS \\T. T. Tl'TTE

1. Introduction. A graph G consists of a non-null set V of objects called t'ertices together with a set E of objects called edges, the two sets having nn common element. \Vitheach edge there are associated just two vertices, called its ends. Two or more edges may have the same pair of ends. G isfillite if both Vand E are finite, and illfinite otherwise. The degree de(a) of a vertex a of G is the number of edges of G which have a as an end. G is locally finite if the degree of each vertex of G is finite. Thus the locally finite graphs include the finite graphs as special cases. A sllbgraph II of G is a graph contained in G. That is, the vertices and edge,; of II are vertices and edges of G, and an edge of II has the same ends in II as in C. A restriction of G is a subgraph of G which includes al1 the vertices of G. A graph is said to be regular of order n if the degree of each of its vertices is I:. An lI-factor of a graph G is a restriction of G which is regular of order 11. The problem of finding conditions for the existence of an lI-factor of a gin-n graph has been studied by various authors [3; 4; 5]. It has been solved. in part. by Petersen for the case in which the given graph is regular. The author h;I" given a necessary and sufficient condition that a giwn locally finite graph shall have a I-factor [6; 7]. In this paper we establish a necessary and sufficient condition that a given locally finite graph shal1 have an ll-factor, where 11 is any positiw integer. Actually we obtain a more general result. \Ve suppose gi\"Cn J function f which associates with each ,-ertex a of a given locally finite graph G a positive integer f(a), and obtain a necessary and sufficient condition that G 5hal1 have a restriction If such that dH(a) = f(a) for each vertex a of G. The discussion is based on the method of alternating paths introduced by Petersen [4]. We also consider the problem of associating a non-negative integer with each ~dge of G so that for each ,-ertex c of G the numbers assigned to the edg-t>5 having c as an end sum to f(c). \Ve obtain a necessary and sufficient condition for the solubility of this problem. :\Jy attention has been drawn to two other papers in which similar theoril'5 of factorization haw been put forward. In one of these papers, GaIlai [lJ gives a valuable unified theory of factors and gives some new results on the factorization of regular graphs. He also claims to have obtained a necessary and sufficient condition for the existence of a 2-factor in a general locally finite graph, but leaves the discussion of this for another occasion. In the other papl'r Bekk [I] establishes a necessary and sufficient condition for the existenct' at" an n-factor in a general finite graph, where n is any positive integer. Prominl'!1t Received February 20. 1951.

164

THE FACTORS OF GRAPHS

31.')

in his theory is the hyper-n-prime graph, a generalization of the hyperprime graph introduced in [6]. 2. Recalcitrance. A path in a graph G is a finite sequence (1)

s.:ltisfying the following conditions: (i) The members of P are alternately vertices and edges of G, the terms a lr a2, ... ,aT being vertices. (ii) If 1 < i < r, then a t and a i+l are the two ends of Ai. We say that P is a path from al to a TI and that its length is r - 1. \,",~ note that the terms of P need not be all distinct. \Ye admit the case in which P has length o. Then P has just one term, a wrtex of G. The wrtices x and y of G are connected in G if a path from x to y in G exists. If this is so for each pair {x, yl of vertices of G, then G is connected. The relation of being connected in G is evidently an equivalence relation. I t therefore partitions G into a set {Gal of connected graphs such that each edge or vertex of G bdongs to some Ga and no two of the G. have any edge or vertex in common. "·e call the graphs Ga the components of G. If S is any proper subset of the set of vertices of a giwn graph G, \\·e denote k- G(SJ the subgraph of G obtained by suppressing the members of S and all td~es of G having one or both ends in S. Suppose now that G is locally finite and that S is a finite set of ,"ertices of G. If S docs not include all the vertices of G the graph G(S) is defined. Then if II is any finite component of G(S) we denote the number of edges which haw Olll? end in S and the other a vertex of H by <,(H). We have (1)

v(H)

+ L dG(c) coH

==

0 (mod 2),

fn: the expression on the left is equal to twice the number of edges of G ha,·ing an end which is a vertex of H. (We have used the symbol c :: H to denote that c i- a vertex of H.) ""e denote by K(G, S) the set of all finite components H of G(S) which satisfy

t'CH)

(3)

+

Lf(c) coH

==

1 (mod 2).

If K(G, S) is finite we denote the number of its elements by keG, S). If S includes all the wrtices of G we write keG, S) = O. In either case we write

reG, S)

(4)

= keG, S)

+L

(f(c) - dG(c».

(f.e;

We call reG, S) the recalcitrance of G with respect to S. If K(G, S) is infinite reG, S) is infinite.

we qy that

THEORDI

I. If G is finite, reG, S) is even or odd according as Lf(c) Cf.G

is e'i.'en or odd.

165

31G

W. T. TUTTE

Proof. By (2), (3), and (4), r(G, S)

=0

L

dG(c)

CfG

+

Lf(c) (mod 2). CfG

But the sum of the degrees of the vertices of G is even, since it is twice tht number oLedges of G. The theorem follows. The locally finite graph G is constricted with respect to f if there exist disjoint finite sets Sand T of vertices of G such that (5)

Lcor f(c) < r(G(T), S).

As an example, G is constricted if it has a vertex a such that dG(a) < f(a). In this case (5) is satisfied if T is null and S has the single element a. Again, Gis constricted if r(G, S) > 0 for any set S of vertices of G, for then (5) is satisfied with T null. So by Theorem I a finite graph G is constricted if the sum of the numbers f(c), for all the vertices c of G, is odd. In this case (5) is satisfied il Sand T are both null. We define an f-factor of the given locally finite graph G as a restriction F of G such that dF(c) = f(c) for each vertex c of G. Similarly, a restriction F of a subgraph X of G is an f-factor of X if dF(c) = f(c) for each vertex c of X .. \ restriction F of a subgraph X of G is an incomplete f-factor of X if dF(c) <; f«() for each vertex c of X, and dF(c) = f(c) for all but a finite number of the vertices of X. The deficiency of such an incomplete f-factor is the sum

taken over all vertices c of X for which dF(c) < f(c). Our object in this paper is to show that G has no f-factor if and only if G is constricted with respect to f. THEORBI I I. Let F be an incomplete f-factor of G, and let S be any finite set rd 1!ertices of G. Then the deficiency of F is not less than r(G, S).

Proof. If 1I is any member of K (G, S), let w(lI) be the number of edges oi F which have one end in S and the other a vertex of H. Analogously \\'ith (2) \rc have (6)

L dF(c)

=0

w(H) (mod 2).

ftH

Let P be the set of all elements H of K(G, S) such that dF(c) = f(c) ior each vertex c of H. Let Q be the set of all other members of K(G, S). Let the numbers of members of P and Q be p and q respectively; q must be finite. The sum of the numbers f(c) - dF(c) taken over all vertices of G not in 5 which satisfy dF(c) < f(c) is at least q. If H E P, then by (3) and (6), v(H) ;t. w(H). Hence at least p of the ('d~es of G having just one end in S are not edges of F. It follows that

166

THE FACTORS OF GRAPHS

L:

ce,.!

(f(c) - dF(c»

;;> p

+ L:

ceS

317

(f(c) - dG(c».

[fence if D is the deficiency of F we have D ;;> p THEOREM

+ q + L:

ceS

(f(c) - dG(c»

= reG, S).

III. IfG is constricted with respect tof, it has nof-factor.

Proof. Suppose G is constricted. Then there are disjoint finite subsets Sand T of the set of wrtices of G such that (5) is satisfied. Assume G has an f-factor F. Then F(T) is an incomplete f-factor of G(T). Its deficiency D is equal to the number n of edges of F having one end in T and the other not in T. Hence, by Theorem 1[, f(c) ;;> n = D ;;> r(G(T), S).

L:

coT

Thi" contradicts the definition of Sand T. 3. Alternating paths. An f-subgraph of G is a restriction J of G having the following properties: (i) The number of edges of J is finite. (ii) dAc) < f(c) for each wrtex c of G. .-\. wrtex c of G is deficient in J if dJ(c) < f(c). Let us suppose that we are given an f-subgraph J of G and that a is a wrtcx of G \\"hich is deficient in J. Following a long-established tradition we refer to an edge of G as blue or red according as it is or i:; not an edge of J . .-\.n alternating path ba.<ed on a is a path P in G which satisfies the following com}i tions: (iJ The first term of P is a. (ll) ~o edge of G occurs twice as a term of P. (iii) If P has more than one term the edges of G which occur in Pare alternately red and blue, the first one being red. If P includes the subsequence (c, C, d) where C i:; an edge of G, we say that P passes through c and the11 C, or P passes through C and then d. Let n(a) be the set of alternating paths based on a; neal is !lot null since it ha~ one member whose only term is a. Let C be an edge of G, with ends c and d. If no member of neal has C as a L"lm, Cis acursal. If some member of neal passes through C and then d, Cis rle,"aibable to d or from c. If C is describable to d but not to c, Cis unicursal to d or from c. If C is describable both to c and to d, C is bicursal. .\ vertex of G is accessible from a if it is a term of some member of neal. The vertex a is singular if no deficient vertex of G. other than a itself. IS accessible from a. THEORDI

1\'. Only a finite number of vertices of G are accessible from a.

167

318

W. T. TUTTE

If b is a vertex of G accessible from a, then either b = a, or b is an end of a blue edge, or b is an end of an edge B whose other end is either a or an end of a blue egge. Since the number of blue edges and the degree of each vertex of G arc finite, the theorem follows. THEORBI Y. Let A and B be edges of G which are of different colours and have a common end x. Suppose A is unicursal to x. Then B is describable from x.

There is a member P of II (a) which passes through A and then x. If B is not a term of P preceding A there is evidently a member of IJ(a) which agrees with P as far as A and continues (x, B, ... ). Then B is describable from x. If B precedes A in P, either the theorem is satisfied or P passes through B and then x. In the latter case there is a member of IJ(a) which agrees with P as far as B and continues (x, A, ... ). Then A is not unicursal to x, contrary to hypothesis. 4. Bicursal components. Let us suppose that G has at least one bicursal edge. The bicursal edges of G, with their ends, define a subgraph of G. We refer to the components of this subgraph as the bicursal components. THEOREM

VI. The bicursal components are finite graphs.

This follows from Theorem 1\', since the vertices of a bicursal component are all accessible from a and G is locally finite. Let L denote any bicursal component. An entrant of L is any member of II «(1 I which has a vertex of L as a term. If P is an entrant of L we denote by e(PI the vertex of L which occurs first as a term of P. \Ve then say that P enters L at e(P). A vertex of L at which some entrant of L enters L is an entrance of L. Let P be an entrant of L. Let A be the first edge of G in P after the fir"! occurrence of e(P) which is not in L, if such an edge exists. The section of P by L is defined as follows. If the edge A exists, the section is the part of P extend in!! from the first occurrence of e(P) to the term immediately preceding A. Othemisf. the section is the part of P extending from the first occurrence of e(P) to the last term of P. In either case the section is an alternating path based on e(P and having only edges and vertices of L as terms (except that its first edge may be blue). If e is any entrance of L we denote by ~(e) the set of sections by L of tho;;,. members of IJ(a) which enter L at e. Since the edges of L are not acursal, L has at least one entrance. If a is ;l vertex of L then a is an entrance of L. In the following series of theorems (VII-XI)) we suppose that some entrance e of L is specified, with the proviso that e is a if a is a vertex of L. THEOREM

of

~

VI I. There exists an edge of L which is a term of some mellli>c'

(e).

Procj. Suppose first that e is a. Any red edge of L having a as an end j; clearly a term of a member of ~(a). Suppose therefore that the edges oi L

168

THE FACTORS OF

G~~PHS

319

having a as an end are all blue. Each of these is describable from a, and no one is the first edge of a member of II (a). Hence some red edge C having a as an end is describable to a. But all red edges having a as an end are describable from a. Hence Cis bicursal and therefore an edge of L, contrary to supposition. :\ow consider the case in which a is not a vertex of L. Let P be an entrant of L sllch that e(P) = e. Let C be the edge of G which immediately precedes the first occurrence of e in P. Then C is unicursal to e. Any edge of L having e as an end and differing in colour from C is clearly a term of a member of ~(e). Suppose therefore that the edges of L having e as an end all have the same colour as C. Since they are all describable from e, some edge E of G differing in colour from C is describable to e. But E is describable from e, by Theorem \ .. Hence E is bicursal and therefore an edge of L, contrary to supposition. THEOREM VIII. If A is an edge of L with ends x and y, and If some member P'

0'- .l (e) passes through x and then A, then some other member of ~ (e) passes through Y (Ind then A.

Proof. Since A is bicursal there exists a member Q of II (a) which passes through yand then A. It may happen that every term of Q which precedes A i~ an edge or vertex of L. Then a is a vertex of L and therefore e = a by the definition of e. Hence the section of Q by L is a member of ~(e) which passes through yand then A. In the remaining case, let B be the last term of Q preceding A which is an l'dge of G but not an edge of L. Let b be the immediately succeeding term of Q. Then b is a vertex of L. Let C be the first edge of G in P' which succeeds Bin Q but does not succeed A in Q. Such an edge exists since A is an edge both of P' and of Q. Let the ends of C be rand s. We may suppose that P' passes through r and then C. Suppose Q passes through r and then C. Then there is a member of .l(e) which agrees •.,ith pI as far as C and then continues with the terms of Q from C to A. This member of ~(e) passes through y and then A. :\lternatively, suppose Q passes through s and then C. There is a member QI of II(a) which enters L at e, then agn."Cs with P' as far as C, and continues with thl' terms of Q in reverse order from C to b. Let D be the edge of Ql immediately preceding the first occurrence of e. If B ¢ D it follows that B is describable irom b. But Q passes through B and then b. So B is bicursal and therefore an t(1~e of L, contrary to its definition. We conclude that B = D and therefore I, = e. Hence there is a member of II(a) which agn'es with Ql as far as Band ;l~n:es with Q from B to A. The section of this path by L is a member of .l(e) which passes through y and then A. THEOREM IX. Let A be an edge of L which is a term of some member of ~(e). Let x be an end of A distinct from e. Then there is all edge B of L 'Which dijTers ;n rolour from A, which has x as an end, and which is a term of some member of

.lIe'.

169

320

W. T. TUTTE

Proof. By Theorem VIII there is a member of .:\(e) which passes through x and then A. The last edge preceding A in this member of .:\(e) has the required properties. THEORE!\f X. If A is any edge of L and x is any end of A, then there is a member of a (e) which passes through x and then A.

Proof. Let U be the set of all edges of L occurring as terms in the members oi .:\(e); t" is non-null, by Theorem VII. Let I" be the set of all other edges of L.

Assume that V is non-null. Since L is connected there is a vertex z of L which is an end of a member B of t" and a member C of V. If z is not e we may suppose that Band C differ in colour, by Theorem IX. By Theorem VIJI there is a member of .:\(e) which passes through B and then z. C is not a term of this member of .:\(e). Hence there is a member of .:\(e) which agrees with this one as far as B and then continues with z and C. This contradicts the definition of C. Suppose now that z is e. If Band C differ in colour we obtain a contradiction as before. \Ve deduce that all the edges of L having e as an end have the same colour. If e = a it follows from Theorem VII that e is an end of some red edge of L. Then C is red. Hence there is a member of .:\(e) which has C as its fiN edge, contrary to assumption. If e is not a it follows from Theorem \"11 that there is a member P of II(a) entering L at e in which the first occurrence oi e is immediately succeeded by an edge of L. We may take this edge to be B. Sincl' Band C have the same colour there is a member of II(a) which agrees with P as far as the first occurrence of e and then continues with C. Hence C is a meml'er of C. contrary to assumption. We conclude that IT is null. The theorem now follows from Theorem \"III. Let G1 denote any subgraph of G. An edge A of G is said to touch G1 if A is not an edge of G1 and just one end, say x, of A is a vertex of G1• Such an edge A is tmicursallo or from G1 if it is unicursal to or from x respectively. THEOkBf XI. If a is a t'erlex of L then all edges of G 'U.'hich touch L are IlIIi" (ursal from L. If a is not a t'ertex of L then tizere exists jllst one edge of G 'ii.'izir/z toucizes L and is tmicursal to L, and all other edges of G which touch L are unicztrSa! from L.

Proof. Let A be an edge of G which touches L. Let x be the end of A which is a vertex of L. Assume that A is not unicursal from x. "'e recall that a = c ii a is a vertex of L. If x is not e there is an edge C of L differing in colour from A and havin~ .\" a5 an end, by Theorem IX. This is true also if x = e = a. For then A is blue ;;inre it is not unicursal from a and not bicursaI. and Theorem VII shows that "'1111,' red edge of L has a as an end. In either of these cases it follows from TheoreIll X that there is a member of II(a) which enters L at e, whose section by L p.l~:'e5 through C and then x, and which continues from C with the terms x and A. But A is not bicursal since it is not an edge of L. Hence A is unicursal from x. contrary to assumption.

170

THE FACTORS OF GRAPHS

321

l'\ow suppose that x = e and e is not a. By Theorem VII, there is a member P of II (a) which enters L at e and in which the first occurrence of e is immediately succeeded by an edge C of L. Let the edge of G which immediately precedes the first occurrence of e in P be B. Clearly B touches L and is unicursal to L. Suppose that A and B are distinct. If A differs in colour from B it is describable from x = e, by Theorem V. If A and B have the same colour this differs from that of C. By Theorem X there is a member Q of Il(a) which enters L at e, and whose section by L passes through C and then e. It is clear that A and B cannot both precede C in Q. Hence there is a member Q' of Il(a) which agrees with Q as far as C and then continues with e and one of the edges A and B. Actually, it continues with e and A since B is unicursal to e. Hence if A and B are distinct, A is unicursal from x. This completes the proof of the theorem.

5. Bicursal units. Let T be the set of all vertices of G which are ends of bicursal edges. Let T' be the set of all edges of G having both ends in T. Then T and T' define a subgraph G' of G. We refer to the components of G' as bicursal ullits. Evidently a bicursal component having a given vertex b is a subgraph of the bicursal unit having the vertex b. By Theorem IV the bicursal units are finite graphs. THEOREM XI I. Let M be any bicursal unit. If a is a vertex of M then all edges of G which touch ll[ are unicursal from J{. If a is not a vertex of J[ then there exists Just one edge of G which touches M and is unicursal to J[, and all other edges of G 'which touch }.[ are unicursal from J[.

Proof. Since some edges of J[ are bicursal there exists a member P of Il(a) having a vertex of J[ as a term. Let e be the first vertex of J[ to occur in P. I f a is not a vertex of A[ there is an edge E of G which immediately precedes the first occurrence of e in P. Then E touches J[ and is unicursal to e and J[. We denote the bicursal component of which e is a vertex by L. If instead a is a ycrtex of J[ we denote the bicursal component of which a is a vertex by L. .\ subgraph L' of J[ which is a bicursal component distinct from L is supplied from L if there exists a sequence (Llo L~, ... , L t ) of bicursal components and a sequence (A 1, A 2, ... , A '-I) of edges of J[ such that (i) L1 = Land L, = L', (ii) the L t are subgraphs of M, (iii) for each integer i in the range 1 -< i < t, A tis unicursal from L t and to L i + 1• We can show that any subgraph of }.{ which is a bicursal component distinct from L is supplied from L. For suppose it is not. Then since JI is connected thl're is an edge B of .M wi th ends band c belonging to bicursal componen ts L' amd L", where L' is L or is supplied from L, and L" is not L and is not supplied from L. \ow B is not bicursal by the definition of a bicursal component, and is not acursal, by Theorem XI. It is not unicursal to L", since L" is not supplied from

171

322

W. T. TUTTE

L. Hence B is unicursal to L'. But this is contrary t6 Theorem Xl since L' is either L or is supplied from L. The Theorem now follows by the application of Theorem Xl to each of the bicursal components which are subgraphs of AI. If a is not a vertex of the bicursal unit JI, we call the edge of G which touche.; M and is unicursal to 111 the entrance-edge of .V. \Ye classify such bicursal units as red-entrant and blue-entrant according as their entrance-edges arc red or blue. A bicursal unit having a as a vertex is a-entrant. 6. Singular vertices. In this section we suppose that a is a singular vertex. \Ve denote the numbers of red-entrant and blue-entrant bicursal units by k, and kb respectively. These numbers arc finite, by Theorem 1\'. Let C denote the set of all vertices of G \\'hich arc not ycrtices of G'. Thus no bicursal edge has an end in C. Let l' be the set of allmemhers of C to which some red edge is unicursal. Let W be the set of all members of C from which some ml edge is unicursal or to which some blue edge is unicursal. Clearly, a

(9)

q 1'.

Suppose c E 1'. :\n)' blue edge of G having c as an end is unicursal from (, by Theorem 1'. Hence, by (9), no red edge of G can be unicursal from c. There are just f(c) blue edges of G which have c as an end and are therefore unicursal from c since c is accessible from, but distinct from, the singular ycrtex a. :\ow suppose i E W. If some red edge is unicursal from i then either a = i or there is a blue edge unicursal to i. If a = i or there is a blue edge unicursal to ;, then each red edge having i as an end is unicursal from i, by Theorem \. and the definition of Il(a). Hence any red edge having i as an end is unicursal from I. Consequently no blue edge of G can be unicursal from i. It is clear from these results that l' and 1I' are disjoint sets. By Theorem 1\ they are finite sets. If i ': W, let y(i) be the number of red edges of G unicursal from i \\"hich ,If(: entrance-edges of red-entrant bicursal units. Let z(i) be the number of hlul' edges of G which arc unicursal to i from members of 1". Let II denote the graph G(Y). If i E lV, any red edge unicursal from i is unicursal to a vertex p distinct from a. For no red edge is unicursal to a. So In" Theorem V, p is either a ycrtex of G' or a member of 1". Hence in the graph JI. the number of edges having i as an end is y(i) + (dJ(i) - z(i)). Thus ,,"e hd\-e (10)

z(i) = y(i)

+ (dJ(i)

- dH(i)).

By the definition of a bicursal unit the entrance-cdge of any bicursal unit ,,-hich is not a-entrant is unicursal either from a member of Vor from a mt"J11hcr of W. Let A he the number of blue edges of G unicursal from a member of F to a member of W. I t is equal to the total number of blue edges unicursal irOn! members of V less the number of the entrance-edg-es of the blue-entrant binm;al units. The latter number is k b • by Theorem XII. But A is also equal to the Slim

172

322

THE FACTORS OF GIUPIIS

l)f the numbers z(i) taken over all i E W. The corresponding sum of the y(l) is k" by Theorem XII. Hence we have (11 )

The bicursal units, if any, are connected finite graphs. By Theorem XII, they are components of (C(V))(lV) = II(TV). Ll't M be any bicursal u'1it. Write q(.1I) = 0 or 1 according as .11 is or is not Ii-entrant. Let u(JI) be the number of blue edges of C which touch .11 and let ;\.11) be the number of edges of C which touch .11 and have an end in IV. l'sing Theorem XII we readily obtain the following results: if .11 is blueentrant q(JI) = 1 and u(.1I) = L"(.1J) 1, if .11 is red-entrant q(.1I) = 1 and /lUI) = dM) - I, and if .11 is a-entrant q(JI) = 0 and Il(J!) = v(.1I). In each case we have

+

u(JI) ==

(12 )

+ q(JI)

~'(.1J)

(mod 2).

The slim of 11 (.1I) and the degrees in ] of the vertices of .11 is even, since it is {\liIT the number of blue edges of C having vertices of J! as ends. :\Ioreover, if c j,;t wrtex of J1I, we have dJ(c) = f(c) unless c = a; and II is a vertex of J! if and only if q(.1I) = O. It follows from (I2) that 113)

t,(.11)

+

'Lf(c) (

t J[

+ (q(JI) + 1)(d

J

+ q(Jf)

(a) - f(aJ)

==

0 (mod 2).

Referring to the definitions of ~2 \\'e see that .11 is a member ot' K(C( 1'1, TV) it and only if (q(JI) 1) (dJ(a) - f(a)) q(.1I) == 1 (mod 2). Hence J! is nut a member of K(C( IT), TV) if and only if .11 is a-entrant (q(.1I) = 0) and the deficiency f(a) - dJ(a) of a in ] is even.

+

+

TIlEORE:\f XIII. If C is 110t col1stricted there exists alii! the deficiency of a il1 ] is even.

all

a-entrant bicursa! ul1it,

Proof. Suppose, first, that a is a member of C but not of TV. Then no red uJge of C has a as an end. Hence dG(a) = dJ(a)
r(C(F), W) = k(C(F), W)

+ 'L

itW

+

(f(i) - dH(i».

Ii a (:: TV then k(C(n, TV) > kb k" and f(a) > dJ(a). If there exists an I/-('ntrant bicursal unit and the deficiency of a in ] is odd \\'e have k (C ( V), W)

> k + k + 1. b

T

Hl l'(' ;\'(' have used the results proved above concerning the membership ot' uirllrsa! units in K(C(V), TV). In each of these cases it follo\\'s that the expn'ssion on the right of (11) is less than r(C( V), W). Then C is constricted, contrary to hypothesis. The theorem foI1O\\·s.

173

324

,V. T. TUTTE

7. Augmentation. In this section we no longer assume that the deficient vertex a is singular. Suppose P is a member of Il(a) which has more than one term, and whose last term is a vertex i of G deficient in 1. To transform 1 by P is to replace J by a restriction K of G, defined as follows. The edges of K consist of the blue edges of G which are not terms of P, together with the red edges of G which are terms of P. \Ve say thef-subgraph 1 is augmentable at the deficient vertex a if there is an f-subgraph K of G satisfying the following conditions: (i) dK(a) > dJ(a). (ii) If dJ(c) = f(c), then dK(c) = f(c). Suppose a is not singular. Then there is a member P of Il(a) whose last term is a deficient vertex i of G distinct from a. Let K be the restriction of G obtained by transforming 1 by P. By the definition of Il(a) we have

and dKCc) = dJ(c) if c is not a or i. Hence K is an f-subgraph of G, and J i, augmentable at a. Suppose next that a is singular and that G is not constricted. The deficicnc,,' of a in 1 is at least 2, by Theorem XIII. Also by Theorem XIII, a is the entran"l' of a bicursal unit J[o. By Theorem VI I there is a red edge A of JIo having II ,I; an end. Since A is bicursal there is a member P of Il(a) including at least t\\n edges, whose last term is a and whose last edge is A. Let K be the restriction ot G obtained by transforming 1 by P. By the definition of Il(a) we have dK(a) = dJ(a)

+ 2
and dK(c) = dJ(c) if c is not a. Hence K is anf-subgraph of G,and 1 is augnh'lltable at a. Thus we have the following THEORBI XIV. Let 1 be any f-subgraph of G, and let a be any t'ertex of G 'U.'lii(h is deficient in 1. Then either G is constricted with respect to for 1 is augmentable (/1 i/.

8. Condition for an I-factor. THEOREM

XV. G has no f-factor 1f and only if it is constricted with respect tl) (.

Proof. Suppose first that the locally finite graph G is constricted with re~p.'i't to f. Then G has no f-factor, by Theorem II I. Suppose next that G is not constricted with respect to f. Let 1n Iw the restriction of G which has no edges. Then 10 is an f-subgraph of G. If a vertex a of G is deficient in a givenf-subgraph 1 of G, we can, by Thcorl'I1l XIV, replace 1 by an f-subgraph K in which the degree of a is increased and no vertex of G which is not deficient in 1 is deficient in K. By repeating this prnn'ss sufficiently often we can obtain an f-subgraph K' of G in which a and those vertices of G not deficient in 1 are not deficient.

174

THE F,\CTORS OF GRAPHS

32.)

I t follows that if 5 is any finite set of vertices of G, we can, by the ahove )rocess, build from 10 an J-suhgraph 1 of G in which no member of 5 is deficient. The theorem follows at once in the case in which G is finite. Then we can take ; to be the set of all vertices of G, and the corresponding J-subgraph 1 must be 11 J-factor of G. If G is infinite and connected we use the foIlO\\'ing non-constructive argument. J haw replaced my original proof by a shorter one for which I am indebted to he referee.) Let x be any vertex of G. The number of paths in G whose first term is x and \ iIich have just 2n 1 terms, where II is any given non-negative integer, is i!lite since G is locally finite. Hence the set of paths in G having x as first term ,; denumerable. Since G is connected it follO\\"s that the set of vertices of G is !enumerable, say lal, a~, ... I. By the foregoing argument, to every positive nteger 11 there is an J-subgraph 1n such that

+

dJ.(a r )

=

f(a r ),

r .;;: n.

rhe set of edges of G is at most denumerable, say equal to lAb A" .. . 1. Put F.,(s) = 1 if As is an edge of 1n and Fn(s) = 0 othen\"ise. Then b\' the diagonal lrocess, there is an increasing sequence Ill, 11" .••• such that

lim F n , (s)

k_oc

=

F(s)

:\ists for all s. Let 1 be the restriction of G \\'hose edges are those A, for \\'hirh F(s) = 1. Then dJ(a r ) = f(a r ) for all r, and 1 is anf-bctor of G. Llstly, \\'e must consider the case in \\"hirh G is infinite and not connected. \\'c em show that no component of G is constricted with respect to.(. For if this i, not so, there is a component Ga of G such that for some disjoint finite subsets S dnd T of the set of vertires of G, (1:;)

L f(c)

<

k(Ga(T), 5)

+L

U(r) - (haT) (r)).

(tS

(tT

(·!e;lrh· each component of Gu(T) is a component of G(T). Hence (15) holds \\itiI Ga(T) replaced by G(T), so that LJ(c) "T

<

r(G(T), 5).

is constricted \\ith respect to I, contrary to hypothesis. Since the theorem has been proyed for connected graphs it ioll()\\"~ t hat each ("('Illponent of G has an I-factor. Hence (assuming the multiplicatin' axiom) 1hlTe is a set Z of I-factors of components of G \\'hich contains just one I-factor ()f l';ICh component of G. The restriction of G \\"hose edges are the edges of tl1(' 1l1"lllbers of Z is an J-factor of G.

T!lll~ G

9. II-factors. A necessary and sufficient condition for the existl'l1l"l' of an n-hctor of G, where n is a given positive integer, can be obtained by applYing Thl"Orem XV to the special case in which the value of J(c) is 11 for each \'ertex C (If G.

175

32G

W. T. TUTTE

It is convenient to denote the number of elements of a finite set U by We then obtain the following THEoRBf X\·I. G has 110 lI-factor if and only Sand T of 'cat ices of G such that

(16)

na(T)

< k(G(T),

1f

there exist disjoint ji.nite sets

Lc.s (dG(T)(c)

S) -

aCe).

- n).

Here k(G(T), S) is the number of finite components II of (G(T»)(S) = G(SU T! for which 11 times the number of vertices differs in parity from the number of edges of G which have one end in S and the other end a vertex of II. A necessary and sufficient condition for the existence of a 1-factor of a gi\·\·n locally finite graph G has been givcn in pre\·ious papers [6; 7). It is simpler in form than the expression obtained by writing 11 = 1 in (16). In the next S{'ctinn, this simpler formula is deduced from Theorem xv. The argument suggests llfl analogous simplification in the case n > 1.

10. An allied problem. Suppose that we are given a locally finite graph G, ,md a functionfwhich associates with each vertex c of G a positive integer f((·I. We consider the problem of associating with each edge A of G a non-negatiw integer h (A) so that for each vertex c of G the sum of the numbers h (A), takt-n over all edges A of G having c as an end, is f(c). If such a set of non-neg-ati\"\" integers heAl exists \\·e say that G isf-soll/ble. We note that if f(c) = 1 for each vcrtex of G, then C is f-soluble if and onl:: if it has a 1-factor. Let T be any finite set of vertices of C. \\'1' denote by SeT) the set of all vcrtices C of G having the following properties: (i) c is not an element of T. (ii) Each edge of G ha\·ing ( as an end has its other end in T. If T docs not include every vertex of C we denote by k(T) the number of finite components II of G(T) haying the following properties: (i) H has more than one \·ertex. (ii) The sum of the numbersf(a}, taken o\'Cr all wrtices of II, is odd. If T is the set of all vertices of G we write k(T) = o. THEORDf X\'IL C t'fr/ices of C such that

(17)

is

not f-solllble if a1ui only

LICc) < ef.T

k(T)

~f

there exists

a

finite set To'

+ LfCc). Cf:SI

T)

Proof. By adjoining new edges to C we can obtain a graph G' having th\' following properties: (i) The vertices of G' are the vertices of C. (ii) Two vertices are joined by an edge in G' if and only if they are joint'd by an edge in G. (iii) If two vertices a and b are joined by an edge in G', the number oi distinct edges of G' which join them is finite and not less than dG(a) fw L

+

176

327

THE FACTORS OF GR.\PHS

Clearly G' is locally finite. If Sand T are disjoint finite sets of vertices of G such that S is contained in SeT) it follows from the definition of G' that

keG' (T), S) = k(G(T), S).

(18 )

It is clear that G is f-soluble if and only if G' has an f-factor. Hence, by Theorem XV, G is not f-soluble if and only if there exist disjoint finite sets Sand T of vertices of G such that (19)

L.f(e) leT

< keG' (T),

S) -

L. (da'cT)(e) - fee»~. ce8

Suppose first that (17) is satisfied for some finite T. If SeT) is not finite then all but a finite number of its elements have degree 0 in G. since G is locally tinite. Hence G is not f-soluble since it has a vertex of degree O. If SeT) is finite it follows from (17) and (18) that (19) will be satisfied if we put S = SeT). Hcnce G is not f-soluble. Conversely, suppose that G is not f-soluble. Then (19) is satisfied for some disjoint finite sets Sand T. If possible let a be any member of S not in SeT). Consider the effect of replacing S by S' = S - Ia I. Clearly the replacement diminishes L. (lla'(T)(c) - fee») Cf:}

b,- dU'(T) (a) - f(a), that is. by at lcast da(a), from (iii). The replacement diminishes k(G'(T), S) by not more than da{T)(a), the maximum number of finite components of G' (S U T) joined to a by an edge of G'. But daCT)(a) <; da(al. Hence. if a is not an element of Sen. formula (19) rcmains valid when S is replaced by S'. If S' has an element not in SeT) \\'e repeat the argument with 5' replacing S, and so on. Since 5 is finite we find, eventually. (20)

L.f(e) Cf.

T

< k(G'(T).

C)

+

L.f(e). CftT

\\here [; is the intersection of Sand SeT). But. by (18). k(G'(T). C) is equal (I) k(T) plus the number of components of G(T U C) which consist of a single \"(·rtex. the value of f for this vertex being odd. Hence

(21)

k(G'(T). C) <; k(T)

+

L. ('fS,

T ,-('

fee).

\'ow (20) and (21) imply (17). This completes the proof of the theorem. if fCc) = 1 for each vertex e of G it is clear that G is f-soluble if and only if it In" a l-factor. Applying Theorem XVII to this case we find that G has no 1LH'tor if and only if there exists a finite set T of vertices of G such that

aCT)

< hu(T),

where hu(T) is the number of finite components of G(T) having an odd number of vertices. This is the simple criterion for the existence of a l-factor mentioned in ~9.

177

328

W. T. TUTTE REFERENCES

1. H. B. Belck, Regulare Faktoren von Graphen, J. Reine Angew. Math., vol. 188 (1950), 228-252. 2. T. Gallai, On factorization of graphs, Acta Mathematica Academiae Scientarum Hungaricae, vol. 1 (1950), 133-153. 3. P. Hall, On representation of subsets, J. London Math. Soc., vol. 10 (1934), 26-30. 4. J. Petersen, Die Theorie der reguliiren Graphs, Acta Math., vol. 15 (1891), 193-220. 5. i{. i{ado, Factorization of even graphs, Quarterly J. Math., vol. 20 (949), \t5-104. 6. W. T. Tutte, The factorization of linear graphs, J. London !l.Iath. Soc., vol. 22 (1947), 107-111. 7. - - - , The factorization of locally finite graphs, Can. J. Math., vol. 1 (1950),44-49.

The Unit·ersity of Toronto

Reprinted from ClIIllld. J. Math. 4 (1952).314-328

178

A PARTITION CALCULUS IN SET THEORY P. ERDOS AND R. RADO

1. Introduction. Dedekind's pigeon-hole principle, also known as the box argument or the chest of drawers argument (Schubfachprinzip) can be described, rather vaguely, as follows. If sufficiently many objects are distributed over not too many classes, then at least one class contains many of these objects. In 1930 F. P. Ramsey [12] discovered a remarkable extension of this principle which, in its simplest form, can be stated as follows. Let S be the set of all positive integers and suppose that all unordered pairs of distinct elements of S are distributed over two classes. Then there exists an infinite subset A of S such that aU pairs of elements of A belong to the same class. As is well known, Dedekind's principle is the central step in many investigations. Similarly, Ramsey's theorem has proved itself a useful and versatile tool in mathematical arguments of most diverse character. The object of the present paper is to investigate a number of analogues and extensions of Ramsey's theorem. We shall replace the sets S and A by sets of a more general kind and the unordered pairs, as is the case already in the theorem proved by Ramsey, by systems of any fixed number r of elements of S. Instead of an unordered set S we consider an ordered set of a given order type, and we stipulate that the set A is to be of a prescribed order type. Instead of two classes we admit any finite or infinite number of classes. Further extension will be explained in §§2, 8 and 9. The investigation centres round what we call partition relations connecting given cardinal numbers or order types and in each given case the problem arises of deciding whether a particular partition relation is true or false. It appears that a large number of seemingly unrelated arguments in set theory are, in fact, concerned with just such a problem. It might therefore be of interest to study such relations for their own sake and to build up a partition calculus which might serve as a new and unifying principle in set theory. In some cases we have been able to find best possible partition relations, in one sense or another. In other cases the methods available to the authors do not seem to lead anywhere near the ultimate Part of this paper was material from an address delivered by P. Erdos under the title Combinatorial problems in set theory before the New York meeting of the Society on October 24, 1953, by invitation of the Committee to Select Hour Speakers for Eastern Sectional Meetings; received by the editors May 17, 1955.

427

179

428

P. ERDOS AND R. RADO

[September

truth. The actual description of results must be deferred until the notation and terminology have been given in detail. The most concrete results are perhaps those given in Theorems 25, 31, 39 and 43. Of the unsolved problems in this field we only mention the following question. Is the relation }.-?(wo2, wo2)2 true or false? Here, }. denotes the order type of the linear continuum. The classical, Cantorian, set theory will be employed throughout. In some arguments it will be advantageous to assume the continuum hypothesis 2No =NI or to make some even more general assumption. In every such case these assumptions will be stated explicitly. The authors wish to thank the referee for many valuable suggestions and for having pointed out some inaccuracies. 2. Notation and definitions. Capital letters, except L\, denote sets, small Greek letters, except possibly 11', order types, briefly: types, and k, 1, m, n, K, }., IL, p denote ordinal numbers (ordinals). The letters

r, s denote non-negative integers, and a, b, d cardinal numbers (cardinals). No distinction will be made between finite ordinals and the corresponding finite cardinals. Union and intersection of A and B are A +Band AB respectively, and A CB denotes inclusion, in the wide sense. For any A and B, A -B is the set of all xEA such that xEEB. No confusion will arise from our using 0 to denote both zero and the empty set. If p(x) is a proposition involving the general element x of a set A then {x:p(x)} is the set of all xEA such that p(x) is true. T/ and}, are the types, under order by magnitude, of the set of all rational and of all real numbers respectively. }. will also be used freely as a variable ordinal in places where no confusion can arise. The relation a ~/3 means that every set, ordered according to /3, contains a subset of type a, and aji/3 is the negation of a ~/3. To every type a there belongs the converse type a* obtained from a by replacing every order relation xy. We put [m, n] = {,,:m ~ "

< n}

for m ~ n.

The symbol

{xo,

Xl, ••• } <

denotes the set {xo, Xl, ••• } and, at the same time, expresses the fact that Xo <Xl < .... Brackets { } are only used in order to define sets by means of a list of their elements. For typographical convenience we write

L:

[x E A If(x)

180

429

A PARTITION CALCULUS IN SET THEORY

instead of 2:xEAf(x) , and we proceed similarly in the case of products etc. or when the condition xEA is replaced by some other type of condition. The cardinal of S is S and the cardinal of a is a For every cardinal a, the symbol a+ denotes the next larger cardinal. If a = b+ for some b, then we put a- = b, and if a is not of the form b+, i.e. if a is zero or a limit cardinal, then we put a- =a. Similarly, we put k-=l, if k=l+1, and k-=k, if k=O or if k is a limit ordinal. If S is ordered by means of the order relation x 1 we denote by a' the least cardinal n such that a can be represented in the form 2: [v < n ]a. where a, < a for all v< n. This cardinal a', the cofinality cardinal belonging to a, is closely related to the cofinality ordinal cf((3) of an ordinal (3 introduced by Tarski [17]. A regular cardinal is a cardinal a such that a' =a. The least ordinal of a given cardinal a is the initial ordinal belonging to a. Initial ordinals are the finite ordinals and the infinite ordinals w. of cardinal ~ •. We put

I I,

I I.

II

lx:xcs; Ixl =a}. In particular, [s]a=o if Isl
(f.£

AI'A. = 0

< v < k).

Fundamental throughout this paper is the partition relation a~

(b, d)2

introduced in [6]. More generally, for any a, b" k, r the relation (1)

is said to hold if, and only if, the following statement is true. The cardinals b. are defined for v < k. Whenever

lsi

=

a;

[s]r =

L

[v < k]K.,

then there are BCS; v
I BI

= b.;

For k <wo we also write (1) in the form

a ~ (b o, bl ,

... ,

181

bk_l)r,

430

P. ERDOS AND R. RADO

(September

and if k is arbitrary, and b,=b for all v
We also introduce partition relations between types. By definition, the relation (2)

a

~ «(30, (31, ••• )~

holds if, and only if, (3, is defined for v
[s]r

S = a;

=

1:[1/ < k]K..

there are BCS; v
11

= (3.;

If k <wo, or if all (3, are equal to each other, we use an alternative notation for (2) analogous to that relating to (1). The negation of (1), and similarly in the case of (2), is denoted by a "* (b o, b1,

•••

)~.

We mention in passing that the gulf between (1) and (2) can be bridged by the introduction of more general partition relations referring to partial orders. These will, however, not be considered here. If a ~No then, clearly, a' is the least cardinal N" such that a"*(a)~... Also, Nm is regular if, and only if, Nm~(Nm)~.. for all n < m. Finally, the relation a~(bri, bi, ... )! is equivalent to ':E [1/
{y: {y, x}< C S};

R(x) = {y: {x, y}< C S}.

If, in addition, [S]r= ':E[v
In the special case r=2, we put, for xES; v
=

Inx E A ]W(x).

Also, W(O) = S. If n <wo, then we write W(xo, ... , X,,-I) instead of

182

431

A PARTITION CALCULUS IN SET THEORY

W( {Xo, ... , xn-d). It will always be clear from the context to which ordered set S and to which partition of [S]r these functions refer. We shall occasionally make use of the notation and the calculus of partitions (distributions) summarized in [5, p. 419]. The meaning of canonical partition relations

* ({3)r

a _

and that of polarized partition relations ao at-l

-

boo

b~

. • .•... bt- 1 •o

r··"-'

••• JA,

will be given in §§8 and 9 respectively. The relation defined in §4.

a-Om will

be

3. Previous results.

A].

THEOREM

1. If k <We then No-(No)~ [12, Theorem

THEOREM

2. If k, n <wo, then, for some f=f(k, n, r)
f - (n)~ [12, Theorem B]. THEOREM

(ii)

3. (i) If a ;?;N o, then a-(No, a)2. NW o)2.

NWo~(Nl'

(i) is proved in [2, 5.22]. This formula will be restated and proved as Theorem 44. (ii) is in [3, p. 366] and will follow from Theorem 36 (iv).

4. (i) If a ;?;N o, then (a 4)+_(a+)!. (ii) If a;?;No, then a4~(3)!. (iii) If 2Nn = Nn+l, then N n+2-(N,,+1, N n+2)2.

THEOREM

(i) is given in [3] and will be deduced as a corollary of Theorem 39. (ii) is in [3, p. 364], and (iii) is [3, Theorem II] and follows from Theorem 7(i).1 5. If4>~A; 14>1 >No, then, for a<wo2; {j<w~; 'Y<Wl, (i) 4>-(wo, 'Y)2. (ii) 4>-(a, {J)2.

THEOREM

1 The partition relations occurring in (i) and (ii) are to be interpreted in the obvious way. Their formal definition is given in §4.

183

432

P. ERDOS AND R. RADO

[September

(i) is [5, Theorem 5 J, and (ii) is [5, Theorem 7]. Both results will follow from Theorem 31. THEOREM

6. 17~(No, 17)2.

This relation, a cross between (1) and (2), has, by definition, the following meaning. If 5="1; [S]2=Ko+Kt. then there is ACS such that either or

A

= "1;

[A]! C K1•

This result is [5, Theorem 4].

7. If a~No, and if b is minimal such that ab>a, then a+)2 ab~(b+, a+)2.

THEOREM

(i) (ii)

a+~(b,

These results are contained in [6, p. 437]. (i) will follow from Theorem 34. 2 THEOREM 8. If 2tt • =N.+dor all v, and if a is a regular limit number then, for every b
This result is [6, Lemma 3], and will follow from Theorem 34. THEOREM

9. If q,~x;

14>1 = lxi, then X~(4), 4»1.

This result is due to Sierpinski who kindly communicated it to one of us. It will follow from Theorem 29. Our proof of Theorem 29 uses some of Sierpinski's ideas. THEOREM

10. For any a, a~(No, No)tt..

This is in [5, p. 434]. The last result justifies our restriction to the case of finite "exponents" r.

4. Simple properties of partition relations. THEOREM

11. The two relations

.. ) a (11

(i) a ~ (/30, /31, ••• );

.

~

( . • .• )~" /30,./31,

are equivalent. S By methods similar to those used in [17] one can show that (i) b:;ia' for all a>l, (ii) b=a' for those a>l for which tltto exist or not. Cf. [13, p. 224].

184

A

PARTITION

CALCULUS

IN SET THEORY

433

PROOF. Let (i) hold; 3'<=a*; [S]r=: E[II=a. Hence, by hypothesis, there are ACS; lI=fl.; [A ]rCK•. Then A< = fJ~. This proves (ii), and the theorem follows by reasons of symmetry. THEOREM 12. Let a-(fJo, fll' ...

)~;

a ~ a(1); k i6; k(1);

< kIll), ~ II < k).

(JI

(k(l) Then (1)

(1)

r

(1)

a - (flo ,fll , •.• )k(l). A n analogous result holds when the types a, fl. are replaced by cardinals.

PROOF. It suffices to consider the case of types. Let 3'(1)

= a (1) ;

Then there is SCS(1) such that

[s]r

3'

=a. Then

=E

[II

< k]K"

where K, = K~l) [S]r for II < k(t), and K, = 0 otherwise. By hypothesis, there are ACS; JI
A

=

"fl.;

[A]r C K •.

Ihi6;k(1),then IAI = Ifl. I i6;r;O¢[A]rCK.which is a contradiction. Hence JI
)~

then

I a 1- (I flo I, I fl11, ... );. lsi = lal; [S]r= E[II
PROOF. Let order S so that 3'=a. Then there are ACS; JI
I I I I'

THEOREM 14. If fl. is an initial ordinal, for all JI
(4)

(flo, fll' •.. );, I m1- (I flo I, Ifld, ... ); m-

are equivalent.

PROOF. By Theorem 13, (3) implies (4). Now suppose that (4)

185

434

P.

ERDOS AND R. RADO

[September

lSI Iml,

holds. Let S=mj [S]r= ~[v
IAI

THEOREM 15. If 1 +a~(1 +iJo, 1 +iJlt

...

)i+ r , then

a ~ (flo, fll' ... )~.

In this proposition, 1+a and l+iJ. may be replaced by a+1 and iJ,+l respectively. Also, the types a, iJ, may be replaced by cardinals.

PROOF. Let S=a. Let Xo be an object which is not an element of S, and put So = S + {Xo }. The order of S is extended to an order of So by stipulating that xoEL(S). Then So = 1 +a. Now let [S]r = ~[v
A = fl.;

This proves the first assertion. Next, if a+1~(iJo+1, ... )~+\ then, by Theorem 11 and the result just obtained, we conclude that

* 1 + a * ~ (1 + flo,* 1 + fll, * •.. a * ~ (flo,

r h;

a

•.•

hl+r,

~ (flo, . . . )~.

Finally, let 1 +a~(1 +b o, 1+bl, ... )i+ r • Let a and iJ, be the initial ordinals belonging to a and b, respectively. Then, by Theorems 14 and 13, 1 +a~(l +iJo, ... )i+ r , a ~ (b o, ... )~.

In this proposition the types a,

iJ", 'Y, may be replaced by cardinals.

In formulating the last theorem we use an obvious extension of the symbol (2). PROOF. We consider the case of types. Let S=a,

[s]r

Put Ko =

= ~[X

L [}..
OA •

< l]Ko). + ~[O < v < 1 + k]K,. Then, by hypothesis, there are A CS; v < 1 +k

186

A PARTITION CALCULUS IN SET THEORY

435

such that A ={3.; [A ]rCK•. 1f '11>0, then this is the desired conclusion. If '11=0, then A ={3o; [A ]rc:2: [X . and so, by hypothesis, there are BCA; X. =(3p). (X . is a one-one mapping of [0, l] into [0, k] such that {3."?:.r for 'liE [0, k] - {p>.:X
ex -4 ('Yo, 'YI, ••• )~.

In particular, the condition on the mapping X-4P>. IS satisfied whenever this mapping is on [0, k]. The types a, {3. may be replaced by cardinals. PROOF. Let N= {p,,:X.. Now let S=a; [S]r= :2: [X .. Then

[s]r =

:2:[" E N]K". + :2:[" E

[0, k] - N]O. By hypothesis, there are A CS; v
"EN;

[A]r C K".

or (ii)

"EE N;

In case (ii), di A = {3. which contradicts the hypothesis. Hence (i) holds, if ='Y,,; [A]rCK", where X=O".
If a-4({3)~;

Ikl = Ill, then a-4({3)~.

This shows that, as far as k is concerned, the truth of the relation Ikl. We are therefore able to introduce the relation

a-4({3)~ depends only on

ex -4 (~)~

which, by definition, holds if, and only if,

187

436

P. ERDOS AND R. RADO

a

-+

(September

(,3);

Ikl =d. A similar remark

for some, and hence for all, k such that applies to the relation a -+ (b)~.

THEOREM 18. Let k<wo; a-+({jo, f31, ...

)~;

+ ... + K"_l. Then there are sets M, NC [0, k] such that IMI + INI >k, S

[s]r

= a;

= Ko

,3,.E[K.]

for",EM;IIEN.

In the special case k = 2 we have either

(i)

,30

E

[Ko] [Kd

(ii) fh

or

E

[Ko][Kd

or (iii)

or

PROOF. Let

P. = {",:",

< k;,3,. E

[K.]},

Q. = [0, k] - p.

(II

I

< k).

NI

We have to find a set NC [0, k] such that ll[IIEN]P.1 >k-I or, what is equivalent, 1 E[IIEN]Q.I < I NI. If no such N exists, i.e. if [IIEN]Q.I E;; NI for all NC [0, k], then, by a theorem of P. Hall [8], it is possible to choose numbers p.EQ. such that p,.¢p, (",
IE

1

THEOREM 19. Let a-+({3, 'Y)2, and suppose that m is the initial ordinal belonging to a Then at least one of the following four statements holds. 4

I I.

(i) ,3

< ColO

(ii) 'Y

< ColO

(iii) ,3, 'Y ~ a, m

(iv) ,3, 'Y

~

a, m*.

PROOF. Let S be a set ordered by means of the relation x
(iii) means that ,9:;ia;

fI~m; "Y~a; "Y~m.

188

A PARTITION CALCULUS IN SET THEORY

437

an ordinal. If ~~wo, then the contradiction wo*~~*=B«~S«=m follows. Hence ~ <woo Case 2. 'YE [Ko]dK1k Then, by symmetry, 'Y<wo. Case 3.~, 'YE [Kok Then, for some sets A, RCS, A<=A«=~i B< = B« ='Y, and ~, 'Y ~a, m. Case 4.~, 'YE [K1k Then, similarly, A<=A»=~i B<=B»='Yi ~, 'Y ~a, m*. This proves the theorem. COROLLARY. For every a, (5)

(r - 2)

+ a-H (wo,

(r - 2)

+ wo)*

r

(r

~

2).

For none of the relations (i)-(iv) of Theorem 19 holds if ~ =Wo i 'Y =wo*. Hence a-H(wo, wo*)2, and Theorem 15 yields (5). The method employed in the proof of Theorem 19, i.e. the definition of a partition of [S]2 from two given orders of S, seems to have been first used by Sierpi6ski [15]. In that note Sierpi6ski proves N1-H(NlI N 1)2. Cf. Theorem 30.

I I

THEOREM 20. (i) If ~o ~ a; ~o < r, then a--"(~o, ~11 any k, ~h ~2' • • • • (ii) If~. = r for II < k, then the two relations

•••

)~ holds for

,

(6)

a --" (flo, fl1' . . . , 'Yo, 'Y1, . . . h+I.

(7)

a --" ('Yo, 'Y1, .•. );

are equivalent. PROOF OF (i). If S=a; [S]'= :E[II .'<1 such that B='Yx; [B]rCKk+X. This proves (6). THEOREM 21. Let (8)

a --" (flo, fl1' •.. )~.

Then either (i) there is IIo
189

438

[September

P. ERDOS AND R. RAOO

studied in the.case in which /3, ~a for all p
IBI

THEOREM 22. The following two tables give information about a number of cases in which the truth or otherwise oj any of the relations

(Po, fll .. .. )~,

(9)

a

(10)

a ---I' (ho, b" ... )~

-+

can be decided trivially.

k

~

0:

r ~ Ia l ,.

~

+

a

r> I a l r> a

k>O:

- --,-0

13. ;:;'cx

fJ. - ct

b. ~ a

b.""G

f3.~a;f3o:$a ~, ;j;. fJo;ta b. ~ a; bo>a b.>a ho
-

O
,-1 · 1>0

- ---

,>1· 1

,Bo ~ a:

(Jol r

bo';;:;a

bo>a

60<'

+

0<,<1·1 r=a>O

{JO~ Oi

+

±

- -- - -- +

- - - - - -- -

+

-

---

-

,>a

190

- -- ----- +

A PARTITION CALCULUS IN SET THEORY

439

The proofs may be omitted. When a row or column is headed by two lines of conditions the first line refers to (9) and the second line to (10). Every condition involving the suffix JI is meant to hold for every JI
5. Denumerable order types. THEOREM 23.

If n <"'0;

0:

<"'02,

then

(11)

""on ~ (n, a)!,

(12)

""on~ (n

+ 1, ""0 + 1)2.

We may assume n>O. (a) In order to prove (12), consider the set S= {(JI, }.):JI
PAI'(Bo, Bl, ... , B,,_l) = (Co. Cl , ... , C,,-l), where CA=BB.,.; CI'=BI'-B, and C.=B. for JI¢}., J.I.. Then C'="'o; CALI (x)
I

I

(;)

191

440

P. ERDOS AND R. RADO

[September

operators PAp., corresponding to all choices of A, IJ, to the system (Eo, ... , E n - l ), applying each one of the operators, from the second onwards, to the system obtained by the preceding operator, and obtain, as end product, the system (Do, ... , D .._ l ). Then D.CA.; 15.=wo(v
< n).

Then, putting D = {x.:v
24. If ex <w04, then

(13)

a

(14)

w04 -

(3, w02)2,

-t-t

(3, w02)2.

This theorem is a special case of the following theorem. THEOREM 25. Let 2~m, n<wo, and denote by lo=lo(m, n) the least finite number I possessing the following property.6 Property Pm ... Whenever peA, IJ) <2 for {A, IJ} .. C[O, I], then there is either {AD, ... , Am-d .. [0, I] such that

c

or

°

pO'a, XfJ) = there is {AD, ... ,An-d .. c[o, Z] such that P(}\a, XfJ) = 1

for a

for

< fJ < m,

{a, fJ} .. c [0, nJ.

Then

(15) (16)

wolD -

(m, won)2,

'Y -t-t (m,

Moreover, if Il-(m, m, n)2, then 10

won)2

for 'Y

< WOlD.

~ll.

Deduction of Theorem 24 from Theorem 25. We have to prove that 1&(3, 2) =4. (i) By considering the function P defined by

p(O, 1) = p(l, 2) = p(2, 0) = 0;

p(2, 1) = p(l, 0) = p(O, 2) = 1,

we deduce that 3 does not possess the property P 32 • (ii) Let us assume that 4 does not possess P 32 • Then there is peA, IJ) such that the condition stipulated for P 32 does not hold, with 1=4. If 5 The existence of such a number I follows from Theorem 2. It will follow from Theorem 39 that we may take 1= (1+31m+n-5)!2.

192

441

A PARTITION CALCULUS IN SET THEORY

{a,

,8,

'Y} .. c [0, 4] i

pea, ,8) = pea, 1') = 0,

then the assumption p({3, 1') =0 would lead to pea, ,8) = p(,8, 1') = pea, 1') = 0,

i.e. to a contradiction. Hence p({3, 1') = 1 and, by symmetry, p('Y, (3) = 1. This, again, is a contradiction. This argument proves that p(a,,8) = 0,

(17)

then pea, 1') = 1.

Since at least one of the numbers p(O, 1), p(l, 0) is zero, there is a permutation a, (3, 1', a of 0, 1, 2, 3 such that pea, (3) = 0. Then repeated application of (17) yields pea, 1') = pea, a) = 1 i p('Y, a) = i p('Y, a) = 1 i pea, a) =p(a, 1') =0, which contradicts (17). This proves lo(3, 2) ~4 and, in conjunction with (i), lo(3, 2) =4. PROOF OF THEOREM 25. 1. We begin by proving the last clause. Let

°

It ~ (m, m, n)2.

(18)

L. c [0, ld. Then Ko + K1 + K 2,

Suppose that p(X, fJ.) <2 for {X, fJ. [S]2

where S= [0,

=

ld, and K. is the set of all p(X, p.) = p(X, p.)

>

p(X, p.)

=

°

{X, fJ.}
ld such that (v = 0),

p(p., X)

(v = 1),

p(p., X) = 1

(v = 2).

By (18), there is Sl = {Xo, ... ,Xk-l}
k = mi

(20)

k

= mi

(21)

k

= ni

[sd 2 C K o, [sd 2 C KI, [sd 2 C K 2•

°

(19) implies that p(Xa, X{J) = for a < ,8 < mi (20) implies that p(Xm- 1- a, Xm- 1-{J) = for a < ,8 < mi (21) implies thatp(Xa, X{J) = 1 for {a, ,8} .. C [0, n]. This shows that [oem, n) ~ [1. 2. We now prove (15). Let l=lo(m, n)i A=[O, woll; N=[O, wo]i

[A]2

°

=

Ko

+ 'K1

(partition .6).

We use the notation of the partition calculus given in detail in [4, p. 419] which can be summarized as follows. If a is an equivalence

193

442

P. ERDOS AND R. RADO

{September

IAI

relation on a set M or a partition of M into disjoint classes then denotes the cardinal of the set of nonempty classes, and the relation

x == yeA) expresses the fact that x and y belong to M and lie in the same class of A. If, for pER, Ap is a partition of M, and if t---'l-f,(t) is a mapping of a set T into M, then the formula (t

E

T)

defines that partition A' of T for which S

== tCA')

if, and only if, fp(s)

== f,(t) (. A)

for pER.

We continue the proof of (15) by putting

+ 0', WoJA + r}) By Theorem 1 there is N'E [N]No such that IA'I =1 A'({O', r})

=

II [X, JA <

l]A({woX

by definition of A', there is p(X, f.t) {woX

+ 0', WofJ. + r} E

< 2 such that

(0' < r < wo).

in [N']2. Then,

for X, fJ. < I; {O',

K p (x.l')

r}< eN'.

By definition of 1 this implies that there is a set {Xo, ... , [0, l] such that either

c

(22)

for a

k = m;

xk-d

pi

< fJ < m

or (23)

for {a, fJ} .. C [0, n].

k = n;

If (22) holds, then we put

A'

=

{woX«

+ O'«:a < m},

where (1'a is chosen such that {(1'O, ( 1 ' l J " ' , (1'm-d
p(X«, X~)

¢

194

°

xm-d ..

443

A PARTITION CALCULUS IN SET THEORY

for some {a, then

13}
(25) for some {a, 13 }.. C [0, n]. Then, if A = [0, 'Y], we have [A ]2=Ko+' K I , where Ko is the set of all {woX+q, wOJL+r} such that {X, JL }.. C [0, l]; q
{Xo, ... , xm-d .. c [0, l]; p(X", X~) =

+ O",,:a < m};

0"0

°

< ... < O"m-l < wo; for a

< fJ < m,

which contradicts (24). If, on the other hand, A" C A;

A" = won;

[A"]2 C K I ,

then there is {Xo, .. " x,,-d
x-

(ao, aI, . . . )~

and their negatives. It turns outl that every positive relation we were able to prove holds not only for the particular type X of the set of all real numbers but for every type 4> such that (26) This fact seems to suggest that, given any type 4> satisfying (26), there always exists Xl such that i.e., that every nondenumerable type which does not "contain" WI or contains a nondenumerable type which is embeddable in the real continuum. This conjecture has, as far as the authors are aware, neither been proved nor disproved. 7 Throughout this section S denotes the set of all real numbers x such that O<x
WI·

• Cf. Theorems 31, 32. 7 Since this paper was submitted E. Specker has disproved this conjecture.

195

444

P. ERDOS AND R. RADO

[September

THEOREM 26. (i)

X-t-t (Wl)~

(ii)

X -t-t (r

for r

+ 0:0

~

0; k

for r

> o. ~

2.

PROOF. (i) is trivial, in view of Wlj;X. In order to prove (ii) it suffices, by Theorem 15, to consider the case r=2. Let {xv:v<wo} be the set of all rational numbers in S, and denote, for n <wo, by Kn the set of all {x, y} < such that the least v satisfying x <xv
r~3.

PROOF. By Theorem 15, we need only consider the case r =3. We have [s]a=Ko+'Kl, where Ko= {{x, y, z}<:y-x
Xm - Xo

< Xm+l

- xm •

m~ 00 , then the contradiction u - Xo ~ u - u follows. ASSUMPTION 2. Let A CS; if =wo+2; [A ]aCKl • Then A =B+{y, z}<; B={xo, Xl.'" }
If

THEOREM 28. X-t-t(r+1, wo+2)r for

r~4.

PROOF. It sufnces to consider the case r=4. We have [S]4 =Ko+'Kl, where Ko= {{xo, Xl, X2, Xa}<:X2-Xl<Xa-X2, Xl-XO}. ASSUMPTION 1. Let [{xo, Xl, X2, Xa, x4}d 4 cKo. Then {xo, Xl. X2, xa} EKo, and hence X2-XI <Xa-X2. Also, {Xl, X2, Xa, X4} EKo, and hence Xa-X2<X2-XI' This is a contradiction. ASSUMPTION 2. Let ACS; if =wo+2; [A]4CKI. We define B, y, z, Xv, u as in the proof of Theorem 27. Then there is mo <wo such that, for mo ~m <wo, u -Xm <X m -Xo. Then, for mo ~m <wo, {xo, Xm, Xm+l, z} EKl ; Xm+I-Xm
Ikl

~

Ixl;

X-t-t

la.1

~

(a(), alJ •.•

Sierpinski proved that X-t-t(a,

a)!

196

Ixi

(v
then

1

h.

if a~X;

lal = Ixi

(Theorem 9).

A PARTITION CALCULUS IN SET THEORY

445

PROOF.

Case 1. There is /-I < k such that a" $ X. We consider the partition S=:E' [vfA(X) is a mapping of Ao on A. We extend the definition offA by putting fA (x) = 0 for xEL(Ao) and for x EE L(Ao). Then fA (x) is nondecreasing in S. The set A is uniquely determined by the function fA and the set A o. Let D (A) be the set of those Xo for which fA (x) is discontinuous at x =Xo. Then ID(A) I ~~o. The functionfA is uniquely determined by (i) the set D(A) and (ii) the values of fA (x) for xED(A) and (iii) the values of fA (x) for all rational x. Therefore

I :E {A} I = I :E {fA} I ~ I X13~o = I XI ~ I :E {A} I. I :E{A} 1= Ixi =~n' say. Now we can write :E{A} =

{Aop: p<w n }. By symmetry, we have, for every v
and

Xvp E A. p - {X"a: (tL, 0-)

<

(II, p) }.

I {(/-I, 0'):(/-1, 0') «v, p)}1
THEOREM

30. IXI--t-7(~l' ~l)r for r~2.

PROOF. The substance of this theorem is due to Sierpinski [15]. By Theorem 15 we need only consider the case r = 2. Let x ~S>=X*; A>=A«~S«=wn; A> I A <~l'

Ixi

<Wt;

197

446

P. ERDOS AND R. RADO

[September

This proves Theorem 30. We note that this theorem is, in fact, an easy corollary of [5, Example 4A].

7. The general case. We shall consider relations involving certain types of cardinal ~I as well as relations between types of any cardinal. We begin by proving a lemma. We establish this lemma in a form which is more general than will later be required, but in this form it seems to possess some interest of its own. We recall that a' denotes the cofinality cardinal belonging to a which was defined in §2.

Isl'

LEMMA 1. Let S be an ordered set, and =~,,; W,,' w!;:t S. Then, corresponding to every rational number t, there is SICS such that = SICL(Su) for t
Isd Isl ;

Sierpitiski, in a letter to one of us, had already noted the weaker result that, if =~I; WI, wt ;:tS, then 17~S.

Isl

PROOF.

Case 1. There is A CS such that

I AL(x) I < I A I = I sI

(x

E A).

Then we define x. for I' <w" inductively as follows. Let Po <w .. ; x.EA(p<po). Then, by definition of n, and hence

I L:[I' < I'o](AL(x.) + {x.D I < I A I, there is x.oEA - E [I' <po](L(xo) + {x.}).

(p.
Then XjI<x,

Case 2. There is A CS such that

I AR(x) I < I A I = I s I

(x E A).

Then, by symmetry, the contradiction w..* ~ S follows. Case 3. There is A CS such that

I

min ( AL(x)

I. I AR(x) I) < I A I = I s I

(x

E

A).

Then we put Ao

=

Al =

{x:x

E A;

{x:xEA;

I AL(x) I < I A I}, I AR(x) I < IAI}·

Then A =Ao+AI. Case 3.1. Then AoL(x) ~IAL(x)1 (xEAo), and hence, by Case 1, we find a contradiction. Case 3.2. Aol Then Ad = and, by symmetry, a contradiction follows. We have so far proved that, if A CS; A = there is zEA such that AL(z) = AR(z) = Then A =A' +A", where A' =AL(z);

IAol =Isi. I F-I si.

I

I I

I

I

I

Isl I I Isl,

I Isi.

198

447

A PARTITION CALCULUS IN SET THEORY

IA'I I I lsi;

= A 1/ = A' CL(A"). By applying this result to A' we find a partition A =A(0)+A(I)+A(2) such that IA(II)

I= Is 1(11<3);

A(II) C L(A(II

+ 1))

(II < 2).

Repeated application leads to sets A (Ao, AI, . . . , Ak-1)

(k

< wo; A. < 3)

such that

I A (Ao,

. . . , Ak-1)

I = I s I;

A (Ao, ... , Ak-1) =

L

[II

< 3]A (Ao,

... , Ak-1, II);

C L(A(Ao, ... , Ak-1, II

A(Ao, ... , Ak-1' II)

+ 1»

(II <2).

Let N be the set of all systems (X o, •.. , Xk ) such that k <wo;

X. E {0, 2} (v < k) ; Xk = 1, ordered alphabetically. More accura tel y, if

P = (Ao, ... , XI:) and q = (/l0, ... , Ill) are elements of N, then we put

p
< (SlO,

••. , J-ll-lt 0, 2, 2, ... , 2, 1)

< (Slo,

••• , iJ.1),

provided only that the inner bracket contains a sufficiently large number of two's. Lemma 1 is proved. THEOREM

31. Suppose that q, is a type such that

11/>1>

No;

Let a <w02; {3 <w~; 'Y <WI. Then (27)

¢ -+ (a, a, a)2,

(28)

I/> -+ (a, fJ)2,

(29)

I/> -+ (wo, ')')2,

(30)

I/> -+ (4, a) I.

THEOREM 32. Let cp, a, 'Y be as in Theorem 31. Let S be an ordered set, S=q" and [S]2=K o+K1 • Then (a) there is VCS such that either

(i)

V

= a;

[V]2 C Ko.

or

199

448

P.

ERD5s AND R. RADO

[September

or (iii) V = WO'Y*;

[V]2 C Kh

and

(b) there is WCS such that either

W=

(i)

Wo

+ w:;

[W]2 C Ko,

or [W]2 C K 1,

(ii) W = 'Y;

or

[W]2 C K I•

(iii) W = 'Y*;

In proving Theorems 31 and 32 we may assume that is m such that

4;;;:; m < wo;

a;;;:; Wo

+ m;

Iq,1 =N

I•

There

fJ ;;;:; wom.

Let S=q,. The letters A, B, P, Q denote subsets of S, and we shall always suppose, in the proofs of the last two theorems, that

P= Q=

woo

PROOF OF THEOREM 31, (29). Let [S]2=Ko+Kl, and (31)

Wo

EE [Ko].

'Y

E [Kd.

We want to deduce that (32) There is B such that

I BRo(x) I ~ No(x E B).

(33)

For otherwise. there would be elements x. such that Xo Xl

E S; E Ro(xo);

I Ro(xo) I = I Ro(xo, Xl) I =

NI, NI,

generally, x.ERo(xo, ... , X._l),

I Ro(xo, ... , x.) I =

NI

(,.. <

"'0).

Then [{xo, Xl, • • • }
200

A PARTITION CALCULUS IN SET THEORY

449

L: [t rational]B(t) C B for some sets B(t) such that B(t)CL(B(u» (t
IL

[v

< vo]BRo(x.) I ~

~o

 ~o. Then there are x.,

A.(v~m)

such that

Xo E Ao = S; and so on, up to

Am = Am_ILo(xm_l) = AoLo(Xo,

Xl, • • . ,

Xm-l).

Then, by (29), Am~(wo, 'Y)2; 'YEEFI(A m), and hence woEFo(Am). There is PCAm such that [P]2CKo. Then [p+ {xo, ... , X m_d]2 CKo which contradicts (34). Hence our assumption is false, and there is A such that (35)

I ALo(x) I ~

~o

(X E A).

By Lemma 1, there isB(t)CA, for rational t, such that B(t) CL(B(u» (t <,u). There are rational numbers t. (v <'Y) such thattl' >1. (,u
< Vo ]P.).

Then, by (29), B'~(a, wo)2; aEEFo(B'); woEFI(B'), and there is P'oCB' such that [P •.1 2CK 1• This defines p. for v <'Y. PutL: [v <'Y ]P. =X. Then X=wo'Y*; [X]2CKI' But this contradicts (34), and so (a) is proved. PROOF OF THEOREM 32 (b). Let the hypotheses be satisfied but (b) be false. Then

201

450

P.

ERDOS AND R. RADO

(36)

[September

'Y

Choose any A.

I

* EE [Kd·

I

ARo(x) ~No (xEA). Then, by Lemma 1, there are sets B(t)CA, for rational t, such that B(t) CL(B(u)) (t
:E [v < Vo ]Ro(x.).

X'o E B(t. o) -

Then the set X={x.:v<'Y1 satisfies X='Y; [X]2CKI which is a contradiction against (36). Hence our assumption is false, i.e., given any A, there is xEA such that ARo(x) =N I . By symmetry, it follows that there also is yEA such that ALo(y) =N I • By alternate applications of these two results we obtain elements x., y. and sets A., B.(v <wo) such that the following conditions are satisfied.

I

xoES;

yoERo(xo) = Bo;

I

I

I

xIEBoLo(yo) =AI; YIEAIRo(XI)

= BI;

generally, for v <wo,

.:E

y.l

Then the set [v <wo]{x., =D satisfies D=wo+w~; [D]2CKo. This contradiction against (36) completes the proof of Theorem 32. PROOF OF THEOREM 31, (27). Let [S]2=Ko+K I+K2 , (37)

a

EE

[K.]

(v

< 3).

Our aim is to deduce a contradiction. We shall reduce the general case to more and more special cases. For the sake of convenience of notation we shall use the same notation for the sets in question at each stage. We put Kl2 =KI +K2. The functions F l2 , L n , RI2 refer to KI2 in the same way as the functions F., L., R. refer to K •. Let ACS. By Lemma 1, there are sets A o, AICA such that AoCL(AI)' Let xoEA I. Then AL(xo) =NI' and there is vo<3 such that AL.o(xo) =N I . By repeating this argument we find numbers Vp <3 and elements Xp (p <wo) such that

I

I SL.o(xo)

I

I

I

Xp E SL'o(xo)L.1(XI) ... L'p-l(Xp-I),

I

... L.p(xp) = NI

(p

< wo).

There are Po
202

451

A PARTITION CALCULUS IN SET THEORY

Then there is PCA o such that [P]ICKo. Then a~~; [C]2CKo, where C=P+{xp .:II<m}, which contradicts (37). Hence the Assumption 1 is false, and we have woEEFo(A o). We may assume that

Wo EE [K o].

(38)

For a later application we remark that in what follows we may replace S by any nondenumerable subset of S without any of the conclusions becoming invalid. Now let A CS. Then, by (29), A--+(wo, a)2. Also, wo--+(wo, wo)2. Therefore, by Theorem 16, A--+(wo, Wo, a)2. Hence at least one of the following three relations holds. (i) wo

E Fo(A),

(iii) a

E F 2(A).

Since (i) and (iii) are false, it follows that (A C S).

(39) By symmetry,

(A

(40)

C

S).

ASSUMPTION 2. There are x., A. (II <wo) such that xoEAo; AoRo(xo) =AI; xIEA I; AIRo(XI) =A 2 ; x.EA., etc. Then [{xo, X},··· }<1 2 CK o which contradicts (38). Hence the Assumption 2 is false, and there are 110 <wo; x,ES (II <110) such that we may put A =Ro(xo, ... , X'o-I) and we then have IARo(x) I ~No (xEA). We may assume that (41)

I Ro(x) I ~ No

(x E S).

By Lemma 1, there are sets A, B such that A CL(B). By (39), there is PCA such that [P]2CKI. For a later application we remark that at this stage we might have applied (40) in place of (39) and in this way could have interchanged the roles of KI and K •. By (41), I 2: [xEP]Ro(x) I ~No, and hence IBR12 (P) I =N1• Therefore we may assume PC L 12(S - P).

(42)

ASSUMPTION 3. If QCP; A CS, then there is xEA such that

I QL (x) I = 1

No.

Now we argue as follows. By Lemma 1, there are sets A.CS-P such that A"CL(A.) (IL
203

452

P. ERDOS AND R. RADO

x. EA.;

I p. -

[September

PI'I

< vo), < v < vo). (v

p. C P

< ~o

(p.

Then we can write [0, vo] = !px:)..<wo}. We can choose Yx such that

Yx E PPOPPI ... P px - /yo, ... , YA-d

(A

< wo).

By (41) and Assumption 3, there is x.oEA. o- L [v
I

I p' - LI(xl'r) I ~ I pI - Pl'r I + I Pl'r - LI(xl') I < ~o

+ o.

By summing over r we obtain I PI-LI(D) I <~o. Hence we may put P'LI(D) = Q, and we then have Q+D ~O'; [Q+D ]2CKI which contradicts (37). Case 2. There is DCX such that 15=0'; [D]2CK2. This, again, contradicts (37). Hence the Assumption 3 is false, i.e., there are P'CP; A'CS such that (x

E A').

Then there is A"CA' such that the set PILI(x) is constant for xEA". Then there is pI! such that P IL 2(x) =pl! (xEA"). We have therefore proved that there is pI!, A" such that (43) The whole argument from (38) onwards remains valid if S is replaced by any set A. Hence it follows from (43) that if A CS, then there are P, A'CA such that (44)

By Lemma 1, there are A o, Bo such that AoCL(Bo). By repeated application of (44) we obtain sets P., A: (p<wo) such that

+ At C Ao; PI + At CAt;

Po

[p O]2 C K I; [pd 2 C K I;

C L2(Arf), PI C L2(A{) ,

Po

generally, P.+A:CA._ I ; [P.]2CKI; P.CL 2 (A:) (O
204

453

A PARTITION CALCULUS IN SET THEORY

(p.

< v < wo).

We put Bl=BoRuCPo+P1 + ... ). Then we have the result that there are sets P., Bl (v<wo) such that

{

(45)

[P.]2 C K I ; PI'

< wo), < v < wo). (v

C L 2(P.)

(p.

Now let Vo <wo; B2CB 1 ; P'CP. o' ASSUMPTION 4. IP'L2(x) I <~o (xEB2). Then there is BaCB2 such that the set D =P'L2(x) is constant for xEBa. By (39), there is QCBa such that [Q]2CK1• Then [(P' -D) Q]2CK1 ; wo2E [Kd which contradicts (37). Hence the Assumption 4 is false, i.e.

+

if Vo

(46)

< Wo;

P' C P. o' then

I {x:x E B I P'L (x) I < ~o} I ~ ~o. I;

2

To Bl the same argument applies as to S, from (38) onwards. The only change we make is that, after (41), we apply (40) instead of (39), so that now the roles of Kl and K2 are interchanged. We find sets Q., B2CBl such that, in analogy to (45), (46), the following statements are true.

Q. C LI2 (B 2)

(47)

< wo), < v < wo). (v

Q" C LI(Q.)

(p.

If Vo <wo; Q' CQ.o' then

(48)

I {x:x E B I Q'LI(X) I < ~o} I ~ ~o. 2;

By Lemma 1, there is B: CB2 (v <wo) such that B: CL(B:) (f.L
are at most ~o elements xEB2 such that at least one of the relations

I Q: LI(X) I < ~o holds. By using this result repeatedly we find elements XA (A <wo) such that, for all v <wo,

Xo E Brf; Xl

E

B{;

I P.L2(xo) I = I Q.LI(xo) I = ~o, I P.L2(xo, Xl) I = I Q.LI(xo, Xl) I = ~o,

generally, xAEB{;

I P.L2(xo,

... , XA)

I = I Q.LI(xo, ... , XA) I =

~o

(v, X < wo).

Since wo-t(wo)~, there is a number v <3 and a sequence >'0 <>'1 <

205

... ;

P. ERDOS AND R. RADO

454

[September

Xp<Wo, such that [{x>.e' X>.p··· 1<]2cK•. By (38), v;;060. We can choose y,., z,. such that, for p. <wo, y,. E P,.L2(xo,

Xl, • • • ,

z,. E Q,.L1(xo, ... , XA"'_l)'

X>''''_l);

Put X= {x>.p:p<m}; Y= {y,.:p.<wol; Z= {Z,.:p.<wo}. Case 1. v = 1. Then [Z +X]2CKI; aE [Kd. Case 2. v=2. Then [Y+X]2CK2; aE [K2]' In either case, a contradiction against (37) follows. This proves (27). PROOF OF THEOREM 31, (28). If [S]2=Ko+Kl and if we put 'Y=wom then we have, by Theorem 32 (a), either (i) aE[Ko] or (ii) /J;;i'YE[Kd or (iii) /J;;iwom~wo'Y*EE[Kd. This proves (28). PROOF OF THEOREM 31, (30). Let [S]3=K o+'K1, (49)

4

EE [Ko];

Ci

EE [Kd·

We shall deduce a contradiction. By Theorem 2, there is n <wo such that n~(m, m)S, and p such that (50)

(n -

1)(1

+ m + m(m -

1)/2)

 ~o

and then there is CCR(zo) such that C = l7. The following diagram shows the relative position in S and the inclusion relations between the various sets to be considered in the argument that follows. It might be of help to the reader. S

{zd

J

M

206

455

A PARTITION CALCULUS IN SET THEORY ASSUMPTION. If DE [C]p, then' II [Xl, x,ED]{xo:xo
(51)

if DE [C]p,

then

{Zl, Xl, X2} E Ko

Xl,

for some Xl, X, ED.

Then [C]2=Kl +K{, where

K: = {{Xl, x,} : Xl, X2 E C;

{Zl'

Xl, X2} E K.}

(JI

< 2).

By (11), C; = 17 ~ wop-t (wo + m, p) 2. Hence there are two cases. Case 1. There is ECC such that E=wo+m; [E]2CKl. Then, since, by (49), E=wo+mEE [Kd, there are xl, x{, x, EE such that {xl, x{, x, } EKo. Then [{ZI, xl, x{, xl },,]3CKo which contradicts (49). Case 2. There is GE [C]p such that [G]2CK{ . Then {Zl' XI, x,} EEKo for all Xl, x2EG, which is a contradiction against (51). It follows that our assumption is false, and that there are HE [C]p and A CL(zo) such that

{xo, Xli X2}

EE Ko

for Xo E A ; Xl, X2 E H.

Put

V(XO,Xl) = {X2:X2EH; {XO,XI,X2} EKd

for xo, Xl E A.

Then [A]2 =Kl' +'K{' , where Kl' is the set of all {xo, Xl}
, V(Xo, Xl)' ~ n

for {xo, xd< C P,

[p]2 = L[W C H]K~, where KW = { {xo, Xl} <: Xo, Xl EP; V(xo, Xl) = W}. The number k of sets W is finite, and wo-t(wo)~, by Theorem 1. Hence there are P'CP; JCH such that [P']2CK)3l, for {xu, xd< C P'. Then for {xo, xd
Since [p,]aCKo+K I and, by Theorem 1, wo-t(wo, WO)3, there are QCP'; v<2 such that [Q]3CK•. By (49), woEE[Ko]. Hence 1'=1; [Q]'CK I • Furthermore, [J]3CKo+Kl; I=n-t(m, m)3. Hence there are

207

456

P. ERDOS AND R. RADO

[September

ME [J]"'; p<2 such that [M]8CKp. Since m~4EE[Ko], we have p=1. Then, in view of QCP'CPCA; MCJCH,

[Q

+ M]8 C K

"'0

l ;

+m =

Q + M E [Kd

which contradicts (49). Case 2. There is NCA such that N=wo+m; [N]ICK{'. Then 1

V(Xo,

Xl) 1

~ n- 1

for Xu,

Xl

E N.

Then

Ql C L(T);

TI

=

K!2)

¢

1

m.

We have [Qd l = L' [K
Q2 =

L' [K < k2]K!2),

where

0

and two elements Xoo and XOI of Qa belong to the same K~a) if, and only if, for every xIET; xaEH, the two sets {xoo, Xl, Xa} and {XOlt Xl, Xa} belong to the same class K,. Then k2 <wo and, by wo-+(wo)~, there are Q3CQa; K3
for Xo E Qs;

Xl

E T; X2 E H.

Put U=Qa+T, and choose {xo", xl' }
E V(xo", xl')

+ L[x E

T)V(xo", x)

+ LUx, y}< C T]V(x, y)

and therefore, in view of the definition of Xa and the relations 1 and (50), 1

L[xo,

~

Xl

E U]{X2:X2 E H; {xo,

(n - 1)

+ (n -

1) ( : )

Xl,

X2}< E

+ (n -

208

Kd

1) ( ; )

TI =m

I

< p = 1 HI.

A PARTITION CALCULUS IN SET THEORY

457

We deduce the existence of xl' EH such that

{XO' Xl, xl'

I EE Kl

for all Xo,

Xl

E U.

Since U=wo+mEE [Kd, there are Yo, Yl, Y2E U such that {yo, Yl, Y2} EKo. But then [{Yo, Yl, Y2, xl' },,]3CKo which contradicts (49). This proves (30) and thus completes the proof of Theorems 31 and 32. THEOREM 33. Let a <w02. Then Wl-7(a, a)2. PROOF. Let S=Wl; [S]2=K o+'Kl ; 2;;;im<wo; a;;;iwo+m, (52)

ex

EE [K.]

(v

< 2).

We have to deduce a contradiction. Let the conventions concerning the use of the letters A, B, P, Q be the same as in the proofs of Theorems 31 and 32. Choose any P. ASSUMPTION. Let [P]2CKo. Suppose that, if P'CP, then there is A such that

I P'Lo(x) I = ~o

(X E A).

Then we define x., p. (v <WI) as follows. There is Xo such that IPLo(xo) I =~o. Put Po =PLo(xo). Now let 0 <1'0 <WI, and suppose that x.ES; P.CPLo(x.) (v <1'0);

I p. -

PI' I < ~o

(Il- < v < 1'0).

Then we can write [0,1'0]= {Jl>.:X<wo}. We can choose elements Yx(X <wo) such that YxEPp.OPp.l ... Pp.x - {Yp:p <X} (X <wo). Put P' = {yx:X <wo}. Then, by our assumption, there is A such that P'Lo(x) =~o (xEA). We can choose

I

I

X'O

E A - L[v

We put P'o=P'Lo(x. o)' If, now, =Jl>.. Then

< vo]({x.} VI

+ L(x.».

<1'0, then there is X<wo such that

VI

I P. o -

POl I ;;;i

I {Yo, Yt.

... } -

p">.1

< ~o.

Also, P'oCPLo(x. o)' This completes the inductive definition of such that

x., p.(V <WI)

PI' C P Lo(Xp.);

I p. -

PI' I < ~o

Put X= {x.:v<wd. Then, by Theorem 23, X=wI>wOm-7(m, Wo +m)2. Since, by (52), wo+mEEF1(X), we have mEFo(X), and there is DE [X]m such that [D]2CKo. Let xp=max [xED]x. Then, for any x.ED, IPp-Lo(x.) I ;;;i Ipp-p.1 + Ip.-Lo(x.) I <~o. Hence we may put Q=PpLo(D), and then we have Q+D=wo+m~a; [Q+DJ!

209

458

P. ERDl)s AND R. RADO

[September

CKo. This is a contradiction against (52). Therefore our assumption is false. Now let A CS. Then, by Theorem 3, IA I =N1-+(No, N1)! and hence, by Theorem 14, A =Cd1-+(Cdo, Cd1)'. Since Cd1EEF1(A), we conclude that CdoEFo(A), so that there is PCA such that [P]2CKo. As the assumption made above is false, there is P' CP such that there are at most No elements x such that IP'Lo(x) I =No. Then there is A'CA such that IP'Lo(x) I
(x

E A").

Since IEI
x E Alii CA"i

y

EE E

= P'Lo(x);

y

EE Lo(x).

Also,

x E Alii C R(P") C R(y); x> y. Hence

So far we have proved that, given any A, there are sets P"', A"CA such that P" CL 1(A"'); [P"]2CKo, and, moreover, there are at most No elements x such that IP"Lo(x) I =No. By applying the last result repeatedly, starting with A =S, we obtain sets P .(v
[P,.]2 C Ko;

PI' C L1(P.)

(p

< II < wo).

There is Q. such that

I P.Lo(x) I < No

(II

< Wo; xES - Q.).

We can choose BCS- E[IICdOm-+(Cdo+m, m)2;

Wo

+ m EE Fo(B);

and there is DE [B]m such that [D ]2CK1. Then, for every v
210

459

A PARTITION CALCULUS IN SET THEORY

and complicated nature, is of interest in that it implies Theorem 7 (i) and Theorem 8. It may well be capable of further worthwhile applications. THEOREM 34. Let a, (3, -y be ordinals, and a-++({3, -yp. Then there are ordinals a). (>. <(3-) such that, if

(I'

then

< tr),

We begin by dedueing (i) of Theorem 7 or, rather, a slightly stronger proposition, from Theorem 34. COROLLARY

"',,+I-("'m

1. Let m and n be such that N:~N" (d
+1, ",,,+1)2.

This implies, a fortiori, "',,+I-("'m, ", ..+1)2 which, in its turn, by Theorem 14, is equivalent to Theorem 7 (i). Deduction of Corollary 1 from Theorem 34. Let us suppose that "'n+l-++("'m+1, "'''+1)2. Then, by Theorem 34, there are ordinals aA, kA such that I k,,1 = II [>. <,,1 IaAI (" <"'m); 1

(53)

",,,+1 -++ (ao + 1, al + 1, ... ).... ,

(54) Then, for>. <"'m, (55) For, let" <"'m, and suppose that (55) holds for>. <". Then, using 1,,1 '<"'m. Now, by (53) and the obvious relation m ~n, I ",..+d ~ L[A < "'m11 aAI ~ N..Nm = N.. which is the required contradiction. COROLLARY

2. Let N': =N .. ; 2N.
"'.. _ (P, ",..)2 By Theorem 14, this proposition implies Theorem 8. Deduction of Corollary 2 from Theorem 34. Let {3 <"'n. Suppose that ", .. -++({3, ",..)2. Then, by Theorem 34, there are ordinals aA, kA such that Ik,,1 =II[>.<"llaAI (Jt<{3-);

211

460

P. Wn -t7

(ao

ERD()s AND R. RADO

1 + 1, ... )~-;

[September

a,. -t7

1

(Wn)lI,.

(p.

< ~-).

Let us assume that, for some p. <{3-, we have a~ <W n (X
I I

LEMMA 2. Let T be a well ordered set, and [T]2=Ko+KI • Then there isS a set B=B(T)CT which has the following properties. We have [B]2CK I • If xET-B, then there is yEB such that {y, x}<EKo. PROOF. We may assume T~O. Choose l such that Ill> I TI. We define, inductively, yx (X0. We put y,.=Yo. Let B = {yx: X< l}. Then there is m < l such that

B = {y).:X

< m};

{Yx, y,.}< E KI (X < p. < m).

For, m is the least p. such that Oy,.. This proves Lemma 2. PROOF OF THEOREM 34. There is an ordered set S such that

EE [Ko); 1pi > 1al. Let

S = a;

~

We choose an ordinal p such that xES. We define f,.(x) (p.
< v), if p.

< v;

f,.(x) ~ x.

Then we define f.(x) by the following rule. If f,,(x) =x for some p.
212

461

A PARTITION CAl.CULUS IN SET THEORY

{jl'(x),i.(x)}< E Ko

(J.I

i.(x) ~ x

< v < p;il'(x) ¢ x); (v < p; xES).

If, for some x,i.(x) <x (v
I pi

I li.(x): v < p} I ~ I S I = I a I

=

follows. Hence, given xES, there is u(x)

i.(x)

<

x

(v

< u(x));

i.(x)(x) = x

(x E S).

Then, for fixed x, [li.(x):II~u(x)} ]2CKO'

u(x)

+ 1 < /3;

u(x)

< fr.

Put M.= {j.(x):U(X)~II} (lI
a

-H

_

(Mo

1

+ 1, Ml + 1, ... )p-.

Let 0
M. =

L

[YI' E MI' for p.

< v] li.(x) :u(x)

~ lI;ix(x) = yx for X < v}.

Now, for every choice of YI'EMI' (P.
x E B(T);

i.(x)

= x E B(T)

or

x

EE

i.(x) = z E B(T).

B(T);

In either case, f.(x)EB(T). In fact, the set T does not depend on x since T is the set of all yES such that

iYll' y}< E

Ko

(p.

< v).

All this proves that, given YI'EMI' (p.
xES;

u(x)

~

v;

fix) = YI'

(p.

< v),

then

i.(x) E B(T). By definition of B(T), we have [B(T) ]2CKl and therefore B(T) <'Y.

213

462

P. ERDOS AND R. RADO

II

1

[September

Hence M. is a sum of Ut
II

1

k.

k.1

THEOREM 35. Suppose that (3~r ~3; (3, (3* ~a; any type cp such that 1cp 1 = 1a I '

cp ++ (s,

(56)

COROLLARY.

Ifr~3;

(57) (58)

s> (r-l)2. Then,for

(3)r.

s>(r-l)2, then 1/ ++

(s, Wo

+ l)r,

cp ++ (s, WI)r,

where cp is any type such that 1cp 1 = 1"A I· The negative results (57) and (58) are not too far from the ultimate truth as is seen by comparing them with the following positive results. By Theorem 1, (59)

r

Wo - (wo, Wo, ... , WO)k

(k

< wo).

By Theorem 31, (60)

where cp is any type such that 1cpl >No; WI, wi ~cp. PROOF OF THEOREM 35. The corollary follows by applying the theorem to the following two cases. (i) (3=wo+l; a=wo; Cp=TJ, (ii) (3=WI; a="A. The proof of the theorem depends on the following lemma due to Erdos and Szekeres [7]. Throughout, we put

s = (r - 1)2

+ 1.

LEMMA 3. If S is an ordered set, r>O, and if z(u)ES (u<s), then there is {uo, UI, • • • , ur-d < C [0, s] such that either

(p

+ 1 < r)

(p

+ 1 < r).

or We now prove the theorem. Let S<=cp; S«=a. Then

214

A PARTITION CALCULUS IN SET THEORY

463

where

= {{XO' ... ,x,-d<: {xo, ... ,x,-d« c s}, Ku = {{xo, ... ,xr-d<: {xo, ... ,x,-d» C s}.

K IO

Case 1. There is A E [S]· such that [A ]'CKo. Then, if A = {s(O"): 0" <s}, an application of Lemma 3 shows the existence of BE [A]' such that B EKlo which is a contradiction. Case 2. There is A CS such that A<=~; [A ]'CKI. We shall prove that one of the two relations

[A]' C KIO, [A]' C Ku

(61) (62)

holds. If both (61) and (62) are false, then there are sets X, YCA such that

= {xo, ... ,x,-d<

(63)

X

(64)

Y =

{xo, ... ,x,-d«, {Yo, ... , Yr-d< = {Yo, ... , y,-d»· =

Then there is O"0, YO«Yr-2 which contradicts (64). Therefore O"+l=r, and xO=YO»YI=XI. But this contradicts (63). This shows that at least one of the relations (61), (62) holds. Now (61) implies ~=A<=A«~S«=a, and (62) implies ~*=A> = A« ~ S« =a. Both conclusions contradict the hypothesis. We have proved that neither Case 1 nor Case 2 is possible, so that (56) is established. This proves Theorem 35. 36. (i) Ifa. 1. (iii) ab-++(a+, b+)2 for any a, b. (iv) Nn )2, if n =n->O.

THEOREM

E[JI
Nn-++(I nl +,

IA.I =a.(JI
(i). Let

I ;

proves (i).

PROOF OF

KI

(ii). By definition of a', the hypothesis of (i) holds for

215

464

P. ERDOS AND R. RAW

[Stpttm~r

some a" n, with In[ =a'; d=a= La •. Hence (ii) follows from (i). PROOF OF (iii). Let =a; a,=b(v
Inl

ab =

r: a, .... (a+,

b+)·.

PROOF OF (iv). Since ~.= r:[v
Let a
(i) Let a~~ol and let b be minimal such that ab>a.

;;;a~t~ab.

Then

(65)

A possible value for Nk is a+, (ii) ~""'+l-++(Nm+l. ~"''' +1)2 for all m. (iii) If Noa. Then, if k=m+l, we have a
If s<:;2; {30. {3~ ;j;ao; al -+-+ (j3o. (31. S

laol = la,l. then

+ 1, s + 1, ... , s + i)',!.

PROOF. Let S< = al; S «= ao. Then to every set XE [5J' there belongs a permutation 7I" ( X) :X---+q(X) defined by

x Let

71">.

=

{xo.

Xl • • . • , X'_l} < =

{x.. (O).

X.. (l) • . • . , X.. (, _ l ) } «

.

(X<s!) be all permutations of [0. s] and, in particular, 11"{\:~

---+ X;

1I"1:X ---+ s - 1 - >..

(X

< s).

Then [Sl' = r: [v <s!lK" where K , = {X:XE [Sl'; 1I'(X) =11',). Now suppose that A C5; v <s!; [A ]'CK •. We shall deduce a contradiction in each of the three cases that follow and so establish the lemma. Case 1. v = O; A<=f3o. Then the contradiction f3o =A«~S«=ao follows. Case 2. 11=1; A<=f31' Then the contradiction Jjt=A«~S« =ao follows. Case 3. 2~v<s!; A<=s+1. Let 1I',:A~,,(A). Then A = {xo. X!, .. . , x,} < and therefore. putting Y>. =Xl+}, (X <s), we have

216

A PARTITION CALCULUS IN SET THEORY (66)

465

{XO, Xl, ... , x.-d < = {X,(O)' X,(lh •.. , X,(S-l>! «, {Xl, XI, . . . ,

X,} < = {YO,

(67)

YI, ..• , y.-l} <

= {y,(O), Y,(lh ... , Y,(,-l)

}«

= {XI+,(O), XI+,(l), ... , Xl+,(.-l) f «.

If xo«xt, then alternate applications of (66) and (67) lead to XO«Xl<<X2<< ... «x. and so to the contradiction 11". =11"0, while, similarly, the assumption XO»Xl leads to the contradiction 11". =11"1. This proves the lemma. PROOF OF THEOREM 37, (i). Let a=N m ; b=NI, and let Fbe the set of all mappings X---+h(X) of [0, wzl into [0, wm ]. We order F by putting, for ho, hlEF, ho«hl if, and only if, there is Xo<w, such that Then I FI =a', and we have, by' Lemma (68)

*

-F«

Wm+l, W'+1 ~

-

= F,

2 of [6],

if a =Nm; b =N"

say.

We can choose a set XE [F]"k' Let x---+f(x) be a one-one mapping of X on [0, Wk], and S= {(x, v):xEX; v <WI("'}' WeorderSalphabetically, by means of a relation u. < wk]N ... = N..k. On the other hand, if d n. Then I sl ~N/(.,o»d. Hence I cp I = I s I = N ....

(69)

1. Let Sl CS, and suppose that Sl is an ordinal. Put Xl = L [v <Wk] Jx:(x, v)Esd. Then Xl is an ordinal, and 'Jtl~F. Hence, by (68), Xl<Wm+l; Ixd ~Nm=a
I < N.;

111

I sll = ~[x E xd I

(70)

= ~[x

{II: (x,

E Xdf(x) < WI;;

II) E sd I ;;;i! ~[x E XdN/(,,)

~ N.II XII ;;;i! N.INm < N.. k ; W"'. ~ cp.

2. Let S2CS, and suppose that (S2)* is an ordinal. Put X 2 = L[V<Wk]{X:(X, v)ESz}. Then (Xz)* is an ordinal, and (X!)* ;;;i!F. Hence, by (68), (X2) * <W'+l; IX2 1;;;i!N,. Put, forxEX 2, N(x) ={v:(x, V)ES2}' Then N(x) is an ordinal. On the other hand, I The authors are indebted to G. Kurepa for pointing out that the result of this lemma had already been obtained by F. Hausdorff, [9, Satz 14].

217

466

P. ERDOS AND R. RADO

[September

(N{x»* is an ordinal, since (N{x»*~ (S2)*' Hence N{x) <wo;

(71) 3. We now apply Lemma 4 to the case s = 2;

aD

= t/J;

Its hypotheses are satisfied, by (70), (71), (69). We obtain w... -++{w",., WI+l)2. This implies (65), by Theorem 14, and cOIl)pletes the proof of Theorem 37. REMARK. If a~No and N/c=a+, then (i) of Theorem 37 yields a stronger result then (ii) of Theorem 36. For, first of all, we note that the hypothesis of (i) of Theorem 37 holds, since a
N",. -++ (b+, N...)2.

(72)

On the other hand, (ii) of Theorem 36 gives (73) It is known that, for any m,

,

,

N.... = N....

(74)

Hence N~! >N~. =Nt =a+' =a+~b+, and (72) is stronger than (73). Since we were not able to find a reference for (74) we give, for the sake of completeness, a proof now. Case 1. Let N~=N .. .,<Wm • Put >.= :E [v <w.. ]>.,. Then, since Iw,,1 ..1
which is the desired contradiction. Case 2. Let N~.. >N,,=N:". Then m>O, and Nm = :E [v <w,,]NA • for some >'.<m. Then :E[v<w .. ]N"'}.,=N , for some l<wm; NA IWA.I ~ Ill; Nm = :E[V<W,,]NA,~ IIIN,,; m~n; N", .. = L:[~<wm]NI'; N~.. ~ Iwml ~N" which, again, is a contradiction. This proves (74).

.=

THEOREM

(75)

38. If Nm -++( I{30 I, "'m+l -++

(flo

I(:lll, ...

)~, then

+ 1, fll + 1, ... hr+l. 218

A

PARTITION CALCULUS IN SET THEORY

467

We give some applications of this theorem. (i) If liSl = =Nm+h then

I'YI

(76)

"'mH ~

(fJ

+ 1, 'Y + 1)1.

For, let a=Nm' and let b be minimal such that ab>a. Then, by Theorem 7, ab~(a+, b+)2 and therefore Nm+l~(liSl, )2. Now (76) follows from Theorem 3S. (ii) If N': =N",; liSl =Nm+1 ; =N..ft , then

I'YI

I'YI

"'''ft+1 ~ (fJ

(77)

+ 1, 'Y + 1)3.

In order to prove (77), we apply Theorem 36 (ii) to a=N.. ft . We note that, by (74), a' =N': =Nm; a'+=Nm+1' Hence, by Theorem 36, N"'ft~(liSl, )1, and (77) follows from Theorem 3S. (iii) If is! = Nr.+l; = N"'r'+1' then

!

I'YI

I'Y I

(7S)

"''''Hl+1 ~

(fJ

+ 1, 'Y + 1)8.

This follows immediately from Theorem 37 (ii) and an application of Theorem 3S. We note that on putting n=k+1 in (ii) above one obtains a result which is weaker than (7S). For, (ii) becomes: if N k +1 =N m ; iSl =N k +2 ; =N"'iW then (7S) holds. The proof of Theorem 3S depends on a lemma.

!

I'YI

LEMMA 5. Let a be an ordinal. Suppose that is. (v
a

~

(fJo

+ 1, fJl + 1, .. ')1:r+1.

PROOF. Let S =a; xES. Then L(x)
S'

C S;

[S']r+1

C K.;

+

S'

=

fJ.

+ 1,

then S' = SIt {x'}; SIt CL(x'); SIt =is.; [S"]rCK.(x') which contradicts the definition of K.(x'). Hence (SO) is impossible, and (79) follows. PROOF OF THEOREM 3S. If is<'''m+h then liSl ~Nm' liSl ~(liSol, liSll, ... )~. By Theorem 13, this implies iS~(iSo, iSl' ... )~. Now (75) follows from Lemma 5. THEOREM 39. (i) If (S1)

219

468

P. ERDOS AND R. RADO

Iml >

(82)

L:[X
[September

kI IA1 ",

then (83)

m -+ (ao

(ii) If r>O; w..-+(ao,

+ 1, a1 + 1, ... h0'+1.

alt ••. )~,

and

2M• ~ N..

(84)

then Wn+1 -+ (ao

for

+ 1, a1 + 1, .. ')10

"",1

II

< n,

•

Ikl
"",1

some." < k. Then, by Theorem 17, we may apply a suitable permutation to the system ao, aI, . . . so that for the new system, again denoted by ao, aI, ... , a., ... ('II
a. = r (II

< ko);

(k o ~ II

< k).

Here ko is some ordinal, O
Wn+l -+

(alo o

+ 1, alo o+1 + 1, ... )101""'1•

This shows that we may assume, without loss of generality, that Ia.1 >r for." r; 1 a. If n=O, then aO, then a ~ L: [X

for some

VA

< w.. ]21101IA( ~

<no Hence, by (84),

L: [X

< w.. ]2M'A,

a~ }:[}.<w.. ]N.. =N...

220

469

A PARTITION CALCULUS IN SET THEORY

Deduction of (iii) from (ii). By definition of N:", "''''-('''m)~. Hence, by r applications of (ii), (85) follows. Now (86) follows from Theorem 15. PROOF OF (i). Let S=m; [S]O+I= E' [II 1m I. Throughout this proof the letters K, ~, p, u denote ordinals less than n, and x, y and z elements of S. The relation

{xo, . . . , x,} == {Yo, ... , y,} expresses, by definition, the fact that, for some II
{xo, ... , x,}, {yo, ... , y,} E K•. We define f.(x) ES as follows. Let x be fixed, and suppose that, for some fixed ~, the elements f. = f.(x) have already been defined for all K<~. Then we putf~(x) =x, if f.=x for some K<~. If, on the other hand, f.~x for K<~, then we define f~ to be the first element y of S- {j.:K<~} such that (87)

{j.o,···, f'r-I' y} == {j.o, ... , f'r-l' x} for KO < ... < K,_1 < X.

This defines f. for all

K.

We now prove that

fA
(88)

if

x < "';

(89)

f>.

~

x.

First of all, (87) holds for y =x. Hence, by (89) and the definition of (88) in the case whenf,,=x. Now suppose that f,,~x. Then (87) holds for y =f" and, again, (88) follows. By (88) and 1 > 1 there is p(x) such that

fA, we havef>.<x. This proves

nl

ml,

f.(x) Let,forKo<'"

< f>.(x)

= x,

if K < p(x)

~

X.

{j.o(x), ... ,f'r-I(X), x} E Kg('o"""r-I,,,,j = K'(KO, ... , x). We now show that if x and z are such that

p(x) = pes),

(90)

(91)

K'(KO, •.• I Kr-1, x)

= K'(KO, ... , /C,_1, s) for KO < ...
then x=z. Let ~~p(x), and suppose that (92)

f.(x): = f.(s)

221

for

K

< X.

470

P. ERDOS AND R. RADO

[September

Then h.(x) is the first element y of S - {j.(x) : K <X} such that (87) holds, i.e.

{j.o(x) , ... ,j'r-I(X), y}EK'(KQ, " ' , Kr-l, x)

for KO< ...
Now, h.(z) is defined by the same property, with z in place of x, and (90) and (91) show thatjA(x) =jA(Z). We have thus proved, by induction, thatj.(x) =j.(z) for all K~p(X). In particular, by (90),

x = jp(:t:)(x) = jP(')(z) = z. We next prove that p(xo)?;,l for at least one Xo. Let us suppose, on the contrary, that p(x)
a(O') ~

~

I kl 'v{;

I ml = lsi L[O' < l] I k I'v(,

=

I L[u < l]{x:p(x)

= IT}

I

which contradicts (82). This proves that p(xo)?;,l for some suitable Xo. Put So= {j.(XO):K
a ~ ({30, (31, . • • )~.

I

\.8.\ ,

For, if r~ 1, then any a can be taken such that a\ > L[v
.8.,

.8•.

k\

k\

222

471

A PARTITION CALCULUS IN SET THEORY

the least number n such that n -+ (ao, ..• , a"'_l)~'

Without loss of generality, we restrict ourselves to the case k ~ 2 ; In [5, Theorem 1], an explicit upper estimate was given for the number Pic (r; a, a, ...• a). which. in that paper. was denoted by Rek, r, a). Clearly, plc(1; aD, •. '. ak_l)=1+ao+ ... +ak+l-k. By Theorem 39,

o
Pk(r + 1; ao + 1, al + 1, ...• ak-l + 1)

(93)

~ 1+

L

[X

< Pk("; aD •...•

ak-l) ]k>-".

I t is easily proved that, for I <woo

(94) For, (94) holds for 1= O. and if 0 < m <woo and (94) holds for 1= m -1. then 1+

L [X < m ]k}.r ;:i! kCm-I)r +

kCm-l)r ;:i! kl+Cm-1)r;:i! kmr ,

so that (94) holds for l=m. We have thus proved the following recurrence relation, THEOREM 40. If 2;:i!k <wo; 0
Pk("

+ 1;

In particular, we have, using the notation of [5]. R(k, ,. + 1, a + 1) ;:i! kRC ",.,,,,>' (k ~ 2; 0

< ,. ;:i! a).

This is precisely the recurrence relation established in [5], from which the explicit estimate is deduced at once. This is no coincidence, as the method of proof of the present Theorem 39 is related to that used for proving Theorem 1 of [5]. Theorem 39 implies Theorem 4 (i), i.e. + I (95) (2it.) -+ (N"+l)"K' For, clearly, N,,+l-+(N,,+l)!K' and therefore W..+l-+(W"+l)!,,. Also,

"L.J [X < wn+d I w.. 11M ;:i! N..M" Nn+l =

2M" = N... o,

say. Hence, by Theorem)9 (i), Wmo+l-+(Wn+l+1)!", and (95) follows.

223

472

P.

ERDOS AND R. RADO

(September

THEOREM 41. If r ~ 3, then, for all n,

(96)

Wn+l ~ (w" + 2, Wo + 1, r + 1, r + 1, ... , r + 1);r-l)l.

As an application, consider the case r = 3; n = 0:

Wl ~ (wo

(97)

+ 2, Wo + 1)3.

This should be compared with:

Wl ~ (wo

+ 1);

(k

< wo; r

~ 0)

which follows from Theorem 39 (ii) and Theorem 1. PROOF OF THEOREM 41. Let w~~(3<Wn+l' We apply Lemma 4 to

s = r - 1;

ao = w,,;

(30=w,,+1;

and obtain (3 ~ (w"

r-l + 1, Wo, r, ... , r)(r-l)l.

This holds, a fortiori, if (3 <w". Now Lemma 5 proves (96). A type (3 is called indecomposable if the equation (3 ='Y+8 implies that either 'Y?;,(3 or 8 ?;,(3. It is known 10 that the indecomposable ordinals are those of the form w~. The types 11 and}" are indecomposable. The next theorem asserts that in Lemma 4 the s! - 2 classes corresponding to the entries s+1 in the partition relation may be suppressed in the special case when both (30 and (31 are indecomposable, at the cost, however, of raising the remaining entries slightly. THEOREM 42. Let s?;,3; laol = lad; (30, .aT;t3ao, and suppose that (30 and (31 are indecomposable. Then (98)

PROOF. Case 1. s=3. Consider a set S with two orders such that S<=al; S«=ao. Then [s]a=K o+'Kl, where Ko is the set of all sets {xo, Xl, X2}<= {Yo, Yl, Y2}«CS for which XA~YA is an even permutation of [0,3], i.e. one of the permutations 012, 120, 201. Now let us assume that (99) It suffices to deduce a contradiction in each of the two cases that follow. Case 1.1. There is A CS such that A< =(30; [A JaCKo. Let X, y, 13 denote elements of A. Then {x, y, z}<= {Xl, Yl, implies that Xl, Yl, 131 is a cyclic permutation of x, y, z. Put B = {x:y«x, whenever

zd«

10

[13. U7S-78].

224

A PARTITION CALCULUS IN SET THEORY

473

y<x}; C=A -B. We shall prove three propositions about the two orders of A showing their effects on the partition A =B+C. 1. Let xyEC implies xEC. 2. Let xEB; yE C; x«y. Then x
which is a contradiction. Case 1.2. There is ACS such that A<={31; [A]aCKI. Then {x, y, z} < = {Xl, Y1, ZI} «CA implies that Xl, yt, Z1 is an odd permutation of x, y, z. This is equivalent to saying that {x, y, z} < = {X2' Y2, Zll»CA implies that X2, Y2, Z2 is an even permutation of x, y, z. Hence the result of Case 1.1 holds if {3o is replaced by {3t, and "«" by "»". We note that.8~ is indecomposable. Hence, in place of (100) we have

fh ~ A» ~ :5»

= ao•

which is a contradiction. This shows that the assumption (99) was false, i.e. that (98) holds. Case 2. s>3. Then, by the result of Case 1, we have a1-t+({3o, .81)8. By Theorem 15, this implies (98). This completes the proof of Theorem 42. REMARK. If, in particular, {3o and .81 are ordinals, not zero, then (s-3) +.8.=.8., so that (98) can be replaced by

(s - 3) + a1 -t+ (Po, fJ1)·. We may also mention here the following corollary of two of our lemmas, in which X is the type of the continuum. (101)

If

2No

= N..,

then

+l -t+ (Wi + 1, W1 + 1)8.

W ..

PROOF. Let w,,~a1 <W,,+I. Then, by Lemma-4, with ao=X, we have al-t+(w1' W1)2. By Lemma 5 this leads to (101).

225

474

P.

THEOREM 43. If

ERDOS AND R. RADO

[September

r<s~fjo; a-++(fjo)~; fjl-+(S)~,

then

a -+ (fJo, fh)'.

This proposition remains valid if the types a, flo, fll are replaced by cardinals.

PROOF. Let r<s:ifjo; a-+({jo, fjl)'; fjl-+(S);. We have to deduce that (102) Let 3'=a; [S]r= K/ =

E'

[JI
1:[11 < k]{A:A E

[sh

[A]r

C K,}.

Then there are BCS; X<2 such that [B]'CK1; B =fj.,... If X= 1, then B-+(s);, and therefore there are AE[Bl'; JI
x,. =

{X,., .•• , x-+r-d

(p :i m).

IBI I

Now let I' <m. Then, since = fjol ~s> r, there is Y,.E [B]' such that X,,+X"+IC Y,.. But Y,.EK/, so that X,., X,.+IE [y,.]rCK.,., for some JI,. and that [B]rCK,o. This proves (102). The analogous theorem, with cardinals in place of types, is proved by means of the obvious modifications of the above argument. Applications of Theorem 43. (a) Let X be the type of the continuum and =N ... Then, by Theorem 30, N..-++(N 1):. Also, as is easily verified,

Ixi

1I

(103)

6-+ (3h.

Hence, by Theorem 43, N ..-++(Nlo 6)', and therefore W..-++(Wlo 6)3. Now, by Theorem 15, W ..-++(W1o r+3)r (r~3) follows and therefore, finally, (r ~ 3).

(104)

(b) By (97), (105)

WI -++ (wo

+ 2)2.a

By (103) and Theorem 39, we have (106)

where m = 1+

a

m -+ (4)2,

E lI'<6]2,.t <226. It now follows from (105) and (106), 226

A PARTITION CALCULUS IN SET THEORY

475

by Theorem 43, that wrt~(wo+2, 2 26 )4 and therefore, by Theorem 15, that (107)

Wl

++

(wo

+ 2, 2 + r 26

- 4) T

(r

~

4).

vVe now give a new proof of the theorem of Dushnik and Miller

[2],11 Our proof bears some resemblance to the original proof but can, we think, be followed more easily. THEOREM 44. If a ~~o, then

a~(~o,

a)2.

PROOF. We use induction with respect to a. By Theorem 1, the assertion is true for a=~o. We assume that n>O and that the assertion is true for a <~n' and we let

lsi

= b =~" > ~oj

We suppose that if

X E [s]~o,

then

[X]2

and we want to find YE [S]b such that [Y]2CKl' There is a maximal set A = {xv:v
x. E Uo(xo, ... , X,_l),

(108)

I Uo(xo, ... , x,) I =

b

(v

< l).

For, the relations (108) imply that [A ]2CKo. Put B = Uo(A). Then

I BUo(x) I < I B I =

(109)

(x E B).

b

Case 1. b' = b. Then we define Xv (v <wn ) as follows. Let v <W,,' and suppose that x"EB (Jl
I L: [IL < v]({x,,} + BUo(x,,» I < I BI,

and therefore there is x.EB- L:~
I

I

B(T) = {x:x E Bj p(x) = We define, by induction, V<W m , 11 12

T",

X"

r}

(T

< w",).

CJl <wm ) as follows. Let, for some

Theorem 3, (i). The symbol Uo was defined in 12.

227

476

P. ERDOS AND R. RADO

[September

(I-'

< v).

Then, by definition of m,

I 2:

~

< v; x E

+ BUo(x» I ~ 2: ~ < v; x E 2:[1-' < v](1 + b•.)b,. < b.

XI']({x} =

IDI

X,.](1

+b

T ,.)

Hence =b, where D=B- L~
I

b

=

I

I D I = 2: ~ < w I DB (I-') I ~ M~m < b. m]

Now we can choose X.E[DB(r.)]b •. Then X.CUI(X,.) (I-'0, and consider any types fJ. (v x.EB. defined for v
=

II X [v < k ]jS•.

[10]. THEOREM 45. If k>O, then llx [v (fJo, fJI, ... )i.

This multiplication has been considered by Hausdorff

PROOF. In spite of its somewhat complicated appearance the proof is, in fact, very simple, as can be seen by following it in the case k=2 or k=3. Let P= L[v
Xl,' •• ,

Xk}:X. E B. for v < k}.

Case 1. There is v < k such that the following condition is satisfied. There is a system of elements x,.EB,. (,."
228

477

A PARTITION CALCULUS IN SET THEORY

function f,(xo,

Xlo ••• ,

i.) EB. such that, for any choice of x). EBA

(lI<X
(Xo, Xl, ••• , i" f,(xo, ••• , i,), X0+1, ••• , if;)

EE K •.

In particular, the function fo(io) is constant. Then we define, inductively, elements Y.(lI
E K,

which contradicts the definition of f •. The theorem is proved. 8. Canonical partition relations. Let S be an ordered set, and consider a partition (110)

[8]'

=

L:' [II < k]K•.

To every such disjoint partition there belongs an equivalence relation .,:l on [S]· defined by the rule that elements X, Yof [S]· are equivalent for.,:l, in symbols:

x ==

y(.~)

if, and only if, there is 11
IX -

•

(/W

has, by definition, the following meaning. Whenever S =a, and (110) is any disjoint partition, with any arbitrary k, then there is BCS such that B =~, and such that the equivalence relation .,:l belonging to (110), if restricted to [B]', coincides with some canonical equivalence relation .,:l~ ......-1. The main result of [4] is expressible in the form wo-. (wo)·. The problem arises of finding canonical partition relations between types other than woo The main difference between canonical and non canonical relations derives from the fact that if the 11 [4; 5]. The notation used in the present note differs slightly from that used in the earlier papers.

229

478

P.

ERDOS AND R. RADO

[September

canonical relation (111) holds then a certain choice of a subset of S can be made irrespective of the number kl of classes of (110). The relation "'0-+. ("'0)1 is equivalent to the statement that, if a denumerable set S is arbitrarily split into nono'IJerlapping subsets S., then there are either infinitely many nonempty subsets S., or else at least one of the subsets S. is infinite. The following theorem establishes a connection between canonical and noncanonical partition relations.

I

THEOREM 46. (i) Let q. denote the number of distinct equi'IJalence relations which can be defined on the set [0, s]. Let (112)

s

=

(~);

I pI> 2r;

2.

€X

-+ (It),•.

Then a-+. (~)'. If I~I >4; a-+(~):oa, then a-+* (~)2. (ii) If m, r~O, and 2"'~N" for m~n<m+2r+l; v
qo = 1; q1 = 1; q2 = 2; qa = 5; q. = 15; q5= 52; q6 = 203. A rough estimate for all s is

q. ~ 2(;)

I

obtained by observing that an equivalence relation is fixed if for p.
s>O,

and hence q. ~s!. Deduction of (ii) from (i). By Theorem 39 (iii), we have """+2r+1 -+ (Cdm

+ 2r + lh2.+2

(k

< Cdo),

and the conclusion follows from Theorem 46 (i). PROOF OF (i). Suppose that (112) holds. Let S=a, and consider any disjoint partition (110). Let .6 be the equivalence relation on [S]r which belongs to (110). Our first aim is to define a certain equivalence relation .6 * on [S]2r. Let [[0, 2r llr= {Po, Ph ... , Then

p.-d ...

230

479

A PARTITION CALCULUS IN SET THEORY

Let X= {xo, ... ,X2r-.}
if, and only if, {x;>.:XEP,.}

== pC· a(X»

== {x;>.:XEP.}C·a). Put, for X == y(·a*)

X, YE[S]2r,

if, and only if, ..1.(X) =..1.( Y). Now, by definition of q.. A* has at most q. nonempty classes. By (112), there is BCS such that B=p, and any two elements of [B]2r are equivalent for ..1.*. This means that, in the terminology of [4], A is invariant in [B]r (d. [4, p. 253 J). Choose any A CB such that 2,. < IA I 2r. Then A is invariant in [A]r and hence, by [4, Theorem 2], canonical in [A Jr. Thus there is a canonical equivalence relation ..1.(A) on [B]r such that A=A(A) on [A Jr. It only remains to show that ..1.(A) is independent of A. Let A o, A1CB; 2r< IAol, IAd
P = {Y2;>.: X < r} ;

Q = {Y2).+1-<).: X < r}.

By definition of A.~ .. '''-1' we have P =QC ·A.~o ......-1). Hence P

== Q( . ..1.);

Similarly, by considering the sets P and Q' = {Y2)'+I--<).: X< r}, we find that P =Q'( ·A.:o.. .•r-'); P =Q'(-A.); Ep ;;;; Kp(p

< r).

Hence Ep = Kp for all p. For reasons of symmetry, 'YIP = Kp, and so, finally Ep ="1P (p
231

480

P. ERDOS AND R. RADO

[September

positions of the "complete" even graph of cardinal-pair ao, ai, i.e. the graph obtained by joining every "point" of a set of cardinal ao to every point of a disjoint set of cardinal al. More generally, we introduce the notation

for the cartesian product of the t sets [S"I r", i.e. we put [So, .. "

st-d ro ... •• r.-

1

=

{(Xo, •. " X,_l):X"

E lS).J'l. for X < t}.

We shall always have 0
ao

boo

bOl b11

(113)

has, by definition, the following meaning. Whenever

}..
[So, ••• , sl_d ro ..... rl-l =

Is,,1 =a).

for

1: [p < k]K.,

I

then there are sets B).CS).. and an ordinal JI
47. If a' =b', then

232

481

A PARTITION CALCULUS IN SET THEORY (114)

In particular, (114) holds if 1
holds if, and only if, either b=O or b'>No. In particular,

(N

(116)

0) --+ {No

2ND

\No

No 2No

)1.1.

PROOF OF THEOREM 47. If 1
Ko = {(K, }.):K

< a;). < b; K +}. even}.

Then, for any K
a=

L

[II < ",.. ]a,;

b=

L

[v < ",.. ]b.,

where a.
IA.I

IB,I

Ko= {(x,y):xEA"j

yEB,;

= {(x, y):x E

y E B,;

Kl

A,,;

L'

[v <w .. ]B,;

< v < "'.. }, II;:;:; p. < ",.. }.

p.

IA"xl Ixl.

Let XE[A]a; yoEB. Then yoEB. for some v <w... We have ~ A"I
I

]1.1ct

L

233

L

I

482

P. ERDOS AND R. RADO

[September

Hence, a fortiori, (115) is false. It remains to prove (115) under the assumption b'>No. Let =No; =b; [A, B]I,I=Ko+KI. We may suppose that

IAI

IBI

(X, Y) E [A, B]Mo'& implies

(117)

[X, Y]I.IQ:K1•

Let (X, Y)E [A, B]Mo,b. 1. Put Yo= E[xEX]{y:(x, Y)EK o}. Then [X, Y= YO ]l,lCKl , and hence, by (117), I Y - Yol
E[x E

x]1

{y:y E Y; (x, y) E Ko}

I ~ I YYol

= h.

Since b'>No= lxi, this implies the existence of xoEX such that

I {y:y E

Y; (xo, y) E Ko}

I=

h.

Put

1/t(X, Y) = {y:y E Y; (xo, y) E Ko}.

t/J(X, Y) = Xo; Then

t/J(X, Y) EX; 1/t(X, Y) E [Y]b, [{ t/J(X, Y)}, 1/t(X, Y) ]1,1 C Ko.

2. Putf(y) = {x:xEX; (x, Y)EKo} (yE Y). If

yE Y

(118)

If(y) I < No,

implies

yl

thenb=1 =E[pcx; Ipi No, there is PI CX such that Ipil
[1/tI(X, Y), {t/Jl(X, Y)} ]1,1 C Ko. 3. We define sequences x., y .. X., Y. (II<Wo) as follows.

xo=t/J(A, B); Yo=1/t(A, B); yo=t/Jl(A- {xu}, Yo); X o=1/tl(A- {xo}, Yo). For O
x. = t/J(Xp-l, Y p-l - {yl'-d);

Y. = 1/t(Xl'-l, Y 1'-1

y. = t/Jl(Xl'-l - {x.}, Y.);

X, = 1/tl(Xl'-l -

-

{x.},

Then

x. E X,-l C X._ 2

-

{x__d C X.-3 - {XI'-2' x.-d C ...

C X 0 - {Xl, . . . , x..-d C A -

IXo,

234

. . . , x.-d ;

{Yl'-d); Y.).

A PARTITION CALCULUS IN SET THEORY

y. E Y. C Y __1

-

483

{y--d C ... C Yo - {YO, .•• , y.-d

C B - {YO, • • • , y.-1} ; [{X.}, Y.]l.l C Ko; [{X.}. {Y., Y'-H, ... } ]1.1 C Ko; [X., {y.} ]1.1 C Ko; [{XO+l' X.+2, ... }. {y.} ]1.1 C Ko; (X", y.) E Ko (j.I, II

< "'0).

This proves (115). Finally, as is well known [13, p. 135], (2 No)' >No, so that (116) is a special case of (115). This proves Theorem 48. We introduce the notation

I I

where A is a set such that A =a. If a, b
is the ordinary binomial coefficient

The following lemma is probably well known. LEMMA

6. If a~No, then

{ :} = ab for b ~ a and

{ :}

= 0 for b > a.

PROOF. The result is obvious for b = 0 and for b > a. Now let O
On the other hand, if y.EA. for JI
and the lemma follows.

235

484

P. ERDOS AND R. RADO

[September

THEOREM 49. Suppose that O<sO;

III

{a1} I k Itao} '0 '1

=

ao (119)

a

'.

(120)

al-l

a~

.

(121)

~ b: 1'0'"

I

Then

•••

I

k

b

'" ... ,Ft-l

b'-l

I

[b~

.

~

b'_l

al-l

'._1 '

','.-1

b.-l

.

~

{a,_l}

'0, ... ,'1-1

k

PROOF. We use the notation of the partition calculus explained in the proof of Theorem 25. In addition, if A is a partition of M, and M'CM, then the relation

I AI

~ a in

M'

expresses the fact that the number of classes of A containing at least one element of M' is at most a. (119) and (120) imply that bA~aA for)'
(122)

< t.

Let IAAI =aA for)'
IAI I

(123)

IA I ~ 1

in

[Bo,··· B,_d ro , .. ·,rl-l. t

Put, for XAE [AA]"- (s~).
=

II A(Xo, ••• , X ,- l),

where the last product is extended over all systems (X o, ••• , X ,_l ) E [Ao, ... , A._d ro , ... ".-1. By (122), this product has at least one ~ Hence, by (120), there is BAE [AA]~ factor. It follows that (s~).
lAd Ill.

I All

~

1 in

[B.,···, B,-d'., .. ·"1-1.

236

485

A PARTITION CALCULUS IN SET THEORY

By (122), we can choose (s ;;; ).

a2(XO,

••• ,

< t).

XI-I) = a(Xo, ... , X,-I, Y .. ... , Y,_I).

;;;Ikl,

Then 1.121 and therefore, by (119), thereisB'AE[A'A]b'A (X<s) such that 1.121 ;;; 1 in [Bo, ... , B._d ro •...• r'-I. By (122), we can choose Y'AE [B'A]"A (X <s). Then, for any X'AE [B'A]"A (X
= (Xo, ... , X.-l , Y" ... , Y 1-1)

=(Yo, ... , Y.-

1,

Y" ... , Y 1-1)( ·a).

This proves (123) and so establishes Theorem 49. We note the following special case of Theorem 49. COROLLARY.

If

then

( ao) -+ al

(bO)I.I. bl

,.

We give some applications of this last result. (a) If O
This is best possible in the sense that, if 2d -1 is replaced by 2d - 2, the last relation becomes false. We even have, as is easily seen,

(b) If al >

Ikl >0; at > Ikl ao , then

In particular, if we assume that 2Mo=N" then

237

486

P.

ERDOS AND R. RADO

More generally, if 2Mn=~"+lr then

(

~n

)

\NnH

(~n

[September

)1.1

\NnH I .

-+

This is best possible in the following strong sense. (c) If a'~ kl ; b>O, then

I

To prove (c), choose n~k such that Inl =a'. Then a= L:[v
I

THEOREM 50. If

21'0 =~1,

then

PROOF. Let A = [0, wo]; B = [0, wd. According to SierphlskP4 the assumption 21'0 = ~l implies, and is, in fact, equivalent to, the existence of a sequence of functions fA(y) EB (XEA), defined for yEB, such that, given any YE [B]Ml, there is XoEA such that{fA(y) :yE Y} =B (Xo~X<wo). Then [A, B]l.l=Ko+'Klr where Ko={ex, y):XEA; yEB;fA(Y) =O}. If, now, (X, Y) E [A, B]Mo.MI, then, by the property ofthe functions fA, there is XEX; Yo, YlE Y such thatfA(Y.) =v (v <2). Then (X, y.)E[X, Y]l.lK. (v<2). This proves the assertion. THEOREM 51. If a, b> 1; ( : ) -+ ( :

Then ( ::) -+ (

X·

1 •

::X'l . I

PROOF. Let 1 and m be such that III =a'; ml =b'. Then a= L:[X
IAAI

14

[14], French translation in [16]. See also [1].

238

I

487

A PARTITION CALCULUS IN SET THEORY

A = E' [X
=

< I; A"K ¢ o};

{~:~

B"

= {I':I' < m; B"Y ¢

IAAI

OJ.

Then a= Ixi = E[XEA"]IA"XI; IA"xl ~
I

I

COROLLARY,

For, if

If

III

I

a> 1, then

(:') -H (:/X,l . ( :') _ (:/X'1 ,

then, by Theorem 51 and the known equation a" =a' , we conclude that

which contradicts Theorem 47. We may mention that there is an obvious extension of Theorem 51 to reI a tions for any t. In conclusion, we collect some polarised partition relations involving the first three infinite cardinals. They follow from Theorems 47-50. We put No=a; N1=b; N2 =d.

(:) -HG :Y'\ (:) -HG ~Y'\ (~) -H (~ :y'l

(:)_(: :Y'\ (:)_(: :y'l 239

(Theorem 47); (Theorem 48).

488

If 2" =b, then

P. ERDOS AND R. RADO

(:)-+(: :y.l

and

(:)~(:

:y.l

(September

(Theorem 49)

(Theorem 50).

It seems curious that the continuum hypothesis should enable us both to strengthen

to

and to show that

cannot be strengthened to

REFERENCES

1. F. Bagemihl and H. D. Sprinkle, On a proposition of Sierpinski, Proc. Amer. Math. Soc. vol. 5 (1954) pp. 726-728. 2. B. Dushnik and E. W. Miller, Partially ordered sets, Amer. J. Math. vol. 63 (1941) p. 605. 3. P. ErdOs, Some set-theoretical properties of graphs, Revista Universidad Nacional de Tucuman, Serie A vol. 3 (1942) pp. 363-367. 4. P. Erdos and R. Rado, A combinatorial theorem, J. London Math. Soc. vol. 25 (1950) pp. 249-255. 5. - - , Combinatorial theorems on classifications of subsets of a given set, Proc. London Math. Soc. (3) vol. 2 (1952) pp.417-439. 6. - - , A problem on ordered sets, J. London Math. Soc. vol 28 (1953) pp. 426-438. 7. P. Erdos and G. Szekeres, A combinatorial problem in geometry, Compositio Math. vol. 2 (1935) pp.463-470. 8. P. Hall, On representations of sub-sets, J. London Math. Soc. vol. 10 (1934) pp. 26-30.

240

A PARTITION CALCULUS IN SET THEORY

489

9. F. Hausdorff, Grundlfige einer theoru der geordneten Mengen, Math. Ann. vol. 65 (1908) pp. 435-506. 10. - - , Mengenlehre, 3d ed., 1944, §16. 11. R. Rado, Direct decompositions of partitions, J. London Math. Soc. vol. 29 (1954), pp. 71-83. 12. F. P. Ramsey, On a problem of formal logic, Proc. London Math. Soc. (2) vol. 30 (1930) pp. 264-286. 13. W. Sierpiiiski, Le,ons sur les nombres transftnis, Paris, 1928. 14. - - , 0 jednom problemu G RusjeviCa koji se odnosi na hipotesu kontinuuma, Glas Srpske Kraljevske Akademije vol. 152 (1932) pp. 163-169. 15. - - , Sur un probUme de la t~orie des relations, Annali R. Scuola Normale Superiore de Pisa Sec. 2 vol..2 (1933) pp. 285-287. 16. - - , Concernant l' hypothhe du continu, Acadbnie Royale Secbe. Bulletin de l'Acad~mie des Sciences Math6matiques et Naturelles. A. Sciences Math~matiques et Physiques vol. 1 (1933) pp. 67-73. 17. A. Tarski, Quelques t~oremes sur Its alephs, Fund. Math. vol. 7 (1925) p. 2. HEBREW UNIVERSITY OF JERUSALEM AND UNIVERSITY OF RUDING

Reprinted from Bull. Amer. Math. Soc. 62 (1956), 427-489

241

MAXIMAL FLOW THROUGH A NETWORK L. R. FORD,

JR. AND

D. R. FULKERSON

Introduction. The problem discussed in this paper was formulated by T. Harris as follows: "Consider a rail network connecting two cities by way of a number of intermediate cities, where each link of the network has a number assigned to it representing its capacity. Assuming a steady state condition, find a maximal flow from one given city to the other." While this can be set up as a linear programming problem with as many equations as there are cities in the network, and hence can be solved by the simplex method (1), it turns out that in the cases of most practical interest, where the network is planar in a certain restricted sense, a much simpler and more efficient hand computing procedure can be described. In §I we prove the minimal cut theorem, which establishes that an obvious upper bound for flows over an arbitrary network can always be achieved. The proof is non-constructive. However, by specializing the network (§2), we obtain as a consequence of the minimal cut theorem an effective computational scheme. Finally, we observe in §3 the duality between the capacity problem and that of finding the shortest path, via a network, between two given points. 1. .The minimal cut theorem. A graph G is a finite, I-dimensional complex, composed of vertices a, b, c, ... , e, and arcs a(ab), ~(ac), ... ,8(ce). An arc a (ab) joins its end vertices a, b; it passes through no other vertices of G and intersects other arcs only in vertices. A chain is a set of distinct arcs of G which can be arranged as a(ab), ~(bc), "(cd), ... ,5(gh), where the vertices a, b, c, ... , h are distinct, i.e., a chain does not intersect itself; a chain joins its end vertices a and h. We distinguish two vertices of G: a, the source, and b, the sink. 1 A chain flow from a to b is a couple (C; k) composed of a chain C joining a and b, and a non-negative number k representing the flow along C from source to sink. Each arc in G has associated with it a positive number called its capacity. We call the graph G, together with the capacities of its individual arcs, a network. A flow in a network is a collection of chain flows which has the property that the sum of the numbers of all chain flows that contain any arc is no greater than the capacity of that arc. If equality holds, we say the arc is saturated by the flow. A chain is saturated with respect to a flow if it contains Received September 20, 1955. IThe case in which there are many sources and sinks with shipment permitted from any source to any sink is obviously reducible to this. 399

243

400

L. R. FORD, JR. AND D. R. FULKERSON

a saturated arc. The value of a flow is the sum of the numbers of all the chain flows which compose it. It is clear that the above definition of flow is not broad enough to include everything that one intuitively wishes to think of as a flow, for example, sending trains out a dead end and back or around a circuit, but as far as effective transportation is concerned, the definition given suffices. A disconnecting set is a collection of arcs which has the property that every chain joining a and b meets the collection. A disconnecting set, no proper subset of which is disconnecting, is a cut. The value of a disconnecting set D (written v(D)) is the sum of the capacities of its individual members. Thus a disconnecting set of minimal value is automatically a cut. THEOREM 1. (Minimal cut theorem). The maximal flow value obtainable in a network N is the minimum of v(D) taken over all disconnecting sets D. Proof. There are only finitely many chains joining a and b, say n of them. If we associate with each one a coordinate in n-space, then a flow can be represented by a point whose jth coordinate is the number attached to the chain flow along the jth chain. With this representation, the class of all flows is a closed, convex polytope in n-space, and the value of a flow is a linear functional on this polytope. Hence, there is a maximal flow, and the set of all maximal flows is convex. Now let S be the class of all arcs which are saturated in every maximal flow.

LEMMA 1. S is a disconnecting set. Suppose not. Then there exists a chain aI, a2, ... , am joining a and b with ~ S for each i. Hence, corresponding to each ai, there is a maximal flow fi in which ai is unsaturated. But the average of these flows,

aj

1 f=-L.fi

m

'

is maximal and ai is unsaturated by f for each i. Thus the value of f may be increased by imposing a larger chain flow on alt a2, ... ,am, contradicting maximality. Notice that the orientation assigned to an arc of S by a positive chain flow of a maximal flow is the same for all such chain flows. For suppose first that (C lt k l ), (C 2, k 2) are two chain flows occurring in a maximal flow f, ki > k2 > 0, where CI = al(a al), a2(aIa2), ... ,aj(aj_l, aj), ... ,ar(ar-lt b) C2 = !31(a bl), !32(b Ib2), ... ,!3k(bk- lt bk), •.. ,!3s(b.-l! b),

and aj(aj_l, aj) = !3k(b k- l , bk) E S, aj_1 = bk, aj = bk- I. Then

244

401

FLOW THROUGH A NETWORK

contain chains C1", C2" joining a and b, and another maximal flow can be obtained from f as follows. Reduce the C1 and C2 components of f each by k2' and increase each of the C1" and C2" components by k 2• This unsaturates the arc a" contradicting its definition as an element of S. On the other hand, if (C lJ k 1), (C 2 , k 2) were members of distinct maximal flowsh,h, consideration of f = H/l h) brings us back to the former case. Hence, the arcs of S have a definite orientation assigned to them by maximal flows. We refer to that vertex of an arc a E S which occurs first in a positive chain flow of a maximal flow as the left vertex of a. N ow define a left arc of S as follows: an arc a of S is a left arc if and only if there is a maximal flow f and a chain aI, a2, ... ,ale (possibly null) joining a and the left vertex of a with no aj saturated by f. Let L be the set of left arcs of S.

+

LEMMA

2. L is a disconnecting set.

Given an arbitrary chain al(a al), a2(ala2), ... ,am (a m-l b) joining a and b, it must intersect S by Lemma 1. Let a,(a ,_l, at) be the first aj E S. Then for each ai, i < t, there is a maximal flow fl in which al is unsaturated. The average of these flows provides a maximal flow f in which alJ a2, ... , a 1-1 are unsaturated. It remains to show that this chain joins a to the left vertex of a" i.e., al_l is the left vertex of al' Suppose not. Then the maximal flow f contains a chain flow [,81 (ab 1) , ,82(b lJ b2),

••• ,

,8r(br- lJ b); k], k

> 0,

,8.

= a" b._ 1 = a" b. =

ai_I.

Let the amount of unsaturation in f of al (i = 1, ... , t - 1), be k j > o. Now alter f as follows: decrease the flow along the chain ,81, ,82, ... ,,8T by min [k, kd > 0 and increase the flow along the chain contained in by this amount. The result is a maximal flow in which contradiction. Hence at E L. LEMMA

at

is unsaturated, a

3. No positive chain flow of a maximal flow can contain more than

one arc of L.

Assume the contrary, that is, there is a maximal flow fl containing a chain flow [,81 (ab 1) , ,82(b 1b2), ... , ,8T(b r- 1, b); k], k > 0, with arcs ,81, ,8i E L, ,8; occurring before ,8i' say, in the chain. Letf2 be that maximal flow for which there is an unsaturated chain al (aal) , a2 (alJ a2), ... , a. (a.-lJ bi-I)

+

from a to the left vertex of ,8i' Consider f = Hh h). This maximal flow contains the chain flow [,810 ,82, ... , ,8T; k'] with k' :;;. tk, and each aj(i = 1, ... ,s) is unsaturated by k j > 0 in f. Again alter f: decrease the flow along

245

402

L. R. FORD, JR. AND D. R. FULKERSON

/31, /32, ... ,/3T by min [k', k;] > 0 and increase the flow along the chain contained in aI, a2, ... , a" /3 jo • • • , /3T by the same amount, obtaining a maximal flow in which /3t is unsaturated, a contradiction. Now to prove the theorem it suffices only to remark that the value of every flow is no greater than v CD) where D is any disconnecting set; and on the other hand we see from Lemma 3 and the definition of S that in adding the capacities of arcs of L we have counted each chain flow of a maximal flow just once. Since by Lemma 2 L is a disconnecting set, we have the reverse inequality. Thus L is a minimal cut and the value of a maximal flow is veL). We shall refer to the value of a maximal flow through a network N as the capacity of N (cap (N)). Then note the following corollary of the minimal cut theorem. COROLLARY. Let A be a collection of arcs of a network N which meets each cut of N in just one arc. If N' is a network obtained from N by adding k to the capacity of each arc of A, then cap (N') = cap (N) k.

+

I t is worth pointing out that the minimal cut theorem is not true for networks with several sources and corresponding sinks, where shipment is restricted to be from a source to its sink. For example, in the network (Fig. 1) with shipment from a; to b i and capacities as indicated, the value of a minimal disconnecting set (i.e., a set of arcs meeting all chains joining sources and corresponding sinks) is 4, but the value of a maximal flow is 3. b 2 ,b 3

2

Fig. 1

2. A computing procedure for source-sink planar networks. 2 We say that a network N is planar with respect to its source and sink, or briefly, N is ab-planar, provided the graph G of N, together with arc ab, is a planar 'It was conjectured by G. Dantzig, before a proof of the minimal cut theorem was obtained, that the computing procedure described in this section would lead to a maximal flow for planar networks.

246

FLOW THROUGH A NETWORK

403

graph (2; 3). (For convenience, we suppose there is no arc in G joining a and b.) The importance of ab-planar networks lies in the following theorem. THEOREM 2. If N is ab-planar, there exists a chain joining a and b which meets each cut of N precisely once.

Proof. We may assume, without loss of generality, that the arc ab is part of the boundary of the outside region, and that G lies in a vertical strip with a located on the left bounding line of the strip, b on the right. Let T be the chain joining a and b which is top-most in N. T has the desired property, as we now show. Suppose not. Then there is a cut D, at least two arcs of which are in T. Let these be O!l and 0!2, with O!l occurring before 0!2 in following T from a to b. Since D is a cut, there is a chain C1 joining a and b which meets Din O!l only. Similarly there is a chain C2 meeting D in 0!2 only. Let C2 ' be that part of C2 joining a to an end point of 0!2. It follows from the definition of T that C1 and C2' must intersect. But now, starting at a, follow C2' to its last intersection with Cl, then C1 to b. We thus have a chain from a to b not meeting D, contradicting the fact that D is a cut. Symmetrically, of course, the bottom-most chain of N has the same property. Notice that this theorem is not valid for networks which are not ab-planar. A simple example showing this is provided by the "gas, water, electricity" graph (Fig. 2), in which every chain joining a and b meets some cut in three arcs.

Fig. 2 Theorem 2 and the corollary to Theorem 1 provide an easy computational procedure for determining a maximal flow in a network of the kind here considered. Simply locate a chain having the property of Theorem 2; this can be done at a glance by finding the two regions separated by arc ab, and taking the rest of the boundary of either region (throwing out portions of the boundary where it has looped back and intersected itself, so as to get a chain). Impose as large a chain flow (T; k) as possible on this chain, thereby saturating one or more of its arcs. By the corollary, subtracting k from each capacity in T reduces the capacity of N by k. Delete the saturated arcs, and proceed as

247

404

L. R. FORD, JR. AND D. R. FULKERSON

before. Eventually, the graph disconnects, and a maximal flow has been constructed. 3. A minimal path problem. For source-sink planar networks, there is an interesting duality between the problem of finding a chain of minimal capacitysum joining source and sink and the network capacity problem, which lies in the fact that chains of N joining source and sink correspond to cuts (relative to two particular vertices) of the dual3 of N and vice versa. More precisely, suppose one has a network N, planar relative to two vertices a and b, and wishes to find a chain joining a and b such that the sum of the numbers assigned to the arcs of the chain is minimal. An easy way to solve this problem is as follows. Add the arc ab, and construct the dual of the resulting graph G. Let a' and b' be the vertices of the dual which lie in the regions of G separated by abo Assign each number of the original network to the corresponding arc in the dual. Then solve the capacity problem relative to a' and b' for the dual network by the procedure of §2. A minimal cut thus constructed corresponds to a minimal chain in the original network. 3The dual of a planar graph G is formed by taking a vertex inside each region of G and connecting vertices which lie in adjacent regions by arcs. See (1; 3).

REFERENCES

1. G. B. Dantzig, Maximization of a linear function of variables subject to linear inequalities: Activity analysis of production and allocation (Cowles Commission, 1951). 1. H. Whitney, Non·separable and planar graphs, Trans. Amer. Math. Soc., 34 (1932), 339-362. 3. - - , Planar graphs, Fundamenta Mathematicae, 21 (1933), 73-84.

Rand Corporation, Santi Monica, California

Reprinted from Canad. J. Math. 8 (1956). 399-404

248

ON PICTURE-WRITING* G. POLYA, Stanford University

To write "sun", "moon" and "tree" in picture-writing, one draws simply a circle, a crescent and some simplified, conventionalized picture of a tree, respectively. Picture-writing was used by some tribes of red Indians and it may well be that more advanced systems of writing evolved everywhere from this primitive system. And so picture-writing may be the ultimate source of the Greek, Latin and Gothic alphabets, the letters of which we currently use as mathematical symbols. I wish to observe that also the primitive picture-writing may be of some use in mathematics. In what follows, I wish to show how the method of generating functions, important in Combinatory Analysis, can be quite intuitively evolved from "figurate series" the terms of which are pictures (or, more precisely, variables represented by pictures). Picture-writing is easy to use on paper or blackboard, but it is clumsy and expensive to print. Although I have presented several times the contents of the following pages orally, I hesitated to print it. t I am indebted to the editor of the MONTHLY who encouraged me to publish this article. I shall try to explain the general idea by discussing three particular examples the first of which, although the easiest, will be very broadly treated. 1.1. In how many ways can you change one dollar? Let us generalize the proposed question. Let P" denote the number of ways of paying the amount of n cents with five kinds of coins: cents, nickels, dimes, quarters and half-dollars. The "way of paying" is determined if, and only if, it is known how many coins of each kind are used. Thus, p.= 1, P 6 =2, P IO =4. It is appropriate to set Po = 1. The problem stated at the outset requires us to compute PlOD. More generally, we wish to understand the nature of p .. and eventually devise a procedure for computing P ". It may help to visualize the various possibilities. We may use no cent, or just 1 cent, or 2 cents, or 3 cents, or .... These alternatives are schematically pictured in the first line of Figure 1 j** "no cent" is represented by a square which may remind us of an empty desk. The second line pictures the alternatives: using no nickel, 1 nickel, 2 nickels, .... The following three lines represent in the same way the possibilities regarding dimes, quarters and half-dollars. We have to choose one picture from the first line, then one picture from the second line, and so on, choosing just one picture from each linej combining (juxtaposing) the five pictures so selected, we obtain a manner of paying. Thus, Figure 1 exhibits directly the alternatives regarding each kind of coin and, indirectly, all manners of paying we are concerned with. • Address presented at the meeting of the Association in Athens, Ga., March 16, 1956. t I used it, however, in research. See 2, especially p. 156, where the "figurate series" are introduced in a closely related, but somewhat different, form. (Numerals in boldface indicate the references at the end of the paper.) •• A photo of actual coins would be more effective here but too clumsy in the following figures.

689

249

690

[December

ON PICTURE-WRITING

00 888 · . . CD 00 000 ® ®® ®®® @ @@ @@@ · . . ® ®® ®®® · . .

0

D D 0 0 0

(0 (0

(0

(0 (0 .= •

FIG. 1. A complete survey of alternatives.

0

+

+

0) +

+

®

+

+

+ +

@ + @) +

00 + 808 +

CD® + ®®® + ®® @@ @®

+ + +

®®® @@@ ®®®

+

+ +

·) . ·) . .. ·) . ·) .

·) .

.. + D'00(~}@'@'® + FIG. 2. Genesis of the figurate series.

The main discovery consists in observing that, in fact, we combine the pic.ures in Figure 1 according to certain rules of algebra: if we conceive each line of Figure 1 as the sum of the pictures contained in it and we consider the product of these five (infinite) sums, in short, if we pass from Figure 1 to Figure 2, and we develop the product, the terms of this development will represent the various manners of paying we are concerned with. The one term of the product exhibited in the last line of Figure 2 as an example represents one manner of paying one dollar (putting down no cents, three nickels, one dime, one quarter and one half-dollar). The sum of all such terms is an infinite series of pictures; each picture exhibits one manner of paying, different terms represent different manners of paying, and the whole series of pictures, appropriately called the figurate series, displays all manners of paying that we have to consider when we wish to compute the numbers P n' 1.2. Yet this way of conceiving Figure 2 raises various difficulties. First, there is a theoretical difficulty: in which sense can we add and multiply pictures? Then, there is a practical difficulty: how can we pick out conveniently from the whole figurate series the terms counted by P n, that is, those cases in which the

250

1956]

ON PICTURE-WRITING

691

sum paid amounts to just n cents? We avoid the theoretical difficulty if we employ the pictures, these symbols of a primitive writing, as we are used to employing the letters of more civilized alphabets: we regard each picture as the symbol for a variable or indeterminate. t To master the other difficulty, we need one more essential idea: we substitute for each "pictorial" variable (that is, variable represented by a picture) a power of a new variable x, the exponent of which is the joint value of the coins represented by the picture, as it is shown in detail by Figure 3. The third line of Figure 3 shows a lucky coincidence: we have conceived the three juxtaposed nickels as one picture, as the symbol of one variable (corresponding to the use of precisely three nickels). For this variable we have to substitute x 16 according to our general rule; yet even if we substitute for each of the juxtaposed coins the correct power of x and consider the product of these juxtaposed powers, we arrive at the same final result xu.

D

FIG.

=

x·

= 1,

3. Powers of one variable substituted for variables represented by pictures.

The last line of Figure 3 is very important. It shows by an example (see the last line of Fig. 2) how the described substitution affects the general term of the figurate series. Such a term is the product of 5 pictures (pictorial variables). For each factor a power of x is substituted whose exponent is the value in cents of that factor; the exponent of the product, obtained as a sum of 5 exponents, will be the joint value of the factors. And so the substitution indicated by Figure 3 changes each term of the figurate series into a power x". As the figurate series represents each manner of paying just once, the exponent n arises precisely p" times so that (after suitable rearrangement of the terms) the whole figurate series goes over into

t In a formal presentation it may be advisable to restrict the term "picture" to denote a (visible, written or printed) symbol that stands for an indeterminate; in the present introductory, rather informal, address the word is now and then more loosely used. Let us pass over two somewhat touchy points: the infinity of variables and the convergence of the series in which they arise. Both are considered in certain advanced theories and both are momentary. They will be eliminated by the next step.

251

692

ON PICTURE-WRITING

[December

(1)

In this series the coefficient of x" enumerates the different manners of paying the amount of n cents, and so (1) is suitably called the enumerating series. The substitution indicated by Figure 3 changes the first line of Figure 2 into a geometric series: (2)

In fact, this substitution changes each of the first five lines of Figure 2 into some geometric series and the equation indicated by Figure 2 goes over into (3)

(1 - x)-1(1 - X5)-1(1 - XI0)-1(1 - X26)-1(1 - X·O)-1

= Po + PIX + P 2 x2+ ... + P ..x" + ....

We have succeeded in expressing the sum of the enumerating series. This sum is usually termed the generating function; in fact, this function, expanded in powers of x, generates the numbers Po, PI, ... , P .., ••. , the combinatorial meaning of which was our starting point. 1.3. We have reduced a combinatorial problem to a problem of a different kind: expanding a given function of x in powers of x. In particular, we have reduced our initial problem about changing a dollar to the problem of computing the coefficient of x lOO in the expansion of the left hand side of (3). Our main goal was to show how picture-writing can be used for this reduction. Yet let us add a brief indication about the numerical computation. The left hand side of (3) is a product of five factors. The well known expansion of the first factor is shown by (2). We proceed by adjoining successive factors, one at a time. Assume, for example, that we have already obtained the expansion of the product of the first two factors: (1 - x)-1(1 - X6)-1

= ao + alx + a2x! + ... ,

and we wish to go on hence to three factors: (1 - x)-1(1 - x&)-1(1 - X10)-1

= bo + blx + b2x2 + ....

It follows that (b o

+ blx + blx! + ... )(1 -

xlO ) = ao

+ alx + a!x! + ....

Comparing the coefficient of x" on both sides, we find that (4)

b"

= b,,-IO + a"

(set b... =O if m
252

1956]

693

ON PICTURE-WRITING

each column shows the value of n, the beginning of each row the last factor taken into account; the bottom row would show P n for n=O, 5, 10, ... , 50 if we had computed it. Yet the table registers only the steps needed for computing the answer to our initial question and yields P 60 = 50; that is, one can pay 50 cents in exactly 50 different ways. We leave it to the reader to continue the computation and verify that PIOO = 292; he can also try to justify the procedure of computation directly without resorting to the enumerating series.* Table to compute P,o

n=O (l-x)-1 (l-xb)-l (l_x IO )-1 (I-X26 )-1 (l-x6O )-1

1 1 1

5

10

15

20

25

1 2 2

1 3 4

1 4 6

1 5 9

1 6 12 13

1 1

30

35

40

45

1

1 8

1 9 25

1

1

10

11 36 49 50

7 16

50

2.1. Dissect a convex polygon with n sides into n-2 triangles by n-3 diagonals and compute D", the number of different dissections of this kind. Examining first the simplest particular cases helps to understand the problem. We easily see that D4 = 2, Db = 5; of course D3 = 1. The solution is indicated by the parts (I), (II), and (III) of Figure 4. After the broad discussion of the foregoing solution it should not be difficult to understand the indications of Figure 4. Part (I) of Figure 4 hints the key idea: we build up the dissections of any polygon that is not a triangle from the dissections of other polygons which have fewer sides. For this purpose, we emphasize one of the sides of the polygon, place it horizontally at the bottom and call it the base. One of the triangles into which the polygon is dissected has the base as side; we call this triangle A. In the given polygon there are two smaller polygons, one to the left, the other to the right, of A. For example, the top line of Figure 4 (I) shows an octagon in which there is a quadrilateral to the left, and a pentagon to the right, of A, both suitably dissected. As the figure suggests, we can generate this dissection of the octagon by starting from A and placing on it, from both sides, the two other appropriately pre-dissected polygons. We may hope that building up the dissections in this manner will be useful. In exploring the prospects of this idea, we may run into an objection: there are cases, such as the one displayed in the second line of Figure 4 (I), in which the partial polygon on a certain side of A does not exist. Yet we can parry this objection: yes, the partial polygon on that side of A (the left side in the case of the figure) dOlls exist, but it is degenerate; it is reduced to a mere segment. • For the usual method of derivin\l: the generating function, cf. 1, Vol. 1, p. 1, Problem 1.

253

694

ON PICTURE-WRITING

[December

(l) •

(ID

~+ [Z]+[SJI27+\lS~/+~+~+ ~

••• +

IZJ .D,. f2J + •.•

++0+Ls _)·GC __+8···+g+···) (ill)

_-x:G-x~ C2J-x:[SJ-~(2;-x:fiJ-x~ ... FIG. 4. Key idea, figurate series, transition.

Part (J1) of Figure 4 shows the genesis of the figurate series. This series, which occupies the first line, is the sum of all possible dissections of polygons with 3, 4, 5, ... sides. According to Part (I) (as the next line reminds us) each term of the figurate series can be generated by placing two pre.dissected polygons on a triangle.6., one from the left and one from the right (one or the other of which, or possibly both, may be degenerate). Therefore, as the next line {the last of Figure 4 (II» indicates, the terms of the figurate series are in one-one correspondence with the expansion of a product of three factors: the middle factor is just a triangle, the other two factors are equal to the figurate series augmented by the segment. 2.2. Part (III) of Figure 4 hints the transition from the figurate series to the enumerating series. Following the pattern set by Figure 3 and Section 1.2, we substitute for each dissection (more precisely, for the variable represented by that dissection) a power of x the exponent of which is the number of triangles

254

1956)

695

ON PICTURE-WRITING

in that dissection. This substitution, indicated by Figure 4 (III), changes the figurate series into (5) where E(x) stands for enumerating series. The relation displayed by Figure 4 (II) goes over into E(x) = x[1

(6)

+ E(x) ]2.

This is a quadratic equation for E(x) the solution of which is E(x)

= Dax + D4X 2 + D5X3 + ... + D"x·- 2 + ... 1 - 2x - [1 - 4x ]1/2

(7)

2x

x

+ 2x + .... 2

In fact, to arrive at (7), we have to discard the other solution of the quadratic equation (6) which becomes 00 for x = O. 2.3. We have reduced our original problem which was to compute D,. to a problem of a different kind: to find the coefficient of x n - 2 in the expansion of the function (7) in powers of x.* This latter is a routine problem which we need not discuss broadly. We obtain from (7), using the binomial formula and straightforward transformations, that for n ;;; 3 D"

= _ ~( 2

1/2 ) (-4),,-1 n-l

=~ ~ 2

3

10 ... 4n - 10 . 4 n-l

3.1. A (topological) tree is a connected system of two kinds of objects, lines and points, that contains no closed path. A certain point of the tree in which just one line ends is called the root of the tree, the line starting from the root the trunk, any point different from the root a knot. In Figure 5 the root is indicated by an arrow, and each knot by a small circle. Our problem is: compute T,,, the number of different trees with n knots. It makes no difference whether the lines are long or short, straight or curved, drawn on the paper to the left or to the right: only the difference in (topological) connection is relevant. Examining the simplest cases may help the reader to understand the intended meaning of the problem; it is easily seen that T 1 =1,

T 2 =1, Ta=2, T 4 =4, T5=9.t * For a more usual method cf. 3, Vol. 1, p. 102, Problems 7, 8, and 9.

-t The trees here considered should be called more specifically root-trees; see 4, Vol. 11, p.365. Their definition which is merely hinted here is elaborated in 2, pp. 181-191; see also the passages there quoted of 5. It may be, however, sufficient and in some respects even advantageous if, at a first reading, the reader takes the definition "intuitively" and supplements it by examples. Observe that in Cayley's first paper on the subject, 4, Vol. 3, pp. 242-246, the definition of a tree is not even attempted. Chemistry is one of the sources of the notion "tree": if the points stand for atoms and the connecting lines for valencies, the tree represents a chemical compound.

255

696

[December

ON PICTURE-WRITING

( I)

~-:}Ib. -1.(I!l.!yy) !+!+Ly+LY+r:r+ +If / . . -- r·( r (II)

D+~

+

(D

~f

+

(0

+

(0

l

+

-I-

II

+ ••• )

t ! + It I +

L1LIj !

+Y + yy yyy +

••• )

+ ••• ) + ••• J

~ . . ~';.( m.o.! .yy. o-D"~ (]I )

+ •••

~ I~ x', !~x" Lx" y~ x', Lx.Y~ x~ ...

o x',

FIG. 5. Key idea, figurate series, transition.

256

1956]

697

ON PICTURE-WRITING

The solution is indicated by the three parts of Figure 5 the general arrangement of which is closely similar to that of Figure 4. The reader should try to understand the solution by merely looking at Figure 5 and observing relevant analogies with all the foregoing figures. He may, however, fall back upon the following brief comments. The simplest tree consists of root, trunk and just one knot. The key idea is to build up any tree different from the simplest tree from other trees which have fewer knots. For this purpose we conceive, as Figure 5 (I) shows, the "main branches" of any tree as trees (with fewer knots) inserted into the upper endpoint (the only knot) of the trunk. Therefore, as Figure 5 (I) further shows, we can conceive of any tree as the juxtaposition of the simplest tree and of several pictures, each of which consists of one, or two, or more identical trees; observe the analogy with the last line of Figure 2. Part (II) of Figure 5 displays the figurate series: the infinite sum of all different trees. Its genesis is similar to, but more complex than, that of the figurate series of Figure 2. In Figure 2 we see a product of five "virtually geometric" series; in Figure 5 we see a product of an infinity of "virtually geometric" series, multiplied by an initial one term factor (the simplest tree, the common trunk of all trees). 3.2. Part (III) of Figure 5 displays the substitution that changes the figurate series into the enumerating series. By this substitution, each "virtually geometric" series arising in Figure 5 (II) goes over into a proper geometric series the sum of which is known, and the whole relation displayed by Figure 5 (II) goes over into the remarkable relation due to Cayley* (8)

T1x

+ T,x + TaX· + ... + T"x" + ... 2

= x(1

- x)-1'I(l - x2)-rs(1 - xl )-1'1

•••

(1 - x,,)-r,. ••••

3.3. By expanding the right hand side of Equation (8) in powers of X and comparing the coefficient of x" on both sides, we obtain a recursion formula, that is, an expression for Tn in terms of T 1, T 2 , • • • , T .._1 for n ~2. The reader should work out the first cases and verify by analytical computation the values T .. for n;;i!5 which he found before by geometrical experimentation. References 1. 2. 3. 4. 5.

G. P6lya and G. Szeg6, Aufgaben und Lehrsitze aus der Analysis, 2 volumes, Berlin, 1925. G. P61ya, Acta Mathematica, vol. 68 (1937), pp. 145-254. G. P6lya, Mathematics and Plausible Reasoning, 2 volumes, Princeton, 1954. A. Cayley, Collected Mathematical Papers, 13 volumes, Cambridge, 1889-1898. D. K6nig, Theorie der endlichen und unendlichen Graphen, Leipzig, 1936.

• This form is slightly different from that given in 4, Vol. 3, pp. 242-246. For other fol'llUl see 2. p.149.

Reprinted from Amer. Math. Monthly 63 (1956), 689-697

257

A THEOREM ON FLOWS IN NETWORKS DAVID GALE

1. Introduction. The theorem to be proved in this note is a generalization of a well-known combinatorial theorem of P. Hall, [4]. HALL'S THEOREM. Let 8 Ir 8 2 , ••• , 8" be subsets of a set X. Then a necessary and sufficient condition that there exist distinct elements Xli ••• , X"' such that X, e 8! is that the union of every k sets from among the 8, contain at least k elements.

The result has a simple interpretation in terms of transportation networks. A certain article is produced at a set X of origins, and is demanded at n destinations y" .'., y". Certain of the origins X are " connected" to certain of the destinations y making it possible to ship one article from X to y. PROBLEM. Under what conditions is it possible to ship articles to all the destinations y?

An obvious reinterpretation of Hall's theorem shows that this is possible if and only if every k of the destinations are connected to at least k origins. We shall now give a verbal statement of the generalization to be proved. A more formal statement will be given in the next section. Let N be an arbitrary network or graph. To each node x of N corresponds a real number d(x), where Id(x)1 is to be thought of as the demand for or the supply of some good at X according as d(x) is positive or negative. To each edge (x, y) corresponds a nonnegative real number c(x, y), the capacity of this edge, which assigns an upper bound to the possible flow from x to y. The demands d(x) are called feasible if there exists a flow in the network such that the flow along each edge is no greater than its capacity, and the net flow into (out of) each node is at least (at most) equal to the demand (supply) at that node. An obviously necessary condition for the demands d(x) to be feasible is the following. For every collection 8 of nodes the sum of the demands at the nodes Received September 24, 1956. The results of this paper were discovered while the author was working as a consultant for the RAND Corporation. A later revision was partially supported by an O. N. R. contract.

259

1074

DAVID GALE

of S must not exceed the sum of the capacities of the edges leading into S. If this condition were not satisfied it would clearly be impossible to satisfy the aggregate demand of the subset S. The principal theorem of this paper shows that conversely, if the above condition is satisfied, then the demands d(x) are feasible. Hall's theorem drops out as a special case of this result if one applies it to the particular network described in the paragraph above and makes use of the known fact (see [1]) that transportation problems of this type with integral constraints have integral solutions. However, the simple inductive argument which works in [4] does not seem to generalize to yield a proof of our theorem. Our approach is in fact quite different and is based on the "minimum cut" theorem of Ford and Fulkerson, [2], [1]. In the next section we give a formal statement of the problem and prove the principal theorem. The final section is devoted to the treatment of a special case for which the "feasibility criterion" yields a very simple method for computing solutions. 2. The principal theorem. We proceed to define in a more formal manner the objects to be discussed. DEFINITIONS. A network [N,c] consists of a finite set of nodes N and a capacity function c on Nx N where c(x, y) is a nonnegative real number or plus infinity.

A flow f on [N, c] is a function f on Nx N such that ( 1)

f(x, y) + f(y, x)=O,

(2 )

f(x, y) < c(x, y)

for all x, yeN.

A demand d on [N, c] is simply a real valued function on N. Note that we do not require the function c to be symmetric, thus the maximum allowable flow from x to y need not be the same as that from y to x. Condition (1) above corresponds to the usual convention that the net flow from x to y is the negative of the net flow from y to x. We shall save writing many summation symbols in what follows by adopting the following convenient notation.

NOTATION.

If S is a subset of Nand d a function on N, we write

d(S)= 2.:. d(x) . zES

260

A

THEOREM ON FLOWS IN NETWORKS

1075

If 8 and T are subsets of Nand f a function on N x N we write

L.

f(8, T)=

::t€S, VET

f(x, y).

From these definitions it follows at once that if U and V are disjoint subsets of N then d(U U V)=d(U)+d(V)

(3)

f(8, U U V)= f(8, U) +f(8, V).

In particular, denoting the complement of 8 by 8' we have, f(N, T)=f(8, T)+f(8', T)

for all 8 eN.

In this notation (1) and (2) are clearly equivalent to (1')

f(A, A)=O;

and f(A, B) < c(A, B)

(2')

for all A, BeN.

The above notation is natural to our problem, for if d is a demand function then d(8) is simply the aggregate demand of the set 8, and if f is a flow then f(8, T) represents the net flow from 8 into T. DEFINITION. such that

A demand d is called feasible if there exists a flow f

( 4)

f(N,

x)~

d(x)

for xeN.

This condition states that the flow into each node must be at least equal to the demand at that node. However (1) and (4) together imply f(x,

N)~

-d(x)

so that we are also requiring the flow out of each node to be at most equal to the supply at that node (recalling that a negative demand represents a supply). Finally we note that from (3) it follows that (4) is equivalent to f(N, 8) > d(8)

(4' )

for all 8

eN.

We can now give a simple statement of our main result. FEASIBILITY THEOREM. every subset 8 C N (5 )

The demand d is feasible if and only if for d(S')~

c(8, 8').

261

DAVID GALE

1076

Proof. The necessity of (5) is obvious, for if d is feasible then there is a flow f such that d(8')~f(N,

8')=f(8, 8')+f(8', 8')=f(8, 8')< c(8, 8').

The proof of sufficiency depends on the "minimum cut theorem" of Ford and Fulkerson, which we shall now state and prove in our own formulation. While our proof is little more than a translation of the above authors' second proof [3] into our notation, we record it here, nevertheless, both for the sake of completeness and because it is substantially shorter than any proof published heretofore. DEFINITION. Let [N, c] be a network and let sand s' be two distinguished nodes (s=source, s' =sink). A flow from s to s' is a flow such that (6)

f(N, x)=O

for x =1= s, x =1= s' .

Let F denote the set of all flows from s to s'. A cut (8, 8') of N with respect to sand s' is a partition of N into sets 8 and 8' such that s E 8, s' E 8'. Let Q denote the set of all such cuts. MINIMUM CUT THEOREM.

For any network [N, c]

maxf(s, N)=min c(8, 8'), Q

F

Proof. have (7)

First note that for any flow fE F and cut (8, 8') E Q we

f(s, N)=f(s, N)+

2::: f(x, N)=f(s, N)+f(8-s, N)

xES-S

=f(8, N)=f(8, 8)+f(8, 8')=f(8, 8')5c(8, 8'). Hence, it remains only to show that equality is attained in (7) for some flow and cut. Let fE F be a flow such that 1(s, N) is a maximum. Let 8 consist of s and all nodes x such that there exists a chain 0"= (xo, XI' ••• , xn) of distinct nodes with Xo=S, Xn=X and C(X i _ l , Xi )-](X I - 1 , Xi» 0, i=l, ... , n. Now s' is not in 8, for, if it were, there would be a chain 0" as above with x=s'. But then letting p=min [C(X i _ h xJ-J(x i -" x.1)]

,

one could superimpose a flow of p along the chain 0" on top of the flow

f, contradicting the maximality of

f. 262

1077

A THEOREM ON FLOWS IN NETWORKS

The above argument shows that (S, S') is a cut, and we conclude the proof by observing that fls, N)=c(S, S'), for if not, then from (7), flS, S') c(S, S'), hence for some xeS and yeS' we would have c(x, y) -fix, y»O, but since xeS there is a chain a=(s, Xu ••• , x) which could be extended to a chain a' = (s, Xu ••• , x, y), contrary to the fact that yeS'. This completes the proof.

<

Proof of feasWility theorem.

Consider a new network [N, c] where

IV consists of N plus two additional nodes sand s'. Let U C N be all nodes x such that d(x);;;:;; O.

Then

c is defined by the rules

c(x, y)=c(x, y)

for x, ye N,

c(s, x)= -d(x)

for xe U,

c(x, s')=d(x)

for xe U',

c(x, y)=O

otherwise.

We now assert that the cut (&-s', s') is a minimal cut of [N, c], for let S, and "8 be any cut of [N, c] and let S=8-s, S'=8-s'. From the definition above we have c(s, "8)=e(S, S')+c(s, S')+c(S, s') =e(S, $')-d(S'n U)+d(S

n U') ,

c(N-s', s')=d(U')=d(S' n U') + d(S

n U') ;

and subtracting we get c(N-s', s')-c(S, 8)=d(S' n U') + d(S' n U)--e(S, S') =d(S')-c(S,

S')~

0,

the last inequality being the hypothesis (5), and the assertion is proved. Now, from the Minimum Cut Theorem, there is a flow 1 from s to s' on [N, c] such that

1(&-8', s')=c(N-s', s')=d(U'), hence (8)

flx, s')=d(x)

for all xe U'.

Let f be f restricted to N x N. Then f is clearly a flow and it remains to show that f satisfies (4). If x e U' then O=](x, N)=f(x, N)+flx, s')=f(x, N)+d(x),

263

1078

DAVID GALE

hence (9 )

fiN, x) =d(x) . If xe U then

0=7(&, x)=f(N, x)+](s, x)~f(N, x)+c(s, x)=f(N, x)-d(x),

so f(N, x) > d(x) ,

(10)

and (9) and (10) together show that f satisfies (4), completing the proof. REMARK. We wish to call attention to the following important fact. We have at no point in what has been said thus far made use of the assumption that the functions d, c and f were real valued. In fact, all definitions and proofs go through verbatim if the real numbers are replaced by any ordered Abelian group, in particular, the group of integers. One useful consequence of this remark is the fact that if a network with integer valued demand and capacity functions admits a feasible flow then this flow may also be chosen to be integer valued. We shall make use of this fact in the next section. There is a second formulation of the Feasibility Theorem which is sometimes convenient. In the network [N, c] let U be as above the set of nodes x such that d(x) < O. THEOREM. The demand d is feasible if and only if for every set Y C U' there exists a flow fy such that (11)

frlN, x)>-d(x)

(12)

fy(N, Y)L d(Y) .

Proof. The necessity is obvious. (11) and (12) imply (5).

for xe U

To prove sufficiency we show that

Let (8, S') be a partition of N and let X=U (\ S, X'=U (\ S', Y = U' (\ S, Y' = U' (\ S'. Then from (11) there exists fF such that d(X')~fy,(N,

X')=fy,(X V Y, X')+fy,(Y', X'),

and from (12), d(Y')~fy,(N,

Y')=fy,(XV Y, Y')+fy,(X', Y').

Adding these inequalities we get

264

A THEOREM ON FLOWS IN NETWORKS

1079

d(8') = d(X') +d(Y')=fy,(X V Y, X') +fy,(X V Y, Y') =fy,(X V Y, X'V Y')=f(8, 8') < c(8, 8'),

which is exactly (5). 3. An example. As an illustration of the feasibility theorem, consider the following problem. (I). Let all ••• , am and b" ••• , b" be two sets of positive integers. Under what conditions can one find integers a'j=O or 1, such that

and

for all i and j ? As a concrete illustration, suppose n families are going on a PICnIC in m busses, where the jth family has bj members and the ith bus has a j seats. When is it possible to seat all passengers in such a way that no two members of the same family are in the same bus? In the case Sal = Sb, the problem becomes that of filling an m x n matrix M with zeros and ones so that the rows and columns shall have prescribed sums. The feasibility theorem gives a simple necessary and sufficient condition for the problem to have a solution. In order to state if we need the following. DEFINITION.

integers a" a.J , Let

Let raj} be a nonincreasing sequence of nonnegative such that all but a finite number of the a l are zero.

••• ,

where j is a positive integer and let Sj be the number of elements in 8 j• The sequence of numbers {Sj} clearly satisfies the same conditions as the sequence {a,}; it is called the dual sequence of the sequence {all and is denoted by {al } *. It is clear that {al} * determines {al} since the integer a, occurs exactly sa,-sa,+l times in {a,}. Actually the correspondence between {a,l and {a,l * is completely dual in the following sense. THEOREM.

This result will not be needed in the sequel and its proof is left as

265

1080

DAVID GALE

an exercise. However, its validity can be made quite obvious by means of a simple pictorial representation. Let each number a, be represented by a row of dots, and write these rows in a vertical array so that a i + J lies under ai' thus:

a, ..... a, ... a5

•

It is then clear that the dual number

Sj is simply the number of dots in the jth column of the array. We can now give the criterion for the feasibility of Problem I. Henceforth for convenience we shall assume the numbers a, and bj are indexed in decreasing order, and shall define a,=O for m, bj=O for j>n.

i>

THEOREM.

Let {sJ}={a,}*.

Then Problem 1 is feasible if and only

if for all integers k . Proof. We may interpret (I) as a flow problem. Let N be a network consisting of m+n nodes x" ... , Xm and Yh ••• , Yn, and let C(Xi' Yi) =1 for all i and j, c=o otherwise. Let d(x,)= -a, and d(Yj)=b J. One easily verifies that the feasibility of (I) is equivalent to the feasibility of the demand d. We shall show that d is feasible by applying the second theorem of the previous section. Let Y be a subset of k nodes YJ' say Y = {YJ 1 ' .•• , Yj). We now compute the maximum possible flow into Y. Because all capacities are unity this maximal flow fy is achieved by shipping as much as possible from each node Xi into the set Y. Thus, the flow from Xi to Y is min raj, k] and the total flow into Y is

We now assert (13)

266

A THEOREM ON FLOWS IN NETWORKS

which is proved by induction on k.

1081

It is clear from the definition that

m

2:: min [a l1 IJ=m=s,.

1=1

Now min [a·l

,

min [aj, k+ IJ= { min [au

kJ kJ + 1

hence, m

m

2:: min [ai' k+lJ=2:: min [ai' kJ+-sk+I' i-I i-I and (13) follows from the induction hypothesis. The second feasibility theorem now states that the problem is feasible if and only if

and since the b j are indexed in decreasing order, the conclusion of the theorem follows. It is interesting that for this particular problem there is a simple " n-step" method for actually filling out the matrix of au's. Such procedures are sufficiently rare in programming theory so that it seems worth while to present it here. The procedure is the following: If the problem is feasible then bl ~ s, and hence a" ... , ab , ~ 1 (recall that the at's are indexed in descending order). Let a il =1 for i~b" ail=O for i>b,. Now consider the new problem, (I)', with the matrix M' having m rows and n-l columns, j=2, ... , n, with a;=a.l-a il and b;=b j • We assert that (I)' is again feasible so that by repeating the process we will eventually fill out the whole matrix. To show that (I)' is feasible we must prove, for any k, k+l

2:: b

j=2

j

k

~

m

2:: s;= 2:: min [s;, kJ, i= j=1

1

where {sa is the dual sequence to {a;}. can be rewritten l)1

?11

The expression on the right ?n

2:: min [a;, kJ=2:: min [ai-I, kJ+ 2:: min [a;, t=l i=b +1

i.,.t

1

k].

We must now consider two cases. Case 1.

s.+, ~ b,.

Then a l -I2>. k for i ;?;,b, and hence min[a i -1,

267

kJ

1082

DAVID GALE

=k=min [ai, k], so that we get

<

Case 2. SUI b,. Then for i S. Sk+h at > k + 1 so a j- 12 k and min [at -1, k]=k=min [aI' k]. For SUI i S. bl! at < k, so min raj-I, k] =min raj, k]-I, hence,

<

since

by the feasibility condition. The proof is now complete. In terms of the picnic problem, the n families should be seated in n stages according to the following simple rule: at each stage distribute the largest unseated family among those busses having the greatest number of vacant seats. REFERENCES 1. G. B. Dantzig and D. R. Fulkerson, On the mo.x-flow min-cut theorem of networks, Ann. of Math. Study No. 38, Contributions to linear inequalities and related topics, edited by H. W. Kuhn and A. W. Tucker, 215-221. 2. L. R. Ford, Jr., and D. R. Fulkerson, Maximal flow through a network, Canad. J. Math. 8 (1956), 399-404. 3. - - - - , A simple algorithm for finding maximal network flows aud an application to the Hitchcock problem, Canad. J. Math. 9 (1957), 210-218. 4. P. Hall. On Repre8entati~'es of Subset8, J. London Math. Soc., 10 (1935), 26-30. THE RAND CORPORATION AND BROWN UNIVERSITY

Reprinted from Pacific J. Math. '7 (1957), 1073-1082

268

COMBINATORIAL PROPERTIES OF MATRICES OF ZEROS AND ONES H.

J. RYSER

1. Introduction. This paper is concerned with a matrix A of m rows and n columns, all of whose entries are O's and l's. Let the sum of row i of A be denoted by rj (i = 1, ... , m) and let the stirn of column i of A be denoted by Sj (i = 1, ... ,n). It is clear that if T denotes the total number of l's in A T

=

m

L

j-l

rj

=

n

L

Sj.

j=1

With the matrix A we associate the row sum vector R = (rlJ"" rm), where the ith component gives the sum of row i of A. Similarly, the column sum vector S is denoted by

S = (SlJ ••• , sn). We begin by determining simple arithmetic conditions for the construction of a (0, I)-matrix A having a given row sum vector R and a given column sum vector S. This requires the concept of majorization, introduced by Muirhead. Then we apply to the elements of A an elementary operation called an interchange, which preserves the row sum vector R and column sum vector S, and prove that any two (0, I)-matrices with the same Rand S are transformable into each other by a finite sequence of such interchanges. The results may be rephrased in the terminology of finite graphs or in the purely combinatorial terms of set and element. Applications to Latin rectangles and to systems of distinct representatives are studied.

2. Maximal matrices and majorization. Let OJ = (1, ... , 1,0, ... ,0)

be a vector of n components with l's in the first rj positions, and O's elsewhere. A matrix of the form

is called maximal, and we refer to A as the maximal form of A. The maximal A may be obtained from A by a rearrangement of the l's in the rows of A. Also by inverse row rearrangements one may construct the given A from A. Received July 1, 1956. This work was sponsored in part by the Office of Ordnance Research. 371

269

372

H.

Let R = (7\, ... , fm) and sum vectors of A. Evidently

S=

J.

RYSER

(81, ... , 8,,) be the row sum and column R

= R.

Moreover, it is clear that the row sum vector R uniquely determines A, and hence S. Indeed, T = L r/ = L 8/ constitute conjugate partitions of T. Consider two vectors S = (S" ... , s,,) and S* = (SI*, .. . , s"*), where the Sj and s;* are nonnegative integers. The vector S is majorized by S*, S

-<

S*,

provided that with the subscripts renumbered (5; 3): (1)

s,

(2)

S,

(3)

:> ... :> S,,' s,* :> ... :> s"* ;

+ ... + S/ < s,* + ... + Sj* , S, + ... + s" = S,* + ... + s"* . S associated with the matrices A

For the vectors Sand we prove that

z=I, ... ,n-l; and

A, respectively,

-< S.

S

We renumber the subscripts of the

Sj

of A so that

s,

:>

S2

:> ... :>

Sn.

8,

:>

82

:> ... :>

8".

For A, we already have Now A must be formed from A by a shifting of l's in the rows of A. But for each i = 1, ... , n - 1, the total number of l's in the first i columns of A cannot be increased by a shifting of l's in the rows of A. Hence s, 'I

+ ... + s/ < 8, + ... + 8t,

= 1, ... , n - 1. Moreover, S,

+ ... + s" = 8, + ... + 8",

whence we conclude that S

-< S.

THEOREM 2.11. Let the matrix A be maximal and have column sum vector S. Let S be majorized by S. Then by rearranging l's in the rows of A, one may construct a matrix A having column sum vector S.

Without loss of generality, we may assume that the column sums of A satisfy s, :> S2 :> ... :> s". We construct the desired A inductively by columns by a rearrangement of the l's in the rows of A. 'Added in proof. The author has been informed recently that Theorem 2.1 was obtained independently by Professor David Gale. His investigations ('oncerning this theorem and certain generalizations are to appear in the Pacific Journal of Mathematics.

270

373

MA TRICES OF ZEROS AND ONES

of

A

By hypothesis, S -< 8, whence SI -< Bl. If SI = B[, we leave the first column A unchanged. Suppose that SI < 8t. We may rearrange l's in the rows of to obtain SI l's in the first column, unless

But if these inequalities hold, then Bl

+ ... + Bn > nSI > SI + ... + Sn

= Bl

+ ... + B.,

which is a contradiction. Let us suppose then that the first t columns of A have been constructed, and let us proceed to the construction of column t + 1. Vlie have then given an m by n matrix where the number of l's in column 7Ji is Si (£ = 1, ... ,t). Let the number of l's in column 711 be s'J (j = t + 1, ... , n). We may suppose that Two cases arise.

Case 1.

S 1+1

< S'/+1.

In this case, remove l's from column 711+1 by row rearrangements, and place the l's in columns '11+2, ••• , 7Jn. If sufficiently many l's may be removed from 711+1 in this manner, then we are finished. Suppose then that there remain e l's in column t + I, with S'+1

< e <;; S~+I'

and that no further l's may be removed by this procedure. Then there must exist an integer w > 0 such that S 1+1

+ ... + s"

But SHI

SI+2

S"

=

(n - t)e

+ w.

< e,

-< S,+1 < e,

< e.

Therefore

(n - t)e

+w

=

S 1+1

+ ... + s" <

which is a contradiction.

Case 2.

271

(n - t)e,

374

H.

J.

RYSER

By row rearrangements, insert 1's into column 11Hl from columns 11H2, • . . ,11n' If sufficiently many l's may be inserted in this manner, then we are finished. Suppose then that there remain e l's in column t + 1 with S',+1 "' e < S HI, and that no further l's may be inserted by our procedure. Let the matrix at this stage of the construction process be denoted by

[e,.l· If now £',.HI =

0,

then

(j = t

+ I, ... , n).

Suppose that some

e"

j ~ t

I,

=

Then either (k

e'l: = I,

=

I, ... ,t

+ 2. + 1),

or else for some k, 1 "' k "' t,

£'".,

=

O.

Consider the case in which e,A' = O. Since Cpl:

=

I,

('v,Hl

=

Sk ~ St+l

>

e, there must exist

O.

Interchanging £"} = 1 and C,f; = 0, and interchanging epk = 1 and ev ,Hl = 0, we see that SI, ••• , s, are left unaltered, and that e is increased by 1. Continue to increase e by transformations of this variety. Suppose that all such transformations have been applied and that e still satisfies S~+I "' e < S,+I. But now it is no longer possible to move a 1 from columns t into columns 1, 2, ... , t + I. This means that SI

+ ... + St + e =

81

+ 2, ... , n

+ ... + St + 8 t+1'

But then SI

whence s 1+1

+ ... + '\·Hl "' 81 + ... + 8t+l < e, which

= SI

+ ... + St + 1',

is a contradiction. This completes the proof.

The preceding theorem has a variety of applications. For example, let the (0, I)-matrix A of m rows and n columns contain exactly T = km l's, where k is a positive integer. Let the column sum vector of A be S = (Slo ... , sn). Then there exists an m by n matrix A* composed of O's and l's with exactly k 1's in each row, and column sum vector S. For let A be m by n, with all l's in the first k columns and O's elsewhere. If S denotes the column sum vector of A, then S < S, and the desired A* may be constructed from .4.

272

MA TRICES OF ZEROS AND ONES

375

In this connection we mention the following result anSl11g in the study of the completion of Latin rectangles (1; 7). Let A be a (0, I)-matrix of r rows and n columns, I <; r < n. Let there be k J's in each row of A, and let the column sums of A satisfy k - (n - r) <; Sj <; k. Then n - r rows of O's and 1's may be adjoined to A to obtain a square matrix with exactly k 1's in each row and column (7). To prove this it suffices to construct an n - r by n matrix A * of O's and I 's with exactly k I 's in each row, and column sum vector (k - s\, ... , k - sn). By the remarks of the preceding paragraph, such a construction is always possible. 3. Interchanges. \Ve return now to the m by n matrix A composed of O's and 1's, with row sum vector R and column sum vector S. V·/e are concerned with the 2 by 2 submatrices of A of the types

Al

= [~

~J and .12 = [~ ~J.

An interchange is a transformation of the elements of A that changes a specified minor of type A I into type A 2, or else a minor of type A 2 into type A hand leaves all other elements of A unaltered. Suppose that we apply to A a finite number of interchanges. Then by the nature of the interchange operation, the resulting matrix A * has row sum vector R and column sum vector S. THEOREM 3.1. Let A and A * be two m by n matrices composed of 0' sand l' s, possessing equal row sum vectors and equal column sum vectors. Then A is transformable into A * by a finite number of interchanges.

The proof is by induction on m. For m = 1 and 2, the theorem is trivial. The induction hypothesis asserts the validity of the theorem for two (0, 1)matrices of size m - 1 by n. We attempt to transform the first row of A into the first row of A * by interchanges. If we are successful, the theorem follows at once from the induction hypothesis. Suppose that we are not successful and that we denote the transformed matrix by A'. For notational convenience, we simultaneously permute the columns of A' and A* and designate the first row of A' by (0" 'TI" 0t> 'TI,)

and the first row of A * by (0" 'TI" 'TIlt 0,).

Here OT and 0, are vectors of all l's with rand t components, respectively, and 'TIs and 'TIt are 0 vectors with sand t components, respectively. Thus we have been successful in obtaining agreement between the two rows in the positions labelled OT and 'TI., but have been unable to obtain agreement in the positions labelled Ot and 'TIt. We may suppose, moreover, that these 2t positions of disagreement are the minimal number of disagreements obtainable among

273

376

H.

J.

RYSER

all attempts to transform the first row of A into the first row of A * by interchanges. Let A'm-l and A *m-l denote the matrices composed of the last m - I rows of A' and A*, respectively. The row sum vectors of A ' m - l and A*m-l are equal. Also corresponding columns of A'm-l and A *m-l below the positions labelled fir and 11. have equal sums. Let ai denote the (r + s + i)th column of A ' m - h and let f3t denote the (r + s + t + i)th column of A 'm- h where i = 1, ... ,t. Let al*, ... ,at* and f31*, ... ,f3t* denote the corresponding columns of A *m-l. Let ai, b i , a;*, b j * denote the column sums of at, f3t, al, f3l, respectively. Now in A ' m - l we cannot have simultaneously a 0 in the position determined by row j and column at and a I in the position determined by row j and column f3j. For if this were the case, we could perform an interchange and reduce the 2t disagreements in the first row of A'. Hence aj > b i • Moreover, aj* = aj + I and b i * = b t - 1, whence ai* - b*i = at - bi + 2 > 2. In A*m-h consider columns aj* and f3j*. There exists a row of A*m-l that has a 1 in column al and a 0 in column f3i*' Replace the 1 by 0 and the 0 by 1, and let such a replacement be made for each i = 1, ... , t. We obtain in this way a new matrix A"._l whose row and column sum vectors are equal to those of Aim-I. By the induction hypothesis, we may transform A ' m - l into A m- 1 by interchanges. However, these interchanges applied to A' will allow us to perform further interchanges and make the first rows of the transformed A I and A * coincide. Hence the theorem follows. Let ~ denote the class of all (0, I)-matrices of m rows and n columns, with row sum vector R and column sum vector S. The term rank p of A in ~ is the order of the greatest minor of A with a nonzero term in its determinant expansion (6). This integer is also equal to the minimal number of rows and columns that contain collectively all of the nonzero elements of A (4). A (0, I)-matrix A = [a,,] may be considered an incidence matrix distributing n elements Xh ... ,Xn into m sets Sh ... ,Sm' Here ail = 1 or 0 according as ;0,.', is or is not in St. From this point of view the term rank of a matrix is a generalization of the concept of a system of distinct representatives for subsets SI, ... ,S1I' of a finite set N (2). Indeed, the subsets Sh ... ,Sm possess a system of distinct representatives if and only if p = m. THEOREM 3.2. Let p be the minimal and p the maximal term rank for the matrices in ~. Then there exists a matrix in ~ possessing term rank p, where p is an arbitrary integer on the range p <; p <; p.

For an interchange applied to a matrix in ~ either changes the term rank by I or else leaves it unaltered. But by Theorem 3.1, we may transform the matrix of term rank p into the matrix of term rank p. This implies that there exists a matrix in ~ of term rank p.

274

MATRICES OF ZEROS AND ONES

377

REFERENCES

1. MarshaU Hall, An existence theorem for Latin squares, Bull. Amer. Math. Soc., 51 (1945),

387-388. P. HaU, On representatives of subsets, J. Lond. Math. Soc., 10 (1935), 26-30. G. H. Hardy, J. E. Littlewood, and G. P6lya, Inequalities (Cambridge, 1952). Dimes Konig, Theorie der endlichen und unendlichen Graphen (New York, 1950). R. F. Muirhead, Some methods applicable to identities and inequalities of symmetric algebraic functions of n letters, Proc. Edinburgh Math. Soc., f!1 (1903), 144-157. 6. Oystein Ore, Graphs and matching theorems, Duke Math. J., f!f! (1955), 625-639. 7. H. J. Ryser, A combinatorial theorem with an application to Latin rectangles, Proc. Amer. Math. Soc., f! (1951), 550-552.

2. 3. 4. 5.

Ohio State University

Reprinted from Canad. 1. Math. 9 (1957),371-377

275

GRAPH THEORY AND PROBABILITY 1'. ERDOS

A well-known theorem of Ramsay (8; 9) states that to every n there exists a smallest integer g(n) so that every graph of g(n) vertices contains either a set of n independent points or a complete graph of order n, but there exists a graph of g(n) - 1 vertices which does not contain a complete subgraph of n vertices and also does not contain a set of n independent points. (A graph is called complete if every two of its vertices are connected by an edge; a set of points is called independent if no two of its points are connected by an edge.) The determination of g(n) seems a very difficult problem; the best inequalities for g(n) are (3) (1)

21"<

g(n)

<

(2; : : ~) .

It is not even known that g(n)l/71 tends to a limit. The lower bound in (1) has been obtained by combinatorial and probabilistic arguments without an explicit construction. In our paper (5) with Szekeresl(k, I) is defined as the least integer so that every graph having l(k, I) vertices contains either a complete graph of order k or a set of I independent points (f(k, k) = g(k». Szekeres proved (2)

l(k, I)

< (k

t ~~ 2) .

Thus for k

= 3,j(3, I)

< (I ~

1) .

I recently proved by an explicit construction that 1(3, I) probabilistic arguments I can prove that for k > 3 (3)

l(k, I)

>

1 (k

>

11+e,

(4). By

t ~ ~ Y', 2

which shows that (2) is not very far from being best possible. Define now h(k, I) as the least integer so that every graph of h(k, I) vertices contains either a closed circuit of k or fewer lines, or that the graph contains a set of I independent points. Clearly h(3, I) = 1(3, I). By probabilistic arguments we are going to prove that for fixed k and sufficiently large I (4)

h(k, I)

>

11+1/2~.

Further we shall prove that Received December 13, 1957.

34

276

GRAPH THEORY AND PROBABILITY

(5)

h(2k

+ 1, I)

< c3 11+ 1/ 1c , h(2k + 2, I) < c3l1+ 1/ k •

A graph is called r chromatic if its vertices can be coloured by r colours so that no two vertices of the same colour are connected; also its vertices cannot be coloured in this way by r - 1 colours. Tutte (1, 2) first showed that for every r there exists an r chromatic graph which contains no triangle and Kelly (6) showed that for every r there exists an r chromatic graph which contains no k-gon for k <;; 5. (Tutte's result was rediscovered several times, for instance, by Mycielski (7). It was asked if such graphs exist for every k.) Now (,-1:) clearly shows that this holds for every k and in fact that there exists a graph of n vertices of chromatic number> n' which contains no closed circuit of fewer than kedges. Now we prove (4). Let n be a large number, 1

O<E
is arbitrary. Put m = [n1+'] ([x] denotes the integral part of x, that is, the greatest integer not exceeding x), p = [nl-~l where 0 < ." < E/2 is arbitrary. Let ®(n) be the complete graph of n vertices XI. X2, ••• , Xn and ®(P) any of its complete subgraphs having p vertices. Clearly we can choose ®(P) in (:) ways. Let

be an arbitrary subgraph of ®(n) having m edges (the number of possible choices of a is clearly as indicated). First of all we show that for almost all a ®a(n) has the property that it has more than n common edges with every ®(P). Almost all here means: for all a's except for

Let the vertices of ~)
to ((;)) ((~=~~)) + ((!)) ((;):(~))
<

1)

<

<

(~)p.exp( -'i).

36

P. ERDOS

Now the number of possible choices for

®(P)

is

(;) < nP < pn. Thus the number of a's for which there exists a has not more than n'edges is less than (7] < E/2)

®(P)

so that

®(P)

f\

®,,(n)

as stated. Unfortunately almost all of these graphs ®.. (n) contain closed circuits of length not exceeding k (in fact almost all of them contain triangles). But we shall now prove that almost all ®,,(n) contain fewer than n/k closed circuits of length not exceeding k. The number of graphs ® ..(n) which contain a given closed circuit (Xl, X2), (X2, X3), ... , (x It Xl) clearly equals

The circuit is determined by its vertices and their order-thus there are n(n - 1) ... (n - l + 1) such circuits. Therefore the expected number of closed circuits of length not exceeding k equals

(~r t.1! (;) ((;l j < (1

<

(1

+ .(1))

+ 0(1)) n" (2~t = n

t, .{(;)),

u(n)

since E < l/k. Therefore, by a simple and well-known argument, the number of the a's for which ®,,(n) contains n/k or more closed paths of length not exceeding k is

{~), as stated. Thus we see that for almost all a ®.. (n) has the following properties: in every ®(P) it has more than n edges and the number of its closed circuits having k or fewer edges is less than n/k. Omit from ®,,(n) all the edges contained in a closed circuit of k or fewer edges. By what has just been said we omit fewer than n edges. Thus we obtain a new graph ®..'(n) which by construction does not contain a closed circuit of k or fewer edges. Also clearly ®,,'(") f\ ®(P)

278

GRAPH

37

AND PROBABILITY

TH~:ORY

is not empty for every ®(P). Thus the maximum number of independent points in ®,,'(n) is less than p = [n l - 1 ], or h(k, [nl-~])

>n

which proves (4). By more complicated arguments one can improve (4) considerably; thus for k = 3 I can show that for every t > 0 and sufficiently large 1 f(3, I) = h(3, l)

>

l2-.,

which by (2) is very close to the right order of magnitude. At the moment I am unable to replace the above "existence proof" by a direct construction. By using a little more care I can prove by the above method the following result: there exists a (sufficiently small) constant C4 so that for every k and l (6)

h(k, I)

1

> C411+-if.

(If k > clog l (6) is trivial since h(k, l) :.;;. I.) From (6) it is easy to deduce that to every r there exists a C6 so that for n > no(r, C6) there exists an r chromatic graph of n vertices which does not contain a closed circuit of fewer than [C6 log n] edges. I am not sure if this result is best possible. We do not give the details of the proof of (3) since it is simpler than that of (4). For k = 3 (3) follows from (4). If k > 3, put 2

2

m = c6[n """7i=T] and denote by ®a(n) the "random" graph of m edges. Bya simple computation it follows that for sufficiently small C6, ®,,(n) does not contain a complete graph of order k for more than

values of a, and that for more than this number of values of a ®,,(n) does not contain a set of C7n2/lt-l log n independent points (C7 = C7(C6) is sufficiently large). Th us f(k, c7n 2/k- 1 log n) > n, which implies (3) by a simple computation. Now we prove (5). It will clearly suffice to prove the first inequality of (5). We use induction on l. Let there be given a graph ® having h(2k 1, l) - 1 1 or fewer edges and vertices which does not contain a closed circuit of 2k for which the maximum number of independent points is less than l. If every point of ® has order at least [[ilk] 2 (the order of a vertex is the number of edges emanating from it) then, starting from an arbitrary point, we reach in k steps at least I points, which must be all distinct since otherwise ® would

+

+

279

+

38

P. ERDOS

have to contain a closed circuit of at most 2k edges. The endpoints thus obtained must be independent, for if two were connected by an edge @ would contain a closed circuit of 2k 1 edges. Thus @ would have a set of at least 1 independent points, which is false. Thus @ must have a vertex XI of order at most [l1lk] 1. Omit the vertex XI and all the vertices connected with it. Thus we obtain the graph @' and Xl is not connected with any point of @', thus the maximum number of independent points of @' is 1 - I, or @' has at most h(2k + I, 1 - 1) - 1 vertices, hence

+

+

h(2k

+ 1,1) <;: h(2k + 1,1 -

1)

+ [1 Ilk] + 2

which proves (5). REFERENCES

1. 2. 3. 4. 5. 6. 7. 8. 9.

Blanche Descartes, A three cQIQur prQblem, Eureka (April, 1947). (Solution March, 1948.) - - SQlutiQn to Advanced PrQblem no,. 4526, Amer. Math. Monthly, 61 (1954), 352. P. Erdos, SQme remarks Qn the theQry of graphs, B.A.M.S. 53, (1947),292-4. - - Remarks Qn a theQrem Qf Ramsey, Bull. Research Council of Israel, Section F, 7 (1957). P. Erdos and G. Szekeres, A cQmbinatQrial prQblem in geQmetry, Compositio Math. 2 (1985) 468-70. J. B. Kelly and L. M. Kelly, Paths and circuits in critual graphs, Amer. J. Math., 7'6 (954), 786-92. J. Mycielski, Sur Ie cQIQrage des graphs, Colloquium Math. 3 (1955), 161-2. F. P. Ramsay, CQllected papers, 82-111. T. Skolem, Ein kQmbinatQnscher Satz mit Anwendung auf ein logisches Entscheidungs problem, Fund. Math. 20 (1933), 254-61.

University of Toronto and

Technion, Haifa

Reprinted from Canad. J. Math. 11 (1959), 34-38

280

Kasteleyn, P. W.

Physica 27

1961

1209-1225

THE STATISTICS OF DIMERS ON A LATTICE I. THE NUMBER OF DIMER ARRANGEMENTS ON A QUADRATIC LATTICE

by P. W. KASTELEYN Koninklijke/Shell-Laboratorium, Amsterdam, Nederland (Shell Interriationale Research Maatschappij N.V.)

Synopsis The number of ways in which a finite quadratic lattice (with edges or with periodic boundary conditions) can be fully covered with given numbers of "horizontal" and "vertical" dimers is rigorously calculated by a combinatorial method involving Pfaffians. For lattices infinite in one or two dimensions asymptotic expressions for this number of dimer configurations are derived, and as an application the entropy of a mixture of dimers of two different lengths on an infinite rectangular lattice is calculated. The relation of this combinatorial problem to the Ising problem is briefly discussed.

§ 1. Introduction. Combinatorial problems relating to a regular space lattice arise in the theory of various physical phenomena. One of these problems is the "arrangement problem", which plays a role in the explanation of the non-ideal thermodynamic behaviour of liquids consisting of molecules of different size with zero energy of mixing (athermal mixtures). In the investigations devoted to this problem (most of which have been discussed critically by Guggenheim 1)) much attention has been paid to the socalled quasi-crystalline model. One considers a regular lattice consisting of points (sites, vertices) connected by bonds. This lattice is fully covered with monomers (molecules occupying one site) and rigid or flexible polymers (molecules occupying several sites connected by bonds); the latter may be dimers, trimers etc., but also "high polymers". If the energy of mixing is zero the thermodynamic properties of this system can be calculated from the combinatorial factor, i.e. the number of ways of arranging given numbers of monomers and polymers on the lattice. The same combinatorial problem arises in the cell-cluster theory of the liquid state 2). There one divides the volume of a liquid into a set of cells of which the centres form a regular lattice, and one considers situations in which, by the removal of certain interfaces between cells, a number of double cells, triple cells etc. have been formed. For the calculation of the free energy of the liquid one then has to determine the number of ways in which

281

1210

P. W. KASTELEYN

a given volume can be divided into given numbers of single, double, triple cells etc.; the equivalence of this combinatorial factor with the former one is obvious. A two-dimensional form of the arrangement problem is encountered in the theory of adsorption of diatomic, triatomic etc. molecules on a regular surface. The empty sites of the surface then play the role of "monomers". As in many problems of this sort, it is easy to find the most general solution for a one-dimensional lattice 2), it appears very difficult to find a more or less general solution for two-dimensional lattices, whereas for three dimensions any exact solution seems extremely remote. Therefore one generally uses approximation methods; we refer to the work of Fowler and Rushbrooke 3) (who also made some rigorous calculations on two- and three-dimensional infinite strips of finite width), Chang, Flory, Huggins, Miller, Guggenheim (for detailed references see ref. IL Orr 4), Rushbrooke, Scoins and Wakefield5), and Cohen, De Boer and Salsburg 2). Recently, Green and Leipnik 6 ) claimed to have found a rigorous solution for the case of monomer-dimer mixtures on a two-dimensional lattice, but Fisher and Temperley 7), and Katsura and Inawashiro 7) proved that their results were not correct. In this paper we present a rigorous solution to the above-mentioned combinatorial problem for a very special case, viz. that of a two-dimensional quadratic lattice, completely covered with dimers (in terms of graph theory we ask for the number of "perfect matchings" of the lattice 8)). Both the absence of monomers and the dimension of the lattice form serious restrictions, but it is hoped that the present investigation may be useful as a first step. The situation has some resemblance to that of the combinatorial problem connected with the Ising model of cooperative phenomena 9), for which an exact solution has been given for an equally special case 10-13). It will be shown that the two problems are to a certain extent analogous. In § 2 and § 3 we shall develop the method of solution for the case of a finite lattice imbedded in a plane (i.e. a rectangle with edges), and in § 4 for that of a lattice imbedded in a torus (i.e. with periodic boundary conditions). In § 5 an alternative method will be sketched. As an application, the entropy of a certain mixture of dimers is calculated in § 6. It is intended to treat in a subsequent paper the statistics of dimers on other two-dimensional lattices, to discuss boundary effects and to make some remarks on three-dimensional lattices. § 2. The planar quadratic lattice. Consider a planar quadratic m X n lattice Qmn to which one can attach dimers (figures consisting of two linked vertices) in such a way that every dimer occupies two lattice points connected by a bond. We indicate the lattice points by (i, j) or p (i = 1, ... ,m; j = 1, ... , n; p = I, ... , mn), the number of "horizontal" dimers(occupying

282

1211

THE STATISTICS OF DlMERS ON A LATTICE I

two points (i, j) and (i + 1, j)) by N z and the number of "vertical"dimers (occupying two points (i, j) and (i, j + 1)) by N;. If g(Nz, N~) is the combinatorial factor, i.e. the number of ways of covering the lattice with dimers so that every site is covered by one and only one dimer vertex, we ask for the configuration generating function

Zmn(z, z') = ~~.,N., g(Nz, N~) ZN. Z'N.' ,

(1)

where the sum runs over all combinations N z, N~ satisfying 2(Nz + N~) = = mn; if desired, the counting variables z and z' may be viewed as activities and Zmn as the configurational partition function. At least one of the two numbers m and n has to be even; let m be even. We shall refer to the arrangement of dimers occupying the pairs of sites PI and Pz, P3 and P4, P5 and Ps, etc. as to the configuration C = IPl; pzl P3; P4 IP5; psi .. · IPmn-l; Pmnl· A simple but important configuration is Co = 11,1; 2,113,1; 4,11 ... 1m - 1,1; m, 111,2; 2,21 ... 1m -1, n; m, nl, which we shall call the standard configuration (fig. la). We could, however, represent this arrangement of dimers equally well by 12,1; 1,113,1; 4,11 ... or by 14,1; 3,111,1 ; 2,11 ." etc. To make the representation unique we order the points of the lattice row after row by choosing the p-numbering as follows: (2) (i,j) H P= (j -1) m + i, and we introduce the convention that the points of a configuration shall be indicated in the following ("canonical") order:

PI < Pz; P3 < P4; ... ; pmn-I < Pmn; PI

(3a)

< P3 < ... < P mn-l·

(3b)

By analogy to the determinantal approach to the Ising problem developed by Kac and Ward 11) we shall try to construct a mathematical form consisting of a series of terms each of which corresponds uniquely to one configuration and has the "weight" zNtz'Nt' of this configuration. The conditions (3) strongly suggest that this form should be a Pfattian rather than a determinant. A Pfaffian is a number attributed to a triangular array of coefficients a(k; k') (k = 1, ... , N; k' = 1, ... , N; k < k'; N even) in the following way 14) :

Pf{a(k; k')} = ~~ bpa(kl; kz) a(k3; k4) .. , a(kN-l ; kN),

(4)

where the sum runs over those permutations k}, k z, ... , kN of the numbers 1,2, ... ,N which obey

ki

< kz; k3 < k4; ... ; kN-l < kN; kl < k3 < ...
and where bp is the parity of the permutation P, i.e. -lor

283

(3')

+ 1 according

1212

P.

w.

KASTELEYN

as P is an odd or an even permutation. Pfaffians have been introduced into physics by Caianello and Fubini 15) and in lattice-combinatorial problems by Hurst and Green 13). We shall now show that it is possible indeed to define a triangular array of elements D(P; P') so that

Zmn(Z, z')

=

Pf{D(P; P')}.

(5)

We begin by noting that if we define D(P; P') = 0 for all pairs of sites (P; P') that are not connected by a bond, all terms in the Pfaffian that would not correspond to a dimer configuration will vanish. Next we put the coefficients D(P; P') corresponding to pairs of sites that are connected by a horizontal or a vertical bond equal, in absolute magnitude, to Z and z', respectively. In this way we get all configurations represented by a term of the proper weight; the conditions (3) and (3') ensure that the correspondence is one-to-one. Finally, in order that all configurations are counted positively we have to choose the signs of the non-zero elements such that the product D(P1; P2) D(P3; P4) ... has the same sign as the parity op. It is evident that the product corresponding to the standard configuration Co has to be positive. Now, from Co one can obtain any arbitrary configuration C in the following way. We draw a picture of the lattice in which every dimer of the configuration Co is represented by a dotted line and every dimer of the configu(1.4)

(1,3)

(1,;)(2))

~.1l

(~1)

(a) Fig. 1. (b) (a) The standard configuration Co of the planar lattice Q64 (b) The construction of a "new configuration C (full lines) from Co (dotted lines)

guration C by a full line (fig. 1b). Since any lattice point is the endpoint of just one line of each type the resulting figure consists of: a) pairs of sites connected both by a dotted line and by a full line ; b) closed polygons consisting of alternating dotted and full lines (to be called Co-bonds and nonCo-bonds). If we then take the configuration Co and we shift in each of the "alternating" polygons all dimers clockwise or counter-clockwise by one step, Co goes over into C. By analogy to the Ising problem 9) one might expect that each of the polygons (representing a cyclic permutation of an even number of lattice sites) would contribute a factor -1 to the parity op of the permutation P corresponding to C. In fact, this is true, but owing to the restrictions on

284

THE STATISTICS OF DlMERS ON A LATTICE I

1213

the permutations occurring in a Pfaffian, the proof is not so simple as in the Ising problem. Consider e.g. the small square in fig. lb. According to (3') its vertices occur in the term representing Co as ... D(Pl; P2) .. , D(Pa; P4) ... and in the term representing C as ... D(Pl; Pa) D(P2; P4) .... Obviously the change from PIP2PaP4 to PIPaP2P4 is not a cyclic permutation of the four points along the square. It can, however, be considered as the product of the following permutations: PIP2PaP4 -+ PtP2P4Pa (putting the points into a cyclic order corresponding to the square) -+ P2P4PaPI (permuting the four points cyclically) -+ P2P4PtPa (ordering the points PI and pa according to (3a)) -+ PtPaP2P4 (ordering the pairs (PI; Pa) and (P2; P4) according to (3b)). The resulting permutation is odd; apparently the parities of the re-ordering permutations just compensate each other. To show that this is true in general we first take a configuration which differs from Co only in the position of the dimers on one polygon. Consider a column of "Co-bonds" which is crossed by this polygon. If we describe a closed path along the polygon, we cross this column as many times in the "forward direction" (i.e., in the direction of increasing P) as in the "backward direction". This is true for any column of Co-bonds, and therefore, if the path contains r forward steps along Co-bonds it must also contain r backward steps along Co-bonds. By a similar argument combined with the alternation of the two types of bonds, we can show that it also contains r forward and r backward steps along non-Co-bonds. In the terms of the Pfaffian which correspond to Co and C, on the other hand, all points of the polygon occur in the order of increasing p. Consequently the permutation which changes the "Co-term" into the "C-term" can be considered as the product of: I) the reversal of the r pairs of sites (Co-bonds) for which the canonical order is opposite to the required cyclic order; 2) the rearrangement of the 2r Co-bonds which is needed to get all polygon vertices in the cyclic order; 3) the cyclic permutation of these 4r vertices; 4) the reversal of the r pairs of sites which now violate (3a); 5) the rearrangement of the 2r pairs which is needed to satisfy (3b). Each reversal within a pair contributes a factor - I to the parity of the resulting permutation, a reshuffling of the pairs a factor + I, and the cyclic permutation a factor (-I )4r-l. We thus find that the total parity o~ the permutation is (-1)r( 1)( - I )4r-l( 1)( - W = -I. If there is more than one polygon, we can perform the corresponding permutations consecutively; each polygon then contributes a factor -I. It shall now be indicated how these factors can be compensated for. First we remark that since no two Co-bonds have a point in common, an alternating polygon can neither intersect itself nor cross or touch other polygons. Therefore we shall not encounter such difficulties as arose in the corresponding step of Kac and Ward's method 11) 12). Any alternating polygon can be considered as built up from horizontal strips of connected unit squares. From the requirement that during the

+

285

+

1214

P.

w.

KASTELEYN

cyclic shift the opposite sides of a connected figure are shifted in opposite directions, combined with the alternation of Co-bonds and non-Co-bonds we conclude that both the numbers of unit squares in a strip and the number of strips are odd (d. fig. 1b). Consequently, each alternating polygon encircles an odd number of unit squares, and it will be sufficient to choose the signs of the D(P; P') such that among the four bonds bounding a unit square there is an odd number having a negative D(P; P') ; we have further seen that the standard configuration has to appear with a positive sign. This can be realized e.g. by attributing minus signs to the coefficients of the vertical bonds between lattice sites of odd i. We thus get the following set of coefficients D(i,j;i + l,j) D(i, j; i, j I) D(i, j; i', j')

+

=

+z for 1 S i sm - 1, s j S n, 1S j S n (-1 liz' for 1 s i s m,

=

0

=

1,

(6)

otherwise

The equations (5) and (6) are sufficient to derive the configuration generating function Zmn(z, z'). It should be remarked that throughout this paper we assume that the dimers are symmetric. If they were asymmetric all elements D(P; P') would have to be multiplied by 2, since any pair of sites might then be occupied in two distinguishable ways. § 3. The evaluation of the Pfalfian. For the evaluation of Pf{D(P; P')} we make use of the property of a Pfaffian that its square is equal to the determinant of the skew-symmetric matrix to which the given triangular array of coefficients can be extended 14). That is, in our case,

Z;'n(Z, z') = [Pf DJ2 = det D,

(7)

where D is the matrix given by (6) together with the requirement of skew symmetry: (8) D(i,j;i',j') = -D(i',f';i,j), and Pf D stands for Pf{D(P; P')}. If D were a completely periodic matrix, it could easily be brought into a diagonal form, viz. by a Fourier-type similarity transformation 9), and the calculation of the determinant would be straightforward. However, the truncated edges of the lattice Qmn disturb the periodicity of the matrix. Fortunately, it is still possible to bring it into a "nearly diagonal" form. Vve write D as the sum of two direct products of a m X m matrix and n X n matrix: (9) D = z(Qm X En) z'(Fm X Qn),

+

where E is the unit matrix,

286

1215

THE STATISTICS OF DIMERS ON A LATTICE I

Q=

f-~o

100 010 -1 0 1

o0 o0 o0

0 0

000 000

o1 -10

.F~

o0 o0 o0

-10 0 o1 0 o 0 -1

I

o0 o0

(10)

-10 o1

0 0

and the indices indicate the order of the matrices. It can be verified that Qn can be diagonalized by a similarity transformation Qn = U;;;lQnUn with the matrix Un given by Un(l; 1')

=

{2/(n

U;;;l(l; 1') = {2/(n

+

I)P il sin {ll' n/(n

+

I)},

(11 )

+ I)}! (-i)l' sin {ll'n/(n + I)};

the diagonal elements of Qn are the eigenvalues 2i cos {In/(n + I)} of Qn(l = 1, ... , n). On the other hand, this transformation obviously leaves En invariant. Qm can be diagonalized analogously by a transformation with Um, but this transformation disturbs the diagonal form of Fm, although not seriously. Transforming D with the direct product U = Um X Un we find that D = U-IDU has the following elements: D(k, 1; k',l')

= 2iz ()k,k' ()l,l' cos {kn/(m + I)} -2iz' ()k+k',m+l()l,l' X cos{ln/(n

+ I)};

i.e. the only non-zero elements are grouped in 2 X 2 blocks along the diagonal. Thus the determinant is readily found: kn 2iz cos - - m+I det D = det D = II II 1n • , k~l I ~l -21Z cos _ _ n+I __

~1n

n

. , 1n - 21Z cos - - n+I kn -2izcos--m+ 1

(12)

and, from (7) and (12), we find the following expression for the configuration generating function of the lattice Qmn: Zmn(z, z')

-1 _

Jt

~m n [ k2 n = II II 2 z2 cos - - - + Z'2 cos 2 - 1n -=

m+1

k~ 1 l~ 1

IT IT

n+1

2tmn

k~ 1 l-~ 1 1

[Z2 cos 2 ~

'

+ z'2 cos 2 ~J n+1 tmHn-l)[ kn 1n ] 2 im (n-l) zlm II II Z2 cos 2 - - - - + z'2 cos 2 - - k~ l~ m+1 n+1

J

m+1

(n even) .

(n odd)

(13)

1

In writing down the expression valid for odd n use has been made of the relation tm

II 2 cos{kn/(m

"=1

287

+ I)} =

I,

1216

P.

w.

KASTELEYN

which is a particular case of the identity [ n 4 u k~l im

2

kn

]

==

+ cos 2 - m+ 1

[u + (1 + ~t2)t]m+1 - [u -(1 + u 2)t]m+1

2(1

; (14)

+ u 2)t

this identity holds (for even m) because the two members represent two polynomials in u of the same degree m, with the same zeros, and with equal coefficients of the leading term. For numerical calculations it is sometimes useful to perform, with the aid of (14), the product over k in (13). In this way we find

f [( cos n~ 1 +

Zmn(Z, Z') = z!mn

( 1+ (2 cos 2

n~ lYJm+1 -

1

- [(cosl:...~(1+'2COS2~)tJm+1 1 n+ Ell ( )! n+ ' 1

lin]

-

In 2 1 +(2COS 2 _ -

1

J

(15)

n+l

where ( = z' Iz and an] = in or Hn - 1) according as n is even or odd. In the limit m -+ 00, i.e. for infinitely long strips of finite width n, we get

Zn(z,z') =lim {Zmn(z, z')}l/m=z!n m--..oo

IT [(COS~ + (1+(2COS2~)tJ. n+ 1 n+ 1

l~l

(16)

Finally we have, in the limit of an infinitely large lattice:

Z(z, z') = lim {Zmn(z, z')}I/mn = m,n--+oo

",/2

,,/2

= exp {n- 2J dw J dw' In 4[Z2 cos 2w + Z'2 cos 2w'J} = 0 o

(17)

",/2

= z! exp{n-I J dw In [( cos w + (1 + (2 cos2w)!J}. o

For Iz'l :s:: Izl we may expand the latter integrand in terms of (, integrate term by term, and sum the resulting series, which gives: In Z(z, z') =

i

In z + n- I

~

(-I)j (2j + 1)-2 (2j+1 =

i~O

C

=

i

In z + n- I J dx X-I arctan x =

=

t

In z'

o

(18)

l/C

+ n-I J dx x-I arctan x. o

From the equivalence of the last two expressions (which is easily proved) and the analogous derivation in the case Iz'l ;;::: Izl it follows that either of them may be used for all values of z and z'. Using the relation arctan x = (2i)-I[ln(1 + ix) - In(1 - ix)] and introducing the function

288

THE STATISTICS OF DlMERS ON A LATTICE I

1217

A 2 (x) = (2i)-1[L2(ix) - L 2( - ix)] , where L 2 (u) = - N dx x-lIn (I - x) is Euler's dilogarithm 16), we finally arrive at the following expressions for the limit of the configurational partition function per site: In Z(z, z') = tIn z

+ n- 1A 2 (z'/z)

= tIn z' + n- 1A 2 (z/z').

(19)

By substituting z = z' = I in equations (13) or (IS), (16) and (19) we immediately find the total number of dimer arrangements, g(tmn) , and its asymptotic behaviour. One is sometimes interested in the "molecular freedom" rp2 of the dimers defined 3) as the number of arrangements per dimer: rp2 = {g(!-mn)}2/mn = {}:;' g(N2, N~)}2/mn = {Zmn(I,I)}2/mn

(20)

lVs,Nz'

In particular, for the infinite lattice we find

I rp~OO) == Z2(1,1) = exp{2n-1A 2(1)} = exp{2G/n} =

1.791 622812...

(21)

where G = 1-2 - 3- 2

+ 5-2 -

7-2

+ ... =

0.915965594 ... (Catalan's constant).

Several approximate values for rp~oo) have, in more or less explicit form, been given in the literature. From Flory's theory of polymers 17) one can derive a value which corresponds to a "Bragg-Williams" or "random mixing" approximation, Chang 18) and Cohen e.a. 2) used a "1st Bethe-Kikuchi" or "quasi-chemical" approximation, Orr 4) worked out a "2nd Bethe approximation", Miller 19) calculated a lower bound, and Fowler and Rushbrooke 3) obtained a very close estimate by extrapolating their exact results for infinite quadratic strips of widths up to 8*) (which are, of course, included as special cases in our expression (16)). The various results are summarized in table II (p. 1220). § 4. The toroidal quadratic lattice. In this section we shall investigate the changes brought forward by introducing periodic boundary conditions, i.e. by winding the lattice on a torus. We shall call the toroidal m X n lattice Q~~, and the corresponding generating functions Z~~(z, z')' Z~)(z, z') and Z(t)(z, z'). One difference with the case of a planar lattice is that D(m, i; I, i) and D(i, n; i, I) should no longer be taken to be zero but equal, in absolute 0) Miller's criticism 19) of the calculations of Fowler and Rushbrooke rests on a wrong interpretation of the method and is therefore not valid.

289

1218

P. W. KASTELEYN

magnitude, to z and z', respectively; we can still choose the signs of these elements. Further we have now to distinguish four classes of configurations. The first class coml?rises those configurations that can be derived from the standard configuration (which we take identical to that of § 2) by cyclic shifts along polygons not looping the torus either in horizontal or in vertical direction, or, more generally, looping the torus an even number of times in both directions; let us call them (e, e) configurations. In an analogous

r-----------------------,

1""1

~

I-:-~~~----~~~~~----:~~-~-l I

I

I

I I I

II

i~ .........____.. ......... ____........... ---.JI

I

j I I I

r--- ....... ----..........----..·1

I

(e.e)CONFIGURATION

r--------------T---------l

[I

I

'---e ...

__ .

I

. ... .......,

II

I

= = =I

IL _______________________

I

I

I I

iI

I I

I I I I IL ________________________ I

(0,0) CONFIGURATION

r"'-- -

r--------------r---------'

I

i

r

I .... · .... I · I

~..

I

I

I

...............

........-.-

I

j I I I

1·········--········~

I

I

I

I~ ·········l________ ~I L______________l ________ - - _ L_______________ -.1 I

I

I

(0,0) CONFIGURATION

I

(0,0) CONFIGURATION

Fig. 2. Configurations from the four configuration classes of the toroidal lattice Q~l

way we define (0, e), (e,o) and (0,0) configurations (d. fig. 2; e = even, o = odd, first symbol refers to horizontai loops). If we define, for I :::;; i :::;; m, I
= -z; D(i, n; i, I) = (_I)H1z',

(22.3) (22.4)

and we remark that Pf 02 counts all configurations correctly except those from the class (0, e), Pf 0 3 all configurations except the class (e,o) and

290

1219

THE STATISTICS OF DlMERS ON A LATTICE I

Pf 0 4 all configurations except the class (0,0). These counting rules are summarized in table I: TABLE I Counting of configurations on a toroidal lattice Class of configurations

Sign of corresponding terms in Pf Dp

I

(e, e) (0, e) (e,o) (0,0)

01

+

I D. I

+

-

-

+ +

03

+ + -

I D.

+ + +

I

+

i

Pf O2 + Pf 0 3

+

-

I

It is evident from this table that if we put Z~~(z, Z')

= H-Pf 0 1 +

Pf 0 4),

(23)

all configurations are counted with the right sign so that we have obtained the analogue of eq. (5) for a toroidal quadratic lattice. The evaluation of eq. (23) runs parallel to that of eq. (5), the only difference being the occurrence of the matrices

o -I

0 -I

100 010 -I 0 1

0 0

0 0

0

000 000

0 -I

1 0

o

100 0 -I 010 0 -I 0 1

o1 o0 o0

0 -I

o1

and 000 000

-I 0 J

instead of Q (d. eq. (9) and (10)). They can be diagonalized successively by a transformation with the matrices V(l,l') = (1In)~ exp {2ll' niln}, V- (1,1') = (lIn)! exp{1(21' - 1)niln}.

(24)

Proceeding as in § 3 we find Z~~~(z, z')

trn

n

= -tIl II 2[Z2 sin 2 {2knlm} k~

1 Ic= 1

!rn

n

+ Z'2 sin 2 {21nln}]I +

+III II 2[z2 sin2{2knlm} + ~1~1 if" n

+III II 2[Z2 sin2{(2k -

z'2 sin2{(2l - I) nln}]!

+

~~

+

I) nlm}

+

Z'2 sin2{21nlnW

I) nlm}

+

Z'2 sin2{(21 - I) nln}]t.

k~II~1

!rn

n

+III II 2[z2 sin2{(2k k==II==1

The first term of the right-hand member is easily seen to be equal to zero. 291

1220

P. W. KASTELEYN

If desired, this equation can be put into a form analogous to (IS) with the aid of the following identities, valid for even m and non-negative values of u: 1m

II 2[u 2 +sin 2{2kn/m}J!

k~l

(26)

!m

I12[u 2

+ sin2{(2k - I) n/m}J! == [u + (I +

u 2 )IJlm

+ [-u +(1 +

u 2 )lJlm.

k~l

For m ~ 00, i.e. for infinite cylindrical strips, the second and fourth term of eq. (25) can be shown to be dominant and equal, and we find Z~)(z, z')

= zln

Z!ll [in]

[

C sin

(21 -

n

I)n

+

(

I + C2 sin 2

(21 -

n

I) n

)!] '

(27)

which is to be compared with eq. (16), valid for planar strips. In the limit n ~ 00 we finally obtain eq. (17) again. The values for the molecular freedom!p2 in cylindrical strips of widths up to 8 calculated by Fowler and Rushbrooke 3) can be found as special cases from eq. (27). In table II we list the various values of !p2 (exact and approximate) calculated for strips and for the infinite lattice. TABLE II The molecular freedom 'P2 for planar and toroidal quadratic

n 1 2 3 4 5 6 7 8

I

=\

planar lattice 1.000

I

toroidal lattice

1.686 1.932

1.685 1.658

1.754 1.716

1.701

00

x n lattices

method

2.414

1.618 1.551

I

1.849 1.772

1.732

Fowler and R ushbrooke 3) and this paper (in those cases where ref. 3 gave no or less accurate results, those of the present method have been recorded)

1.823 1.471 1.687 1.736 1.63 1.8 1.791 623 ...

Chang"I 18) Orr 4) Miller 19)

",0''

) . approXimate

Fowler and Rushbrooke 3) this paper

§ 5. Alternative approaches. We saw in § 2 that the number of dimer configurations on a lattice is equal to the number of alternating polygons (defined with respect to a standard configuration) on that lattice. The strong analogy with the Ising problem suggests that a solution of the present problem is possible which follows more closely the method of Kac and Ward referred to above. This can indeed be developed, and again one obtains the configuration generating function Zmn as the square root of a determinant.

292

1221

THE STATISTICS OF DIMERS ON A LATTICE I

We shall not go into details but instead mention another closely related method for calculating Zmn. It follows from the considerations of § 2 that there is a one-to-one correspondence between alternating polygons on Qmn and closed paths on a corresponding oriented lattice Q~n' sketched in fig. 3. In this lattice any bond may be traversed in only one direction: the Co-bonds in the direction of increasing (decreasing) i for odd (even) values of j, the non-Co-bonds in the direction from the "head" of a Co-bond to the "tail" of another Co-bond .

-

.... \11

\11

'1\

I

,

\1,

'1\

---

...-

I

-

\11

'1\ ...-

II

\11

'1\

.... Fig. 3. The oriented lattice

I

,

...-

Qg~.

We now form a mn X mn matrix d whose rows and columns correspond to the sites of Q~n. We define, for all combinations of indices which represent lattice points, d(i,j;i,f) d(i,j;i+I,j) d(i,j;i-I,j)

d(~, ~~~, ~ +

1

y Y

I)} = {+y:

1, ~,1 - 1) -y d(i,j;i',j') = 0

d(~,

for j odd, for j even, for i even, j odd, for i odd, j even, otherwise,

(28)

i.e. we attach weight factors y and ±y' to horizontal and vertical oriented bonds, respectively, and a factor 1 to each lattice site on its own. Thus in detd each term will correspond to a configuration of closed paths on Q~n (d. ref. 9). A term representing a permutation consisting of v cycles (each permuting an even number of points) occurs with a factor (-I)p; since in a determinant, as contrasted with a Pfaffian, there is no restriction on the order of the indices of the elements, the difficulty mentioned in § 2 does not arise here. The argument of § 2 shows again that the difference in sign between permutations consisting of odd and even numbers of cycles is compensated for by the negative signs attributed to the vertical bonds between sites with odd values of i. So detd is just the path generating

293

1222

P. W. KASTELEYN

function Hrnn(y, y') for the lattice detd =

~M ~M'

Q~~n:

h(M, M') yMy'M' = Hrnn(y, y'),

h(M, M') being the number of ways of combining M horizontal and M' vertical steps to closed paths. According to § 2 such a combination may be considered as representing a configuration of M' vertical dimers, and hence tmn - M' horizontal dimers. It follows that Zrnn(z, z')

~M'

=

g(tmn - M', M') ztrnn-M' z'M'

= zlrnn

~M' ~M

h(M, M') eM'

=

=

z!rnn

det(d)1I~1. y'~C'

(29)

This is confirmed by an evaluation of detd. For an infinite lattice e.g. one finds

H(y, y')

=

lim {Hrnn(Y, y')}1/rnn = m,n---""oo

,,/2

n/2

= exp{n- 2 f dw f dw' In [( 1 - y2) 2 o

0

z.

+ 4y2 cos 2 w + 4y'2 cos 2 w']} ;

(30)

e,

multiplication by and substitution of y = 1, y' = immediately lead to (17). This result is noteworthy in that Zrnn itself is expressed as a determinant rather than its square. The origin of this possibility lies in the fact that the quadratic lattice can, in the well-known way, be divided into two sublattices, that of the "odd" and that of the "even" sites (or, in terms of graph theory, that it is "dichromatic" 8)); this ensures the possibility of working with the oriented lattice Q:n' The algebraic root lies in the possibility of writing certain Pfaffians as a determinant (cf. Muir 14), vol. IV p. 263). For the triangular lattice, on the other hand, the method of this section cannot be used, whereas that of § 2 still works, as we hope to show in the envisaged sequel to this paper.

§ 6. The entropy of a system of dimers on a rectangular lattice. Consider a planar rectangular m X n lattice, i.e. a lattice whose horizontal and vertical bonds differ in length. Let the lattice be covered entirely by two sorts of dimers: N 2 dimers which fit only into horizontal positions (i.e. which can occupy two sites (i, j) and (i + 1, j)), and N~ dimers fitting only into vertical positions. If the energy of mixing of these dimers is zero, the configurational free energy of the mixture is completely determined by the entropy of mixing, i.e. by the combinatorial factor g(N 2, N~). This quantity can be calculated from the configuration generating function with the aid of Cauchy's formula: (31)

where the path of integration encircles the origin but excludes the singularities of Zrnn( I, e). We shall introduce x = N 2/tmn and x' = N;/tmn = 1 - x.

294

THE STATISTICS OF DIMERS ON A LATTICE I

1223

For large m and n we can evaluate (31) by the saddle-point method. We find lim {g(tmnx, tmnx')}1/mn = y(x),

(32)

where y(x) is given by the following two equivalent expressions: In y(x)

n-1A 2 (tan tnx) - tx In (tan tnx) = = n-1 A 2(tan tnx') - tx' In (tan tnx').

=

(33)

In fig. 4 the reduced entropy per dimer of this "interlocking mixture",

(] = S/tmnk = 21n y(x), which corresponds to Flory's "entropy of disorientation" 17), is plotted against x. For comparison we have also plotted the entropy of mixing for an "ideal" or "random mixture", i.e. of a system where to each single lattice site a horizontal dimer (available fraction: x) or a vertical dimer (fraction: x' = I - x) is attached in a random manner, - - - - - RANDOM MIXTURE _ _ INTERLOCKING MIXTURE

08,---,-----,--,-----.----,

Fig. 4. The reduced entropy per dimer of a mixture of horizontal and vertical dimers as a function of x (fraction of horizontal dimers).

without paying attention to possible hindrances. This quantity is equal to -x In x - x' In x'. The difference between the two entropies is a measure of what might be called the order ot interlocking. § 7. Concluding remarks. We have seen that the generating function Zmn(z, z') for dimer configurations on a planar quadratic lattice can be written in the form of a Pfaffian. The corresponding skew-symmetrie matrix could by a similarity transformation be brought into a nearly diagonal form, and its determinant, which is the square of the Pfaffian, evaluated. The asymptotic behaviour of Zmn(z, z') for large lattices was found to be described by equation (17), which can also be written as Z(z, z') = exp{(2n)-2 f" dw f "dw' In 2[z2 o 0

+ Z'2 + Z2 cos W + z'2 cos w']}.

295

(34)

1224

P. W. KASTELEYN

The same result was found when periodic boundary conditions were introduced. The effect of boundary conditions is, however, not entirely trivial and will be discussed in more detail in a subsequent paper. The right-hand member of (34) has a remarkable resemblance to Onsager's expression for the partition function per spin of a rectangular Ising system 9)10) *). A more detailed examination reveals that Z(z, z') as a function of C = z'/z has no singular points on the real positive axis; it corresponds to Onsager's partition function at the critical point (or critical line, if the strengths of the horizontal and vertical interactions vary with respect to each other). This fact might tempt one to conjecture that the more general problem of monomer-dimer mixtures would be the analogue of the Ising problem at arbitrary temperatures, and hence rigorously solvable. However, this is not the case. It is easy to see that the true analogue of the partition function of an Ising system is the generating function Hmn(y, y') = = detd for closed paths on the lattice Q~n; for an infinite lattice, the function H(y, y') given by (30) has a singularity at y = 1, which is just the value of interest for the dimer problem.

I······~I:::J

•

....i

(ol

1.

r::::]

(bl

Fig. 5. The construction of a monomer-dimer configuration from the standard configuration by (a) the omission of bonds and (b) the shift of dimers along chains of bonds.

A monomer-dimer mixture, on the other hand, has more resemblance to an Ising ferromagnet in an external field 9): the various configurations can be derived from the standard configuration Co by the omission of a number of bonds (fig. Sa) followed by the shift of dimers along certain chains of bonds, which need no longer be closed (fig. Sb). The contribution to the combinatorial factor from open chains increases with the ratio of the activity of two monomers to that of a dimer, whereas in the Ising case it increases with the ratio of the activity of +spins to that of -spins, i.e. with the magnetic field. Since this general Ising problem has as yet resisted all attempts at a rigorous solution one suspects that the monomer-dimer problem will also be very hard to solve. On the other hand, a better insight into the latter problem might throw some new light on the former. *) In this connection it may be remarked that the algebraic introduction which Hurst and Green 13) need for re-deriving Onsager's results by the Pfaffian method can be avoided by deriving them along the lines of the present paper.

296

THE STATISTICS OF DIMERS ON A LATTICE I

1225

Note added in proof. After the submission of this paper the author received preprints of a short communication by H. N. V. Temperley and M. E. Fisher (to be published in Phil. Mag.) and of an article by M. E. Fisher (to be published in Phys. Rev.), both on the statistics of dimers on a quadratic lattice. Following the lines used by Hurst and Green 13) in the discussion of the Ising problem, the authors obtain results identical to those of the present paper. They discuss in more detail the asymptotic behaviour of these results for large lattices, and in addition make some remarks on monomer-dimer mixtures. On the other hand, they restrict themselves to planar quadratic lattices, and their method - although formally equivalent to that developed above - seems less suited to generalization to other two-dimensional lattices. We hope to comment upon these papers in more detail later on. Received 28·6-61 REFERENCES I) Guggenheim, E. A., Mixtures, Clarendon Press, Oxford (1952) Chapter X. Cohen, E. G. D., De Boer, J. and Salsburg, Z. W., Physica 21 (1955) 137. Fowler, R H. and Rushbrobke, G. S., Trans. Faraday Soc. 33 (1937) 1272. Orr, W. J. C., Trans. Faraday Soc. 4.0 (1944) 306. Rushbrooke, G. S., Scoins, H. 1. and Wakefield, A. J., Discussions Faraday Soc. 15 (1953)

2) 3) 4) 5)

57. 6) Green, H. S. and Leipnik, R, Rev. mod. Phys. 32 (1960) 129. 7) Fisher, M. E. and Temperley, H. N. V., Rev. mod. Phys. 32 (1960) 1029. Katsura, S. and Inawashiro, S., Rev. mod. Phys. 32 (1960) 1031. 8) Berge, C., TMorie des graphes et ses applications, Dunod, Pari~ (1958) 175,30. 9) Newell, G. F. and Montroll, E. W., Rev. mod. Phys. 25 (1953) 352. Domb, C., Adv. in Phys. 9 (1960) 149, in particular § 3. 10) Onsager, L., Phys. Rev. 65 (1944) 117. II) Kac, M. and Ward, J. C., Phys. Rev. 88 (1952) 1332. 12) Sherman, S., J. math. Phys. 1 (1960) 202. 13) Hurst, C. A. and Green, H. S., J. chern. Phys. 33 (1960) 1059. 14) Muir, T., Contributions to the History of Determinants, London (1930).

Scott, R. F. and Mathews, G. B., Theory of Determinants, Cambridge University Press, New York (1904) 93. 15) Caianello, E. Rand Fubini, S., Nuovo Cimento 9 (1952) 1218. 16) Grabner, W. and Hofreiter, N., Integraltafel II, Springer Verlag, Wien & Innsbruck (1950) 72.

17) 18) 19) 20)

Flory, P. J., J. chern. Phys. 10 (1942) 51. Chang, T. S., Proc. roy. Soc., London, A 169 (1939) 512. Miller, A. R, Proc. Camb. phil. Soc. 38 (1942) 109. Potts, R B. and Ward, J. C., Progr. theor. Phys. 13 (1955) 38.

297

Errata to "The statistics of dimers on a lattice.

I.

of dimer arrangements on a quadratic lattice"

P.W. Kasteleyn,

by

The number

Physica 27 (1961) 1209-1225.

P. 1220, be added,

line 6:

after "dominant and equal" a footnote sign should

referring to the following footnote,

to be placed at the

bottom of the page: provided n is even.

For odd n the second term vanishes while

the third and fourth terms are equal,

and a factor 2 should be

inserted into the right-hand side of eq. (27). P. 1222, ego (30): the last term in the right-hand side should read 4y2y,2cos 2w' and not 4y,2cos2w'.

298

LONGEST INCREASING AND DECREASING SUBSEQUENCES C. SCHENSTED

This paper deals with finite sequences of integers. Typical of the problems we shall treat is the determination of the number of sequences of length n, consisting of the integers 1,2, ... , m, which have a longest increasing subsequence of length a. Throughout the first part of the paper we will deal only with sequences in which no numbers are repeated. In the second part we will extend the results to include the possibility of repetition. Our results will be stated in terms of standard Young tableaux. PART

I

Definition. A standard Young tableau of order n is an arrangement of n distinct natural numbers in rows and columns so that the numbers in each row and in each column form increasing sequences, and so that there is an element of each row (column) in the first column (row) and there are no gaps between numbers. Example.

247 38

(order = 7)

59

Definition. The shape of a standard tableau is an arrangement of squares with one square replacing each number in the standard tableau. Example.

The shape of 2 4 7 is as shown in Figure 1. 38

59

FIG.

1.

Received June 23, 1959; in revised form August 29, 1960. This work was conducted by Project MICHIGAN under Department of the Army Contract (DA-36-069-SC-78801), administered by the U.S. Army Signal Crops. The author would like to thank W. Richardson, G. Rabson, T. Curtz, I. Schensted, R. Thrall, and J. Riordan for illuminating discussions concerning this problem, and E. Graves for calculations which contributed to the solution. The problem originated as one aspect of a paper on sorting theory by R. Bear and P. Brock, Natural sorting, The University of Michigan, Willow Run Laboratories, Project MICHIGAN Report 2144-278-T, submitted for publication in Soc. Ind. App. Math. 179

299

180

C. SCHENSTED

One reason that standard tableaux are so useful to us is that it is easy to compute the number of standard tableaux of a given shape either by means of a simple recurrence relation, or by means of the following elegant result; Frame, Robinson, and Thrall (1). THEOREM. The number of standard tableaux of a given shape containing the integers 1, 2, ... ,n is

(1)

n!

n

-n-j-l

hj

Here the h j are the hook lengths, that is, the number of elements counting from the bottom of a column to a given element and then to the right end of the row.

Example. To compute the number of standard tableaux of the shape shown in Figure 2(a), we first find the hook lengths, which are shown in Figure

FIG. 2(a).

FIG. 2(b).

2(b). Then we find that the number of standard tableaux of this shape is

_ _ _--'-9_!__. 6·5·3·1·4·3·1·2·1

=

168.

Definition. S (- x is defined as the array obtained from the standard tableau, S, by means of the following steps: (i) Insert x in the first row of S either by displacing the smallest number which is larger than x, or if no number is larger than x, by adding x at the end of the first row. (ii) If x displaced a number from the first row, then insert this number in the second row either by displacing the smallest number which is larger than it or by adding it at the end of the second row. (iii) Repeat this process row by row until some number is added at the end of a row. In the above steps "adding at the end of the row" is interpreted as putting in the first column in the given row if the row does not yet have any entries in it. We define x --+ S similarly except that we replace the word "row" by the word "column" throughout.

300

181

INCREASING AND DECREASING SCUSEQl'ENCES

Example.

If

5

~

247 38 59

then

24 i

246

5- 6 = 37

and

6~S~38

58

59 6

9 LEMMA

1. 5 _ x and x

-+

5 are standard tableaux.

Proof. Since the p roofs for 5 _ x and x

-+

5 are simi lar we consid er on ly

5-x. First we note t hat if two consecutive rows of S have the same length, a nd if a number is displaced from the fi rst of these two rows, then it will either di spl ace the number which was standing under it or else some number to its left, and thus will not be added at the end of the ro\\' o Thus a row canllot be made longer than the row a bove it a nd 5 _ x cannot fai l to be a standard tablea u on accoun t of its shape. Th us we have only to prove that the llUlll bers in each row and C01 U 11111 sti ll forlll increasing sequences. A num ber is inserted into a row ill such a place tha t the number to its left (if any) is smaller , and the number to its righ t (if any) is larger. Thus thc numbcrs in c
Definition. The P -symbol corn:sponding t o a sequence of distinct in tegers is the standard tableau (. . . ((XI _ X2) - xs) . .• - Xn). T he Q-symbol correspondin g to the same sequence is t he array which is obtained b y putting k in the square which is added to the shape of the P-symbol whell X k is inserted in the P-sym boL

XtX2 . .. Xn

301

182

C. SCHENSTED

Examples. Sequence P-symbol

Q-symbol

LEMMA

3 35 354 3549 35498 354982 3549827 349 348 248 247 3 35 34 5 5 59 39 38 5 59 124 124 1 12 1 2 124 124 35 3 35 35 3 6 67 2. The Q-symbol corresponding to an arbitrary sequence is a standard

tableau. Proof. Since the Q-symbol has the same shape as the P-symbol, and since the P-symbol is a standard tableau, the shape of the Q-symbol is legitimate. Each digit added to the Q-symbol is larger than all of the previous digits, and in particular is larger than the digits above it and to its left. Hence the numbers in each row and column form increasing sequences, and the lemma is established. LEMMA 3. There is a one-to-one correspondence between sequences made with the n distinct integers Xl, X2, ••• , Xn and ordered pairs of standard tableaux of the same shape-the first containing XI. X2, ••• ,Xn and the second containing 1,2, ... , n.

Proof. Given a sequence, the P-symbol and Q-symbol are uniquely determined standard tableaux of the type mentioned in the lemma. Given a pair of standard tableaux of the appropriate types we can find the unique sequence which could have them for a P-symbol and Q-symbol as follows: The position of the largest number in the second tells us which number was added on to a row of the first without displacing another number when the last digit was inserted. This must have been displaced from the previous row by the largest number which is smaller than it (there always will be at least one number smaller than it in the preceding row since the one directly above it is smaller). This in turn must have been displaced from the next row up. Finally we get to the first row and discover what number was inserted into it. This is the last digit of the sequence. We now also know what the P-symbol and Q-symbol were before the last digit was inserted. Thus we can repeat the procedure to find the next to the last digit of the sequence. This proves the lemma. Note. Since there are n! possible sequences of Xl, X2, ••• ,Xn , Lemma 3 shows that there are n! ordered pairs of standard tableaux of order n such that the shapes of tableaux in each pair are the same, but the shapes of tableaux in different pairs are not necessarily the same. This fact is already known (2). Of course, the number of ordered pairs of standard tableaux of a given shape is equal to the square of the number of standard tableaux of that shape, which is given in turn by Expression (1).

302

INCREASING AND DECREASING SUBSEQUENCES

183

Definition. The jth basic subsequence of a given sequence consists of the digits which are inserted into the jth place in the first row of the P-symbol. LEM~fA

4. Each basic subsequence is a decreasing subsequence.

Proof. Each number in the jth basic subsequence, on insertion in the first ro\\" displaces the previous member of the jth basic subsequence, which must therefore he larger than the present member.

LEM:\fA 5. Given any member of the jth basic subsequence, we can find a member of the (j - l)st basic subsequence which is smaller and which occurs further to the left in the given sequence. Proof. The number in the (j - l)st place in the first row, when the given member of the jth basic subsequence is inserted, is such a member of the (i - l)st basic subsequence.

TIlEOREM 1. The number of columns in the P-symbol (or the Q-symbol) is equal to the length of the longest increasing subsequence of the corresponding sequence. Proof. The number of columns is the same as the number of basic subsequences. By Lemma 4 there can be at most one member of each basic subsequence in any increasing subsequence. By Lemma 5 we can construct an increasing subsequence with one element from each basic subsequence, Q.E.D . .Yote. The proof shows us how to actually obtain in increasing subsequence of maximal length.

LDl:\fA 6. (x -

5)

<--

y

=

x -

(5 <-- y).

Proof. Suppose first, that of all the digits in x, y, and 5, the largest is y. \Ve represent 5 schematically by Figure 3. There are two cases of interest.

FIG.

3.

The square added to the shape of 5 in x - 5 is in the first row, or it is not. We represent x - 5 schematically in these two cases by Figure 4(a) and 4(b) respectively, where x' is the number added to the end of some column without displacing another number when we form x - 5. It is easily verified

303

184

C. SCHENSTED

FIG.4(a).

FIG. 4(b).

that in the first case the final result is as shown second case the result is that of Figure 5(b).

III

Figure 5(a) and

III

the

=x-+(S~y)

(x-+S)+y =

FIG. 5(a).

y (x-+S)~y=

=x-+(S~y)

X'

FIG. 5(b).

This proves the lemma if y is the largest number involved, and the proof is similar if x is the largest number involved. Suppose now that, of all the digits in x, y, and S, the largest is .Y, and that .Y is in S. In this case we use induction. The lemma can be easily verified by direct calculation if S is of order 0, 1, or 2. We assume the lemma true for S of order n, and prove that it is then true for S of order n 1. Let us suppose, then, that S is of order n 1. Now, since N is the largest number in S, we see that N is at the end of whatever row it is in, and also at the end of its column. Thus, if we remove N from S we will obtain a new standard tableau, S', of order n. Now since ~V is larger than any of the other numbers, it can never displace any of them, and hence the presence or absence of .V cannot have any influence on the position of the other numbers. Thus (x ~ S) ~ y will be the same as (x ~ 5') ~ y except that N is added somewhere, and x ~ (S ~ y) will be the same as x ~ (5' ~ y) except for the addition of ~V. However, since 5' is of order n, we have by assumption

+

(x ~ 5') ~ y

=

x ~ (5' ~ y).

304

+

INCREASING AND DECREASING SUBSEQUENCES

185

Thus we have only to prove that N occupies the same position in (x ~ S) ~ y and x ~ (S ~ y) to prove the lemma. The truth of this can be easily verified for each of the possible cases which can arise as to the relative locations of X, x', and y'. Here x' (y') is the number which is added to some column (row) without displacing another number when we form x ~ S' (S' ~'y). In making these verifications it is necessary to keep the following facts in mind. If x' and y' do not fall into the same square, then we represent S', x ~ S', and S' ~ y schematically by Figure 6(a), 6(b), and 6(c) respectively. The shape of (x ~ S') +- y must have a square added to the shape of

FIG. 6(a).

FIG. 6(c).

FIG. 6(b).

x ~ S', and the shape of x ~ (S' ~ y) must have a square added to the shape of S' ~ y. By assumption (x ~ S') ~ y = x ~ (S' ~ y) so that the shape of (x ~ S') ~ y and x ~ (S' ~ y) must be Figure 7.

FIG.

7.

If x' (in x ~ S') and y' (in S' ~ y) occupy the same position then we schematically represent S', x ~ S', and S' ~ y by Figure 8(a), 8(b), and 8(c) respectively. Here the shaded parts of x ~ S' and S' ~ yare the

FIG. 8(a).

FIG. 8(b).

FIG. 8(c).

regions where numbers could have been displaced. Now let us suppose that y' > x'. Then when we insert y into x ~ S' the same numbers will be displaced in each row as were displaced when we inserted y into S, until we displace y'.

305

186

C. SCHENSTED

In S' ~ y we would have put y' where x' is, but y' > x', thus y' will be added at the end of the row containing x', and the shape of (x ---+ S') ~ Y (and hence of x ---+ (S' ~ y)) will be Figure 9. If we had had x' > y', then

FIG.

9.

the shape of (x ---+ S') ~ y and x ---+ (S' ~ y) would have been Figure 10. Thus, if we know the shapes of x ---+ S' and S' ~ y, and if we know whether x' > y' or x' < y', then we know the shape of (x ---+ S') ~ y and x ---+ (S' ~ y).

FIG. 10.

Now we can return to the problem of showing that N has the same position in (x ---+ S) ~ y and x ---+ (S ~ y). As we mentioned there are several special cases. \lIfe will consider only three of these as the others go in the same way. First suppose that the position of N in S does not coincide with either the position of x' in x ---+ S' or the position of y' in S' ~ y. Then N will never be displaced and it will have the same position in (x ---+ S) ~ y and x ---+ (S~y) as it does in S. Next suppose that the position of N in S coincides with the position of x' in x ---+ S', and that the position of y' in S' ~ y lies to the left of this. Then we have schematically Figure 11. Finally suppose that the position of N in S coincides with the position of x' in x ---+ S', and that the position of y' in S' ~ y lies one column to the right of this. Then schematically we have Figure 12. Proceeding similarly we can verify all of the other special cases, and hence the validity of Lemma fi. LEMMA 7. If one sequence is a second sequence written backwards, then Psymbol of the first is obtained from the P-symbol of the second by interchanging rows and columns.

Proof. First we note that x ---+ y = x xy and if x

>

~

y since if x

<

y they are both

y they are both~. Now we define P(XI, X2, .. . ,xn )

306

"'= ( ...

«XI

INCREASING AND DECREASING Sl7BSEQUENCES

187

s=

x-+S'=

5'+y=

x~S=

S~y=

(x-+S)--E-y=

=

X-+-(S -+- y)

FIG. 11.

and P (Xl, X2, ... ,Xn) == (Xl -~ ••• (X n-2 ---+ (Xn-l ---+Xn)) ... ). Next we assume that P(XI, X2, ••. ,Xn-l) = P(XI, X2, •.. , Xn-l) and that P(XI' X2, •••• xn) = P(XI, X2, ..• ,xn) and prove that P(XI. X2, •••• Xn• xn+!) = P(Xlo X2, .•• ,Xn, Xn+l). (We have just shown that P(Xlo X2) = Xl +- X2 = Xl ---+ X2 = P(XI, X2). furthermore P(XI) = Xl = P(XI).) We have +- X2) +- Xs) ••• +- xn)

P(Xlo X2, .••• Xn, Xn+l) = P(Xlo X2, ... , xn) +- Xn+l = j)(Xlo X2 • ...• xn) +- Xn+l = [Xl ---+ P (X2 • ...• X,,)] +- Xn+l = Xl ---+ [P(X2 • ... , Xn) +- Xn+l] = Xl ---+ [P(X2 • •.•• Xn) +- Xn+l] = Xl ---+ P(X2 • ...• Xn. Xn+l) = Xl ---+ P (X2 • •.• , Xn. Xn+l) = P(XI. X2 • ...• Xn, Xn+l).

Of these lines, the second, fifth. and seventh follow by assumption, and the

307

188

C. SCHENSTED

x-+5'=

S'-+-y=

x-+5=

S~y=

(x-+S)~y=

FIG. 12.

fourth from Lemma 6. Now P(Xlo ... , xn) is the P-symbol for the sequence while P(XI, X2, ••• , xn) is the P-symbol for the sequence X n , ••• , X2, Xl with rows and columns interchanged. Hence the lemma follows.

Xlo X2, ••• , X n,

Note. It must not be assumed that Lemma 7 holds for Q-symbols. THEOREM 2. The number of rows in the P-symbol (or the Q-symbol) is equal to the length of the longest decreasing subsequence of the corresponding sequence.

Proof. This follows immediately from Theorem 1 and Lemma 7, since writing a sequence backwards changes increasing subsequences into decreasing subsequences. 3. The number of sequences consisting of the distinct numbers and having a longest increasing subsequence of length a and a longest decreasing subsequence of length (:1, is the sum of the squares of the numbers of standard tableaux with shapes having a columns and (:1 rows. THEOREM

Xl, X2, ••• , X n ,

308

INCREASING AND DECREASING SUBSEQUENCES

189

Proof. Follows immediately from Lemma 3 and Theorems 1 and 2 (see also the note to Lemma 3). Example. To find the number of permutations of 1,2,3, ... ,25 having a longest decreasing subsequence of length three and a longest increasing subsequence of length 21 we note that the only allowed shapes with 25 squares, 21 columns, and 3 rows are those of Figure 13.

I I I I I I I I I I I I I I I I I I I

DII FIG.

13.

By the Frame-Robinson-Thrall theorem, the corresponding numbers of standard tableaux are 21,000 and 31,350 respectively. Thus the desired number of permutations is 21,000 2

+ 31,350 = 1,423,822,500. 2

PART

II

We now want to consider sequences in which some of the numbers are repeated. We can obtain the properties of such sequences in terms of sequences without repetitions by a simple artifice. Suppose the smallest number appears p times in the sequence, the next smallest q times, etc. We replace the p occurrences of the smallest number by the numbers 1,2, ... , P (in this order), the q occurrences of the next number by p 1, p 2, ... ,p q, etc. Then the decreasing subsequences of the two sequences will be in oneto-one correspondence, while the increasing subsequences of the new sequence will be in one-to-one correspondence with the non-decreasing subsequences of the original sequence.

+

+

+

Example. Given the sequence 33 2 3 4 1, we replace 1 by 1, 2 by 2, the three 3's by 4, 5, 6, and 4 by 7. The result is 45267 1. The latter sequence has a decreasing subsequence 5 2 1 which corresponds to a decreasing subsequence 3 2 1 in the original and an increasing subsequence 45 6 7 which corresponds to a non-decreasing subsequence 3334 in the original. If we construct the P-symbol for the derived sequence, and map the numbers in it back to the numbers in the original sequence, then we get a modified standard tableau in which repeated numbers are allowed, the numbers in each column form an increasing sequence, and the numbers in each row form a non-decreasing sequence. Since the numbers in the Q-symbol refer to

309

190

C. SCHENSTED

the order of addition of spaces to the P-symbol, the Q-symbols of the two sequences will be identical. We can get modified forms of each of the results in Part I. The main result, Theorem 3, now takes the form: THEOREM 4. The number of sequences of Xl. X2, ••• , Xn having a longest nondecreasing sequence of length a and a longest decreasing sequence of length {3 is the sum of the products of the number of modified standard tableaux of a given shape with the number of standard tableaux of the same shape, the shapes each having a columns and {3 rows.

Example. To find the number of sequences of seven numbers consisting entirely of l's, 2's, and 3's having a longest non-decreasing sequence of length four and a longest decreasing sequence of length three, we proceed as follows. The possible tableaux must have the shape of Figure 14.

FIG. 14.

The possible modified standard tableaux are

1 2 3 1 2 3 1 2 3

1 1 2 2 122 3 222 3

1 1 1 2 2 3 3 1 123 2 2 3 122 3 2 3 3

,

1 2 3 1 2 3 1 2 3

1 1 3 2 123 3 233 3

111 1 2 2 3 1 1 1 3

111 1

2 3

3 1 122 2 3 , 2 2 3 3 1 1 3 3 1 133 2 2 2 3 3 3 They are 15 in number.

By the Frame-Robinson-Thrall theorem the number of standard tableaux of this shape is 35. Hence the number of sequences of the desired type is 15 X 35 = 525. As a further example we will work out explicit formulae for binary sequences (sequences consisting of D's and l's). In this case the modified standard tableaux have the general form of Figure 15, where the bracketed region can have any division of D's and l's (the D's preceding the l's, of course). ~

FIG. 15.

310

191

INCREASING AND DECREASING SUBSEQUENCES

Let n be the number of digits in the sequence. Let m be the length of the longest non-decreasing subsequence. Then there are no sequences for which m < n/2. If m = n the longest decreasing subsequence is of length 1. If n/2 '" m < n, the longest decreasing subsequence is of length 2. The number of possible modified tableaux is 2m - n + 1. The number of standard tableaux is n!

+ 1) (m + l)!(n _

(2m - n

m)!'

Thus the number of binary sequences of n digits with a longest non-decreasing subsequence of length m is

+

n!(2m - n 1)2 (m l)!(n - m)! .

+

Note. Since the total number of binary sequences is 2n we have 2"

=

f

+

n!(2m - n 1)2 m>n/2(m l)!(n - m)!'

+

In the above derivation we allowed all possible binary sequences. Theorem 4 also readily solves the problem if the number of O's and l's in the sequence

is fixed. In this case there is at most one modified tableau and thus the number of sequences of n digits with a longest non-decreasing subsequence of length m is

+

n!(2m - n 1) (m l)!(n - m)!

+

with the additional restriction that the number, p, of O's must satisfy n - m '" p '" m.

Note. This shows that ( n)

P

f

=

m-max(p.n-p)

III

the sequence

+

n!(2m - n 1) (m l)!(n - m)!'

+

Throughout Part II we could have dealt equally well with increasing and non-increasing subsequences rather than decreasing and non-decreasing subsequences.

REFERENCES

1.

J.

S. Frame, G. de B. Robinson, and R. M. Thrall, The hook graphs of the symmetric group, Can. J. Math., 6 (1954),316. 2. D. E. Rutherford, Substitutional analysis (Edinburgh University Press, 1948), p. 26.

Institute for Defence Analysis Princeton Reprinted from Canad. J. Math. 13 (1961), 179-191

311

ON A THEOREM OF R. JUNGEN M. P. SCHUTZENBERGER

Let us recall the following elementary result in the theory of analytic functions in one variable. THEOREM

(R. JUNGEN [7]). If a is rational and b algebraic their

Hadamard product c is algebraic;l)", further, b is rational, c also is rational.

For several variaLles, J ungen 's proof shows that the theorem is still true for the Bochncr-:\Iartin [2] Hadamard product. It does not hold for the Cameron-:\lartin [3] and for the Haslam-Jones [6] Hadamard products. In this note we give a version of Jungen's theorem which is valid for a restricted interpretation of the notions involved when a and b are formal power series in a finite number of noncommuting variables. 1. Notations. Let R be a fixed not necessarily commutatiye ring with unit 1. For any finile set Z, F(Z) is the free monoid generated by Z and Rpul(Z) is the free module on F(Z) over R. An element a of Rpo\(Z) will usually be written in the form a = L (a, f) -J: fE F(Z) where the coefficients (a, j) are in R; Rpo1(Z) is graded in the usual mantler and 7r"a= LI(a, f)-J:fEF(Z), degf~nl. We identify R with 7r oR pol (Z). Rpol(Z) is also a ring with prouuct aa' = (a, 1') (a', 1") -J: f, 1', f" E F(Z) , f = 1'1"} . It is well known (d., e.g., [4; 3]) that these notions extend to the ling R(Z) of the fermal power series (with coefficients in R) in the noncommuting variables zEZ; R(Z) is topologized ill the same manner as a ring of commutative formal power-series and au' =lilll n •n ,_",(7r"a)(7r n ,a'). Any bER*(Z)=\aER(Z):7rua=ol has a quasI-ill verse (-b)*=limn_'" Ln'
I

LI

Y»

Received by the editors December 6, 1961.

313

I

886

M. P. SCHUTZENBERGER

[December

of the R-module R(X V Y) (resp. R:',(X V Y»). For each q = (ql, ... , qm) E R-V(X V Y), 7r n q = (7r n ql, .. " 7r n Qm). If qER*·1f (XV I') (i.e., if 7roq = 0) let Aq be the homomorphism of the monoid F(XV Y) into the multiplicative monoid structure of R(XVY) that is induced by AqX=X if xEX and Aq)'j=qj if )'iEI'. Since 7roq = 0, Aq can be extended to an endomorphism of the Rmodule R(XVY) by A
L!

+

lIence, p(x)=limm_"p(m) exists and it satisfIes P(x)El<*.lf(X), 7rup( x) = 0, p( :r.;) = Ap(-.c)P. In fact, p(:r.;) is the only clement to s,ltisfy these equations because if 7roP' = 0 and p' =\p'p, any rebtilJll 7r m P(:r.;) = 7r mP' im plies 7r m+lp' = 7r m+l\TmP' P = 7r m+IATmP(x;)p = 7r m-i!p( x). For this reason we call p( x) the solution of p. DEFI~ITIO~ 2. R:,g(X) is the least subset (of R*(X» that contain:> every coordinate of the solution of any proper system having its coonJinates in R;",(XV Y). (RE\\.-\RK. It can easily be sho\\"n that R:,g(X) is rationally closed and that it contains every coordinate of the solution of any p[I'per system having its coordinates in R:,g(XV 1').) DEFI~ITIO~ 3. For any

a, b E R(X),

a0 b

=

L I (a,f)(b,j) I f E

F(X)}.

2. Main result. Property 2.l. The elemellt a of R*(X) belongs to N;at(X) if and

only if there exists a tlnite integer X~2 and a homomorphism M of F(X) into the multiplicative monoid of RSX'V (the ring of the XXS matrices with entries ill R) sllch that a = IMit, \ I fe::: F(X) : (abbreviated as L}Jju' -f). PROOF. (1) The condition is necessary. This is tri\'ial if a=7rI(/. IIence it suffices to show that for any r, r'ER, a= LM!U--f and a' = LM'h,-", -f one can constru,:t suitable homomorphisms giving ra+a'r', (la' and a*. This is dOlle below, defining the homomorphisms by their restriction to X.

L

314

887

ON A THEOREM OF R. JUNGEN

Addition. Let N"=N+N'+2 and p."xERh"XN" defined for each xEX by J.L"X;.1 = J.L"X.V".i = 0 and

for 1

~

i

~

N" j

J.L " X;+I.N" = J.LXi.N

for 1

~

i

~

for 1

J.L"Xi,i' = the direct sum of J.LX and J.L'X

i\' j ~

i

~

.\"

j

for 2 ~ i, i' ~ X" - 1 j

The verification is trivial. Product. Let N"=N+N' and define IIfERN"X.V" ioreachfEF(X) by ~1;,;' = p.f; ..v if fr! 1, 1 ~i ~ N, i' = N + 1; IIf;.;· = 0, otherwise. Then, if p."x=ilx+llx where ilX is the direct sum of J.LX and J.!'x, one has for eachf = X(l l X(2) . • . x(n),J.!"f = iii Lliii'lIxtiiilf":j'xlj'f" = fl. Since IIfx(j)=iiiIlX(i) and (~1"'iii"h,s"=0 when f"=1, one has J.L"fl.,V" = L\(uf{,N)(p.'f{:N·):j'f"=fl· Hence, LJ.!"fl,y .. -j=aa'. QlIasi-im'erse. Let 1V" = N and define IIfERsXS for each IE F(X) by IIfi,i'=J.!fi ..v if fr!1, 1~i~N, i'=1; IIfi,;'=O, otherwise. Then JJ."x = fJ.X + IIX and since JJ.fllx = IIfx identically one has p."f = Lllf(1)1IJ<2) ... IIPk'J-tJo an=a* and the first part of the proof is completed. (2) The condition is sufficient. \Ye say tha t the proper system p is linear if for each j ~ M, pj = qj.O + Lj· qj .j'')'j' where all the q's belong to R:"t(X) and we verify that all coordinates of the solution of sllch a system belong to R:at(X). This is tri vial if .11 = 1 because p( 'X. ) = (1 - ql ,l)-lql.o( = (1 + qi.l)ql.O). If it is true for J[' < AI it is still true for JI. Indeed, because p( 'X. )11 = (l-q.\f ..\f)-I(q.\f,O+ Lj<.\f q.lf.j·P( 'X. )1'), the proper linear system P' defllled by p; = pj - qj,.IfYII + qi ..IfP.lI for j <.1[ and P:lf = (1-q.U ..If)-I(plf - q.If.IfY.If) is such that p( 'X.) = p'( 'X.). Since its first JI -1 coordinates do not involve YIf the result follows from the inc! uction hypothesis. Kow, given a homomorphism p. of F(X) into R.lfXM, the JI elements aj= LIfJ.li.wj:jEF(X), J-F11 are such that (aj, xf) = Lj' ,UXj.i,(aj' , j). Hence (aI, .. ,a.lf) is the solution of the linear proper system such that qj.O = LlfJ.xi ..v·x: x E xl, qi.i· = L!PXj'.j·x: xExl for each j, j' and 2.1 is proved.

+

315

888

M. P. SCHUTZENBERGER

[December

We now consider two subrings R' and R" of R that commute element-wise. Property 2.2. If a = LJ.L'!t,o'l·jER;:t(X) where J.L' is a homomorphism into R'NXX and if b = p( 00 hER~l:(X) where the proper system P has its coordinates in R~j(XU Y), then a 0 bER:1g(X). If, further, bER:~~(X) then a 0 bERiat(X). PROOF. \Ve verify first the case of bER;~~()(), i.e., of b = L,fJ."Il.-v" oj for some N" and J.L". Then a 0 b = 2:(J.L' ®J.L")!J,-','.\"' -J where the kroneckerian product J.L'@J.L" is a homomorphism of F(X) into RNN"XNN" because R' and R" commute and the result is proved. For the general case we denote by K(Z) for any set Z the ring of the NXN matrices with entries in R(Z). We shall have to consider several homomorphisms of moduleer: R_11(Z')-4KM(Z") where Z' and ZIt are two finite sets. In each case er is defined by a mapping Z'-4K(Z") which is extended in a natural fashion to a homomorphism of the monoid F(Z') into the multiplicative structure of K(Z"). Then for each a

= (aI, .. "

a.lI)

E R-'>f(Z'),

uai

=

2:! (aj,g)·ug: g E F(Z')}

and era = (eral, ... ,eraJI). l\Iore specifically, J.L: R_If(X)-4K-If(X) is induced by a mapping J.L: X-4K(X) such that the entries of each J.LX belong to R'*(X). For each qER"HI(X), }.I'q: R(XU Y)-4KM(X) is induced by }.,qf = J.Lj if jE F(X) and }.l'qYi = Mi if yiE Y. I-Icnce. since R' and R" commute element-wise, J.L}.qg = }.~qg for each gE F(XU (with Aq as previously defined). Consequently, J.L}.qp = ApqP for any

n

PER"_'1(XU Y). Let no\v Z~ -I-··l(l<·
•

n

316

889

ON A THEOREM OF R. JUNGE:-l

ther, PER~ot·,,(XU Y) all the entries appearing in lIP belong to R~ol (XUZ) and then finally (J.lp( 00 Lk,· 0:R:1g (X). This completes the proof because

L (b, /)P.'/I.S}: / E F(X) I = L I (b, /)41.\': / E F(X) I = p.bl,Y

a0 b=

where for each xEX, J.I is defined by /lX.,i· =P.'Xi,.· ·x. RE~IARK 1. Definitions 1, 2, and 3 and the computations of this section used only the structure of monoid of the additi\'e groups considered. Hence, the results are still valid when an arbitrary semiring S is taken in place of R. For S consisting of t\\·o Boolean clements, Jungen 's theorem and its special case for b rational have been obtained in a different form by Y. Bat'-Hillel, :\1. Perles and E. Shamir [1] (also by S. Ginsburg and G. F. Rose [5]) and by S. Kleene [8] respectively as by-products of more sophisticated theories. RDfARK 2. Let R = C, the fIeld of complex numbers; and P a proper system of dimension JI. Introducing 4JI new symbols Zj and replacing each Yj by z~d- iZ 4i +1 - Z4j, 2 - iZ4j+3 in the PiS we can ded uce from P a new systpm of dimel:sion 4..11 in which all the coefficients are non-negative real numbers and whose solution is simply related to p(

y; ).

Assume now that pE C~~ (XU Y) has only real non-neg;lti\,e coeftlrients and denote by a a homomorphism of CpQI(XU Y) into C. Because of the assumption that COj, Yj') = (Pit 1) =0, identically, \\"l' caa find an E>O such that lapj! <E for all j when !ax~ ~E and :ay\ ~2E for all xEX and yE Y. Since the sequence ap(O), apd) . . . . ,ap(n), ... is monotonically increasing it converges to a tillite solution (d., e.g., [10]). Hence, the canonical epimorphism of Cp.,I(XU onto the ring of the ordinary (commutative) polynomids can be extended to an epimorphism of Calg(X) onto the ring of the Taylor series oi the algebraic functions.

n

Acknowledgment. Acknowledgment is made to the \011lmOllwealth Fund for the grant in support of the visiting profess(lrship of bi,)mathematics in the Department of J'reventive :\ledicinc at Ilarvard :\ledical School. REFEREXCES 1. Y. Bar-Hillel, :'II. Perles ami E. Shamir, 0" formal propl'./ips (If simplt! phrasl' structure gramm(zrs, Technical Report :\0. 4. Information System Branch, Office (,f Naval Research, 1960.

317

890

M. P. SCHUTZENBERGER

2. S. Bochner and W. T. !\Iartin, Singularities of composite functions in several mriables, :\nn. of Math. 38 (1938), 293-302. 3. R. H. Cameron and \V. T. l'.iartin, A nalytic continuation of diagonals, Trans. Amer. ~lath. Soc. 44 (1938), 1-7. 4. K. T. Chen, R. H. Fox and R. C. Lyndon, Free differential wlculus. IV, ;\nl1. of !\lath. (2) 68 (1958),81-95. 5. S. Ginsburg and G. F. Rose, Operations which preser;;c d,jinability, System Development Corporation, Santa l'.ionica, Calif., SP-511, October, 1961. 6. l!. S. Haslam-Jones, An e:.:tt'llsion of lTadamard multiplication thcoreal, Proc. London Math. Soc. II. Ser. 27 (1928), 223-232. 7. R. Jungen, Sur les series de Taylor n'ayant que des singularites al;;ebrico-logarithrniques sur leur cercle de convergCllce, Comment. :'.Iath. Hclv. 3 (1931),226-306. 8. S. Kleene, Represoltation of ct'rnts in nerve nets and finite automata, Autumata Studies, Princcton Cniv. Press, Princeton, :\. J" 1956. 9. :\1. Lazard, Lois de groupes et analyseurs, Ann. Sci. Ecole :\orm. Sup. (4) 72 (1955), 299-400. 10. A. :\1. Ostrowski, Solutions of equations and systems of equations, Academic Press, New York, 1960. HARVARD MEDICAL SCHOOL

Reprinted from Proc. Amer. Math. Soc. 13 (1962), 885-890

318

REGULARITY AND POSITIONAL GAMES

BY

A. W. HALES AND R. 1. JEWETT

Reprinted from the TRANSACTIONS OF THE AMERICAN MATHEMATICAL

Vol. 106, No.2, pp. 222-229 February, 1963

319

SOCIETY

REGULARITY AND POSITIONAL GAMES BY

A. W. HALES AND R. I. JEWETT

1. Introduction. Suppose X is a set, 9' a collection of sets (usually subsets of X), and N is a cardinal number. Following the terminology of Rado [1], we say 9' is N-regular in X if, for any partition of X into N parts, some part has as a subset a member of 9'. If 9' is n-regular in X for each integer n, we say 9' is regular in X. For example, let X = {1,2, "',mn - n + 1} and 9' be all m element subsets of X (hereafter designated x(m»). Then 9' is n-regular in X, but not (n + 1)-

regular. Another example is the famous theorem of Ramsey which states that given integers k, m, n, there exists an integer p such that, if A = {l,2, ... ,p}, then {B(k):BEA(m)} is n-regular in A(k). The concept of regularity is useful in analyzing certain types of games, as we shall see in §3. In §2, we shall give some general results and discuss related problems. 2. Regularity. One of the first problems in this area was proposed at Gottingen in 1927. The pro blem was as follows: If the positive integers are split into two parts, does one part contain arithmetic progressions of arbitrary length? B. L. van der Waerden solved this and a more general problem. He proved that, given integers m and n, there exists an integer p such that the set of all arithmetic progressions oflength m is n-regular in {l, 2, "', p} [2]. This will be a consequence of Theorem 1. First we shall give some preliminaries. DEFINITION. If 9' and :Y are collections of sets, let 9' ®:Y be the collection of all sets A x B, where A is in 9' and B is in :Y. LEMMA 1. Let M and N be cardinal numbers. Let 9' be N-regular in X, a set of cardinality M, and let :Y be NM-regular in Y. Then 9' ®:Y is N-regular in X x Y.

Proof. Let P be a set of cardinality N. Then a partition of X x Y into N parts can be represented by a function f from X x Y into P. For each y E Y, f defines a function fy from X into P given by f'(x)

= f(x,y).

Since there are N M such functions the mapping y -+ f, induces a partition of Y into N M parts. One of these parts contains as a subset a member T of :Y. That is, for all y, y' E T Received by the editors July 5,1961 and, in revised form, December 26,1961.

222

320

REGULARITY AND POSITIONAL GAMES

fy f(x,y)

223

= f y"

= f(x,y')

(x e X).

Choose Yo e T. Then fyo partitions X into N parts and hence 3 Se9", peP such that (xeS). But then jy(x)

=

p

(xeS, yeT).

f(x,y)

=

p

(xeS, yeT)

That is,

which was to be shown. DEFINITION. Let X and Y be sets, 9" andff collections of sets. Then a mapping f: X --. Y is called provincial with respect to 9" and ff in case when A e 9", A ~ X there exists a set Be ff, B ~ Y such that B ~ f(A). LEMMA 2. Let f: X --. Y be provincial with respect to 9" and ff. Then 9" is N-regular in X, ff is N-regular in Y.

if

Proof. Let P be a set of cardinality Nand g: Y --. P. Then g(f): X --. P and there exist peP and Ae9" such that A ~ X and g(f(A)) = {pl. But there is a Beff, B ~ Y such that B ~f(A). So g(B) = {pl. LEMMA 3. Let X be a semigroup, 9"~ 2X. Suppose for each positive integer k, 9" is k-regular in a finite subset of X. Then for each n,

is regular in X.

Proof. We induct on n. Suppose 9"n-l is regular in X and k ~ 1. Then there is an integer m and Be X (m) such that 9" is k-regular in B. Since 9"n-l is km_ regular in X, 9" ® 9"n-l is k-regular in B x X ~ X x X. But the mapping (x,y) --. xy of X x X into X is clearly provincial with respect to 9" ® 9"n-l and 9"n' and thus9"n is k-regular in X. Let W be a fixed set and t ¢ W. Let X be the free semigroup on the set W. A functional f is a mapping of W into X which can be described as follows. For some positive integer n there is an n-tuple oc = (OCl' oc 2 , ••• , ocn) of elements of W U {t} in which t appears at least once, such that, for we W,f(w) is the result of replacing the t by a wand multiplying (in X) the n components of the new n-tuple. For example, if W = {I, 2, 3,4} and oc = (1, t, 3, t, t, 2, 1, t) then the corresponding functional f would satisfy f(4)

= 14344214.

321

A. W. HALES AND R. I. JEWETT

[February

Suppose that fl' f2' ... J" are functionals. Let cfJ: wn -+ fl(W)f2(W) ···f,,(W) be efined by ¢(WIW2 ... w,,) = fl(Wl)f2(W']) ···f,,(w,,). We see that if 9 is a functional of "length" n then there is a functional h such that ¢(g(w» = hew) (w E W). oosely, we have cfJ(g(t)) = h(t). If As; W, R s; W n and R is a member of {J(A) :1 is a functional} then ¢(R) is also a member of this collection. Thus, relative to this collection, cfJ is a provincial map of An onto fl(A)fl(A) ···fn(A). THEOREM 1. If A is a finite subset of W the collection {I(A): tional} is regular in X.

1

is a func-

Proof. Let (l m,j) be the statement: If BE W(m) there exists an integer p such hat {feB) :fis a functional} isj-regular in BP. We will prove these statements by inductIon and thus prove the theorem. It is clear that (Im,l) and (II) are true. Assume further that for n> 1 and kG; 1, (III,k) and (111-1) are true for all j. Let A E W(n). Pick a E A and let B = A - {a}. By assumption, there is an integer r such that {f(A):f is a functional} is k-regular in Ar. By (III-I)' the fact that B' is finite for each s, and Lemma 3, we see that {lo(B)/I(B) ···fr(B):J; is a functional} is (k + I)-regular in the subsemigroup of X generated by B, namely, Bl U B2 U ... u B S u .... The B S are disjoint, and if fo.Jlt ... ,J.. are functionals, then the set o(B)fl(B) ···flB), all of whose elements have the same "length," meets at most one of the B S • In such a situation, (k + 1)-regularity in the union implies (k + 1)regularity in one of the parts. Thus there is an integer q such that {fo(B)fl(B) ···flB)}

is (k + 1)-regular in Bq. We will use the integer q to verify (In,Ht). Let A q = Po u PI U ... U Pk. This defines a partition of Bq and so there are functionals go,gl,···,gr such that gO(B)gl(B)···gr(B) is contained in one of the parts, say Po, and also in Bq. Thus each "entry" in a g, is an element of B u {t}. Since B s; A, we can conclude that gO(A)gl(A) ... grCA) s; Aq. Now the mapping cfJ: A r -+ Uo(a)gt(A) ... urCA) defined by ¢(W 1 W2 ..• wr) = UO(a)gl(w t ) ... gr(wr) is provincial with respect to {f(A):f is a functional}. Thus, by (In,k)' {f(A)} is kregular in go(a)Ul(A) ... UreA). If Uo(a)Ut(A) •.. gr(A) is disjoint from Po we are done. If not, there are elements ai' a2, ... , ar of A such that x = Uo(a)gt(at) ... grCa,) E Po.

Suppose x

= V 1V2 ••• v, E A'.

Define cx

= (CX1' cx 2, ••• , cx,)

322

by

REGULARITY AND POSITIONAL GAMES

1963]

IX.

,

225

if VieB, if Vi = a.

= {VIt

Note that t appears in fo, so a appears in fo(a), and hence t appears in Then IX represents a functional 9 for which g(a)

IX.

=x

and g(B) s gO(B)gl(B) .. · g,(B) s Po. Since g(A) S Po, and the theorem is proved.

g(A)

= g(B) u {g(a)}, we have

COROLLARY. Let S be a finite subset of a commutative semigroup H. Then the collection of all sets

{a + nx :xeS}, where a e Hand n is a positive integer, is regular in H. Proof.

Let X be the free semigroup on S. Then the mapping

is provincial. COROLLARY (VAN DER WAERDEN). For any partition of the positive integers into a finite number of parts, one of the parts contains arithmetic progressions of arbitrary length.

The stronger statement proved by van del' Waerden and mentioned above is clear from the proof of Theorem 1. This suggests a general result which we have given as a corollary to Theorem 2. Theorem 2 is proved in Rado [3]. The proof is essentially an application of Tychonoff's Theorem, as shown by Gottschalk [4]. THEOREM 2. Let X and r be sets, and for every finite subset A of X let fA be a function from A into r. Suppose that for each x e X, r x = {fA(X): A is a finite subset of X containing x} is finite. Then there exists a function F from X into r with the property that, given any finite subset A of X, there exists a finite subset B of X such that F and fB agree on A. COROLLARY. Let X be a set and!7 a collection of finite sets. Then if !7 is nregular in X for some positive integer n, !7 is n-regular in a finite subset of X.

Proof. Let P be a set with n elements. Suppose !7 is not n-regular on any finite subset of X. Then for each finite subset A of X, there is a function fA from A into P that is not constant on any member of !7. By Theorem 2 there is a function F from X into P that is not constant on any member of !7, a contradiction. The above corollary suggests a general problem. Let M and N be cardinals. Does there exist a cardinal P having the following property? If X is a set and .<J> is a collection of sets each of cardinality less than M, and !7 is N-regular in

323

226

A. W. HALES AND R. I. JEWETT

[February

X, then [/' is N-regular in a subset of X of cardinality less than P. A simple example shows that if M > 1 and N > 1 are integers, P = ~o is best possible. The corollary says that if M = ~o, and N > 1 is finite, then P = ~o is sufficient. No further results in this area are known. Another problem is the "rectangle" problem. Let M, N, and P be cardinals. For what pairs (R, S) of cardinals (if any) is the following true? If X has cardinality Rand Yhas cardinality S, then X(M)® y(n)is P-regular in X x Y. That is, if an R x S rectangle is partitioned into P parts, one part contains an M x N rectangle. From Lemma 1, such pairs always exist. For example, if M =2, P=2, N = 5, P = 2, then R = 3 and S = 40 is sufficient. The "minimal" pairs (R, S) for given M, N, and P are not known in general. A particularly interesting case occurs when M = N = ~o and P = 2. It is easily seen that (~o, ~o) does not work and (~O,22110) does. The sufficiency of (~o,tIO) or, for that matter, (2 110, 2110) is an open question.

3. Positional games. By a positional game we shall mean a game played by n players on a "board" (finite set) X with which is associated a collection [/' of subsets of X. The rules are that each player, in turn, claims as his own a previously unclaimed "square" (element) of X. The game proceeds either until one player has claimed every element of some S E[/', in which case he wins, or until every element has 'been claimed, but no one has yet won, in which case the game is a tie. The most familiar example of such a game is "Tick-TackToe." Another is the Oriental game "Go Moku." It is known from game theory that, in a finite two-player perfect information game, either one player has a forced win or each player can force a tie [5]. LEMMA 4. In a positional game involving 2 players, where [/' is 2-regular in X, the first player has a forced win.

Proof. Since no tie can occur, one player has a forced win. Assume the second player has a forced win. But then the first player can force a win by (1) making his first move at random, and (2) thereafter following the optimum strategy for the second player, ignoring the last random move, and playing again at random if this is impossible. Since having made an extra move cannot possibly hurt, this will give the first player a win, a contradiction. Therefore, the first player has a forced win. The following result in combinatorial analysis is due to Philip Hall [6]. LEMMA 5. Let S1,S2,"',Sn be an indexed collection of finite sets. Then (A) and (B) are equivalent. (A) There exist S1' S2, ••• , Sn such that each s E Si and Sj:f= Sj if i:f= j. (B) For each F s;; {1, "', n}, the set Sl has at least as many elements as F.

UieF

324

1963]

REGULARITY AND POSITIONAL GAMES

227

If condition (A) is satisfied, we say Sl' "', Sn have distinct representatives. We will use this lemma to exhibit a tying strategy for the second player in certain positional games. LEMMA 6. Let X be the board of a 2-player positional game with winning sets !I' = {Sl' "', Sn}. For k = 1,2, "', n let T 2k - l = T2k = Sk' Then if T l , "', T 2n have distinct representatives, the the second player can force a tie.

Proof. Let the representatives be t l , t 2, "', t 2n . Consider the sets {tl' t 2}, {t3' t 4 }, "', {t 2n - l , t 2.}. Observe that in order to win the first player must have

both elements of at least one of these sets. Since the second player can easily prevent this, he can force a tie. LEMMA 7. Let !I' £; 2x , where X is finite. Let n be the size of the smallest member of!l'. Let m be the size of the largest set of the form {S e!l' : xeS} where x e X. If n ~ 2m, then in the corresponding 2-player positional game, the second player can force a tie.

Proof. By a simple counting argument, Lemmas 5 and 6 can be applied to obtain the desired result. The rest of this paper will be concerned with a particular class of 2-player positional games, namely generalizations of Tick-Tack-Toe. The traditional Tick-Tack-Toe game is played on a 3 x 3 array of points in the plane. For positive integers k and n, the "k"-game" is played on a k x k x ... x k (n times) array of points in n-space. If we choose as a board the set X = {(ai' a2' "', an) : 1 ~ aj ~ k for all i},

then S is in !I', the collection of winning sets (paths), in case !I' consists of k points in a straight line. An equivalent characterization of S e!l' would be that the elements of S, in some order, are CXl> CX2' "', CXk where CXi = (ail' ''', ain) and, for each j, the sequence (ali' a 2i , "', akj) is one of the following: (1, (2,

1, "', 1) 2, "', 2)

(k,

k, "', k)

(1, 2, "', k) (k, k-1, "', 1).

In this case we say CXl,CX2' "',cxk are in a natural order (there are two such orders). In traditional Tick-Tack-Toe, the second player can achieve a tie. In the 33 -game, however, the first player has a forced win (in fact, no tie position exists).

325

228

A. W. HALES AND R. I. JEWETT

[February

Thus, in the 3-dimensional games sold on the market, k is usually 4. Our previous results enable us to draw some conclusions about the existence of winning and tying strategies in the general case. THEOREM 4. (a) If k~3n-l (k odd) or if k~2"+1_2 (k even), then the second player can force a tie in the k"-game. (b) For each k, there exists nk such that the first player can force a win in the kn-game if n ~ nk'

Proof. (a) If k is odd, there are at most (3"-1)/2 paths through any point and this bound is achieved only at the center point. If k is even, the bound is 2" -1. The result follows readily from Lemma 7. This suggests that the center point is the optimum move for the first player if k is odd. (b) In Theorem 1, let W = {1,2, · .. ,k}. Note that if f is a functional then f(W) is a path, but the converse is not true. Now (Ik,2) and Lemma 4 yield the result. We conjecture that the bounds in Theorem 4(a) can be improved by a direct application of Lemmas 5 and 6. It seems possible that k ~ 2(21/11 - 1) -1, i.e., that the total number of points be greater than the total number of paths, can be shown to be sufficient in this way. Even though, in some k"-games, the second player cannot force a tie, a tie position may still exist, i.e., [/' (the collection of paths) may not be 2-regular in X (the board). The bounds of Theorem 4(a) apply, but much more can be said. THEOREM 5. If k ~ n 2-regulal' in the board.

+ 1,

then in the k"-game the collection of paths is not

Let k be fixed. For each n let the kn-game board Xn be the set of n-tuples on

{I, 2, ''', k}. Designate the elements of GF(2) by {a, I}. Any partition of Xn into two parts can be represented (in two ways) as a function from Xn into GF(2). Let f:Xm-+ GF(2) and g:Xn-+ GF(2) represent partitions. Then we define f ff) g: Xm+n -+ GF(2) by

(fff) g) (a1' "', am+n)

=

f(a1' ''', am)

+ g(a m+1' "', am+,.)

where addition on the right takes place in GF(2). Thus fff) g represents a partition of Xm+n into two parts. Note that "ff)" is an associative operation on functions from the Xi into GF(2). Proof of Theorem 5. Let V1, V2 , "', Vm be k-dimensional vectors over GF(2), that is, functions from X 1 into GF(2). Define

Suppose that for each choice of (i = 1, ''', k)

326

REGULARITY AND POSITIONAL GAMES

1963]

229

the vector is neither all zeros nor all ones. Then from the above discussion it can be seen that represents a partition of Xm no part of which contains a path. The theorem will be proved if for each k, k -1 such vectors can be found. The desired constructions are obvious extensions of the following two examples for odd and even k. For" = 5: (1, 0, 0, 0, 1) (0, 1, 0, 1, 0) (1, 0, 0, 0, 0) (0, 1, 0, 0, 0) For k=6: (1, 0, 0, 0, 0, (0, 1, 0, 0, 1, (1, 0, 0, 0, 0, (0, 1, 0, 0, 0, (0, 0, 1, 0, 0,

1) 0) 0) 0) 0).

REEERENCE,)

I. R. Rado, Notc on combinatorial analysis, Proc. London Math. Soc. (2) 48 (1943-·4.5), 122-160. 2. A. Y. Khinchin, Three pearls of number theory, Graylock Press, Rochester, 1952, pp. 11-12 3. R. Rado, Axiomatic treatment of rank in infinite sets, Canad. J. Math. 1 (1949), 338. 4. W. H. Gottschalk, Choice functions and Tychonoff's Theorem, Proc. Amer. Math. Soc. 2 (1951),172. 5. D. Blackwell and M. A. Girshick, Theory of games and statistical decisiolls, Wiley, New York, 1954, p. 21. 6. P. Hall, 011 represelllatives of subsets, J. London Math. Soc, 10 (1935), 26-30. CALIFORNIA INSTITUTE OF TECHNOLOGY, PASADENA, CALIFORNIA UNIVERSITY OF OREGON, EUGENE, OREGON

327

Research Notes

833

On well-Quasi-ordering finite trees By C. ST. J. A. NASH-WILLLUiS King's College, Aberdeen (Received 9 March 1963) Abstract. A new and simple proof is given of the kno,,;n theorem that, if T1 , T2 , •• , is an infinite sequence of finite trees, then there exist i and j such that i < j and Ti is homeomorphic to a subtree of 1j.

A qua.si-ordered set is a set Q on which a reflexive and transitive relation ~ is defined. Q and Q' will denote quasi-ordered sets. An infinite sequence ql' Q2' , .• of elements of Q will be called good if there exist positive integers i, j such that i < j and qj ~ qj; if not, the sequence wiII be called bad. A quasi-ordered set Q is u'ell-quasi-ordered (lI'qo) if every infinite sequence of elements of Q is good. A graph G consistR (for our purposes) of a finite set 1'(G) of elements called t'uficP8 of G and a subset E(G) of the Cartesian product V(G) x r(G). The elements of E(G) are called edges of G. If (;, lJ) E E(G), we calllJ a successor of;. If;, lJ E V(G), a ;'7-path is a sequence ;0' "',;n of vertices of G such that ;0 = ;, ;n = '1 and (;i-1>;;J EE(G) for i = I, ... , n. The sequence with sole term; is accepted as a ;;-path. If there exists a ;11-path, we say that lJ follou·s;. For the purposes of this paper, a tree is a graph T possessing a vertex piT) (called its root) such that, for every; E J'(T), there exists a unique piT) ;-path in T. The letter T (with 01' without dashes or suffixes) will always denote a tree. For the purposes of this paper, a homeomorphism of T into T' is a function 9: l"(T) -+ V(T') such that, for every; E nT), the images under 9 of the successors of; follow distinct successors of 9(;). The set of all trees will be quasi-ordered by the rule that T ~ T' if and only ifthere exists a homeomorphism of T into T'. This paper presents a new and shorter proof of the following theorem of Kruskal (2). THEORE}I

l. The set of all trees i.s u'qo.

If A, B are subsets ofQ, a mappingf: A -+ B is non-descending if a ~f(a)forevery a E A. The class of finite subsets ofQ will be denoted by SQ, and will be quasi-ordered by the rule that A ~ B if and only if there exists a one-to-one non-descending mapping of A into B, where A, B denote members of SQ. The Cartesian product Q x Q' will be quasi-ordered by the rule that (ql' q~) ~ (q2' q;) if and onl,\' if ql ~ q2 and q~ ~ q;. The cardinal number of a set A will be denoted by IAI. The following two lemmas are well known (see (I)). hut for the reader's eonYenil:'nce their proofs are given here. LElIlMA

l. If Q, Q' are wqo, then Q x Q'i.s u·qo.

Proof. We must prove an arbitrary infinite sequence (ql' q~), (q2' q;), ... of elements ofQ x Q' to be good. Call q", terminalifthere is non> rnsuch that q", ~ q". The number 53'3

329

Research Notes

834

of q", which are terminal must be finite, since otherwise they would form a bad suhsequence of q1' g2' •••• Therefore there is an N such that qr is not terminal if r > .V. We can therefore select a positive integer f(l) > N, then an f(2) > f(l) such that q/(l) .::; g/(2), then an f(3) > f(2) such that q/(2) .::; q/(a) and so on. Hince Q' is wqo, there exist i, j such that i < j and q/(;) .::; q;(j), whence (q/(il> q;w) .::; (qI(j)' q;lj» and therefore our original sequence is good. LEMMA

2. If Q is 1I'go, then SQ

i.~

wqo.

Proof. .\ssume that the lemma is false. Select an Al E SQ such that Al is the fir;;t term of a had sequence of members of SQ and IA11 is as small as possible. Theil select an A2 such that A 1 , A2 (in that order) are the first two terms of a bad sequence of members of SQ and IA21 is as small as possible. Then select an Aa such that ..1 1 , A~, ..13 (in that order) are the first three terms of a bad sequence of members of 8Q and !..t31 is as small as possible. Assuming the Axiom of Choice, this process yields a bad sequencc AI' A 2 , ..1 3 , .... Since this sequence is bad, no Ai is empty: therefore we can seleet all element (Ii from each Ai' Let Bi = Ai - {a;}. If there existed a had sequence EplI , H/(2l> ... "uch thatf( I) .::; f(i) for alIi, the sequence ..11> A 2 ,

... ,

..1 / (11-1' BIll), B / (2h

...

would be bad (since Ai .::; B j entails Ai .::; Aj and is therefore impossible if i < j). Since this would contradict the definition of A/Ill> there can be no bad Requence B I (l), B I (2), ... such thatf( 1) .::; f(i) for all i. Itfollows that the class (\8, say) of set" Hi is wqo, since any bad sequence of sets B; would have a (bad) infinite ;;uhsequenee in which no suffix was less than the first. Therefore, by Lemma L Q x )!' is wqo. Therefore there exist i, j such that i < j and (ai' BJ .::; (U j , B j ), which implies that Ai ~ Aj anel thus contradicts the badness of AI' A 2 , .... This contradiction proves the lemma. The branch of T at a vertex 6 is the tree R such that r(R) is the set ofthol'e wrtices of T which follow sand E(R) = E('l') n (V(R) x I'(R».

Proof of Theorem I. Assume that the theorem is false. Select a tree 7; such that T1 is the first term of a bad sequence oftrees and Ir(T1 ) I is as small as possible. Then !
. would be bad (since Ti .::; R E B j entails 1'; .::; 'lj and is therefore impossible if i < j). Hince this would contradict the definition of Tf{lh there can be no bad sequence R 1 • R 2 , ... such that Ri E Bl1jl and f(l) .::; f(i) for every i. Since any bad sequence of elements of B would have a bad subsequence of this form, it follows that no sequence of elements of H is bad. Therefore B is wqo and hence, by Lemma 2, SB is wqo. TherefOl'e

330

Research Notes

835

Hi ~ R j for some pail' i. j such that i < j. Therefore there is a one-to-onp 11011descpnding mapping 9: Hi -'? B j • For each R E B i , R ( 9(R) and su there exisb a humeumorphism hll of R into 9(R). A humeumorphism h of Ti into ~ may thus be defined h.v writinp: h(p(Ti )) = p(~) and making h coincide with hll on the vertices of pach if E B i • Thprefore Ti ( 7j, which contradicts the badness of 7~, T2 • ... and thus Ill'" I es Theorem I. The Tree TheOI'PIll of (2) is stronger than Theorem 1 of the present paper, hut the ahu\"(' proof of Theorem I can easily be adapted to prove the Tree Theorem hy COI1sitlf.l'iilg X x F(B) in place of .'in (where X. F have the meaninf,(s stated in (2)). Because til(' I:ecessary chanp:es are eas,\' to make, [ have sacrificed this much f,(enerality in the intc'I','4s of readability.

So/e ((([rled 10 A 1I!l1I8t l!·H;~. It has bCf'n brought to thp author's notice that Kruskal's proof of thl' Tree Theorem (2) anticipatl'd a somewhat similar proof obtained indepelldmtly hy i-i. Tarkowski (Bllll. Acrtrl. Polon. Sci. Sir. Sci . .lltltli. Asl,.. Phy'. 8 (IfHIO), an-H). I{ EFEH

(1) HI,;)!.,,,. (;.

Ord(,l'in~ b~' dj\'i,iiJili!~'

EX,' ES

in abs!ra('! algebras. 1'l"Ic. £'Jllr/"" .1/rII/•. S.O('. (:\).2

(I 9.;:!), :l~lj-331i. (2) KI:'·SKAI.,.f. B. \\·pll-
Reprinted from Proc. Cambridge Phil. Soc. 59 (1963),833-835

331

T(lljI8.

Z. Wahrsch .. inlichkcitstheorie 2, :J40

368 (191i4)

On the .Foundations of Combinatorial1.'heory I. Theory of Mobius Functions By GrAN-CARLO RO'fA

Contlluts :J40

I. J"Iro,!. ... !';oll . . . .

342 344

2. Prd;min.lI·i"H. . . . :l. The illcide"ce algpbra 4. Mai" rf'Hu\t.s . . . .

347

349

Ii. Al'l'licatiollR . . . . (i. The J<~1I1er ,'harn(·tA'ri~tie .

:l52

356

7. nl'ornetri,' lat.l.i"I'H. . . . H. J{cl'resPlll al jOIlR . • • . n. J\ ppli""1 iOIl: 1h" coloring of gral'hH ]I'. Appli('at.io,,: flows in networks . .

:JfiO

:ltn

:164

L 1lIll'1Ilhu'tillll One of the most useful principlf'H of enumeration in discrd.e probability and combinatorial t.heory is the celebrate,\ principle ol·incln8i()/I-e,cdl1.~i()n (ef. }<'J.)J.U).t *, Fn(.;cm:T, H,JOltDAN, RYSER). When :skillfully applied, thi" principle has yiedd"d Oil' ~()I\lt.ion to many a combinat.orial problem. It" mathematical foulldat.iow\ wen' t.IlOroughly investigated not long ago in a monograph by FRECHET, and it. Illight ,,1 til"';\' "ppe'ar that, after Kl.wh ('xhaustive work, little else eOlllcl lin >;aid on the subjf'd.. (lilt' fre~(lU(,nt.ly /lotiees, however, a wide gap between the bare stat.ement of the principle alld the Hkill rl'quired in rpcognizing that it applies to a particular eOlllhinatorial problem. It has often t.aken the eornbined efforts of many a ('ombinatorial analy:;t O\'er long periods to rceognize an inclusion-exclusion patt('TIl. For example, for the IlH;nage problem it took fifty-five years, since CAYJ,EY'S attempts, before JAcQln;s TOUCHARD in 1934 could recognize a pat.tl'rn, and t.hence readily obtain till' :solution as an explicit binomial forrnuht. The sitnat.ion becomeR bewildl'ring in problems requiring an enurnel"/ttiol1 of :tlly of tho JIUIHf'rous collections of combinatorial objects which are nowadllYH e·onling to the fore. The count,ing of t,rees, graphs, partially ordered seb!, complexes, finite sets on which groups act, not t.o mention more difficult problems relating to permutations with restricted po"it.ion, such as I,at-in squares and the coloring of maps, seem to lie beyond present-day methods of (,lIumeration. The lack of a systematic

332

011 till' 1<'oundationR of Combinatorial Thf'ory. I

:141

theory is hardly matched by the eonRummaw skill of a few individuals with a. natural gift for enulllerat,ioJl. Thill wurk begins the tltudy of a very general principle of enumeration, of which the inclusion-exclusion principle is the simplest, but also the typical case. It often happens that a set of objects to be counted possesses a natural ordering, in general only a partial order. It may be unnatural to fit the enumeration of such a sct into a linear ordcr such as the intcgers: instead, it turns out in a great many cases that a more effective technique is to work with the natural order of the set.. One is led in this way to set up a "difference calculus" relative to an arbitrary partially ordered set. Looked at in this way, a surprising variety of problems of enumerat,ion reveal themselves to be instances of the general problem of inverting an "indefinite ~mm" ranging over a partially ordered set. The inversion can be carried ou1. by defining an analog of the "difference operator" rclative t.o a part,ial ordcring. su(~h an olH,rator is the Mobius function, and the analog of the "fundamental theorem of the ealculus" thus obtained is the Mobius inversion furmula on a partially ordered set. This formula is here expre8sed in a language close to that of IlllmheI' theory, where it appears as the well-known illv!'rsc relat.ion hetween the Rit'mallll zeta function and the Diriehlet generating function of the classieal Mobius functioll. In fact., the algebra of formal Diriehlet series turns out to be the simplest nontrivial instance of such a "difference calculus". relat.ivc to t.he order relation of divisibilit,y. Once the importance of thc MohiuH flloetion in enumeration problems is realized, interest will natul'ally center upon relating the propertie8 of t.hit; function to the st.rueturc of the ordering. This is t.he su bje!'t of t.he first. paper of this serics; we hojw to have at. !I'ast h('glln thl' systemat if' study of the n'mal'kahl(, pl'Operti,'s of this most. natural invariant. of all onlpr relat.ion. 'Ve begin in Section 3 with a brief study of the ineidcnee algebra of a locally finite partially ordered 8et and of the invariants assoeiated with it: the zeta funl'\.ion, Mi\hius function, ineidcrw(' fune(.ion, allll Bult,r (,haraeh'ristie. The language of numbcr theory is kept., rat.her than t.hat of the calculw, of finite difl'erenel's, and the results here are quite simple. TIll' next sedioll (:ontaim~ th!' main t1H'orerns: ThcoJ'!'1ll I I'plales till' Miihius fllndion8 of two sets relakd by a Galois eonne(;tion. By Ruitably varying OIH' of the s!'ts while keeping t,he other fixed one can derive much informat.ion. Ttu'orem 2 of this 8ection is sugge~ted hy a teehnique that apparently goes back to I{AMANU,JAN. These two Imsic resuJt,s an' applied in the next H('etion to a variety of special eases; although a number of applications and special cascIO have beellieft 0111., we hope thercby to have given an idea of the t!'chniques involved. The results of Section 6 stem from an "ldeenkreis" that can be traced back to Whitney's early work on linear graphs. Theorem 3 relates t,he Mobius funetion to ccrtain very simple invariants of "cross-cuts" of a finite lattice, and the analogy with th" Euler ehara('t-eristic of combinat.orial topology is inevitahle. Pursuing this analogy, we were led to set up a series of homology theories, whose Euler eharaetprist.ic doe;; indeed poirwidl' with the Euler eharact.eristic which we had int.roduel'd by purely combinatorial devices.

333

:142

GIAN·CAR(,O ROTA:

Some of the work in lattice theory that was carried out in the thirties is uHeful in t.his inv(~8tigation; it. t,urns out, however, that modular lattices are not eombinat.orially as interesting as a type of structure first studied by WlIITNEY, which we have called geometric lattices following BIRKHOFF and t.he :French schooL The remarkable property of such lattices is that their Mobius fUllction alternates in sign (Seetion 7). To prevent the lengt.h of this paper from growing beyond bounds, we have omitt.ed applieations of the t.heory. Some elementary but typical applicat.ions will be found in Ule author's expository paper in the American Mathematical Monthly. 'rowan1:s t,he end, however, t.he temptation to give some typical examples became irresistible, and SectionR 9 and 10 were added. These hy no means exhaust the range of applieations, it is our eonviction that the Mobius inversion formula on a partially ordered set is a fundamental principle of enum{'ration, and we hope to implement this conviction in the successive papcrs of this series. One of them will deal with st.ructures in which the Mobius function is mulUplieative, ·-·that i~. has the analog of the number· theoretic property fl (mn) = ,lim) fl (n) if m and nan' l·oprime .,. anll another will give a sy:4cmat.ic (iP-vP)opment of the Ideenkrl'is centering around POLY A'S Haupt:satz, whieh can be ;;ignificant.\y extemll'd by a ;;uitable Mobius inver:sion. A few words about t.he hist.ory of the subject. Tlw statement of the Mohius inversion formula does not appear here for the first time: the first coherent vl'l'sion--with some redundant aRsumptions--i:; due t.o \VF.JSNI<:R, and was independl'ntly redis(,overed shortly afterwards by PH [Ln' HALL. Ward gave the statement ill full gl'nprality. ~(rangdy enough, howl'ver, tlll'>::e authon; did not punnw tlw (,ombinatorial impli(,at-ions of their work; nor was an attempt m!1de t.o ;;yst.emati. eally inn'sl igat.p til!' 1'1'O\lI'I-tit,;; of M6bi It" flllletiolls. Aside fl'Ol1I H A I.[:S applieutions to Il·grollp;;. and f!'OlIl SOIll!' applit'ations to ;;t,ati:4it'al IIlcchanim; by M. S. GREEN and r\~;TTL1':'!'()N, Iittl(' has hpen done; we give a hopefully ('ornpll'lp hihliography at the end. It is a pll·asut'(· (,0 aeknowledge I he eneouragenwnt of G. BmI\JI()~'~' antI :\. GLK<\SO!)i", who spotted an (,ITOl' in Ow dE'finition of a (·mss.cut. as well as of SEYMOUR RHJmMAN and KAI·LAI CHUNG. My eolleagues 1), KAN, G. WHln:HEAD, and esp('(:ially F. P)';'I'E!{SON gave IIW ('"sential help in i'l'tting up the hOl1lologil'al interpretat.ion of the crOAs·cut thl'orem.

2. Preliminaries l~ittJe knowledge i;; rC4uin'tl to read t-hi" work. The two !lot ionlS w(' shall not define are t.hose of a partially ordlm~d set (whose order relation is denoted by :;io) and a I,altice, which is a partially ord('red set where max and min of two elements (we call them join and meet, as usual. and write them V and 1\) Ilre defined. We :shall use inst,ead the symbols V and n t.o denote union and int,{'['::;('etioll of ,'1'(S only. A seymcnt [x, y], for x and y in a partially ordered set P, is the set of all element,:s z betwel'll x amI y, that is, sueh that x ;;-;, z <:'; y. \VI' shall oceasionally liRe open or hlllf,opt'n Regments such a:s [x, y). where one of HII' endpoint.s iH t.o he omitted. A segment is endowed with the induced ordcr shueture; t.hu:s, a :;;cgment of a lattiee it! aglLin a latti(·e. A partially order{'d set is lmnll!l finite. if eV('ry S{'glI1l'lIt. is finite. We Hhall only deal with loeally finite partially ordered set.;;.

334

On the Foundations of Combinatorial Theory. [

:l43

The producl P /~ Q of partially ordered H{'t,; P and Q i" the set of all ordered pairs (p. q), whcre PEP and q F Q, endowed with the order (p, q) ',' (T, 8) whenever p ;;;; rand q ;;;; s. The product of any numbl'l' of partially ordered sets is defined similarly. The cardinal pOWI'r Hom (P, Q) is the set of all mOllotonie fUIl('lion;; from P to Q, endowed with the partial order structure f ~ g whenever f (p) ;:' y(p) for evpl'Y l' in P. In a partially ordered set, an element p covers an element q when the segment [q,pl ('ont.ains two elenwnts. An atom in P is an element that coven; a millinwl

element. and a dual atom is an elempllt that is covered by a maximal clement. If P is a partially onlered set, we shall denote by p* the partially ordncd "d. obtaineo. from P by inverting the order relation.

A closure relation in a partially ordered set P is a function l' ,->- P of Pinto itHelf with the properties (1) P ;;;; 1'; (2) P= p; (3) l' ;;;; q impliet; p ~ ij. An element it; closed if l' = p. If P is a finite Boolean algebra of sets, then a closure relation on P defines a lattice structure on the closed elemmts by the rules p 1\ q = p ( l q and p V q p V q, and it is easy to see that evpry finite lattice is isomorphic to one that is obtained in this way. A Galois connection (cf. ORE, p. 182ff.) between two partially ordered sets P and Q is a pair of functions r: : P --* Q and 7l : Q --* P with the propedies: (I) both ( and 7l are order-inverting; (2) for l' in P, 7l(((p)) ~ 1', and forq in Q, ((71(q) ~ q. Undcrthese circumstances the mappings l' --*71(((p)) and q-+C(71(q)) are closure relations, and the two partially ordered sets formed by the clo~ed 8ds are isomorphic.

.=

I n Section 7, the notion of a closure relation with t.he JJ.far ['ane-Slcinitz exchange pl'Oprrty will bc used. Such a closure relation if< defined on the Bonle-all algebra P of subsets of a finite set E and satisfies the following property: if l' and q are points of E, am),'" a suhle! of E, and if p rf: S but pE S V q, then q E S V p. 8ueh a closure relation can be made the basis of WHITNEY'S theory of independence, as well as of the theory of geometric lattices. The doscd sets of a dosurc relation satisfying the MACLANE-STEINITZ exchange property where every point is a closed set form a geometric (= matroid) lat,tice in the sense of BIRKHOFF (Lattice Theory, Chapter

IX). A partially ordered set P is said to have a 0 or a I if it has a unique minimal or maximal element. We shall always assume 0 I. A padially ordered set P having a 0 and a I satisfies the chain condition (also called the ,JORDAND1W}:KI);D chain condition) when all totally ordered sub8ctl,1 of P having a maximal number of elements have the same number of elements. Under these circumstances one introduces the rank r (1') of an element l' of P as the length of a maximal ehain in the segment [0, 1'], minus one. The rank of 0 is 0, and the rank of an atom is 1. The height of P is the rank of any maximal clement, plus one.

*'

Let P be a finite partially ordered set satisfying thc chain condition and of height n 1. The characteri8tic polynomial of P is the polynomial It (0, x»).n-r(x), where r is the rank function (see the def. of It below). xEP

.L

+

If A is a finite set, we shall write n(A) for the number of elements of A.

335

:144

GrAN-CARLO ROTA:

3. 'rhe incid('nce algebra Let P be a locally finite partially ordered set_ The incirience algebra of P is defined as follows_ ConRider the set of all real-valued functions of two variableI' f (x, y), defined for x and y ranging over P, and with the property that f (x, y) = 0 if x ;f y_ The sum of two such functions f and g, as well as multiplieatio/l by scalar", are dl'fim-d as usual. The product h = fg is defined as follows:

h(x, y) = LI(x, z)g(z, y)_ x5;,z~y

In view of the assumption that P is locally finite, the sum on the right is welldefined_ It is immediately veJ-ified that this is an associative algebra over the real field (any other associative ring eould do). The incidencc algebra has an identity element which we write IJ (:l", y), the Krollecker delta. The zeta function C(x, y) of the partially ordered set [> is th!' element of the incidence algcbra of P such that C(x, y) = 1 if x ~ y and Uc, y) = 0 otherwi"e. The funl'tioll /I (:l", y) = C(x, 11) (x, y) is callpd thc hlcidence function.

a

TIll' idea of the incidence algebra is not new. The incidence algebra is

It spt>(·ial case of a semigroup algebra relative to a semigroup which is easily associated with the partially ordered set. The idea of taking "interval functions" goes back to DEDEKIND and E. T. BELL; see also WARD.

Proposition 1. The zeta lunction of a locally finite partially ordered 8et i8 invertible in the incidence algebra. Proof. We define the inverse fl (x, y) of the zeta fune! ion by induction oyer the numhf'r of l'lements in t.he :-<egment [x, y). First, Ret. fl(X, x) = I for all x in P. Sll""()~'· HOW t,iHlt fl (:1",:) ha~ ol'(,n dl"filH'd for all:; in the opell Hl'gllWllt Iy. y). Theil spt /1, (x, Y) -(.r, z) .

LIt

:r- z

~y

('Il-ady /1, is an illv{"!",;e of ~. TIll> function fl, inverse to C, i" called the Miibius func/ion of the partially ordered set P. Tlw following r('~ult, simple though it iH, is funtiam('lltal:

Proposition 2. (Miibius inversion formula). Let I(x) be a real-valued junction, de/incd for x ranging in a lomlly finite partially ordered 8et P. Let an element p e.l"i8t with the property that / (x) = 0 unle88 x ;:;; p. Suppose that (*) g(x)=2.,f(y). y~.r

l'hen (**)

f(x) = Lg(y)fl(y,X). y~x

Proof. The function g is wplI-defincd. Imlcpd, the sum Oil tilt' right can be written as I (y), which is finite for a locally finite ordered sct.

L

p'5.1I5x

Subst.ituting t.he right side of (*) into the right side of (**) and ~irnplifyillg,

336

On tht' 1<'oundationx of Combinatorial Theory. I

we get '2,g(y)p(y,x) =

2: L!(z)P(y,x) =

"'2.

345

L!(z)C(z,y)p(y,x).

Interchanging the order of summation, this becomes "L,/(z) L C(z, y)p(y, x) z

=

L,!(z) b(z, x)

=

f(x) ,q. c. d.

lI~X

Corollary 1. Let r(x) be a function dl'firwd for x in P. Suppose there is an ...11'r1l(:lIt q such that r(x) vanishes unles.~ x ~ q. Suppose that s(x) =

Lr(y).

Then r (x)

= L, p (x, y) s (y) . 1/ ;;~X

The proof is analogous to the above and is omitted. Proposition 3. (Duality). Let p* be the partially ordered set obtained by inverting the order of a locally finite partially ordered set P, and let fl* and p be the Mobius functions 0/ p* and P. 'l'hen p*(x, y) = p(y, x). Proof. We have, in virtue of Proposition 2 and Corollary 1, '2>*(x,y) = b(x,z). X~·II~·Z

Letting q(x, y) = p*(y, x), it follows that. q is an inverse of C in the incidence algebra of P. Since the inverse is unique, q = p, q. e. d. Proposition 4. The Mobius function of any segment [x, y] of P equals the re81riction to r;r, y] of the Miibius fundion of P. The proof i~ omitted. Proposition 5. Let P X Q be the direct pmduct of locally finite partially ordered sets P and Q. The Mobius function of PxQ ilS given by fl ((x, y), (u, ti» = p (x, u) p (y, 11),

X, U E

P; y,

t'

E

Q.

The proof is immediate and is omitted. Thl' same letter p has been used for the Mobius functions of three partially ordered sets, and we shall take this liberty whenever it will not cause confusion. Corollary (Princillle of Inclusion-}~xclulSion). Let P be the Boolean algebra of all subsets of a finite set of n elements. Then, for x and y in P, p (x, y) ,= (-

l)n(II)-n(X) ,

y

~x,

where n (x) denotes the number of elements of the set x. Indeed, a Boolean algebra is isomorphic to the product of n chains of two elcment.s, and every segment. [;r, y] in a Boolean algebra is isomorphic to a Boolean algebra. Aside of the simple result of Proposition 5, little can be said in general about how the Mobius function varies by taking subsets and homomorphic images of a partially ordered set. We shall see that more sophisticated notions will be required to relat.e the Mobius functions of two partially ordered sets.

337

346

I~'

GIAN·CARLO ROTA:

Let P be a finite partially ordered set with 0 and I. The Eliler characteristic of P is defined as E=l+p(O,I).

The simplest result, relat.ing t.o t.he computation of t.he Euler characteristic was proved by PHILIP HALL by combinatorial methods. We reprove it below with a vcry Rim pIe proof which shows one of the uses of the incidence algehra: Proposition 6. Let P be a finite partially ordered set with 0 and I. For every k, let 0" be the number oj chains with k elements stretched between 0 and I. Then

+ 0 3 - 0 4 + .... 15 - 71 + n 2 .... It

E = I - O2

+

Proof. p = C-1 =~ (0 71)-1 = is easily verified that equalR the number of chains of k elements stretched between x and y. I_cUing x = 0 and y = I, the result follows at once. It will be seen in section 6 that the Euler characteristic of a partially ordered set can be related to thc classical Euler characteristic in suitable homology theories built on the partially ordered set. Proposition 6 is a typical application of the incidence algebra. Several other results relating the number of chains and subsets with specified properties can often be expressed in terms of identities for functions in the incidence algebra. In this way, one obtains generalizations t.o an arbitrary partially ordered set of some classical identities for binomial coefficients. We shall not pursue this line here further, since it lieR out of the track of the prcsent work.

71"-1 (x, y)

Example 1. The classical Mobius function p (n) is defined as (- l)k if n is the prodnet of k distim't, prinH'!-l, and 0 otherwise. The ela~:;ieal inversion formula fir"t dcrived by MobiuR in 1832 is: g(m) = '2,f(n);

f(rn) =

nlm,

'2g(n)/l("Z-),

n!m.

It is casy to see (and will follow trivally from later rcsult::;) that p

(~;) is the

Mobiw-; function of the Ret of positive integers, with divisihility as the partial order. In this eal:lc tiw ineidenec algcbra ha::; a di:;tinguished subalgcbra, formed by all functions f(n, m) of the form f(n, m)

~.~

G(7;). The product H=FG of two

functions in this subalgebra can be written in the simpler form

(*)

IJ(rn) = '2,P(k) 0(11). kn=m

If we associate with the element F of this subalgebra thc formal Dirichlet series F(s) =

~ F(n)jn s ,-then the product (*) corresponds to the product of two formal

Dirichlet series considered as functions of s,

Ii (8) =

J§'(,s) (i(.~). Under thiR

repre1'1entation, the zeta function of the partially ordered set is the classical Rie·

L Ijn 00

mann zeta function C(8) ,=

S,

and the statement that the Mobius fUlldion is

n~l

338

34i

Oil the Foundations of Combinatorial Theory. I

til(' inVf'rsc of the zeta function reduces to the classical identity I g (,~) =

L f1 (n)/n·. 00

n:.- 1

It is hoped this example justifies much of the terminology introduced above.

Example 2. If P is the set of ordinary integers, then ft(m, n) = - 1 if m. = n - 1, ft(m, m) = 1, and ft(m,n) = 0 otherwise. The Mobius inversion formula reduces to a wcll known formula of the calculuR of finitc differenees, whieh is the discrete analog of the fundamental theorem of calculus. The Mobius function of a partially ordered set can be viewed as the analog of the classical difference operator 11f(n) = f(n + 1) - f(n), and the incidence algebra serves as a calculus of finite differences on an arbitrary partially ordered set.

4. Main results It t.urns out that the Mobius functions of two partially ordered sets can be compared, when the sets are related by a Galois connection. By keeping one of the sets fixed, and varying the other from among sets with a simpler st.ructure, such as Boolean algebras, subspaces of a finite vector space, partitions, etc., one can derive much information about a Mobius function. This is the program we shall develop. The basic result is the following:

Theorem 1. Let P and Q be finite partially ordered 8et8, where P has a 0 and Q hats a 0 and a 1. Let ftp and ft be their Mobiu8 function8. Let

n:Q-P; e:P-Q be a Galoi8 connection 8uch that (1)

n(x) = 0

(2).

e(O) = 1.

Then ft(O, 1)

if and only if x

~--=

1.

= Lftp(O, a)C(e(a), 0) = Lftp(O,a). a>O

[a:Q(a)=Oj

One gets a significant summand on the right for every a > 0 in P which is mapped into 0 bye. One therefore expects the right side to contain "few" terms. In general, ftp is a known function and ft is the function to be determined. Proof. We shall first establish the identity (*)

LI5(n(x),a)=C(x,e(b» a~b

for cvery b in P. Here Con the right stands for the zeta function of Q. Equation (*) is equivalent to the following statement: n(x) ~ b if and only if x ~ e(b). But this latter statement is immediate from the properties of a Galois connection. Indeed, if n(x) ~ b, then e(n(x» ~ e(b), but x ~ e(n(x», hence x ~ e(b), and similarly for the convcrse implication. To identity (*) we apply the Mobius inversion formula relative to P, thereby obtaining the identity (**)

15 (n(x), 0)

= Lftp(O,a)C(x,e(a». a~O

Now, b (n(x), 0) takes the value 1 if and only if n(x)

339

= 0,

that is, in view of

34M

GIAN-CARLO ROTA:

assumption (1), if and only if x Therefore,

1. For all ot.her values of x, we have l5(n(x),O)

-x

0,

l5(n(x), O) = 1 - n(x, 1),

We can now rewrite equation (**) in the form 1 - n(x, 1)

=

C(x, e(O))

+ L>p(O, a)C(x, eta)) a,>O

However, in vicw of assumption (2), C(x, e(O)) = C(x, 1), and this is identically one for all x in Q. Therefore, simplifying, -

n(x, I)

=

L ,up (0, a)C(x, e(a)). 11>0

Now, since C = 15 ,u(0, I)

+ n, we have ,u =

= -

L

() ,u(0, x)n(x, 1) =

05%51

,un, hence, recalling that 0

LL

'* 1,

,up (0, a) ,u(0, x)C(x, e(a)).

05%5111>0

Interchanging the order of summation, we get ,u(0, I)

= L,up(O,a) L,u(O, x)C(x, e(a)). ">0

O~x~l

The last sum on the right equals 15(0, eta)), and this equals C(e(a), 0). The proof is therefore complete. For simplicity of application, we restate Theorem 1 inverting the order of p, Corollary. Let p: Q ---+ P; q: P ---+ Q be order pre.~ert'ing lunctjon8 between

P and Q 8.uch that (1)

If p (x)

=

1

th(,11

X

q(l)

(2)

p(q(,c));Sx

(3)

= 1 , and convcrsely . .~

I.

and

q(p(x))~c~x.

Then ,u(0, 1)

=

L,up(a, I)C(q(I1),O) a< 1

=

L,up(a, 1)

[a:q(a)=Oj

where ,u is thc Mobius function of Q. The second result is :suggested by a techlli(jue whieh apparently gO('S back to RAMANUJAN (cf. HARDY, RAMANU,JAN, page 139). Theorem 2. Let Q be a finite partially ordered .set with 0, lind let P be a parlially ordered 8et with O. Let p: Q .~ P be a mon%nir, lunclion 01 () onlo P. A88ume Ihllt the int'er.se image 01 el)el'!1 interoallO, aJ in P i8 anilliNI'ulI0, xl ill fJ, amI Iltl/I !h,inver8e image 010 contains at least /11'0 points. Then L,u (0, x) == 0

[x:p(.r)-=ui

for every a in P.

Thc proof j,; by inductiun over the set P. :-;irl('(' 10,01 is an inkrval and its illvprsc image is all intprval [0, ql wit.h q . 0, we IlItvl' >

Lf.l(O,.I') [X:I'(f).~nj

= Lll(O"t} ~ O. (\." ..q

340

On the Foundations (If Combinatorial Theory. I Suppo~c

now the statement il'\ tme for all b such that b

2 »(O,x)- o.

b
It follows that.

< a in

:l4!J

P. Then

[x:p(.c)=I,)

L,u(O,x) = L L,u(O,x). b&a [:t:p(:t)-b)

[.c:p(x)=a)

The last sum equals the sum over 80me int,erval [0.1'] which is the inver8t1 image of the segment [0, aj, t.hat i8

:L :L,u (0, x) =

b~a

[z:p(:t)-bl

L,u (0, x)

=

lJ (0, r) .

0&",;0;.

But, l' > 0 becausc a is strictly greater than O. Hence lJ(r, 0) = 0, and this C011cludes the proof. 5. Applications The simplest (and typical) application of Theorem 1 is the following: Proposition 1. Let R be a subset of a finite lattice L with the following properties: I ¢ R, and for every.c of L, except x = 1, there is an element y of R such that y ~ x. For k ~ 2, let q" be the number of subsets of R containing k elements whose meet is O. Then ,u(0, 1) = q2 - qs q4 Proof. Let B(R) be the Boolean algebra of subsets of R. We take P = B(R) and Q = L in Theorem 1, and establish a Galois connection as follows. For x in L, let n(x) be the set of elements of R which dominate x. In particular, n(I) is the empty set. For A in B(R), set (! (A) = 1\ A, namely, the meet of all elements of A, an empty meet giving as usual t.he element 1. This is evidently a Galoi8 connection. Conditions (I) and (2) of the Theorem are obviously satisfied. The fundion ,up is given by t.he Corollary of Proposition 5 of Scction :1, and hence the conclusion is immediate. Two noteworthy special cases are obtained by taking R to be the set of dual atoms of Q, or the set of all element,s < 1 (cf. also WEISNER). Closure relations. A useful application of Theorem 1 is the following: Proposition 2. Let x -+ x be a closure relation on a partially ordered set Q having 1, with the property that x = 1 only if x = 1. Let P be the partially ordered .subset of all cl08ed elements of Q. Th(>:n: (a) If x > x, then ,u(x, 1) = 0; (b) If x = x, then ,u(x, 1) = ,up (x, 1), where,up is the Mobius function of P. Proof. Considering [x, 1], it may be assumed that P has a 0 and x = O. We apply Corollary 1 of Theorem 1, setting p(x) =--' x and letting q be the injection map of Pint.o Q. It. is then clear that the assumptions of the Corollary are satisfied, and the set of all a in P such that q(a) = 0 is either the empty set or the single element. 0, q. e. d. Corollary (Ph. Hall). If 0 is not the meet of dual atoms of a finite lattice L, or if 1 is not the join of atoms, then ,u(0, 1) = O. Proof. Set x = I\A (x), where A (x) is the set of dual atoms of Q dominating:l:, and apply the preceding result. The second assertion is obtained by inverting the order. Example 1. Di.ytributive lattices. Let L be a locally finite distributive lattice. Using Proposition 2, we can easily compute its Mobius function. Taking an interval

+ + ....

341

350

UlAN-CARLO ROTA:

lx, yJ

and applying Proposition 4 of Section 3, we can assn me that L is finite. For a E L, define ii to be thc join of all atoms which a dominates. Then a ...... a is a closure relation in the inverted lattice L* . .Furthermore, the subset of close(1 elements is easily seen to be isomorphic to a finite Boolean algebra (cf. BIRKIWFF Lattice Theory, Ch. IX) Applying Proposition 5 of Section 3, we find: fl (x, y) = 0 if Y is not the join of elements covering x, and fl (x, y) = ( - l)n if y is the join of n distinct elements covering x. In the special case of the integers ordered by divisibility, we find the formula for the classical Mobius function (cf. Example I of Section 3.). The Mobius function of cardinal products. Let P and Q be finite partially ordered sets_ We shall determine the Mobius function of the partially ordered set Hom(P, Q) of monotonic functions from P to Q, in terms of the Mobius function of Q. It turns out that very little information is needed about P_ A few preliminaries are required for the statement. Let R be a subset of a partially ordered set Q with 0, and let R be the ideal generated by R, that is, the set of all elements x in Q whieh are below «) some element of R. We denote by Q/R the partially ordered set obtained by removing off all the elements of R, and leaving the rest of the order relation unchanged. There is a natural order-preserving transformation of Q onto Q/R which iH one-to-one for elements of Q not in R. We shall call Q/R the quotient of Q by the ideal generated by R. Lemma. Let I: P -+ Q be monotonic w'ith range R c Q. Then the srgmcnl [f, 1] in Hom (P, Q) is isomorphic with Hom (P, QfR). Proof· For g in [f, 1], set y' (x) = y(x) to obtain a mapping y -+ g' of l/, IJ to Hom (P, Q/ R). Since y >: f, the range of g lies above R, so the map i" an i~() morphism. Proposition 3. Tlw jtJobiu8 lunction I~ of the cardinal product HOIII (P, Q) of the finite partially ordered 8Pt P u'ith the partially ordered set Q with 0 and 1 is determined as follows: (a) If f(p) 0 for some rlem!'lIt p of P which is not maximal, then fl (0. j) -, 0_ (b) In all other rases,

*

/.l (0, f) c= TIfl(0,/(1II»,

IE P,

where the product ranges over all maximal elem(mls of P, and where If, Oil IIIf' riyht stands for the Mobius lunction oj Q. (c) For f ;::;;; g, flU, g) = fl(O, g'), where g' i8 the image of y under the canom.ial map of [j, 1] onto Hom. (P, Q/R), prOl'iderl Q/R has a U. Proof· Define a closure relation in lO, IJ *, namely the :;egment [0, /J with the inverted order relation, as follows. ~et y(m) ,~ g (m) if rn i;; a maximal element of P, and y(a) = if a is not a maximal element of P. If Y === 0, then g(m) = 0 for all maximal elcnH>ntti 111, hence y (a) = 0 for all a < tiOllle maximal element. sinc!' y is monotonic. Hence g c= 0, and the assumption of Proposition 2 is satisfied. The set of closed elements is isomorphic to Hom (M, P), where }.f is a set of as many elements a;; there are maximal elements in P. Conclusion (a) now follows from Proposition 2, and eonelusion (b) from Proposition 5 of Sect.ion 3. Conclusion ((;) follows at once from the Lemma.

°

342

On the Foundations of Combinatorial Theory. I

351

We pass now to some applications of Theorem 2. Proposition 4. Let a --? a be a closure relation on a finite lattice Q, with the property that a\[1j = Ii V fj and 0> O. Then for all a E Q, Lt.t(O, x) =0. [x:x~aJ

Proof. Let P be a partially ordered set. isomorphic to the set of closed elements of L. We define p(x}, for x in Q, to be the element of P corresponding to the closed element x. Since 0> 0, any x between 0 and 0 is mapped into O. Hence the inverse image of 0 in P under the homomorphism p is the nontrival interval [0,0]. Now consider an interval [0, a] in P. Then p-l ([0, a]) = [0, x], where x is the closed element of L corresponding to a. Indeed, if 0 ~ y ~ x then y ~ x = x, hence p(y) ~ a. Conversely, if p(y) ~ a, then y ~ x but y ~ y, hence y ~ x. Therefore the condition of Theorem 2 is satisfied, and the conclusion follows at once.

Corollary (Weisner). (a) Let a > 0 in a finite lattice L. Then, for any b in L, Lt.t(O,x)

= 0

xva~b

(b) Let a

<

1 in L. Then, for any b in L, Lt.t(x, 1) =

o.

xi\a~b

Proof. Take x = x V a. Part (b) is obtained by inverting t.he order. :Examp1e 2. Let V be a finite-dimensional vector space of dimension n over a finite field with q elements. We denote by L(V) the lattice of subspace" of V. We shall use Prop~sition 4 to compute the Mobius function of L(V). In the lattice L( V), every segment [x, y], for x ~ y, is isomorphic to the lattice L (W), where W is the quotient space of the subspace y by the subspace x. If we denote by t.tn = t.tn(q) the value of t.t(0, 1) for L(V), it follows that t.t(x, y) = t.t1, whenj is the dimemrlon of the quotient space W. Therefore once ftn is known for for every n, the entire Mobius function is known. To determine ftn, consider a subspace a of dimension n - 1. In view of the preceding Corollary, we have for all a < 1 (where 1 stands for the entire space V): Lft(x, 1)

=0

where 0 stands of course for the O-subspace. Let a be a dual atom of L(V), that is, a subspace of dimension n - 1. Which subspaces x have the property that x 1\ a = O? x must be a line in V, and such a line must be disjoint except for 0 from a. A subspace of dimension n - 1 contains qn-l distinct points, so there will be qn - qn-l points outside of a. However, every line contains exactly q - 1 points. Therefore, for each subspace a of dimension n - 1 there are qn _ qn-l q - 1

-=-_=--;-_ = qn-l

distinct lines x such that x 1\ a = O. Since each interval [x, 1] is isomorphic to

343

nIAN-CARLO ROTA.

a Kpace of dimension n

1, w£' obtain L,u(x.l)-

qn-l ,un--l .

xlla=O

T"O

This is a difference equation for ,un which is easily solved by iteration_ We obtain the result. first established by PHILIP HALL (see also WEISNER and S. DELSARTE) : ,un (q)

=

(--

l)n qn(n~l)/2 = ( - l)n q(;) _

6. '1'Iw Euler characteristic Sharper results relating ,u (0, 1) to combinatorial invariants of a finite lattice can be obtained by application of Theorem 1, when the "comparison set" P remains a Boolean algebra. A cr08s-cut C of a finite lattice L is a subset of L with the following properties: (a) C does not contain 0 or 1. (b) no two elements ofC are comparable (that is, if x and y belong to C, then neith{'r x < y nor x> y holds). (c) Any maximal chain stretched between 0 and I meets the sct C. A spanning subset S of L is a subset such that V S= 1 and 1\ S = o. The main result is the following Cross-cut Theorem: Theorem 3. Let ,u be the Mobius function and E the Euler characteristic of a nontrivial finitp lattire L, alld let C be a cross-cut of L. For every integer k ~ 2, let qlt denote the !lumber of 81)(wning 8ub8ets of C contm:ning k distinct element8. Then FJ--I-o',u(O,I)c·~q2

--q3 tq4 -q5 t .. ·

Th£' l)fooj is by induction over the distance of a cross-cut C from the element 1. lklille the di::;tance d (x) of an dement. x from the element 1 as the maximum length of a chaill strctched hetween x and 1. For example, the distanel' of a dual atom is two. If C is a cross-cut of L, define the distance d (C) as max d (x) as x ranges over C. Thus, the rliHtanee of the cross-cut consisting of all dual atoms is two, and conversely, thiK is the ouly cr08H-cut having distance t.wo. It fol\mvH from Proposition 1 of Section 5 that the result holds when d (0) -~ 2 (take R ~= C in the aSHcrtion of the Proposition)_ Thus, we shall assume the t.ruth of the statempnt for all <'ross-cull< whose diHtanee iK less than /I, all () or x ~:;:: C to mcan that there i:; an element y or C such that x> y, or that there is an element y of C such that x ;:;:: y.For a general C, these possibilitics may not be mutually exdusive; they are mutually exclusive when C is a eross-cut. We shall repeate(lly make use of this remark below. Define a modified lattiee L' as follow:;. Let L' contain all the elements x such that x ;S; C in the same order. On top of C, add an element 1 covering all the elements of C, but no others; this defines L'. In L', consider the cross-cut C and apply Proposition 1 of section 5 again. If ,u' is the M(ibiu8 function of L', then

,u'(0, 1) =, P2' P3 t P4 ... , whcre Pit is the !lumber of all subsets A ( C c L' of k elements, l:mch that /'. A

344

O.

353

On the Foundations of Combinatorial Theory. I

Comparing the lattices Land L', we have

°= I

P (0, x)

+ LP (0, x) = x>C

x~u

Lp'(O, x) + p'(O, 1).

x$C

However, for x ;2; C, we have p'(O, x) = p(O, x) by construction of L'. Hence LP(O, x)

= - P2 + Pa - P4 + .,.

X~i:!V

Since the sets (xIx;;;: C) and (xIx> C) are disjoint, we can write P (0, 1)

=-

L P (0, x) =

-

",<1

[L P (0, x) + L P (0, x)] . ",;;;0

1>z>0

We now simplify thc first summation on t.he right: (*)

p(O, 1) = Ps - Pa

+ P4'"

- LP(O, x). 1>2:>0

°

Now let qk(X) be the number of subsets of C having k elements, whose meet is and whose join is x. In particular, qk(l) = qk. Then clearly· Pk = Iqt(x),

k;;;; 2,

z>O

t.he summation in (*) can be simplified to (**)

p(O, 1)

= (q2 -

qa

+ q4 -

+ ... + P (0, x)] .

... ) -

L [- q2(X) + qa(x) -

1>z>0

q4(X)

+

For x above C and unequal to 1, consider the segment [0, x]. Wc prove t.hat. = C (\ lO, x] is a cross-cut of the lattice 10, x] such that d(C(x» < d(C). Once this is done, it followR lJY the induet.ion hypothesis that. every term in brackets on t.he right of (**) vanishpR, and the proof will he complete. Conditions (a) and (b) in the definition of a cross-cut are trivially satisfied by C(x), and condition (c) is verified as follows. Suppose Q is a maxima'! chain in [0, x] which does not meet C(x). Choose a maximal chain R in the Regment [x, 1]; then the chain QU R is maximal in L, and does not intersect C. It rcmains to verify that d (0 (x» < d (C), and t.his is quite simple. Therc is a chain Q stretched between C and x whose length is d(C(x». Then d(C) exceeds the length of the chain QU R, and sincc x < 1, R has length at least 2, hcnce the length of Qu R cxceeds t.hat of Q by at least one. The proof is therefore complete. Theorem 3 gives a relation between the value p(O, 1) and the width of narrow cross-cuts or bottlenecks of a lattice. The proof of the following statement is immediate. Corollary 1. (a) If L has a cross-cut w'ith one element, then P (0, 1) = 0. (b) If L has a cross-cut with two elements, then the only two possible values of p(O, 1) are and 1. (c) If L has a cross-cut having three elements, then the only possible values of P (0, 1) are 2,1, and -1. In this connection, an interesting combinatorial problem is to determine all possible values of p(O, 1), given that L has a cross-cut with n elements.

C(x)

°

°

25

Z. Wahrscheinlichkeitstheorie. Hd. 2

345

354

GIAN·CARLO ROTA:

Reduction of the main formula. In several applications of the cross-cut theorem, the computation of the number qk of spanning sets may be long, and systemat.ic procedurel:! have to be devised. One such procedure is the following: Proposition 1. Let C be a cross· cut of a finite lattice L. For every integer Ie ~ 0, and for every ,~ubset A c C, let q(A) be the number of spanning sets containing A, and let 8 k = q(A), where A ranges over all subsets of'C having Ie elements. Set So

2: A

to be the number of elements of C. Then

p,(0, I) == So - 2S 1 + 228 2 - 2383 + ....

Proof. For every subset Be C, set p(B) = 1 if B is a spanning set, and p(B) = 0 otherwise. Then q(A)

=

2:p(B).

02112A

Applying the Mobius inversion formula on the Boolean algebra of subsets of C, we get p(A)

=

2:q(B)p,(A, B),

B2A

where p, is the Mobius function of the Boolean algebra. Summing over all subsets A c C having exactly Ie elements, 2: q(B) p,(A, B).

qk=L)(A)=2: n(A)=k

n(A)-k

B2A

Intcrchanging tlw order of summation on thc right., recalling Proposition 5 of Sf'ction

:~ and the fact that a set of Ie + Ielements possesses ( Ie

f:'lemt'nl~, \Ie

i I)

subsets of Ie

obtain

A convenient way of recasting this expression in a form suitable for computation is the following. },pt V he t.he vector space of all polynomials' in the variable .1', over the real field. The polynomials 1, x, x2, .. , , are linearly independent in V. Hpnc
Ie

= 0, 1, 2, ....

Formula (*) ("an now be rewrit,ten in the concise form

Upon applying the cross·cut theorem, we find the expression (wherp qo and q1 are also given by (*), hut t.urn out to be 0) Ix x2 (l+x)i+(I+x)3 _ ... j

p,(O,I)=L ( l+x = L(

1./ 2X)

= 8 0 --. 281

=

L(I -- 2x + 4x 2

+ 4S2 - " ' , 346

q.e.d.

-

8..t: 3-1- ... )

355

On the Foundations of Combinatorial Theory. I

The cross-cut theorem can be applied to study which alterations of the order relation of a lattice preserve the Euler characteristic. Every alteration which preserves meets and joins of the spanning subsets of some cross-cut will preserve the Euler characteristic. There is a great variety of such changes, and we shall not develop a systematic theory here. The following is a simple case. Following BmKHOFF and J6NSSON and TARSKI we define the ordinal sum of lattices as follows. Given a lattice L and a function assigning to every elemeat x of L a lattice L(x), (aU the L(x) are distinct) the ordinal sum P = LL(x) of L

the lattices L(x) over the lattice L is the partially ordered set P consisting of the set U L(x), where u;;::;; v if u E L(x) and v E L(x) and u ;;::;; v in L(x), or if u E L(x) XEL

and vEL (y) and x < y. It is clear that P is a lattice if all the L (x) are finite lattices. Proposition 2. If the finite lattice P is the ordinal sum of the latticeB L(x) over the non-triviallatUce L, and /lp, /lx and /ll. are the corresponding Mobius functions, then: If L(O) is the one element lattice, then /lp(O, 1) = /lL(O, 1). Proof. The atoms of P are in one-to-one correspondence with the atoms of L and the spanning subsets are the same. Hence the result follows by applying the cross-cut theorem to the atoms. In virtue of a theorem of J6NSSON and TARSKI, every lattice P has a unique maximal decomposition into an ordinal sum over a "skeleton" L. This can be used in connection with the preceding Corollary to further simplify the computat.ion of /l(0, n) as n ranges through P. Homolog'ical interpretation. The alternating sums in the Cross-Cut Theorem suggest that the Euler characteristic of a lattice be interpreted as the Euler characteristic in a suitable homology theory. This is indeed the case. 'Ve now define* a homology theory }[ (e) relative to an arbitrary cross-cut C of a finite lattice L. For the homological notions, we refer to Eilenberg-Steenrod. Order the elements of C, say aI, a2, ... , an. For k ;;::;; 0, let a k-simplex a be any subset of C of k + 1 elements which does not span. Let Ck be the free abelian group generated by the k-simplices. We let C- 1 = 0; for a given simplex a, let ai be the set obtained by omitting the (i + I)-st element of a, when the elements of a are ordered according to the given ordering of C. The boundary of a k-simplex k

is defined as usual as oka = L(-I)iai' and is extended by linearity to all of i=O

Ck, giving a linear mapping of Cle into Ck - 1 . The k-th homology group lh is defined as the abelian group obtained by taking the quotient of the kernel of 0" by the image of Ok+l. The rank bk of the abelian group Ilk, that is, the number of independent generators of infinite cyclic subgroups of H", is the k-th Betti number. Let IXk be the rank of C k , that is, the number of k-simplices. The Euler characteristic of the homology H (C) is defined in homology theory as

L (-I)k lXk . 00

E(C) =

k=O

* This definition was obtaint'd jointly with D. whom J now wish to thank.

KAN,

F.

PETERSON

and G.

WHITEHEAD,

25*

347

356

thAN-CARLO ROTA:

It follows from wdl-known results in homology theory t,hat E(C)

L (-I)kb 00

=

k .

1:-0

Let qk be the number of spanning subsets with k elements as in Theorem 3. Then qk+l !Xk is the total number of subsets of C having k 1 elements; if C

+

has

Nelements, then

+

!Xk =

(k ! 1) - ql:+l. It follows from the Cross-Cut Theo-

rem that

We have however

~ (- l)k (k ~

k=O

1

I)

=

-.~ (- I)' (~) = 1 -.~ (-I)' (~) = .~O

.=1

and hence E(C) = 1

in other words:

+ 1'(0, I) =

1 - (l - I)N

=

1.

E;

Proposition 3, In a {in'tte lattice, the Euler characteristic cross-cut C equals the Euler characte-ristic 0/ the lattice.

0/ the homology 0/ any

This result. can sometimes be used to compute the Mobius functions of "large" latt.iees. In general, the numbers qk are rather redundant, since any spanning sub:-<et of k elements gives rirse to several spanning subsets with more t.han k elempnts. A method for eliminating redundant spanning sets i::; then called for. One such method consists precisely in the determination of the Betti numbers bk • We conjecture that the Betti numbers of 1I (0) are theml:lelves inuependent of the eross-eut 0, and are al80 "invariants" of the lattice L, like the Euler characteril:ltic E (C). In the special ease of lattices of height 4 satisfying the chain con· dition, this conjecture has been proved (in a different language) by DOWKER. Example 1. The Betti numbers of a Boolean algebra. We take the eross-eut 0 of all atoms. If the height. of the Boolean algebra is n -+- 1, then every k-eycle, for k < n - 2, bounds, so tha,t bo = 1 and bk = 0 for 0 < k < n - 2. On the other hand, there is only one eycle in dimension n -- 2. Hence bn - 2 = 1 and we find E =-, I + (_l)n-2, which agrees with Propot;it.ion 5 of Seetion 3. A notion of Euler charaeteristic for distributive lattiees has been recently introduced by HADWIGER and KLEE. :1<'01' finite distributive lattices, KLEE'S Euler characteristic is related to the one introduced in this work. We refer to KLEE'S paper for det.ails.

7, Geometric lattices An ordered structure of very frequent occurrence in combinatorial theory is the one that has been variously called matroid (WlIITNEY). matroid lattice (BIRKlIOH'), closure relation with the exchange property (MAoI.ANE), geometric lattice

348

On the J<'ounrlntions of Combinatorial Theory. I

357

(BlRKHOFJf), abstract. linear dependence relat.ion (BLEICHER and PRESTON). Roughly speaking, theRe structures arise in the study of comhinatorial objects that. are ohta.ined by piecing t,oget.her smaller object.s with a part.icularly simple st.ructure. The typical such case is a linear graph, which is obtained by piecing together edges. Several counting problems associated wit.h such structures can often be attacked by Mobius inversion, and one finds that the Mobius functions involved have particularly simple properties. We briefly summarize the needed facts out of the theory of such structures, referring to any of the works of the above authors for the proofs. A finite lattice L is a geometric lall-ice when evl'l'y dement of L is the join of atoms, and whenever if a and b in L cover a 1\ b, then a V b covers both a and b. Equivalently, a geometric lattice is characterized by the exilltence of a rank funct.ion sat.isfying r(a 1\ b) r(a V b) ~ r(a) r(b). Notice that this implies the chain condition. In particular if a is an atom, then r(a V c) = r(c) or r(l:) 1. If M is a semimodular lattice, then the partially ordered subset of all elements which are joins of atoms is a geometric sublattice. Geometric lattices are most often obtained from a closure relation on a finite set which satisfies the MACLANE-STEINITZ exchange property. The lattiee L of closed sets in such a closure relation is a geometrie lattice whenever every oneelement set is closed. Conversely, every geometrie lattice can be obtained in this way by defining one such closure relation on the set of its atoms. The fundamental property of t.he Mobius funetion of geomet.ric lat.tices is the following: 'rhl'ort'In 4. Let I' be the Mobiu8 function of a finite geomdric lattice L. Then: (a) I' (x, y) 0 for any pair x, y ,in L, provi(led x ~ y. (h) If y CO/'eI'8 z, thpH 1'(:1', y) and 1'(;1', z) /tat'" opposite 8igu8. Proof. Any segnll'nt L.t, yJ of a geometric laHiec i::; abo a. geollletric: lattice. It. will t.hereforc suffice to aSRume that x = 0, y = 1 and that. z is a dual at.om of L. 'Ve proceed hy in
+

+

+

'*

1'(0,1)

0' -

2>(0, x).

:.Va~l

:.,,1

Now from t.he fluba(lditiv{' inequality r(x 1\ a) f- r(x

\j

(1)

<: r(x) -+- r(a)

we infer that if x V a ~~ 1, then n ~-::- dim x + dim a, hene(' dim x ~ n - 1. The element. x must therefore be a dual atom. It follows from t.he induction assumption and from the fact. that L satisfies the chain condit.ion, that all the I' (0, x) in the sum on the right. have the Harne sign, an(1 none of t,hem is zero. Therefore, I' (0, 1) is not zero, and its sign is the opposite of that I' (0, x) for any dual at.om x. This condudes the proof.

349

UlAN·CARLO ROTA:

35~

Corollary. The coefficients 0/ the characteristic polynomial 0/ a geometric lattice alternate in sign. vVe next derive a combinatorial interpretation of the Euler characteristic of It geometric lattice, which generalizes a technique first used by WHITNEY in the study of linear graphs. A subKet {a, b . ... , c} of a geometric lattice L is independent when l'(a V b V," V c) = r(a)

+ r(b) + ... + r(c).

Let C/c be the cross-cut of L of all elements of rank k > O. A maximal independent subset {a, b, ... , c} C C/c is a ba8i8 of Ck . All bases of C/c have the same number of elements, namely, n - k if the lattice has height n. A subset A c C/c is a circuit (WHITNEY) when it is not independent but every proper sub~et is independent. A set is independent if and only if it contains no circuits. Order the elements of L of rank k in a linear order, say ai, a2, ... , al' This ordering induces a lexicographic ordering of the circuits of Ck . If the subset {ail' ai., ... ,ail } (il < i2 < ... < ij) is a circuit, the subset ail' at., ... ,atl _ 1 will be called a broken circuit. Proposition 1. Let L be a geometric lattice of height n + 1, and let C/c be the crOS8-CUt of all element8 of rank k. Then p(O, 1) = (-I)nm/c, where m/c i8 the number of 8ubsets of C/c who8e meet is 0, containing n - k + 1 elements each, and '/lot containing all the arcs of any broken circuit. Again, the assertion ·implies that ml = m2 = ma = .... Proof. Let the lexieographically ordered broken circuits be PI, P 2 , ... , P 11, and let St be the family of all spanning subsets of C/c containing P t but not PI. p~ • .... or Pi I· In particular, 8 11 t 1 is the family of all those spanning sub. sets not containing all the arcs of any broken circuit. Lct qj he the number of spanning subRets of j clements and not belonging to 8 t . We !lhall prove that for each i ~ 1

(*) First, ::;et i = 1. The set 8 1 contains all spanning subset::; containing the hroken eire'uit Pl. Let PI be the cicuit ohtained by completing the broken circuit· Pl. - A spanning set contained in 8 1 contains cither PI or else PI but not PI; eall these two families of spanning subsets A and B, and let q1 and qfi he defined accordingly. Then qj = q; + qf + qf, and I~(O, 1) = q2 - qa

+ Q4'"

+ qt + (q: -

=

+ qt) + ....

Q~ -- q~+ ...

q;)- (q!f -

Now. q1 = 0, because no circuit can contain two elements; there is a one-to-one corre;;pondence between the elements of A and those of B, obtained by completing the broken circuit PI- Thus, all terms in parentheses cancel and the identity (*) holds for i = 1. To prove (*) for i > 1, remark that the element Ct of C/c, which is dropped from a circuit to obtain the broken circuit PI, does not occur in any of the previou::; eircuits, because of the lexicographic ordering of the circuits. Hence the induction can be continued up to i = (] + 1.

350

On the Foundations of Combinatorial Theory. I

359

Any :;pt belonging to Hu+l does not contain any circuit. Hence, it is an inde. pendent Het. Since it is a spanning set., it mutst contain n - k 1 elements. Thus, all the' intcgers qu+l yanitsh except q;'~L-l and the stakmcnt follows from (*), q.e.d.

+-

+

+

+ .. , +

Corollary 1. Let q(A) = An mIAn---1 m2A n - 2 mn be the character. istic polynomial of a geometric lattice of height n 1. Then (-I)kmk is a positil'e hlleger for 1 ~ k ;c:: n. equal to the number 01 independent subsets of k atOn!8 110/ containing any broken circuit. The prool is immediate: take k = 1 in the preeeding PropotsitioJl_ The homology of a geometric lattice is simpler than that of a general lattice:

+

Proposition 2. In the homology relatice to the CrOl:J8-cllt C\ k = 1, Ihe Betti numbers b1 , b2 , ••• ,b k --2 vanish.

01 all elements oj rallk

TIH' prool is not difficult. Example 1. Part'itions of a set. Let S be a finite set of n elements. A partition n of H is a family of disjoint ;;ubset,; B 1 • B 2 , ... , B k , ealled blocks, whmlP union its 8. There i" a (well-known) natural ordering of partitions, which is defined as follows: n ~ a whenever every block of 7l is contained in a block of partition a. In particular, it! the partition having n blocks, and I it! the partition having one block. In this ordering, the partially ordered set of partitions is a geometric lattice (cf. BIRKHoFJ.'). The' Mobius function for the lattice of partitions was first determined by ScnOTzENBEROER and independently by ROBERTO FRUCHT and the author. We give a new proof which uses a recurtsion. If n is a partition, the cla88 of n iR the (finite) sequence (kl' k2, ... ), where k t is the number of blocks with i elements.

°

I.emma. Let Ln bp. the lattice 01 partitiol1 . ~ of a set u'ith 11 r/p/nell/s. linE L" 01 rank k, then the 8egment [n, I] is isomorphic to L n - k . It n is 01 dass (k1, k 2 , ... ), then the segment [0, n] is isomorphic to the direct product 01 kl lattice8 isomorphic to L 1 , k2 lal/ices isomorphic to L 2 , etc. TIlt' proof is immediate. It follows from the Lemma that if [x, y] is a :segment of L n , then it is iso· morphic to a product of k j lattices isomorphic to L t , i = 1,2, .... We call the sequenec (kl' k2' ... ) the Cfa,~8 of the segment [x, y]. i8

Propo8ition 3. Let fln = fl (0, I) for the lattice ments. Then fln

= (- I)n-l (n - I)!.

01

partitions

Proof. By the Corollary to Propmlition 4 of Section 5,

01 a

set with n ele.

L>(x, 1) =

O. Let a

:rr'-,a=Q

be the (lual atom consisting of a block C 1 containing n -- 1 points, and a second bloek C 2 eontaining one point. Which non-zero part,itions x have the property t hat x II a = O? Let the blocks of such a partition x be B 1 , ••• , B k . None of the hlocks B; can contain two distinct points of the block C 1 , otherwise the two pointts would still belong to the same block in the intrff\ection. Furthermore, only one of the B j can contain the block C 2 . Hence, all the B j contain one point, exeept OIl(" which contains C 2 and an extra point. We conclude that x must be an atom, and there are n - I such atoms. Hence, fln = fl (0, 1)= fl (x, 1), where x

L x

rallge~

over a set of n --- 1 atoms. By the Lemma, the segment [x, 1] is isomorphic

351

360

GIAN-CARLO ROTA:

to the lattice of partitions of a set with n - 1 elements, hence Iln :::lince 1'2 ,,-= - 1. the conclusion follows.

Corollary. 1/ the segment [x. y] is I' (x, y) == ft~' ft~' ... ft~"

0/ class

(k 1 • k 2 •

••••

-

(n -- I) flll

-1.

k n ), then

= (- l)k.H.+ "'H,,-tI (2 !)•• (:l Ilk • ... ((n -_. 1) !t" .

The Mobius inversion formula on the partitions of a set has several ('ombinatorial applications; see the author's expository paper on t.he subject..

8. Representations There is. as is well known, a dose analogy between combinatorial results relat.ing to Boolean algebras and those relating to the lattice of subspaces of a vector space. This analogy is displayed for example in the theory of q-difference equations developed by ~'. H. JACKSON. and can be noticed in many numbertheoretic investigations. In view of it, we are led to surmise that. a result analogous to Proposition 1 of Section 5 exists, in which the Boolean algehra of subsets of R is replaced by a lattice of subspaces of a vector space over a finite field. Such a result does indeed exist; in order to establish it a preliminary definition is needed. Let L be a finite lattice. and let V be a finite-dimensional vector space over It finite field with '1 elements. A representation of L over V is a monotonic map p of L into t.he lattice M of subspaces of V, having the following propertie~: (1) p(O)

O.

(2) pta \I b) O~ pta)

V p(b).

(3) Each atom of L is mapped to a lillI' of the vect·or space r. lind tl](' "rt. of lim'" t hilS obtained span:> the entire :>paee V. A represent.ation i:> faithful when t.he mapping p is OIH'-to-OIH'. \VI' "hl111 !Set' in Scetion 9 t.hat a gn'at many ordered structures arising in combinatorial problems admit fait.hful repre:>entations. Given a representation p: L --.,.. .1[, olle defines the conjnyale map q : M -)- L as follows. Let K be the set of atoms of M (namely, lines of V), and let A be the image under p of the set of atoms of L. For s E M. let K (s) be the set of atoms of M dominated by 8, and let B(s) be a minimal 13ubsot of A which spans (in t.he "cd·or ~paee 8('lIse) evcry element of K (s). Let A (s) be the subset of A which is spanned by B(8). A simple vector-space argument., which is here omiUed. shows that t.he :>et A (8) is well defined, that. is, t.hat it does not depend upon t.he ehoiee of B(s), hut. only upon the f'hoiee of .~. LI't. C(8) be the set of atoms of L which are mapped by p onto A (8). S('t q(8).=, V C(8) in the lattic(' L; t.his defines the map q. It. is obviously a monotonie function.

J,('mma_ I,pt P : L~!If be a faithful rpJI]'(w'nta,tion and let q:.M .,.. L be the conjuyate map. Assume thai every element ()f Li.~ u. join of atom,•. Then p (q (8» ~ .~ and q(p(x» ~ x. Proof. By definition, q(8) - V O(,~), whel'e 0(8) is the inven-lc image of A (8) under p. By property (2) of a representation, p(q(s»=p(VC(s»= Vp(C(8»= VAts).

352

On the Foundations of Combinatorial Theory. I

361

But this join of the set of lines A (8) in the lattice M is the same as their span in the vector space V. Hence VA(s) ~ 8, awl we conclude that p(q(s» ~ s. To prove that q(p(x»;;:;;; x, it suffices to show that A(p(x» = B, where B is the set of atoms in A dominated by p (x). Clearly Be A (p (x», and it will suffice to establish the converse implication. By (2), and by the fact that x is a join of atoms, we have p (x) == V B. Thcrefore every line l dominated by p (x) is spanned by a subset of B. If in addition lEA, then l ;;:: V C for some subset C c B, hence l E B. This shows B::J A (p (x», q. e. d. Theorem 5. Let L be a finite lattice, where every element is a join of atoms, let p : L ~'? M be a faithful representation of L into the lattice M of subspaces of a vector space V over a finite field with q elements, and let q: M -+ L be the conjugate map. For every k ~ 2, let mA; be the ~umber of k-dirnensional subspaces s of V such that q(s) = I. Then (*) where f1, is the Mobiu8 function

0/

L.

Proof. Let Q = L*, let c: L -+ Q and c* : Q -+ L be the canonical isomorphisms between Land Q. Define'll: Q -+ Mas'll = pc*, and e: M -+Q as e = cq. We verify that'll and e give a Galois connection between Q and M satisfying the hypothesis of Theorem 1. If 'll (x) = 0, then there is ayE L such that y = c* (x) and p (y) = O. It follows from the definition of a representation that y = O. Hence x = c (y) = 1. Furthermore, e (0) = c (q (0)) = 1. It follows from the preceding I~emma that 'll and e are a Galois connection. Applying Theorem 1 and the result of Example 2 of Section 5, formula (*) follows at once. Remark. It is easy to HCC that every laUiee having a faithful representation is a geometric latticc. The converse is however not true, as an example of T. LAZARSON shows. A reduction similar to that of Proposition 1 of Seetion 7 can be carried out with Theorem 5 and representations, and another combinatorial property of the Euler characteristic is obtained.

9_ The ('olurin/!,' of graphs By way of illustration of the preceding theory, we give some applications to the classic problem of (~olorillg of graph", and to the problem of eOllst,[ucting flows in networks with specified properties. Our results extend previous work of G. D. BIRKIHWF, D. C. LJ~WIS, W. T. TU'I"l'1<; and H. WIl[,I'~EY. A linear graph G =- (V, }I) is a structure consisting of a tinitp ~ct V, whosp elements are called v('rtices, toget.her with a family E of two-plenwnt. 8uhsds of V, called edges. Two verticeH a and b are adjacent when the set (a, b) is an edge; the vertiecs a and b are caliI'd th!' l'ndpoints of (/1, b). Alternatcly, one call,., tlw vertice8 regions and calls til(' graph a map, and we Uf<(, t.he two t.crms interehangeably, considering them as two words for the same objcct. If 8 is a set of edges, the vertex set V (8) conshlts of all vertices which are incident to some edge in S. A Het of f'dgeH 8 is connected when in any part ition S =~ Au 11 into disjoint Jloll-pmpty t:ds A and 11, the vertex setH V (A) and V (R) arc not (ii:.;joint. :Evl'I'y Sf't of edges is the union of disjoint conneeted blocks.

353

362

GlAN-CARLO ROTA:

The bond closure on a graph 0 = (V, E) is a closure relation defirwd on the set E of edges as follows. If SeE, let S be the set of all edges both of whose endpoints belong to one and the same block of S. Every set consisting of a single edge is closed, and these are the only minimal non-empty closed sets. Lemma 1. The bond closure S _ S has the exchange property. Prool. Suppose e and I are edges, SeE, and e E ,Su7 but e ¢ S. Then every endpoint of e which is not in V (S) is an endpoint of f; on the other hand, Sand f have at least one point in common, otherwise e E S. Thus both e and f either connect the same two blocks of S, or else they have one endpoint in B and one common endpoint; hence E E[Ue, q.e.d. The lattice L = L(O) of bond-closed subsets of E is called the bond lattice of the graph O. Suppose that E has n blocks and p(J.) is the characteristic polynomial of L, then the polynomial J.np (J.) is the chromatic polynomial ofthe graph 0, first studied by G. D. BIRKHOFF. :From Theorem 4 we infer at once the theorem of WHITNEY that the coefficients of the chromatic polynomial alternate in sign. The chromatic polynomial has the following combinatorial interpretation. I_et C be a set of n elements, called colors. A function f: V _ C is a proper coloring of the graph, when no two adjacent vertices are assigned the same color. To every coloring / - not necessarily proper - there corresponds a subset of E, the bond of /, defined as the set of all edges whose endpoints are assigned the same color by /. The bond of / is a closed set of edges. :For every closed set S, let p(J., B) be the number of colorings whose bond is B. Then we shall prove that p (J., S) = J.nq(J., B), where q(J., S) is the characteristic polynomial of the segment [8, 1J in the lattice L. Since every coloring has a bond LP(J., '1') equals the total

t

7'.·'8

number of colorings having :some bond '1' ,,;;: S. But t.his-numlwr is evidently AI< . r(.), where k is thc number of vertices of t.he graph and r (8) is the rank of S in L. Applying t,he Mobius inver:
(*)

p (J.)

= p (J., 0) =

L J.k-r(T) I' (0, T) .

TEL

But the number of colorings whose bond is the null set 0 is exactly the number of proper colorings. WHITNEY'S evaluation (cf. A logical expansion in MathcmatiCl;) of the chromatic polynomials of a graph in terms of the number of subgraphs of sedge,; and p connected components is an immediate consequence of the cross-cut theorem applied to the atoms of the bond-lattice of O. This result of WHITNEY'S can no\\' be sharpened in two directions: first, a cross-cut ot.hcr than that. of the atoms can be taken; secondly, the computation of the cocfficients of the chromatic polynomial can be simplified by Proposition I of Section 8. The cross-cut of all elements of rank 2 is particularly suited for computation, and can bc programmed. The interested reader may wish to cxplicitly translatc the cross-C"ut t.heon'm and the results of Section 8 into the geometric lan/:,11lage of graphs. Example 1. For a complete graph on n vertices, where every two-elemcnt. subset is an edge, the bond-lattice is isomorphic to the lattice of partitions of a set with n elements. The chromatic polynomial is evidently (J.)n = J. (J.- I) ... (A - n -+- 1), and the coefficients s (n, k) are the Btirling numbers of the first kind.

354

On the Foundatiolll> of Combinatorial Theory. I

Thus,

363

L /1 (0, ;Tt) -= s (n, k). Thi::; gives a eomhillatol"ial interpretation to t.he Stirling

r(,,)~k

numhers of the first kind. For a map m embedded in the plane, where regions and boundaries have their natural meaning and no region bounds with itsclf, one obtains an interesting geometric result by applying the cross-cut thcorem to the dual atoms of the bond lattice L (Ill). Let m be a conneeted map in the plane; without loss of generality we can assume: (a) that all the regions of 11t, except one which is unboundl"d, lin im;ide a convex polygon, the outcr boundary of m; (b) that all boundaries are segml"nt.s of straight lines. The dual graph of 111 is the linear graph made up of the boundaries of m. A circuit in a linear graph is defined as a simple closed curve contain cd in t.he graph. We give an expression of the polynomial P(A. m) in term::; of I he eireuits of the dual graph. The outer boundary is alwaYH a eireuit. A set of circuit.s of a map m in the plane span.~, when their ullioll- in the set-t,heoretic sem~e - is the entire boundary of m. Proposition 1. For every integer k ;;:;; 1, let C k be the number di.stinct circuits of a map m in the plane. Then

0/ spanning sel.s of k

Proof. If the map has t.wo region::;, thcn C 1 = 1 and all other Cc =, 0, ~(J tIlt" ['{'suit is trivial. A:o<sume now t.hat 1II haR at lea::;t, 3 regionH. Then C 1 = O. All we have to prove is that the integers C k are the integers qk of Theorem a, relative to the ("fOR;;-("llt of L(m) consiHt.ing of all thc dual atom;;. By the .Jordan {'ur\'e tll('Ol'("m, every circuit. dividn,; thl' plane' into two n.giolls: thi::; give::; a one-to-one corre::;pondence of the circuits with the dual atoms of L(m). l'ollver"dy, becausc we can assume that the map i,; of the Hpecial type de'sc('ibl"d above, every dual atom in L(m) is a map with two connected regions, and so must have a~ a boundary a ~implc closed eun'e, q.e.d. It. has been IIhown by RICHARD RADO (p. 312) that the boud-Iattice L(G) of any lim-ar graph G has a fait.hful repre~ent.at.ion. Accordingly, Theorem 5 can also he appli('(l to obtain expre::;sion for /1 (0, I). These expre;,;sions usually give sharper bounds than similar exprm,sions based upon t.he cross-cut of atoms. :Fa rther-reaching techniques for the com putation of the Mobius function of L (G) are obtained by applying Theorem 1 to situations where P and Q are both bondlattie"s of graph::;. This we shall now do. A rnmlOmorphisrn of a graph G into a graph lJ is a one-to-one function f of the vertices of G onto the vertices of H, which induces a map J of the edges of () into t.he edges of H. Every monomorphism /: G -+ lJ induces a monot.onic map p: L(G).-+ L(H), where p(S) is defined as Ow closure of the image /(S) in H. It also induces a monotonic map q: L(H) -+ -" L(G). wherc q(T) is defined as t.he sct of edges of G whose image is in T.

1,I'rnrna 2. q(p (S» . ~ S for Sin D(G) and p(q(T)

~

T for Tin L(H).

Proof. Intuit.ively. pIS) is obtained by "adding edges" to S, and q(p(S» "imply ('('movell the addcd edges. Thus, the first statement is graphically clear. The sccond one can be s(,e'n as follows. q(T) is obtained from T by removing a

355

364

GJAN·CARJ,O nOTA:

number of edges. Taking p(q(T)), some of the edge;; may be replaced. but in general not. all. Thus. p(q(T)) ~ T. Taking ]If O~= L(ll)* and c: L(ll)-+ M t.o be the canonical order.im·erting map, we see t.hat n = cp and l; = qc give a Galois conncct.ion bct.ween L(O) and M. Now, n(x) =--·c 0 is equivalent to p(x) = 1 for x EL(G). This can happcn only if x has only one component, that is -- since x is closed - only if x = 1 in L(G). Thus n(x) = 0 if and only if x = 1. Secondly, (1(0)= q(l) = I, evidently. 'Ve have verified all the hypot.heses of Theorem J, and we then.fore obt.ain:

Proposition 2. Let I : G _ H be a mono1iwrphism oj a linear graph G into a line,ar graph H, and let PG and PH be the Mobius lu:nc/ions oj the bond·lattices. Then pG(O, 1)

=

LPH(a, I),

[a E JAB); q(a)=OI

where q is the map

0/

L(H) into L(G) naturally associated with

I,

as abore.

Proposition 1 can be used t.o derive a great many of the reductions of G. D. BIRKHOFF and D. C. I_EWIS, and provides a systematic way of investigating the changes of Mobius funet.ions _. and hence of the chromatic polynomial when edges of a graph are removed. It has a simple geomet.ric interpretation. An interesting applieat.ion is obtained by taking H to be the complete lattice on n elements. We then obt.ain a formula for P which completes the statements of Theorems 3 and 5. Let G be a linear graph on n vertices. I_et. C be t.he family of two-element subsets of G which arc not edges of G. I_et P be the family of all subsets of C which are closed sds in t.hc hond.lattice of the completc graph on n vertices built on the vertices of G. Then,

Corollary.

pa(O, J) 0=_,

2>

aE

(a, I).

1!'

where P is the Mobius function of the latt.icc of partitions (.f. Exampll' I)) of a set of n elements. Stronger results can be obtained by considering "epimorphisms" rathrr t.han "monomorphisms" of graphs, relating PG to the Mobius function ohtainrcl from G by "coalescing" points. In this way, one makes contact with G. A. DIRAC'S thcory of critical graphs. We leave the development of t.his topic to a latl'r work.

10. Flows in networks A network N = (V, E) is a finite set V of vertices, together with a srt of ordered pairs of vertices, called edges. We shall adopt for networks the same language as for linear graphs. A circuit is a sequence of edges 8 such that every vertex in V (8) belongs to exactly two edges of 8. Every edge has a positive and a negative endpoint. Given a function (/) from E to the integers from 0 to A - 1, let for each vert.ex I', (])(v) be defined as (]) (v) = 1) (e, v) (/) (e) ,

2: e

where the sum ranges over all edges incident t.o v, and t.he function

356

1) (1', v)

takes

On tll(,

l~oundations

of Comhinatorial Theory. I

365

t.hr valu(' -1-1 or" 1 according as the positive or ncgat.ive end of the edge e abuts at t.ht' vprtex 1l, and the valuc r.cro othrrwise. The funct.ion tP is a flow (mod. A) when iP(I') "0 (mod. A) for (~vcry vert"x v. The value tP(e) for an cdge e is called t.he caparily of the flow through e. The mod. ). fI'l!'lhius inver:;ioll Oil a lattiee assoeiated with the network. This will give an exprm;sion for th!' nurnbrl' of proprr flows as a polynomial in A, whose coefficient,s are t.he val\l!'~ of a Mi".hiw; funetioJl. Every flow through N ill a proper flow of a suitable Imbnetwork of N, ohtailH'd by removing those edgl't-' which are aSfdgned eapa('ity O. Howl'v(,l", t.lH' (;OIl\Tr:;p of thill assert,ion is not. true: given a subnetwork ,<-; of N, it may IIO! he pOllsihlt~ to find a flow whil'h is propcr on the complcment of N. This happens because f'very flow which assign:; eapaeit.y zero t.o eaeh edge of S may assign eapacit,y r.t'ro to Home further cdg(>s. We arc therefore led to define a closure relation on t.ll(, set of all su bgraphs as followll: .g shall be thc set of all edges which ncecsHarily arc assigned capacity zero, in any flow of N whieh aSi"igns capacit.y zero to every edge of S. In other words, if e is, then there is a flow in N which assigns capacit.y 0 to t.he edge e, but, which assigns eapaeit.y zero to all the edgPK of R. It it; immediat.ely verifil'd t.hat S "-,. R is a do:mre relatioll. \Vo (:all it. thl' I:·ircuit do .•ure of S. Tlw drcuit dmmre has the excluwye propl'rty: if e E Sou 11 !Jut, (' rf [J., th('1l p E 8"0(:. Before yerifying it. we first derive a gcometri(" eharacterir.at ion of the eir("uit, (·I""lIre. A set 8 il< ('in:uit ("I""pll (8·-,,· R) if and only if through ('\'('ry l'dg.· (, !lot in S theI'(' pa""l':; a l'ircuit whid. is disjoint, from S. For if S is (:los('d and /' rf: S. t 11('11 tliPI'(' j" a flow t.hl'Ough e and disjoint. frolU S. Rut thiH ean happen ollly if then' j:; a <'in:uit through P.. "If there is a cireuit t.hrough tho edge p ,li"joint from Su e, and a circuit through e disjoint, from Sand eontaining p, t.hen there is - as has l)oen observed by WHITNEY - also a circuit through e not eont.aining .g U p. This implil's that. e is not in the plosure of Sup and verifi('s Hie exc-hangc pl"Opert.y. The lattice C(N) of closed subset.s of edges of the network N ill thc cirw·it lattice of N. An atom in this lattice is not necessarily a single edge.

'*'

Proposition I. The number of proper flows, (mod. A) on a network N with v vertices, (' edyes and /) n)1mected component8 i8 a polynomial P (A) of degree e - - 1'+ p. 1'his polynornial ,is lite. characteri8tic polynomial of the circuit lattice of N. Tlte coefficients alternate in sign. Proof. The last st.at.ement ill an immediate eonRPquence of Theorem 4 of Seetion 8. TIl(> total llumber of flows on N (not. Il('('essinily proper) is determined as rol1ow~. Assume for ~illlpli('ity t.hat N is eOIIlll'ekd. Remove a sct ]) of tl - 1 edges from N, one adjaeent, to eaeh but one of the vertices. Every flow Oll N mn bc obtained by first allsigning to each of the edges not in I) an arbitrary eapacit.y, between 0 and A - I, and then filling in capacit.ies

357

:11 ill

for tlw cdg,'s in /) to matt·h t.he rNlllirement of zero eapadty through lllH'h ,·ert,l'x. TJ1I'rl' Ill'e ),'-'" I ways of doing t.hi",. and t.hil< is t,]l('refofl' the total ll11mhcr of flow", mod. }., If 1.111' nctwOI'k i", in p conneeteli components. the SHnl<' Ill'gunwnt. give/! AI-'" 1'. :\ow, (,very flow on 0 is a proper flow on a unique closed Hubset. 8, f)"t.ailu·d hy removing all cdges having eapaeit.y zero. Hence )f ,·tp c 2]1(8,1), ~E!::'(U)

p(S, 1) is t.he charact.eristic polynomial of t.he c10secl subgraph R. Sctt.ing n (x) ,',. e (x) ..- l' (8) -+- p (8). t.hc number of edges, ,"crt-ices and eomponf'nts of .~,

where

and applying the invcrsion formula, we get. p(G,1) o"'21n(B)I-'(S. G).

q.e.d.

Se!d.(U)

In the course of t.he proof we have also shown t.hat n(8) is t.he rank of S in tIl(' circuit latt.iep of U. Thc rank of till' null subgraph is one. The four· color problem is cquivalent to t.he st.at.ement. t.hat every planar net· work wit.hout an isthmus has a proper flow mod 5. (An ist,hmuH is an edgf' that disconnects a component of the network when removed.) Most. of t.he results of the preceding section extend t.o circuit lattices of II network. and give techniques for ('omputation of the flow polynomials of networks. \rc shall not writ.e down t.heir translation into the geomet.ric language of networ·kK. Referenees AnH.ANJ>lm.

H27 -

L., and H. :\1.

TI"'.~ 1':

11I(·id'·I\'·1' llIatri("'s an,llirlt'1I1' graph" .•1. Mat h. !\Il'l'h.

~.

835 (1959).

:K T.: Algebrail: Arit.hIlIPli,'. XI'W York: Arm·r. Mat-h. SOl'. (HI:!,), -- ExponentialllOlyuomink ABU. of 1\1ath., 11. /-ieI'. S;;, 258,,277 (19:14). BEROE, C.: TMorie des graplH'M ef. HI'S applicat.iolls. Paris: Dounod 19M!. BlltKIIOFF. flAmn;'I'T: Lall,ice Thl'or}" third pr£'limillary edition. Hanar,\ Univ('l'oil.". Hlli:l. -- Latt.iee Theory, r!'\'is('d edit-ion. American Mathematical Societ.y, ]948. BIRKHOF}'. G. D.: A tktt·rminant formula for thl' nnmber of ways of coloring a map. Ann. nf Math., 11. Sl'r. 14.42' ,4H (lIH:I). -'-, alld D. C. L.;WIS: Chruma!,i,. ItolynornialH. TrailS. Anll'r. mat,h. SO(·. 60, :~55 4.;1 (194Ii). 13u:/cllt:R, M. N., and G. B. PUt;"TOS: Abstmd lincnr dep!'n'\l'rH'c 1·e1atioIlS. Pull!. Math., Debreeen 8, 55--6:J (1961). BOUGAYEV, N. V.: Theory uf numerical derivative!!. Moscow, 18,0--1873. pp. 1-222. BRUUN, N. n. Ht:: Gem'ralization of I'olya's fuwlampntal tlH'on'rn in ('J\lmu"·at.in, ('om· binatorial analYMiH. Indngntiolll'A math. 21, r,9-liB (1959). CHUNn, K.·L., and L. T. C. HSlI: A t,ombinatorial formuln with it.H application to the theory of l'robnbility of arbitmr.v cwilt.s. Ann. math. :--itatistic-s 16, 91·-95 (1945). DEDEKIND, R.: Gesammelte Mathematische Werk£', V01l8, I--I1-IlI. Hamburg: Deut.Rehe Math. Verein. (1930). DEL.'1ARTE, S.: ~'onctions de Mobius Bur le8 groupes abl-liens finis. Ann. of Math., II. SCI'. 411, 600-609 (1948). DILWORTH, R. P.: Proof of a t:onjccture on finit.e modular lattict,s. Ann. of :Math .. II. K"r.

B~:LL,

60, 359-3U4 (1954). G. A.: On the four·color conjel't.ure. Proc. London mat.h. H()(·i£'t~·. liT. S("r. 13, ]ll:l to218 (1963). DOWKER, C. H.: Homology groups of relations. Ann. of Math., II. Sf'r. 06, 84-9r, (19;;2).

DIRAC,

358

On the Foundations of Combinatorial Theory. I

367

DUBRBTL-JACOTIN, M.-L., L. LEHmUR et R.. CROISOT: LeQons sur la tMorie des treilles des structures algebriqlles ordonnpes et des treilles geometriques. Paris: Gauthier-Villars 195:~. EIL~;NBER(l. Fl.. and N. STEENROD: Foundations of algebraic topology. Princeton: University PreS" 19fi2. FARY, I.: On straight-line representation of planar graphs. Acta Sci. math. Szeg!'d 11. 22!l233 (1948). FELLt;R. W.: An introduction to probability theory and its applications, seco'lfl ed;ti"" New York: Wiley 1960. FUAl'KLIN, P.: The four-color problem. Amer. J. l\lath. 44, 225~23() (1922). FHECHI,T, M.: Les probabilites associ{-es it, un syst,eme d'evenements compat.ibles et d{-p!",· dant.s. Actualitees seientifiques et industriclIes, nos. SG9 et 942. Pa ris: Hermann 11)40 ('t 1943. :l<'RONTEHA MARQUES, B.: Una funci6n numeriea 'on los reticuloR finit'ls qu(' se annla para los reticulos reducibl,-s. Ad-as dl' la 2a, Reuni{,n de matemat.i<,os espallOles. Zaragoza loa· III 19()2. FRncHT, R. .• and n.-c. ROTA: La funei6n (I<> Miibius para l'l retkulo di particio,H's de un ('onjunto finito. To appear in !'i('ientia «'hile). UOLDBERG, K., M. S. GRt;EN and R. E. NETTLETON: Dl'nRI' suhgraphs aJlrt ('onnectivit.y. Canadian J. Math. 11 (19G9). (;OLOMB. S. \-V.: A mat.h,-matieal t.1ll'or.,· of disndl' (,lasHi!ieat ion. I;'ollrt.h Syrnp(JBillrn in In· format.ioll Theory, London. 19(j1. o IU;t;N, M. S., and R. E. NETTLETON: Mohius fundion on the lattice of dl'nHe suhgraphs. ,1. Res. nat, Bur. Standards 64B, 41--47 (1962). -- - Expression in terms of modular distribution fun('.tions for th(' entropy dell:;ity in an infinite system. J. Chemical Physisc 29, 1365~I:nO (1958). HADWIUER, H.: Eulers Charakteristik und komhinat.orische Geomet.rie. ,1. rl'i,H' ang"\\,. Math. 194, 101 -llO (19i'ii'i). HALL. PUII.U': A contribut.ion to the tl1l'ory of gl'OlIpS of prim\' p,nH-I' order, Proc. London math. Soc., 11. Ser. 36, :{9--95 (1932). - The Eukrian fundions of a group. Q.uart.. J. Mat.h. Oxford !'iN. I:H liil, 19:1fi. HAnAHY. F.: li(l~()lv(·d prubh'IlIS in thl' t'llulll('ratiotl of graph~. ['Hhl. math. lll:-it. HtlllU,;1l A(·H(l. S('i. r.. 1i:1 .!li'i (I!)(;O). HARDY, O. H.: ]{amanujan. Call1bridg:l': Cuiversit.I' Press 1940 . . , and E. I\\. V'.'HIGllT: An introdu('tion to t.he theor~' of numhers. Oxford: Unin-rAit,y Pr('ss 19;'4. HAHTMAl'LS. ,1.: Luttic(' tlll'or,\' of g"neraliz('(l partitions, ('anadian J. i\latll. 11. 97- lOt; (H)[)!l). HILLE. K: The inversion problems of Mobius. Duke math. J. 3, i'i4!l -;i6l:l (I!l:!]). lisp. L. T. C.: Abstrad, throry of inversion of iterated sUlllmation. Oukl' math. ,J. 14.4lifi to 47:1 (1!J47). On Homanov'~ de vi"" of oI'thogonalizatioll. Kei. J{ep. l\at.. THing Hua Cniv. 5, 1--12 (l!l4X).

1\ote on an abstract inversion principle. Proe. Edinburgh math. Soc, (2) 9, 71-·73 (1954). ,1Aclu;oN, F. H.: Rt'riPH conned!'(] with the enumeration of partit.ions, Pro('. London mat.h. i-;Of' ..

II.

i-;,'r,

J, Ii:l

HX (1!)o4).

'I'll(' q-{(H'm of Taylo!"s theorem. MesKellgt'r of 1\lathemat,ies :IH. 57--·(;1 (/!)()9). J6NSSON. R.: "Lat.tice-theon-tie approal'h to projpdive and aflill<' w-omet.r.I·. !'iyrnposium on the ,\xiomatie Method. Amst.erdam, North· Holland Publishing Company, 19;J\l, Il:lH--20fi. ". and A. TARSKI: Direl't de('omposition of finit.e algphraie syst.ems. Kotre Dame Mat.he· mat,i,-al It-dun's, no. G. Illdiana: 1\o1.re Vamp 1\)47. KA'" M .• and .J. C. 'VARl): A combinat.orial solntion of the two·dimensional ]sing modI'!. PhI'S. Hevi,'w HH. 1:t{21:{:n (l9:'i2). KA['J.A:-iSKI, 1.. and .J. HIORDAN: The pro1.lollle Ups m{-nagps. !'ieripta math. 12, 113-·-124 (H)46). KL~;~;. \'.: 'I'll<' Eukr eharaetprigtif' in eOl!lbillatorial gl'oll](,t.l',\'. Auwr. math. Monthly 70, ll\l 127 (196a).

359

GUN·CARLO ROTA: On the Foundations of ('ombinatorial Theory. I

3liH

LAZARSON, 1'.: TIH' r"presentation problem for independence functions. J. I..ondon math. Soc. 8B, 21-25 (195tl). MACLANE, s.: A latt.il'(' formulation of trallst,endcnce degrees and p.bases. Duke nmth .•J. 4, 455-468 (19:ltl). MACMILLAN, B.: Absolutely monotone functions. Ann. of Math., II. Ser. GO, 467-501 (1954). MOBlllS, A. }<'.: Uber eine be80ndere Art von Umkehrung der Reihen. J. reine angew. Math. II, 105~ 123 (l8:J2). ORl'l, 0.: Theory of graphs. Providence: American Mathematical Society 1962. PO[,Y A, 0.: 1{ombinatorische Anzahlbestimmungen flir Gruppcn, Graphen und chemiKehe Verbindungen. Acta math. 68, 145-253 (1937). HADO, R.: Note on independence functions. Proe. London math. Soc., III. Ser. 7, 300- 320 (1957). HEAl), It. C.: The enumeration of locally restriett1d graphs, I. J. London math. Soc. 84, 417 t0436 (1959). REI>FU:LD, J. H.: The tht10ry of group.n·duced distributions. Amer. J. Math. 49, 433-451\ (1927). H ~;V\lZ. ANDRE: Fonctions croillllantes et mesures sur leR espaces topologiques ordonn~R. Ann. Inst. j<'ourier 6 187 -268 (1955). RIORI>AN, J.: An introduction to combinat.orial Q.nalysis. New York: Wiley 1958. HOMANOV, N. P.: On a spedal orthonormal s,VHtem and its connectioll with the thcory of primes. Math. Hhornik, N.~. HI, :1:'3 304 (l!!4r,). ROTA, G.·C.: Combinatorial tht10ry and Mobius funet.ions. To appear in Amer. math. Monthly . . - The number of partitions of a Ret. To appear in Amer. math. Monthly. HYSER, H. J.: Combinatorial Mathematics. Bufl'alo: Mathematical AHSociation of America 1963. SClIUTZENBERGER. M. P.: Contribution aux applications statistiques de la tMorie de l'infor· mation. Pub!. Inst. Htat:. lTniv: PariR. 3, 5-117 (1954). '1\\RHKr, A.: Ordinal algebraR. AmRtt'rdam: North·Holland PuhliHhing Company 1956. 'I'IJlIC·lIAllIl ••J.: HIlT un probl<\rne de permut,ation8. C. r. Aead. H"i., Pads, 198, fi:II·-(i:J:J (19:J4). 'I'1I'l"j'E, W. T.: A eontributioll to the thoory of "hrlJllluti.., polYllomiaiH. Canadian J. Math. 6. RO---91 (Hl:':l). A d".,s of .\I"·li",, ~I'OIl". ('""adiall .J. Malh. ~. 1:1 :!x (Willi)_ A hOlllotUI'Y th"ol'l'lI, fo .. IlIall'Oi
Characteristic function .. aJIII the algcl,,'" of logie. Ann. of Math., 11. H!'l·. 34, 40:' 414 (19:l3). The abstract propertie8 of linear dependence. Am(,r. J. Math. oj, 507-·5:13 (19:l5). WIELANDT, H.: Beziehungen zwischen den Fixpunktzahlen von Automorphismengruppen eim>r endIichen Gruppc. MaUl. Z. i3. I4fi ---]:'1-1 (WHO). WINTNER, A.: Eratosthenian Avcrages. Baltimore (privakly printed) 1943. Department of Mnthemati('H Massachusetts Institute of Technology Cambridge 39, MaKll8chusetts ( Received September 2, 19(3)

Reprinted by TRUEXpress Oxford England

360

PATHS, TREES, AND FLOWERS JACK EDMONDS

1. Introduction. A graph G for purposes here is a finite set of elements called vertices and a finite set of elements called edges such that each edge meets exactly two vertices, called the end-points of the edge. An edge is said to join its end-points. A matching in G is a subset of its edges such that no two meet the same vertex. We describe an efficient algorithm for finding in a given graph a matching of maximum cardinality. This problem was posed and partly solved by C. Berge; see Sections 3.7 and 3.8. Maximum matching is an aspect of a topic, treated in books on graph theory, which has developed during the last 75 years through the work of about a dozen authors. In particular, W. T. Tutte (8) characterized graphs which do not contain a perfect matching, or 1-factor as he calls it-that is a set of edges with exactly one member meeting each vertex. His theorem prompted attempts at finding an efficient construction for perfect matchings. This and our two subsequent papers will be closely related to other work on the topic. Most of the known theorems follow nicely from our treatment, though for the most part they are not treated expliCitly. Our treatment is independent and so no background reading is necessary. Section 2 is a philosophical digression on the meaning of "efficient algorithm." Section 3 discusses ideas of Berge, Norman, and Rabin with a new proof of Berge's theorem. Section 4 presents the bulk of the matching algorithm. Section 7 discusses some refinements of it. There is an extensive ·combinatorial-linear theory related on the one hand to matchings in bipartite graphs and on the other hand to linear programming. It is surveyed, from different viewpoints, by Ford and Fulkerson in (5) and by A.]. Hoffman in (6). They mention the problem of extending this relationship to non-bipartite graphs. Section 5 does this, or at least begins to do it. There, the Konig theorem is generalized to a matching-duality theorem for arbitrary graphs. This theorem immediately suggests a polyhedron which in a subsequent paper (4) is shown to be the convex hull of the vectors associated with the matchings in a graph. Maximum matching in non-bipartite graphs is at present unusual among combinatorial extremum problems in that it is very tractable and yet not of the "unimodular" type described in (5 and 6). Received November 22, 1963. Supported by the O.N.R. Logistics Project at Princeton University and the A.R.O.D. Combinatorial Mathematics Project at N.B.S.

449

361

450

JACK EDMONDS

Section 6 presents a certain invariance property of the dual to maximum matching. In paper (4), the algorithm is extended from maximizing the cardinality of a matching to maximizing for matchings the sum of weights attached to the edges. At another time, the algorithm will be extended from a capacity of one edge at each vertex to a capacity of d t edges at vertex Vt. This paper is based on investigations begun with G. B. Dantzig while at the RAND Combinatorial Symposium during the summer of 1961. I am indebted to many people, at the Symposium and at the National Bureau of Standards, who have taken an interest in the matching problem. There has been much animated discussion on possible versions of an algorithm.

2. Di~ression. An explanation is due on the use of the words "efficient algorithm." First, what I present is a conceptual description of an algorithm and not a particular formalized algorithm or "code." For practical purposes computational details are vital. However, my purpose is only to show as attractively as I can that there is an efficient algorithm. According to the dictionary, "efficient" means "adequate in operation or performance." This is roughly the meaning I want-in the sense that it is conceivable for maximum matching to have no efficient algorithm. Perhaps a better word is "good." I am claiming, as a mathematical result, the existence of a good algorithm for finding a maximum ca-rdinality matching in a graph. There is an obvious finite algorithm, but that algorithm increases in difficulty exponentially with the size of the graph. It is by no means obvious whether Qr not there exists an algorithm whose difficulty increases only algebraically with the size of the graph. The mathematical significance of this paper rests largely on the assumption that the two preceding sentences have mathematical meaning. I am not prepared to set up the machinery necessary to give them formal meaning, nor is the present context appropriate for doing this, but I should like to explain the idea a little further informally. It may be that since one is customarily concerned with existence, convergence, finiteness, and so forth, one is not inclined to take seriously the question of the existence of a better-than-finite algorithm. The relative cost, in time or whatever, of the various applications of a particular algorithm is a fairly clear notion, at least as a natural phenomenon. Presumably, the notion can be formalized. Here "algorithm" is used in the strict sense co mean the idealization of some physical machinery which gives a definite output, consisting of cost plus the desired result, for each member of a specified domain of inputs, the individual problems. The problem-domain of applicability for an algorithm often suggests for itself possible measures of size for the individual problems-for maximum matching, for example, the number of edges or the number of vertices in the

362

PATHS, TREES, AND FLOWERS

451

graph. Once a measure of problem-size is chosen, we can define FA (N) to be the least upper bound on the cost of applying algorithm A to problems of size N. When the measure of problem-size is reasonable and when the sizes assume values arbitrarily large, an asymptotic estimate of FA (N) (let us call it the order of difficulty of algorithm A) is theoretically important. It cannot be rigged by making the algorithm artificially difficult for smaller sizes. I t is one criterion showing how good the algorithm is-not merely in comparison with other given algorithms for the same class of problems, but also on the whole how good in comparison with itself. There are, of course, other equally valuable criteria ..\nd in practice this one is rough, one reason being that the size of a problem which would every be considered is bounded. I t is plausible to assume that any algorithm is equivalent, both in the problems to which it applies and in the costs of its applications, to a "normal algorithm" which decomposes into elemental steps of certain prescribed types, so that the costs of the steps of all normal algorithms are comparable. That is, we may use something like Church's thesis in logic. Then, it is possible to ask: Does there or does there not exist an algorithm of given order of difficulty for a given class of problems? One can find many classes of problems, besides maximum matching and its generalizations, which have algorithms of exponential order but seemingly none better. An example known to organic chemists is that of deciding whether two given graphs are isomorphic. For practical purposes the difference between algebraic and exponential order is often more crucial than the difference between finite and non-finite. It would be unfortunate for any rigid criterion to inhibit the practical development of algorithms which are either not known or known not to conform nicely to the criterion. Many of the best algorithmic ideas known today would suffer by such theoretical pedantry. In fact, an outstanding open question is, essentially: "how good" is a particular algorithm for linear programming, the simplex method? And, on the other hand, many important algorithmic ideas in electrical switching theory are obviously not "good" in our sense. However, if only to motivate the search for good, practical algorithms, it is important to realize that it is mathematically sensible even to question their existence. For one thing the task can then be described in terms of concrete conjectures. Fortunately, in the case of maximum matching the results are positive. But possibly this favourable position is very seldom the case. Perhaps the twoness of edges makes the algebraic order for matching rather special in comparison with the order of difficulty for more general combinatorial extremum problems (d. 3). An upper bound on the order of difficulty of the matching algorithm is n4, where n is the number of vertices in the graph. The algorithm consists of "growing" a number of trees in the graph-at most n-until they augment or

363

452

JACK EDMONDS

become Hungarian. A tree is grown by branching from a vertex in the tree to an edge-vertex pair not yet in the tree-at most n times. Such a branching may give rise to a back-tracing through at most n edge-vertex pairs in the tree in order to relabel some of them as forming a blossom or an augmenting path. At each of these three levels there may be other labelling work involvedbut it is majorized by the work already cited. The work of identifying and labelling the vertex at the other end of some edge to a given vertex need not increase more than linearly with n. An upper bound on the order of magnitude of memory needed for the algorithm is n 2-the same order of magnitude of memory used to store the graph itself.

3. Alternating paths. 3.0. A subgraph of graph G is a graph consisting of a subset of vertices in G and a subset of edges in G under the same incidences which hold for them in G. A non-empty graph G is called connected if there is no pair of non-empty subgraphs of G such that each vertex of G and each edge of G is contained in exactly one of the subgraphs. The vertices and edges of any graph partition uniquely into zero or more connected subgraphs, called its components. Maximum, minimum, and odd will refer to cardinality unless otherwise stated. 3.1. The graph E, formed from a set E of edges in G, is the subgraph of G consisting of edges E and their end-points. Any graph H, unless it has a singlevertex component, is formed by its edges. Thus in some contexts it causes no confusion to make no explicit distinction between a graph and its edge-set. In particular, a matching in G may be thought of as a subgraph of G whose components are distinct edges. The sum of two sets D and E is commonly defined as D + E = (D - E) U (E - D). The sum D + E of two graphs D and E, formed by edge-sets D and E, is defined to be the graph formed by the edge-set D E.

+

3.2. There are two other kinds of subtraction for graphs besides the settheoretic difference used above. With these we must distinguish between a subgraph and the edges which form it. Where G is a graph and E is a set of edges, G - E is the subgraph of G consisting of all the vertices of G and the edges of G not in E. For two graphs G and H, G - H is the subgraph of G consisting of the vertices of G not in H and the edges of G not meeting vertices of H. Graph G U H (graph G n H) consists of the union (intersection) of the vertex-sets and the edge-sets of graphs G and H, with incidences in G U H (graph G n H) the same as in G and H. We may also take the intersection or union of a graph with a set of edges to get, respectively, a set of edges or a graph. In the latter case the end-points of the edges being adjoined to the

364

453

PATHS, TREES, AND FLOWERS

graph must be specified. We shall have occasion to give the same edge different end-points in different graphs. 3.3. A circuit B in graph G is a connected subgraph in which each vertex of B meets exactly two edges of B. A (simple) path Pin G is either a single vertex (joining itself to itself) or else a connected subgraph whose two end-points each meet one edge, an end-edge, of P and whose other vertices each meet two edges of P. A path is said to join its end-points. 3.4. For the pair (G, A1), where M is a matching in G, a vertex is called exposed if it meets no edge of AI. Let M denote the edges of G not in AI. Define an alternating path or alternating circuit, P, in (G, A1) to be such that one edge in lv[ n P and one edge in 111 n P meets each vertex of P, except the endpoints in the case of a path. Several authors, beginning with J. Peterson in 1891, have used alternating paths to prove the existence of "factors" in certain kinds of graphs. 3.5. For any two matchings 111, and 1112 in G, the components of the subgraph formed by M, M2 are paths and circuits which are alternating for (G, M,) and for (G, M2). Each path end-point is exposed for either 1\1, or 111 2.

+

A vertex of G meets no more than one edge, each, of AI, and Jif2-and thus no more than two edges of 111, 111 2 , one in }v[, n 1112 and one in Aif2 n M,. An end-point v of a path in graph 1\1, 11£2, meeting an end-edge in iVl, n M2 , say, meets no other edge of 111,. Hence, if an edge of Jif 2 meets v, it does not belong to AI, and so it does belong to A£, 1'v1 2 • But then v is not an end-point. Therefore v is exposed for M 2 • This completes the proof.

+

+

+

3.6. An alternating path A in (G, M) joining two exposed vertices contains one more edge of M than of ],,1. M A is a matching of G larger than M by one. Such a path is called augmenting. Thus matching 11£ is not maximum if (G, A1) contains an augmenting path. The converse also holds:

+

3.7 (Berge, 1). A matching 11if in G is not of maximum cardinality if and only if (G, M) contains an alternating path joining two exposed vertices of M.

+

If iV12 is a larger matching than M, some component of graph "~if 1112 must contain more l'Vlz-edges than 11if. By 3.5, such a component is an augmenting path for (G, M).

3.8. Berge proposed searching for augmenting paths as an algorithm for maximum matching. In fact, he proposed to trace out an alternating path from an exposed vertex until it must stop and, then, if it is not augmenting, to back up a little and try again, thereby exhausting possibilities. His idea is an important improvement over the completely naive algorithm. However, depending on what further directions are given, the task can still be one of exponential order, requiring an equally large memory to know when it is done.

365

454

JACK EDMONDS

Norman and Rabin (7) present a similar method for finding in G a minimum cover-by-edges, C, a minimum cardinality set of edges in G which meets every vertex in G. The Berge-Norman-Rabin theorem (2) is generalized in (3), but a corresponding generalization of the algorithm presented here in Section 4 is unknown. 3.9. Norman and Rabin also show that the maximum matching problem and the minimum cover-by-edges problem are equivalent. Assuming every vertex meets an edge, the minimum cardinality of a cover of the vertices in G by a set of edges equals the minimum cardinality of a cover of the vertices in G by a set of edges and vertices, where a vertex is regarded as covering itself. By replacing edges by vertices or vice versa, one can go back or forth between a minimum cover by a set of edges and a minimum cover by a set of edges and vertices, where the latter set consists of a maximum matching together with its exposed vertices.

4. Trees and flowers. 4.0. A tree may be defined as (1) a graph T every pair of whose vertices is joined by exactly one path in T; (2) inductively, as either a single vertex or else the union of two disjoint trees together with an edge which has one endpoint in each; (3) as a connected graph with one more vertex than edges; and so on. 4.1. An alternating tree J is a tree each of whose edges joins an inner vertex to an outer vertex so that each inner vertex of J meets exactly two edges of J. An alternating tree contains one more outer vertex than inner vertices. This follows from the third definition of tree by regarding each inner vertex with its two edges as a single edge joining two outer vertices. 4.2. For each outer vertex v of an alternating tree J there is a unique maximum matching of J which leaves v exposed and the only exposed vertex in J. Every maximum matching of J is one of these. Definition (2) of tree can be strengthened to the statement that a tree minus anyone of its edges is two trees. Thus J minus anyone of its inner vertices, say u, is two alternating trees. One of these, J 1, contains v as an outer vertex. Assume inductively that J 1 can be matched uniquely so only v is exposed and that J 2 , the other subtree, can be matched uniquely so only the vertex V2, joined in J to u by edge e2, is exposed. Then the union of e2 and these two matchings is a matching of J which leaves only v exposed. Since every edge of J has one inner and one outer end-point, every maximum matching leaves only an outer vertex exposed. 4.3. A planted tree, J = J(M), of G for matching M is an alternating tree in G such that },{ n J is a maximum matching of J and such that the vertex r

366

PATHS, TREES, AND FLOWERS

455

J which is exposed for M n J is also exposed for M. That is, all matching edges which meet J are in J. Vertex r is called the root of J(M).

III

In planted tree J(M) every alternating path P(M), which has outer vertex v and the matching edge to v at one of its ends, is a subPath of the alternating path P .(M) in J(M) which joins v to the root r.

For k > 1, assume that P 2k- i is the unique path P(M) which contains 2k - 1 edges and assume that at its non-vend it has an inner vertex Uk and a matching edge. Then P 2k , consisting of P 2k- i together with the unique nonmatching edge in J which meets Uk, is the unique path P(M) with 2k edges. It has outer vertices Vk and v at its two ends. If Vk ~ r, then P 2k+i' consisting of P 2k together with the unique matching edge which meets Vk, is the unique path P(M) with 2k + 1 edges. It has an inner vertex Uk+i and a matching edge at its non-vend. Since our assumption is true for k = 1 and since k cannot become infinite, the theorem follows by induction. We define a stem in (G, M) as either an exposed vertex or an alternating path with an exposed vertex at one end and a matching edge at the other end. The exposed vertex and the vertex at the other end are, respectively, the root and the ttp of the stem. The preceding theorem tells us that (1) no trial-and-error search is required to find the path in J from any of its vertices back to the root and (2) the path p. in J joining any outer vertex v to the root of J is a stem. 4.4. An augmenting tree, J A = JA(M), in (G, M) is a planted tree J(M) plus an edge e of G such that one end-point of e is an outer vertex Vi of J and the other end-point V2 is exposed and not in J. The path in J A which joins V2 to the root of J is an augmenting path. This follows immediately from (4.3). 4.5. For each vertex b of an odd circuit B there is a unique maximum matching of B which leaves b exposed. A blossom, B = B(M), in (G, M) is an odd circuit in G for which M n B is a maximum matching in B with say vertex b exposed for M n B. A flower, F = F(M), consists of a blossom and a stem which intersect only at the tip of the stem (the vertex b). A flowered tree, J F, in (G, M) is a planted tree J plus an edge e of G which joins a pair of outer vertices of J. The union of e and the two paths which join its outer-vertex end-points to the root of J is a flower, F. Let Vi and V2 be these outer vertices, and Pi and P 2 be the paths in J joining them to r. We have seen that Pi and P 2 are stems (which are easily recovered from J). Since they intersect in at least r and since the path in J joining r to any other vertex is unique, P b = Pi n P 2 is an alternating path with an end at r. If its other end-point, say b, were inner, it would be distinct from r, Vh and V2. Thus r would be distinct from Vi and V2, and b would meet three different edges of J, one in P b , one in Pi not in P b , and one in P 2 not in P b • But an inner vertex meets only two edges in the tree. Therefore b is outer and P b is a stem. Thus Pi' = Pi - (P b - b) and P 2' = P 2 - (P b - b), unless one is a

367

456

JACK EDMONDS

vertex VI = b or V2 = b, have non-matching edges at their b-ends and matching edges at their outer ends. It follows in any case that B = PI' U P 2 ' U e is a circuit with only b exposed for M n B, and thus B is a blossom with Pb as its stem. 4.6. A Hungarian tree H in a graph G is an alternating tree whose outer vertices are joined by edges of G only to its inner vertices. 4.7. For a matching M in a graph G, an exposed vertex is a planted tree. Any planted tree J(M) in G can be extended either to an augmenting tree, or to a flowered tree, or to a Hungarian tree (merely by looking at most once at each of the edges in G which join vertices of the final tree).

An exposed vertex satisfies the definition of planted tree. Suppose we are given a planted tree J and a set D (perhaps empty) of edges in G which are not in J but which join outer to inner vertices of J. (1) If no outer vertex of J meets an edge not in D U J, then J is Hungarian. Suppose outer vertex VI meets an edge e not in D U J, whose other end-point is, say, V2. (2) If V2 is an inner vertex of J, we can enlarge D by adjoining e. (3) If V2 is an outer vertex of J, then e U J is a flowered tree. (4) If V2 is exposed and not in J, then e U J is an augmenting tree. (5) Finally, if V2 is not exposed and not in J, then the ~1-edge e2 which meets V2 is not in J, and thus V3, the other end-point of e2, is not in J by the definition of planted tree. Therefore, in this case we can extend J to a larger planted tree with new inner vertex V2 and new outer vertex V3 by adjoining edges e and e2. For any J and D, one of the five cases holds. Therefore by looking at any edge in G at most once, we can reach one of the three cases described in the theorem, because the other two cases, (2) and (5), consume edges and G is finite. 4.8. The algorithm which is being constructed is efficient because it does not require tracing many various combinations of the same edges in order to find an augmenting path or to determine that there are none. In fact we accomplish one or the other without ever looking again at the edges encountered in process (4.7), except to pick out from the tree the blossom or the augmenting path when case (3) or (4) occurs. We see from (4.3) and (4.5) how easy it is to retrieve the blossom or the path. When flowers arise we "shrink" the blossoms, and so if an augmenting path arises later, it will be in a "reduced" graph. However, only one other very simple kind of task translates the augmentation to (G, M) itself. That task is to expand a shrunken blossom to an odd circuit and find the maximum matching of the odd circuit which leaves a certain vertex exposed. Actually, we shall find in (7.3) that it is desirable to leave odd circuits shrunk while looking in the reduced graph for as many successive augmentations as possible since they are all reflected in augmentations of (G, M). 4.9. For H, a subgraph of G, G is the disjoint union (G - H) U oH U H+, where oH is the set of the edges with one end-point in H and one end-point in

368

PATHS, TREES, AND FLOWERS

457

C - H, and where H+ = C - (C - H) is the subgraph consisting of Hand all edges of C with both end-points in H. When H is connected, shrinking H means constructing the new graph GIH = (G - H) U oH U h by regarding H+ as a single new vertex, h = HIH, which meets the edges oH = oh. The end-points in G - H of the edges oH do not change. 4.10. If B is an odd circuit in G, then b' = BIB is called a pseudovertex of GIE. To expand b' means to recover G from GIB. The algorithm, after it expands a pseudovertex b', will make use of the circuit B. In general, finding a "Hamiltonian" circuit in a graph B+ is difficult. Therefore, when the algorithm shrinks B to form GIB, it should remember circuit B as having effected the shrinking. Thus we call circuit Bin G (rather than B+) the expansion of b'. In formal calculation shrinking Bin G is an easy operation. Essentially, just assign all the vertices and edges of B a label, b', and then, until b' is expanded by erasing these labels, ignore any distinction between vertices labelled b' and ignore edges joining them to each other. Where M is a matching set of edges in G, MIB is defined as M n (GIB). Clearly, if B is a blossom for (G, M), then MIB is a matching of GIB. 4.11. Let Go = G, G; = G;_J/Bj, and b; = B;IB; for i = 1, ... ,n, where Bi is an odd circuit in graph G;_l. We inductively define the pseudovertices (with respect to C) of Gk (k = 1, ... ,n) to be bk together with the pseudovertices in G k- 1 - Bk = G k - bk. Of course not every bi, i < k, will be a pseudovertex of Gk because some will have been absorbed into others. The order in which the pseudovertices of a Gk arise is immaterial. That is, the order in which the odd circuits B i are shrunk is immaterial except in so far as one shrunken B i is a vertex in another B i. Thus we can expand any pseudovertex b j of G k to obtain a graph Gkj for which GkjlB j = G k. The pseudovertices of Gk / (with respect to G) are the pseudovertices in B j together with the pseudovertices in G k - b j ; that is, graph Gkj can be obtained from G by shrinking in a proper order the odd circuits which were absorbed into these pseudovertices. On the other hand, we do not expand a vertex bh in Bk until vertex b,; is expanded. There is a partial order on the b;'s defined by the transitive completion of the relation bh < bk where bh is a vertex of B k. (I t is a special kind of partial ordering because each bh is a vertex of at most one Bk.) There is a partial order on the sets, Sa, S~, ... , of mutually incomparable b;'s, where Sa < S~ when every member of Sa is less than or equal to some member of S~. Evidently there is a unique family of graphs, Ga , G~, ... , which include the G/s and G. They correspond 1-1 to the sets, Sa, S~, ... , so that the pseudovertices of Ga are Sa, etc. We have Sa < S~ if and only if C~ can be obtained from Ga by shrinking certain B;, those for which b; is less than or equal to some member of S~ and not less than or equal to any member of Sa. Graph G corresponds to the empty set and Gn corresponds to the set of b/s which are maximal with respect to their partial order.

369

458

JACK EDMONDS

The complete expansion of a pseudovertex b l is the subgraph U+

= G - (G - U) C G

where U consists of all vertices of G absorbed into bi by shrinking.

4.12. Where B is the blossom of a flower F for (G, M), M is a maximum matching of G if and only if MIB is a maximum matching of GIB. 4.13. Where blossom B is in J F, a planted flowered tree for (G, M), JFI B is a planted tree for (GIB, MIB). It contains BIB as an outer vertex. Its other ollter and inner vertices are respectively those of J F which are not in B. Theorem (4.13) follows easily from (4.5). We separate the two converse statements of Theorem (4.12) into slightly stronger statements, (4.14) and (4.15).

4.14. Where B is any odd circuit in G, for every matching MI of G! B there exists a maximum matching MB of B such that M = MI U MB is a matching for G. Since any matching MI of GIB contains at most one edge meeting BIB, the edges MI in G meet at most one vertex, say bI, of B. Therefore the desired JIB is the maximum matching of B which leaves bl exposed. Since the cardinality IMBI of MB is constant, any augmentation of MI yields a corresponding augmentation of M. Therefore, the "only if" part of (4.12) is proved. Applying the above matching operation to successive expansions of pseudovertices into odd circuits we have:

Where P is the complete expansion of a pseudovertex p in G2 , where G I is the graph obtained from G2 by completely expanding p, and where M2 is any matching of G 2, there exists a matching M p of P leaving exactly one exposed vertex in P such that Mp U M2 is a matching of G I . Thus since IMpl is constant. any augmentation in G2 yields a corresponding augmentation in G I • 4.15. For (G, M), let P be a subgraPh such that (1) M n P leaves exactly one exposed vertex in P, (2) M I P is a maximum matching of GI P, and (3) p = PIP is the tip of a stem Spfor (GIP, MIP). Then M is a maximum matching of G. The edges of Sp form in G a stem, S, for (G, M). (In case Sp has no edges, take S to be the vertex in P exposed for M.) Compare M' = M + Sand M'IP with M and MIP. The definition of stem implies that M' is a matching of G with IM'I = IMI and that the exposure of the root of S is changed to the exposure of the tip of S. Similarly M' I P = M I P Sp is a matching of GI P with IM'IPI = IMIPI and with vertex p exposed. Because the cardinalities do not change, it is sufficient to show that M' is maximum in G if M' I P is maximum in GIP. Using (3.7), if M' is not maximum, G contains an augmenting path A = A (M'). If A contains no vertices of P, then it is also an augmenting

+

370

459

PATHS, TREES, AND FLOWERS

path for M'IP in GIP. Otherwise, because P contains only one exposed vertex for l,g', at least one of the ends of A is at an exposed vertex UI not in P. There is a unique sub path A I of A with one end-point at UI and containing only one vertex PI of P, at its other end. The only difference between Al and AdP = (AI U P)IP is that PI is replaced by p, which is exposed for M'IP. Thus AdP is an augmenting path for M'IP and so M'IP is not maximum. The theorem is proved. The theorem extends as follows: For (G, M), let P lt ••• ,Pn be a family of disjoint subgraphs in G such that (1) M n P ( leaves exactly one exposed vertex in Pi' (2) Mn = M n Gn is a maximum matching of Gn = GIPd ... IPnI and (3) vertices PilP i of Gn are outer vertices in a planted tree I n for (Gn, Mn). Then M is a maximum matching ofG. We may assume that the indices order the P ilP /s so that (for k = 1, ... , n - 1) those from 1 through k are contained in a planted subtree J k of I n not containing those from k 1 through n. Hence the theorem follows by induction after proving that M n- I = M n Gn - I is a maximum matching of Gn- I = GIPd ... lPn-I. Since every outer vertex of I n is the tip of a stem in G,,, this follows from the last theorem.

+

4.16. Theorems (4.7) and (4.13) show how by branching a planted tree out from an exposed vertex of (G, M) and shrinking blossoms B ( when they are encountered, we eventually obtain in a graph Gk = GIBd ... IBk either a tree with an augmenting path or a Hungarian tree. An augmenting path admits an augmentation of matching Mk = M n Gk according to (3.7), and (4~14) shows how this induces an augmentation of matching M k - I = M Gk - I and so on back through M. On the other hand, when a Hungarian tree J is obtained, submatching (J U Bk U ... UBI) n M of (G, M) cannot be improved and so this part of G is freed from further consideration. This follows immediately from (4.15) and the next theorem, (4.17), where Gk is denoted simply as G.

n

4.17. Let J be a Hungarian tree in a graph G. A matching MI of G - J is maximum in G - J if and only if MI together with any maximum matching MJ of J is a maximum matching of G. Since J and G - J are disjoint, if there exists a matching M/ of G - J which is larger than M I, then MI' U MJ is a larger matching of G than MI U M J • Conversely, suppose MI is maximum for G - J. Let M' = M/UMIUM/

be an arbitrary matching of G where MI' C G - J, where M/ C J, and where MI n «G - J) U J) is empty. Then IMI'I -< IMII. Every edge in MI meets at least one inner vertex of J; that is, where I' C I(J) is the set of the inner vertices met by MIt IMII -< WI. The graph J - I' consists of 11'1 + 1

371

460

JACK EDMONDS

disjoint alternating trees whose inner vertices together are I(J) - I'. Therefore, since the maximum matching cardinality of an alternating tree equals the number of its inner vertices, 1M/I < II(J) - I'I. Adding the three inequalities gives IM'I < IMII II(J)I = IMI U MJI. SO the theorem is proved.

+

4.18. The matching M of G = GO, to begin with, may be empty. If it leaves any exposed vertices, then the process (4.16) operates with respect to one of them. Either it produces an augmentation of M by one edge, thus disposing of two exposed vertices, or it reduces the possible domain for augmenting M to a sub graph (? = G k - J of G, containing one less exposed vertex and containing only edges and vertices not previously considered. Successive application of (4.16) may reduce the consideration of M to a sub graph Gt of G and reveal there an augmentation of M. After augmenting in Gt, obtaining a larger M for G with two less exposed vertices in Gt, (4.16) operates again in Gt, never returning to the matching in the rest of G. 4.19. Repeated application of (4.18) reduces the domain in question to a

Gn containing no exposed vertices. Then we know that we have a maximum matching; let us still call it M, with n exposed vertices in G. Thus the construction of an algorithm for finding a maximum cardinality matching in a graph is complete. Often the last application of (4.18) is unnecessary. For verifying maximality, the algorithm may as well stop when it reduces the domain to a Gn-l containing one exposed vertex, since two exposed vertices are necessary in order to augment. However, for theoretical purposes it is convenient to have the algorithm grow a tree from each exposed vertex of the final, maximum matching. 4.20. We may define an alternating forest to be a family of disjoint alternating trees and a planted forest in (G, M) to be a family of disjoint planted trees in (G, M). A dense planted forest is one which contains all the exposed vertices of (G, M). The family of exposed vertices, itself, is a dense planted forest. The algorithm works as well by growing a dense planted forest all at once, rather than one tree at a time. I t is appropriate then to define augmenting forest (flowered forest) to be a planted forest plus an edge e of G whose end-points are outer vertices of different trees (of the same tree) of the planted forest. A Hungarian forest in G is defined similarly to Hungarian tree, replacing the word "tree" by "forest." Notice that the trees of a Hungarian forest are not necessarily Hungarian trees-an outer vertex of one tree may be joined by an edge of G to an inner vertex of another tree in the forest. The theorems on trees presented in this section are essentially the same for forests.

5. The dual to matching. 5.0. A bipartite graph K is one m which every circuit contains an even

372

PATHS, TREES, AND FLOWERS

461

number of edges. This condition, that K contains no odd circuits, is equivalent to being able to partition the vertices of K into two parts so that each edge of K meets exactly one vertex in each part. The well-known Konig theorem states: For a bipartite graph K, the maximum cardinality of a matching in K equals the minimum number of vertices which together meet all the edges of K. 5.1. The linear programming duality theorem states: If (1) x ::> 0, Ax < c and (2) y # 0, A Ty ::> b, for given real vectors band c and real matrix A, then for real vectors x and y, maxz(b, x) = miny(c, y) when such extrema exist. The problems of finding a maximizing vector x and a minimizing vector y are called linear programmes, dual to each other. 5.2. The Konig theorem is now widely recognized as the instance of (5.1) where band c consist of all ones and A = AK is the zero-one incidence matrix of edges (columns) versus vertices (rows) in a bipartite graph K. In view of Theorem (5.1) the Konig theorem is equivalent to the remarkable fact that, with b, c, and A as just described, the two linear programmes of (5.1) have solutions x and y whose components are zeros and ones whether or not this condition is imposed. An elegant theory centres on this phenomenon. Graph-theoretic algorithms are well known for so-called assignment, transportation, and network flow problems (5). These are linear programmes which have constraint matrices A that are essentially A K • 5.3. For a linear programme with an arbitrary matrix A of integers, or even of zeros and ones, we cannot say that the extreme values will be assumed, as when A = A K , by vectors with integer components. Therefore, in general when we impose the condition of integrality on x, the equality of the two extrema no longer holds. In particular, when the maximum matching problem is extended from bipartite to general graphs G, a genuine integrality difficulty is introduced. Our matching algorithm met it by the device of shrinking blossoms. 5.4. The matching algorithm yields a generalization of the Konig theorem to maximum matchings in G. The new matching duality theorem, in the form "maximum cardinality of a matching in G equals minimum of something else," is also an instance of linear programming duality. I t is reasonable to hope for a theorem of this kind because any problem which involves maximizing a linear form by one of a discrete set of non-negative vectors has associated with it a dual problem in the following sense. The discrete

373

462

JACK EDMONDS

set of vectors has a convex hull which is the intersection of a discrete set of half-spaces. The value of the linear form is as large for some vector of the discrete set as it is for any other vector in the convex hull. Therefore, the discrete problem is equivalent to an ordinary linear programme whose constraints, together with non-negativity, are given by the half-spaces. The dual (more precisely, a dual) of the discrete problem is the dual of this ordinary linear programme. For a class of discrete problems, formulated in a natural way, one may hope then that equivalent linear constraints are pleasant even though they are not explicit in the discrete formulation. 5.5. Arising from the definition of a matching-no more than one matching edge to each vertex-are the obvious linear constraints that for each vertex v E G the sum of the x's corresponding to edges which meet v is less than one. To obtain a maximum cardinality matching, we want to maximize the sum of all the x's, corresponding to edges of G, subject to the additional condition that each x is zero or one. It turns out that maximum matching can be turned into linear programming by substituting for the zero-one condition the additional constraints that the x's are non-negative and that for any set R of 2k + 1 vertices in G(k = 1,2, ... ) the sum of the x's which correspond to edges with both end-points in R is no greater than k. The former condition on the x's obviously implies the latter since for no matching in G do more than k matching edges have both ends in R. The converse-that subject only to the linear constraints, L Xi can be maximized by zeros and ones-is not so obvious, but in view of (5.1) it follows from (5.6), the generalized Konig theorem. Actually the stronger converse holds-that subject only to these same linear constraints, L Ci Xi, for any real numbers Ci, can be maximized by zeros and ones. In other words, the polyhedron described by the constraints is, indeed, the convex hull of the zero-one vectors which correspond to matchings in G. We shall not prove this until we take up maxiinum weight-sum matching in paper (4). Although the convex-hull notion suggested trying to generalize the Konig theorem, and although the generalization found does suggest the true convex hull, the success of the first suggestion does not necessarily validate the second. 5.6. A set consisting of one vertex in G is said to cover an edge e in G if e meets the vertex. The capacity of this set is one. A set consisting of 2k + 1 vertices in G(k = 1,2, ... ) is said to cover an edge e in G if both end-points of e are in the set. The capacity of this set is k. An odd-set cover of a graph G is a family of odd sets of vertices such that each edge in G is covered by a member of the family. MATCHING-DUALITY THEOREM. The maximum cardinality of a matching in G equals the minimum capacity-sum of an odd-set cover in G.

374

PATHS, TREES, AND FLOWERS

463

It is obvious that the capacity-sum of any odd-set cover in G is at least as large as the cardinality of any matching in G, so we have only to prove the existence in G of an odd-set cover and a matching for which the numbers are equal.

5.7. The theorem holds for a graph which has a perfect matching M-that is, with no exposed vertices-since the odd-set cover consisting of two sets, one set containing one of the vertices and t he other set containing aU the other vertices, has capacity-sum equal to IMI . It also holds for a graph which has a matching with one exposed vertex. Here the odd-set cover may be taken as consisting of one member, the set containi ng all vertices of the graph. For the case of one exposed vertex, an odd-set cover may also be constructed as in (5.8) by applying t he algorithm to construct a Hu ngarian tree even though it obviously will not result in a ugmentation. 5.8. Applying the algorithm to (G, M), where !M! is maximum, using some exposed vertex as root, we obtain a graph G' containing a maximally matched H ungarian tree J, a number of whose outer vertices are pseudo. Let SJ consist of all odd sets of the following two types: sets each consisting of one inner vertex in J, and sets each consisting of the vertices in the complete expansion of one pseudovertex of J. The number of edges of }.{ which a member of SJ covers is equal to the capacity of the member. Every edge of M not in G' - J is covered by exactly one member of SJ' An edge of G is covered by a member of SJ if and only if it is not in G' - J. :Vlatch ing M (\ (G' - J) is a maximum matching of G' - J with one less exposed vertex than (G, M). Assuming that IAf (\ (G' - J)I equals the capacity of an odd-set cover, say S' J, of G' - J, we have that IMI equals the capacity of SJ V S' J. an odd set cover of C. Theorem (5.6) follows by induction on the number of exposed vertices. 5.9. It is evident from the proof that we may require the minimum odd-set cover to have certain other structure- in particular , that each member with more than one vertex contain the vertices of at least one odd circuit in G. \\1ith the latter restrict ion the theorem becomes a strict generalization of the Konig theorem.

6. Invariance of the dual. G.O. For any particular application of the algorithm (4) to G, yielding, say, the maximum matching 1'ff, we may skip the augme ntation steps in (4.16) by regarding the augmented matching as being the one already at hand. This gives a particu lar application of (4) to G starling with maximum matching M. In the application of the algorithm to (G, M,), we can rega rd all the branchings and blossom shrinkings as taking place without subtracting the trees] j as they arise. Thus we obtain from (C, M) a graph G* with a number of pscudo-

375

464

JACK EDMONDS

vertices which are outer vertices in a sequence {Id (i = 1, ... ,n) of disjoint planted trees in G*, one corresponding to each exposed vertex of (G, M). By expanding all the pseudovertices of G* completely, we recover the graph G. 6.1. The tree It is Hungarian in G* - II ... - II-It but usually not Hungarian in G* because an outer vertex of I I might be joined to an inner vertex of any other tree with a lower index. Hence the partition of the outer and inner vertices into trees I t depends on the order of their construction. Also non-matching edges which can occur in each tree are not unique. In general, joining outer to inner vertices of a I I are many other M edges which would do as well. The particular blossoms which led to the pseudovertices are also fairly arbitrary. And, finally, the maximum matching is far from unique. However, (6.2) will show that the graph G* is uniquely determined by G alone. 6.2. For a (G, M) where .M is any maximum matching, let G* and (Ii I be obtained from (G, M) by (6.0). (a) The non-pseudo outer vertices of the I/s and the vertices of the pseudovertex complete expansions, all called the outer vertices o (G) of G, are precisely the vertices of G which are left exposed by some maximum matching of G. (b) The inner vertices of the 1/s, called the inner vertices I (G) of G, are precisely those vertices of G not in O(G) but joined to vertices in O(G). (c) G* is obtained from G by shrinking the connected components of O(G)+, the subgraph of G consisting of vertices O(G) and all edges of G joining them. 6.3. We have defined vertex families O(G) and I(G) in terms of particular It. The theorem yields definitions dependent only on G itself. Clearly O(G*) and I(G*), defined in terms of the It in G*, are respectively the out~r and the inner vertices of the II. Notice that the early definitions of inner and outer, for vertices in an alternating tree, are consistent with the definitions for a general graph. 6.4. Proof of (6.2), (b) and (c). Let the vertex v* of G* be joined in G* to some outer vertex u* of It. Then v* is a vertex in some Ih(h , i), since It is Hungarian in G* - II - ... - It-I. But v* cannot be an outer vertex of Ih since u* is not inner and since Ih is Hungarian in G* - II - ... - Ih-I. Therefore v* is inner. It follows that each outer vertex u of G is joined only to inner vertices and to other vertices in the complete expansion of its image u*. By construction, each inner vertex is joined to an outer vertex of G. Hence, (b) is true. Since by construction the complete expansion of each outer vertex of G* is connected, it also follows that the connected components of O(G)+ correspond precisely to outer vertices of G*. Hence, (c) is true.

6.5. An outer vertex u of G, by definition, either is identical with or is contained in the complete expansion of some outer vertex u* of, say, Ii. For any maximum matching M t of alternating tree Ito M t U [M n (G* - Ii)] is

376

PATHS, TREES, AND FLOWERS

465

a maximum matching of G*, which by (4.14) induces a maximum matching AI' of G. Let AIt be the one which leaves u* exposed. If u* is pseudo, then by (4.14) AI' can be chosen so that u is exposed in the expansion. This proves half of (6.2), (a). 6.6. C. Witzgall suggested the following simplified proof of the converse, viz. that only the outer vertices are ever exposed for a maximum matching. A non-outer vertex v meets an edge e of AI. Deleting v and its adjoining edges, V J t - v is a Hungarian forest in G* - v. If v is inner, then the forest is dense in G* - v. Otherwise it is dense in G* - v except for one exposed vertex, the other end of e. In either case it follows that AI - e is a maximum matching of G - v. Assume that AI' is a maximum matching of G which leaves v exposed. Then .iV' is also a matching of G - v. Since AI' is larger than AI - e, we have a contradiction. This completes the proof of (6.2). 6.7. The definition of odd-set cover may be expanded (more than necessary for Theorem (5.6» to include the possibility of members which are even sets of vertices in G. A set of 2k-vertices has capacity k and covers the edges which have both end-points in the set. Then, clearly, Theorem (5.6) still holds for this kind of cover. With this definition of cover, it follows from the uniqueness of G* that there is a unique preferred minimum cover, S*, for any graph G. The one-vertex members of S* are the inner vertices of G*, the other odd members of S* correspond to the pseudovertices of G*, and the one even member of S* consists of the non-inner, non-outer vertices of G*.

7. Refinement of the algorithm. 7.0. Several possibilities for refining the algorithm suggest themselves. We could remember an old tree, uprooted by an augmentation, so that when a new rooted tree takes on a vertex in it, we can immediately adjoin a piece of it to the new tree. This appears not worth doing. A tree is easy to grow, easier than selecting from an old tree the piece which may be grafted. 7.1. A quite useful refinement is to leave the pseudovertices of the old tree shrunk until their expansion is necessary. We see from (4.14) that any further augmentation of a matching AI' in a graph G' with pseudovertices yields a further augmentation in G just as easily as the first. On the other hand, a maximum matching in G', reached after one or more augmentations, does not necessarily yield a maximum matching of G. The sufficiency part of (4.12) depends on the blossom being part of a flower, whereas the first augmentation in G' uproots the stem. 7.2. However, we may easily observe the circumstance arising in the application of the algorithm to (G', .M') where the shrinkage might hide a

377

466

JACK EDMONDS

possible augmentation in G. It is where a pseudovertex, say b', becomes an inner vertex of the planted tree, say J' = J'(M'). In this case, we obtain a graph G" from G' by expanding b' to an odd circuit B. The edges of J' form in G" a subgraph which we still call J'. The set M' is also a matching in G". One edge of J' M' has an end-point, say bl , in B. One edge of J' n M' has an end-point, say b2 , in B. The maximum matching MB of B which is compatible with M' in G" leaves bi exposed. The vertices bi and b2 partition B into two paths, P 2 even and PI odd, which join bi and b2• The graph J" = P 2 \J J' is a planted tree in G" for the matching MB \J M'. Unless bi and b2 coincide, P 2 will contain outer vertices of J". These may be joined to vertices not in J" which admit an extension of J", not possible for J"IB = J' C G', to a planted tree with an augmenting path.

n

7.3. If J' C G' can be extended in G' to a tree with an augmenting path, it does not matter that some of the inner vertices are pseudo because a further augmentation for G is thus determined. If J' with pseudo inner vertex b' can be extended in (G', M') to a flowered tree whose blossom B' contains b', then b' loses its distinction as an inner vertex. It might as well stay shrunk and be absorbed into the new pseudovertex B'IB' of G'IB'. In fact, Theorems (4.15) and (4.17), together, tell us that any pseudo outer vertex might as well be left pseudo during the algorithm. Therefore a pseudo inner vertex should be retained until a planted Hungarian tree J H is obtained. If no inner vertices of J H are pseudo, then (4.17) is applicable. Otherwise, at this point, a pseudo inner vertex should be expanded according to (7.2). 7.4. One of the main operations of the algorithm is described in (4.3). That is back-tracing along paths in a tree already constructed, either to obtain an augmentation as in (4.4) or to delineate a new blossom as in (4.5). The backtracing takes place in an alternating tree only because blossoms have been shrunk to pseudovertices. A pseudovertex~may be compounded from many earlier blossom shrinkings and may thus ~ncompass a complicated subgraph of G. After shrinking, back-tracing entirely bypasses the internal structure of a pseudovertex. A possible alternative to actually shrinking is some method for tracing through the internal structure of a pseudovertex. Witzgall and Zahn (9) have designed a variation of the algorithm which does that. Their result is attractive and deceptively non-trivial. REFERENCES

1. C. Berge, Two theorems in graph theory, Proc. Nat!. Acad. Sci. U.S., 43 (1957), 842-4. 2. - - - The theory of graphs and its applications (London, 1962). 3. J. Edmonds, Covers and packings in a family of sets, Bull. Amer. Math. Soc., 68 (1962). 494-9.

378

PATHS, TREES, AND FLOWERS

467

4. - - - Maximum matching and a polyhedron with (0, 1) vertices, appearing in J. Res. Nat\. Bureau Standards 69B (1965). 5. L. R. Ford, Jr. and D. R. Fulkerson, Flows in networks (Princeton, 1962). 6. A. J. Hoffman, Some recent applications oj the theory oj linear inequalities to extremal com· binatorial analysis, Proc. Symp. on App\. Math., 10 (1960), 113-27. 7. R. Z. Norman and M. O. Rabin, An algorithmJor a minimum cover oj a graph. Proc. Amer. Math. Soc., 10 (1959), 315-19. 8. W. T. Tutte, TheJactorization oj linear graphs, J. London Math. Soc., ee (1947),107-11. 9. C. Witzgall and C. T. Zahn, Jr., Modification oj Edmonds' algorithmJor maximum matching oj graphs, appearing in J. Res. Nat!. Bureau Standards 69B (1965).

National Bureau of Standards and Princeton University

Reprinted from Cllnad. J. Math. 17 (1965), 449-467

379

A THEOREM OF FINITE SETS by G. KATONA Mathematical Institute of the Hungarian Academy of Sciences Budapest, Hungary

§ 1. Introduction Let AI> ... , An be a system of different subsets of a finite set H, where IHI = hand IAil = I (1 ~ i ~ n) (IAI denotes the number of elements of A). We ask for a system AI' ... , An (for given h, I, n) for which the number of sets B satisfying IBI= 1- 1 and Be AI for some i is minimum. The first lower estimation for this minimum is given by SPERNER ([1], Hilfssatz).

I

h

(N)·t .

n· . Th·IS d epend s on . . H owever, 1·f n = ,1 IS h-l+l ·1 expected that the minimizing system is the system of all I-tuples chosen

· estImatIOn . . IS . H IS

from a subset of N elements of H. In this case the number of B's is ( N ) 1-1 which does not depend on h. A. HAJNAL proved this statement in the case of 1 = 3 (unpublished). In this paper I prove for all cases that this is, indeed, t.he minimum, and find t.he (more complicated) minimum alsQ for arbitrary n. The theorem is probf.bly meful in proofs by i~duction over the maximal number of elements of the subsets in a system, as was SPERNER'S, lemma in his paper [1]. ' KLEITMAN told me in Tihany (Hungary) that he thought I could solve t.he following problem of ERDOS by the aid of the above'theorem and the "marriage problem": Let AI> ... , An be subsets of H, where IHI = 2h and IAil = h. For what n's is it always possible to construct a system B I , . . . , Bn with the properties Bi C Ai' IBi I = h - 1 (1 < i < n).§ 3 contains the solution of this problem in a more general form.

§ 2. The main result Before the exact formulation of the theorem we need the following simple but interesting LEMMA 1. If n and I are natural number8, we can write the number n uniquely in the form (1)

n = (al(n, I

I))

+ (al-l (n, I)) + ... + (a/(n,l) (n, I)) , 1- 1

t(n, 1)

where t(n, I) :2: l, al > ai-I> ... > a/(n.l) are natural number8 and ai(n,I):2: :2: i (i = t(n, 1), t(n, 1) 1, ... , 1).

+

IR7

381

188

G. KATO:l!A

PROOF.

The existence of form (l) is proved by induction over 1. For

1 = 1 the statement is trivial. Assume that for 1 = k - l it is true also and prove for 1 = k. Let ak be the maximal integer satisfying the inequality

(~k) < n. If here equality holds, we are ready. If it does not, using the induction hypothesis we have for the number

n- (a;) the following expression:

(2)

where t ~ 1, ak-l > ... > at, ai ~ i (i = t, t + 1, ... , k -- 1). (2) gives an expression for n, we have to verify only ak > ak-l and ak ~ k. If ak <::::: <::::: ak-l held, then

would hold also, which contradicts choosing of ak' On the other hand, > ak-l and ak-l ~ k - 1. The unicity of Form (1) is proved also by induction over l. For 1 = 1 the statement is trivial. Assume that for 1 = k - 1 it is also true and prove for 1 = k. If, on the contrary, there exist two forms:

ak ~ k follows from a k

we may separate two different cases. If ak = a;', we can obtain two different

(~)

forms of n --

, which contradict our induction hypothesis. If ak

< a~,

the contradiction follows from

n<

(:k) + (~

=:)+ ... + (ak- : + I) <

=

(alc ;

1) _ 1 <

(an;

1)

(i) (i) + ... + (a:) . <:::::

Thus we proved the lemma. In the future we will use the following two notations: E[(n)

=

(a[(n, l) -

,

1- I

1) + (a[_l(n, l) - 1) + ., . + (a[(n,[)(n, 1) - I)

and l'[(n)

=

(a[(n,

1- 2

I)) +

1- I

(a[-l (n,

1- 2

t(n, 1) - I

l)) + ... +

(a[(I/,[) (n,

These numbers are uniquely determined by Lemma I.

382

l)) .

t(n, l) - I

s:

A THEOREM

O~'

189

FINITE SETS

Let us consider now the problem. I.Jet H be a finite set with h elements, and V't = {AI' ... ,An} a system of different subsets of H, where the number of elements of Ai is

Obviously, I is a fixed integer between 1 and h. Let c(V't) denote the following system c(V't) = {B: IBI = 1- 1 and Be Aj for at least one 1}. The problem is to determine the minimum of Ic(V't)I, if h; n and I are given. Theorem 1 gives the exact solution of this problem. THEOREM 1. Let h, n and I be given integers with the properties

If H is a set of h elements, and V't={AI'·· .,An},

lAd =1

(i

= 1, ... , n)

a system of different subsets of H, then

min Ic(d) I = F/(n) , where the minimum runs over all such 8ystems V't.

REMARK. It is interesting, that minlc(V't)l does not depend on h. For example, SPERNER'S estimation [1]: c

(V't)" L

+

n·l h_ 1 1

depends on h. Before the proof we shall give another theorem. We will prove them together. THEOREM 2. Let h, n and I be given integers with the properties h~l,

1

<1< hand

Further G and H are disjoint sets of h elements. If is a SY8tem of Ai'S, where and

V't={A I , ... ,An} Aie G

or

Aie 11

(1

IAil=1

then

min Ic(V't) I =

(z ~ 1) + F/

383

(1

(n -

(~) ) .

s: i::::;;: n)

:S: i ::::;;: n) ,

190

G. KATONA

PROOF. 1. First we construct the minimizing system of Theorem 1. Denote this system by J(h, n, I). Obviously, it is sufficient to construct the system J(ar(n) , n, I), where aT(n) is the least integer satisfying

(art)) ~n.

The construction will be carried out by induction over I. HI = 1, at(n)

=

= nand J(at(n), n, 1) consists of all the sets of one element. Assume we

constructed already the system J(aT_l(n), n, 1now .1(aT(n), n, I). H n =

(al(~' 1) J ' the~

1)

for all n. Construct

the minimizing system consists

(al(~' 1) ) , let H be a set of aT(n) = elements, and e an element of H. Since a/ > al- t , we can con-

of all the subsets having I elements. If n>

=al(n,I)+1

struct the system..4 (al(n, 1), n -

(al(~' I) ) , 1- 1) on H --{e} by the induc-

tion hypothesis. Define the system J J = {N

in the following manner:

U{e}: NEJ(al(n, l),n -

(al(~,l)), I-I)}.

H .9J denotes the system of all subsets of H - {e}, having I elements, then .9J and Jform together the system J(aT(n), n, I). Indeed, the number of

+n_

(a/(~, 1)) = n - and

sets is

(a/(~, I))

(4)

Ic(J(aT(n), n, I» I =

(a/(n, I)) 1- 1

However, it is easy to see, that

Ic(J(aT(n), ri, 1) 1= (a;(n,; J+ Ie

we have only to verify

+ ... + (a/(n,l) (n, I)) t(n, I) - 1

(Jl

= F/(n) .

(al(n, I), n - (a l (;' 1)) , 1 -

1)) I

and by the induction hypothesis

Ie ( ....•/t (al(n, I), n -

(al(n, I)) , I _ 1

1)) I= (a/_1-2 (n, I)) + ... + (a/(n,l) (n, I)) , t(n, 1) 1

I

which proves (4). 2. The minimizing system of Theorem 2 consists of a complete system in

G, and J (h' n - (~) , I) in H. 3. In the previous two points we showed that in the case of Theorem 1 min! c(d)

I ::;: F/(n) ,

384

191

A THEOREM OF FINITE SETS

and in the case of Theorem 2

minlc(d)I~(l~I}+F,(n -(~)}. Thus, it is suftcient to verify (5)

and

Ic( d) I Z

(6)

(I ~ 1) + F, (n - :~ )} ,

respectively. These statements will be proved by induction over l. If 1= 1, both statements are trivial. Assume we have proved for all numbers < I and prove for l. 4. First we prove the inequality (7)

if (8)

are integers, and (9)

The statement will be proved for fixed 1 and for every n, n l , n 2 using the induction hypothesis for I - l. For the sake of simplicity we use the following notations: t = t(n, I)

at

r = t(nl' l ) b , 8

= t(n2' l -

1)

ci

= =

at(n, I) a/(nl • I)

= a/(n2, I -

(r ~ i ~ 1) 1)

(8 ~

i

s:;: 1- 1)

at

= at(n) ,

bt

=

Ct-l

= at-l(n2 )

It follows from (8) and (9) that (10)

n1Zn -E,(n) = (a, ~ 1) + ... + (at ~ 1).

Because of (10) (11)

must hold, since in the contrary case it would be

what contradicts (lO). On the other hand (12)

385

at{nl) , •

192

G. KATONA

because of (8). Applying (11) and (12) we can distinguish two different cases: (a) bl = al and (b) bl = al - 1. (a) In this case (7) has the form ( al

ll-1

)

+ ( ai-I) + ... + ( l-2

at

t-l

):s::.

Decreasing both sides by ( al ) we have l-1

(13)

Let H i andH2 be disjoint sets. Construct the system..£ (b

1) on HI and the system..£(et_I' n

1-

2,

1) on IJ

1-

2•

l-I+ 1, ni _

(~l) ,

In this manner we

obtain a system Jon HI U H 2 • Applying the induction hypothesis (Point 3. (5)) for J and 1 - 1 we have

F1_I(n (14)

= Ie (..£ (b l-

I

(~l)) ~le(J)I=

+ 1, n

i -

(an, l- 1)) I + le(..£(e~_I' n

2,

1- 1»1·

However, we know (Point 1. (4)) that (15)

Ie

(Jf (bl- + I

l,nl -

(a;), 1- 1)) 1= Fl- (ni _I~l)) I

and (16)

Finally, (13) follows from (14), (15), and (16). (b) bl =

al-

l. We separate this case into two subcases: (ba)

n L (a1-11) , 2

(bb)

l -

386

n < (aI - I1) . 2

l -

A THEOREM OF

FI~nl'E

193

SETS

(ba) In this case (7) has the form

(1 all) + (t-12)+ ... + (t at 1) ~ (~l-=: )+(lb~12)+'" since

el- 1= al -

+ (r b, 1) +

+ (~l-=- 21) + (/1-23)+ ... + (8 ~ 1) , 1, because of (9) and the supposition (ba). Decreasing

l- 1) + (ai-I) al ) (al-1 1-2

both sides by ( = 1-1

we have

We can prove (17) by using of the induction hypothesis if (18) holds. However (9) gives

- 1) + (el-2 ) + ... + (e ~ (a l- 1) + tal1-1 1-2 1-1 + (a l-1- 1) + ... + (at - 1) . 1-2 t-l s)

(19)

8

Decreasing both sides by

(a1-11) we l -

obtain

V1-22) + ... + (C;) ~ (al~~~I) + .. , + (:t-=-ll)

(20)

and (20) is equivalent to (18). (bb) In this case (7) has the form

l- 1) + al ) + (1-2 al-1) + ... + (t at- l ) ~ (a1-1 (1-1 Decreasing both sides by

13

1-11) we have

(a

l -

Gra.ph

387

194

G. KATONA

Let G and H be two disjoint sets of al

...4' (a l - 1, nl ::<:::::

(a l -

(a l ~ 1J ' I -

-

1 elements. Construct the system

1)- We can it construct if n 1 _

(a l ~ 1 ) ::<:::::

1) . But this follows from al - l = bl > bl-v since nl - (a ~ 1) = (/1-11 )+ ... + (;).

I-I

l

Construct further the system ...4'(CT-l' n2 , 1- 1) on H. The possibility of this construction follows from the assumption (bb). In this manner we obtain a system f on G U H. Applying the induction hypothesis (Point 3. (6)) for fand 1- 1 we have

(22)

+ Ic(...4'(cT_l> n

2,

1- 1)

I.

However, we know (Point I. (4)) that (23)

and (24) further, (2l) follows from (22), (23) and (24). Thus we proved the inequality for I. 5. However, we need (7) under the condition

n·l

(25)

n2::<:::::-

aT

instead of (9). Thus we are going now to prove the inequality

n·l

(26)

-::<:::::EI(n).

aT

We prove (26) by induction over l, but we should like to mention that the proof of (26) is independent from the whole proof of the theorems. For 1 = l the statement is trivial. Assume we proved it for the integers < I,

aT = al and EI(n) = (~I-=-II) , thus holds with equality. We may assume aT = al + l. Obviously

and prove for I. If n=

(a;)

then

(27)

388

(26)

195

A TJlEonEM OP lILNITE SBTS

and by the induction hypothesis (28)

I- I

al _ 1 + 1

a'_1 )+ ... + (a')] :5: (a 1 r H

[( I - 1

-

2

I)

+ ... + (a, -

I) .

r- 1

I 1- I . . , summarlzmg (27) and (28) we obtain (26). In the If - --,:5: a/+ 1 al_1 + 1 contrary case I I-I

-- > - -

(29)

a/+ 1

holds because of al

~ al_ l

a/

+ 1. Let us set out from the identity

a, ) (I 1- I) (a,- I) I (a,) (1-1 a/+l ---;;;- = 1-1 - a/+l 1 .

The expression in the bracket is positive because of (29), thus we can write

H )+ + (a,)] (_I _£=..!.) < (a, - I) __ a'-1 )+ (a1-2 I [a,) [( 1-1 .. . r a/+l a/ 1-1 a/+l l '

. Since

al

I - I instead of -I -- I , an d reord er the ine> a/- I. W·r~te --,--.,.

quality

a'_l

+1

_I [(a,)+ ( a, +1

I

aH )

I- I

+ a,~~: I

al

+ ... + (a,)] < r

(a, - I) + I-I

[It-ill + .. . + (:]].

l!'inally, from the above inequality (26) follows by (28). 6. Now let us prove statement (5) for I by induction over h if h = I is trivial. Assume we have proved (5) for all sets IHI < h, and prove for h.

n·! sets At. \Ve k

There exists an element e of H, contained by at most define the following systems: and

where (30)

Naturally, and

&)~ {A:AEv(, e~A )

e ~ (A - {e): AEv(, eEv()

n, ~

n·! n·! lel:5::5:--. h

aT(n)

c(&))cc(vf) c(e )(u)e c c(v() ,

13 '

389

196

G. KATONA

{D U {a}

where .2)( u)a denotes in general the system inequality (31)

: DE .2)}. Thus the

Ic(d)I~lc($)I+lc(@)1

holds. However, 9J is a system in H - {e}, we may apply the induction hypothesis for h - 1 (32) Further, applying the induction hypothesis for 1- I we obtain

·1 c(@) 1~ F/- l (n 2 )

(33)

•

It follows from (31), (32) and (33) that (34)

F/(n - n 2 )

+ F/- (n l

2 )::;;:

Ic(d) I.

Using the result of Point 5, inequality (5) follows from (34) and (7) by (30), since (7) is proved already for I. 7. Now prove statement (6) for I by induction over h. If h = 1, it is trivial. Assume we have proved (6) for all sets 101 = IHI < h, and prove for h. The proof will be similar to the proof of the previous point. Let d l and d 2 be given by and

d

l

= {A :AEd, ACO},

d

2

= {A : A Ed, A

If Idll = rand Id21=

8,

c

H} .

there are two elements e E 0 and f E H, such

that e is contained by at most'!...:.!. , and f is contained by at most ~ sets h h Ai. Define the following systems:

9J and where

=

{A :AEd, e~A, f~A},

@l= {A - {e}:

@2

AEdl , eEA}

= {A - {f} : A Ed

2,

f E A} ,

(35)

and (36)

Naturally,

and

82

=

8

·1

1@21::;;:-· h

c(&1)

c

c(d) ,

C(@l) (U) eCc(d)

C(@2)(U)/Cc(vtj.

390

197

A THEOREM OF FINITE SETS

Thus the inequality

+ IC(@1)1 + IC(@2)1 =

Ic(vE) 1~ Ic(~) 1

(37)

+ IC(@1U@2)1

Ic(~) 1

holds. However f11 is a system in GUll - {e} - {f}, we may apply our induction hypothesis for h - 1:

"

Further, applying the induction hypothesis for 1 - 1 we obtain

IC(@1U@2)1~ (~=;) + Ft-1(r2 +82 - (~=:)).

(39)

It follows from (37), (38) and (39) that

(l ~ 1) + Ft (n - r

(40)

2 -

82 -

(h ~ 1J) +

(~= :)) ~

+ F t- 1 (r2 + 8 2 -

1

c(d) I·

Now we should like to use inequality (7) which is valid under condition (25) (Point 5). For this reason we have to verify only

r2

+

82 -

1)

h( I- 1

(41)

[n - r2

-_ 8 2 _

~

(h - 1) + r + 2

1

• (

l-

h )

1)] 1

1

at n - ( )

In- (':}]-l aT (n -

However

1

(~J)

[n - (~)ll

(42)

h

is an immediate consequence of (35) and (36). Since n < 2 of Theorem 2,

(h -

82 _

(~) is a condition

aT (n - (~)) ~ h holds and (42) results (41). Thus we can use

391

198

G. KATONA

(7) for this case:

Finally, (40) and (43) gives the desired inequality, and the whole proof is finished. Now we consider a natural generalization of the problem of Theorem 1. The problem is to determine the minimum of Jck(d)l, where 1 ~ k ~ 1, ck(d) = C(Ck-1(d» and c1(d) = c(d). It is not difficult to conjecture what is the result. To the theorem we need the following notation: F¥(n)

=

(a/(n, l») 1- k

+ (a/-1(n, 1) ) + ... + (at(n,l) (n, 1») 1- 1 - k

(1

t(n,l) - k

~ k ~ 1) ,

where (:) = 0 if b < O. THEOREM

3. Let h, n, land k be (liven integers with the properties

h ~ 1, 1 ~ k

~

1 ~ hand

If D is a set of h elements and (i = 1, .. , , n)

d={Al"" ,An},

a 8Y8tem of different subset8 of D, then min Ick(d) 1= F¥(n), where the minimum run8 over all 8uch 8Y8tems d. PROOF. It is easy to see by induction over 1, that Ick(vf(h, n, 1» Thus, we have to prove only

I = F~(n) .

ICk(~) I ~ F~(n) .

(44)

This will be proved by induction over k. For k = 1 Theorem 3 gives Theorem 1. Assume now (44) is true for values smaller than k, and prove for k. Obviously, holds and using the induction hypothesis and Theorem 1 we obtain (45)

(a) If t(n, 1) - (k - 1)

>

0, then

F~-l(n) = ( a/(n, 1)

1- (k - 1)

)

+ ... + (

at(n,l) (n, 1) ) t(n,l) - (k -1)

392

199

A THEOREM Oll' ll'INITE SETS

is an expression oftype (1). That is (46)

+ 1) = t(n, 1) - Ie + 1 al+k-l(n, 1) (t(n,l) - Ie + 1::;;: i::;;: 1- Ie + 1)

t(F~-l(n), 1- Ie

at (F¥-l(n) , 1- Ie + 1) =

and

'~+I

F ,_ k+1(Ff-1(n»=

(al(Ff-l~n),1-le+l1 =

l=t(n,I)-k+1

(47)

']+1

=

(al+~_l(n,I»)= ~

I=t(n,I)-k+ 1

1

-

~

i

1

-

I

(a~(n,l»)=Ff(n), 1 - Ie

j=t(n,l)

which proves (44) and (45). (b) If t(n, 1) - Ie

+l

::;;: 0, then (46) does not hold. However in this case

F~-l(n) _ 1 = ( a/(n, 1) ) 1-le+l

and

t(F~'-l(n)

+ ... + (ak(n,l») 1

- 1, 1- Ie + 1» = 1

a/(Ff-1 (n) - 1,1- Ie + 1) = al+k-l(n, 1) hold. Further, the equation

':i

F Z_ k+1(Ff- 1 (n) _ 1) = (48)

l

(a l (Ff- 1(n) .-1, 1- Ie + 1») = ~

1=1

=

']"1 (al+~_l(n,l») = 1=1 t - 1

i

j=k

(1::;;: i::;;: 1- Ie + 1)

(a!(n,l») 1 - Ie

=

i

1

-

(~j(n»)

j=t(n,l)

1 - Ie

= Ff(n)

is true in this case instead of (47). If we prove

F/-k+l. F" -l(n» = F,-k+l (Ff-l(n) - 1),

49)

then (44) foHows from (45), (49) and (48). (49) will be proved by the following simple lemma. LEMMA 2.

It t(m, r)

= 1, then

Fr(m + 1) = Fr(m) . PROOF.

(2 ::;;:

8 ::;;:

Let

8

m = (ar(m, r») r

+

be the least index such. that as(m, r) > as_1(m, r) 1 1. Thus, we can write

r). If there is not such 8, let 8 be equal to r

+

+ ... + (as-1 (m, r») + (a s- 1(m, r) ~ I} + ... + 8-1

393

8-2

200

G. KATONA

and

m+ 1= (aA:' r)) + ... + (as-l~~r~ + 1) .

Now it is not difficult to see, that

F,(m) = (a'(m, r)) r-l

+ ... +

+ ... +

(a s- 1 (m, r)) s-2

2)) =

(a s - 1 (m, r) - (s -

o

+ (a s-

(aAm, r)) r-l

1

(m, r) s-3

+ ... + (a s-

1) + ... + r) + s-2

1 (m,

I) =

= F,(m+ 1), which proves the lemma and Theorem 3.

§ 3. Solution of an

Erdos~problem

Let H be a finite set of h elements, and d

a system of subsets of H: (1 ~i~n).

ERDOS proposed the following problem. For which numbers n can we con-

struct a system 91 with the properties

In the solution we use the well-known marriage problem. It is clear in this connection, that it is a very important question, in which cases does F~(n) < n, F~(n) = n or FNn) > n hold. The following sequence of lemmas deals with this problem. LEMMA

3. If 1

~

k

s: l and x are positive integers, then

is a monotone increasing function between land 2l - k - 2 but it is a monotone decreasing function from 2l- k~ 1. The values f(2l - k - 2) andf(2l-- k - 1) are equal. PROOF.

Let 0

s: a 

(~)

(:)

<

0,

>0, respectively, if a+b<x, a+b=x

x, respectively.

394

201

A THEOREM OF FINITE SETS

Consider the difference t(x

+ 1) -

the above remark we obtain that

and finally,

t(x) = ( x ) _ ( x ). Using 1-k-I 1-1

l(x+I)-/(x)
if

2l-k-2<x,

l(x+I)-t(x)=O

if

21-1c-2=x,

f(x+I)-t(x»O

if

21-k-2>x.

This completes the proof. The following two lemmas are immediate consequences of Lemma 3. LEMMA

3a. III :s:: k :s:: 1 and x are positive intf!gers, then

LEMMA

3b. II 1

< k < 1 and x > 21 -

+ 1 are positive integers,

k

1< k :s:: m, then "i [(2i --: k - 1) _ (2i - ~ - I)]:s:: (2m -

LEMMA 4.

II

~ -

i=k

PROOF.

then

k

k-

m - k

~

Let a and b be positive integers, where

1) _ (2m !!.- < b < a 2

km

I}. I

- 1. Then

and similarly

2)

2)

2) [1- a ~~] = b+2

(a + _(a + = (a + b+I b+2 b+l. Further (51)

(52)

2)

(a + [2b-a+ 1]. b+1 b+2

(a+2)[2b-a+I]=(a)[2b-a+I].[ (a+2)(a+l) ], b+ I b+ 2 b b+ I (b + 2) (a - b + I)

where and

a+2 a+2 a-b+l-a 1

----->--=2.

-+ 2

395

202

G. KATONA

That is (53)

a

follows from (50), (51) and (52). Applying (53) for = 2i = obtain

or

k- 1, and b= i-I,

(l:S;:

k:s;: i),

we

~ 1) -k-l) _(2(i +.1) -k-l) ~2[(2i~k -1) _(2i-~ -1)]. (2(i '+I-k z-k , ~+1

(54) Prove now the lemma by induction over m. H m = k, the statement is trivial. Let the lemma be true for m and prove it for m + 1.

i[(2i~k-l) i=k

, -

k

_(2i-.k-l)]= ~1[(2i~k-l)_(2i-~-I)]+ ~

i=k

~

-

k

and by induction hypothesis and (54)

i[(2i~k-l) i=k

" -

k

_ (2i-~-I)]~2[(2m-k-l) _(2m-k-l)]~ mm [2(m + 1) - k- 1) _ (2(m + 1) - k-I} l m+l-k m+l "

~

k

holds, which proves Lemma 4. LEMMA

(55) PROOF.

"

< a/(n, I) then FNn) < n.

5. If l :s;: k :s;: I and 21 - k We may use Lemma 3b:

On the other hand, by Lemma 3a

396

203

A THEOREM OF FINITE SETS

holds and summarizing it we obtain

:i l-(a./(n, I)) _ (a/(~, 1))] ~ :i [(2i ~ k- 1) _(2i - ~ - 1)]. k

" -

/=k

"

" -

/=k

Applying now Lemma 4 and (54):

1)) _ (a/(~, 1))] ~ (21- k-

~[(a!(n,

<

(2(1

Obviously.

(~7)

!i l·(a!(n, 1

/=1

" -

1- k

".

k

" -

i=k

+ 1) -

k

"

1) _(21- k- I)' < 1

1) _ (2(1 + 1) - k - 1) .

kl+l-k

1+1

+ 1) - k - 1) _ (2(1 + 1) - k - 1"1 +1- k 1+ 1

1)) _ (ai(~' 1))] < (2(1 k" 1

I

also holds, since we added a nonpositive number to the left side. If we sum (56) and (57) the obtained inequality Ff(n) _ n

<

(21- k+ 1-k

1) _ (21- k + 1) + (2(1 + 1) - k - 1\_ 1

_ (2(1

6. If l

~

> a/(n, Ff(n) > n.

k ~ 1 and 21 - k

(58) PROOF.

k -1)

=

0

_1+ 1

results (55). LEMMA

+ 1) -

l+l-k

1) then

We know that

(59) If 1 > i ~ a/(n, 1) - (1- k), then a/(n,l) - (1- i) a/(n,l)

~

~

2i - k and by (59)

2i - k

holds. In this case, obviously

(:/(n, 2) - (a/(:, 1)) ~ 0

(60)

follows. If k

~

i

<

a/(n, 1) -

(1 - k), then by (59) and Lemma 3

holds, but it is trivially true for i

<

k, too.

397

204

G. KATONA

Sum (60) and (61)

:ir(~;(n'l») _ (a;(~'l»)]:za,(n'I~+k-I[(al(n'~)-(l-i»)_ ;=1

l ~-k

~

_ (al(n, 1) - (1- i»)]

=

i

k

+ k) _ (2a/(n, 1) -

(2a/(n, 1) - 21 a/(n,l) - 1 - 1

That is (62)

~-

;=1

F~(n) _ n:z (2a/(n, 1) - 21 +

a/(n, 1) -1

k) _ ( 2a/(n,1) -

a/(n, 1) -1

a/(n,l) - 1 - 1

21

21 + k ) + 1. +k - 1

+ k) + 1 +

+k-

1

s true. Here (63)

(

+ k) _ ( 2a/(n, 1) -

2a/(n, 1) - 21 a/(n,l) - 1 - 1

a/(n,l) - 1 (

21

+ k) :z (

+k -

a/(n,l) - 1 a/(n,l) - 1

+k -

a/(n,l) - 1 )_ \ a/(n, 1) - 1 - 1

1 ) 1

because of Lemma 3. However we can write the right hand side of (63) in the form (64)

Here

and since 21 - k - 1 (65)

(a/(n,

>

a/(n, 1) -

1 by supposition of the lemma, thus

i-I) _(a/(;,~;- 1) :z (a/(~,

Finally, (65), (64) (63) and (62) give which proves our lemma.

F~(n) - n

398

>1

1») -

(a/n,

2) .

205

A THEOREM OF FINITE SETS

7.

LEMMA

If

(66)

n> (21~k)

+ (2(1~~)I-k) + ... + (~),

then

F~(n)

On the other hand, if (67)

n

s;:

< n.

(21 ~ k) + (2(1 ~~\- k) + ... + (~),

then with equality only if

(68) for 80me

n 8

(k

=

(21 ~ k) + (2(1 ~~)1- k) + ... + (28 ~ k)

s;: 8 s;: I).

PROOF. Consider first the case of (66). If ai(n, 1) = then t(n, I) < k and n=

2i -

k (k

s;: i s;: I),

(21 - k) + ... + (k) + (k - 1) + ... + (t(n,I)J' .

Obviously,

I

Tc - 1

k

k Fdn) =

thus FNn) < n holds. In the contrary case

(21-I k) + ... + (k)k ' >

a,(n, I)

ai(n, I) =

hold for some r (k

s;: r <

t(n, I)

2r - k

2i -

k

I). Since

the statement follows by Lemma 5:

The case (67) may occur in two different ways. 1. If (68) holds, then obviously FNn) = n For some r (k < r s;: I), a,(n, I) < 2r - k,

2.

399

(r

< is;: I)

206

G. KATONA

and (r

Since

< i:::;;' 1) .

the statement follows by Lemma 6:

THEOREM 4. Let 1 :::;;, k :::;;, 1:::;;' h be positive integers, H a set of h elements and d={Al' ... ,An}, JA,J=1 a system of subsets of H. If (69)

n:::;;'

(21~k) + (2(1~~)I-k) + ... + (~),

there exists a system (70) but in the case of (71)

n>

(21 ~ k) + (2(1 ~~)1- k) + ... + (~)

not necessarily . PROOF. First we prove the latter case. If (71) holds then by Lemma 7 FNn) < n. We know (Theorem 1) that there exists a system d such that Ick(d)\ = FNn). Thus, a system /lJ satisfying (70) does not exist. In the proof of the existence of /lJ in the case of (69) we use the well-known marriage problem [2]: THEOREM OF ORE. Let E and F be disjoint sets and G a graph on E U F Assume G has the property that for arbitrary DeE there is a set H c F such that every element of H is connected with at least one element of D and IHI :;:::: IDI. Then there exists a one-to-one mapping between E and a subset K of F, such that the associating vertices are connected in G. In our case E = d, F = ck(d) and A E d; B E ck(d) are connected if and only if A ::> B. Thus, it is sufficient to verify that for every subsystem

e=

{AIt, ... ,Aim}cd

there are at least m sets in ck(d) , which are contained in one of AlP:::;;' j :::;;, :::;;, m). However, m :::;;, n, thus by (69)

400

A THEOREM OF FINITE SETS

207

and Lemma 7 gives (72)

F~(m) ~m.

Use now Theorem 1: This and (72) results Ick (€2)1 ~ m, which means that our graph has the property prescribed in the used theorem. Applying the theorem the obtained one-to-one mapping gives just the desired system /Ii. COROLLARY. IT 21 - k :;::: h, then (69) always holds and a system /Ii satisfying (70) always exists. This is an immediate consequence of the inequality

and the fact that d has at most

(~) elements.

REFERENCES [1] SPERNER, E.: Ein Satz tiber Untermengen einer endlichen Menge, Math. Z. 27 (1928) 544-548. [2] ORE, 0.: Graphs and matching theorems, Duke Math. J. 22 (1955) 625-639.

Reprinted from Proceeding of the Colloquium held at Tihany. Hungary. Sept. 1966 Academic Press and Akademiai Kiado, Budapest, 1968, pp. 187-207

401

Reprinted from JOURNAL OP COMBINATORIAL THEORY All Rights Reserved by Academic Press, New York and London

Vol. I, No.2, September 1966 Printed in Italy

A Short Proof of Sperner's Lemma Let S denote a set of N objects. By a Sperner collection on S we mean a collection of subsets of S such that no one contain another. In [1], Sperner showed that no such collection could have more than NC[NI21 members. This follows immediately from the somewhat stronger THEOREM.

Let

r

be a Sperner collection on S. Then

~.A.ErNCj11

<

1,

where I A

I denotes the cardinality of A. PROOF. For each A c S, exactly I A /l(N - I A /)!

maximal chains of S (as a lattice under set inclusion) contain A. Since none of the N! maximal chains of S meet r more than once, we have

I

~.A.Er A /l(N -

I A /)! <

N! ,

proving the theorem.

REFERENCE

1. B. SPERNER, Bin Satz tiber Untermenger einer endlichen Menge, Math. Z. 27 (1928), 544-548.

D. LUBELL Systems Research Group, Inc. 1501 Franklin Avenue Mineola, New York 11501

402

Sonderabdru
Fase. 6

BIRKHAUSER VERLAG, BASEL UND STUTTGART

-----.--- --- .---.------.--.-... -. ------- - - - -

- - - - - - - - - - - - -..

Mobius Inversion in Lattices By HENRY H. CRAl'Ol)

1. Introduction. In the development of computational techniques for combinatorial theory, attention has lately centered on ROTA'S theory of Mobius inversion [6]. The main theorem of ROTA'S paper, concerning the computation of the Mobius invariant across a Galois connection, is a prerequisite to the use of lattice-theoretic methods in combinatorics. By suitably combining ROTA'S main theorem with a discrete analogue of integration-by-parts, we here obtain a perfectly general formulation of Mobius inversion across a Galois connection (theorem 3, below). As immediate applications of this theory, we obtain a number of interesting computational results concerning finite lattices (section 3, 4) and combinatorial geometries (section 5). 2. Mobius Inversion across a Galois Connection. We begin with a restatement and a simplified proof of ROTA'S main theorem. The proof tlli"lls on the essential fact that for any (locally finite) ordered set Q with least element 0, the recursion

.2;a(y) "y, z)

=

0 for

H

0

I/EQ

has the unique solution a(y) = 0 with initial condition a(O) = 0, and has the unique solution a (y) = #Q (0, y) with initial condition a (0) = 1. Recall that the zeta function "y, z) has value 1 if Y ~ z, and has value 0 otherwise. Theorem 1. If J is a closure operator on a finite lattice P, and Q = P/J is the quotient lattice, consisting of the J-closed elements of P, then for all elements x E P, and elements y closed in P, x ~ y, the sum

L

#(x, t)

t;z&;t&;J(t)-1I

has value #Q (x, y) if

X

is closed, and has value 0 otherwise.

1) We wish to express our gratitude to the National Research Council, Canada, for their support of this research (grant A-2994), to K. JACOBS, for his organization of the extraordinary conference "Kombinatorik" at Oberwolfach, and to D. KLEITMAN and J. GOLDMAN, for their organization of the combinatorics seminar at M.I.T., for which this material was prepared.

403

596

H. H.

ARCH. MATH.

CRAPO

Proof. Note that the theorem may be rewritten in the form (1)

LP(X, I) 15 (J(/), y).

15 (X, J(X)) PQ(J(X), y) =

tE?

Without loss of generality, we assume x = 0 in P. For each element y L p(O, I) 15 (J(/), y). Then

E

Q, let

a(y) =

tEP

L a(y) C(y, z)

=

IIEQ

L p(O, I) b(J (I), y) C(y, z)

=

t.1I EP

L p(O, I) C(/, z)

=

bp(O, z) .

t EP

IfO<J(O), bp(O,z) =0 for all ZEQ, and a(y)=O for all YEQ. If O=J(O), bp(O, z) = 1 for Z = 0, and a(y) = PQ(O, y). I Given a function f from a finite lattice P into a ring with unit, associate the difference operators D, E lower difference Df(x) = Lf(y)p(y, x), lI;II""X

upper difference

Ef(x) =

LP(x, y)f(y)· 1I;:Z:~Y

Theorem 2 (Analogue of integration by parts). If f, g are funclions from a finile lattice P inlo a ring, then LDf(x) g(x) =

xeP

Lf(x) Eg(x).

xeP

Proof. Both are equal to Lf(x)p(x,y)g(y).

I

X.II

It is interesting to compare the proof of theorem 2 with the argument that cycles and coboundaries in a graph are orthogonal to one another. For each vertex p and edge x, let + 1 if p is the head of x, f (p, x) = - 1 if P is the tail of x, o otherwise.

1

Boundary and co boundary operators are defined by af(p) =

L f(p, x) f(x) for any 1-chain f, x

bg(x) = Lg(P) f(p, x) for any O-chain g. p

If f is a 1-cycle

(of

=

Lf(x)h(x) x

0) and h is a 1-coboundary (h = bg), then

=

Lf(x)f(p,x)g(p) x,p

=

Laf(p)g(p) p

=

LOg(P)=O. p

If a: P -+ L is a supremum-homomorphism from a complete lattice P into a complete lattice L, then ad: L -+ P is an infimum-homomorphism, defined by ad(y) = sup{x; a (x) ~ y}.

The pair a, ad is a Galois connection, in the sense that P/a d (a) is isomorphic to L/a(a d ), where ad (a) is a closure operator on P and a(a d ) is a co closure operator on L. All Galois connections between complete lattices arise in this fashion. In the special case where a is onto L, P/a d (a) ~ L.

404

Vol. XIX. 1968

597

Mobius Inversion in Lattices

We now combine theorems 1 and 2 to establish a general theorem on Mobius inversion across Galois connections. This theorem is the discrete analogue of the change-of-variables formula for integration. and of Stokes' Theorem, w = dw.

f

f

as

s

Theorem 3. If 0': P --+ L i8 a 8up-homomorphism from a finite lattice P into a finite lattice L, if f i8 a function from P and g i8 a function from L taking values in a ring, then '2,Df(x)g(O'(x» = '2, f(O'-1 (y» Eg(y) . ZEP

1/EL

Proof. Let Q = P/O'-1O' ~ L/O'O'-1 be the common quotient lattice, and regard both f and g, restricted to closed elements in P and L, as functions on Q. Then

'2, Df(x) g (0' (x» = '2, f(t) f-l(t, x) g (0' (x)) = '2,

ZEP

t,ZEP

(2)

'2, f(r) f-lQ(r, 8) g(8) = '2,

=

f(r)b(r,O'-1(Y»f-ldy,z)g(z)=

reQ,',1/EL

r,seQ

'2, f(O'-1(y))f-lL(y,z)g(z) =

=

f(t)f-lp(t,x)b(0'(X),8)g(8) =

t,zeP,seQ

1/,'EL

'2,f(O'-1(y))Eg(y).

I

1/EL

A number of related forms of Theorem 3 may be more convenient in applications. For instance, using the fact that the difference operators D and E are inverses to the summation operators Sand T, Sf(x) =

'2,

f(y);

'2, f(y)'

Tf(x) =

1/;1/;;>Z

1I;a;~11

we obtain corollary 1 by substituting Sf for /. Tg for g. Corollary 1. ItO': P --+ L i8 a sup-homomorphism from a finite lattice P into a finite lattice L, if f is a function on P and g is a function on L, then '2,f(x) g(O'(x)) = ZEP

'2, Sf (0'-1 (y)) Eg(y),

1/EL

'2, Df(x) Tg(O'(x)) =

ZEP

'2,f(O'-1(y))g(y), IIEL

'2,f(x) Tg(O'(x)) = '2,Sf(O'-1(y))g(y). ZEP

I

1/EL

The symmetric intermediate form (2) appearing in the proof of Theorem 3 deserves special note: Corollary 2. If 0': P --+ L is a sup-homomorphism from a finite lattice P into a finite lattice L, if Q is the quotient lattice P/O'-1 (0') ~ L/O'(~), and if functions f on P and g on L are defined on Q by restriction to closed elements of P, coclosed elements of L, respectively, then '2,Df(x)g(O'(x» = '2,f(r) f-lQ(r, s)g(s) = XEP

r,seQ

'2, f(O'-1 (y)) Eg(y) . I

1/EL

For Galois connections given directly as a pair of order-inverting maps 0', T whose composites O'T and TO' are increasing, it is more convenient to have Theorem 3 in the form obtained by inverting the lattice L, as follows.

405

598

H. H.CRAPO

ARCH. MATH.

Corollary 3. If rJ: P -+ L, i: L -+ P is a Galois connection between finite lattices P and L, if f is a function on P, and g is a function on L, then LDf(x)g(rJ(x)) = Lf(i(y))Dg(y).1 lIeL

",eP

Theorem 1, above, lacks the full symmetry of Galois connections because it operates between a lattice P and its quotient Q, rather than between two lattices P, L, with a common quotient Q. The symmetric form for Theorem 1, and thus for ROTA'S main theorems, is recoverable from Theorem 3 as follows.

Corollary 4. If a is a sup-homomorphism from a finite lattice P into a finite lattice L, then for any elements t E P, Z E L Lit (t, x) b (a(x), z) = L b (t, aLi (y)) It (y, z) . veL

"'EP

This common value is clearly equal to 0 unless t is closed in P (ie: t < x => a (t) < a (x)) and Z is coclosed in L (ie: 3x E P; a(x) = z). If t is closed in P and z is closed in L, both t and z correspond to elements of the common quotient lattice Q, and the common value of the summations is equal to ItQ(t, z).

Proof. Setf(x) = b(t,x), g(y) = b(y,z). Then Df(x) = ItP(t,x) and Eg(y) = ltL(y,z). The intermediate symmetric form (2) arising in the proof of Theorem 3 is in this case equal to L b(t, r) ItQ(r, s) b(s, z). I " .. Q

The theory of Mobius inversion across a composite of sup-homomorphisms develops directly from Theorem 3.

Corollary 5. If a,: P'-l -+ PI, i = 1, ... , k, is a sequence of sup-homomorphisms between finite lattices, if f is a function on Po and g is a function on Pk, then the sums (3)

L f (at (... af (x)

)) It (x, y) g (ak (... at+l (y)

))

f£,lIEPi

are equal, fori = 0, ... , k. (Note thatfori = k the evaluation of g is at y.) I For computations involving composites such as aLi (ill) it should be borne in mind that Lf is a contravariant functor, ie: aLi (i.1) = (i(a))LI. The applicability of Corollary 5 is appreciably extended by the observation that composites of sup-homomorphisms give rise to commutative diagrams involving the intermediate quotients. For two sup-homomorphisms, diagram 1 applies.

Diagram 1

406

Vol. XIX, 1968

599

Mobius Inversion in Lattices

For any sup-homomorphism p, the symbol PI indicates the map of each element to its closure, regarded as an element of the quotient lattice, while the symbol P2 indicates the map of each element of the quotient, regarded as a closed element of the domain, to its image under p. Note that (a Th

= al(a2 Tlh

(a T)2 = (a2 Tllz T2.

and

A result not obvious from previous forms of Theorem 3 derives from such consid· eration of quotients. Note that the lattices Land Q in the following corollary are not related by a Galois connection. Corollary 6. If a: P -+ L and T: L -+ Mare sup·homorrwrphisms between finite lattices, if f is a function on P and g is a function on M, and if Q is the common quotient lattice 0/ P and M relative to the composite -r(a), then L f(a 4 (x))PL(x,y)g(T(Y)) = Lf(r)PQ(r,s)g(s).

$.lIeL

r.8eQ

I

The expressions f(r), g(s) in Corollary 6 refer as usual to f((T(a))f (T)) and g((T(a))2 (s)), the values of f and g at elements closed in L, coclosed in M, relative to T(a). 3. Enumerative Lattice Theory. To each binary relation !?: X -+ Y between finite sets X and Y there corresponds a Galois connection (a, T) between the Boolean algebras B(X), B(Y). For all A ~ X, B ~ Y, a(A) = {YE Y; xEA => X!?Y}, T(B)={XEX; YEB=>x!?y}.

(These definitions are simply C1(A) = nA, C1(B) = nB, if elements of X are viewed as subsets of Y, and elements of Yare viewed as subsets of X.) Theorem 4. If !?: X -+ Y is a relation between finite sets, if f is a function defined on subsets of X, and if g is a function defined on subsets of Y, then LDf(A)g(nA ) = Lf(nB)Dg(B).

A'X

B~Y

Proof. Apply Corollary 3 of Theorem 3 to the Galois connection B(X) ~ B(Y} defined by C1(A} = ~ Y for A ~ X, T(B) = nB~X for B~ Y. I • The elements of the common quotient lattice Q are precisely those pairs (A, B) A ~X, B~ Y, which are 1} totally related: xEA, yE B => X!?y and maximal, in the sense that

nA

2} x ¢ A=>3 Y E B, x e y , 3} y¢ B => 3XEA, xey,

where Each element

SEQ

Isl2 as a subset of Y.

edenotes negation of relation !?

thus has a cardinality

407

Isl1 as a subset of X

and a cardinality

600

H. H. CRAPO

ARCH. MATH.

Corollary 1. If (a, -r) is the Galois connection defined by a finite relation

2: (rp -

2: rpl«B)1 (v -

1)IAI vla(A)1 =

A~X

e: X

-4-

Y,

1)IBI.

B~Y

This sum may in turn be calculated on the common quotient lattice Q, and is equal to

2: rpl'll P,Q (r, s) vl• I••

',SEQ

Proof. Let f(A) = rplAI for all A ~X, and g(B) = vlBI , for all B~ Y. Noting that Df(A) = rp IOI(- 1)IA- o l = (rp - 1)IAI, and similarly for Dg, the result

2:

O~A

follows directly from Theorem 4.

I

Modulo a few redundancies, Corollary 1 to Theorem 4 is also the fundamental enumerative structure theorem for finite lattices. The redundancies arise, causing nonisomorphic relations to have isomorphic lattices, when a(x) = a(A), for some subset A ~ X and some element x ¢; A (also when this situation occurs for some element and subset of Y). When such redundancies do not occur in the relation e, the supremum-irreducible elements of the lattice Q are precisely the pairs (x, a(x)) for x E X, the infimum-irreducible elements of Q are precisely the pairs (-r(y), y) for y E Y, and the relation e may be recovered from the lattice Q by (4)

yin Q.

xey~x?

Corollary 2. Let Q be a finite lattice, with set X of supremum-irreducible elements and set Yof infimum-irreducible elements. For each element Z EQ, let IX (z), (3(z) be the mtmbers of sup-irreducibles beneath z and inf-irreducibles above z, respectively. Then

2: (rp -

2: rpcc(lnfB) (v -

1)IAI PP(supA) =

A~X

1)1 BI,

B~Y

and both sums are equal to 2:rpcc(')p,(r,s)PP(·).

I

.,seQ

Redundancies can be reintroduced on the other side of the relation -- lattice correspondence, with interesting results. Given a finite lattice L, and functions j and k from finite sets X and Y, respectively, into L, a binary relation e is defined by (5)

xey~j(x)

?k(y).

Let Q be the quotient lattice of the Galois connection determined by the relation e. Then Q has two order-embeddings in L, neither of which is associated with a closure operator on L. Corollary 6 to Theorem 3 applies. Theorem 5. Let j and k be functions from finite sets X and Y into a finite lattice L, and let e and Q be the relation and quotient lattice described above. Each element z E L has cardinalities Iz 11 and Iz 12 given by

Izll = l{eEX; j(e)? z}l,

Izl2 = l{eEY; z? k(e)}l.

408

Vol. XIX, 1968

601

Mobius Inversion in Lattices

Each element SEQ is realized as a pair (A, B) of subsets of X, Y, and thus has cardinalities Is II = IA I, Isl2 = I B I·

L IP lx " ,uL(x, y) Villi. = L IPIT" ,uQ(r, s) vlsl •. In particular, ,uQ(O, 1) = L ~(O, Ixll),u(x,y)~(jYI2'0). x,veL x,yeL

r,seQ

Proof. The function j extends to a sup-homomorphism a from the Boolean algebra B(X) into L by alA) = supL{j(e); eEA}. Similarly, k extends to an infhomomorphism from the inverted Boolean algebra S(Y) into L (with opposite .: L-+ SlY»~, defined by .LI(B) = infL{k(e); eEA}. Then

.LI

.(a(A» = {bEY; aEA *j(a) ~ k(b)},

and the quotient lattice with respect to the composite .(a) is equal to Q. The formula follows from Corollary 6 to Theorem 3, and the special case results from setting 11' = Y = 0, realizing that Ir 11 = 0 * r = 0 E Q and Isl2 = 0 * s = 1 E Q . I If, in the situation described above, X = Y = L, and if L is assumed to be an ordered set, not necessarily a lattice, then the resulting lattice Q is the MACNEILLE completion of the ordered set L. 4. Cross-cuts and Complementation. ROTA'S cross-cut theorem [6] and this author's complementation theorem [1] have in common a double application of Mobius inversion. Interesting sidelights on these theorems are obtainable by consideration of a lattice of the intervals of a finite lattice. Theorem 6. Given a finite lattice L, let I (L) be the set consisting of the empty interval 0, together with all intervals [x, y], for x ~ y in L, ordered by containment. Then f'I(L) ([x, y], [w, z]) = ,uL(W, x) ,uL(y, z) if w ~ x ~ y ~ z, and

,uI(L) (0, [x, y])

= -

,uL(x, y).

Proof. If w ~ x ~ y ~ z, the interval from [x, y] to [w, z] is isomorphic to the cartesian product of the inverted interval [w, xl in L with the interval [y, z] in L. But ,uL(Y), x) = ,uL(X, w), and the Mobius invariant is multiplicative on cartesian products. This establishes the product formula.

,uI(L) (0, [w, z])

= -

L L

,uI(L) ([x, y], [w, z])

:t,y;w;:£;x:;;;;;y;:;;;z

=

-

=

-

L

,uL(W, x) ,uL(y, z)

=

x.Y;W~X~1I~Z

,uL(w, x) ~ (x, z)

= -

,uL(W, z).

I

x;w~z;;;;;z

Theorem 7. Given a finite lattice L, an arbitrary subset X ~ L, a function f defined on subsets of X, and a function g defined on intervals of L, then LDf(A)g([inf A, sup A]) = A~X

=

f(0) g(0) - f(0)

L ,u(w, z) g([w, z]) + L fIX () [x, y]) ,u(w, x) C(x, y) ,u(y, z) g([w, z]).

w,zeL

W,x.'V,zeL

409

602

H. H. CRAPO

ARCH. MATH.

Proof. The map a defined by a(A) = [inf A, sup A] is a sup-homomorphism from the Boolean algebra B(X) into the interval lattice I (L), because a(A u B)

=

[inf(A u B), sup (A u B)]

=

[inf A, sup A] v [inf B, sup B].

Note that ad ([x, y]) = X n [x, y]. By Theorem 3, LDf(A)g([infA, sup A]) = A,X L f(X n [x, y]) P,I(L) ([x, y], [w, z])g([w, z]),

o;;;; ["'. v] ;;;;[w.z] ;;;;[0.1]

which reduces to the required form, by Theorem 6, once the summation is separated into three parts: 0=[x,y]=[w,z],

0=[x,y]<[w,z],

and

0<[x,y]~[w,z].

I

Corollary 1. If X and Yare arbitrary subsets of a finite lattice L, let qk be the number of k-element subsets A of X disjoint from Yand spanning L (ie: inf A = 0, sup A = 1). Then

+ qz - ... = + 2: C(X n [x, y], Y) p,(0, x)C(x, y)p,(y, 1).

qo - ql

= !5 L (O, 1) - p,dO, 1)

"'.VEL

Proof. Set f(A) = C(A, Y), so that Df(A)

=

2:(-l)IAI-IBIC(B, Y) B,A

=

(-1)IAI!5(0,An Y).

Set g([w, z]) = 15(0, w) !5(z, 1). The sinister of the equation in Theorem 7 becomes

2: (-1)IAI!5(0, A n

A,X

2: (-l)kqk, 00

Y)!5(O, infA) 15 (sup A, 1) =

k=O

and the simplification of the dexter is obvious. I The cross-cut theorem, the complementation theorem, and, one may conjecture, other interesting facts about Mobius invariants of lattices are evaluations of Corollary 1 at particular sets X, Y. Corollary 2 (The Cross-cut Theorem). If X is a cross-cut of a finite lattice L, and if qk is the number of k-element subsets of X which span L, then

qo - ql

+ qz -

... = p,dO, 1).

Proof. In Corollary 1 to Theorem 7, let X be the crosscut, and let Y = 0. The condition X n [x, y] = 0 is satisfied if and only if x ~ y < z for some z E X, or z < x ~ y for some z E X. These possibilities are mutually exclusive, and are indicated y < X and X < x, respectively. Thus p,(O,x)C(x,y)p,(y, 1) =

2:

"'.YEL;Xn["'.!I]-O

=

L

p,(0, x) C(x, y) p,(y, 1)

X.Y;II<X

=

+ 2: p,(0, x)C(x, y) p,(y, 1) = Z,lI;X
2: !5(O,y)p,(y, 1) +x;X
2p,(0, 1).

Substitution of this formula into that of Corollary 1 completes the proof.

410

I

Vol. XIX, 1968

603

Mobius Inversion in Lattices

Corollary 3 (The Complementation Theorem). If s is any fixed element in a finite lattice L, then ~(o, 1) == ~ ~(O,x)C(x,y)~(y, 1) X,YES.L

where s.1 is the set of complements of s in L.

Proof. In Corollary 1 to Theorem 7, let X

== Land Y ==

C(X II [x, y], Y) == C([x, y], s.1)

°

°

==

s.1. Note that

1

+ ...

°

if and only if both x and yare complements of s. If == 1, qo - ql == qo == 1. If '*' 1, then at most one of 0,1 are in s.1. Assume w.l.o.g. that ¢ 8.1. Then a subset A disjoint from s.1 U {a} spans if and only if Au {a} spans. A and Au {a} have cardinalities of opposite parity, so qo - ql + ... == 0. Thus qo - ql + ... == == t5 (0, 1), and the corollary follows. I 5. Combinatorial Geometry. A combinatorial geometry (or simply, a geometry) (e.g.: [4]) is most easily defined in terms of the lattice structure of its flats (closed subgeometries). Such lattices, which are called geometric lattices, have the distinguishing characteristic that, for all x, y E L

Y covers x= 3 atom p complementary to x in [0, y]. In this definition, "y covers x" means x < y and x < t :0;; Y => t == y. The complementarity condition requires P == 0, x v p == y. We shall consider only finite geometric lattices here, so this single property will suffice for a definition. Geometric lattices are consequently relatively-complemented semimodular lattices, generated by atoms, generated by coatoms, and possessed of a well-defined rank A(x) == length of all maximal chains from to x. (Note ,1(0) == 0.) The points of the associated geometry are the atoms of the geometric lattice. Thc lines, planes, ... , of the geometry are the sets of points beneath elements of rank 2, 3, ... , respectively, in the lattice. Linear graphs give rise to geometries. If G is a linear graph with edge set X and vertex set H, the equivalence relation of path-connection along edges in a subset A ~ X yields a partition nA of the vertex set H into A-path connected components. The map a: A -+ nA is a sup-homomorphism from the Boolean algebra B(X) into the partition lattice P(H). The Galois-closed edge sets, ie: the maximal sets A of each rank ,1(a(A)), form a geometric lattice L(G). The coboundary operator, defined parenthetically in section 2, maps vq O-chains f: H -+ {a, 1, ... , v - 1} to each coboundary t5/, where q is the number of connected components of G. Colorings, those O-chains which have unequal values on the ends of any edge, correspond to coboundaries which take non-zero values on each edge. The kernel kerg of a coboundary is the set g-l(O), which is necessarily closed. "Ve wish to calculate P(x; v), the number of coboundaries with kernel x and values in the ring {a, 1, ... , v - 1}. The number of v-colorings of the graph is vqp(O; v). A coboundary is freely-determined by its values on any basis (spanning tree) for the graph, so there are V A(l)-.!(x) v-coboundaries with kernel ;;;;x, for any x E L(G).

X"

°

411

604

H. H. CRAPO

ARCH. MATH.

By Mobius inversion on L, there are p(x; v) = L ,u(x,Y)V.!(l)-.!(y) = Ep( ; v)(x) u;z~"

v-coboundaries with kernel x. The polynomial p(O; v) is clearly well-defined for any finite Dedekind lattice L, ie: a lattice satisfying the chain condition, and thus having a rank function A.. p(O; v) has been called the Poincare polynomial of the lattice L. Recent unpublished work tends to establish a relation between Poincare polynomials and general "coloring problems" on such lattices. Poincare polynomials may be considered as polynomial-valued elements of the incidence algebra of the lattice L. Let p (x, z; v) = L,u (x, y) C(y, z) VA(z)-.!(y) , 1I e L

so that p(x; v) = p(x, 1; v). Such polynomial. valued matrices have easily-calculable inverses in the incidence algebra. Theorem 8. Let functions a, b, c on a finite lattice L take value8 which are invertible element8 of a ring. Then q(x, z) has an inver8e

=

L a(x),u(x, y)b(y)C(y, z)c(z) 1I e L

in the incidence algebra of L. In particular, the Poincare polynomial p(x, z; v) has inver8e p-l (x, z; v) = L v.! (!I)-.!(:t) ,u(x, y) C(y, z). I 1I e L

Every geometric lattice L may be realized in a number of ways as a quotient of other geometric lattices with respect to sup-homomorphisms which also preserve the relation covers-or-equals. Such maps are called 8trong maps, and map atoms either to atoms or to o. There is a notion of orthogonality [3] ~ith respect to any such realization (1: M --* L, giving rise to a strong map a*: M --* L*, whenever the domain lattice M is also modular. The relation (1** = a holds. If the Poincare polynomial is modified so as to have two numerical variables, it becomes possible to obtain from the polynomial for a Boolean representation, by simple substitution of variables, the corresponding polynomial for the orthogonal geometry. For graphs, this process converts coboundary enumeration to cycle enumeration 2). The appropriate two-variable polynomial is the coboundary polynomial for any strong map (1: P --* L, defined by (6)

1'(a; rp, v) = Lrp.!(ULl(:t»p(x; v). :teL

2) Cf. [2], [7], [8]. "A Ring in Graph Theory" is an important work, in which TUTTE calculates what has come to be known as the Grothendieck group, for a category of graphs.

412

Vol. XIX, t 968

605

Mobius Inversion in Lattices

Theorem 9. If a: P --+ L is a strong map between geometric lattices P, L r (a; cp, v) = L

(7)

cpA (x) ,u (x, y) V.1(1)-.1(a(x» .

X,YEP

Proof. Directly from Theorem 3.

I

The application of Theorem 9 to Boolean representations of a geometry, such as the map from subsets A of the set of atoms to sup A in L, is particularly useful. Corollary 1. If B = B (X) is a finite Boolean algebra and a: B --+ L is a strong map into an geometric lattice L, then r(a; cp, v) = Lcplxlp(x; v) = L(CP _1)IAl v.1(1)-.1(a(A».

I

A>;X

XEL

The definition of orthogonality relative to a strong map a: M --+ L of a modular geometry onto a geometry L is as follows. The strong map a: M --+ L determines a closure J = aLI (a) on M. There is a unique co closure J* on M satisfying, for all x, y in M such that y covers x (8)

y ;;:;; J (x)

~

J* (y)

;t; x.

The coclosure J* determines a quotient lattice P (with order induced by that on M), and a map r: M --+ P which is an inf-homomorphism preserving the relation coversor-equals. The inverted lattice P is a geometric lattice, so we set L* = P. Let a* be the associated strong map from if onto L*. The rank generating function e of a strong map a: P --+ L defined by (9)

e(a;;, 1])

= L;.1L(l)-.1L(a(x»1].1 p (x)-.1L(a(x» XEP

has symmetry [2] relative to orthogonality, whenever a maps a modular geometry onto L. Theorem 10. e(a*;;, 1]) = e(a; 1],;) for any pair a: M --+ L, a*: M --+ L* of orthogonal maps of a modular geometry. Proof. The measurement Ad1) - AL(a(x)) which provides the exponent of; in

e(a), enumerates the number of intervals [y, z] of length 1 ("steps") in any maximal

chain ("path") from x to 1 in M, for which z ;t; J (y), ie: J* (z) ;;:;; y. But this is precisely the measurement AM(X) - AL*(a*(x», which provides the exponent of 1] in e(a*). Similarly, is the number of steps [y, z] in any path from 0 to x in M for which z ;;:;; J(y), ie: J* (z) ;t; y. I Corollary 2. If B = B(X) is a finite Boolean algebra and a: B --+ L is a strong map into a geometric lattice L, then r(a; cp, v) = (cp -

1).1L(1) e

413

C,.:. 1'

cp -

1).

606

H.H. CRAPO

ARCH. MATH.

Proof. By Corollary 1, -ria; rp, v) = =

L (rp -

A!;X

l)IA I V1L (1)-l(u(A» =

(rp - 1)ld1)

L ( ':'1 )lL(1)-l(U(A»(rp _

A!;;X

rp

l)IAI-l(u(A».

I

We may now complete the calculation of the cycle polynomial -r(a*) from the coboundary polynomial -ria). Corollary 3. If B = B(X) is a finite Boolean algebra and a: B --+ L is a strong map onto a geometric lattice L, with orthogonal a*: B--+ L*, then -r(a*; rp, v)

=

= (rp _

l)lllv-l(1)-r(a; v

Proof. From Corollary 2 we have e(~, 'Yj) 111- A(1). Thus

-r(a*; rp, v) = (rp - 1)1 1 1- A(1) 12*

C.:. 1'

~~ ~ 1, v).

= 'Yj-A(1)-r(a; 'Yj

+ 1, ~'Yj). Also A*(1) =

rp - 1) = (rp - 1) 111-A(1) 12 (rp - 1,

= (rp _ l)III-A(1)V-A(l) (rp _ l)A(1)-r

(a; v~ ~ ~ 1 , v).

rp':' 1) =

I

So far we have dealt with representations of geometries as quotients of simpler geometries of higher rank. A few parallel results are available for embeddings of a given geometry as a subgeometry of various larger geometries, usually of equal rank. Corollary 4. If a: P --+ L is a strong map between geometric lattices and if t: L --+ N is a 1·1 strong map from L into a geometric lattice N, then -r(da); rp, v) = v AN (1)-AN(,(1»-r(a; rp, v).

Proof. -r(da); rp, v) because AN (t (y))

=

L rpAP(U A(x» ,uL(X, y) V·N(l)-AN('(Y» = V1N (1)-lN(,(1» -ria; rp, v)

::z:,yeL

=

I

Ady) for all y E L.

Corollary 5. If L: L --+ N is a 1-1 strong map from a geometric lattice L into a geometric lattice N, then the relation vlN (I)- Ad

1 ) PLiO,

v) =

L

PN(x; v). xEN;,A(x)=O

In particular, if N is the lattice of aU partitions of the set H of vertices of a graph G and L is the lattice L(G) of closed subsets of the edge set X of G, then vIHI-I-A(,(X» PLiO, v)

=

L

00

ndv k=l

1) (v - 2) ... (v - k

+ 1)

where nk is the number of k-part color.partitions of the vertex set H of G.

Proof. Evaluate -r(t; rp, v), simplify by using Corollary 4. and set rp = O.

414

I

Vol. XlX,1968

Mobius Inversion in Lattices

607

Bibliography

H. H. CRAPo, The Mobius Function of a Lattice. J. Combinatorial Theory 1,126-131 (1966). H. H. CRAro, The Tutte Polynomial. Aequationes Math. (to appear). H. H. CRAPO, Geometric Duality. Rend. Sem. Mat. Univ. Padova 38, 23-26 (1967). D. A. HIGGs, Strong Maps of Geometries. J. Combinatorial Theory 6 (1968) (to appear). O. ORE, Galois Connexions. Trans Amer. Math. Soc. 61i, 493-513 (1944). G.-C. ROTA, On the Foundations of Combinatorial Theory I. Z. Wahrscheinlichkeitstheorie und verw. Gebiete 2, 340-368 (1964). [7] W. T. TUTTE, A Ring in Graph Theory. Proc. Cambridge Philos. Soc. 43, 26-40 (1947). [8] W. T. TUTTE, A Contribution to the Theory of Chromatic Polynomials. Canad. J. Math. 6, 80-91 (1954).

[1] [2] [3] [4] [5] [6]

Eingegangen am 9. 11. 1967 AnBchrift des Autora: Henry H. Crapo Department of Mathematics, University of Waterloo. Waterloo, Ontario, Canada

415

Reprinted from JOURNAL OF COMBINATORIAL THEORY All Rights Reserved by Academic Press, New York and London

Yo. 7, No.3, November 1969 Printed in Belgium

A Generalization of a Combinatorial Theorem of Macaulay

G. F.

CLEMENTS AND

B.

LINDSTROM

University of Colorado, Boulder, Colorado 80302, and University of Stockholm, Sweden Communicated by D. H. Younger

Received May 26, 1968

ABSTRACT

Let E denote the set of all vectors of dimension n (n :> 2) with non-negative integral components. E is ordered in the lexicographic order. Let E. denote the subset of all vectors in E with component sum v. If Hv denotes any subset of Ev let LHv denote the set of the I Hv I last elements in Ev , where I Hv I is the number of elements of Hv . Let PH. denote the set of all vectors of E'+1 , which are obtained by the addition of 1 to a component of a vector in Hv. In [3) Macaulay proved the inclusion P(LHv) C L(PHv). Sperner gave a shorter proof in [4). Let kl < k2 < ... < k n be given positive integers and let F denote the set of all vectors (al , ... , an) with integer components and 0 < ai < k i i = 1, ... , n. We shall prove Macaulay's inclusion for subsets Hv of Fv even if the operators P and L are restricted to operate in F. This will follow from our theorem. As another application we prove a generalization of the main result ill [2). By a different method Katona proved the theorem when kl = k2 = ... = k n = 1 (see [1, Theorem 1)).

1.

INTRODUCTION AND STATEMENT OF THE RESULTS

Let the integers kl , k2 , ... , k n be given such that 1 ~ kl ~ k2 ~ ... ~ k n • The set of all vectors (a l , ... , an) of dimension n with integer components ai for which 0 ~ ai ~ ki' i = 1, ... , n, will be denoted by F. Write (a l , ... , an) < (b l , ... , bn ) if al < bl or if al = bl , ... , ai-l = bi - l , ai < b i for some i, 2 ~ i ~ n (lexicographic order). Put kl + ... + k n = k. Define v = 0, 1, ... , k. If H is a subset of F, put H" = H n F" and let I H" I denote the number of elements of H" . The set of the I H" I first elements of F" in the lexicographic order will be denoted by CH" and is called the compression of H" . The set of the last I H" I elements of F" is denoted by LH., . If H is any subset of F let k

CH=

U CH". .,=0 230

416

(Ll)

A GENERALIZAtION OF A COMBINATORIAL THEOREM OF MACAULAY

231

Let r be the multi valued function from F into F which associates with (a l , ... , an) the set {Cal -

1, a2 , ... , an), (al' a 2 - 1, aa , ... , an), ... , (al , ... , an-I, an - I)}

n F.

If F(H) C H we say that H is closed. Let P be the multivalued function from F to F which associates with (a l , ••• , Cln) the set {Cal

+ 1, a

2 , ••• ,

an), (ai' a 2

+ 1, aa , ... , an), ... , (al , ... , an-I, an + I)} n F.

Let Sm denote the set of the m first vectors of F. Put a(al , ... , an)

=

n

L a;

;=1

and

rx(H)

=

L

rx(a).

IJeH

We can now state our results. THEOREM.

If H" C F"

then r(CH,,) C C(FH,,), v

=

0, 1, ... , k.

=

COROLLARY

1.

If H" C F"

COROLLARY

2.

If H C F and H

COROLLARY

3. If He F, I H I = m and H is closed then rx(H)

then P(LH,,) C L(PH,,), v

0, 1, ... , k.

is closed then CH is closed. ~

rx(Sm).

The notions in Corollary 2 were introduced by Lindstrom and Zetterstrom in solving a problem of k-adic integers [2]. They proved Corollary 2 in the special case n = 2 and all k i equal (see [2, Lemma 1, p. 167]), but convinced themselves by an incorrect example that the corresponding result for n = 3 was wrong (they overlooked 102 in their example on page 169). Theorem 1 in [2] is a special case of Corollary 3 when all k; are equal. The theorem of Macaulay follows from Corollary 1 for fixed v when kl ;;:::, vn. The reader will perhaps find the following picture helpful to grasp the theorem. Let n, k; for i = 1, ... , n and v be fixed positive integers satisfying 1~ v ~k

=

n

L ki •

;=1

Let the points (a l , ... , an) with integral coordinates a; such that be locations of buttons which turn on lights at those positions

417

°

~

ai ~ k i

232

CLEMENTS AND LINDSTROM

which have non-negative coordinates. Suppose one is required to press m of the buttons in the hyperplane a1 + a2 + ... + an = v. Which ones should be pressed so as to minimize the number of lights that go on? The minimum will be realized by using the first m buttons in the lexicographic order which lie in the hyperplane. For instance, if n = 3, kl = k2 = k3 = 4, and v = 7, the relation between the buttons, indicated by small solid circles, and lights, indicated by large open circles, are shown in Figure I. A button turns on the lights connected to it by a straight line.

FIGURE

2.

1.

n

=

3, kl

=

k2

= kl =

4, v

=

7.

PROOF OF THE COROLLARIES

We shall first prove the corollaries with the aid of the theorem. PROOF OF COROLLARY 1: Apply the mapping of F onto F which maps (al ,... , an) on (k 1 - al , ... , k n - an). Fv is then mapped on Fk- v . If Hv is

418

A GENERALIZATION OF A COMBINATORIAL THEOREM OF MACAULAY 233 mapped on HL" then CH" is mapped on LH~_" and rH" is mapped on PH~_" . The rest of the proof is now obvious. PROOF OF COROLLARY 2: If r(H) C H then r(H,;) C H V- l for v = 1, ... , k From the theorem it follows r(CH,,) C C(H"-l) and then

r(CH) = r

(u CH,,) = U r(CH,,) C U C(HV- l ) C CH. v-o k

k

k

,,=1

,,=1

*-

PROOF OF COROLLARY 3: Assume CH Sm and let a = (a l , ... , an) be the first element of F which is not element in CH. Let b = (b l , ... , bn ) be the last element of CH. It follows a < b, for CH S",. We shall next prove o:(a) > o:(b). If o:(a) = o:(b), a E CH follows from the definition (1.1) since a < b, and a ¢ CH is contradicted. We now prove that o:(a) < o:(b) implies the same contradiction. Since a < b let al = bl ,... , ai-l = b i - l , ai 0 for some j > i put d = o:(b) - o:(a) and d' = d - min{d, bi - ai - I}. It follows d' ~ bi+l bn • We can now subtract d' ones from the integers bi+l , ... , bn so that we obtain non-negative integers CHI"'" Cn . Put Ci = bi - d d' and Cj = aj for j = 1,2, ... , i - I . We now have c = (cl , ... , cn) E CH, for bE CH and CH is closed by Corollary 2. From ai < Ci it follows a < c. Then we get a E CH, for C E CH and o:(c) = o:(a). Thus o:(a) > o:(b) follows. If we delete b from CH and adjoin a to CH, we obtain a set H' such that rH' C H', CH' = H' and o:(H') > o:(CH). If H' Sm we can repeat the operation' and after m steps, at most, we have H"'" = Sm and o:(H""') > o:(CH) = o:(H), which is the result.

*-

+ ... +

+

*-

3.

PROOF OF THE THEOREM

We shall first define some auxiliary notions. If H C F, put i

= 1, 2, ... , n;

d

= 0, 1, ... , k i

•

For subsets H" of F" let (CHV)i:d denote the set of the !(H,,)i:d! first elements of (F,,)i:d' We shall say that Hv is i-compressed if (CHV)i:d = (Hv)i:d, d = 0, 1, ... , k i • If CH" = Hv we say that H'I; is

compressed.

419

234

CLEMENTS AND LINDSTROM

For any subset H" of Fv we can define the sequence of sets Hvl, Hv 2, ••• , Hv;, ... by putting Hv = Hv 1 and H~+1

k;

=

U (CHv;)t:a,

where

i

,.=

j(mod n),

1

~

i ~ n.

(3.1)

<1=0

We shall prove five lemmas. LEMMA

1.

One can find p such that H "P is i-compressedfor i

= 1, 2, ... , n.

PROOF: Enumerate the elements of Fv in the lexicographic order. If a E F" let n(a) be a's number. For any subset H" of F" define n(H,,) as the sum of numbers of its elements. It is evident that

n(H"I)

~

n(H,,2)

~

...

~

n(H,,;)

~

...

and

Since the sequence cannot decrease indefinitely, there must exist a p such that Hv P = H~+1 = ... , i.e., H"P is i-compressed for i = 1,2, ... , n. LEMMA 2. If the theorem is true in n - 1 dimensions and if rHv C H"-1 (n dimensions), itfollows rH/ C HLlfor j = 2,3, .... PROOF:

The proof is by induction from j to j r(H'/)i:a

+ 1. From

n (F"-I);:
it follows (in n - 1 dimensions)

From rH,,; C H~-1 , we obtain

d

~

1,

and then

d

~

1,

(3.3)

for the left side is the first I(CH'/)i:a I elements of (F,,-I)i:
420

A GENERALIZATION OF A COMBINATORIAL THEOREM OF MACAULAY 235 If we take the union for d = 0, 1, ... , ki' we obtain by (3.1) rHt+1 C H!~~, and the lemma follows by induction since rHvl C H!_l.

If Hv is compressed, then rHv is compressed. Assume a = (a l ,... , an), b = (b l , .•. , bn) are elements

LEMMA 3.

PROOF: and a < b. Then, if (b l

, ... ,

bi

+ 1, ... , bn ) E H" ,

we shall prove a E rH'IJ. Let al = bl , ... , ai-l = bi- l , ai

1

~

< b i • Then if j

in F'IJ_I

j ~ n, ~

i, we have

+ 1, ... , an) < (bl , ••• , bi + I, ... , bn ) E H" . (a l ,... , aj + 1, ... , an) E Hv since H" is compressed.

(a l ,... , aj

It follows Hence a E rH". If j > i, we disregard the first i - I components and assume al < bl . If av < k" for some v> I, or if v = I and a'IJ < bv - 1, we have (a l ,... , a" I, ... , an) < (bl , •.• , bi 1, ... , bn) E H" and then a E rH'IJ. If a = (b l - 1, k2 ,... , k n), we find that b = (bl , k2 ,... , kv - 1, ... , k n) for some v > 1 and then j = lor v. Hence (b l I, k2 ,... , kv - 1, ... , k n) E H" or (bl , k2 ,... , k n) E Hv . It follows (bl , k2 ,... , k n) E Hv and a E rHv .

+

+

+

LEMMA 4. Let n ~ 3. Assume that g = (gl , ... , gn) and h = (hi , ... , hn) are elements in F", g < hand hn = 0 or gn = k n . Then ifh E S, and Sis i-compressed for i = 1, 2, n, it follows g E S. PROOF: The conclusion follows if we can find an increasing sequence of vectors in Fv beginning in g and ending in h such that any two consecutive vectors have an i-th component equal (i = 1,2 or n). First assume gn = k n • Consider three cases:

(lo) gl

=

(20 ) If gl

hi is trivial.

<

hi and gi

> 0 for

some i, 2 ~ i ~ n - I, we have

where and is as small as possible in the lexicographic order. We find then

(gl

+ 1, g~ ,... , g~-l , gn)

~ (hi,

h2 ,... , hn ),

if gl

The sequence follows from (3.4) and (3.5) if gl

421

+1=

+1=

hI .

hi.

(3.5)

236

CLEMENTS AND LINDSTROM

g;

If gl + 1 < hI and > 0 for some i ~ 2 proceed as in (20). If gl + 1 < hI and g~ = '" = gn-l = 0 proceed to (30). (30) If gl < hI and g2 = gs = ... = gn-l = 0, we obtain the inequalities (gl' g2 ,,,., gn)

< (hI' g2 '''., gn-l , gn

- hI

+ gl)

:;:;; (hI,,,·, hn).

(3.6)

The second vector belongs to F" since hI - gl :;:;; kl :;:;; k n = gn . We have proved that if gn = k n one can find the desired sequence of vectors. Then assume hn = O. Apply the preceding result to the vectors (kl - hI'"'' k n - hn) < (kl - gl '''., kn - gn). LEMMA 5.

The theorem is true when n = 2.

PROOF: A subset B = {(aI' a2), (al + I, a2 - I),,,, (a l + c, a2 - c)} of F" is called a block. A block which contains the first element of F" is called an initial block. If B is any block and Bo is an initial block, we easily obtain in all cases, since kl :;:;; k2 , 1

rB

1 -

1

B

1

~

1

rBo

1 -

1

Bo I.

*

(This is not true when kl ~ v > k 2). If BI '''., Br are all the maximal blocks which are subsets of H", it follows rBi n rBi = 0 if i j and then 1

rH"

1 -

1

Hv 1 =

r

L

(I

rBi

1 -

1

B; J) ~ 1 rBo

1 -

1

Bo

1

i~l

for any initial block Bo. In particular when Bo = CH", we have 1 rH" 1 ~ 1 r(CH,,) 1 and r(CH,,) C qrH,,), since r(CH,,) is compressed. PROOF OF THE THEOREM: The theorem will be proved by induction from n - 1 to n. It is true for n = 2 by Lemma 5. Assume that the theorem is true in n - 1 dimensions. Let H" be any subset of F" and consider H"i and (rH,,)i for j = 2, 3,,, .. By Lemma 1 we can determine p such that S = H"P is i-compressed for i = I, ... , n. Put (rH,,)p = T for abbreviation. From Lemma 2 it follows rSCT.

(3.7)

To complete the proof we show how to alter S to CH" and T to a subset of C(rH,,) in such a way that r(CH,,) C qrH,,) is obtained. First, if S = F" then 1 S 1 = 1 CH" 1 shows that CH" = F". Also reS) = reF,,) = F"_l C Timplies T = F"-l . Then since 1 T 1 = 1 qrH,,)I, it follows qrH,,) = F"_l , and hence r(CH)" = reF,,) = F"-l = qrH,,), and we are done. If S Fv there is a first vector g = (gl '''., gn) of F" which is not in S.

*

422

A GENERALIZATION OF A COMBINATORIAL THEOREM Of MACAULAY

237

Let h = (hI' ... ' hn ) denote the last vector of S. If h < g, then S = CHv , and so it is no loss of generality to assume h > g. It follows from Lemma 4 that hn > 0, since S is i-compressed for i = 1, 2, ... , n. Define h* = (hI' ... ' hn - l , h n - 1) and g*

= (gl , ... , gn-l , gn - 1)

if gn

•

>

O.

Note that h*, g* E F v - l • From (3.7) it follows h* E T. If XES - {h}, then rex) < x < h (where rex) denotes any image of x). This shows that rex) = h* for no XES - {h}. Now let S'

= (S - {h}) u {g},

T =

Ie: -

{h*})

u

{g*}

if gn if gn

> 0, = O.

We now show that rs' CT. Since rex) = h* for no XES - {h}, it suffices to show that reg) C T'. If gn > 0 then g* is an image of g which is in T' by construction. Observe that gn < k n . This under follows from Lemma 4 since S is i-compressed for i = 1,2, ... , n. If gi > 0 for some i, 1 ~ i ~ n - 1, then (gl ,... , gi - 1, ... , gn) is an image of g under But it is also an image of (gl , ... , gi - 1, ... , gn + 1), which is in S because it precedes g and g was the smallest element of Fv not in S. Since reS) C T, it follows (gl , ... , gi - 1, ... , gn) E T. Also, because hI > gl (S is I-compressed), we find h* =I=- (gl , ... , gi - 1, ... , gn) so (gl , ... , gi - 1, ... , gn) is (still) in T'. Thus all images of g are in T', so reS') C T. Obviously, S' is i-compressed for i = 1,2, ... , n. After a finite number of applications of " we have S'···, = CHv and r(CHv) CT···' = U. Now C(rHv) is the first 1 rHv 1 elements of F V - 1 while r(CHv) is the first 1 r(CH v) 1 elements of F V - 1 by Lemma 3. But 1 r(CHv) 1 ~ 1 U 1 = 1 T 1 = 1 rHv I. It follows r(CHv) C C(rHv), and the theorem is proved.

r,

r.

4.

CONCLUDING REMARKS

We can show that p = 4 suffices in Lemma 1 when n = 3. The proof is rather long since the number of cases which one must consider is large. We recently noticed that Katona [1] has proved our theorem when kl = k2 = ... = k n = 1. Katona puts his result in the language of set theory. He observes that "the theorem is probably useful in proofs by induction over the maximal number of elements of the subsets in a system, as was Sperner's lemma in his paper" [5].

423

238

CLEMENTS AND LINDSTROM

REFERENCES 1. G. KATONA, A Theorem of Finite Sets, Theory 0/ Graphs (Proceedings of the colloquium held at Tihany, Hungary September 1966), ed. by P. Erdos and G. Katona, Academic Press, New York and London, 1968. 2. B. LINDSTROM AND H.-O. ZETTERSTROM, A Combinatorial Problem in the k-adic Number System, Proc. Amer. Math. Soc. 18 (1967), 166-170. 3. F. S. MACAULAY, Some Properties of Enumeration in the Theory of Modular Systems, Proc. London Math. Soc. 26 (1927), 531-555. 4. E. SPERNER, Ober einen kombinatorischen Satz von Macaulay und seine Anwendung auf die Theorie der Polynomideale, Abh. Math. Sem. Univ. Hamburg 7 (1930); 149-163. 5. E. SPERNER, Ein Satz tiber Untermengen einer endlichen Menge, Math. Z. 27 (1928), 544-548.

PRINTED IN BRUGES, BELGIUM, BY THE ST. CATHERINE PRESS, LTD.

424

Reprinted from: JOURNAL OF MATHEMATICAL PHYSICS

VOLUME 11. NUMBER 6

JUNE 1970

Short Proof of a Conjecture by Dyson J. J. GOOD Department oIStatistics, Virginia Polytechnic Institute, Blacksburg, Virginia

(Received 26 December 1969) Dyson made a mathematical conjecture in his work on the distribution of energy levels in complex systems. A proof is given. which is much shorter than two that have been published before.

Let G(a) denote the constant term in the expansion

so that

of F(x; a) =

II ( I i#j

- X ~ )"i • Xi

G(a) = ~ G(a" ... , a }_, , a, - 1, a H"

By multiplying F(x; a) by this function we see that, if ~ O,} = I,"', n, then

a,

If a j = 0, then Xi occurs only to negative powers in F(x; a) so that G(a) is then equal to the constant term in

F(x 1 ,'"

}

,Xi-I' X i + 1 " " , Xn;

that is,

a1 , " ' , OJ-I' a i + 1 ,'"

G(a) = G(a" ... , a,_" aJ+" ... ,an),

,01/),

if a} = 0. (2)

Also, of course, G(O) = I.

(3)

Equations (1)-(3) clearly uniquely define G(a) recursively. Moreover, they are satisfied by putting G(a) = M(a). Therefore G(a) = M(a), as conjectured by Dyson. F. J. Dyson, J. Math. Phys. 3,140.157,166 (1962). J. Gunson, J. Math. Phys. 3, 752 (1962). K. G. Wilson, J. Math. Phys. 3,1040 (1962). 4 Z. Kopal,Numerical Analysis(Chapman and Hall, London,1955), p.21. 1

a" a,,"', a,_"

an). (I)

where all Q2' ••• , a tl are nonnegative integers and where F(x; a) is expanded in positive and negative powers of x" x., ... , X n' Dyson' conjectured that G(a) = M(a), where M(a) is the multinomial coefficient (a, + ... + a,,) !/(a,! ... an !). This was proved by Gunson' and by Wilson' A much shorter proof is given here. By applying Lagrange's interpolation formula (see, for example, Kopal 4) to the function of x that is identically equal to I and then putting x = 0, we see that

F(x;a) = ~F(x;

... ,

i.j = 1.2.···, n,

2

3

425

Reprinted from ADVANCES IN MATHEMATICS All Rights Reserved by Academic Press, New York and London

Vol. S, No. I, August 197() p,.z"nted in Belgium

On a Lemma of Littlewood and Offord on the Distributions of Linear Combinations of Vectors* DANIEL

J.

KLEITMAN

Department of Mathematics, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139

In this paper we prove the following result: THEOREM 1. Let a l , ... , an be vectors in a Hilbert space S, each with length at least unity. The number of their linear combinations with coefficients o or 1 that can lie in the union of any k regions Rl ,... , Rk in S each of diameter less than ( <) unity is not more than the sum of k largest binomial coefficients on N.

This result is related to a problem of Littlewood and Offord [1] on the distribution of roots of algebraic equations. Various special cases have been obtained by several authors [2-4]. Theorem I settles a long standing conjecture of P. Erdos [2]. The method of proof can be extended straightforwardly to prove the following generalization. THEOREM II. Let a l , ... , an be vectors in a Hilbert space S each of length at least unity and let mV Ml ,... , mn , Mn be integers. Then the number of linear combinations Ln=l Ciai with coefficients Ci integral and in [mi , M i ] that lie in the union of any k regions in S each of diameter less thiln ( <) one is no more than the number of such linear combinations whose "weights" (L ci ) are among k "most populous" weights. This is the number of linear combinations satisfying

* This

work was supported in part by NSF GP-13778.

155

427

156

KLEITMAN

As this bound can be achieved by choosing all a's identical, the result is best possible.

We present the proof in detail for Theorem I only. Parallel steps for Theorem II are easily obtained. We can, without loss of generality, assume that our regions R j are mutually disjoint. We do so below. We first note a well-known property of binomial coefficients. The sum of k largest binomial coefficients on N is equal to the sum of k + I and k - I largest binomial coefficients on N - I. To prove this we note that k largest binomial coefficients on N are (~), (r~l)"'" (~), where r = [(N - k + 1)/2] ands = r + k - I = [(N + k - 1)/2]. Applying the recursion (j) = (N j 1 ) + (j~i) to each of these coefficient yield the desired relation, viz.,

We prove our theorem by induction on N. In light of the property just described we need only show that (0, I)-linear combinations of (a l , ... , aN) lying in k disjoint regions of diameter < I can be put in I: I correspondence with (0, I)-linear combinations of (a l , ... , aN-l) lying either in k + I or k - I such regions. Now (0, I)-linear combinations of (a l , ... , aN) have coefficient of aN either zero or one. In the former case they may be considered as (0, 1)linear combinations of (a l , ... , aN-l) as they stand. They must lie in our k disjoint regions as sums of (a l , ... , aN-l) if they are to do so as sums of (a l , ... , aN)' In the latter case, involving linear combinations explicitly containing aN' the condition that they lie in our k disjoint regions is that their (a l , ... , aN-l) parts lie in the translation of these regions by-aN' If we show that the translation by -aN of at least one of our regions is disjoint from each of our k original regions we are done, since sums lying in our original k regions with aN correspond to sums without aN lying in these regions plus our translated region (k + I disjoint regions) or lying in the k - I other translated regions which are themselves mutually disjoint and of diameter < 1. If we consider a hyperplane normal to aN placed so that all regions Rl ... Ric lie on one side of it (the side on which x . aN ;?: 0) and so that it just touches the closure of some R j , the translation of R j by -aN must lie on the other side of the hyperplane and hence must be disjoint from the regions Rl ... RN . This remark completes the proof.

428

A LEMMA OF LITTLEWOOD AND OFFORD

157

If we were concerned with linear combinations as in Theorem II we could proceed in the same manner. If the coefficient of aN can take on MN - mN different values we obtain (MN - mN)k regions in terms of (aI'"'' aN-I) to correspond to the k regions RI '" R k • The desired recursion relation is obtained if these can be divided into disjoint families of regions of sizes k + (MN - mN), k + (MN - mN) - 2, k + (MN - m N) - 4,,, .. The hyperplane construction above permits us to find such families. That is if m N = 0 the original k regions along with the translation of R j as defined by -aN' -2a N '''., -MNaN , form k + M N disjoint regions. Another disjoint family can be obtained by throwing away these and repeating the procedure just described. Iteration of this procedure produces disjoint families of regions which by induction yield the recusion satisfied by k most populous weights. ACKNOWLEDGMENT

The author thanks R. Graham who suggested the geometric hyperplane interpretation of the argument described above.

REFERENCES

J. E. LITTLEWOOD AND C. OFFORD, On the number of real roots of a random algebric equation (III), Mat. USSR Sb. 12 (1943), 277-285. 2. P. ERDOS, On a lemma of Littlewood and Offord, Bull. Amer. Math. Soc. 5 (1945), 898-902. 3. D. KLEITMAN, On a lemma of Littlewood and Offord on the distribution of certain sums, Math. Z. 90 (1965),251-259. 4. G. KATONA, On a conjecture of Erdos and a stronger form of Sperner's theorem, Studia Sci. Math. Hungar. 1 (1966), 59-63. I.

PRINTED IN BELGIUM BY THE ST. CATHERINE PRESS, TEMPELHOF

429

37, BRUGES, LTD.

Ramsey's Theorem for a Class of Categories R. L. GRAHAM Bell Telephone Laboratories. Incorporated. Murray Hill. New Jersey

K. LEEB Universitiit Erlangen. Erlangen. Germany

AND B. L. ROTHSCHILD University of California. Los Angeles. California 90024 DEDICATED TO RICHARD RADO ON THE OCCASION OF HIS 65TH BIRTHDAY

1. INTRODUCfION AND BASIC TERMINOLOGY

In this paper we present a Ramsey theorem for certain categories which is sufficiently general to include as special cases the finite vector space analog to Ramsey's theorem (conjectured by Gian-Carlo Rota), the Ramsey theorem for n-parameter sets [21, as well as Ramsey's theorem itself [4, 61. The Ramsey theorem for finite affine spaces is obtained here simultaneously with that for vector spaces. That these two are equivalent was already known [5, II, and the arguments previously used to show that the affine theorem implies the projective theorem are also special cases of the results of this paper. The argument used here to establish the main result is essentially the same as that used for n-parameter sets [2]. What we do here is to abstract the properties of n-parameter sets which suffice to allow the induction argument. In particular, the properties described for n-parameter sets in Remarks 1-3 of [21 are essential. In order to state the Ramsey property for a category C we must have a notion of rank with which to index the objects and subobjects of the category. To this end, it is convenient to consider henceforth only categories c with the following property: (a) I

>

The objects of c are the nonnegative integers o. I, 2, ... , and if where c (J ,k) is the set of all morphisms from I to k in c.

k, C (J , k) - ",

Using this property, we define a rank on subobjects of an object I in c. Namely, if k - ' I and k' -t' I are representatives of the same subobject of I, then there must be isomorphisms k - " k' and k' -J k. But by (a), this means that k - k'. We define the rank of this subobject to be k, and we refer to it as a k-subobject of I. We denote by c [L] the class of subobjects of I in C of rank k. We make the convention that for k < 0, or I < 0,

431

GRAHAM, LEEB, AND ROTHSCHILD

418

[i]-

e Ill. In order to make our induction argument work, we need a finiteness condition. We assume in addition to (a) that all categories considered here satisfy: (b) For each pair of integers there is an integer Yk. I such that e a finite set with Yk.1 elements. In particular, Yo.o - 1.

[i] is

For convenience, all categories we consider are assumed to satisfy (c)

All morphisms of e are monomorphisms.

If k .....t I is a morphism of e, we let] denote the induced mapping on subjects of I. That is, if s _8 k represents a subobject of k, then] takes this subobject into the subobject of I represented by the composition fg. This is clearly well defined, and]: e e An r-coloring of e is a function

[!]-It ..... r}.

c:

e

c

is

[!] - [!].

An r-coloring

i.

[!]

We say that a sUbobject has color i if its image under c

of e

[!]

induces an r-coloring on e

[!]

by the

composition cf, where k .....t I is in e. If the image of c] is only a single element, we say that c has a monochromatic k-subobject, namely, the ksubobject represented by k .....t I. We can now state the Ramsey property for a category e satisfying (a)(c):

Given integers k. I. r, there exists a number n, depending only on k, I, r, so that for all m ~ n, every r-coloring of has a monochromatic 1subobject.

e[k]

When e has morphisms k .....t I which are all the monomorphic functions from II, .... k} to II ..... I}, then this is just the statement of Ramsey's Theorem. If e has morphisms k .....t I which are the linear monomorphisms from vk - to VI - , where VI. V2 .... form a basis for a vector space V over GF(q), then this is the statement of Rota's conjecture. In this case, the k-subobjects of I correspond to the subspaces of VI of dimension k. Other examples of special cases of the Ramsey property will be given later. 2. STATEMENT OF THE MAIN RESULT In order to establish the Ramsey property for certain categories e, we consider a somewhat stronger version of it which makes the induction argument easier. e(k;II ..... I,):

only on

there is an i,

c[

Ii]

k

There is a number N-Nc(k;r;II ..... I,) depending such that for any m ~ N and any r-coloring c of e

[k]'

k. r. II •...• I"

I "" i "" r,

7

and a morphism 1/

jIo

.....t m such that c

c [~]

.

~ ~ Ii}

432

h .... . r}

RAMSEYS THEOREM FOR A CLASS OF CATEGORIES

419

commutes, where incl(j) - i. This statement always holds for k < 0, since C [~]-

~, by convention.

If

all the Ii are equal, this becomes the Ramsey property stated above. 1 below provides the induction step in establishing It establishes B(k + 1;/ 1 , .•. , I,) if we know provided the categories A and Bare related in a special way. This relation is given by the conditions below. For a functor M from A to B with M (x) - y for integers x and y, we denote by Ai the induced function from subobjects of x to subobjects of y. This is given by letting Ai take the subobiect represented by s -.I x in A into the subobject represented by M(s) _M ) Y in B. Theorem

I,) for certain categories. A (k; II , ... ,I,) for all , and 1/

C(k;/I,""

Conditions on Categories A and B

There is a functor M from A to B with M(f) - I + I, 1-0,1, ... , a functor p from B to A with p (I) - I, I - 0, I, ... , an integer 1 ~ 0, and for each I - 0, 1, ... 1 morphisms,l _.Ij I + I, 1 EO; j EO; I, satisfying the following: I. For each k + 1 - 0, 1,2, ... , the diagonal d in the following diagram is epic, where!! (together with the indicated injections) is coproduct, and d is the unique map determined by the coproduct to make the diagram commute:

[k~) -----il--------;~

/~ 1+1]

II. For each commutes:

1;, Ai

k+ 1

.....!------------

s -' I

in

B

and each

q,lj

1

g

q,'j

s

III. For some

I -' I

II

+ 1 in

A,

j - I, ... ,I

..

A [I] k

the following diagram

1+1

1

M(P(g»

.

s+1

the following diagram commutes for all

j - I, ... , I:

433

GRAHAM, LEEB, AND ROTHSCHILD

420

1+1

/~

1+2

~~ 1+1

Remark. Let such that

J

+I_

h

I in B. Then by III there is some

J -' S

+ 1 in A

'"

commutes in

B

for each j . By II, the diagram
----'----;,~

,

1+ I

.+I.} .1+1

commutes for each

j.

'"

Thus

'"

---h

1+1

.r+l

commutes for each

j.

THEOREM 1. Let A and B be two categories satisfying the conditions above. Assume A (k;/ I, . .. • I,) holds for 0111 1 ' ••• , I, and r > o.

434

RAMSEYS THEOREM FOR A CLASS OF CATEGORIES

Then

+

B (k

I; II , ... , I,)

421

holds for all II , ... , I" and, > o.

3. PROOF OF MAIN RESULT We will eventually need a lemma about n-dimensional arrays of points. We state it now without proof. Proofs can be found in [3] and (2). (It is a special case of Corollary 4 below, in fact.) We denote by A· the set of ntuples (x I , . . . ,x.) of elements of Xi of a set A.

Given integers , > 0, 1 ~ 0, there exists an integer depending only on , and I, such that if n ~ N, A is a set of 1 elements, and A· is ,-colored in any way, then there exists a set of 1 ntuples (x I (j), ... , x. (;l), 1 ..;; j ..;; I, all the same color with the property that for each i, 1 ..;; i ..;; n, either Xi (j) - j for all j, or Xi (j) - a, for all j and some E A. LEMMA 1.

N - N(" I),

a,

Proof

of

Theorem

1. We use induction on L -II + ... +1,. holds vacuously if I, < k + 1 for any i or if k + 1 < 0 and trivially if 1 - O. SO we assume Ii ~ k + 1 ~ 0 and 1 > o. If any I, - 0, then k + 1 - 0, and B(k + 1;/ 1 , . . . , I,) holds trivially, since Yo.o - I. So we may assume all I, > 0, and, in particular, that L > O. Assume, then, that B(k + 1;/ 1 , . . . , I,) holds for L - I, and let II + ... +1, - L, I, > o. B(k + 1;/ 1 , ••• , I,)

DEFINITION. For I";; h ..;; m, suppose k + 1 --' I + h is in B, and f - M for some k --,' I + h - 1 in A. For any fixed choice of jh, jHI , ... ,jm -I, 1 ..;; j; ..;; I, let "', - "'1+, .j' Then the (k + I)-subobject of I + m represented by the composition

«)

.m-

I f .h k + 1 --71 + h --71 + h + 1 --7 ... --71 + m - 1 --71 + m

is said to have signature (h ;jm - 1 , ... ,jh) with respect to I and m. (The signature need not be unique for a given subobject, nor must every subobject have a signature.) An ,-coloring of B [t~~] such that all (k + 1)subobjects with the same signature have the same color is called an {/ , m)-c%,ing.

For integers / and prove Lemma 2 below.

m

we define recursively some numbers needed to

VI - NA (k;"m- I ; /, ... , J)

Vm -

N A (k; ,,0;

Vm -

1 + 1 , ... ,

Vm -

1+

I) .

The existence of these numbers is guaranteed by the hypothesis of Theorem 1.

With the same assumptions as in Theorem 1, let be integers; let x ~ Vm + I; and let B [k~I]-+c 11, ... ,,} be an

LEMMA 2. /

~

0, m

~

1

r-coloring. Then there exists / + m coloring of B ~~].

[i

-+6

435

x

in

B

such that egiS an (/, m)-

422

GRAHAM, LEEB, AND ROTHSCHILD

Proof We use induction on m. For m Assume for some m ~ 2 that it holds for m -

the choice of the (VI

v;, there is some

VI

+

1

I.

the lemma is trivially true. Then by induction, and by

m.....' x in B such that B [vk!~]

is

+ I. m - I)-colored bycg. We now color B [v~:/] as follows:

.....r'

k + 1 - ' VI + 1

and

each choice of compositions

jm - 1 •...• j I, 1 ..

k + 1

f

k + 1 ~VI +1

VI

Two subobjects, represented by

have the same color if and only if for h .. t, the subobjects represented by the

+ 1

~I ~VI +2~

... _ _

~m-I

VI

+m-I _ _ vl +m

and ~I

/

k+l~vl +1~vl +I~

have the same color, where

... _ _ VI

I]

~m-I +m-l~vl

I .. i .. m - I.

+m

This is an

"m_l_

· 0 f B [VIk+1 + ; ca II'It e ,. coIonng

Next, we color subobject in B

[v~:/].

A

A

[vJ 1 by the coloring induced by

M.

That is, a

[vJ] is assigned the same color as its image under M in

In other words,

there is some i. 1 .. i diagram commutes:

..

e' M is the coloring we use.

"m_ I ,

and some I

.....W

VI

By the choice of VI>

in A such that the following

h ..... "m-I}

Thus all the subobjects in MU bye' M(w). k

Suppose

k

.....r' I + h -

1

[ij) have the same color in B [i~\j

colored

+ 1 - ' 1+ h is in B, I .. h .. m, with /- M(j') for some in A. Consider the following diagram:

436

RAMSEYS THEOREM FOR A CLASS OF CATEGORIES

M(W~ •

k+1

-

423

vl+h+1

VI +h

q,'+h.}h

I+h

I

vl+m-1

l+h+1

q,.1 +m-I.}m-I

..

Um-I q,l+m-I.}m-1

I+m-I

I

VI+m Um

...

I+m

where

ul-M(w), u,-M(P(U'_I», i-2,3, ... ,m, 2,3, ... ,m. By condition II this commutes of ih, ih+l , ... , im - 1. Consider any subobject of I + m (!;im - 1, ... ,h) with respect to I and m. Let it be

and WI-W, for each choice with signature represented by k + 1 .....' I + m, where e is the bottom row of the diagram above with h - 1. Then ume represents a subobject of VI + m. By the definition of c' and the choice of w, all such subobjects with the same signature (!;im -I, ... ,h)

w, - P (U;_I), i -

have the same color in B [v~!;nl' since the diagram above commutes. On the

other hand, consider a subobject of I + m with signature 1, ... , ih), h ~ 2, and let it be represented by k + 1 .....' I + m, where e is the bottom row of the diagram. By the commutativity of the diagram, Um e - bM (whf), where VI + h .....b VI + m is the top row of the diagram. This means that Um e has signature (h - I;im -I, ... ,ih) with respect to (h; im -

VI

+ 1 and m -

1.

Since cg was a

(VI

+ I, m - O-coloring of B [v~!;nl' the

color of this subobject is determined only by the i,. Thus the color of any subobject with signature (h;im-I, ... , ih) with respect to I and m, h ~ I, has its color under the coloring c gUm determined only by the i" So c gUm is an (I, m)-coloring, and the lemma is proved. We may now proceed with the proof of Theorem I. Let 1- max

1<1<,

NB(k

+ 1;,;1 1 , ... ,1;-10 I, - I, 1;+1,"" I,),

a number which must exist by the induction hypothesis. Let y _ ,YI. HI, where Y'.HI is the number given by property (b). Let m - N{y, 0, where N {y ,0 is the number given by Lemma 1. Let Vm be the number used in the hypothesis of Lemma 2 (depending on I and m), and let x ~ Vm + 1. Finally, let B [k~,l""'c Ii, ... ,,) be an r-coloring. By Lemma 2 there is some

B

1+ m .....6 x in such that cg is an (I,m)-coloring of B[~~,;,l. We now color the m-tuples (h, ...• im), 1 :EO; i, ,.. I, by letting (h • ... ,im) and (k I •...• k m) have the same color if and only if for each k + 1 .....h I in B the subobjects represented by the compositions

437

_

_'"

GRAHAM, LEEB, AND ROTHSCHILD

424

II

.I,il

.I+m-l,im --'.l>- 1

h

4>1.k

4>l+m-I./cm

k + 1 --'.l>- 1 --'.l>- 1 + 1 --'.l>- ... --'.l>- 1 + m - 1

+m

and k + 1 --'.l>- 1 --'.l>- 1 + 1 --'.l>- ... --'.l>- 1 + m - 1

--'.l>-

1+ m

both have the same color in B [~~,;,]. This is a y-coloring of the m-tuples. By Lemma

and the choice of m, we can find t m-tuples z .s; t, all having the same color such that for each i either 1, (z) - z for all z or 11 (z) -it for all z and some fixed it. Let iI, ... , id be the i for which it (z) - z (there must be at least one of these since there are t m-tuples here). For 0 .s; a .s; d, let h. denote the composition (h

(z), ...

,1m (z», 1 .s;

1 + i.

-----?> 1 + i. +

... --'.l>- 1 + i. + I - 2

where we let

io - 0

and id+l

- m

• '+ia+1-2,jia+l-l

>

I

+ i.+ 1 - 1,

+ 1. Consider the following diagram:

-! ho

1 --'.l>- ...

/+;1- 1

..

4>1+11-I.j

..

hI l+il

4>1+i1-I,j

4>1+12-l.j

...

M(el)

--

-

l+i l

4>1+id-2- I ,j

1 + id-2-1

!

l+i2- 1

!

..

1+id-2

hd-2

1 +;2

{

4>1+ld-2-I,j M(ed-2)

l+id-2

4>1+ld-I-I.j

where the 1 + id - S - 1 _'d- s 1 + id-s+1 - 1 in A are those guaranteed by the Remark (following Condition III) to make this diagram commute for each 1 - 1,2, ... , t.

438

......

......

RAMSEYS THEOREM FOR A CLASS OF CATEGORIES By the choice of the represented by A

h.

we have for any

AO

.1+1t-1,}

k+I~/~/+il-1

.. .

M(Cd-Icd-2 ... c2cI) ., 1 + id

425

k

+ 1 _A 1 that the t subobjects

.,

I+il~

...

Ad

~ 1 + min B [~~ij,

I .. j .. t,

all have the same color, By Condition II, the following diagram commutes for all j: ""+II-I,}

I+il-I

r

I+i l

~

1

ho

..

"",j

1+1

Then letting a - hd M(ed_1 ... el P (ho» we see that for subobjects represented by the t compositions k+

.',j

A a I~ I~ I+I~

all have the same color, Thus

ega"',,}

I+m,

M(P(h o»

k

+ 1 _A 1 in

B

the

I " j " t,

are equal for all

j - 1,2, ...

,t, on

B[k~lj· Now consider any subobject of AI' (A [ij> in B [i~\j.

Let it be

represented by k + 1 - ' 1 + 1 in B, where f- M{f'>, k -to 1 in A, Then the subobject represented by af has signature (id;jm' ... ,itd+l> with respect to 1 and m, since af is just hd M (ed-I ... e I P (h o>1>, Since 1 + m is (f, m>-colored by c g, all subobjects of 1 + m with this signature have the same color. Thus ega gives the same color to any subobject of AI' (A [ij>, since the signature was independent of the choice of f. That is, I ..

q" r.

Consider the coloring ega"'" is some Ip

....IP 1 in B

on

B [k ~ Ij.

[ij-

{q}

for some

q,

By the choice of I, either there

such that

I]

B [k~1

or there is some

I

egaM (A

c'~'.lfp ----~.,

Iq - 1 ....Iq 1 in B

\P),

p '" q,

such that

I] ----~., Cf~'.lfq {q}.

I B [ 1~1

In the former case, we have the desired monochromatic subobject, and the theorem is proved. Hence we may assume that

439

GRAHAM, LEEB, AND ROTHSCHILD

426

B

We recall that ega4>I, I

-

[Il:1I]

ega4>I,j

ega4>I,jJq

By Condition II, 4>I,jJq -

<6""',llq

----~> {q}.

on B [k~l] for all j. In particular,

Hi::]]-

Hi::]] -

Now consider any subobject in if (A

the coloring

B,

where

J- M(f'),

M(P{Jq»J - M(P{Jq)j')

ega.

forallj.

M(P{Jq»4>lq-I,j' j - 1, ... ,I.

egaM (p{Jq» 4>1q-I,j

k + I - ' Iq in represented by

{q}

[Iq;

{q}.

Thus

j - I ..... 1 .

I]), and let it be represented by

k -J' Iq - I in A. The subobject is in if (A [L]), and thus has color q by

So cgaM (P{Jq» colors all subobjects in

We also saw above

thatcgaM(P{Jq»

I]) color q,

- [I-I] l+1 ) B[kl+ I]'

colors all subobjects

color q. But by Condition I, this accounts for all of Iq -4aM (P(fq)) x

M (A [Iq;

in4>lq _l.j(B

and hence

is the desired morphism, and the theorem is proved. 4. CONSEQUENCES

PROPOSITION 1. Let re be a class of categories such that for each category B in re there is a category A in re such that A and B satisfy the conditions of Theorem 1. Then B (k; II •...• I,) holds for all k, II," .• I, and all B in re.

Proof B(-I;II •...• I,) holds vacuously for allll." .• 1" as observed at the beginning of the proof of Theorem 1. This holds for all B in re. Thus for each B we can find a suitable A and apply Theorem 1 to obtain B (0; II •...• I,) for all II •...• I,. Proceeding in this fashion from 0 to 1 to 2, etc., we obtain B (k; II •...• I,) for all k, II •...• I, and B in re. COROLLARY 1 (Ramsey). Let C be the category with objects the nonnegative integers and morphisms k - ' I all the monomorphic functions from II, .... k) into II ..... I}, where composition is just composition of functions. Then C(k;II" ..• 1,) holds in general.

Proof We must find a class re containing C which satisfies the conditions of Proposition 1. For re choose the single category C itself. This clearly satisfies (a)-(c). So for A and B both equal to C, we must show that they satisfy the conditions of Theorem 1. Let P be the identity functor on c. For any k - ' I in c, let M (J) be the function k + I -J' I + 1 in C given by letting j' (x) - J(x), x';; k, and r (k + 1) - I + 1. Let 4>1 be the function from II ..... Il to 11 •.... I + J} which acts identically on II ..... I}. That is, 4>1 (x) - x for x .;; I. Then we claim these choices, together with choosing 1 - I satisfy I-III.

440

RAMSEYS THEOREM FOR A CLASS OF CATEGORIES

427

Consider a subobject in C [L~',l represented by some k + I - ' I + I. First suppose I (s) - I + I for some s. Then I represents the same subobject as l.s.k+I' where "'s.UI is the permutation of (l, ... ,k + II fixing everything except sand k + I, which it interchanges. "s.k+1 is an isomorphism and is its own inverse. Let k -i I be defined by letting / (x) - I"s. UI (x), I';; x.;; k. Then clearly M(J') -11rs .UI' Thus the subobject we chose is in Ii (C [L]>. The only other subobjects are represented by some k + I - ' I + I Then letting k + I -r' I be defined by we have 1- q,1 / , and the subobject is in ;; (C[k~l]>. This establishes I. II is clear from the definitions. III follows by taking e to be q,h since M(q,I) (x) - x for I .;; x .;; I. This establishes Corollary 1. We note that if one examines the argument used in the proof of Theorem 1 for this special case, the usual proof of Ramsey's Theorem emerges. where I

({l , .... k

/ (x) - I(x),

+ 11) c

II •...• I).

I';; x.;; k+l,

Let v be an infinite-dimensional vector space over GF(q) with basis For each k - O. I, .. '. let vk - , Vo - <0>. Let c be the category which has objects o. I • . . .• and morphisms k -+. I, where q, is a linear monomorphism from Vk to VI' Composition is ordinary composition of mappings. C clearly satisfies (a)-(c).

VI. V2 , ....

COROLLARY 2 (Vector Space Analog). described above, C(k;/ 1 , • .• ,I,) holds in general.

For the category C

Proof We apply Proposition 1 to a class containing C. Let A be an infinite-dimensional vector space over GF(q) with basis al. a2 •...• and let Am - , Ao - <0>. For m - 0, 1,2 •... , the category c m is defined as follows: The objects of Cm are 0, 1.2 ....• and the morphisms k -+ (w •• ) I are all pairs (w. q,) where w E Am ® VI and q, is a linear f rom Vk to VI' L et k -+ (0' ••) I, were h · monomorp h Ism w - ~~-l ai ® Wi, Wi E vh and I -+(x.~)

be morphisms in cm. Then their composition is defined to be k -+(Y.H) n, where y - x + ~~_I al ® I/;(wi)' Thus we can think of these morphisms as certain special affine transformations from Am ® vk into Am ® VI' (a)-(c) are satisfied for the cm. We choose for our class q; all the cm. When m - 0, we get the category C of Corollary 2. n

For each m, let B - cm and A - cm + l • We show that these satisfy Theorem 1. To define M, consider a morphism k -+(0' •• ) I in cm+!' Then W E A m+ 1 ® VI can be written uniquely as w - W' + am+1 ® Wm+h where w' E Am ® VI' Let q,': VUI -+ VI + I be determined by letting q,' (VUI) - VI+I + Wm+h and q,' - q, on Vk' Then define M«w. q,» - (w'. q,'), where k + I -+(w' ••,) I + I is in cm. One can verify by a direct check that M preserves composition. We next define P. Let k -+(w ••) I be in cm. Then P «w. q,» - (w". q,"), where w" - w + am+1 ® 0, and q," - q,. Clearly P preserves composition. Also, since the identity morphism for k in cm is (0. Ik), where Ik is the identity transformation on Vb and similarly for cm +h we see that M(J) - I + I and p(J) - I for each I. Finally, let t - IAml - qm, and for each element a E Am and each I let q,lo - (a ® VI+I • ell in cm, where el is the map from VI to VI + I acting identically on VI' Then these choices are sufficient to satisfy I-III. in 1/;:

To check I, let k + I -+(0" •• ') I + I represent a (k + I)-subobject of I + I First suppose q,' (VUI ) S/; VI' Then we can choose some isomorphism VUI -+ VUI such that q,' I/;(Vk ) c VI and q,' I/;(VUI) - VI+I + v' for some cm.

441

428

GRAHAM, LEEB, AND ROTHSCHILD

Furthermore, for a suitable choice of v E Am ® Vk+1 we have with w'EAm®V/. Of course (w',cj,') and (w'.cj,'''') represent the same subobject since (v. "') is an isomorphism. Now let k -+(w.~) I be in cm +1> where q, - q,,,,' on Vb and w - w' + am+1 ® v. Then we have M «w, q,)) - (w'. q,' "'). Thus all subobjects represented by a (w'. q,') with q,'(Vk+I) '1. V/ are in M(Cm + 1 On the other hand, ifq,'(vk+l) C VI> then (w' ,q,') - (w" + a ® V/+I> q,') for some a E Am and some w" E Am ® VI' But

V

E VI'

(w',cj,') (v.",)-(w'.cj,''''),

[Ljl.

(w"

+a

® v/+I>q,') - (a ® vl+l>e/) (w" .q,") - q,/,a (w", q,").

where q," - q,' on vk+1> This establishes I.

q,": Vk+1 -

VI'

Thus the subobject is in

q,/a (Cm

[k~l]>.

To check II, let s _(w,~) I in Cm. Then M(p«w.q,m - (w'.q,'), _(w'.l) I + I, where w' - wand q,' is the mapping determined by letting q,' - q, on V" and q,' (Vk+I) - VI+I' Clearly s

+1

This establishes II. Finally, for III, consider in cm +1 the morphism (am+1 9vJ+l.er)

- - - - - l » /+1.

M«am+1 ® v/+I.e/» ",' (V/+I) - V/+2

- (0. ",'),

where

",'

acts

+ V/+I' Now we have for each

identically

on

VI>

and

a E Am,

(a ® V/+2.el+l) (a ® VI+I.e/) - (0. ",') (a ® VI+I.e/).

This esta bJishes III. Thus Cm (k; II •...• I,) holds in general for all m by Proposition I. In particular, as noted above, if m - 0, this establishes Corollary 2. We note also that for m - 1 the subobjects of an object I can be considered to be affine subspaces of VI' Thus we have also proved the affine version of Ramsey's Theorem, which we state below. COROLLARY 3 (Affine Analog). For is true in general.

C -

c i as described above,

C (k; II •...• I,)

The application of Theorem 1 to the case A - CI> B - Co is just the statement that the affine analog for k and all I I • . . . • I, implies the vector space analog for k + 1 and all II , ...• I,. This result was already known [I, 5], and the previous proof is the same as the proof of Theorem I specialized to this case. There was another way given in (5) to show that Corollary 3 implies Corollary 2. Namely, it shows that c i (k;/I •...• 1,) implies Co (k;/I •...• 1,). This argument is also a special case of Theorem I, and we can describe it here. Actually, we replace Co with the equivalent C~, defined by letting k - ' I in c~ if and only if k - 1 _I 1- 1 is in Co. We also must adjoin an identity 10 to c~. If k -("'.~) I is in CI> then M«w.q,)) - (o.q,) in C~, where we recall that k + 1 _(o,~) 1+1 in c~. We let t - 0, thus making the choices of P and q,/j unnecessary. Clearly C~ [L~ld M(e l and I is satisfied. II is

-

442

[Lj),

RAMSEYS THEOREM FOR A CLASS OF CATEGORIES

429

vacuously true as is III, since t - o. Hence by Theorem 1, if C, (k; I J ••••• I,) holds for all II.' ..• I" then Co (k + I; II •...• I,) holds and this is just Co (k;II •... • 1,), as desired. Finally we obtain the Ramsey theorem for n-parameter sets. We refer the reader to [2] to see that the definitions used there are essentially the same as those we will use here. In particular, the categories corresponding to the notions in [2] are the quotient categories described in the last paragraph in this paper. That is, the partially ordered sets of subobjects are isomorphic. Let

be a finite group, and let A - (a I • . . . • a,o) be a finite set. Let be the category with objects o. I. 2. . . .• and morphisms described as follows: G

C (A • G)

For each k and I, the morphisms k s

G __

II ..... I)

__

v. s ) I are diagrams

f

U A ---;..

II ..... k)

U A.

where f is any epimorphic function which acts identically on A, and s is any function such that s (a) - lEG for a E A. Composition of the morphisms v.s) I an d I --(g.t) m IS• gIVen • b v•. s.·,) m, were h . d' k -y k -fg IS or mary composition of functions, and sg . t is defined by s (g(x)) . t (x) - (sg . t) (x) in G for x ElI, ...• m) U A. We note several things about this choice for C(A. G). First there is no mention of the relationship of G to A. G need not be a permutation group on A, nor even act on it at all. This was a necessary assumption for part of the proof in [2]. Second, we allow IA 1 < 2 here, where in [2], IA 1 ;;. 2 was required. Actually, in the situation in [2] where the n-parameter sets under consideration had constant set B C A, we did not need IBI ;;. 2. But this took a separate argument. What we have there is the general result for nparameter sets for arbitrary sets of constants B. COROLLARY 4 (n-Parameter holds in general.

Sets).

If

C-C(A.G).

then

C (k ; II •...• I,)

Proof Again, we consider a class ((I containing C(A. G) for which Proposition 1 holds. There is more than one possibility here. We will give the proof in detail for one class ((I. Then we will describe another class but omit the detailed verification of I-III. It is this second class ((I which provides a more direct translation of the proof in [21. The first ((I we describe now is somewhat different.

Let

(a. a2.

be an infinite set. For each t -I. 2. 3 •...• let and let c,- C(A,.G). Thus C(A.G) above is c,o here. We cm +1 and B - COl satisfy Theorem 1, for all m ;;. 1.

aJ ... .)

A,- (al •... • a,),

claim that A

-

To see this we first define M. Let k __V.s) I be in C.,+l' Then where k - I __V· ... ) I + I in COl is defined as follows. For xEA.,U{I ..... /), /(x)-f(x) if f(x)EA.,U{I ..... k}, /(x)-k+1 if f (x) - am +I> and r (1+1) - k + 1. For x E A., U II ..... n, $ (x) - s (x), and $ (I + 1) - 1. One can check that M does preserve composition. For the identity map (e/. I), I in Cm +I> where e/ acts identically on I and I (x) - lEG, x E A m +1 U II ..... n, we see that M «e/. 1)) - (e/+I • 1) in COl' so M (I) - I + 1. M«J.s)) - (/.$),

Next we define P. Let k __ !It.,) I be in Col' Then P«h.,» - (h".,"), where k __(h" .... ) I in C.,+I is defined by letting h" (x) - h (x) and u" (x) - u (x)

443

GRAHAM, LEEB, AND ROTHSCHILD

430

for x E Am U (I •...• II, and h" (am+l) - am+h preserves composition, and p (J) - I for alii.

r" (am+l) - lEG.

P

clearly

Finally, for each I and any g E G and any j, I';; j .;; m, let or just (j.g)1 for short, where djl(x)-x for x E {I •...• II U Am, djl (I + I) - ai' and 1,1 (x) - lEG for x E {I •...• II U Am, Igi (/ + J) - g. These 4>'s are indexed by the pairs (j. g). We let 1 - IAml IGI - m IGI, and for the choices above we verify I-III. 4>1.(j.g)-(djl.l gl ),

Let

k

+I

I+I

_(f,,)

represent a subobject in Cm

[7::].

Suppose first

that f (/ + il ¢ Am. Let,.. be a permutation on {I •...• k + 11 U Am fixing all a E Am and such that ,..1 takes I + I onto k + 1. Let u - <s (/ + 1))-1, and let «(. s') - ([. s) (.... I. k ... ), where as above, I.k maps {I •...• kl U Am onto lEG, and k + I onto u. Then ( -,..1 and s' - I.kl· s. In particular, since (r. I.k ...) is an isomorphism in cm {its inverse is (,..-I.I.-I k », we see that ([.s) and s') represent the same subobject of 1+1. Now let k _V" ,,") I be defined in c m+! as follows. For x E Am U {J •...• II, we let (' (x) - ( x ) if ( x ) "'" k + I, and (' (x) - am+1 if ( x ) - k + 1. We let (' (am+l) - am+I' For x E Am U (I •...• II, we let s" (x) - s' (x), and s" (am+l) - 1. Then M«([".s"»-«.s'). SO the subobject represented by ([.s) is in M(Cm+1 This is the case, then, for any ([,s) with 1(/ + I) ~ Am. On

«.

[L]>.

the other hand, suppose I (I + J) - aj E Am. Let k + I _if .,') I in c m be defined by (X) - I (x) and s' (x) - s (x) for x E {I •...• II U Am. Then ([. s) - (j. s (/ + 1)) I s') and ([. s) represents a su bobject in (j. s (/ + 1))1 (cm [k~l]>. This establishes I.

«.

For II, we note that for k _V,,) I in cm' M(P«([,s))) is the morphism I _V'.,') I + I in cm, where ( x ) - I (x) and s' (x) - s (x) for x E {I •...• II U Am, and ((/ + I) - k + I, s' (/ + I) - 1. Then for each j and g we see that (j.g)I([.s) - «.s') (j.g)t, establishing II. k

+

To verify III, we consider (m + I. ill in cm+!' Then M«m + I, \)1) is the morphism I + I -(',]) I + 2 in cm where I (x) - I for all x in {I •... , I + 21 U Am and e' (x) - x for x E {I, ... ,II U Am, and e' (/ + il - I + I, e' (I + 2) - I + 1. Then clearly (j,g)1+1 (j,g)l- (e', \) (j,g)l- M«m + I. \)1) (j,g)I' This establishes III and completes the proof of Corollary 4. The alternate choice for the class qj to prove Corollary 4 is as follows. For each m - 0, 1.2... .• let A;" - A U «(I, ... , ml x G), and let c;,. - C(A;". G). Then Co - c. Let ~ be the class of all c;". For each m. C;"+I and c;,. satisfy Theorem 1. x

For E A;" U

k _ V ,') I {I •...• I}

( (x) -

for

in C;"+h we let

we

let

M«([.s))-«.'),

where

for

f (x) and s' (x) - s (x) if I (x) E A;" U (I •...• k);

we let ( x ) - k + I, s' (x) - g's(x); and ( I + \) - k + I, I in c;,. we define P«([.s))-«.') in C;"+I by letting ( x ) - I (x), s' (x) - s (x) if x E A;" U {I •...• II, and ((m + I. g)) (m+l.g), s'«m+l.g»-1. For aEA;" and gEG, as before, we let 4>I,(a.g) - (dal.l gl ). Then I, II and III can be verified, with t - IA;"IIGI. I(x) - (m + I.g),

,(/+\)-1.

For

k_(f,,)

Now we still do not have an exact translation of the proof in [2]. In particular, we have taken no account of any action of G on A. To handle

444

RAMSEYS THEOREM FOR A CLASS OF CATEGORIES

431

this we consider a set A and a group G acting on A, a - a' E A for g E G. We consider the category CU, G) and obtain from it the category CU, G) by identifying any two morphisms k _(f•• ) I and k _(g. 0) I for which j(x)-g(;c) and s(x)-u(;c) if j(x) E It, ... ,k}, and j(;c)'u)_g(;c)o(x) otherwise. By considering G to act on (11, ... , m) x G) by (i, g)h - (i, gh) for all h E G, we obtain the categories c;,. - C(A;", G). The categories c;"+l and c;" satisfy Theorem I, where we take for M and p the functors determined by the M and p for c;"+l and c;,. above by their action on classes of identified morphisms. For the q,'s we use classes of identified q,1. fA.,) from above. There are IA;" I of these, represented by the q,1. fA. I). Thus we let 1- IA;" I here. Letting ~ be the class consisting of all c;,., we can apply Proposition 1. This is the exact translation of the proof in [2]. REFERENCES I.

R. L. GRAHAM AND B. L. ROTHSCHILD, Rota's geometric analogue to Ramsey's theorem, Proc. AMS Symp. in Pure Mathematics XIX Combinatorics AMS Providence (1971), 101-104. 2. R. L. GRAHAM AND B. L. ROTHSCHILD, Ramsey's Theorem for n-parameter Sets, Trans. Amer. Math. Soc. 159 (1971), 257-292. 3. A. HALES AND R. I. JEWETT, Regularity and Positional games, Trans. Amer. Math. Soc. 106 (1963), 222-229. 4. F. P. RAMSEY, On a problem of formal logic, Proc. London Math. Soc. 2nd Ser. 30 (1930), 264-286. 5. B. L. ROTHSCHILD, A generalization of Ramsey's theorem and a conjecture of' Rota, doctoral dissertation, Yale University, New Haven, CT, 1967. 6. H. J. RYSER, "Combinatorial Mathematics," Wiley, New York, 1963.

Reprinted from Advances in Math. 8 (1972), 417-433

445

Reprinted from JOURNAL OF COMBINATORIAL THEORY All Rights Reserved by Academic Press, New York "nd London

\ 01. 13, No.2, October 1972 Printed i11 HeLM'lIm

A Characterization of Perfect Graphs L.

LOVASZ

Eotvos L. University, Budapest, VIII. Muzeum krt. 6-8, Hungar Communicated by W. T. Tuite

Received December 3, 1971

It is shown that a graph is perfect iff maximum clique . number of stability is not less than the number of vertices holds for each induced subgraph. The fact, conjectured by Berge and proved by the author, follows immediately that the complement of a perfect graph is perfect.

Throughout this note, graph means finite, undirected graph without loops and mUltiple edges. G and I G I denote the complement and the number of vertices of G, respectively. Let fL(G) denote the maximum cardinality of a clique in the graph G, and let X(G) be the chromatic number of G. Obviously x(G) ~ fL(G).

A graph G is called perfect if x( G')

=

fL( G')

for every induced subgraph G' of G. Berge [1] formulated two conjectures in connection with this notion: (A) A graph is perfect circuit without diagonals. (B)

iff neither it nor its complement contains an odd

The complement of a perfect graph is perfect.

Obviously, (A) is stronger than (B). In [3] (B) was proved. This result also follows from the theory of anti-blocking polyhedra, developed by Fulkerson [2]. In the present paper a theorem stronger than (B) but weaker than (A) is proved. This possibility of sharpening of (B) was raised by A. Hajnal. Copyright © 1972 by Academic Press, Inc. All rights of reproduction in any form reserved.

95

447

96

LOVASZ THEOREM.

A graph G is perfect

if and only if

/1-(G') /1-(G') ?

1

G'

1

for every induced subgraph G' of G. Proof Part "only if" is trivial. To prove part "if" we use induction on 1G I. Thus we may assume that any proper induced subgraph of G, as well as its complement, is perfect. Let multiplication of a vertex x by h (h ? 0) mean substituting for it h independent vertices, joined to the same set of vertices as x. This notion is closely related to the notion of piuperfection, introduced by D. R. Fulkerson.

(I) As a first step of the proof we show that if Go arises from G by multiplication of its vertices then Go satisfies

Assume this is not the case and consider a Go failing to have this property and with minimum number of vertices. Obviously, there is a vertex y of G which is multiplied by h ? 2; let Yl , ... , Yn be the corresponding vertices of Go. Then

by the minimality of Go; hence

and 1

Go

1

=

pr

+ 1.

Put G1 = Go - {Yl , ... , Yh}' Then G1 arises from G - Y by mUltiplication of its vertices, hence by [I, Theorem 1], Gl is perfect. Thus, Gl can be covered by /1-(G l ) ::( /1-(Go) = r disjoint cliques of Gl ; let Cl , ... , Cr be these cliques, 1 C l 1 ? I C 2 1 ? ... ? 1 C r I. Obviously, h ::( r. Since 1 Gl 1 = 1 Go I - h = pr + I - h, 1

Cl

1

=

... =

1 Cr-h+l 1

Let G2 be the subgraph of Go induced by C l 1

G2

1 =

(r - h

= p.

U ... U Cr-h+l U

+ I)p + 1 <

448

1

Go

I;

{Yl}, then

97

A CHARACTERIZATION OF PERFECT GRAPHS

thus, by the minimality of Go ,

Since p.(GJ

~

p.(Go} = p, this implies

p.(6J ;> r - h

+ 2.

Let F be a stable set of r - h + 2 vertices of G2 ; then I F (') Ci I ~ 1 (1 ~ i ~ r - h + 1), hence Yl E F. This implies that F U {Y2 ,... , y,,} is stable in Go . On the other hand IF U {Y2 , ... , y,,}1

= r + 1 > p.(60 ) ,

a contradiction.

(II) We show that X(G) = p.(G). It is enough to find a stable set F such that p.(G - F) < p.(G) since then, by the induction hypothesis, G - F can be colored by p.(G) - 1 colors and, adding F as a further one, we obtain a p.(G)-coloring of G. Assume indirectly that G - F contains a p.(G)-clique CF for any stable set Fin G. Let, for x E G, h(x) denote the number of C/s containing x. Let Go arise from G by multiplying each x by h(x). Then, by Part I above,

On the other hand, obviously

I Go I =

L hex) = LF I C

F

I = pJ,

II)

where f denotes the number of all stable sets in Go , and p.(Go)

~

p.(G)

p.(60)

= max F

=

p,

L hex) =

II)eF

max F

Lr IF (') C

F,

I ~ max F

L

r#1

1=

f -

1,

a contradiction. REMARK. The condition given in the theorem is strictly related to the max-max inequality given by Fulkerson [2]. Multiplication of a vertex is the same as what he calls pluperfection.

449

98

LOVAsZ REFERENCES

1. C. BERGE, Fiirbung von Graphen, deren siimtliche bzw. deren ungerade Kreise

starr sind, Wiss. Z. Martin-Luther-Univ. Halle-Wittenberg Math.-Natur. Reihe (1961), 114. 2. D. R. FULKERSON, Blocking and anti-blocking pairs of polyhedra, 7th International Programming Symposium, The Hague, 1970. 3. L. LovAsz, Normal hypergraphs and the perfect graph conjecture, Discrete Math., in press.

Printed by the St Catherine Press Ltd., Tempelhof 37, Bruges, Belgium.

450

Reprinted from JOURNAL OF COMBINATORIAL THEORY All Rights Reserved by Academic Press, New York and London

Vol. 13, No.3, December 1972 Printed in Belgium

Note A Note on the Line Reconstruction Problem L.

LOVASZ

Eotvos L. University, Budapest, Hungary Communicated by W. T. Tutte

Received May 29, 1972

It is shown that if a graph has more lines than its complement does, then it can be reconstructed from its line-deleted subgraphs.

As in Harary's book [4], graph means finite, undirected graph without loops or multiple lines. V(G) and E(G) denote the sets of points and lines of G, respectively. Ulam [6] conjectured that, if two graphs GI and G2 are such that V(G I ) = {VI'"'' V n }, V(G 2) = {WI'"'' W n }, n ~ 3, and GI - Vi ~ G2 - Wi, for each i, then GI ~ G2 • In other words, every graph with at least three points can be uniquely reconstructed from its maximal induced subgraphs. It seems that this conjecture is particularly difficult, and it is solved for special cases only; see, e.g., [5]. An analogous conjecture, formulated by Harary [3], replaces "maximal induced subgraphs" by "maximal subgraphs". This conjecture is actually weaker than Ulam's conjecture (see [1]). In this note we prove it for graphs with "many" lines. Let GI

THEOREM.

,

G2 be two graphs, E(G I )

=

{e l

, ••• ,

{It ,.",fm}, and 1 V(G I ) 1 = 1 V(G 2) 1 = n. Assume that GI for each 1

~

i

~

m, and m

> tG).

Then GI

~

G2

-

em}, E(G 2) ei ~ G2 -

=

Ii

•

Proof Let G -+ H denote the set of all monomorphisms of G into H. Then, by the sieve formula, 1G -+

HI

=

L

(_l)iE<X>11 X

-+

H I,

(1)

X~G

where H is the complement of H and X runs over all graphs with VeX) = V(G), E(X) C E(G). In effect, the right-hand side of (1) just counts all maps from the points of G to the points of H, then takes away those Copyright © 1972 by Academic Press, Inc. All rights of reproduction in any fonn reserved.

309

451

310

LOVAsZ

maps sending (at least) one line of G to a line of N, then adds those sending (at least) two lines to lines of N, etc. Thus it counts exactly those maps which send no lines of G to lines of N, so every line of G goes to a line of H. Applying (1) to G1 and G2 we have 1G1

-

G2 1 =

L

(_I)IE(X)Q X -

6 2 1,

(2)

(_I)IE(X)Q X -

6 2 1.

(3)

X!;G1

and for G2 and G2 we have 1G2

-

G2 1 =

L X!;Ga

Since the hypothesis on maximal subgraphs assures that G1 and G2 have the same proper subgraphs (see [2, p. 92]), the terms in (2) and (3), with X =1= G1 and X =1= G2 , are equal. Also, since m > tG), 1 G1 - 6 2 1 = 1G2 - G2 1 = O. Hence 1G1 - G2 1 = 1G2 - G2 1 > 0, which proves the theorem. REFERENCES

1. D. L. GREENWELL, Reconstructing graphs, Proc. Amer. Math. Soc. 30 (1971), 431-433. 2. D. L. GREENWELL AND R. L. HEMMINGER, Reconstructing graphs, "The Many Facets of Graph Theory" (G. T. Chartrand and S. F. Kapoor, eds.), SpringerVerlag, New York, 1969. 3. F. HARARY, On the reconstruction of a graph from a collection of subgraphs, "Theory of Graphs and Its Applications" (M. Fiedler, ed.), Czechoslovak Academy of Sciences, Prague/Academic Press, New York, 1965, pp.47-52. 4. F. HARARY, "Graph Theory," Addison-Wesley, Reading, Mass., 1969. 5. F. HARARY AND B. MANVEL, The reconstruction conjecture for labeled graphs, "Combinatorial Structures and Their Applications" (R. K. Guy, ed.), Gordon & Breach, New York, 1969. 6. S. M. ULAM, "A Collection of Mathematical Problems," Wiley (Interscience), New York, 1960, p. 29.

Printed by the St Catherine Press Ltd., Tempelhof 37. Bruges. Belgium.

452

© DISCRETE MATHEMATICS 5 (1973) 171-178. North-Holland Publishing Company

ACYCLIC ORIENTATIONS OF GRAPHS* Richard P. STANLEY Department of Mathematics, University of California, Berkeley, Calif 94720, USA Received 1 June 1972

Abstract. Let G be a finite graph with p vertices and x its chromatic polynomial. A combinatorial interpretation is given to the positive integer (-1)P X( -A), where A is a positive integer, in terms of acyclic orientations of G. In particular, (-1)P x( -1) is the number of acyclic orientations of G. An application is given to the enumeration of labeled acyclic digraphs. An algebra of full binomial type, in the sense of Doubilet-Rota-Stanley, is constructed which yields the generating functions which occur in the above context.

1. The chromatic polynomial with negative arguments

Let G be a finite graph, which we assume to be without loops or multiple edges. Let V = V(G) denote the set of vertices of G and X =X(G) the set of edges. An edge e E X is thought of as an unordered pair {u, v} of two distinct vertices. The integers p and q denote the cardinalities of V and X, respectively. An orientation of G is an assignment of a direction to each edge {u, v}, denoted by u -+ v or v -+ U, as the case may be. An orientation of G is said to be acyclic if it has no directed cycles. Let X (X) = X(G, X) denote the chromatic polynomial of G evaluated at X E C. If X is a non-negative integer, then X(X) has the following rather unorthodox interpretation. Proposition 1.1. X(X) is equal to the number of pairs (a, 0), where a is any map a: V -+ {I, 2, ... , X} and 0 is an orientation of G, subject to the two conditions: (a) The orientation 0 is acyclic. (b) ffu -+ v in the orientation 0, then a(u) > a(v). * The research was supported by a Miller Research Fellowship.

453

172

R.P. Stanley, Acyclic orientations of graphs

Proof. Condition (b) forces the map a to be a proper coloring (i.e., if {u, v} E X, then a(u) =1= a(v)). From (b), condition (a) follows automatically. Conversely, if a is proper, then (b) defines a unique acyclic orientation of G. Hence, the number of allowed a is just the number of proper colorings of G with the colors 1, 2, ... , "A, which by definition is X("A). Proposition 1.1 suggests the following modification of X("A). If "A is a non-negative integer, define X("A) to be the number of pairs (a, 0), where a is any map a : V -+ { 1, 2, ... ,"A} and 0 is an orientation of G, subject to the two conditions: (a') The orientation 0 is acyclic, (b') Ifu -+ v in the orientation 0, then a(u) ~ a(v). We then say that a is compatible with O. The relationship between X and X is somewhat analogous to the relationship between combinations of n things taken k at a time without repetition, enumerated by (~). and with repetition, enumerated by (n+!-I) = (_l)k(~n).

Theorem 1.2. For all non-negative integers "A, x("A) = (-l)P X( -"A).

Proof. Recall the well-known fact that the chromatic polynomial X(G, "A) is uniquely determined by the three conditions: (i) X(G o, "A) = "A, where Go is the one-vertex graph. (ii) X(G + H, "A) = X(G, "A) X(H, "A), where G + H is the disjoint union ofG andH, (iii) for all e EX, X(G, "A) =x(G\e, "A) - x(G/e, "A), where G\e denotes G with the edge e deleted and G/e denotes G with the edge e contracted to a point. Hence, it suffices to prove the following three properties of X: (i') X(G o, "A) = "A, where Go is the one-vertex graph, (ii') X(G + H, "A) = X(G, "A) X(H, "A), (iii') X(G, "A) = x(G\e, "A) + x(G/e, "A). Properties (i') and (ii') are obvious, so we need only prove (iii'). Let a: V(G\e) -+ {l, 2, ... , "A} and let 0 be an acyclic orientation of G\e compatible with a, where e = {u, v} EX. Let 0 1 be the orientation of G obtained by adjoining u -+ v to 0, and O2 that obtained by adjoining v -+ u. Observe that a is defined on V(G) since V(G) = V(G\e). We will

454

1. The chromatic polynomial with negative arguments

173

show that for each pair (a, 0), exactly one of 0 1 and O2 is an acyclic orientation compatible with a, except for x(GI e, 'A) of these pairs, in which case both 0 1 and O2 are acyclic orientations compatible with a. It then follows that X(G, 'A) = x(G\e, 'A) + x(G/e, 'A), so proving the theorem. For each pair (a, 0), where a: G\e -+ {l, 2, ... , 'A} and 0 is an acyclic orientation of G\e compatible with a, one of the following three possibilities must hold. Case I: a(u) > a(v). Clearly O2 is not compatible with a while 0 1 is compatible. Moreover, 0 1 is acyclic, since if u -+ v -+ WI -+ W2 -+ ••• -+ u were a directed cycle in 0 1, we would have a(u) > a(v);;" a(w 1);;" a(w2);;" ... ;;.. a(u), which is impossible. Case 2: a(u) < a(v). Then symmetrically to Case I, O2 is acyclic and compatible. with a, while 0 1 is not compatible. Case 3: a(u) = a(v). Both 0 1 and O2 are compatible with a. We claim that at least one of them is acyclic. Suppose not. Then 0 1 contains a directed cycle u -+ v -+ WI -+ W2 -+ ... -+ u while O2 contains a directed cycle v -+ u -+ wi -+ w2 -+ .•• -+ v. Hence, 0 contains the directed cycle u -+ wi

-+

W2 -+ •.• -+

v -+ WI

-+

W2 -+ ••• -+

u,

contradicting the assumption that 0 is acyclic. It remains to prove that both 0 1 and O2 are acyclic for exactly x(G/e, 'A) pairs (a, 0), with a(u) = a(v). To do this we define a bijection cI>(a, 0) = (a', 0') between those pairs (a, 0) such that both 0 1 and O2 are acyclic (with a(u) = a(v)) and those pairs (a', 0') such that a': G/e -+ {I, 2, ... , 'A} and 0' is an acyclic orientation of G/e compatible with a'. Let z be the vertex of G/e obtained by identifying u and v, so V(G/e) = V(G\e) - {u, v} u {z}

= X(G\e). Given (a, 0), define a' by a'(w) = a(w) for all WE V(G\e) - {z} and a'(z) = a(u) = a(v). Define 0' by WI -+ w2 in 0' if and only if WI -+ w2 in O. It is easily seen that the map cI>(a, 0) = and X(G/e)

(a', 0') establishes the desired bijection, and we are through.

Theorem 1.2 provides a combinatorial interpretation of the positive integer (-I)P X(G, -'A), where 'A is a positive integer. In particular, when 'A = I every orientation of G is automatically compatible with every map a: G -+ {I}. We thus obtain the following corollary. 455

174

R.P. Stanley, Acyclic orientation of graphs

Corollary 1.3. If G is a graph with p vertices, then (-1)P X(G, -1) is equal to the number of acyclic orientations of G. In [5] , the following question was raised (for a special class of graphs). Let G be a p-vertex graph and let w be a labeling of G, i.e., a bijection w: V(G) ~ {l, 2, ... , p}. Define an equivalence relation - on the set of all p! labelings w of G by the condition that w - w' if whenever {u, v} E X(G), then w(u) < w(v) ~ w'(u) < w'(u). How many equivalence classes of labelings of G are there? Clearly two labelings wand w' are equivalent if and only if the unique orientations 0 and 0' compatible with wand w', respectively, are equal. Moreover, the orientations o which arise in this way are precisely the acyclic ones. Hence, by Corollary 1.3, the number of equivalence classes is (-1)P X(G, -1). We conclude this section by discussing the relationship between the chromatic polynomial of a graph and the order polynomial [4;5;6] of a partially ordered set. If P is a p-element partially ordered set, define the order polynomial n(p, X) (evaluated at the non-negative integer X) to be the number of order-preserving maps a:P ~ {l, 2, ... , X}. Define the strict order polynomial n(p, X) to be the number of strict orderpreserving maps a:P ~ {I, 2, ... , X}, i.e., if x < y in P, then a(x) < a(y). In [5] , it was shown that nand n are polynomials in X related by (P, X) = (-1)P n (P, - X). This is the precise analogue of Theorem 1. 2. We shall now clarify this analogy. If 0 is an orientation of a graph G, regard 0 as a binary relation ~ on V( G) defined by u ~ v if u -'+ v. If 0 is acyclic, then the transitive and reflexive closure 0 of 0 is a partial ordering of V(G). Moreover, a map a: V(G) ~ {l, 2, "', X} is compatible with 0 if and only if a is orderpreserving when considered as a map from O. Hence the number of a compatible with 0 is just n (0, X) and we conclude that

n

x(G, X)

= L; nco, X), o

where the sum is over all acyclic orientations 0 of G. In the same way, using Proposition 1.1, we deduce (1)

X(G, X)

= E n((5, X). o 456

2. Enumeration of labeled acyclic diagraphs

175

Hence, Theorem 1.2 follows from the known result fi(p, A) = (-l)p n(p, -A), but we thought a direct proof to be more illuminating. Equation ( 1) strengthens the claim made in [4] that the strict order polynomial is a partially-ordered set analogue of the chromatic polynomial x.

n

2. Enumeration of labeled acyclic digraphs Corollary 1.3, when combined with a result of Read (also obtained by Bender and Goldman), yields an immediate solution to the problem of enumerating labeled acyclic digraphs with n vertices. The same result was obtained by R.W. Robinson (to be published), who applies it to the unlabeled case. Proposition 2.1. Let f(n) be the number of labeled acyclic digraphs with n vertices. Then

Proof. By Corollary 1.3, (2)

fen)

= (_l)n L; x(G, -1), G

where the sum is over all labeled graphs G with n vertices. Now, Read [3] (see also [ 1] ) has shown that if Mn(k) =

L;

X(G, k)

G

(w}1.ere the sum has the same range as in (2», then

n~o 00

(3)

i2n) = (~o xn/n! 2(n) )k 00

Mn(k) xn/n!

2

Actually, the above papers have 2n2/2 where we have 2 (~) - this amounts to the transformation x' = 2'h x. One advantage of our 'normalization' is

457

R.P. Stanley, Acyclic orientations of graphs

176

that the numbers n! 2 (~) are integers; a second is that the function F(x)

=

jj

n=O

xnln!

2(~.)

satisfies the functional relation F'(x) =F(tx). A third advantage is mentioned in the next section. Thus setting k = -1 and changing x to -x in (3) yields the desired result.

;=0

By analyzing the behavior of the function F(x) = l; x nIn! 2 (1) , we obtain estimates for f(n). For instance, Rouche's theorem can be used to show that F(x) has a unique zero a ~ -1.488 satisfying Ial ~ 2. Standard techniques yield the asymptotic formula f(n) -

C2(~) n!(-a:)-n,

where a is as above and 1. 741 ~ C = l/aF( 1a). A more careful analysis of F(x) will yield more precise estimates for f(n).

3. An algebra of binomial type The existence of a combinatorial interpretation of the coefficients Mn (k) in the expansion

suggests the existence of an algebra of full binomial type with structure constants B(n) = 2 (~) n! in the sense of [2] . This is equivalen t to finding a locally finite partially ordered set P (said to be of full binomial type), satisfying the following conditions: (a) In any segment [x, y] = {zi x ~ z ~ y} of P (where x ~ y in P), every maximal chain has the same length n. We call [x, y] an n-segment. (b) There exists an n-segment for every integer n ;;;. 0 and the number of maximal chains in any n-segment is B(n) = 2 G) n!, (In particular, BO) must equal I, further explaining the normalization x' = 2 ~ x of Section 2.) 458

177

3. An algebra of binomial type

If such a partially ordered set P exists, then by [2] the value of ~k(x, y), where ~ is the zeta function of P, k is any integer and [x, y] is any n-segment, depends only on k and n. We write ~k(x, y) = ~k(n). Then again from [2],

Hence ~k (n)

=Mn (k).

In particular, the cardinality of any n-segment [x, y] isM n (2), the number of labeled two-colored graphs with n vertices; while fleX, y) = (_l)n fen), where fl is the Mobius function of P and fen) is the number of labeled acyclic digraphs with n vertices. The general theory developed in [2] provides a combinatorial interpretation of the coefficients of various other generating functions, such as (!:;=1 xn/B(n))k and (2 - !:;=o x n/B(n))-I. Since M n (2) is the cardinality of an n-segment, this suggests taking elements of P to be properly two-colored graphs. We consider a somewhat more general situation. Proposition 3.1. Let V be an infinite vertex set, let q be a positive in~ teger and let Pq be the set of all pairs (G, a); where G is a function from all 2-sets {u, v} ~ V (u =1= v) into {a, 1, ... , q - l } such that all but finitely many values of G are 0, and where a: V -+ {a, I} is a map satisfying the condition that ifG({u, > then a(u) =1= a(v) and that !:ueva(u) < 00. If(G, a) and (H, r) are in Pq , define (G, a)";;; (H, r) if: (a) a(u)";;; r(u) for all u E V, and (b) If a(u) = r(u) and a(v) = rev), then G( {u, v}) =H( {u, Then Pq is a partially ordered set of full binomial type with structure constants B(n) = n! q(~).

vn

°

vn.

Proof. If (H, r) covers (G, a) in P (i.e., if (H, r) > (G, a) and no (G '; a') satisfies (H, r) > (G', a') > (G, a)), then

E

UEV

r(u) = 1 +

L; a(u).

UEV

From this it follows that in every segment of P, all maximal chains have the same length. 459

178

R.P. Stanley, Acyclic orientations of graphs

In order to prove that an n-segment S = [(G, a), (H, r)] has n! q(~) maximal chains, it suffices to prove that (H, r) covers exactly nq n-l elements of S, for then the number of maximal chains in S will be (nq n-l )(n - I) q n-2) ... (2 q l). I = n! qeD . Since S is an n-segment, there are precisely n vertices vI' v2' ... , vn E V such that a(vi ) = 0 0, where we can assume r'(u) = 0, r'(v) = l.lfv is not some vi' then a(u) = 0, a(v) = I, soH'({u, vn = G({u, vn.lfv = vi (2";; i";; n) and u is not vI' then r(u) = 0, rev) = I, soH'({u, vn = H( {u, v}). Hence H' ( {u, v}) is completely determined unless u = vI and v = vi' 2 ..;; i ..;; n. In this case, each H' ( {vI' v;}) can have anyone of q values. Thus, there are n choices of vI and q choices for each H'( {vI' vi})' 2..;; i..;; n, giving a total of nqn-I elements (H', r') E S covered by (H, r).

Observe that when q = 1, condition (b) is vacuous, so PI is isomorphic to the lattice of finite subsets of V. When q = 2, we may think of G ( {u, v}) = or 1 depending on whether {u, v} is not or is an edge of a graph on the vertex set V. Then a is just a proper two-coloring of v with the colors and I, and the elements of P 2 consist of all properly twocolored graphs with vertex set V, finitely many edges and finitely many vertices colored I. We remark that Pq is not a lattice unless q = I.

° °

References [I) E.A. Bender and J. Goldman, Enumerative uses of generating functions, Indiana Univ. Math. J. 20 (1971) 753-765. [2] P. Doubilet, G.-C. Rota and R. Stanley, On the foundations of combinatorial theory: The idea of generating function, in: Sixth Berkeley symposium on mathematical statistics and probability (1972) 267-318. (3) R. Read, The number of k-colored graphs on label1ed nodes, Canad. J. Math. 12 (1960) 410-414. [4] R. Stanley, A chromatic-like polynomial for ordered sets, in: Proc. second Chapel Hill conference on combinatorial mathematics and its applications (1970) 421-427. [5) R. Stanley, Ordered structures and partitions, Mem. Am. Math. Soc. 119 (1972). [6) R. Stanley, A Brylawski decomposition for finite ordered sets, Discrete Math. 4 (1973) 77-82.

460

Sonderabdruck aus

ARCHIV DER MATHEMATIK

Vol. XXIV, 1973

81RKHAUSER VERLAG, BASEL UND STUTTGART

Valuations on Distributive Lattices By LADNOR GEISSINGER

461

Fase.3

230

ARCH. MATH.

Valuations on Distributive Lattices I By LADNOR GEISSINGER

Introduction. We continue the study, begun by G.-C. Rota, of the valuation ring of a distributive lattice. This ring is the representing object for all valuations on the lattice. In the locally finite case Rota established a connection with the incidence algebra of the set of join-irreducible elements, from which he derived interesting results about the Euler characteristic and Mobius function associated with some geometric objects. In this paper we give new proofs of some of his results, and extend others. In part I we discuss general properties of the valuation module and ring of a lattice, and determine their structure for a finite geometric lattice. We then describe the duality between maps of finite distributive lattices and of finite posets. This makes it easy to characterize finite projective distributive lattices, construct the free distributive lattice on a finite poset, and determine what properties of lattice homomorphisms correspond to strict and residuated maps of po sets. We also use the valuation ring to give a construction for the coproduct of distributive lattices. In part II we will determine the structure and mapping properties of valuation rings and Mobius algebras. We use these to prove some theorems of Rota on Mobius functions, an identity due to Klee, and theorems on extending finitely additive measures.

1. The Valuation Module of a Lattice. A function f from a lattice L into an abelian group is modular if f(a v b) f(a II b) = f(a) f(b) for all a, bEL. Following Rota [19], we call any such modular function a valuation. (Birkhoff [2] reserves the term valuation for real-valued modular functions.) In the free abelian group Z(L) on L, let M (L) be the subgroup generated by all elements of the form a v b a II b - a - b with a, bEL. Then V(L) = Z(L) 1M (L) is the valuation module of L introduced by Rota [19]. Let i: L -+ V(L) be the natural induced map. The following characteristic property of (V(L), i) is an immediate consequence of its construction.

+

+

+

Proposition 1. The function i: L -+ V(L) is the universal valuation on L, that is, i is a valuation and every valuation on L into an abelian group A factors uniquely as i followed by a group homomorphism from V(L) into A. Thus the additive group of valuations on L into A can be identified with Hom(V(L), A). The functorial properties of V(L) also follow easily from the construction. Proposition 2. A lattice homomorphism rp: Ll -+ L2 induces a unique group homomorphism rp': V(L 1 ) -+ V(L 2) such that rp' i 1 = i2 rp.

462

Vol. XXIV. 19'1'3

231

Valuations on Distributive Lattices I

The existence of simple types of valuations implies certain structural properties of V(L) and i(L). For example, since every constant function from L into any abelian group is a valuation, by Proposition 1 the elements of i(L) must be non-zero and of i.nfinite order. More useful information about the linear independence of subsets of i (L) can be derived from consideration of 2-valued valuations, or equivalently, of prime ideals of L and their complements, prime filters.

Proposition 3. For any prime ideal or prime filter F of a lattice L, its characteristic function CF: L -+Z, which i8 1 on F and 0 on L\F, is a valuation. Each element of i(F) i8 linearly independent of the elements of i (L \F) and vice versa, though neither of these 8ets is necessarily independent. Proof. It is easy to verify directly the first statement. The second statement follows from the first by Proposition 1.

Proposition 4. The map i is an injection iff L is distributive. When L is distributive, if {al' ... , ar, b} eLand if b is not in the interval [A.at, Vat], then b is linearly independent of {al, ... , a r} in V(L). Proof. A well-known theorem of Stone states that any two elements of a distributive lattice can be separated by a prime filter [2, 17], hence they are independent in V(L) by Proposition 3. If L is not distributive it contains distinct elements c, x, y with c II x = C II Y and c v x = c v y from which i(x) = i(y). The condition on b holds iff there is a prime filter separating b from {al' ... , ar}. Later we will give an elementary proof of this proposition which does not depend on Stone's theorem. Whenever we deal with distributive lattices we identify Land i(L). Now L with either of the operations v or II is a semigroup, so Z(L) may be considered a semigroup algebra using either v or II as multiplication.

Proposition 5. If L is a distributive lattice, M (L) is an ideal of the semigrnup algebra for both v and II and so V(L) is a commutative ring with either (the induced) v or II as product. Moreover, for any homomorphism rp: Ll -+ L2 of distributive lattices, the extended map rp: V(L l ) -+ V(L 2 ) is a homomorphism for both v and II.

Z(L)

+

Proof. (x v y X and similarly for v.

II

Y-

X -

y)

II

t = (x

II

t) v (y

II

t)

+ (x II t) II (y II t) -

X II

t - yilt

Corollary. The ring (V(L), II) and the map i are characterized by the following universal property. For any commutative ring A and any map {3: L -)- A for which {3(x II y) = {3(x)' {3(y) and {3(x v y) = {3(x) + {3(y) - {3(x II y), there is a unique ring homomorphism ex: (V(L), II) -+ (A,') such that ex' i = {3. Suppose L is a distributive lattice. If L does not have a unit (maximal element) u or a zero (minimal element) z we can adjoin such elements to L and the enlarged lattice is still distributive. Since this merely adds onto V(L) one or two copies of Z as direct summands, whenever it is convenient we assume u, z E L. Then u and z are the identities for II and v in V(L). The usual augmentation map e: Z(L) -+ Z given by e( LCtXt} = LCt is a homomorphism for both II and v and its kernel I,

463

232

L. GEISSINGER

ARCH. MATH.

which is generated by all x - y for all x, y E L, contains M (L). Thus V(L) with the induced homomorphism B: V(L) --+Z is an augmented algebra relative to both multiplications v and A. Moreover, in Proposition 5 the ring homomorphism rp commutes with the augmentation homomorphisms B and carries h into 1 2 • In a sense the augmentation ideal I is the most important part of V(L). Namely, for any x E L, V(L) = I EB Zx so that a homomorphism I: I --+ A extends uniquely to a valuation on L when an arbitrary value 1(x) in A is assigned to x. It is often convenient to take x = z the zero of L so that I F:>! V(L)/Zz = f(L) naturally represents valuations normalized to take value 0 on z and at the same time f(L) is again a ring for the A-multiplication. In the theory of Boolean algebras there is a duality principle which comes from complementation. For a general distributive lattice L with u and z there is no complementation process in L, however there is an endomorphism of V(L) which can be used in much the same way. Namely, the map .(x) = u z - x for all x E L when extended to an endomorphism of Z(L) carries M (L) into itself and so induces a homomorphism.: V(L) --+ V(L) .• could also be described by saying that .(c) =-c for all c in the augmentation ideal and that .(z) = u or .(u) = z. The following proposition is the substitute for De Morgan's laws and justifies our subsequent practice of usually ignoring v and considering V(L) only with multiplication A.

+

Proposition 6•• : (V(L), A) --+ (V(L), v) is an isomorphism 01 augmented algebras, and .2 = id. Hence .: (V(L), v) --+ (V(L), A) is also an isomorphism.

+

+ +

+

+

Proof. z u - X AY = z u x v y - x - y = (z u - x) v (z u - y) so that .(x A y) = .(x) v .(y). Clearly .2 = id and ET = B. As a further check that. is the correct algebraic analogue of complementation, note that if x E L has a complement x' then x v x'

+ X Ax' -

X -

x'

= 0= u

+z -

x - x' so that • (x)

=

x'.

In any case, in V(L) we always have x v .(x) = u and x A.(X) = z for every x E L. More generally, for any element x in an interval [v, w] of L, the element v w - x in V(L) acts as the relative complement of x. A more useful and more familiar form of Proposition 6, again for distributive lattices, is the following.

+

Proposition 7. For all

V Xj =

Xl, ••• , Xn

in L, u - V XI = 1\ (u - XI) in V(L), that is,

L: XI - L: (Xj A + L (XI A Xi)

+

Xi

A Xk)

-

•••

(i

< i < k < ... ).

+

+

Proof. • (V XI) = z u - V XI = 1\ • (Xj) = 1\ (z u - XI) = z 1\ (u - Xi) since z A (u - XI) = O. For a direct proof, use induction beginning with either u -

X V

Y= u

+ X AY -

X -

Y = (u - x) A (u - y) or x v y =

Note that in the direct proof u can be replaced by any v

~

X

+y -

X

A y.

V XI.

Corollary. For any valuation 1on L,

I(V XI) =

L:f(xt) -

L:f(Xi AXj)

+ L:f(x, A Xi AXk) -

464

•••

(i

< i < k < ... ).

Vol. XXIV, 1973

233

Valuations on Distributive Lattices I

This is well known; in particular, when f is an additive set function on L = 2x this is the classical inclusion-exclusion formula. It is somewhat unusual to have to consider two natural ring structures on the same abelian group V(L). The relation between them, derived from x v y = x y - X II Y for all x, YEL, is given by av b= e(a)b e(b)a-a II b for all a, bE V(L). But now it is easily checked that if e: A -')- k is an augmentation of any k-algebra (A, .) there is another multiplication on A given by a b = e(b) a e(a) b - a' b for which (A, *, e) is an augmented algebra. Moreover, it is clear that the endomorphism -rIa) = -a of the augmentation ideal carries a' b into a b = rIa) rIb). To see if -r extends to all of A, note the following. If (A, .) has a unit u, then b U = U b = = e(b) u, and conversely ifzis a unit for (A, *) and if e(z) = 1 then a'z = z'a= e(a)z. An element z with this property in an augmented algebra (A, " e) has been called an (invariant) integral [13]. If (A, " e) is an augmented algebra with unit u then r can be extended to an isomorphism (A, .) -')- (A, *) iff there is such an integral in A by letting r(u) = z and r(z) = u. For any commutative ring A with unit, the set E of all idempotents forms a Boolean algebra with a II b = a . b and a v b = a b- a .b for all a, bEE. Thus Proposition 7 holds for idempotents in A. What we have just shown is essentially that when A has an augmentation e, the set {aEEle(a) = 1} (which is always a sublattice of E) is a Boolean algebra iff A has an integral z. Finally note that if K is any multiplicatively closed subset of E, then the sublattice generated by K consists of all elements of the form fLi (fLi II aj) (at II aj II ak) ... for all finite indexed families (ac) of elements of K. A generalized Boolean algebra, that is, a distributive lattice L which contains z and is relatively complemented, along with the usual symmetric difference operation /::; is a group. Moreover, M(L) is then an ideal in the group algebra (Z(L), /::;) so V(L) with the induced operation /::; is again a ring. In V(L), x /::; y = x v y - X II Y z, forallx,YEL, andmoregenerallyforallb,cE V(L),b /::;c = b v c - b II C + e(b) e(c) z. But it is easily checked that this formula for the operation /::;, which makes sense for any distributive lattice L, always yields a third ring structure on V(L), even when /::; is not defined on L or Z(L). Of course if u E L then V(L) is also a ring with product the operation complementary to /::;, that is, x /::;' Y = X II Y - X V Y u. We shall not pursue these other operations further; instead we turn to nondistributive lattices. For a nondistributive lattice L, M(L) need not be an ideal in (Z(L), ,\). The condition for M (L) to be a II-ideal is that in V(L) for all t, x, Y E L,

+

+

*

+

*

* *

*

+

L

L

+L

+

+

o=

t II (x

V

y)

+ t II (x II y) -

But in any case, t II (x

V

y) - (t II x) v (t II y) = t

t II

X -

+x v y -

t II Y = t II (x tvx vy

V

y) - (t II x) v (t II y).

+ t II X II Y -

t II

X -

t II Y =

= t+x+y-x lIy- t II x- t II y-tv xv y+t IIxlly=

= - t-x-y+xv y+tv x+tv y- tv xv y+t II xlly= = (t v x) II (t v y) - t v (x II y) . (We have suppressed the i's in itt), etc .. ) Thus M(L) is a II-ideal iff it is a v-ideal. Birkhoff [2] calls a valuation f on any lattice distributive if

465

L.

234

I(t

V

GEISSINGER

ARCH. MATH.

X V y) - f(t II X II y) = f(x V y) + I(t V X) + I(t V y) - I(t) - I (X) - I(y) = = I(x) + I(y) + f(t) - I(t II X) - f(t II y) - I(x II y).

Thus M (L) is a II-ideal iff i is a distributive valuation. By Proposition 1, i is distributive iff every valuation on L is distributive. For any lattice L, let lil(L) be the subgroup of Z(L) generated by M(L) and all elements of the form t II [x V y X II Y - x - y] for all t, x, y E L. Then 1il (L) is the II-ideal generated by M (L) and from our computation above it follows that 1il (L) is also the v-ideal generated by M(L). Thus f(L) =Z(L)/lil(L) is a ring for each of the products II and V and since 1il (L) is contained in the augmentation ideal, f(L) is an augmented algebra. The induced map i: L --+ f(L) is a homomorphism for both II and v; thus j(L) is closed under both these operations. Hence j(L) is a lattice and i is a lattice homomorphism. Moreover, by construction j(t II (x v y)) = i ((t II x) v v (t II y)) so i(L) is a distributive lattice. Also i as a function into f(L) is a distributive valuation, and f(L) is the valuation ring of i(L).

+

Proposition 8. The lunction i: L --+ f(L) is the universal distributive valuation on L and i: L --+ i (L) is the universal lattice homomorphism Irom L into distributive lattices. Proof. The first part is a consequence of the previously mentioned properties of

1il (L). For the second, suppose rp is a lattice homomorphism of L into a distributive

lattice L', and i': L' --+ V(L') is the natural injection. Then i' 0 rp is a distributive valuation and so factors uniquely as 0( 0 i where 0(: f(L) --+ V(L') is a group homomorphism. But since i', rp and i preserve II and v and i' 0 rp = rp 0 i then 0( preserves II and v on j(L), and hence on all of V(L). Thus 0( is both a lattice and ring homomorphism. The result stated in the Corollary to Proposition 7 is now seen to hold more generally for distributive valuations on any lattice. Examples. For the modular 5-element lattice Ms it is easily checked that V(Ms) is free of rank 2 while f(Ms) is free of rank 1. For the nonmodular 5-element lattice Ns, V(Ns ) = f(Ns) and the rank is 3. If L I , L2 are lattices with minimal elements Zl, Z2 respectively, a valuation I on LI X L2 is determined by I(ZI, Z2) and the normalized valuations II (x) = I(x, Z2) - I (Zl' Z2) on LI and 12 (y) = I (Zl' y) - I (Zl, Z2) on L 2. Conversely, given normalized valuations It on Lt into an abelian group A and an element c in A, the function I (x, y) = c II (x) 12 (y) is a valuation on LI X L2 into A. Thus the augmentation subgroup I (LI X L 2) is the direct sum I (L I ) EEl I (L 2) and

+

+

V(LI X L 2) R; I (L I) EEl I (L 2) EB Z (Zl' Z2) Similarly,

f(LI X L 2 )

R;

R;

[V(LI) EEl V(L 2 )]/Z (Z2 - Zl).

[f(LI) EEl f(L 2)]/Z(Z2 - Zl).

Application. Let L be a finite geometric lattice which is connected [3], that is, which is not isomorphic to a direct product of two geometric lattices. Then for any two copoints (= coatoms) x, y there is by the path theorem [3] a connected path x = xo, Xl, ... , Xr = y of copoints from x to y, where connected path means that, for each i, Xi-l II Xi is a coline above which is at least one copoint ti different from Xi-l and Xi. Thus Xi-I. tt, Xj generate a copy of Ms and so any valuation I on L must take

466

Vol. XXIV, 1973

Valuations on Distributive Lattices I

x,

235

the same value on all and t" hence on all copoints. Now for any element y of rank k there are elements x, t with x of rank k 1 and t a copoint such that y = x " t and x v t = u. Thus if all flats of rank k 1 have the same f value, then the same is true of all flats of rank k and so by induction downward f is constant on the flats of any given rank, in particular, on the points. Since a flat x of rank k is the join of k independent points it is easy to see that f(x) - f(z) = k(f(p) - f(z» for any point p. That is, the valuation g(x) = f(x) - f(z) is given by g(x) = [rk(x)] g(p). The lattice L is modular iff the rank function is a valuation. Hence if L is modular geometric then V(L) is free abelian on two generators p, z and the universal valuation i: L-+ V(L) is given by i(x) = [rk(x)] (p - z) z. When L is nonmodular there are flats x, y, t such that x v t = Y v t, t = Y " t, x < y and rk(y) = rk(x) + 1. Hence for the valuation g above [rk(x)] g(p) = g(x) = g(y) = [rk(y) 1] g(p) and so g(p) = O. Thus for nonmodular L all valuations are constant so that V(L) is free abelian on one generator z and i(x) = z for all x in L. Clearly then V(L) = P(L) in the nonmodular case. For modular L, since L is connected and atomic it cannot be distributive unless it contains just one point in which case L is the two element lattice and V(L) = P(L). SO when L is modular and not distributive there are at least two points (atoms) {p, q} and since i(p) = i(q) then j(p) = j(q) = j(p " q) = j(p v q). That is, in the group homomorphism from V(L) to P(L) the point p must be identified with z so that P(L) is free on the single generator z. Finally, for any finite geometric lattice L, if L is expressed as the product of connected geometric}attices, then V(L) and P(L) are free abelian groups and rank V(L) = 1 (off of modular connected components) while rank P(L) = 1 (off of 1-point components [isthmuses]).

+

+

+

X"

+

+

+

2. Finite Posets and Distributive Lattices. We collect together here some facts we shall need about maps of finite posets and distributive lattices. With only slight modification most of the results hold also for infinite posets and distributive lattices which have a zero and are locally finite, that is, in which every interval is a finite set. A subset J (possibly empty) of an ordered set (poset) (P, ;:;:::;) is called an order ideal (order filter) if y E J and x;:;:::; y (y ;:;:::; x) imply x E J. The set J(P) of all order ideals of P is a sublattice of 2P with unit u = P and zero z = 0. An element x of a joinsemilattice L is join-irreducible if x = r v 8 implies x = r or x = 8. The set P(L) of all (including z) join-irreducible elements of L is a poset with the induced order. If a poset P has a zero z, the poset P\z will be denoted by P. For any finite distributive lattice L, the map x -+ a(x) = {p E P(L) Ip ;:;:::; x} is a lattice isomorphism of L onto J(P(L» [2, 7]. Dually, for any finite poset M, the principal order ideals (m] = = {k E M I k ;:;:::; m} are precisely the nonzero elements of P(J (M» so that m -+ (m] is an order isomorphism of M onto P(J(M». In a distributive lattice L which is finite or just satisfies DCC, a filter (order filter closed under,,) is prime (its complement is closed under v, i.e. is an ideal) iff it is principal and its minimal element is joinirreducible. Hence a (x) may be identified with the set of all prime filters containing x or all prime ideals not containing x, and a is then the Stone representation of the distributive lattice L by a lattice of sets [2, 7]. By a (u, z)-homomorphism between lattices we mean a map which preserves (u, z, v, ,,). The following equivalences of pairs of categories will be much used.

467

236

L. GEISSINGER

ARCH. MATH.

Proposition 9. The category of finite distributive lattices and (u, z)-homomorphism~ (u-homomorphisms) is equivalent to the dual of the category of finite pwets (with z) and order homomorphisms (preserving z) . Proof. 'Ve prove both equivalences simultaneously. If {J: MI ~ll1z is an order homomorphism of finite posets then taking inverse images we get a map (J' : J( 1lfz) ~ --+J (M 1) which is a (u, z)- homomorphism of finite dist·ributive lattices . Moreover, if the .Jf j both have zeros then the J (Md are still distributh'e lattices and (J' maps J {M z ) into J {M l ) iff (J(ZI) = Z2. In this case (J': J{M 2) ~ J(M 1) is a u-homomorphism. The association ill ~ J (M) (or 111 -,>- J (lll)) and {I -'>- {J' is a contravariant functor from t.he poset category to the dist.ribut,ive lattice category. Note that for A E J (M 2), P' (A) = V {ttl I ({J(t) ] ;:£ A } . In the opposite direction we have the contravariant functor P which associates to each u-homomorphism ),: Ll ~ Lz of finite distributive lattices the order homomorphism .it *: P (L2) -+ P (L] ) given by}, * (P2) = = A {xE LI I .it (x) ~ P2}. Clearly ),(Zl) = Z2 iff ),* maps P(L 2 ) into P(L 1 ). The conclusion now follows from easy computations involving composites and the isomorphisms L ~ J (P(L)) and },f ~ P (J (.J.ll» mentioned above. For generalizations and related results see [6]. Using the concrete correspondences above it is easy to investigate the properties of special (u, z)-homomorphisms of finite distributive lattices. For example, .it is a monomorphism iff J.* is an epimorphism and it is casy to check that an epi in the category of finite posets is just a surjective order homomorphism. But it is also obvious that;' is a monomorphism iff it is injective. For each q E P{L) we let gO denote the unique maximal element in L which lies below q. Then for.it: L1 ~L 2 as above , we have .it * (p) = q iff P ~ .it (g) and p;t A(gO). That is q E I. * (P2(L» iff .it(q) > .it (qO) and ).* is surjective iff ),(q) > .it (qO) for all q E P{L z ). Dually, .it is an epimorphism iff .it * is a monomorphism and clearly a monomorphism in the category of posets is just an injective order homomorphism . Among injective order homomorphisms (J: .Mt-+M2 arc the strict maps, thosc for which p ~ q iff (J(p) ~ P{q), i.e . .Ml is isomorphic to ($( Jl 1 ) with the order inherited from .M2 • For any map Pif P{p) ~ P(q) then for any A E J (Jl 2 ) such t hat q E P' (A) we have also PEP' (A). Consequently, if p ;t q then (q] is not in the image of {J'. Thus P' is a surjective cpimorphism iff f3 is a strict map. 'Ve will later derive two other conditions which are equivalent to .it or {I' being an epimorphism . It is now casy to prove a theorem of Balbes which characterizes "projective" distributive lattices [1,7]. In the category of finite distributive latticcs and (u, z)homomorphisms L is weakly projective if for any cc L1 --+ L 2 which is surjective and any {J: L-+L2 , there is a y : L ~ Ll such that rxy = f3. In the category of finite posets and order homomorphisms III is weakly injective if for any (I.: Jl 1 -+ M 2 which is strict and any f3 : M 1 -+ i1f, there is a y: .M2-+ _~f such that YI1={J. Proposition 10 (Balbes). For a finite distributive lattice L the following are equivalent: (i) L is weakly projective,

(ii) P(L) is UJeakly injective, (iii) P(L)isalattice.

468

Vol. XXIV, 1973

Valuations on Distributive Lattices I

237

Proof. From our remarks above (i) and (ti) are equivalent. If M is weakly injective and if in the defining property above we take M1 =M, M2 a lattice, (X a strict embedding, and fJ the identity, then a retraction y of M2 onto M must exist. It follows that M must be a lattice, provided that there is such a map (X. But the natural map a --+ (a] is a strict embedding of M into the lattice J (M). Conversely, if M is a lattice, fJ: M 1 --+ M any order homomorphism, and (X: M 1 --+ M 2 a strict map, define y by y(m2) = V {fJ(m1) 1(X (m1) ~ m2}. It is easy to see that y preserves order and because (X is strict also y (X = fJ. The referee has pointed out that the equivalence of (ti) and (iii) for an arbitrary poset in place of P(L) (and with "complete" inserted in (iii)) is due to Banaschewski and Bruns: Arch. Math. 18, 369-377 (1967). Another useful result comes from the observation that for an order homomorphism fJ: M1--+M2 the induced map fJ': J(M2) --+ J(M1) takes P(J(M2)) into P(J(M1)) iff for each m2EM2 there is a (X (m2) E M1 such that fJ' ((m2]) = ((X(m2)]. In this case (x, fJ constitute a Galois connection [18] between M 1 and M 2. (Each of (X and fJ is said to be residuated.) Namely, (X(m2) = sup{ml1 fJ(ml) ~ m2} and fJ(ml) = = inf{m21 (X (m2) ~ ml} and (XfJ is a closure operator on Ml and fJ(X is an interior operator on M 2. In terms of distributive lattices Ll and L 2 , the result above states that a map A: P(L 2) --+ P(L l ) can be extended to a lattice (u, z)-homomorphism of L2 into Ll iff A has the same properties as (X above, that is, the A-preimage of every principal order filter in P(L l ) is a principal order filter in P(L 2).It is clear that the product Ll X L2 of finite distributive lattices is the categorical product in both of the lattice categories under consideration and P (Ll X L 2) is isomorphic to the disjoint union P(L l ) U P(L 2) and P(L I X L 2) to the one point join P(L l ) U P(L 2)/{Zl, Z2} which are the coproducts of the P(Lj ) and P(L j ) in the poset categories. Dually the categorical product is the Cartesian product in both poset categories so the coproduct (free distributive product) of L1 and L2 must be isomorphic to J(P(Ll) X P(L2)) and J(P(L l ) X P(L 2)) respectively in the lattice categories. We will see shortly that V(L j ) is free with rank 1P(L j ) 1 and so V(J(P(Ll) X P(L 2))) has rank 1PI (L l ) 1 . 1P (L2) I· This suggests that V(L l ) ® V(L2) might be used as a model for V(J (P (L l ) X P (L 2))). Suppose LI, L2 are any distributive lattices with units (not necessarily finite), then considering the V(L j ) as augmented algebras with A-multiplication, their coproduct is V(L l ) ® V(L 2) with the natural embeddings (Xi given by (Xl (Yl) = Yl ® U2 -and (X2 (Y2) = Ul ® Y2. The multiplicative semigroup generated by (Xl (L l ) U (X2 (L 2) is {Yl ® Y21 Yi E L j } and these are idempotents in e-1(1). The distributive lattice L generated by this semigroup consists, as we saw before, of all (yj ® til (Yi AYl ® tj A tj) (Yi AYl AYk ® ti A tj A tk) ... with i < i < k < ... for all finite indexed families (yj) in Ll and (ti) in L 2 . Fortunately, we never need to use this expression for elements of L.

L

L

+L

Proposition 11. The coproduct of Ll and L2 in the category of u-homomorphisms of distributive lattices with unit is (L, (Xl, (X2), and V(L) FI:i V(L l ) ® V(L2). Proof. For any La, if fJi: L j --+ La are u-homomorphisms, then the induced fJj: V(L j ) --+ V(La) are algebra homomorphisms, so there is a unique algebra map y: V (Ll ) ® V(L2) --+ V(La) given by y (y ® t) = y (y ® U2) Y (U1 ® t) = fJl (y) A fJ2 (t)

469

238

L.

GEISSINGER

ARCH. MATH.

such that YlXj = th. For any s, tEL we know e(s) = e(t) = 1 = e(y(s)) = e(y(t)) so that svt=S+t-sAt and y(SAt)=y(S·t)=y(s)y(t)=y(S)Ay(t) and y (s v t) = y (s) y (t) - y (s A t) = Y (s) v Y (t). Thus Y is a u-homomorphism of Land y(L) ~ L3 because L3 contains the image Y(IXt{L j )) = (3j(Lj ) of the generators IXj(Lj) of L. By the Corollary to Proposition 5, the inclusion of L into V(L 1 ) @ V(L 2) extends uniquely to a ring homomorphism of V(L) onto V(L 1 ) ® V(L2). On the other hand, the ring homomorphisms IXj: V(L j ) -l- V(L) yield a ring homomorphism of V(L 1 )® V(L 2) onto V(L) which is the identity on L. Hence V(L) R:i V(Ll) @ V(L2). Since J takes the category of finite posets into a subcategory of itself we may apply J twice to get a covariant functor which associates to each order homomorphism {3: M 1 -l- M 2 the (u, z)-homomorphism {3": J2 (M 1) -l- J2 (M 2) given, for principal ideals (A], by {3"((A])={BEJ(M2)1{3'(B)~A} for every AEJ(Ml). For each finite poset M let y: M -l-J2(M) be given by y(m) = {A EJ(M)lm ¢A}. Then y provides a natural transformation from the identity to the functor J2 since for {3: Ml-l-M2 we have (3" y(m) = {3" ({A EJ(M 1 ) Im ¢A}) = U {{3" ((A]) 1m¢: A} =

+

= {BEJ(M2)lm¢:{3'(B)} =

I

= {BEJ(M2) {3(m)¢:B} = y{3(m).

Lemma. The sublattice F(M) generated by y(M) contains all elements of J2(M) except for the unit and zero. If L is a distributive lattice, there is a unique lattice homomorphism n: J2 (L) -l- L such that n y = id. Proof. If OEJ(M) and O*M, then n{y(m)lm¢O}={AEJ(M)lm¢:A for all m¢:O}= = {A EJ(M)

IA ~O} =

(0].

Thus F (M) contains all principal ideals in J2 (M) except (M], and doesn't contain 0 or (M] since 0 E y(m) and M ¢: y(m) for all mE M. If L is a distributive lattice, the map}.: P (L) -l-J (L) given by }.(p) = {xEL Ip;:t x} is order preserving. The induced lattice homomorphism }.': J2 (L) -l- J (p (L)) composed with the isomorphism J(P(L)) R:i Lyields a lattice homomorphism n: J2(L)-l-L given by n((A]) = V {pE P(L) I J.(p) ~ A} for all A E J(L). Thus ny(y)

= V {pE P(L) I }.(p) ~ }.(y)} =

y for all yEL.

Proposition 12. For any finite poset M, (F (M), y) is the free distributive lattice on M. That is, every order homomorphism (3 of M into a distributive lattice factors uniquely thru F(M) as y followed by a lattice homomorphism, namely the homomorphism n{3" in case the lattice is finite. Proof. For finite L, n{3"y = ny {3 = {3 by the lemma and the fact that'y is a natural transformation. n {3" is unique since y (M) generates F (M). If L is infinite just replace L by a finite sublattice of L containing (3(M). The preceding is a natural generalization of the construction due to Skolem of the free distributive lattice on a finite number of generators [2].

470

Vol. XXIV, 1973

Valuations on Distributive Lattices I

239

References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11]

R. BALBES, Projective and injective distributive lattices. Pacific J. Math. 21,405-420 (1967). G. BIRKHOFF, Lattice Theory. Third ed., Providence 1967. H. CRAPO and G.-C. ROTA, Combinatorial Geometries. Cambridge (Mass.) 1970. R. L. DAVIS, Order Algebras. Bull. Amer. Math. Soc. 76, 83-87 (1970). J. FOLKMAN, The homology groups of a lattice. J. Math. Mech. 16, 631-636 (1966). L. GEISSINGER and W. GRAVES, The category of complete algebraic lattices. J. COIp.binatorial Theory (A) 13, 332-338 (1972). G. GRATZER, Lattice Theory. San Francisco 1971. C. GREENE, On the Mobius Algebra of a Partially Ordered Set. Proc. Conf. on Mobius Algebras, University of Waterloo 1971. P. R. HALMOS, Measure Theory. Princeton 1950. P. HILTON and S. WYLIE, Homology Theory. Cambridge 1960. V. KLEE, The Euler characteristic in combinatorial geometry. Amer. Math. Monthly 70,

119-127 (1963). [12] A. HORN and A. TARSKI, Measures in Boolean algebras. Trans. Amer. Math. Soc. 64,467 -497 (1948). [13] R. LARSON and M. SWEEDLER, An associative orthogonal bilinear form for Hopf algebras. Amer. J. Math. 91, 75-94 (1969). [14] H. M. MACNEILLE, Partially ordered sets. Trans. Amer. Math. Soc. 42, 416-460 (1937). [15] B. PETTIS, Remarks on the extension of lattice functionals. Bull. Amer. Math. Soc. 64, 471 (1948). [16] B. PETTIS, On the extension of measures. Ann. of Math. 64, 186-197 (1951). [17] H. RASIOWA and R. SIKORSKI, The Mathematics of Metamathematics. Warsaw 1963. [18] G.-C. ROTA, On the foundations of combinatorial theory. I. Z. Wahrscheinlichkeitstheorie Verw. Gebiete 2, 340-368 (1964). [19] G.-C. ROTA, On the combinatorics of the Euler characteristic. In: Studies in Pure Mathemathics, pp. 221-233. London 1971. [20] L. SOLOMON, The Burnside algebra of a finite group. J. Combinatorial Theory 2, 603-615 (1967). [21] E. SPANIER, Algebraic Topology. New York 1966.

Eingegangen am 24. 8. 1970 *) Anschrift des Autors: Ladnor Geissinger Mathematics Department University of North Carolina Chapel Hill, North Carolina 27514, USA

*) Eine revidierte Fassung ging am 1. 9. 1972 ein.

471

Sonderabdruck aus ARCHIV DER MATHEMATIK Vol. XXIV, 1973

Fasc.4

BIRKHAUSER VERLAG, BASEL UND STUTTGART

Valuations on Distributive Lattices II By LADNOR GEISSINGER

Introduction. We continue the study begun in part I [Arch. Math. 24, 230-239 (1973)] of the valuation ring of a finite distributive lattice. We show that it is the Mobius algebra of the set of join.irreducible elements and we derive Solomon's formula for idempotents. We use the duality between posets and distributive lattices given in part I to derive mapping properties of Mobius algebras. From this we get theorems on extending finitely additive measures, theorems of Rota concerning Mobius functions, an identity due to Klee, a factorization theorem of Stanley and Greene, and results on the characteristic valuation of a distributive lattice. 3. Finite Valuation Rings and Mobius Algebras. If L is a distributive lattice with DCC (minimum condition) and P(L) is the set of all (including the zero z) joinirreducible elements of L, then every element of L can be uniquely expressed as a finite irredundant join of elements of P(L). For each x E L the set a(x) = {pEP(L)\p ~x}

is a finitely generated order ideal in P(L) and x = Va(x). Also the prime dual ideals (filters) in L are precisely the principal filters generated by elements of P(L). Theorem 1. For every distributive lattice L with DCC, the valuation ring V(L) is a free abelian group with P(L) as basis. That is, every valuation on L is determined by its values on P(L) and these values can be assigned arbitrarily. Proof. For every xEL there are {PI, ... ,Pr}c P(L) such that x = VPt. Then by Prop. 7, in V(L) we have x=Vpt=LPt-LPtIlPJ+LPtIlPJIIPTe .... If x ¢ P (L) then each of the summands PI II .•• II PTe is strictly below x in L. Hence by induction upward on L (DCC) we conclude ~hat every x in L is a linear combination in V(L) of a finite number of elements in a(x). Thus P(L) generates V(L). For any {PI, ... , PTe} c P(L), if Pr is maximal among them then the remaining Pt are in the complement of the prime filter generated by Pr and so Pr is independent of the rest by Prop. 3. Thus P(L) is an independent set in V(L). Since this theorem is the principal result upon which the remainder of our discussion is based, we sketch alternative proofs. Perhaps the most natural procedure is to attempt to prove the statement about valuations directly without using the valuation ring. The chief difficulty is to show that any function v from P(L) into 22

Arclliv der Mathematik XXIV

473

1. GEISSINGER

338

ARCH. MATH.

an abelian group A can be extended to a valuation on L. One can easily define by induction on L an extension of the function v to L by using the unique representation of an element x as an irredundant join VPt of elements of P(L) and then setting v(x) = V(Pi) V(pj II Pi) .... However, to show that this extension is a valuation requires a rather delicate induction argument. For another proof, when L is locally finite, we can use instead the valuations vp of Proposition 3 for each PEP (L), which take the value 1 for all x ~ p and otherwise the value O. Every function f: P (L) -'>- A then determines a valuation Vf on L by vf(x) = 2:f(p) vp(x) = 2:f(p). By Mobius

2:

2:

p

p";",

inversion over P(L) we can choose f so that vf and any given v agree on P(L). One can also easily show that the Vp are independent. Hence if L is finite we get yet another proof by observing that V(L) is generated by P(L) and its dual has IP(L)I independent functionals and so again V(L) must be free on P(L). See also the proof by Greene [8]. i8

Corollary. If P(L) i8 a iI-8emilattice, then it the 8emigroup algebra of (P(L), II) over Z.

i8

a 11-8ub8emilattice of Land (V(L), iI)

For distributive lattice L with DCC, the elements of L correspond to the finitely generated order ideals in P(L) and these are closed under finite intersection. Lis then locally finite iff P (L) is locally finite, and iff for each p in P (L) there is a unique maximal element pO in L lying below p. When L is locally finite for each pin P(L) let ep = p - pO (in V(L)) and for zero z let ez = z. Parts of the following theorem appear in a paper by Davis [4] in which he showed that the valuation ring V(L) is isomorphic to the Mobius algebra of P(L) as defined by Solomon [20].

Theorem 2. Let L be a locally finite di8tributive lattice with zero, and let p, be the Mobius function of P(L). Then {e p Ip E P(L)} i8 a ba8i8 of V(L) con8i8ting of orthogonal idempotent8, x = 2:ep (p;£ x) for each x in L, ep =; 2:p,(r, p) r (r;£ p and r in P(L)) for each p in P(L), and x = 2:p,(r, p) r (p, r in P(L) and p;£ x) for each x in L.

*'

Proof. For p q in P(L) it's easy to check that ep and eq are idempotent and epileq = O. For p in P(L), p = ep + pO and for any x in L which is not in P(L), x = by c where b, c are less than x and in L. Also if b = eq (q ;£ b) and c = eq (q ;£ c) then b II c = eq (q ;£ b II c) since the eq arc orthogonal idempotents. Hence byc = b + c - bllc = 2:eq (q;£ byc). Thus by induction upward, x = 2:eq (q ;£ x) for all x in L. Hence the {eq} form a basis of V(L). The embedding i: L-,>- V(L) restricted to P(L) and the map e from P(L) into V(L) thus satisfy i(p) = 2:eq (q;£ p) and so by Mobius inversion ep = 2:p,(r, p) i(r) = 2:p,(r, p)r (r;£ pl. It follows that for every x in L, x = ep = p, (r, p)r (r ;£ p ;£ x and r, p in P(L)). The expression for any x in terms of P (L) was discovered before the other formulas above. A direct proof of this follows. By Theorem 1 for any x in L, x = 2:dpp where d p is integral and d p = 0 if P ¢ a(x). For any q in a(x),

2:

2:

2:

2:

2:

q = xilq = C~qdp) q +q~dp(Pllq) and since q is independent of the p II q

<

q in the second summand,

2: dp =

p~q

474

1. This

Vol. XXIV, 1973

339

Valuations on Distributive Lattices II

is true for each q in a(x), so by Mobius inversion over the finite subset a(x) of P(L) we get dr = L,u(r, p). p:;;;'z

Corollary. For every q in P(L), qO

= -

L,u(r, q) r (r

< q).

Solomon [20] defined the Mobius algebra M (Q), for any poset Q in which every principal ideal (q] is finite, as the free abelian group on Q with multiplication given by q "r = L,u (s, t) s (s ~ t, t ~ q and t ~ r). If Q has a zcro z, then Q is isomorphic to the poset of join-irreducible elements of the lattice J(Q) of nonempty ordcr ideals of Q. and by Theorem 2 then M(Q) Ri V(J(Q)). If Q does not have a zero, letting z denote the zero (= empty set) of J(Q), it follows that z generates a 1·dimensional ideal in V(J(Q)) and M(Q) Ri V(J(Q))/Zz. In both eases, the spanning orthogonal idempotents in M(Q) are given as above by ep = L,u (r, p)r (r ~ p) and p = L eq (q ~ p) for eaeh pin Q.

Proposition 13. An order morphism fJ: P..,.. Q of finite posets induces ring homomorphisms fJ': V(J(Q))"'" V(J(P)) and fJ': M(Q)..,.. M(P) given by fJ'(eq) = L ep (fJ(p) = q), where the sum is taken to be 0 if the index set is empty. Proof. We are identifying Q, P with the nonzero join irreducible elements of J(Q), J(P). Define a map 1: V(J(Q))..,.. V(J(P)) by A(eq) = Lep (fJ(p) = q). Then for each A EJ(Q), A = Leq (qEA) and so A(A) = Lep(fJ(p) EA) = fJ'(A). Thus A is the map induced on V(J(Q)) by the (u, z)-lattice homomorphism fJ': J(Q) ..,.. J(P). Factoring out z = () merely deletes ez = z from the expressions for ep, eq and fJ', so that in the quotient fJ': M(Q) ..,.. M(P) is given by the same formula. The conclusion of Prop. 13 still holds if we suppose fJ is only defined on an order ideal in P, or equivalently if fJ maps Pinto Q with a unit u' adjoined. If moreover P and Q have zeros and if fJ preserves zero, then fJ': J(Q) ..,.. J(P) is a lattice homomorphism and by Prop. 9 every lattice homomorphism J(Q) ..,.. J(P) comes from exactly one such fJ: p..,.. Q U {u / }. From Prop. 13 it follows that fJ/: V(J(Q))..,.. ..,.. V(J(P)) is surjective iff fJ is injective and fJ (P) ~ Q (nothing maps onto u' ) while fJ' is injective iff fJ maps onto Q (some elements could still go onto u' ). Finally, if fJ maps ontoQ, let oc:Q..,.. Pbe any function such that fJoc = id (i.e. oc (q) E {p I fJ(p) = q}), and let Tbethe subgroup of V(J(P)) spanned by{ep IpEtoc(Q)}. The subring fJ'(V(J(Q))) is spanned by {fJ'(eq)} and clearly V(J(P)) = fJ'(V(J(Q))) EB T. Furthermore, it is easy to see that when any y in V(J(P)) is expressed as a linear combination of the {fJ'(eq)} and the basis described for T, then every coefficient is 0 or ± 1. Hence if yEt fJ'(V(J(Q))), we may replace some ep in the basis for T by Y to get a basis for another direct complement of fJ'(V(J(Q))). These results combined with the duality in Prop. 9 and our subsequent discussion of mapping properties yields the following.

Proposition 14. Let A: K ..,.. L be a homomorphism of finite distributive lattices and Ae : V(K)..,.. V(L) the induced ring homomorphism. Then A is an epimorphism iff Ae is surjective, and A is an injection iff Ae is an injection. In any case, Ae( V(K)) is a direct summand of V(L). Moreover for any y E L with yEt Ae( V(K)) there is a direct complement of Ae(V(K)) with a basis which contains y. 22*

475

340

L.

GEISSINGER

ARCH. MATH.

Corollary. Let K be a sub lattice of a finite distributive lattice L. Then every valuation v of K into an abelian group A can be extended to a valuation of L into A. Moreover, for any element y E L such that y rf= V(K), there is an extension of v to L for which v(y) is any prescribed value in A. We will see later precisely what the condition that y ELand y rf= V(K) means in terms of the lattices K and L. The following interesting result will be used to extend part of Proposition 14 and its Corollary to infinite distributive lattices. Proposition 15. Let B 1 , ••• , Br be elements of a distributive lattice L. Every linear relation among the Bk which holds in V(L) also holds in V(M) where M is the finite sublattice generated by the B k • Proof. Suppose that in V(L), LakBk = 0 where the ak are integers. From our construction of V(L) this means that LakBk, when considered back in Z(L), is a linear combination of a finite number of elements of the form

OJ V Di

+ OJ II D

j -

OJ - D j

•

Let N be the finite sublattice generated by all the B k , OJ, D j • Then in V(N) the relation L akBk = 0 holds. But N is finite and M ~ N so by Proposition 14 the induced map j: V(M) -+ V(N) is an injection. Thus any relation among elements of M which holds in V(N) also holds in V(M). Remark. This immediately yields a proof of Proposition 4 which does not depend on the existence of enough prime ideals, that is, does not make use of Zorn's Lemma. Proposition 16. F'or every distributive lattice Land sublattice M, the natural map j: V(M) -+ V(L) induced by inclusion is an injection. Proof. If B 1 , ... , Br are elements of M, any linear relation among them which holds in V(L) also holds in the valuation ring of the sublattice of L generated by the Bk by Proposition 15. But since this sublattice is contained in M the relation also holds in V(M). Corollary. Any valuation of M into any ratioual vector space A (or divisible abelian group) can be extended to a valuation of L into A. Stronger versions of this were proved by Horn and Tarski [6] and Pettis [9] for bounded real-valued modular functions.

4. Combinatorial Applications. The Characteristic Valuation. For a finite distributive lattice L it is easily shown that the rank function r is given by r(y) = 1{p E P(L) 1z < p ~ y} I. Using the representation y = ep (p ~ y) in V(L) we see that r is the unique valuation on L for which r(p) - r(pO) = r(ep) = 1 for all p E P(L) and r(z) = O. The valuation X on L for which X(p) = 1 for all p E P(L) and X(z) = 0 is called by Rota the characteristic valuation of L.

L

476

Vol. XXIV, 19'1'3

Valuations on Distributive Lattices II

341

Proposition 17. For every y E L, X (y) = - L p, (z, q) (q E P (L) and z < q ~ y). For the element pO covered (in L) by some P E P(L), X(pO) = 1 + p,(z, pl. Proof. Just apply the homomorphism X on V(L) to the expression y = LP,(p, q)p (q ~ y) given in Theorem 2 to get X(Y) = LP,(p, q) (z < p ~ q ~ y). Then X(Y) = = - LP,(z, q) (z < q ~ y) and when y = pO,

X (pO) = - LP,(z, q) (z

< q<

p) = p,(z, z)

+ p,(z, p).

Now suppose P(L) is a lattice and y is the join in L of PI, ... , Pr E P(L), then

(X - 1)(y)

=

(X - 1)(Vpj) = L (X - 1) (pj) - L (X -1)(pjApj)

+ ....

Since X - 1 is the valuation which takes the value 0 on P(L) and -1 on z, we get X(y) - 1 = C2 - Ca C4 - Cs where CTc is the number of k element subsets of PI, ... , Pr whose meet is z. When y = qO for some q E P (L) this yields a result due to Rota [18].

+

+ ...

Corollary. Suppose P is a finite lattice and {q, PI, ... , Pr} ~ P is such that all Pi ~ q and every element covered by q is among the Pt. Then p, (z, q) = C2 - Ca + ... where CTc is the number of k·subsets of {Pi} whose meet is z. Proof. In J(P), q0is the join of the Pi, hence by Proposition 17, X(qO) - 1 = p,(z, q). Following Rota, we indicate how the characteristic valuation is related to the Euler characteristic of combinatorial topology [19]. For any finite totally unordered set S, the nonzero elements of J(S) = 28 are called simplexes and the nonzero elements of J2(S) are the finite simplicial complexes with vertices in S. In this setting we usually treat 28 simply as a poset and ignore its lattice structure, but joins and intersections of subcomplexes are important. For any finite distributive lattice L = J(P) we would like to consider the elements of P as simplexes and the elements of J(P) as simplicial complexes. We can approximate to this by constructing an analogue of the barycentric subdivision operator. The (first) barycentric subdivision of a simplicial complex K E J(28 ) may be described as the simplicial complex B(K) whose k-simplexes are all chains A o CAl C ••• C ATc of k 1 (non-empty) simplexes At of K. By analogy, for any finite poset M let B(M) be the set of all (non-empty) finite chains of elements of M ordered by inclusion. B (M) is an ordered simplicial complex [10] and J(B(M» is the lattice of all finite subcomplexes of B(M). To each A E J(M) we associate the subcomplex B(A) whose simplexes are those chains in B(M) all of whose elements are in A, that is, the full subcomplex with vertex set A. Then B is a lattice monomorphism of J(M) into J(B(M». Note that for mE M, B«m]) is usually not a simplex or even a subdivided simplex but it is topologically as simple since it is the cone with vertex m and base B ( {m' E M Im' < m}) and is thus contractible. When M consists of all elements of a finite lattice L except for the unit and zero, then the homology of the complex B(M) is called the order homology of L [5, 18, 19]. Now if A: LI --+ L2 is a (u, z)-homomorphism of finite distributive lattices, A* carries each chain in P(L 2) into a possibly smaller chain in P(L I ) so that A* may also be considered as an order homomorphism of BP(L 2 ) into BP(LI). Taking preimages we get a (u, z)-homomorphism B(A) of JBP(L I ) into JBP(L 2 l.

+

L. GEISSINGER

342

ARCH. MATH.

The association of J BP(L) to L and of B(A.) to A. is a functor from the lattice category into itself, and the lattice monomorphism B: L ~ J BP (L) given by B(y) = B({peP(L)

ip ~ y})

provides a natural transformation from the identity to this functor. It is well-lplOWll [10, 21] that the Euler characteristic E in combinatorial topology is a modular function from the lattice of subcomplexes of a simplicial complex into the integers, which is 0 on the empty sub complex and takes the value 1 on any contractible subcomplex. Thus for any finite distributive lattice L, the composite EB: L~JBP(L)~Z

is a valuation on L which takes the value 0 on z and the value 1 on P(L), that is, EB = X the characteristic valuation on L. From the classical formula for the Euler characteristic, for y e L, X(Y) = EB(y) = ao - al a2 - ... where ale is the number of chains of k 1 elements in P(L) which are less than or equal to y.

+

+

Klee's Identity and Extensions of Valuations. In a paper on the Euler characteristic [11], Klee proves the following identity. Proposition 18. 8uppose 8 is a A-semilattice, (at) and (bj) are finite indexed families of elements in 8, and u e 8 such that u ;?; at, bj for all i, 1- Then in the semigroup ring Z[8, A], TI (u - at) + TI (u - bj) - TI (u - at A bj) = TI (u - at) TI (u - bj). Proof. We may assume 8 is finite. From the Corollary to Theorem 1 we see that the identification of an element of 8 with the ideal it generates yields an isomorphism of Z[8, A] with V(J(8». But in V(J(8», the identity above reduces by Proposition 7 to the simple statement (u - Vat)

+ (u -

Vbj) - (u - V {at A bj}) = u - (Vat) v (Vbj)

where v means join in J(8). And this holds because V{a,Abj} = (Vat)A(Vbj) in J(8).

Klee derived the relation above using the counting result below. For positive integers c, m, n let Pc(m, n) denote the number of relations of cardinality c in {1, 2, ... , m} X {1, 2, ... , n} with domain {1, 2, ... , m} and range {1, 2, ... , n} (subsets of size c projecting onto all of {1, ... , m} and {1, ... , n} respectively). Corollary. - L(-l)CPc (m,n) = (-l)m+n .

•

Proof. Choose a semilattice 8 and elements ai, bj such that all of the meets ail A... Aai. Abit A... Abir are distinct. Then in the identity in Proposition 18, the expressions above are the coefficients on each side of the identity for any term which is a product of m of the at and n of the bj • The following is a generalization of a theorem by Klee in the same paper [11]. Let L be a lattice and K a A-subsemilattice such that every element of L is a finite join of elements of K. For any function f from K into an abelian group, and for any

478

Vol. XXIV, 1973

343

Valuations on Distributive Lattices II

finite family (al' ... , an) of elements of K, let I(al, ... ,a n ) = Lf(a;) - LI(a;lIaj)

+ LI(aillajlla/c) -

... (i

k···).

Proposition 19. The lunction f on K can be extended to a distributive valuation on L iff lor all lamilies (ai), (bj) in K, il Vai = V bj then I(al, .:., an) = f(b l , ... , br). Proof. By our earlier remarks this condition holds if f extends to a distributive valuation on L since then I (Vai) = I (al' ... , an). So assume I satisfies the condition. Then for any a E L, define. I (a) to be I (aI, ... , an) where the ai E K are chosen so that Vai = a. Extend this function on L linearly to Z[L, II], then f(a)

= I(al, ... , an) = f(Lai - Lai lIaj ... ).

For (ai) and (b j ) in K by Klee's identity, I(Vai)

+ I(Vbj) -

I( V {aj

II bj}) =

v

1«Vai) (Vb j ))

•

Finally, given a, bin L we can find (ai), (bj) in K such that Vai = a, Vbj = band a b = V {a; bj}, hence f is a valuation on L and it is obviously distributive.

II

II

Corollary. If L is distributive, f can be extended to a valuation on L iff lor all (ai) in K, il Va; = a is again in K then I(a) = I(al, ... , an). Proof. One can easily show for any (ai) in K, if an+1

~

Vai then

I(al, ... , an) = f(al,"" an+1)

and from this that if Vaj = V bj then I(al, ... , an) = I(b l , ... , br ).

See Klee [11] for this and further results. Factorization in Mobius Algebras. Suppose P is a finite pOBet with and K is the set of all elements covered by u. Let t E P, t u and 0 = and identify as usual P with the join-irreducible elements of J(P) = M(P), eu = u - uO = u - V K = (u - k) and eu = LIt(r, u) r.

*

TI

0,

U -

c is idempotent and (u - c) (u - c)

II eu =

IJ

0 for all r

(u - c) IIt~:(r, u)

and continuing for all c in 0, eu =

II r =

K

II

(u - c) eu =

IJ

(u - c)

r)

~

zero and unit {k E K: k ;;:::; t} L. In V(L) or For all c in

c. Thus

II (L It (r, u) r)

where the sum is over only those rEP for which r~ VO. But r~ VO in J(P) iff in P, sup {r, t} = u. Now in the Mobius algebra of the interval [t, u] as poset, we have (u - c) = LIt(r, u) r, but this may not hold in Jf(P). In M(P) we have

TI a

r<;1

479

344

L. GEISSINGER

ARCH. MATH.

and The coefficient of ep in this expression is 1 when p ;t ve, 0 when t and p are comparable and p* u, and Lf.t(r, u) (t ~ rand p ~ r) when p and t are incomparable and p ~ ve. It follows that, if for every PEP, sup {p, t} exists in P, then in M(P) we have Lf.t(r, u) r = TI (u - c). This yields the factorization theorem due to Greene [S] . • ;;;1

C

Proposition 20. Suppo8e P i8 a finite poset 'With unit and zero and t in P has the property that lor every p in P, the join p v t exi8t8 in P. Then in the Mobius algebra 0/ P,

Lf.t(r, u) r = (Lf.t(r, u) r)

reP

r~t

A (

L f.t(r, u) r).

r v t=u

Note that the condition on t is equivalent to requiring the injection [t, u] ~ P to be residuated, that is, the preimage of every principal liter in P is a principal filter in [t, u]. When P is a lattice, if Q is the A-subsemilattice generated by e and u, then the inclusion Q ~ P induces a homomorphism M(Q) ~ M(P). Thus TI (u - c)

o

can be computed in M(P) as L f.tp(r, u) r (r E P, r ~ t) and in the sub-algebra M(Q) as L f.tQ (r, u) r (r E Q). Greene and Stanley [S] use the latter expression in applications of the factorization theorem to geometric lattices. Residuated Maps. For a finite poset P, even if P has a zero, adjoin a new element z to get a poset P = P u {z} with z as zero. Enlarge the Mobius algebra M(P) to M(P) Et> Zz and extend multiplication by defining (z)2 = z and M(P) Az= O. In this algebra it is easily checked that the elements {p + zl pEP} u {z} multiply precisely as do the elements of P in M(P) so that we may identify M(P) and M(P) Et> Zz. Now if rp: P ~ Q is an order morphism of finite posets and each of P and Q is enlarged by adding a zero z, then obviously rp: M(P) ~ M(Q) is a A-homomorphism iff rp: M(P) ~ M(Q) is a A-homomorphism. If ip is a A-homomorphism, since x v y = = x + y - X AY for x, y in J(P) or in J(Q), then "if is also a lattice homomorphism of J(p) into J(Q) taking only z in Ponto z in Q. But following Prop. 10 we saw that rp: P ~ Q extends to a lattice z-homomorphism ip: J(P) ~ J(Q) iff rp has the property that the preimage of each principal liter in Q is a principal liter in P or is empty. This completes the proof of the following proposition. Proposition 21. A map rp: P ~ Q 0/ finite p08et8 extends to a homomorphi8m rp: M(P) ~ M(Q) 0/ the Mobius algebras ill the pre image 0/ every principal/ilter in Q i8 a principal/ilter in P or i8 empty. Note that this condition is satisfied if rp is inclusion and P is either an ideal in Q or P is a A-subsemilattice ifQ is a A-semilattice.1f rp satisfies the condition in Prop. 21, rp is half of a Galois connection and the other part is the map fJ: Q ~ P u {u} given by fJ(q) = min{p: rp(p) ~ q}. From Prop. 13 it follows that the map rp = fJ/: M(Pu {u}) ~M(Q) is given by rp(ep) = L eq(fJ(q) = p). But rp(ep) = Lf.tp(r, p)rp(r) and L eq (fJ (q) = p) = L f.tQ (t, q) t (fJ (q) = p). Comparing. coefficients of any t in Q q

t,q

yields Rota's principal theorem on Galois connections [S, lS].

480

Vol. XXIV, 19'1'3

Valuations on Distributive Lattices II

345

Corollary. Suppose rp: P -+ Q and (J: Q -+ P are order morphism8 of finite posets such that (J(q) = min{p: rp(p) ~ q} for each q in Q. Then for each t in Q and p in P,

2:

fJ(q)~p

/-lQ(t,q) =

2:

/-lp(r,p)

,p(r)~1

where a sum is taken to be 0 if the index set is empty.

References [1] R. BALBES, Projective and injective distributive lattices. Pacific J. Math. 21, 405-420 (1967). [2] G. BIRKHOFF, Lattice Theory. Third ed., Providence 1967. [3] H. CRAPO and G.-C. ROTA Combinatorial Geometries. M.I.T. Press 1970. [4] R. L. DAVIS, Order Algebras. Bull. Amer. Math. Soc. 76, 83-87 (1970). [5] J. FOLKMAN, The homology groups of a lattice. J. Math. and Mech. 1Ii, 631-636 (1966). [6] L. GEISSINGER and W. GRAVES, The category of complete algebraic lattices. J. Combinatorial Theory (A) 13, 332-338 (1972). [7] G. GRATZER, Lattice Theory. San Francisco 1971. [8] C. GREENE, On the Mobius Algebra of a Partially Ordered Set. Proc. Conf. on Mobius Algebras, University of Waterloo 1971. [9] R. R. HALMOS, Measure Theory. Princeton 1950. [10] P. HILTON and S. WYLIE, Homology Theory. Cambridge 1960. [11] V. KLEE, The Euler characteristic in combinatorial geometry. Amer. Math. Monthly 70, 119-127 (1963). [12] A. HORN and A. TARSKI, Measures in Boolean algebras. Trans. Amer. Math. Soc. 64, 467 -497 (1948). [13] R. LARSON and M. SWEEDLER, An associative orthogonal bilinear form for Hopf algebras. Amer. J. Math. 91, 75-94 (1969). [14] H. M. MACNEILLE, Partially ordered sets. Trans. Amer. Math. Soc. 42, 416-460 (1937). [15] B. PETTIS, Remarks on the extension of lattice functionals. Bull. Amer. Math. Soc. 64, 471 (1948). [16] B. PETTIS, On the extension of measures. Ann. of Math. 64, 186-197 (1951). [17] H. RASIOWA and R. SIKORSKI, The Mathematics of Metamathematics. Warsaw 1963. [18] G.-C. ROTA, On the foundations of combinatorial theory. I. Z. Wahrscheinlichkeitstheorie Verw. Gebiete 2, 340-368 (1964). [19] G.-C. ROTA, On the combinatorics of the Euler characteristic. In: Studies in Pure Mathematics, pp. 221-223. London 1971. [20] L. SOLOMON, The Burnside algebra of a finite group. J. Combinatorial Theory 2, 603-615 (1967). [21] E. SPANIER, Algebraic Topology. New York 1966. Eingegangen am 2. 10. 1972 Anschrift des Autors: Ladnor Geissinger Mathematics Department University of North Carolina Chapel Hill, North Carolina 27514, USA

481

Sonderabdruck aUB ARCHIV DER MATHEMATIK Vol. XXIV, 1973

Fasc.5

BlRKHAUSER VERLAG. BASEL UND STUTTGART

Valuations on Distributive Lattices III By LADNOR GEISSmGER

Introduction. In Parts I and II [Arch. Math. 24, 230-239, 337-345 (1973)] we were principally interested in combinatorial applications of the valuation ring of a distributive lattice. We now show how this ring provides a natural setting for some elementary results in measure theory as well as some classical results on representations of distributive lattices. Specifically, in the valuation ring V(L) of a distributive lattice L it is easy to identify various extensions of L as well as prime ideals of L and so arrive at some theorems of Pettis, Birkhoff, and Stone. For any faithful representation of L as a lattice of sets the extension of V (L) by real (or complex) scalars is naturally isomorphic to the algebra of simple functions, and the sup norm on the functions comes from an intrinsic norm on VR(L). The Stone space of L corresponds to the spectrum of VR(L) with the Zariski topology. 5. Extensions of a Distributive Lattice. It is well-known [9] that the ring (family closed under union and difference) generated by a lattice L of sets consists of all the finite disjoint unions of differences E - F of elements of L. The existence and categorical properties of this and other minimal extensions of a distributive lattice L can be easily deduced using the valuation ring V(L). We noted before that if w ~ x ~ yin L then the element w Y - x in V(L) acts like the relative complement of x in the interval [w, y] even if such a relative complement does not exist in L. If w, X" Yc are elements of Land w ~ x, ~ Yt then (YI - Xl) " (Y2 - X2) = = YI "Y2 - (Xl" Y2) V (YI "X2) and (y, - x,) = 0 in V (L) so that both Yc - x, and w Yc - Xc are idempotent elements, YI - Xl is orthogonal to Y2 - X2 (the analogue of disjoint sets) iff YI "Y2 ~ Xl V X2, and

+

w"

+

(w

+ YI -

Xl) " (w

+ Y2 -

X2) = w

+ YI " Y2 -

(Xl" Y2)

V

+ 2:

(YI " X2) •

Let R(L) be the set of all elements of V(L) of the form 8 = W (Yt - X,) where w ~ Xl ~ Yt in L for 1 ~ i ~ n and the y, - Xi are mutually orthogonal. If v ~ w then also 8 = V (w - v) (Yt - xd and w - v is orthogonal to all the y, - Xi. For another element r = w (qj - pj) in R (L) expressed as above, r" 8 = W (Yt - Xi)" (qj - Pj) is again in R(L) since the (y, - x,)" (qj - Pj) = y,,, qj -(Yt "p,)v (Xt"q,) are orthogonal and all elements are above win L. Thus R(L) is closed under the idempotent operation". To show that R (L) is closed under v, first note that since the Yt - X, are orthogonal w (Yt ..:... Xl) = V (w Yt - X,)

+

+ 2:

+ 2: + 2:

+

+ 2:

483

+

476

L. GEISSINGER

ARCH. MATH.

and dually if t ~ YI ~ XI for all i then t - L (YI - XI) = /\ (t - y, element is also in R(L). Now if q ~ p ~ w, and 8 is as above, (w

+q -

p) v (w

+ L (YI -

+ x,) and this

Xl)) =

=w+L~-~+~-~-~-~AL~-~= = L (Yl -

Xl)

+ w + (q -

for any t in L such that t w

+ (q -

p)

A

~

q and t

p) ~

A

(t - L (Yl - XI))

V YI, and since

(t - L (Yl - Xl)) = (w

+q -

p)

A

(t - L

(Yl -

Xl))

R(L) and (q - p) A (t - L (Yt - Xl)) is orthogonal to all the Yl - Xl then p) V 8 is in R(L). It then follows that for any r, 8 in R(L) above w in L, r v 8 and r v 8 - 8 w = v are both again in R(L) and above w. Thus R(L) is a distributive lattice. In fact, R(L) is ,relatively complemented"because if t;;:;; 8;;:;; r in R(L) there is a w in L such that w ;;:;; t and v = r - 8 w is in R(L) so tvv = = r - 8 t is in R(L) and is the complement of 8 in [t, r].

IS ill

(w

+q-

+

+

+

Proposition 22. For every distributive lattice L, R(L) i8 the unique minimal relatively complemented di8tributive exten8ion of L. Every lattice homomorphi8m of L into a relatively complemented di8tributive lattice L' extends uniquely to a lattice homomorphi8m of R(L) into L'. Proof. If f{J: L ---+ L' is a lattice homomorphism it extends uniquely to a ring homomorphism f{J: V(L) ---+ V(L'). So f{J extends to a lattice homomorphism of R(L) into R(L') = L' and the extension is unique since R(L) is generated by relative complements w Y - X with w ;;:;; X ;;:;; Y in Land f{J(x) has a unique relative complement in the interval [f{J(w), f{J(Y)] in L'. Note that for the augmentation homomorphism e: V(L) ---+Z, e(R(L)) = 1. Also, if L has a zero and unit (or just a zero) then R(L) is a Boolean (generalized Boolean) algebra with the same zero and unit. If L does not have a zero or unit and we adjoin them to L to get L' then R (L') will be the minimal (generalized) Boolean algebra generated by L, and V(L') will have rank one or two more than V(L). By our previous discussion of the universal properties of the map L ---+ V (L) or by propositions 16 and 22 it follows that the embedding L ---+ R(L) induces an isomorphism V(L) R:! V(R(L)). This gives part of a result due to Pettis [16] (for real valuations see also SIniley, Trans. Amer. Math. Soc. 48 (1944)).

+

Corollary. If A is a valuation on L into an abelian group A it has a unique extension to a valuation on R(L) into A. If A i8 a partially ordered abelian group and A i8 monotone on L then its extension to R (L) i8 al80 monotone. Proof. A: L---+A extends uniquely to a group homomorphism A.: V(L)---+A which restricts to a valuation on R(L). For 8 ;;:;; t in R(L) there is a w in L such that w ;;:;; 8 and so r = t - 8 w = w L (Yl - Xi) with the y, - Xl orthogonal and y, ~ Xj ~ w in L. Thus A(t) - A(8) = L (A (Yl) - A(X,)) so if A is monotone on L then it is also on R(L).

+

+

484

Vol. XXIV, 1973

Valuations on Distributive Lattices III

477

6. Representations and Prime Ideals. A form of the following statement seems to be a folk theorem of measure theory. Theorem 3. If L i8 a lattice of nonempty 8ub8et8 of a 8et X then V(L) i8 naturally i80morphic to the ri1UJ of 8imple function8 S (L) generated over Z by the characteri8tic functions of the 8et8 in L. Proof. For each A E L let CA : X -'>- Z be the characteristic function of the subset A of X. Then A -'>- CA is a modular function from lattice L into the ring S(L) which is multiplicative and hence extends uniquely to a ring homomorphism of V(L) onto S(L). To show it is a monomorphism it is necessary to show that if a relation 'Ld;CA, = 0 holds in S(L) then also 'L.d;A; = 0 holds in V(L). If M is a finite sublattice of L containing AI, ... , An, it will be enough by proposition 15 to show that V (M) is isomorphic to S (M). If B is a join-irreducible element of M and BO is the maximal element of M properly contained in B, then an element x E B\BO is not in any A EM for which A l; B. So CB is independent of CA for all A EM, A l; B. It follows as in the proof of Theorem 1 that the CB for all BE P(M) are independent and so V (M) and S (M) have the same rank. Since V (M) maps onto S (M) it follows that the map must be an isomorphism. Another version of this result states that a valuation on such a lattice of sets L extends uniquely to a group homomorphism on S(L). The algebra of simple functions SR(L) generated over the real numbers R by S(L) is a subalgebra of the Banach algebra B(X) of all bounded real-valued functions on X with the sup-norm. The norm on SR(L) then yields a norm on VR(L) = V(L)@R which we shall show is intrinsic, that is, can be defined using only the embedding of L into VR(L). We defer the definition until after a discussion of prime ideals. If T is a proper prime ideal in the ring VR(L) then T ~ L since L generates VR(L) as an R-algebra and if T and L were disjoint then for all A ~ Bin L, A A (B -A) =0 so B - A is in T which means that T = I the augmentation ideal. Thus if T is not the augmentation ideal then Tn L is a proper (Le. not Land nonempty) prime ideal of L. Also for all A ~ B in L if A is not in T then B - A is in T so that in VR(L)/T all elements of the prime filter L\(T n L) are identified to the unit of the quotient algebra, which is then isomorphic to R. Thus all prime ideals of VR(L) are maximal and are the kernels of multiplicative linear functionals VR(L) -'>- R. If P is a prime ideal of L, the valuation vp which takes the value 0 on P and 1 on L\P extends to a multiplicative linear functional on VR(L). Thus the prime ideals of VR(L) other than I correspond bijectively to the proper prime ideals (or filters) of L, and hence also of R(L). The next proposition implies the existence of enough prime ideals to separate elements of L or R(L) and even stronger separation properties. Proposition 23. If M i8 a 8ublattice of L then every prime ideal of VR(M) (or L) lifts to a prime ideal of VR(L) (or L). Proof. Let T be a prime ideal of VR(M) not the augmentation ideal and AEM\(TnM) and U the ideal in VR(L) generated by TnM. If A were in U then A would be a linear combination of elements BjAE" i = 1, ... ,n where

485

478

1.

Bi E Tn M and Ei

E

L. But V (Bi

II

GEISSINGER

El)

;:;::

ARCH. MATH.

V Bi and V Bi ;f: A, so from our earlier i

results about independence, A must be independent of the Bi II E i . Thus no element of M\(TnM) is contained in the ideal U. Also M\(Tn M) is closed under II since it is a filter in M. It is well-known, and easily proven, that an ideal in a commutative ring which is disjoint from a multiplicatively closed system is contained in a prime ideal with the same property. Thus there is a prime ideal W of VR(L) which contains U and is disjoint from M\ (T n M). Since W n VR (M) is a prime ideal in VR (M) which contains Tn M and is disjoint from M\ (T n M), it is clear that W n VR(M) = T. Of course if T is the augmentation ideal of VR(M) it is contained in the augmentation ideal of VR(L). Corollary (Birkhoff-Stone [2,22]). If A is an ideal and B a filter in a distributive lattice L, and if A and Bare di8joint, there is a prime ideal T such that T ~ A and (L\T)~ B. Proof. Apply the theorem with M = A V Band T the prime ideal of VR(M) corresponding to the prime ideal A in the lattice M, that is, T is generated by A and all b' - b with b, b' E B. From this corollary with A, B single elements, the non-topological part of Stone's representation theorem [17,22] follows immediately. That is, if we associate with each element b of L the set of proper prime ideals of L not containing b then this is a faithful representation of L as a lattice of subsets of the set of all prime ideals. We shall now compare the topology introduced by Stone [22,17] on the set of prime ideals of L with the usual Zariski topology on the prime spectrum of the ring VR(L). For any subset A ~ VR(L) let @(A) be the set of all prime ideals of VR(L) which do not contain A. Then @(A) = @(B) if either B is the ideal in VR(L) generated by A, or, in case A ~ L, B is the ideal in L generated by A. The @(A) are precisely the open sets in the Zariski topology on the set X of prime ideals of VR(L). A base for this topology consists of the sets @(IX) for all IX E VR(L). Suppose M is a finite sublattice of L and let ep for all p E P(M) denote the orthogonal idempotents ZM and p - po introduced earlier. It follows from Theorem 2 and Prop. 22 that R(M) consists of all elements ez :Lep(pEA) for all subsets A ~ P(M). For 1X=:Ldpep in VR(M) let eO(=ez+:Lep(p*z, dp*O), then eO(ER(M)~R(L). Moreover, if IX rf= I then IX and eO( generate the same ideal in VR(M) and hence also in VR(L), whereas if IX E I the same is true of IX and eO( - ez . Thus for IX E VR(M), if IX rf= I then @(IX) = @(eO(), and if IX E I then @(IX) = {T EX: ZM E T and eO( rf= T}. In thc latter case if L has a zero we may assume ZM is the zero of L, then

+

@(IX) = @(eO()\{I}.

Theorem 4. Let L be a di8tributive lattice 'With zero and let Y be the set of all prime ideals of VR(L) except for the augmentation ideal, that is, Y is the prime spectrum of the ring VR(L)/(z). For each nonempty ideal A of the lattice R(L) let U (A) = {T E Y: Til A} . Then the map A

~

U(A) is an isomorphism of the lattice of ideals of R(L) onto the

486

Vol. XXIV.19?3

479

Valuations on Distributive Lattices III

lattice of all open sets in the Zariski topology on Y. The sets U (a) for all a E R (L) are precisely the open compact subsets of Y, and the topology of Y is Hausdorff, locally compact, and totally disconnected. Y is compact iff L has a unit.

Proof. Our computation above shows that for any oc E VR(L) there is an eO( E R (L) such that U (oc) = U (eO(). Now for any ideal A in R (L), A is the union of principal ideals generated by the elements bE A and U(A) = U U(b). It is easy to see that every open set is of the form U(A) for some ideal A in R(L) since for a finite collection bI , ... , b,. of elements of R (L), U ( V bj ) = U U (b j ). The corollary to proposition 23 implies that the correspondence is one-to-one. The fact that finitely generated ideals are principal translates into the statement that the only open compact sets are the U (b) for b in R (L). The topology is Hausdorff because R (L) is a generalized Boolean algebra and the remaining assertions are easily checked. This representation of Lor R(L) by the subsets U(a) of Y differs slightly from the situation described in Theorem 3 since U (z) is the empty set. Here VR(L)/(z) is isomorphic to the ring SR(L) of simple functions on Y generated over R by the characteristic functions of the U(a) for all a E L. For oc E VR(L) or VR(L)/(z) let CO( denote the corresponding function in SR(L). Then for any prime ideal y E Y, CO(y) = AII(OC) where All: VR(L)/(z) -+ R is the algebra homomorphism with kernel y. Thus the sup-norm on SR(L) yields a norm on VR(L)/(z) given by JJocJJ

=

max{1 AII(OC) I, YEY}.

Clearly the functions in SR(L) are continuous on Y, separate points, have bounded support, and for every point of Y there is a function which does not vanish there. Thus by the Stone-Weierstrass theorem, the completion of SR(L) is the space Co(Y) of all continuous functions which vanish at 00. Note that if the characteristic function of a set A ~ Y is in Co (Y) then A must be open and compact and so A = U (a) for some a E R(L), and the function is already in .8R(L). This shows that R(L), but usually not L, can be recovered from VR(L), namely as those elements oc E VR(L) for which All (oc) = 0 or 1 for all y E Y and e (oc) = 1. Now it is well-known that the continuous linear functionals on Co (Y) correspond to bounded regular Borel measures on Y, but we can describe them more simply as follows. Any element oc E VR(L) can be expressed as oc

*

=

,.

2: d,b, + dz

where b, E R(L), bj

1

* z,

and bi A bk

=

z for

i k. Then for any prime ideal y E Y, at most one of the bi is not in y, in which case All (oc) = d" and for each bi there is a prime ideal which does not contain it. Thus

" JJocJJ = max{ldd} and so the unit ball in VR(L)/(z) consists of elements oc = 2:dibi with the bl as above and Id, I ~ 1. 1 Proposition 24. The linear extension of a bounded R-valued modular function v on R(L) with v(z) = 0 is continuous on VR(L)/(z), hence extends uniquely to a continuous linear functional on the completion of VR(L)/(z). Proof. Now v is the difference of two bounded nonnegative finitely additive measures, thus we may assume v is nonnegative, finitely additive, and v(z) = O.

487

480

L.

ARCH. MATH.

GEISSINGER

If I oe II ~ 1 with oe = "L dt bi as above, then

Iv (oe) I = I"L dt v (b I ~"L Idt Iv (bt ) ~"L v (bi ) = j)

v ( V bi ) .

So a bound for von R(L) is also a bound for the linear extension of von VR(L)/(z). Theorem I) (Tarski-Pettis [12,15]). Let M be a relatively complemented sub lattice of a distributive lattice L and suppose both contain z. Then any bounded finitely additive function v: M --+ R with v(z) = 0 can be extended to a function on L with the same properties. Moreover, for any bE L\M, the value v(b) may be prescribed arbitrarily. Poof. The injection V(M) --+ V(L) induces an isometry VR(L)/(z) --+ VR(M)/(z) by Prop. 23. Apply the Hahn-Banach theorem to extend the bounded (by Prop. 24) functional v on VR(M)/(z) to a bounded functional on the completion of VR(L)/(z). Finally, since M is a generalized Boolean algebra, M = R (M) and as we saw above, no other element of R(L) except those already in R(M) can be in the completion (closure) of VR(M)/(z). Corollary. Even if M is not relatively complemented the conclusion holds provided v is nonnegative monotone and bounded. Proof. From the Corollary to Prop. 22 the unique extension of v to R(L) is nonnegative, and it is easily seen that it is bounded. Finally, if we return to the situation in Theorem 3 where L is (or is represented by) a lattice of nonempty subsets of a set, then VR(L) 1'1:1 SR(L). For any elements AI ••..• An of L, any finite sublattice M of L containing all At. and any x E B\BO where BE P(M) and BO is maximal in M less that B (or any x E B if B = ZM) "Ld;OA,(x) = "Ld;(xEA i) = "Ldd B ~ Ai) = vB("LdjA,),

where, as before, vB(A) is 1 if A ;?; Band 0 otherwise, that is, the VB are precisely all multiplicative linear functionals on VR(M). Furthermore, for any x which is in some At, the minimal element of M which contains x is such a join-irreducible element B. Thus the values of the function "LdiOA, are the numbers vB("Ld,Ai) for all BE P(M). Since by Prop. 23 each of these extends to all of VR(L), then the sup-norm of SR(L) can be defined intrinsically in VR(L) by II"LdtAd =max{lvB(2dtAi)l: BEP(M)}

for any finite sublattice M of L containing the At. If t:p: L --+ L' is any lattice homomorphism of distributive lattices then it is easy to see that the induced map t:p: VR(L) --+ VR(L') is norm-decreasing and by Prop. 23 it is an isometry iff t:p is an injection. Completing VR(L) for each L yields a functor from the category of homomorphisms of distributive lattices to the category of (norm-decreasing) homomorphisms of commutative Banach algebras.

488

Vol. XXIV,1973

481

Valuations on Distributive Lattices III

References [1] R. BALBES, Projective and injective distributive lattices. Pacific J. Math. 21, 405-420 (1967). [2] G. BIRKHOFF, Lattice Theory. Providence 1967. [3] H. CRAPO and G.·C. ROTA, Combinatorial Geometries. Cambridge 1970. [4] R. L. DAVIS, Order Algebras. Bull. Amer. Math. Soc. 78,83-87 (1970). [5] J. FOLKMAN, The homology groups of a lattice. J. Math. and Mech. 10, 631-636 (1966). [6] L. GEISSINGER and W. GRAVES, The category of complete algebraic lattices. J. Combinatorial Th. (A) 13, 332-338 (1972). [7] G. GRATZER, Lattice Theory. San Francisco 1971. [8] C. GREENE, On the Mobius Algebra of a Partially Ordered Set. Proc. Conf. on Mobius Algebras, University of Waterloo 1971. [9] P. R. HALMOS, Measure Theory. Princeton 1950. [10] P. HILTON and S. WYLIE, Homology Theory. Cambridge 1960. [11] V. KLEE, The Euler characteristic in combinatorial geometry. Amer. Math. Monthly 70, 119-127 (1963). [12] A. HORN and A. TABSKI, Measures in Boolean algebras. Trans. Amer. Math. Soc. 84, 467 -497 (1948). [13] R. LARSON and M. SWEEDLER, An associative orthogonal bilinear form for Hopf algebras. Amer. J. Math. 91, 75-94 (1969). [14] H. M. MAcNEILLE, Partially ordered sets. Trans. Amer. Math. Soc. 42, 416-460 (1937). [15] B. PETTIS, Remarks on the extension of lattice functionals. Bull. Amer. Math. Soc. M, 471 (1948). [16] B. PETTIS, On the extension of measures. Ann. of Math. 64, 186-197 (1951). [17] H. RASIOWA and R. SIKORSKI, The Mathematics of Metamathematics. Warsaw 1963. [18] G.-C. ROTA, On the foundations of combinatorial theory. 1. Z. Wahrscheinlichkeitstheorie Verw. Gebiete 2, 340-368 (1964). [19] G.-C. ROTA, On the combinatorics of the Euler characteristic. In: Studies in Pure Math., pp. 221-233. London 1971. [20] L. SOLOMON, The Burnside algebra of a finite group. J. Combinatorial Th. 2, 603-615 (1967). [21] E. SPANIER, Algebraic Topology. New York 1966. [22] M. H. STONE, Topological representations of distributive lattices and Brouwerian logics. Casopis Math. Fys. 87, 1-25 (1937). Eingegangen am 29. 1. 1973 Anschrift des Autors: Ladnor Geissinger Mathematics Department University of North Carolina Chapel Hill, North Carolina 27514, USA

31

Ardll. der Mathematik XXIV

489

Classic Papers in Genetics

Read more

Classic Papers in Rheumatology

Read more

Classic Papers in Breast Disease

Read more

Classic Papers in Coronary Angioplasty

Read more

Classic Papers in Coronary Angioplasty

Read more

Classic Papers in Critical Care, Second Edition

Read more

Classic Papers in Natural Resource Economics

Read more

Classic Papers in Natural Resource Economics

Read more

Quantum Chemistry, Classic Scientific Papers

Read more

Quantum chemistry: Classic scientific papers

Read more

Botchan: A Modern Classic

Read more

Crash (Bfi Modern Classics)

Read more

10 (BFI Modern Classics)

Read more

We (Modern Library Classics)

Read more

Lolita (Penguin Modern Classics)

Read more

Crash (Bfi Modern Classics)

Read more

Crash (Bfi Modern Classics)

Read more

Lolita (Penguin Modern Classics)

Read more

Lolita (Penguin Modern Classics)

Read more

Modern Classics: Twenty Handknit Classics for the Modern Woman

Read more

Modern Classics: Twenty Handknit Classics for the Modern Woman

Read more

Explorations in modern economics: Selected papers

Read more

Tono-Bungay (Modern Library Classics)

Read more

Lanhydrock Days (Acorn Modern Classics)

Read more

Viability Theory (Modern Birkhauser Classics)

Read more

Combinatorics

Read more

Modern CLassics of Science Fiction

Read more

Combinatorics

Read more

Combinatorics

Read more

Combinatorics

Read more

Recommend Documents

Classic Papers in Genetics

CtASSIC PAPERS IN GENETieS '/' , ^""s Edited by JAMES -^-^ A. ^^^^ PETERS \ (2.H c LASSIC TAPERS IN G ENETI...

Classic Papers in Rheumatology

Classic Papers in Rheumatology Classic Papers in Rheumatology Edited by Paul Dieppe Director, MRC Health Services Res...

Classic Papers in Breast Disease

Classic Papers in Breast Disease Classic Papers in Breast Disease Edited by Michael Baum MD ChM FRCS Emeritus Profes...

Classic Papers in Coronary Angioplasty

medwedi.ru Classic Papers in Coronary Angioplasty medwedi.ru Clive Handler and Michael Cleman (Eds) Classic Papers...

Classic Papers in Coronary Angioplasty

Classic Papers in Coronary Angioplasty Clive Handler and Michael Cleman (Eds) Classic Papers in Coronary Angioplasty...

Classic Papers in Critical Care, Second Edition

Classic Papers in Critical Care Second Edition Mitchell Fink • Michelle Hayes • Neil Soni Editors Classic Papers in ...

Classic Papers in Natural Resource Economics

Classic Papers in Natural Resource Economics Edited by Chennat Gopalakrishnan Classic Papers in Natural Resource Econo...

Classic Papers in Natural Resource Economics

Classic Papers in Natural Resource Economics Edited by Chennat Gopalakrishnan Classic Papers in Natural Resource Econo...

Quantum Chemistry, Classic Scientific Papers

...

Quantum chemistry: Classic scientific papers