QUO VADIS, GRAPH THEORY?
ANNALS OF DISCRETE MATHEMATICS
General Editor: Peter L. HAMMER Rutgers University, New Brunswick, NJ, USA
Advisory Editors: C. BERGE, Universite de Paris, France R.L. GRAHAM, AT&T Bell Laboratories, NJ, USA M.A. HARRISON, University of California, Berkeley, CA, USA V KLEE, University of Washington, Seattle, WA, USA J.H. VAN LINT California Institute of Technology, Pasadena, CA, USA G.C. ROTA, Massachusetts Institute of Technology, Cambridge, MA, USA 7: TROTER, Arizona State University, Tempe, AZ, USA
55
QUO VADIS, GRAPH THEORY? A Source Book for Challenges and Directions
Edited by
J o h n GIMBEL University of Alaska Fairbanks, AK, USA
John W. KENNEDY and Louis V. QUINTAS Pace University New York, NY USA
1993 NORTH-HOLLAND-AMSTERDAM
LONDON NEW YORK
TOKYO
ELSEVIER SCIENCE PUBLISHERS B.V. Sara Burgerhartstraat 25 RO. Box 21 1, 1000 AE Amsterdam, The Netherlands
L i b r a r y of C o n g r e s s C a t a l o g i n g - i n - P u b l i c a t i o n D a t a
Quo vadis. g r a p h t h e o r y ? a source book for challenges and directions / e d i t e d by J o h n G. G i n b e l , J o h n W. K e n n e d y , a n d L o u i s V. Q u i n t a s .
.
p. cm. -- ( A n n a l s o f d l s c r e t e n a t h e n a t l c s 55) I n c l u d e s b i b l i o g r a p h i c a l r e f e r e n c e s a n d index. ISBN 0-444-89441-1 ( a l k . p a p e r ) 1. Graph theory. I. G i m b e l . J o h n G o r d o n . 11. K e n n e d y , J. W. (John W . ) 111. Q u i n t a s . L o u i s V . IV. S e r i e s . 1993 QAlEE.06 511'.5--dC20 93-9334
CIP Typescript for this volume was prepared i n a MacintoshTMenvironment using FramemakerTM by KzQ, Pace University, New York, NY 10038, U.S.A.
ISBN: 0 444 89441 1
0 1993 Elsevier Science Publishers B.V. All rights reserved. No part o f this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording o r otherwise, w i t h o u t t h e prior written permission o f t h e publisher, Elsevier Science Publishers B.V., Copyright & Permissions Department, PO. Box 521, 1000 AM Amsterdam, The Netherlands. Special regulations for readers in the U.S.A. - This publication has been registered with the Copyright Clearance Center Inc. (CCC), Salem, Massachusetts. Information can be obtained from the CCC about conditions under which photocopies of parts o f this publication may be made in the U.S.A. All other copyright questions, including photocopying outside of the U.S.A, should be referred to the copyright owner, Elsevier Science Publishers B. V., unless otherwise specified. No responsibility is assumed by the publisher for any injury and/or damage to persons or property as a matter o f products liability, negligence or otherwise, or f r o m any use or operation of any methods, products, instructions or ideas contained in the material herein. This book is printed on acid-free paper. Printed in The Netherlands
FOREWORD In the spectrum of mathematics, graph theory, as a recognized discipline, is a relative newcomer. The first formal paper is found in the work of Leonhard Euler in 1736. In recent years the subject has grown rapidly so that, in today’s literature, mathematical and scientific, graph theory papers abound with new mathematical developments and significant applications. Three factors, perhaps, account for this explosive growth of the subject: 1) Graph theory provides the natural structures from which to construct mathematical models that are appropriate to almost all fields of scientific (natural and social) enquiry. The underlying subject of study in these fields is some set of “objects” and one or more “relations” between the objects.
2 ) Graph theory has developed a rich language of terms to render concise the expression of intricate concepts associated with object-relation structures. This facilitates, indeed encourages, interdisciplinary communication of ideas and techniques to the benefit of all fields that use graph theory. 3 ) Graph theory offers a huge selection of intellectual challenges that range in level from simple exercises for the novice, to deep open questions for the mathematical sophisticate. Many fascinating and compelling questions in graph theory are easy to comprehend, but their complete solutions are elusive. Nevertheless, in pursuit of these solutions, graph theorists are frequently rewarded by achieving results that contribute to further development of the subject. As with any academic field, it is beneficial periodically to step back and ask: “Where is all this activity taking us?” “What are the outstanding fundamental problems?” “What are the next important steps we should take?” In short, “Quo Vudis, Graph Theory?” Thanks to o u r contributors, this volume offers a comprehensive reference source for future directions and open questions in graph theory. The idea for this volume originated together with that for an international discussion meeting, also under the title “Quo Vadis, Graph Theory?” held at the University of Alaska, Fairbanks in August of 1990. By means of discussion, rather than by formal presentation of results, participants considered significant avenues for further exploration in graph theory. This volume is not a proceedings of that meeting; rather, it is a collection of papers written with the discussions of that meeting as background. The first three papers in the volume are special in that they provide the reader with complementary perspectives on the future of graph theory in general. “Whither Graph Theory?” by William T. Tutte and “The Future of Graph Theory” by BCla Bollobas each take a philosophical approach. “New Directions in Graph Theory” by Fred S. Roberts offers a comprehensive overview of questions and developments in the subject with an emphasis on applications. It is with these three papers that we recommend that the reader start. The remaining papers are arranged by topic, in the order used in the paper by Roberts. These papers elaborate on the potential for future developments in specific topics of graph theory. Among them the reader will find a rich source of worthwhile and challenging questions that await resolution.
v1 The editors express their thanks to the contributors to this volume, their efforts especially have made this a worthwhile task. Our thanks are also due to the referees for their thorough efforts and useful suggestions. W e gratefully acknowledge support for this volume and for the Quo Vadis, Graph Theory? meeting in Alaska provided by The Air Force Office of Scientific Research, The A R C 0 Foundation, The National Security Agency, The Office of Naval Research and The University of Alaska Fairbanks. Our special thanks are due to Michael Kazlow, Mathematics Department, Pace University for his expertise and dedication while worlung with us on the many technical and editorial aspects of the preparation of this volume. We also thank Peter L. Hammer and Elsevier Science Publishers for their encouragement in the publication of this work. Finally we thank the University of Alaska Fairbanks and Pace University for their general support of this project.
John Gimbel, University of Alaska Fairbanks John W. Kennedy, Pace University, New York Louis V. Quintas, Pace University, New York August, 1992
vii
CONTENTS Foreword Whither graph theory? W.T. T ~ E The future of graph theory, B. BOLLOBAS New directions in graph theory (with an emphasis on the role of applications), F.S. ROBERTS A survey of (m, k)-colorings, M. FRICK Numerical decks of trees, F. GAVRIL, I. KRASIKOV and J. SCHONHEIM
V
1 5 13
45
59 The complexity of colouring by infinite vertex transitive graphs, B. BAUSLAUGH 71 Rainbow subgraphs in edge-colorings of complete graphs, P. ERDCJS and Z. TUZA 81
Graphs with special distance properties, M. LEWNTER Probability models for random multigraphs with applications in cluster analysis, E.A.J. GODEHARDT Solved and unsolved problems in chemical graph theory, A.T. BALABAN, Detour distance in graphs, G. CHARTRAND, G.L. JOHNS and S. TIAN
89 93
109
127
Integer-distance graphs, R.P. GRIMALDI Toughness and the cycle structure of graphs, D. BAUER and E. SCHMEICHEL The Birkhoff-Lewis equations for graph-colorings, W.T. TurrE
137 145
The complexity of knots, D.J.A. WELSH The impact of F-polynomials in graph theory, E.J. FARRELL A note on well-covered graphs, V. CHVATAL and P.J. SLATER Cycle covers and cycle decompositions of graphs, C.-Q. ZHANG Matching extensions and products of graphs, J. LIUand Q. Y u
159
Prospects for graph theory algorithms, R.C. READ The state of the three color problem, R. STEINBERG Ranking planar embeddings using PQ-trees, A. KARABEG Some problems and results in cochromatic theory, P. E R D ~and S J. GIMBEL From random graphs to graph theory, A. RUCINSKI Matching and vertex packmg: How “hard”are they? M.D. PLIJMMER The competition number and its variants, S.-R. KIM Which double starlike trees span ladders? M. LEWINTER and W.F. WIDULSKI
153 173 179 183
191 20 1
21 1 249 26 1 265 275 3 13 327
The randomf-graph process, K.T. BALIKJSKA and L.V. QUINTAS Quo vadis, random graph theory? E.M. PALMER Exploratory statistical analysis of networks, 0. FRANK and K. NOWTCKI The Hamiltonian decomposition of certain circulant graphs, J. LIU
333
Discovery-method teaching in graph theory, P.Z. CHINN
375
Index of Key Terms
385
34 1 349 367
...
Vlll
Quo Vadis, Graph Theory?
was also the title used for An International Conference on the Future of Graph Theory held at University of Alaska Fairbanks, August 1990. Sponsors The Air Force Office of Scientific Research The ARC0 Foundation The College of Liberal Arts, UAF The Department of Mathematical Sciences, UAF The National Security Agency The Office of Naval Research The Vice Chancellor for Academic Affairs, UAF Organizing Committee Phyllis Chinn, Humboldt State University, California John Gimbel, University of Alaska Fairbanks, Alaska John W. Kennedy, Pace University, New York Louis V. Quintas, Pace University, New York Fred S . Roberts, Rutgers University and Rutcor, New Jersey Local Organizing Committee Ron Gatterdam Hannibal Grubis Dushan Jetvic Pete Knoke Laura Lee Potrikus
Quo Vadis, Graph Theory? J. Girnbel, J.W. Kennedy & L.V. Quintas (eds.) Annals of Discrete Mathematics, 55, 1 4 (1993)
0 1993 Elsevier Science Publishers B.V. All rights reserved.
WHITHER GRAPH THEORY?
William T. TU’ITE Department of Combinatorics and Optimization University of Waterloo, Waterloo, Ontario, CANADA
Abstract This is the text of an oration delivered at the conference Quo Vadis, Graph Theory?, held at Fairbanks, Alaska, on August 16,1990. It enlarges upon the image of a well introduced by R.C. Read at the same Conference. He envisaged graph theorists as situated at the bottom of a well among the graphs of simplest structure, with the more interesting graphs extending upward along the well-shaft and out to the Stars.
Friends, Romans, fellow-citizens of the Graphic Republic, mark me well, for it is of a well that I would speak. One of the things that impressed me in the lectures we have heard was the metaphor that as graph theorists we live at the bottom of a well. That, I recall, was the fate of three little girls in a work we all revere [l]. Their names, if I remember rightly, were Elsie, Tillie and Lacie, and they lived in a well. Well in, as the narrator insisted. It was a treaclewell, and they became very ill through consuming nothing but treacle. We d o not think our well is a treacle-well; we would rather call it a nectar-well or an ambrosia-well. We subsist upon its product and the unenlightened remark that it makes us mentally very ill. For it fires our imaginations and we sing right merrily of graphs and matroids. Well has it been written: “Theiryoung men shall see visions and their old men shall dream dreams” (Ladies, feel free to read “women” for ‘ h e n ” in that quotation). A recurring vision and dream has the well well-walled with graphs. Down at the bottom is the null graph. Careful not to step on it! There are small graphs around us and big ones higher up. There are mighty ones miles high. Graphs growing wider still and wider through the leagues and the light years. For it is a deep well. We want to explore that grand array of graphs, and reduce it to order, the order of theorems and algorithms. There are ways of contacting those graphs. It can be done through the lore of large numbers, as in so many of the theorems of Erdos. Or we can look at the graphs nearby, note regularities, state those regularities as conjectural theorems, and then try to prove those theorems for all graphs, even for those soaring out of sight. It works sometimes, usually by the grace of the principle of mathematical induction. Some of the proved theorems give algorithms, and we can carry through those algorithms step by step for graphs not far away. But not for the graphs up there in the starry immensities. Even for them we like to assert that the algorithms exist. Moreover, we like to affirm that some of them can be carried out in polynomial time, even though we cannot imagine them being carried out at all. We have paid special attention to algorithms of practical utility, applying to low-lying graphs. In my graph-theoretical dreams I envision someone coming upon me and speaking thus: “Avert thine eyes from the heavens and see the graphs that may bring thee treasures on Earth. Be thou not like Thales of old who, gazingjixedly at the stars, fell into a well”. One can only reply “Thou warn’st me too late. I am in a well already. Well in”. But he is a prophet of a possible future for Graph Theory. Mind you, in some moods I have much sympathy for him. I do find it hard to believe in all those graphs up there getting bigger and bigger as they recede into the distance. No doubt
2
W.T. Tutte
almost all of them are so big that there is not room enough to record them within the confines of the observable physical Universe. It is with a twinge of self-doubt that I assert that every one of them has either a 1-factor or a 1-block. I admire all the theorems that say that almost all those graphs having Property A have also Property B, but I do wonder what they mean. Why postulate an unobservable? Or if someone insists on postulating one how can any statement about it make sense? Yet I still feel that those theorems are telling us something. Perhaps Graph Theory needs a philosophical branch to tell us what we mean by what we say. O h well, let m e quarrel no further with the conventions. One of the latest theorems to arise in the well is that of the well-quasi-ordering of graphs by minors. W e have noted it and we have remarked upon some of its curious corollaries. We see it mainly as a theorem controlling the great graphs above. W e have paid due respect to the Four Color Theorem and related coloring problems. All these, I would say, have their chief interest among the high graphs. Brooks’ Theorem is a shining example, being a genuine theorem and not just a conjecture, and one with a simple proof at that. But I will not think of coloring theory as well-developed until it has learned how to cope with Hadwiger’s Conjecture. Go to it, Graph Theory! We have touched upon many conjectures that challenge us. Take the one about reconstruction as an example. I worked on that once. I even settled what some described as outstanding problems by proving that some of the polynomials associated with graphs are reconstructible. I looked again at my results and was quite appalled by their superficiality. “Vanity ofvanities”, I cried, “all is triviality!” G o on, 0 graph theorists, and delve below that surface!
He thought he saw a coach and four That stood beside his bed. He looked and saw it was A bear without a head. “Poor thing”, he said, “poor silly thing”. “It’s waiting to be fed”. - Lewis Caroll [2] Even in Graph Theory things are not always what they seem. Let us return to the Four Color Theorem. We have discussed the semi-philosophical problem “Why 4?” Wherefore 4? What is so special about that number? I suppose Haken and Appel would have a probability argument based on Euler’s Theorem. The simple-minded would say “Well, the Five Color meorem is, almost trivially, true and the Three Color Theorem is trivially false. Four is the intermediate integer. ’’ But we have seen that there can be an answer on a deeper level. I think I should also mention Beraha’s answer. The so-called Beraha numbers B ( n ) ,or real zeros of the Beraha polynomials, (see this volume pp. 153-158 - 4 s . ) are of evident but not well-understood significance in the theory of plane chromatic polynomials, and their limit as TI tends to infinity is four. I suppose the question arises out of our yearning for an elegant proof of the Four Color Theorem. Perhaps Graph Theory needs an artistic branch concerned not with getting new theorems but with finding the most elegant proofs of known ones. We have learned that it has already developed a probabilistic branch, and I have told you my dream of a twig in the algebra of partitions. There is indeed much to be done in the development of our subject, and the graph theorists of today are active in doing it. And so, should some Power demand of our discipline “Quo vadis?” we can reply, in confident metaphor, “Per ardua ad mtra”. Which, being interpreted, saith “The sky’s the limit”.
Whither graph theory?
References [l] [2]
Lewis Carroll;Alice’s Adventures in Wonderland, Macmillan (1865). Lewis Carroll; Sylvie and Bruno, Macmillan (1989).
3
This Page Intentionally Left Blank
Quo Vadis, Graph Theory? J. Gimbel, J.W. Kennedy & L.V. Quintas (4s.) Annals of Discreie Maihematics, 55, 5-12 (1993)
0 1993 Elsevier Science Publishers B.V. All rights reserved.
THE FUTURE OF GRAPH THEORY Bela BOLLOBAS Department of Pure Mathematics and Mathematical Statistics University of Cambridge, Cambridge, ENGLAND
Abstract Graph theory has grown very rapidly in the past few decades. In this brief essay we try to forecast how it might develop in the years to come.
“Which of us would not be glad to lift the veil behind which the future lies hidden; to cast a glance at the next advances of our science and at the secrets of its development during future centuries? What particular goals will there be towards which the leading mathematical spirits of coming generations will strive? What new methods and new facts in the wide and rich field of mathematical thought will the new centuries disclose?” With these poetic words, David Hilbert embarked on his momentous lecture delivered before the International Congress of Mathematicians at Paris in 1900 (see [l] [2]). It is difficult to overestimate the importance of the problems Hilbert presented in his lecture: they profoundly influenced the course of mathematics in this century. The organizers of the Quo Vadis, Graph Theory? meeting (Fairbanks, Alaska, 1990) set themselves a very ambitious task: to fathom the future of graph theory. Undoubtedly, this is much easier than attempting to predict the future direction of the whole of mathematics; but it would still be presumptuous of me to make ex cathedra statements about the future of our subject. However, as I have been put on the spot, I will try to assess graph theory and its relationship to other fields. Graph theory is often under attack, and so are its practitioners. We are accused of being shallow, knowing and using no real mathematics, and tackling problems of little interest, whose solutions are easy if not trivial. Although these criticisms are usually made by people unsympathetic to everything combinatorial, there is a grain of truth in these accusations perhaps even more than a grain. In graph theory we do write too many papers, sometimes we do tackle problems that are too easy, and we have a tendency to become wrapped up in our circle of ideas and problems, unconcerned about the rest of mathematics. However, I am convinced that these are mostly teething problems. Graph theory is young, very young indeed, and it is still highly underdeveloped. Occasionally we pretend that our subject started in 1736 with Euler and the bridges of Konigsberg, and that Dtnes Konig’s book 200 years later established graph theory as a major area, but the truth is that the field really started to take off only in the fifties and it acquired a large following only in the seventies. Perhaps the greatest strength of graph theory is the abundance of natural and beautiful problems waiting to be solved. There is no doubt that Hilbert was correct when he emphasized the importance of problems to a branch of mathematics. “As long as a branch of science offers an abundance of problems, so long it is alive; a lack of problem foreshadows extinction or the cessation of independent development. Just as every human undertaking pursues certain objects, so also mathematical research requires its problems. It is by the solution of
6
B . BollobL
problems that the investigator tests the temper of his steel; he finds new methods and new outlooks, and gains a wider and freer horizon. ’’ We should rejoice that graph theory has a tremendous supply of exciting problems that beg to be solved. Paradoxically, much of what is wrong with graph theory is due to this richness of problems. It is all too easy to find new problems based on no theory whatsoever, and to solve the first few cases by straightforward methods. Unfortunately, in some instances the problems are unlikely to lead anywhere, and we must agree with DieudonnC that we do publish embryonic solutions of “problems without issue’’.We all know embarrassing examples of these, and it is not clear that we are making enough effort to rid our journals of “papers without issue”. There are many beautiful results in graph theory whose proofs do not make use of sophisticated concepts and tools, but rather rely on great ingenuity. It is important to emphasize that this happens because there are no suitable tools available and not because a graph theorist should use as little mathematics as possible. We would be delighted to use any tools suitable to tackle the natural questions arising in the field. In fact, there are signs that in graph theory we can make use of more and more results from other branches of mathematics: the theorems of Brouwer and Borsuk have found many applications, the Riemann Hypothesis for curves over finite fields has been used many times, Ramanujan’s Conjecture has been applied with great success by Lubotzky, Phillips, Sarnak and Margulis, and, recently, cohomology theory was the driving force in the work of Chung and Graham on pseudo-random hypergraphs. Encouraged by these signs, we should learn more mathematics outside combinatorics so that we are ready to wield powerful tools when the opportunity arises. We should not be disheartened by the fact that, due to the great variety of natural and difficult problems, most methods brought into graph theory are unlikely to apply to a wide selection of questions. The last two decades have seen some outstanding achievements in graph theory: Appel and Haken proved the Four Color Theorem (see [3] [4]), and Robertson and Seymour proved Wagner’s Conjecture (see [5]-[7]) and created a rich and wonderful theory of graph minors. Other major results close enough to graph theory to justify their mention are SzemerCdi’s theorem [S] on arithmetic progressions, and the more recent result of Laczkovich [9] (see also [lo]) on squaring the circle. Nevertheless, the striking feature of graph theory in the last two decades is that probabilistic methods have developed into a cohesive theory. There is no doubt that the theory of random methods is frequently used in most branches of graph theory. The theory was founded by Erdos and RCnyi in the late 50s and early ~ O Sand , for over twenty years the theory got along very well without much probability theory beyond the use of moments, Chebyshev‘s Inequality and the Inclusion-Exclusion Principle. However, the theory really started to blossom when a number of other tools from probability theory were found to be useful, like random walks, martingales, branching processes, Markov chains and so on. The amval of these methods rather changed the nature of probabilistic combinatorics: there is less “pure combinatorics” and more “combinatorial probability” and even “pure probability”. This change is not to be lamented but rather to be applauded: it is not that graph theory is losing its hold on an area but rather that it is becoming stronger with the influx of new tools. I very much hope that this success of the theory of random graphs will be repeated by other branches of graph theory, and that by acquiring powerful tools from the more established branches of mathematics they will become much stronger. What are the really big problems in our field? There are two that clearly stand out: the question whether P is equal to NP, and Hadwiger’s Conjecture. The first is well-known in all of mathematics and is recognized as one of the most i m p r -
The future of graph theory
7
tant in mathematics; the latter is hardly known outside combinatorics but is familiar to all in combinatorics: every k-chromatic graph has a subcontraction to a complete graph of order k. My view is that Hadwiger's Conjecture is considerably harder than the P-NP question; in fact, my hunch is that P = NP, contrary to general belief. Let me turn to some more reasonable problems, illustrating the types of problems I believe will be studied in the future. These too are unlikely to be easy, but they may not be entirely out of reach. I would like to emphasize that these problems strongly reflect my taste in graph theory. In recent years more and more attention has been paid to discrete isoperimetric inequalities; in particular, Imre Leader and I have studied them on various graphs (see [ l 11 [12]). Given a graph G and a set A of its vertices, for r 2 1 denote by A(l) the t-tzeighbourhood ofA: the set of vertices within distance t of A. If
(1)
IA(,)I 2 A a )
for every set A c V(G)with a vertices then (1) is said to be an isoperimetric inequality. One is especially interested in best possible isoperimetric inequalities. The classical example of a discrete isoperimetric inequality is Harper's inequality [13] in the discrete cube: if A is a set of
=
( )
vertices of the (graph of the) discrete cube then
As it happens, there are very few important families of graphs for which we know the best isoperimetric inequalities. For most natural graphs we are far from knowing the answer. Perhaps the most striking of these is the slice of the cube. Given 0 < r < n, let S(r, n) be the graph with vertex set [ n ] ( r ) , the set of all r-subsets of [n] = { 1,2,.. n}. in which two vertices (sets) are joined by an edge if they have r - 1 elements in common. What then is the best isoperimetric inequality in the slice S(r, n)? This question may not be as easy as it looks, since a very special case of it implies a solution to the last unsolved problem in Erdos, KO and Rado [14], for which Erdos is currently offering $500(see [15], p.471). Here is another fascinating problem for which ErdBs offers $500, due to Erdos, Faber and LovAsz; the reformulation below is due to Erdos [15] (see p.471). Let G,, G,, ..., G, be compete graphs of order n such that no two of them have an edge in common. Is it then true that Uy= G j has chromatic number n? The beautiful results in Kahn and Seymour [16] and Kahn [17] constitute the most recent progress towards a proof of this conjecture.
In graph theory as a whole, there seems to be a shift towards global problems. A good example of this is the recent theory of pseudo-random and quasi-random graphs and hypergraphs due, among others, to Thomason [18] [19], Chung, Graham and Wilson [20] [21], and Chung and Graham [22] [U]. There are many exciting questions in the area: here we shall mention only one of them. A sequence of graphs (G,)y , with G, having n vertices, is said to be quasi-random if there is some function a(n) = o(n) such that
8
B. Bollobis
for all subsets W of V(G,). Here, as usual, e(G, [ w]) stands for the number of edges of the subgraph of G, spanned by W . For a family F! of graphs, call (C,)y an F-sequence if for every F E F! the members of the sequence have asymptotically as many induced subgraphs isomorphic to F as a random graph with probability 112 for the edges. Finally, call a family !Fforcing if every !F-random sequence is quasi-random. Chung, Graham and Wilson [21] show the existence of several forcing families (for example, = { K,, C 2 1 }for any fixed t 2 2). Chung and Graham [22] ask whether one can characterize forcing families. Also, what is the situation for hypergraphs? There are numerous other problems concerning the number of induced subgraphs. Denote by i(G) the number of painvise non-isomorphic induced subgraphs of a graph G. Proving a conjecture of And& Hajnal, it was shown by Erdos and Hajnal [%I and Alon and BolIobAs 2 [25] that if G, is a graph of order n and i(GJ = o(n ) then we can omit o(n) vertices of G, in such a way that the remaining graph is either complete or empty. Call an induced subgraph trivial if it is either complete or empty, and write t(G) for the maximal order of a trivial subgraph of G. Thus the result above says that if i(G,) = o(n2) then t(G,) = n - o(n). What happens if we use only certain graph invariants (like order, size, maximal degree, etc.) to distinguish non-isomorphic subgraphs? Given a set II of graph parameters and a graph G, of order n with t(G,) = t , at least how many isomorphism classes of induced subgraphs are there in G, that can be distinguished by the parameters in n? Writing fin, t, ll) for the minimum as C, ranges over all graphs of order n with t(G,) = t , we obtain a rather large family of problems whose solutions would tell us much about the structure of graphs. (Of course, the condition t(G,) = t can be replaced by any other condition.) An interesting simple case is when ll consists of the parameters order and size. An old and fascinating problem of Erdois and Rknyi can also be formulated in terms of t(G) and i(G):given c > 0, is there a constant d = d(c)> 0 such that if t (G,) 5 c l o p for a graph G, of order n, then i(G,) 2 ed"? A much more recent related problem of Erd6s asks for the determination of r(n) = min r(G,), where r(C,) is the maximal order of an induced subgraph of G, which is Gn regular. Ramsey's theorem shows that r(n) 2 clogn for some absolute constant c > 0, and one can also show that r(G,) In"2. The two bounds are very far from each other, but it is not clear which one is closer to the true magnitude of r(n). Another problem that arose by adding a large family of conditions to a classical problem, thereby transforming it into a much more significant problem, is the list-coloring problem for graphs. Given a graph G and a function A mapping the edges of G into the finite subsets of some set (of colors), a A-coloring of G is a proper edge-coloring @ such that @(e)E A ( e ) for every e E E(G).Thus A(e) is the list assigned to the edge e, and in a h-coloring the color of e has to be chosen from this list. The list-chromatic number of G is (G) = min{k : if IA(e)l = k for every e then G has a A-coloring},
that is, the minimal length of the lists guaranteeing the existence of a list-coloring.
The future of graph theory
9
Writing x’(G) for the edge-chromatic number of a graph G , we have, trivially, that x;(G) 2 x’(c>. Dinitz (see [26]) conjectured that, in fact, (2)
X;W
= x’(@
for every graph G, so, in particular, x ; ( G ) I A ( G ) + 1, where A(C) is the maximal degree of G. Harris and I proved [27] that if c > 11/6 and A = A(G) is sufficiently large then (G) I c A , and later various improvements were obtained by Chetwynd and Haggkvist and by Bollobas and Hind. However, we are still very far from proving (2).
xi
Finally, let me mention one of my favorite problems, a conjecture of mine with Catlin and Eldridge. Extending a result of CorrBdi and Hajnal [Z],Hajnal and Szemerkdi [29] proved the following deep result. If G is a graph of order n = s ( r + 1) , with maximal degree r, then it has an ( r + 1)-colouring with equal color classes. Putting it another way, the complement of G contains ( r + l ) K s , that is, the union of r + 1 vertex-disjoint complete graphs, each of them having s vertices. Note that ( r + l)Ks has maximal degree s - 1. Now the BollobBs-Catlin-Eldridge Conjecture (see PO], p.426) states that G not only contains ( r + l)Ks but (a subgraph isomorphic to) any graph of order n with maximal degree s - 1. Once again, a single graph, ( r + l)Ks, has been replaced by a large family: the family of all graphs of order n = s ( r + 1) with maximal degrees - 1. In fact, a little more is conjectured to be true: if GI and G2 are graphs of order n with maximal degrees A1 and A2, and ( A , + 1) ( A 2 + l ) < n + 1, l of G I contains (a subgraph isomorphic to) G2. As the Hajnal-Szethen the complement c merkdi Theorem is a very special case of this conjecture, it is unlikely to be easy. The problems above are fairly ad hoc examples of the kind of problems I believe we shall be looking at in certain parts of graph theory, and it should be emphasized again that they strongly reflect my own taste in graph theory.
Let me return to the theme of Quo Vadis, Graph Theory? Will there be Graph Theory in twenty or fifty years time? Will it change and, if so, in what way? I believe that the future of Graph Theory is rosy since there are too many good things going for it, It has a fantastic supply of beautiful and natural problems and it is also a branch of mathematics very close to Computer Science. We have hardly started to develop the tools to solve our problems, and we have hardly made use of our proximity to Computer Science. When both of these happen, we shall really take off. But we must never lose sight of the sublime beauty of Graph Theory. We must remember the words of G. H. Hardy: “Beauty is the first test: there is no permanent place in the world for ugly mathematics.” Finally, a word of warning. In the next few decades Graph Theory is bound to become much more difficult. We shall try much more difficult problems, and in order to have a chance of cracking them, we shall have to be far better prepared than we are today. Learning vast amounts of combinatorics and general, main-line mathematics will be essential for everybody wishing to succeed in Graph Theory. I am looking forward to a vigorous development of Graph Theory in the coming years.
B. BollobL
10
References D. Hilbert; Mathematical problems, Bulletin of the American Mathematical Society. 8,437479 (1902). F.E. Browder (editor); Mathematical DevelopmentsArising from Hilbert Problems, Proceedings of Symposia in Pure Mathematics, 28 (Part 1). American Mathematical Society, Providence (1976). K. Appel and W. Haken; Every planar map is four colorable: Part I. Discharging, Illinois Journa6 of Mathematics, 21,429490 (1977). K. Appel, W. Haken and J. Koch; Every planar map is four colorable: Part 11. Reducibility, Illinois Journal of Mathematics, 21,491-567 (1977). N. Robertson and P. D. Seymour; Generalizing Kuratowski's theorem, in Proceedings of the Fifteenth Southeastern Conference on Combinatorics, Graph Theory and Computing (Baton Rouge, 1984), Congressur Numerantium, 45, 129-138 (1985). N. Robertson and P. D. Seymour; Graph minors: XV. Wagner's conjecture - to appear N. Robertson and P. D. Seymour, Graph minors - a survey, in Surveys in Combinatorics 1985, I. Anderson (editor), London Mathematical Society Lecture Note Series, 103, Cambridge University Press, 153171 (1985). E. SzemenN; On a set containing no k elementsin arithmetic progression,Acta Arifhmetica. 27, 199-245 (1975). M. Laczkovich; Equidecomposabdity and discrepancy: a solution of Tarski's circle squaring problem, J. fiir die Reine und Angewandte Mathematik, 404.77-1 17 (1990). M. Laczkovich, Uniformly spread discrete sets in P" - to appear. B. Bollobi% and I. Leader; Compressionsand isoperimetricinequalities, J . Combinatorial Theory (A), 56, 4 7 4 2 (1991). B. Bollobirs and I. Leader; Isopenmetric inequalities and fractional set systems, J. Combinatorial Theory (A). 56.63-74(1991). L.H. Harper; Optimal nnmbenngs and isoperimetric problems on graphs, J . Combinatorial Theory, 1, 385-394 (1966). P. ErdaS, C. KOand R. Rado; Intersection theorems for systems of finite sets, Quart. J . Math. O.xford (2). 12,313-320 (l%1). P. Erd6s; Some of my favourite unsolved problems, in A Tribute to Paul Erd&, A. Baker, B. Bollobis and A. Hajnal (editors),Cambridge University Press, 467-478 (1990). J. Kahn and P. Seymour; A fractional version of the ErdCis-Faber-Lovilszconjecture to appear. J. Kahn; Coloring nearly-disjointhypergraphs with n + o(n) colors - to appear A. Thomason; Random graphs, strongly regular graphs and pseudo-randomgraphs, in Surveys in Combinatorics 1987, C Whitehead (editor), London Math. SOC.Lecture Notes Series, 123, Cambridge University Press, Cambridge, 173-1% (1987). A. Thomason; Pseudo-randomgraphs, in Random Graphs '85(M.Karodski and Z. Palka, eds.), Annals of Discrete Mathematics, 33, Noah-Holland, 307-33 1 (1987). F.R.K. Chnng, R.L. Graham and R.M. Wilson; Quasi-random graphs, Proc. Nat. Acad. USA. 85. %9970 (1988). F.R.K. Chung,R.L. Graham and R.M. Wilson, Quasi-random graphs,Combinatorica, 9,345-362 (1989). F.R.K. Chung and R.L. Graham;Quasi-random hypergraphs, Random Structures and Algorithms, 1,105124 (1990). F.R.K. Chung and R.L. Graham; Quasi-randomset systems, Journal of the American Mathematical Society, 4, 151-196 (1991). P. k d 6 s and A. Hajnal; On the number of distinct induced subgraphs of a graph, Discrete Mathematics. 75, 145-154(1989). N. Alon and B. BolloMs; Graphs with a small number of distinct induced subgraphs, Discrete Mathematics, 75.23-30 (1989). R. Haggkvist; Towards a solution of the Dinitz problem?, Discrete Mathematics, 75,247-251 (1989). B. Bollobh and A. J. Hanis; List-colouringsof graphs, Graphs and Combinatorics, 1, 115-127 (1985). K. Comhdi and A. Hajnal; On the number of independent circuits in a graph, Acta Math. Acad. Sci. Hungar.. 4,423439 (1%3).
-
The future of graph theory
[29]
[30]
11
A. Hajnal and E. SzemertYi; Proof of a conjecture of ErdBs, in Combinatorial Theory and Its Applications. vol. 11, P. ErdBs. A. Rhyi and V.T. S6s (editors). Colloq. Math. SOC.J . Eolyai, 4, North-Holland, Amsterdam, 601423 (1970). B. Bollobls; Exfrernal Graph Theory, London Mathematical Society Monographs, No. 1 1 , Academic Press,London, (1978).
This Page Intentionally Left Blank
Quo Vadis, Graph Theory? J. Girnbel, J.W. Kennedy & L.V. Quintas (eds.) Annals of Discrete Malhematics, 55, 1 3 4 4 (1993) 0 1993 Elsevier Science Publishers B.V. All rights reserved.
NEW DIRECTIONS IN GRAPH THEORY (WITH AN EMPHASIS ON THE ROLE OF APPLICATIONS) Fred S. ROBERTS Department of Mathematics and Center for Operations Research Rutgers University, New Brunswick, New Jersey, U S A .
Abstract We summarize some general themes which we saw as comhg out of Quo Vadis Graph Theory? - an international conference on the future of graph theory. held at the University of Alaska, Fairbanks. in August 1990. We expand on these themes with specific examples and emphasize the role of applications.
I.
Introduction
In August 1990, an international conference on the future of graph theory, Quo Vudis, Graph Zheory?, was held at the University of Alaska in Fairbanks. The purpose of this paper is to describe some general themes which I saw coming out of the meeting, with an emphasis on the role of applications. I will expand on each theme, giving examples due to the participants, interspersing them with examples of my own. Inevitably, the reader will see my own prejudices here, and in particular an emphasis on topics which I like. For this I make no apologies. In the following all undefined graph-theoretical terms can be found in Bondy and Murty [13 or Roberts [21[31. The general themes which I saw coming out of the meeting, and which are summarized in the following sections, are these: I. 11. 111. IV. V. VI. VII.
New variants of old concepts need to be explored. New approaches to algorithms will be developed. Applied problems will continue to stimulate the development of graph theory. Randomness is a widespread theme with many aspects. Some of the BIG OLD problems remain. Really new concepts need to be developed. Graph theory is a wonderful vehicle for education in the mathematical sciences and its educational role will influence its scientific development.
2. New Variants of Old Concepts Need to be Explored Many traditional concepts of graph theory are still interesting, especially in new and mcdified forms. Among such concepts are coloring, distance, and intersection graph, to name just a few. There has been a great deal of work in recent years on developing such variations of traditional concepts. Much of this work is motivated by practical applications. It is an important direction in modem graph theory.
2.1 Graph Coloring Many variations of traditional graph colorings are of interest in graph theory today. Many of these have arisen from practical applications, to such areas as traffic phasing, fleet mainte-
F.S. Roberts
14
nance, garbage pickup scheduling, scheduling of meetings of legislative committees, channel assignment, mobile radio telephone scheduling, and task assignment. Roberts [4] gives a recent survey. Here, I mention several such variations.
2.1.1 Defective Colorings A graph G is called (m,k>colorable if it can be vertex colored with m colors with each vertex adjacent to at most k vertices of the same color as itself. This concept was discussed at the conference by Marietjie Frick (this volume pp.45-57) and is studied for instance in the papers by Andrews and Jacobson and by Frick and Henning [6]. The k-defective chromatic is the minimum m so that G is (m,k)-colorable. From the point of view of applinumber u(G) cations, this is a natural number in which to be interested. Users of graph coloring are often satisfied with a number of violations of the restrictions of ordinary graph coloring. Other interesting variants of ordinary graph coloring also arise if we accept different kinds of violations of the ordinary graph coloring restrictions.
[a
2.1.2 List Colorings In many practical applications, a choice of color to assign to vertex x is restricted. Suppose R(x) is a list of colors allowed to be assigned ton. We then seek an R-list coloring, an ordinary coloring so that each vertex x gets a color from its list R(x). Such list colorings were introduced by Erdos, Rubin, and Taylor [7] and were mentioned in this conference by Johanan Schiinheim (this volume pp.59-69) (cf. Brown, et al. [8]).A similar concept arises for edge colorings and is due to Bollobh and Hanis [9]. This was mentioned at the conference by BCla BolloMs (this volume pp.5- 11). A related concept involves R-amenable colorings, colorings where each vertex x gets a color from a specified set S of colors and which is not in R(n).See Brown, et al. [8] and Mahadev and Roberts [lo] for some recent results about R-amenable colorings. A sample idea about list colorings which is especially interesting is the following idea of Erdiis, Rubin, and Taylor [7]. We say that G is k-choosable if G can be R-list colored for any list assignment R(x) so that all sets R(x) have k elements. For instance, the 5-cycle C5 is not 2choosable, for otherwise we could find a coloring in which every R(x) = { 1,2} and this would give us a coloring of C5 in two colors. Erdos, Rubin, and Taylor characterize the 2-choosable graphs, but a characterization of 3-choosable graphs remains elusive. Mahadev, Roberts, and Santhanakrishnan [ l 11 obtain some results about 3-choosable graphs. However, not even the complete bipartite graphs which are 3-choosable have been completely characterized. The choice number ch(G) of G is the smallest k so that G is k-choosable. ErdBs, Rubin and Taylor show that ch(G)can be greater than x(G) and Tesman [12] has shown that the two numbers are equal if G is chordal. It is an open problem to characterize the graphs for which ch(G) = x(G). It is also an open problem to settle two intriguing conjectures of Erdos, Rubin and Taylor: Every planar graph is 5-choosable (and so ch(G)5 5); there is a planar graph which is not 4-choosable. (Compare the four color theorem, which says that every planar graph is 4-colorable, and so x(G) 5 4.) 2.1.3 2'-Colorings
One variation of graph coloring which has been of special interest to me is the T-coloring. Suppose that T is a set of nonnegative integers with 0 in T. A T-coloring of a graph G is an assignment of a positive integerffx) to each vertex xof G so that if two vertices x and y are adjacent, then V(x) -f(y)l t T. T-colorings were motivated by the channel assignment prob-
New directions in graph theory (emphasizing applications)
15
lem. Here, the vertices of G are transmitters and an edge represents conflict. Then we wish to assign channels to transmitters in such a way that conflicting transmitters receive channels whose separation is not in the disallowed set T. T-colorings were introduced by Hale [13] and later studied by Cozzens and Roberts [14]. This early work has led to a large literature which includes five Ph.D. theses. Recent surveys of this literature can be found in the theses by Bonias [ 151, Liu [16] and Tesman [12] and in the papers by Roberts [4] [17]. A common goal in Tcoloring is to minimize the span or separation between the smallest and largest channels used. We return to T-colorings in 54.2.4 and §4.4. 2.1.4 Other Colorings
Other interesting variations of graph coloring deserve further attention. Among those are: H-colorings. These were discussed at the conference by Bruce Bauslaugh (this volume pp.71-79) and are studied for example by Bang-Jensen and Hell [18], Hell and NeSetfil [19], and Haggkvist, et al. [20]. Z-colorings. These are so-called set colorings in which we color with real intervals instead of individual colors. They are studied, for instance, by Opsut and Roberts [21]-[B], and have connections to a variety of pmctical problems, such as fleet maintenance, traffic phasing, mobile radio frequency assignment, and task assignment. J-colorings. These are set-colorings in which we color with unions of real intervals. They are studied by Raychaudhuri [24] [23. D-colorings. These are set colorings in which we color with unions of two real intervals. They have been studied by Trotter and Harary [26]. n-tuple colorings. These are set colorings in which we color with a discrete set of n elements. They were introduced by Gilbert [27] in connection with the mobile radio frequency assignment problem. An early reference on n-tuple colorings is the paper by Stahl [28] . A major open problem about these colorings is to determine whether or not Stahl’s conjectured formula for the n-tuple chromatic number of the Kneser graphs is correct. A special case of this conjecture is the Kneser conjecture, settled by Lovksz [29]. Stahl’s conjecture is discussed in recent papers by Frankl and Fiiredi [30] and by Roberts [311. 2.2 Distance Concepts
Distance concepts have played an important role in graph theory. Among other things, these can be notions defined through distances on graphs in the graph-theoretical sense, notions defined from metric distances in a space in which a graph is embedded, and notions of distance between graphs.
2.2.1 Distance Concepts on Graphs The talk by Martin Lewinter (this volume pp.89-92) emphasized a variety of distance concepts on graphs which deserve further study. For instance, the eccentricity of a vertex x in a graph G is the maximum distance from x to another vertex of G, the cenfer of G consists of the set of vertices of minimum eccentricity, the periphery of G consists of vertices of maximum eccentricity, the cep of G consists of vertices whose distance from the center is equal to the eccentricity of the center, the diameter of G is the maximum distance between two vertices of G , and a diametrical path is a shortest path between two vertices whose distance is equal to the diameter. Many of these concepts are developed in detail in the book by Buckley and H a m y [32]. A sample problem pointed out by Lewinter is to try to understand graphs where
16
F.S. Roberts
the periphery and the cep are disjoint and graphs where any diametrical path must go through the center. There are many other fascinating variants of these ideas. Concepts of centrality are studied in the paper by Freeman [33] in the context of social networks, a topic we discuss in $4.2.5. Concepts of distance are also important in chemistry (cf.$4.3). The ideas of distance arise in many practical problems of communication and transportation. For instance, motivated by problems of transportation, I have been interested in orientations of undirected graphs which result in a strongly connected digraph of minimum diameter, and in orientations which result in strongly connected digraphs which minimize other distance-related parameters. For more on this problem, see for instance ChvAtal and Thomassen [34] and Roberts and Xu [35]-[38]. Jean-Claude Bermond (personal communication), motivated by problems of communications, has been interested in finding strongly connected orientations in which the diameter is the same as the diameter of the original undirected graph. This problem has been studied by McCanna [39] for hypercubes (these graphs are important in computer communication networks; see the discussion in $3.5) and Bermond has some results about it for toroidal graphs. The problem of characterizing graphs for which there is a strongly connected orientation whose diameter is the same as that of the original graph is an intriguing open problem. 2.2.2 Facility Location Problems As Lewinter noted, distance concepts are especially important in facility location problems. Location problems arise whenever a large set of potential sites for placing certain units is available and selection must be made of the sites to be utilized. Such problems arise naturally in situations like placing warehouses, communication centers, or emergency services. A typical problem is to locate a facility at that point of a graph or network which minimizes the sum of distances to users. See Hansen, et al. [ a ] for a survey of results in location theory. The main body of facility location theory concentrates on the location of facilities under the control of a single decision maker. In contrast, recent developments have introduced a theory of locating facilities as the result of a collective action in which “clients” pursue their own interests within the mutual dependency imposed by a voting rule. (The notion of voting rule and social choice theory has interesting implications for graph theory, and will be mentioned again later, in $93.5, 4.1, and 4.2.3.) For recent work on location theory under collective action, see Hansen and LabM [41] and Hansen, Thisse, and Wendell [42]. Traditionally in location theory (and elsewhere in graph theory), the objective function (for instance, find a location which minimizes the sum of the distances to the users) is assumed a priori. Recently, some authors have attempted to put these objectives on a firm axiomatic foundation. Such results, another example of the interplay between social choice theory and location theory, can be found in the papers by Holzman [43], Vohra [44], Foster and Vohra [45], and Hansen and Roberts [46]. 2.2.3 Clustering Problems In many practical problems of detection, decision making, or pattern recognition, we seek methods for clustering alternatives into groups. Clustering problems are important in medicine, as pointed out by Erhard Godehardt (this volume pp.93-108), who dealt with randomness of clusters (see Godehardt [471).Clustering is also important in genetics, as I shall point out in $4.1, where I mention a result of Arratia and Lander which also deals with random clus-
New directions in graph theory (emphasizing applications)
17
tering. In 94.2.5, I shall mention the importance of clustering in the theory of social networks. The reader might be particularly interested in the recent survey on clustering in Russia by Mirkin and Muchnik [ a ] . Let me make some general remarks here and point out some recent theoretical results in cluster analysis. Clustering frequently starts with information about the distances between (the dissimilarities between) elements of a given set of entities. Clustering methods aim at finding within the given set of entities, subsets called clusters which are both homogeneous and wellseparated. For instance, there has been considerable interest lately in the development of algorithms for minimum diameter clustering, that is, where the diameter or largest dissimilarity between a pair of entities in a cluster is minimized. Two such algorithms are described in the paper by Guknoche, Hansen, and Jaumard [49]. Hansen and Jaumard [50] solved an open problem of Brucker by deriving an O(N3 log N) algorithm for the minimum sum of diameters bipartitioning problem, where N is the number of entities to be clustered. Hansen, Frank, and Jaumard [51] give an efficient algorithm for determining the maximum sum of splits partitions into M clusters for all M between N - 1 and 2. Here, the split of a partition is the smallest dissimilarity between an element in the cluster and one outside it. As we observe in 94.2.5, a potentially intriguing direction for cluster analysis is to develop concepts of clustering which are invariant under various transformations of data, and to derive conditions under which different clustering algorithms lead to conclusions which are invariant in this sense. 2.2.4 Chemical Applications
Chemistry has played a central role in the history of graph theory, with the work on organic chemistry of Cayley, Sylvester, and others, and graph-theoretical aspects of chemistry are a major area of research today. A recent overall reference for the area is the two-volume series edited by Bonchev and Rouvray [52] [53]. I shall have more to say about graph theory and chemistry in 94.3. In his talk, Alexandru Balaban (this volume pp.109-126) emphasized the importance of distance concepts for chemical applications of graph theory. In particular, he emphasized the importance of centrality concepts for nomenclature and classification problems. On the order of 10 million compounds are discussed in chemical abstracts. The retrieval of structural information about these compounds is an important problem. A chemist would like to find out if someone else has already studied a compound in which he or she is interested. One of the problems is to develop a system of nomenclature. The IUPAC (International Union of Pure and Applied Chemistry) classical nomenclature is based (for acyclic graphs) on a method for coding and retrieval as well as canonical vertex numbering which is due to Ron Read and which starts from the graph center. Attempts to generalize this for any graph have not yet succeeded and progress on a generalized graph center approach would seem to be a useful direction in which to work. See for instance the work of Balaban, Kennedy and Quintas [54] and Bonchev, Balaban, and Rand16 [53. 2.2.5 Geometric Graphs
In their talks at the conference, Marc Lipman and Milan RandiC distinguished two ideas: The graph-theoretical distance between two vertices as measured by the length of the shortest path between them; and the metric distance between the two vertices when the graph is embedded in some metric space. RandiC talked about the importance of this distinction in graphs representing chemical structure. Lipman talked about the significance of this distinction in problems having to do with communications and transportation.
18
F.S. Roberts
A recent important result about the difference between minimum spanning trees and Steiner minimum trees illustrates the distinction made by Lipman and RandiC. Designers of computer circuits, long-distance telephone lines or mail routings seek to find a minimum total length collection of routes which will connect up all desired locations. In solving such a problem, one starts with a network embedded in the Euclidean plane and one first needs to determine whether or not to allow extra points in the network to use as interconnection points. If interconnection points are not allowed, one is seeking the minimum spanning tree. If interconnection points are allowed, one is seeking a Steiner minimum tree. In the former case, the problem can be solved purely from graph-theoretical distance (with weights on edges), although the embedding makes graph-theoretical distance equal to physical distance. In the latter case, we need to use the embedding and to measure physical distances to new points. Du and Hwang [%I have recently solved an important old problem by proving a 22-year-old conjecture of Gilbert and Pollak which says roughly that adding extra interconnection points cannot reduce the length of the minimum solution by more than about 13 percent. That is, the minimum ratio between the length of a Steiner minimum tree and a minimum spanning tree is &/2. Garey, Graham, and Johnson [57l proved that the Steiner minimum tree problem is NP-hard. Therefore, it is important to find fast heuristics with good performance. By contrast, there are efficient algorithms (like Prim’s and Kruskal’s) for finding minimum spanning trees. The Du-Hwang result suggests that minimum spanning trees are viable heuristics for Steiner minimum trees. In her talk, Margaret Cozzens mentioned the graphs defined from points in the plane or another metric space by joining two such points by an edge if and only if the distance between them is at most some amount 6. If the points are on the real line, we get the indifference graphs which have been studied by Roberts [58] and others and which have a large number of applications, many of which are surveyed in Roberts [59]. If the points are in the plane, the problem of characterizing the resulting graphs is still an open problem, which is discussed further in $2.3. Cozzens discussed problems involving multiple indifference graphs. These are nested families of indifference graphs on the same set. They are important in connection with the channel assignment problem in communications (see for instance Raychaudhuri [24] and Wang [a] They ). also arise in connection with the applications of graphs to genetics, in particular in connection with the human genome project, which is also discussed in 54.1. 2.2.6 Metrics between Graphs
There are other ways in which distance concepts arise in graph theory. In particular, one can talk about the distance between two graphs. In his talk, Lipman mentioned the importance of this topic for chemistry (in measuring similarity of two structures) and also its importance in object identification problems of interest to the United States Navy. A large body of recent graph-theoretical work has involved changing one graph into another by edge rotation, removing an edge {u,v} and adding an edge { u,w}. The rotational distance between graphs G and H having the same number of vertices and edges is the smallest number of edge rotations required to change G into H . An intriguing open question about this distance concept is the following. Let S be a set of graphs each of which has the same number of vertices and the same number of edges. Build a new graph D,.(S),the edge rotation distance graph, by taking S as the vertex set and taking two graphs G and H of S adjacent if and only if one can be obtained from the other by an edge rotation. Similar notions arise from the concept of edge slide, which is an edge rotation but only applied in the situation where {v,w} is in G; the analogous notions are edge slide distance and edge slide distance graph. An
New directions in graph theory (emphasizing applications)
19
interesting conjecture is that every graph is the edge rotation distance graph D,(S) for some S. This conjecture is known to be true for special kinds of graphs such as complete graphs, cycles, trees, line graphs, and complete bipartite graphs. Moreover, it is known that every graph is the edge slide rotation graph for some set of graphs S. Sample references on these and related graph distance concepts are by Jarrett [61], Chartrand, et al. [62], and Chartrand, et al. [63] (this volume pp. 127- 136). These references in turn contain many references to the literature. Distance concepts have been of interest in a variety of decision making problems. For instance, suppose each of a panel of experts or voters gives us his or her preferences, the preferences are used to define the arcs of a digraph, and we would like to find a consensus of the preferences or a consensus digraph. We could measure the distance between two digraphs and then use some measure of central tendency such as median or mean to find a consensus. This idea was introduced by Kemeny and Snell [64] to measure distances between linear (actually weak) orders. It was generalized to distances between partial orders and between asymmetric digraphs by Bogart [65] [&I. Bogart and Weeks [67] talk about a distance between signed digraphs. A similar problem arises in numerical taxonomy. Suppose that a variety of classification procedures provide different trees. How does one find a consensus tree? Distance measures between trees are used to find a consensus in applications in numerical taxonomy, evolution, and other areas by BarthClemy and McMorris [ a ] , Day and McMorris [69], Margush and McMoms [70], McMoms and Neumann [71] and elsewhere. I shall return to issues of consensus in $ $ 3 5 4 . 1 and 4.2.3. 2.3 Intersection Graphs
Suppose that F is a family of sets. We can build a graph, the intersection graph of F, by letting the sets in F be the vertices and joining two such vertices by an edge if and only if they have a nonempty intersection. Intersection graphs have had a large number of important applications, both practical and graph-theoretical (see Fishburn [72], Golumbic [73], Roberts [2] [59] [741) and promise to continue to do so. Among the important families which have been studied are subfamilies of the families of real intervals (here the intersection graphs are called interval graphs), disks in the plane and higher-dimensional space, boxes in the plane and higher-dimensional space, cubes in the plane and higher-dimensional space, convex sets, circular arcs, unions of intervals, and so on. See Cozzens and Roberts for recent results on some of these families. Here I shall emphasize the intersection graphs of disks in the plane or in higher dimensional Euclidean space. If all the disks are in r-dimensional space and all have the same radius, the resulting intersection graph is called an r-unit sphere graph. The 1-unit sphere graphs are the same as the indifference graphs mentioned in $2.2.5. The problem of characterizing the 2-unit sphere graphs remains open, as does the problem of characterizing the intersection graphs of disks of differing radii in the plane. The problem also remains open if the radii vary, but are dependent on each other (to a ‘closest’ neighbor). This is a variation suggested in the talk by Lipman who pointed out that 2-unit sphere graphs arise in problems of communication among ships in the ocean. The ships can see each other but wish to remain silent. They communicate by line of sight. We can represent a ship by a circle whose radius is the distance from the ship to the horizon, and two ships can communicate if and only if their circles intersect. The r-unit sphere graphs also arise in biochemistry in the work of Havel, Kuntz, and Crippen [76] and Havel, Kuntz, Crippen, and Blaney [77] and in the channel assignment problem (Cozzens and Roberts [14]). In the latter case, we are interested in find-
20
F.S. Roberts
ing good ways to color 2-unit sphere graphs. Unfortunately, as Orlin (personal communication) has shown, this problem is already NP-hard. Other work on r-unit sphere graphs is in the papers by Fishburn [78] and Maehara [79]-[81]. 2.4 Other Variants Briefly Mentioned
Other traditional graph-theoretical concepts whose modern variants are being studied and need to be studied further are: Vulnerability Concepts. These were mentioned in the talk by Henda Swart, and include such ideas as integrity, toughness, and the like, almost all hard to calculate. They were also mentioned in Clyde Monma’s talk, which dealt with the design of survivable communication networks, and are related to the ideas in the talks on reliability and survivability by Jean-Claude Bermond and by Olga Salizky. See the survey papers by Bagga, et al. [82] and Barefoot, Entringer, and Swart [B]. Domination Concepts. These were also mentioned in the talk by Swart. Tournament Concepts. These were mentioned in the talk by Jorgen Bang-Jensen. Graph Polynomials (Generalized Chromatic Polynomials). These were mentioned in the talks by William T. Tutte (this volume pp.153-158) and by Dominic Welsh (this volume pp. 159-171). Covering Concepts. Matching Concepts. Planarity Concepts. Planar graphs were discussed in the talks by Ronald Read (this volume pp.201-210), Karen Seyffarth, Richard Steinberg (this volume pp.211-247, Tutte, and Welsh. Crossing Numbers. Extremal Concepts. These were mentioned in the talks by BCla Bollobk (this volume pp.5-11) ,Paul ErdBs, and Vera S 6 . Ramsey Concepts. These were mentioned in the talks by Bollobk, ErdBs, Michak Karodski, Andrzej Rucidski (this volume pp.265-273), and S6s. 3.
New Approaches to Algorithms Will be Developed
Perhaps the biggest change in graph theory in the past two to three decades has been the increasing emphasis on algorithms for solving graph-theoretical problems. The algorithmic approach to graph theory has had a dramatic effect on the growth and development of the subject and the algorithmic developments continue in new and exciting directions. It is safe to predict that more such new directions will be explored. 3.1 On-Line Algorithms There is increasing emphasis in practical problems to find solution algorithms which are on-line in the sense that one is forced to make choices at the time data becomes available, rather than after having the entire problem spelled out. A general approach to on-line problems is to think of them as sequential decision making problems. There are two points of view: (a) Formulate a probabilistic model of the future and minimize the expected cost of future decisions; (b) compare an on-line decision strategy to the optimal off-line algorithm, one that works with complete knowledge of the future. The first is the approach taken for instance in the theory of Markov decision models. However, in graph theory, the second approach is starting to lead to a very fascinating new branch of algorithm development. For a recent overview of the field, see McGeoch and Sleator [MI.
New directions in graph theory (emphasizingapplications)
21
The talks by William T. Trotter and Marc Lipman discussed on-line algorithms. Lipman emphasized their importance in practice. Trotter argued that many practical problems, such as investment decisions, are inherently on-line, that on-line algorithms are a natural setting for approximation, and that on-line methods already have led to the development of new, clever algorithmic tricks. Let me briefly discuss the on-line approach to graph coloring. We have a graph builder B and a graph colorer C . B builds the graph one vertex at a time, presenting the vertex and its adjacencies. C responds with a choice of color as each new vertex with its adjacencies is presented by B. How do we measure how well C does? Let ~ L ( Gbe) the least number of colors t so that C has a strategy which colors G with t colors regardless of how B builds G . Let F be a family of graphs and x o ~ ( Fbe ) the least number of colors t so that C has a strategy which colors a graph built by Bin t colors, regardless of which graph from F ! B builds, or how it is built. The simplest on-line algorithm for graph coloring is the ‘greedy’ algorithm sometimes known as ‘first fit’. On-line algorithms for coloring inter’val graphs have applications in the study of the channel assignment problem and in dynamic storage problems (see references in Trotter [83). Kierstead and Trotter [%I have shown that if F ! is the family of interval graphs with chromatic number k, then x 0 ~ ( f = l 3k - 2; that is, there is an on-line algorithm (not first fit) which always colors the graph in at most two fewer than three times as many colors as the optimal coloring obtained when the entire graph is known in advance, but there can be no online algorithm which is guaranteed to do better. G y M h and Lehel [mhave obtained a similar result for the indifference graphs. It is a ‘folklore’ theorem that if is the family of forests remains an open question to compute ~ L ( Fif ) on n vertices, then ~ o ~ is (fl is the family of bipartite grap
3.2 Existence of Algorithms As Read pointed out (this volume pp.201-ZlO), in the 196O’s, when the major developments of the algorithmic approach to graph theory began, there was an emphasis on finding effective algorithms, ones which work. Gradually there developed the theory of efficient algorithms and the search for algorithms which are polynomial of low degree. Until recently, however, the only method for showing that there was a polynomial algorithm for solving a problem was to exhibit such an algorithm. As Read said, this may not be the easiest way to show this and it limits us to algorithms which we are capable of constructing. In an important series of papers, Robertson and Seymour have shown the existence of polynomial algorithms without indicating how to do the algorithms. For instance, they showed (Robertson and Seymour [%I) that for any fixed integer w , there is a polynomial algorithm to decide if an input graph has tree-width at most w. Robertson and Seymour were later able to construct a polynomial algorithm for this problem, and indeed Arnborg, Corneil, and Proskurowski [89] constructed an algorithm of running time q n W+ 2, to test if a graph on n vertices has tree-width less that or equal to w for every fixed w. However, Robertson and Seymour [90] prove that there exists an algorithm of running time O(n2), for every fixed w, and as of this date, no one knows how to construct such an algorithm. To paraphrase Read: The Robertson-Seymour Theorems point to a coming theory of graph algorithms which transcends our ability to construct explicit algorithms. We are left with several questions: Are we happy with the knowledge that an algorithm exists? What is the practical significance of this conclusion? Does the knowledge that an algorithm exists ever help us in devising other algorithms?
22
F.S. Roberrs
3.3 Algorithms Based on Lies
Some approaches to the development of algorithms are highly unusual. Let me describe an algorithm developed to deal with a garbage problem posed by the New York City Department of Sanitation. (See Beltrami and Bodin [91] and Tucker [!El.) In this problem, we need to assign garbage trucks to pick up garbage. A particular garbage truck is assigned a tour or schedule of sites that it visits on a given day. We wish to assign each tour to a day of the week (Monday through Saturday) so that each site is visited a specified number of times a week, no site is visited twice in one day, and no day is assigned more tours than there are trucks. We wish to find a set of tours which has such an assignment and minimizes the total amount of time taken by all trucks. The entire problem is solved using a heuristic algorithm. At the heart of the algorithm is the subroutine designed to decide, given a set of tours, whether or not each tour can be assigned to one of the six days of the week so that if two tours visit a common site, they get a different day. This problem is equivalent to the difficult question of determining if a graph defined from the tours is 6-colorable. The subroutine must be used over and over again. Tucker has observed that if the famous Berge Conjecture (strong perfect graph conjecture) is true, then there is a much-improved algorithm to use in the subroutine. What is wrong with using this algorithm? In some sense, using it is “lying” since it is based on a statement which may not be true. However, should the algorithm ever give rise to a set of tours which cannot be assigned to the six days of the week in the desired way, then we would have found a counterexample to the strong perfect graph conjecture! So, we have nothing to lose; lying pays off! The main point I wish to make is that some algorithms for practical problems can use rather unusual strategies. As an aside, let me note that the Perfect Graph Conjecture, proved in its weak case by LovLz [93] [%I, continues to be one of the sources of a great deal of graph-theoretical work, see for example Berge and Chvatal [93. I should mention here the linear programming approach to this problem and to combinatorial problems in general, which is developed in detail in Grotschel, LovBsz, and Schrijver [%]-[%I. Of special note is the strong new general method for constructing higher-dimensional polyhedra whose projection approximates the convex hull of 0 - 1 valued solutions of a system of linear inequalities (see Lov&sz and Schrijver [*I). This new general method is especially relevant to odd holes, odd antiholes, and other concepts related to perfect graphs. It is also relevant to orthogonality constraints such as those developed by LovLz [lo01 in solving the Shannon capacity problem for the 5-cycle and also discussed by b v L z [loll and more recently by Narasimhan and Manber [102]. 3.4 Approximation Algorithms and Algorithms that May Work on Special Classes of Graphs Often when there is no good algorithm for solving a problem in complete generality, we might look to solve it approximately. In many practical problems, an approximate solution is all that we really need. In graph theory today, there is increasing emphasis on approximation algorithms. After several decades of proving that problems are NP-complete when looked at in complete generality, we are finding that the conclusion of NP-completeness for the problem of finding the optimal solution might not be very relevant if we can live with a near-optimal solution. An example of an approximation algorithm was given in the talk by Kim Hefner. Suppose E Eif and only if there is a E V so that (x,a) and (y,a) are in A. The conflict graph arises in communication applications where Vis a set of transmitters and an arc from x to a means that a signal sent at x
D = (V,A) is a digraph. Its conflict graph is the graph G = (V,E ) where { x , y }
New directions in graph theory (emphasizingapplications)
23
can be received at a. Then { x , y } E E means that x and y conflict in the sense that signals sent at x and y can be received at the same place. (The same construction arises in ecology. Here, V is a set of species in an ecosystem and an arc from x to a means species x preys on species a. Then {x, y} E E means that x and y compete in the sense of having a common prey, and (V,E ) is the competition graph of (V,A).See Lundgren [ l a ] for a recent survey article about competition graphs. Competition graphs were discussed in the talk by Suh-ryung Kim (this volume pp.313-326). See Kim [lo41 and Wang [lo51 for many additional references on competition graphs.) Hefner’s talk was based on a question arising from large naval communication networks. The question was the following: Given a graph G , what is the maximum number of arcs in a digraph D which has G as its conflict graph? The essential issue is that given a network, we wish to determine how many links we can add to the transmission system and not change conflicts. This problem is NP-complete. However, it can be solved efficiently by an approximation algorithm discussed by Hefner; see Hefner and Hintze [106]. (As an aside, we note that Hefner’s question has given rise to another: What graphs arise as competition graphs of strongly connected digraphs? See Jones, Lundgren, Maybee, and Pullman .)
Of increasing interest is the development of random approximation algorithms. I shall discuss this subject in 55.4. As an alternative to approximation, there is another approach which is sometimes useful, not just for Hefner’s problem but in general: Analyze the problem for special classes of graphs relevant to the application we have in mind. In the case of conflict graphs, it makes sense to study conflict graphs of digraphs which are relevant to the communication application which motivated our interest in them. These are digraphs which describe highly reliable communication network topologies such as the double loop networks studied in Hu, Hwang, and Li [lo81 and the chordal ring topologies studied by Hu and Hwang [1091. Recently, Roberts and Wang (unpublished) have been studying the conflict graphs of such networks and it seems reasonable to try to analyze Hefner’s problem for the special classes of conflict graphs which result. Incidentally, highly reliable network topologies received a great deal of attention at this conference, and were mentioned in talks by Bermond, Monma, Claudine Peyrat, and Salizky. A recent comprehensive reference on network reliability is the book edited by Hwang, Monma, and Roberts [1101. 3.5 Parallel and Distributed Algorithms
The design, analysis, and management of computing systems that consist of many processors is a central part of computer science, and its study has led to much important work in graph theory. Such systems divide into two general (and not entirely distinct) categories, distributed systems and parallel computers. A distributed system consists of autonomous, physically separated computers linked together in a network, and the main theoretical issues center around problems of communication and synchronization of such systems. A parallel computer is usually a single machine composed of many distinct processing units which work together to perform the same kinds of tasks performed by a standard sequential machine. Over the past few years, the catalogue of problems in graph theory for which there are efficient parallel algorithms has expanded greatly. An example of such a problem is the problem of determining whether or not a graph has a strongly connected orientation. (I previously discussed, see 32.2.1, the problem of choosing such an orientation which minimizes some objective defined using a distance measure.) A good parallel algorithm for finding a strongly connected orientation, such as those developed by Atallah [lll] or Vishkin [112] (see survey
F.S. Roberrs
24
in Karp and Ramachandran [113])is very different from a good sequential algorithm for the same question, for instance those of Boesch and Tindell [114], Chung, Garey and Tarjan [115] or Roberts [2]. On the other hand, insights gained from analyzing problems from a parallel/ distributed point of view have, for some problems such as the network flow problem, led directly to new efficient sequential algorithms. The design and analysis of distributed computing networks involves many of the same fundamental ideas and problems that have been previously studied in relation to other kinds of communication networks; for example, connectivity, reliability, routing, spanning trees. In addition, understanding and tracking how information flows in the network and how the individual processors respond to this information requires the development of new logical and combinatorial models. Some of the same issues arise in parallel computers, where the architecture of a specific machine is such that the processors are linked in some network structure, such as a hypercube, and thus information transfer becomes an important theoretical consideration. Hypercubes were discussed at the conference in the talks by Claudine Peyrat and by William Widulski (this volume pp.327-331). The effective utilization of such parallel machines requires a thorough understanding of their underlying graph-theoretical structure. In a distributed algorithm, each processor has a piece of relevant information. By exchanging messages, the sum total of knowledge can be obtained. There was one talk at this conference, that by Bermond, which discussed the development of consensus protocols for finding this sum total of knowledge. (I have previously discussed consensus issues in graph theory in $2.2.6 and will discuss them further in $84.1and 4.2.3.) It seems natural to investigate the relevance of the social choice literature. Especially relevant is the recent trend to apply techniques of combinatorial optimization to social choice problems. Here I should mention a paper by Bartholdi, Tovey, and Trick [lq.In this paper it is shown that computing the social welfare function called the Dodgson winner is an NP-complete problem, and it is therefore difficult to determine the winner of an election! Thus, social choice theory shows that consensus protocols based on very natural consensus procedures might lead to algorithms which are inefficient. 3.6 Probabilistic Algorithms I shall discuss probabilistic algorithms in $5.4.
4. Applied Problems will Continue to Stimulate the
Development of Graph Theory The history of graph theory has been closely linked to applications, witness for example the importance of computer science, chemistry and electrical networks in the development of the subject. It seems reasonable to expect that applied problems will continue to play an important role, both in stimulating new graph-theoretical work and as areas where graph theory can be of practical use. Here I shall concentrate on a few areas of special interest, namely, genetics, the social sciences, chemistry, and communication networks/information management. This is not to diminish the past and future importance of transportation problems, location problems, ecology, manufacturing, computer science, and other areas. These just happen to be a few of the topics discussed by speakers during Quo Vadis, Graph Theory? and they are also a few of my favorite topics. I will also mention, lest we forget, the impact of graph theory on and benefits of graph theory from areas of pure mathematics.
25
New directions in graph theory (emphasizing applications)
4.1 Genetics
Margaret Cozzens’ talk emphasized the important role of graph theory in the human genome project. It is now well-known that information storage within a cell is by means of long nucleic molecules, which can be thought of as long strings of smaller units called nucleotides. For instance, in ribonucleic acids - RNA - each nucleotide for simplicity is one of four bases. Nowadays, by the use of radioactive marking and high-speed computer analysis, it is possible to sequence long RNA chains rather quickly, and it has become feasible to think of sequencing the entire 3-billion base long human genome. The human genome is the total genetic complement of the cell, all the genes on all the chromosomes. There are approximately 100,000 genes distributed in 23 chromosomes. Mapping the human genome would require localizing each of its genes; sequencing it would require determining the exact order of the thousand or more nucleotides which make up each gene. Sequencing the entire human genome would ultimately make it possible to devise ways to treat such genetic disorders as Alzheimer’s disease and cystic fibrosis. For more on this topic from a non-technical point of view, see Congress of the United States [ 1171 or DeLisi [ 1181. Other general references are Bell and Marr [119] and Waterman [120]. Graph theory has played an important historical role in genetics, and in particular in the sequencing problem. I should mention here the discovery by Benzer that the gene structure is linear, which led to the theory of interval graphs (see Benzer [121] [122]). I should also mention the use of Eulerian chains in graphs by Hutchinson [123] to sequence an RNA chain from fragments obtained by a complete enzyme digest. It was this method which was used to improve on the fragmentation stratagem which was used by R.W. Holley and his co-workers at Cornell University (Holley, et al. [124]) to determine the first nucleic acid sequence, and which, at one small point in time, played a vital role in the development of genetics. A problem of considerable importance in the human genome project and which was mentioned by Cozzens in her talk is the problem of detecting matches. Detecting the similarity between two RNA, DNA, or protein sequences has led to the discovery of important shared phenomena. For instance, it was discovered that the sequence for platelet derived factor, which causes growth in the body, is 87% identical to the sequence for v-sis,a cancer-causing gene. This led to the discovery that v-sis works by stimulating growth. More generally, now we are seeking matches among a cluster of sequences. This problem can be approached graphtheoretically by defining a graph G on a vertex set consisting of a set of RNA, DNA, or protein sequences, where two sequences are adjacent if and only if they match ‘‘ retty well”. The problem is then to find a set of k vertices that generate a k subgraph of a edges. If a = 1, we are looking for a clique. In a sample result, Arratia and Lander El251 estimate the size of a “significant cluster” in a random graph. There is much more to be done here. A general discussion of alignment and matching problems in connection with the human genome project can be found in Chapters 3 and 4 of the book edited by Waterman [120], and many other graph-theoretical problems related to the human genome project are also summarized in that book.
(i
P
Consensus methods, which we have already mentioned in various places in this paper, are also becoming of interest in connection with the human genome project and with molecular biology in general. Day and McMonis [126] survey nine different consensus methods of use in molecular biology and Day [127l has compiled an annotated bibliography of over 115 papers on this subject. A typical application of consensus methods is to find a consensus “pat-
26
F.S. Roberts
tern” given a collection of molecular sequences; for example, DNA sequences. This problem is studied, for example, by Waterman [128] and Mirkin and Roberts [129]. In the latter paper, it is suggested that some of the algorithmic methods of graph theory and combinatorial optimization for computing consensus functions such as medians should be relevant to the consensus problems of molecular biology. 4.2 Social Sciences The social sciences, such as economics, psychology, sociology, anthropology, and political science, have been a major source of interesting graph-theoretical problems and in turn have used many graph-theoretical ideas. Here, I summarize some of these applications of graph theory in the social sciences, with an attempt to suggest future directions of interest. The book by Harary, Norman, and Cartwright 11301 has had great influence on a generation of social scientists and popularized many graph-theoretical ideas motivated by social science applications. An early survey article on graph theory in the social sciences, which can serve as background for many of the topics in this section, is the article by Roberts [ 1311. 4.2.1 Balance Motivated by the theory of small group behavior, Harary [132] and Cartwright and Harary [133] introduced the notion of balance of a signed graph, which corresponds somehow to the imprecise idea of absence of tension. The large amount of work in balance theory is summarized in the papers by Johnsen [1341 and Roberts [74]. A signed graph is a graph with a sign, + or -, on each edge. Such a signed graph is called balanced if every cycle has an even number of - signs. In her talk, Cozzens observed that signs can change over time, and then asked what happens to balance. For instance, suppose r& is the friendship between i and j (positive or negative) and the rii change over time according to some rule. What if we only know the rii up to sign? Does the corresponding signed graph eventually stabilize over time? If so, does it stabilize to a balanced signed graph? Is this balanced signed graph determined by the initial sign pattern alone? A simple example of a model for the change of the rij over time is given by Hubbell, Johnsen, and Marcus [135]. Incidentally, I should mention that ideas of balance theory, originally developed because of their importance in sociology, have also had applications in the simplification and analysis of the structure of mathematical models for large, complex systems, such as those used to analyze economic or energy systems. (See for instance Greenberg, Lundgren, and Maybee [ 1361.) They have also had applications in economics where they are closely related to the so-called Morishima matrices. (See Morishima [137 and Roberts [74].)They have had surprising applications in the theory of maximizing quadratic polynomials in 0,l-variables. See the paper by Hansen and Simeone 11381. A talk by Andrea Colboum and John Kennedy emphasized a notion related to balance which is called groupthink. This is a phenomenon which can be modelled by a digraph in which each member of a group is represented by a vertex and an arc goes from a dominating individual to a dominated individual. The question is to what extent the members of a group are dominated by a single individual, since this leads to in-group pressures. As a preliminary model, Colboum and Kennedy defined a groupthink digraph as a signed digraph in which there is a connected, induced subgraph, each arc of which is +, and so that every vertex not in the subgraph is dominated by more of the vertices in the subgraph than it dominates in the subgraph. (See Colbourn and Kennedy [139].) A characterization of groupthink digraphs is needed. Also, results similar to those of balance theory are needed: How does one measure
New directions in graph theory (emphasizing applications)
21
‘degree’ of groupthink (how close one is to a groupthink digraph); does the smallest number of sign changes to achieve groupthink equal the smallest number of arc deletions needed to achieve groupthink, at least in certain situations? 4.2.2 Sign Stability
Suppose that A is a square real matrix. We say that A is stable if every eigenvalue of A has negative real part. This property of a matrix is important in the analysis of the stability properties of dynamical systems and has had important applications in biology, chemistry, economics, efc. In his famous book, Samuelson [140] observed that sometimes we only know the entries of a matrix like A up to their signs, and asked when the stability properties of A could be derived just from its signs. If every matrix with the same sign pattern as A is stable, we call A sign stable. The theory of sign stability, and its related theory of sign solvability, are surveyed in the papers by Klee [ 1411, Maybee [1421, and Roberts p4],and was mentioned in the talk by Cozzens. Sign stable matrices were first characterized using graph theory by Jeffries, Klee, and van den Driessche [1431. One of the most important future areas for research about sign stability has to do with the generalization to the situation which prevails if we know more than just the sign pattern. For instance, as Victor Klee has pointed out (see Greenberg and Maybee [144]), sometimes we know whether an entry of the matrix A is large positive, small positive, zero, small negative, or large negative. Is there a theory analogous to sign stability (or sign solvability) which can be developed here? 4.2.3 Pulse Processes
In her talk Cozzens mentioned the pulse process model, which is one of a variety of structural models based on weighted digraphs and related matrices which have been developed to understand large scale decision making problems. The pulse process model was developed by Roberts [145] and is summarized in Roberts [2] [59].It and related models have been applied to decision making problems having to do with such topics as food production, energy use, air pollution, transportation systems, coastal resources, health care delivery, manpower, water policy, inland waterway traffic, ecosystems, and the analysis of historical events. A pulse process works this way. Each vertex of a weighted digraph corresponds to a variable relevant to a decision making problem being modelled. The vertices attain values (positive or negative real numbers) at discrete times. If a vertex x increases in value by an amount c at time t , and if there is an arc from x toy with weight u, then as a result vertex y increases in value by an amount cu at time t + 1. A signed or weighted digraph is called value stable under a pulse process if the sequence of values at each vertex is bounded (in absolute value) over time. Value stability can be characterized in terms of the eigenvalues of the signed or weighted digraph (see Roberts [2]). However, as Roberts [2] observes, the conclusions of value stability are sensitive to changes in weights in the digraph. Recently, Tanny and Zuker [146] [147 have obtained results on the sensitivity of eigenvalues under elementary matrix perturbations, and applied them to stability conclusions from pulse processes. More work is needed along these lines. Another direction of future work has to do with consensus problems. I have already talked about consensus issues in §§2.2.6,3.5,and 4.1. Bogart and Weeks [67l introduce a distance measure between two signed or weighted digraphs and use the distances to define a consensus signed or weighted digraph given different digraphs constructed by different experts. Roberts
28
F.S. Roberts
[ l a ] [149] uses a different consensus method to build the weighted digraphs used in pulse process analysis. All of these consensus methods are sensitive to changes in scales used to define the signed and weighted digraphs, and analysis of this sensitivity is needed. 4.2.4 Meaningfulness of Conclusions
If a conclusion such as one about stability or one about a consensus weighted digraph, as discussed in the previous section, can change when scales used to measure elements such as the weights on the digraph change (in allowable ways), we say that these conclusions are meaningless - they are not invariant under scale transformations. Meaningless conclusions are accidents of the particular scale parameters we choose, such as parameters defining units or zero points, and so don’t tell us anything inherently important. (For instance, it is meaningless to say that the temperature in New York is twice the temperature in Fairbanks, since this statement might be true in Fahrenheit and false in Centigrade.) The theory of meaningfulness of scales of measurement is an important area in the theory of measurement, and is summarized in such books as Roberts [150] and Luce, et al. [151]. Its many applications are summarized in Roberts [152] [153]. Some of its uses in graph theory are summarized in the paper by Roberts [154]. Recently, in [155], I have suggested analyzing many conclusions in graph theory and, more generally, combinatorial optimization, from the point of meaningfulness. An early result along these lines is the result of Roberts [156] that conclusions about value stability in pulse processes (see $4.2.3)are meaningful if all of the variables are measured on interval scales, scales which like temperature are unique up to a choice of unit and a choice of zero point. I would like to mention several other similar results. The problem of finding the shortest path from x to y in a weighted graph leads to conclusions which are meaningless if the weights are measured on interval scales. However, the conclusions are meaningful if the weights are measured on ratio scales, scales which, like mass, are unique up to choice of unit (see Roberts [15fl). The conclusion that a particular tree in a weighted graph is a minimum spanning tree is meaningful for both ratio and interval scales, indeed, even for ordinal scales in which the numbers are known only up to order. This is not a priori obvious, but follows from the fact that the greedy algorithm, applied to an ordering of edges from lowest weight to highest, gives us the minimum spanning tree (see Roberts [155]).
A much more subtle conclusion is the recent result of Cozzens and Roberts [157 about Tcoloring, a concept which was defined in 32.1.3. They prove that if separations (elements of r ) are measured on a ratio scale, then it is meaningful to conclude that the greedy algorithm finds an optimal T-coloring for every complete graph; however, it is meaningless to conclude that the greedy algorithm finds an optimal T-coloring for a particular complete graph. (Optimality means that the span or separation between smallest and largest channels used is minimized. Greedy algorithms on complete graphs are an important area of research in Tcolorings; a major open question in the field is to specify for which values of n and sets T the greedy algorithm obtains the span of K,,.) Meaningfulness methods have been applied to analyze consensus methods such as those discussed in 92.2.6. For instance, they have been used to pinpoint conditions under which an arithmetic mean is the appropriate consensus function. See AczCl, Roberts, and Rosenbaum [1581 and AczCl and Roberts [ 1591. 4.2.5 Social Networks
In sociology, there is a great deal of interest in studying social networks, graphs whose vertices are people in some group of interest and whose edges correspond to friendship or
New directions in graph theory (emphasizing applications)
29
some other relation between individuals. Sometimes we add weights, representing strength of the relationship, to the edges. The large literature of social network theory is summarized in such papers as Johnsen [I341 and sample papers can be found in the journal Social Networks and in the book by Freeman, Romney, and White [la]. Numerous structural properties of graphs and digraphs are or should be of interest for the study of social networks. In $2.2.1,I have already mentioned centrality (cf.Freeman [33]). Other concepts of interest are structural equivalence (see for example Boyd [161]) and cohesiveness (see for example Borgatti, Everett, and Shirey [162]). Here, I mention just one interesting direction of research. One of the goals of social network theory is to cluster the vertices in a social network into cliques. The clustering depends on the weights. Suppose that rij is the rating by individual i of his or her friendship for individual j . One well-known and widely used clustering algorithm is the CONCOR algorithm of Breiger, Boorman, and Arabie [ 1631 and Arabie, Boorman, and Levitt [1641. Batchelder [ 16.3 asked: What if the rij are measured on a scale and a transformation of scale takes place? Specifically, what if the rij are measured on a ratio scale? If the scale transformation can be performed independently on each row, Batchelder showed examples where CONCOR gives totally different clusters after transformation of scale. In this case, as I have discussed in g4.2.4, we say that the clustering is not meaningful. It would be very helpful to derive conditions under which different clustering procedures lead to meaningful conclusions and to develop new concepts of clustering which are invariant under change of scale. 4.3 Chemistry
I have already mentioned in $2.2.4 the importance of chemistry in the history of graph theory, and have in that section referred the reader to recent general references on the subject. Alexandru Balaban (this volume pp.109-126) mentioned in his talk some of the areas of intensive application of graph theory and chemistry. One includes QSAR (Quantitative Structure Additivity Relationships) and QSPR (Quantitative Structure Property Relationships), which involve ways to translate the discrete nature of chemical structures into topological indices and which are important in drug design (the importance of the relation between chemical properties and structure was also mentioned in the talk by RandiC). A second is computeraided design of organic synthesis, which dissects a graph of a complicated molecule into simple pieces. A third is retrieval of structural information and chemical documentation and nomenclature (cf. our discussion in $2.2.4).A fourth is the development of molecularlstructural formulas in terms of graphs, not words, to describe chemical structures. Balaban also pointed to some of the topics in graph theory whose development is important to chemistry. One of these topics is spectral graph theory, which has also arisen in our discussion of eigenvalues and pulse processes in $4.2.3. Spectral graph theory, according to Balaban, has already led to a flourishing industry for predicted structures which chemists have tried to synthesize. Important classes of graphs which need to be studied, according to Balaban, are those with an excess of negative eigenvalues over positive ones and those with an excess of positive eigenvalues over negative ones. John Kennedy also emphasized spectral graph theory in one of his talks, and pointed out that there are many problems in statistical mechanics where one could make important progress if one could calculate the characteristic polynomial of a special kind of graph. He posed a variety of problems relating the characteristic polynomial of a graph G to the characteristic polynomials of graphs arising from a partitioning of the vertex set of G into two parts of equal cardinality. A second topic in graph theory which is of importance to chemistry, according to Balaban, is the area of distance concepts, and such concepts as the generalized center of a graph, which
30
F.S. Roberts
I have already discussed in $2.2.4. A third important area, according to Balaban, is the area of factorization of graphs. For instance, many chemists work on decomposing graphs into 1-factors (perfect matchings) without knowing what graph-theorists have done. (It is interesting that decomposition into 1factors also plays a role in the eigenvalue analysis of stability of weighted digraphs under pulse processes - cj. Roberts [59] [I&]). The 1-factors or perfect matchings are called Kekule'structures in organic chemistry. I would like to point out some interesting recent work on this subject. A hexagonal system is a connected plane graph with no cut vertex in which every interior region is a regular hexagon of side length 1. The topological properties of hexagonal systems are extensively studied because of their important role in the chemistry of benzenoid hydrocarbons (see Cyvin and Gutman [167]). Gutman and Cyvin have developed a widely used peeling algorithm for determining if the benzenoid hydrocarbon corresponding to a given hexagonal system has a KekulC structure. Hansen and Zheng [168] have recently shown that this algorithm, even as modified by Gutman and Cyvin, does not always work, and have given a revised peeling algorithm which does. Hansen and Zheng [169] have given a linear algorithm for solving the problem. See Zheng [I701 for many references on hexagonal systems. According to Balaban, it remains an open problem to determine, for a given number h of hexagons in a polyhex, which structure has the largest number K of KekulC structures. It also remains open to characterize the general structural features for in-plane or out-of-plane polyhexes which maximize K for given h.
Cages are a special type of graph mentioned as important for chemistry by Balaban. A cage is a trivalent, girth 3 graph (see Wong [171]). Cages are related to reaction graphs; in some simple cases, these are cubic and have high symmetry. Some of the problems here simply have to do with enumeration of these graphs. In his talk, Louis Quintas (this volume pp.333-339) mentioned the importance of random graph models in chemistry. I shall have more to say about this topic in $5.1. He pointed out that in chemistry and physics, a vast number of problems require degree bounds, because of bounded valence. The probability of an edge being chosen is not necessarily independent, as it is assumed to be in many random graph models. Hence, we need a different model. This is the case, for instance, when we study phase transitions from liquid to solid. There is much work to be done on the theory of random graphs under degree constraints. 4.4 Communication Networkshnformation Management
One of the major factors underlying the explosive growth in the field of graph theory has been the rapid onset of the information age and the need to manage, circulate, disseminate, store, and access huge amounts of information. From its beginning the development of the theory of communication networks has been closely tied to graph theory. We can only expect the influence of the information age on graph theory to continue to stimulate its development, and for graph theory to continue to be vitally important in the understanding of communication networks and the theory of information management. These topics were discussed in the talks by Cozzens, Monma, and Lipman. I could not begin to do justice to the vast literature surrounding these areas. I will limit myself to referencing other sections of this paper which discuss ideas relevant to communication networks and information management. I have already discussed in $3.4the importance of the theory of reliability and survivability of computer and communication networks; new notions of reliability and survivability are increasingly important in the field today. I have also mentioned, in $3.4,some of the new network topologies which are being studied, topologies such as the chordal ring topologies and the
New directions in graph theory (emphasizingapplications)
31
double loop networks. I have mentioned in $2.4 the need for new measures or concepts of vulnerability. I have mentioned in $2.1.3 the theory of T-colorings. These were motivated by problems of channel assignments in communications. 4.5 Graph Theory and Pure Mathematics
With the emphasis on practical applications, I have so far neglected to emphasize that methods of pure mathematics are becoming increasingly useful to graph theorists and that graph-theoretical methods have and will continue to play an important role in solving some problems of pure mathematics. These points were made in the talks by BolloMs and S6s. More and more, modern methods of pure mathematics are being used by graph theorists. For instance, Karonski and BollobAs, in their talks, discussed the use of rapidly mixing Markov chains, Martingales and other ideas from modem probability theory in the theory of random graphs. The theory of graph enumeration, discussed in the talk by Edgar Palmer (this volume pp.341-348), has in many ways become a subset of group theory. In the other direction, Dominic Welsh, in his talk (this volume pp. 159- 171), said that in his opinion, the most striking application of graph theory to pure mathematics is in the resolution of (almost all of) the Tait Conjecture which gives conditions under which two knots have the same number of crossings. (See the recent paper by Schrijver [172].) I should also mention a recent application by Schramm 11731 of &heAndreev-Thurston Theorem (Thurston [I741 [175]). This theorem states that every planar graph arises from a circle packing on the sphere whose nerve is the given graph. Schramm uses this to prove a variety of results, for example that if U is an open, connected subset of the sphere S2 and d is a Riemannian metric on U , then the collection of all (proper) balls of d is packable on U. 4.6 Other Areas of Application
There is no space to go into detail on the importance of graph theory vis-a-visother areas of application. I simply mention here such areas as: clustering(see $2.2.3), location problems (see $2.2.2), transportation (the one-way street (strongly connected orientation) problem, discussed in $32.2.1 and 3.5, and the traffic phasing problem mentioned in $2.1.4, are just two small examples); ecology (the competition graphs discussed in $3.4 and the pulse processes discussed in $4.2.3 are just two cases of graph-theoretical ideas important for ecology); manufacturing(mentioned in the talk by Lipman); VLSI design; artificial intelligence; program verification and other problems in computer science. 5.
Randomness is a Widespread Theme with Many Aspects
Randomness has been an important theme in graph theory and should continue to be so. I will organize my comments about the important role of randomness into four sections, in which I will deal with random graphs, with deterministic graph problems arising from random graphs, with the probabilistic method, and with probability and algorithms.
F.S. Roberts
32
5.1 Random Graphs
The theory of random graphs is by now very widely known and is summarized in a very fine way in the book by BollobL [176] and in the new journal Random Structures and Algorithms. The talks by Palmer (this volume pp.341-348) and Karohski identified a number of major trends in the theory of random graphs. One of these is the idea of evolution, of how a random graph undergoes a phase transition when a certain parameter reaches a certain level. Joel Spencer calls this the “Big Bang” (see Spencer [177]). Examples of such results are by now well known. One example, which has its origins in the work of ErdBs and RCnyi [178] [179] and is worked out in greater detail in BollobL [176] [180], is the following. Suppose a graph G ( n , N )has n vertices and N edges and is chosen randomly among all such graphs. If N = LcnJ,for 0 < c < 112, then in almost every G(n,N),the order of a largest component is log n. However, if N = Lcn J and c > 1/2, then there is a phase transition and we get almost everywhere a giant component of size,en,where E> 0 depends only on c. A second major trend in the theory of random graphs is the emphasis on limit theorems. Here, one is interested in asymptotic distributions, for instance of the number of triangles, cliques of certain kinds, or of other structures. To illustrate this idea, consider a different model for a random graph, the model in which we pick n vertices and each possible edge is present with probabilityp, independently. We get a graph G(n,p).Suppose that X, is the number of triangles in this random graph. It is straightforward that the expectation is given by
E(XJ =
(
3 “)p3;
a similar result holds for the number of cliques of size r (see p.252 in [176]). Karonski noted that one can show that if p goes to 0 relative to n, then E(X,) approaches 0 as n gets large; if p grows slowly relative to n, then E(X,) approaches a constant as n gets large; and if p grows quickly relative to n, then E(X,) approaches infinity as n gets large. Still a third major trend in the theory of random graphs is the study of Ramsey properties. On the one hand, random graph tools are used in proofs of Ramsey theorems for deterministic graphs. Many examples of such proofs are given in Chapter XI1 of BollobL [176]. On the other hand, one studies in their own right Ramsey properties of random graphs. This is an important new direction. A sample paper on this subject is that of tuczak, Rucinski, and Voigt [ 1811. KaroAski and Quintas discussed future needs in the theory of random graphs. One of these is the need to develop revised models of randomness. For instance, according to Karonski, we need to develop the theory of evolutionlphase transition for random n-cubes, something which is important with respect to computer architectures, and for random hypergraphs. According to Quintas (this volume pp.333-339), we also need to develop evolution theorems, limit theorems, and other results for random graphs whose degrees are restricted, as I pointed out in 94.3. Another area of present and potential future import is the theory of random interval graphs. I have already mentioned interval graphs and their significance in 9 $2.3 and 4.1. Problems of genetics and ecology have led to an interest in random interval graphs. It has been known for a long time (Cohen, Komlbs, and Mueller [ 1821) that if an arbitrary graph is chosen at random, then the probability that it is an interval graph approaches 0 as the number of vertices approach infinity. Recently, Scheinennan [la] [184], Justicz, Scheinerman, and Winkler [185], and others have developed a theory of random interval graphs in which an interval
New directions in graph theory (emphasizingapplications)
33
graph is generated by randomly choosing n intervals on the line. Some sample results of Scheinerman are that almost all such graphs are Hamiltonian and that almost all such graphs have chromatic number ni2 + o(n). 5.2 Deterministic Graph Problems Arising from Random Graphs
As Rucinski mentioned in his talk (this volume pp.265-273), sometimes random graphs pose purely deterministic graph-theoretical questions. Rucinski mentioned several examples of such problems in his talk. These examples come from the study of the number of subgraphs of a given type in a random graph of the type G ( n , p )defined in the previous section. I just give one such problem as an example. The density d( G ) of a graph is the number of edges divided by the number of vertices, the global density m(G) is max d ( H ) over all subgraphs H , and a graph G is called balanced if d(G) = m(G).An old theorem of Erdiis and R h y i [178] says that if G is balanced, then the probability that G(n,p)contains a copy of G approaches 0 if npd(G)approaches 0 as n gets large and 1 if npd(@approaches 1 as n gets large. Bollobiis [I861 showed that the same result holds if we replace d( G ) by m(G),and Barbour, et al. [187] gave a purely deterministic proof of BoIlobAs’ result given that of Erdiis and RCnyi. We say that F is a balanced extension of G if F is a supergraph of G and m(F) = d(F) = m(G).Gyori, Rothschild, and Rucinski [188]showed that every graph has a balanced extension. However, it remains open to determine the smallest number of vertices that a sparsest possible balanced supergraph of an n-vertex graph can have. 5.3 The Probabilistic Method
The probabilistic method has played an important role in graph theory. It is designed to help us prove the existence of things in a deterministic problem, without actually constructing them. The probabilistic method is summarized in the books by ErdBs and Spencer [ 1891 and Spencer [ 177 and was mentioned at the conference in the talks by Bollobh and E d & Certainly this powerful tool will continue to play a vital role in the development of graph theory. 5.4 Probability and Algorithms
It has long been known that many algorithms which can be bad in their worst cases are very good in an “average” case. This has led to increased interest in analysis of algorithms over random instances of problems. Here, the inputs are drawn from a known distribution and we seek algorithms with good average case behavior. Recent studies of the average case behavior of the simplex algorithm, though not about graph algorithms, are important examples of what I have in mind. See Borgwardt [NO]and Shamir [191][I921 for surveys of this topic. For work on the average case behavior of graph algorithms, see for instance Karp, et al. [193], Kemp [I%], and Steele [195]. Probabilistic ideas enter into the development of efficient algorithms in another way as well. Namely, sometimes if we allow a machine to make some random choices, we obtain an algorithm - a random algorithm - which is very effective at solving a problem. A typical example of the second kind of situation was given in the talk by Karobski. It involves the greedy matching algorithm. Here, we choose an edge uniformly at random and delete it from the remaining graph and add it to our matching. Randomization considerably improves the size of the matching which can be produced (see Dyer and Frieze [IW). A problem of some importance is to try to eliminate the random elements of good algorithms for combinatorial problems. For instance, the best parallel algorithms for matching and depth first search tree are random: The matching algorithm uses random matrices and the
F.S. Roberts
34
depth first search algorithm uses matching (see for example Karp, Upfal, and Wigderson I t is not hard to show the existence of universal matrices which could replace the randomization, but so far no one has been able to construct them. To give another example, searching a graph in logspace is easy if you search at random (that is, via the standard random walk). The randomness seems superfluous, yet so far nobody has seen how to eliminate it, for instance by finding an adequate bit string to substitute for a truly random sequence in directing the random walk.
[lw).
As discussed in 33.4, there is considerable current interest in approximation algorithms. The expected algorithmic behavior of random approximation algorithms has been attracting considerable attention. A typical question here (mentioned by Karp and Steele [198]) is the following: If costs are uniform on [0,1], can one give a traveling salesman problem algorithm with polynomial expected running time? See Frieze [199] for some progress on this problem.
I have mentioned parallel algorithms above and in more detail in 33.5. Karonski in his talk emphasized that parallel algorithms will be important in the future of random graph theory. 6. Some of the BIG OLD Problems Remain As S6s pointed out, some fascinating and difficult problems have greatly influenced the development of the field of graph theory. Researchers working on them have developed new theories and new concepts which have been cornerstones of the subject. We can certainly expect continued interest in these old problems, because of their importance and because they are so interesting. When I think of the big, old problems of graph theory, I mean such probIems as the following: 0
7.
IS P = NP? The Reconstruction Problem - Schonheim (this volume pp.59-69) mentioned interesting variations in his talk, by speaking of the number deck, the edge number deck, and the total number deck; the number deck, for instance, gives the multiset { p1,p2,...,p,,}, where pi is the multiset of the orders of the connected components of G\vi and V ( G ) = {vI,v2,...,vn}; see for instance Gavril and Schonheim [200], Krasikov, Ellingham, and Myrvold [ZOl], and Krasikov and Schonheim 12021. Finding Hamiltonian Cvcles - and the Traveling Salesman Problem. The Isomomhism Problem. The Berge Coniecture - this was mentioned in $3.3. The Four Color Problem - which is still being studied; in his talk, Daniel Cohen posed the provocative question “whyfour’?’’. The Shannon Capacity - this was mentioned in $3.3. Graceful Numbering. Hadwiger’s Coniecture - this was mentioned by Bollobb.
Really New Concepts Need to be Developed
It almost goes without saying that a subject will grow stale if no new concepts are developed. On the other hand, it is very difficult to predict in advance what new ideas will become important in the future. I choose here four of my favorite ideas from talks presented at this conference.
S6s talked about avoidance theorems (or unavoidance theorems), theorems which say that every graph of a certain kind has a structure of a certain kind. She mentioned these as an area for important new developments in the future. An example of such a theorem is the theorem
New duections in graph theory (emphasizing applications)
35
of RCdei [203] that every tournament has a Hamiltonian path. One can ask what other unavoidable subgraphs there are in tournaments. Another example of an avoidance theorem is the fundamental theorem in extremal graph theory due to T u r h [204], which says in its simplest case that every graph of n vertices and more than Ln2/4] edges contains a copy of K3. Still another fundamental avoidance theorem is due to ErdCis and Stone [205] and was mentioned by Bollobzis. This theorem concerns the graph K,(t), the complete r-partite graph with t vertices in each class. For every E > 0 and integers r > 1, t 2 1, the theorem says that there is an no so that if n 2 no and G is a graph of n vertices and at least ( + &)nedges, then G contains K,(t). Speaking more generally, one can ask about the maximum number f(n.L) of edges in a graph of n vertices which avoids a subgraph L ErdBs and Simonovits [206] show that
I agree with S6s that the discovery of similar theorems might form an important theme in the coming years.
A second idea of S6s' also bears repeating. While not as easy to make explicit as the idea of an avoidance theorem, this is the idea of a metatheorem which would cover different fields of graph theory. I think that our field is ready for such theorems. However, it is hard to say more specifically what I have in mind here. A third idea comes from the talk by Ronald Read. He said that methods are needed for dealing with LARGE graphs. I think he is quite right. The fourth idea came from my own presentation. One hesitates to predict that one's own ideas will become interesting to others. Rather than make a prediction, let me simply say that I wish the idea of meaningfulness, which I have described briefly in $4.2.4, would become an important theme in future graph-theoretical research. This is an idea which has been totally neglected heretofore. Yet, it is closely related to the ideas of invariance which have played such a central role in the history of mathematics, for instance in the Erlanger Program of Felix Klein (c$ Narens [2m), in geometry, in physics, and so on. 8.
Graph Theory is a Wonderful Vehicle for Education in the Mathematical Sciences and its Educational Role Will Influence its Scientific Direction
Because problems of graph theory are so simple to describe, because graph theory is so closely tied to real-world applications, because graph theoretical problems are readily attacked by computer and graph-theoretical concepts are readily illustrated on the computer, and because it is not necessary to learn a great deal of sophisticated mathematics to work on graph-theoretical questions, graph theory is a natural subject to be introduced to students of all ages and all abilities. This was the point made in the talk by Joseph Malkevitch. Graph theory can be taught from kindergarten on up through postdoctoral education, and is a natural topic for public education about the value of mathematics. We can expect that more and more graph theory will enter the schools, especially since the recent Standards of the National Council of Teachers of Mathematics have encouraged the inclusion of discrete mathematics, specifically graph theory, at various places in the curriculum (see National Council of Teachers of Mathematics [208]). Nathaniel Dean and Clyde Monma, in their talks, made the point that computers can and
36
F.S. Roberts
will play an important role in the teaching of graph theory. In her talk, Phyllis Chinn (this volume pp.375-384) pointed out that new and innovative methods of instruction can be used with graph theory. She talked about how graph theory is a natural subject for the discovery method and for math labs. It is safe to say that graph theory will be used increasingly in education at all levels. It is also reasonable to speculate that this development will have some impact on the development of the field as science. Computer methods developed for teaching and discovery should inevitably be useful for researchers as well. New ways of presenting ideas should help to organize them for all graph theorists. We will also presumably lure bright young minds into the field, which will have its inevitable happy implications for the future of graph theory! 9.
Conclusion The field of graph theory has grown tremendously since its ‘invention’ by Euler in the
18h century, and it has grown explosively in the last 20 years. The future of the field looks very bright. It will be very interesting to look back on the next 20 years, as new variations of
old concepts, and really new concepts, are developed, as new approaches to algorithms and to randomness take hold, as applications (new and old) stimulate the development of new graphtheoretical methods, as perhaps some of the BIG OLD problems are solved, and as a new generation of graph theorists, some of whom got interested in graph theory in kindergarten, come to be the leaders of the field!
Acknowledgements The author gratefully acknowledges the support of grants AFOSR-89-0512 and AFOSR90-0008 to Rutgers University. He also gratefully acknowledges the help of the following people, who provided references and other comments: Denise Sakai, Jean-Claude Bermond, Ravi Boppana, Midge Cozzens, Guoli Ding, Pierre Hansen, Pavol Hell, Jeff Kahn, Richard Karp, John W. Kennedy, Ed Scheinerman, Ron Shamir, Joel Spencer, Chi Wang, and Xianghuan Yuan.
References J.A. Bondy and U.S.R Murty; Graph Theory with Applications, Elsevier, New York (1977). F.S. Roberts; Discrete Mathematical Models, with Applications to Social. Biological, and Environmental Problems, Prentice-Hall, Englewood Cliffs, New Jersey (1976). F.S. Roberts; Applied Combinatorics, Rentice-Hall, Englewood Cliffs, NJ (1984). F.S. Roberts; From garbage to rainbows: generalizations of graph coloring and their applications, in Graph Theory. Combinatorics, and Applications. Vol. 2, Y. Alavi, G. Chartrand, O.R. Oellennann, and A.J. Schwenk (editors),Wiley, New York, 1031-1052 (1991). J.A. Andrews and M.S. Jacobson; On a generalization of chromatic number, Congressw Numerantiurn, 47, 3 3 4 3 (1985). M. Frick and M.A.Henning; Various results on defective colorings of graphs, Research Report 931901 (3). Department of Mathematics, Applied Mathematics and Astronomy, University of South Africa,Retoria (1990). P. Erdbs, A. Rubin and H. Taylor; Choosability in graphs, Congressus Numerantium, 26, 125-157 (1979). J.I. Brown, D. Kelly, J. Schonheim, and R.E. Woodrow; Graph coloring satisfying restraints, Discrete Math., 80, 123-143 (1990). B. Bollobh and A.J. Hanis; List-colouringsof graphs, Graphs and Combinatorics, 1,115127 (1985). N.V.R. Mahadev and F.S. Roberts; Amenable colorings, Technical Report 92-26, DIMACS, Rutgers University, New Brunswick, New Jersey (1992).
New directions in graph theory (emphasizingapplications)
37
N.V.R. Mahadev, F.S. Roberts and P. Santhanakrishnan;3-Choosable complete bipartite graphs, Technical Report 91-62, DIMACS, Rutgers University, New Brunswick. New Jersey (1991). B. Tesman, T-Colorings,List T-Colorings, and Set T-Colorings of Graphs, Ph.D. Thesis, Department of Mathematics, Rutgers University, New Brunswick, New Jersey (1989). W.K. Hale; Frequency assignment: theory and applications, Proc. IEEE, 68,1497-1514 (1980). M.B. Cozzens and F.S. Roberts; T-Colorings of graphs and the channel assignment problem, Congressus Numerantium, 35, 191-208 (1982). I. Bonias; T-Colorings of Complete Graphs, Ph.D. Thesis, Department of Mathematics, Northeastern University, Boston, Massachusetts (1991). D.D. Liu; Graph Homomorphisms and the Channel Assignment Problem, Ph.D. Thesis, Department of Mathematics, University of South Carolina, Columbia, South Carolina (1991). F.S. Roberts; T-Colorings of graphs: Recent results and open problems, Discrete Math., 93, 229-245 (1991). J. Bang-Jensen and P. Hell; On the effect of two cycles on the complexity of colouring,Discrete Applied Math., 26, 1-23 (1990). P. Hell and J. Neklfil; On the complexity of H-coloring, J. Comb. Th., B48.92-110 (1990). R. Haggkvist, P. Hell, D.J. Miller and V. Neumann-ha; On multiplicative graphs and the product conjecture, Combinatorica,S, 7141 (1988). R.J. Opsut and F.S. Roberts; On the fleet maintenance,mobile radio frequency, task assignment, and traffic phasing problems, in The Theory and Applications of Graphs, G. Chartrand, Y. Aiavi, D.L. Goldsmith, L. Lesniak-Foster, and D.R. Lick (editors), Wiley, New York, 479-492 (1981). R.J. Opsut and F.S. Roberts; I-colorings, I-phasings, and I-intersectionassignments for graphs, and their applications,Network, 13,327-345 (1983). [Dl R.J. Opsut and F.S. Roberts; Optimal I-intersection assignments for graphs: A linear programming approach, Network, 13,317-326 (1983). A. Raychaudhuri; Intersection Assignments. T-Coloring. and Powers of Graphs, Ph.D. Thesis, Depart“I ment of Mathematics, Rutgers University, New Brunswick,New Jersey (1985). A. Raychaudhuri; Optimal multiple interval assignments in frequency assignment and traffic phasing, Discrete Appl. Math. (to appear). W.T. Trotter and F. Harary; On double and multiple interval graphs, J. Graph Theory, 3,205-21 1 (1979). E.N. Gilbert, Unpublished Technical Memorandum,Bell Telephone Laboratories, Murray Hill, New Jersey (1972). S. Stahl; n-tuple colorings and associated graphs. J. Comb. Th., B20, 185203 (1976). L. Lovlsz; Kneser’s conjecture, chromatic number and homotopy. J. Comb. Th.,A25.319-324 (1978). P. Frankl, and 2. Fiiredi; Extremal problems concerning Kneser graphs, J. Comb. Th., B40.270-284 (1986). F.S. Roberts; Set, T-, and list colorings, in The First Workhop on Combinatorid Optimization in Science and Technology (COST),E. Boros and P.L. Hammer (editors), DIMACSIRUTCOR Tech. Rep. 3-91, Rutgers University, New Brunswick, New Jersey, 290-297 (1991). F. Buckley and F. Harary; Distance in Graphs, Addison-Wesley,Reading, Massachusetts (1990). L.C. Freeman; Centrality in social networks: I. Conceptual clarification, Social Network, 1 , 215-239 (1979). V. Cbvltal and C. Thomassen; Distances in orientations of graphs, J. Comb. Theory, B24.61-75 (1978). F.S. Roberts and Y. Xu; On the optimal orientations of city street graphs I: h g e grids, SIAM J. Discrete Math., 1, 199-222 (1988). 1361 F.S. Roberts and Y. Xu; On the optimal orientations of city street graphs 11: Two East-West avenues or Norti-South streets, Network, 19,221-233 (1989). [37l F.S. Roberts and Y. Xu; On the optimal orientationsof city street graphs 111: Three East-West avenues or NorthSouth streets, Network, 22,109-143 (19%). F.S. Roberts and Y. Xu; On the optimal orientations of city street graphs IV: Four East-West avenues or North-South streets,Discrete Appl. Math. (to appear). J.E. McCanna; Orientations of the n-cube with minimum diameter, Discrete Math.. 68,309-3 10 (1988). P. Hansen, M. LabM, D. Peeters, J.-F. Thisse and J.V. Henderson; Systems of cities and facility location, in Fundamentals of Pure and Applied Economics,22,l-70 (1987).
38
F.S. Roberts
P. Hansen and M. LabM; Algorithms for voting and competitive location on a network, Transportation Science, 22,278-288 (1988). P. Hansen, J.-F. Thisse and R.E. Wendell; Location by competitive and voting processes, in Discrete Location Theory, R.L. Francis and P. Mirchandani (editors),Wiley, New York. 479-501 (1990). R. Holman; An axiomatic approach to location on networks, Math. ofoper. Res., 15,553-563 (1990). R.V. Vohra; An axiomatic characterization of some locations in trees, mimeographed, Faculty of Management Sciences,Ohio State University, Columbus,Ohio (1990). D.P. Foster and R.V. Vohra; An axiomatic characterizationof a class of locations in trees, Workmg Paper Series WPS 90-14, College of Business Administration,Ohio State University, Columbus, Ohio (1990). P. Hansen and F.S. Ro'oerts; An impossibility result in axiomatic location theory, Tech. Rep. 92-2, DIMACS. Rutgers University, New Brunswick,New Jersey (1992). [47l E. Godehardt; Graphs m Structural Models (2nd edition), Friedr. Vieweg 62 Sohn, Braunschweig, Germany (1990). 1481 B.G. Mirkin and I.B. Muchnik; Clustering andmultidimensionalscaling in Russia (196&1990):Review, mimeographed,Department of Informatics and Applied Statistics, Central Economics-Mathematics Institute, Krasikova, 32, Moscow, 117418 (1991). A. Gu&oche P. Hansen and B. Jaumard; Efficient algorithms for divisive hierarchical clustering with the diameter criterion, J . ofClussification,8.5-30 (1991). P. Hansen and B. Jaumard; Minimum sum of diameters clustering, J . of Classification, 4,215-226 (1987). P. Hansen. 0.Frank and B. Jaumard; Maximum sum of splits clustering, J. of Classification, 6, 177-193 (1989). D. Bonchev and D.H. Rouvray (editors); Chemical Graph Theory, Vol. 1 (Introduction and Fundamentals), Gordon and Breach, New York (1990). D. Bonchev and D.H. Rouvray (editors); Chemical Graph Theory, Vol. 2 (Reactivity and Kinetics),Gordon and Breach, New York (1991). A.T. Balaban, J.W. Kennedy and L.V. Quintas; The number of alkanes having n carbons and a longest chain of length d: An application of a theorem of Polyl, J. Chemical Education, 65,303-3 13 (1988). D. Bonchev, A.T. Balaban and M. RandiC; The graph center concept for polycyclic graphs, Int. J. Quantum Chem.. 19.61-82 (1981). D.Z. Du and F. Hwang; A proof of Gilbert-Pollak's conjectureon the Steiner ratio; Afgorifhmica,7,121136 (1992). M.R. Garey, R.L. Graham and D.S. Johnson; The complexity of computing Steiner minimal trees, SIAM J. Appl. Math., 32,835859 (1977). F.S. Roberts; InMference graphs, in Proof Techniques in Graph Theory, F. Harary (editor), Academic Press, New York, 139-146 (1%9). F.S. Roberts; Graph Theory and i 8 Applications lo Problems of Society. CBMS-NSFMonograph No. 29, SIAM(1978). D. Wang; The Channel Assignment Problem und Closed Neighborhood Containment Graphs, Ph.D. Thesis, Department of Mathematics, NortheasternUniversity, Boston, Massachusetts ( 1985). E. Jarrett; Trunsformationsof Graphs and Digraph, Ph.D. Thesis, Department of Mathematics, Western Michigan University, Kalamaz~o,Michigan (1991). G. chartrand. W.D. Goddard, M. Henning, L. Lesniak,H.C. Swart and C.E. Wall; Which graphs are distance graphs?,Ars Combinatoriu, 29A, 225-232 (1990). G. Chartrand, K.S. Novotny, G.L. Jones and O.R. OeUermann; Subgraph distance in graphs, J. ofCombinatorics, Information & System Science, 16.67-85 (1991). 1641 J.G. Kemeny and J.L. Snell; Mathematical Models in the Social Sciences, Blaisdell, New York (1%2). (Reprinted by MIT Press, Cambridge, Massachusetts(1972).) K. Bogart; Preference structures I: Distances between transitive preference relations, J . Math. Sociology, 3 , 4 9 4 7 (1973). K. Bogart; Preference structures 11, SIAM J . Appl. Math., 29,254-262 (1975). K. Bogart and J.R. Weeks; Consensus signed digraphs, S I A M J . Appl. Math., 36.1-14 (1979). J.P. Barthklemy and F.R. McMorris; The median procedure for n-trees. J . Classification, 3, 329-334 (1986).
New directions in graph theory (emphasizingapplications)
39
W.H.E. Day and F.R. McMoms; A formalizationof consensus index methods, Bull. Math. Biol., 47,215229 (1985). POI T. Margush and F.R. McMoms; Consensus n-trees, Bull. Math. Biol., 43,239-244 (1981). PI1 F.R. McMoms and D. Neumann; Consensus functions defined on trees, Math. Soc. Sci., 4, 131-136 (1983). P21 P.C. Fishbum; Interval Graphs and Interval Orders, Wiley, New York (1985). l731 M.C. Golumbic; Algorithmic Graph Theory and Perfect Graphs, Academic Press. New York (1980). I741 F.S. Roberts; Seven fundamental ideas in the applicationof combinatoricsand graph theory in the biological and social sciences, in Applications of Combinatorics and Graph Theory in the Biological and Social Sciences, F.S. Roberts (editor), Springer-Verlag,Mew York, 1-37 (1989). M.B. Cozzens and F.S. Roberts; On dmensiond properties of graphs, Graphs and Combinatorics, 5.2946 (1989). I761 T.F. Havel, I.D. Kuntz and G.M. Crippen; The combinatorid distance geometry approach to the calculation of molecular conformation I. A new approach to an old problem, J . Theor. Biol., 104,359-381 (1983). m T.F. Havel, I.D. Kuntz, G.M. Crippen and J.M. Blaney;The combinatorialdistance geometry approach to the calculation of molecular conformation 11. Sample problems and computational statistics, J . Theor. Biol., 104,383-400 (1983). P.C. Fishbum; On the sphericity and cubicity of graphs, J. Comb. Theory, B35.309-318 (1983). H. Maehara; A digraph represented by a family of boxes or spheres, J. Graph Theory, 8,431400 (19%). H. Maehara; Space graphs and sphericity; Discrete Appl. Math., 7.55-64 (1984). H.Maehara; Sphericity exceeds cubicity for almost all complete bipartite graphs, J . Comb. Theory, B40, 231-235(1986). K.S. Bagga, L.W. Beineke, M.J. Lipman and R.E. Pippert; A classification scheme for vulnerability and reliability parameters of a graph, Mathematical and Computer Modelling (to appear). 1831 C.A.Barefoot, R. Entringer and H.Swart; Vulnerability in graphs - a comparative survey, J . Comb. Math. Comb. Comput., 1 , 1S22 (1987). L.A. McGeoch and D.D. Sleator (editors); On-Line Algorithms, DIMACS Series, Vol. 7, American Mathematical Society and Association for Computing Machinery,Providence, Rhode Island (1992). W.T. Trotter; Interval graphs, interval orders and their generalizations,in Applications of Discrete Mathematics, R.D. Ringeisen and F.S. Roberts (editors), SIAM, 45-58 (1988). H.A.Kierstead and W.T. Trotter; An extremal problem in recursive combinatorics, Congressus Nwnerantium, 33, 143-153 (1981). A. G y M h and J. Lehel; On-line and first lit colorings of graphs, J. of Graph Theory. 12.217-227 (1988). N. Robertson and P.D. Seymour; Graph minors. 11. Algorithmic aspects of tree-width, 1. of Algorithms, 7, 309-322 (1986). S. Amborg, D. Comeil and A. Proskurowski; Complexity of finding embedding in a k-tree, SIAMJ. Discrete Math., 8,277-284 (1987). N. Robertson and P.D. Seymour; Graph minors. XIII. The disjoint path problem, J. Comb. Theory, B (to appear). E. Beltrami and L. Bodin; Networks and vehicle routing for municipal waste collection. Networks, 4.6594(1973). A.C. Tucker; Perfect graphs and an applicationto optimizingmunicipal services, SIAMRev., 15,585-590 (1973). L. Lovisz; Normal hypergraphs and the perfect graph conjecture, Discrete Math., 2,253-267 (1972). L. Lovisz; A characterization of perfect graphs, J. Comb. Th., 13,9598 (1972). C. Berge and V. Chvital (editors);Topics on Perfecl Graphs, Discrete Math., 21, North-Holland,Amsterdam (1984). M. Grotschel, L. Lovisz and A. Schrijver;The ellipsoid method and its consequences in combinatorial optimization,Combimtorica, 1, 169-197 (1981). M. Grotschel, L. Lovisz and A. Schrijver; Relaxations of vertex packing, J . Comb. Th., B40,330-343 (1%). 1981 M. Grotschel, L. Lovhsz and A. Schrijver; Geometric Algorithms and Combinatorial Optimization, Springer, Berlin (1988). 1691
F.S. Roberts
40
1991 [loo]
[loll [lo21
[lo31
[lo41 [lo3 [lo61 [lo7
[lo81 [lo91 [110]
[lll] [112] [113] (1141
1113 11161 [117 [118] [119]
L. Lovslsz and A. Schrijver; Matrix cones, projection representations,and stable set polyhedra, in Polyhedral Combinatorics, W. Cook and P.D. Seymour (editors), DIMACS Series, 1, American Mathematical Society and Association for Computing Machinery,Providence, Rhode Island, 1-17 (1990). L. LovQsz;On the Shannon capacity of a graph, IEEE Trans. Inform. Theory, IT-25.1-7 (1979). L. Lovbz; An Algorithmic Theory of Numbers, Graphs. and Convexity, NSF-CBMS Monograph No. 50. SIAM, Philadelphia (1986). G. Narasimhan and R. Manber; A generalization of LovLz’s Ofunction, in Polyhedral Combinatorics, W. Cook and P.D. Seymour (editors), DIMACS Series, 1, American Mathematical Society and Association for Computing Machinery, Providence, Rhode Island, 19-27 (1991). J.R. Lundgren; Food webs, competition graphs, competition-commonenemy graphs, and niche graphs, in Applications of Combinatorics and Graph Theory in the Biological and Social Sciences, F.S. Roberts (editor), Springer-Verlag. New York, 221-243 (1989). S.-R Kim; Competition Graphs and Scientific Laws for Food Webs and Other Systems. Ph.D. Thesis, Department of Mathematics, Rutgers University, New Brunswick, New Jersey (1988). C Wang; Competition Graphs, Threshold Graphs, and Threshold Boolean Functions, Ph.D. Thesis, Rutgers Center for Operations Research, Rutgers University,New Brunswick, New Jersey (1991). K.A.S. Hefner and D.W. Hintze; Maximizing arcs in a network for a gven conflict graph, mimeographed, Naval Postgraduate School, Monterey, California (1990). K.F. Jones, J.R. Lundgren. J.S. Maybee and N.J. Pullman; Competition graphs of strongly connected and Hamiltonian digraphs, mimeographed, Department of Mathematics, University of Colorado, Denver, Colorado (1991). X.D. Hu, F.K. Hwang and W.-C.W. Li; Most reliable double loop networks in survival reliability, Networks (to appear). X.D. Hu and F.K. Hwang; Reliability of chordal rings, Networks (to appear). F.K. Hwang, C. Monma and F.S. Roberts (editors); Reliability of Computer and Communicalion Networks, DIMACS Series, 5, American Mathematical Society and Association for Computing Machnery, l’ruvideuce,Rhode Island (1991). M.J. Atallah; Parallel strong orientation of an undirected graph, Info. Proc. Lelters, 18.37-39 (1%). U. Vishkm On efficient parallel strong orientation, Info. Proc. Letters, 20,235-240 (1985). R.M. Karp and V. Ramachandran;Parallel algorithms for shared-memorymachines, Handbook of Theoretical Computer Science, Volume A, North-Holland,Amsterdam,871-894 (1990). F. Boesch and R. Tindell; Robbins’ theorem for mixed mnltigraphs, Amer. Math. Monthly, 87,716-719 ( 1 980). F.R.K. Chung, M.R. Garey and R.E. Tarjan; Strongly connected orientations of mixed multigraphs, Networks, 15,477-484 (1985). J.J. Bartholdi 111, C.A. Tovey and M.A. Trick; Voting schemes for which it can be difficult to tell who won the election, Social Choice and Welfare,6,157-165 (1989). Congress of the United States (Office of Technology Assessment); Mapping Our Genes (Genome Projects: How Big, How Fast?),The Johns Hopkins University Press, Baltimore, Maryland (1990). C. DeLisi; Computers in molecular biology: Current applications and emerging trends, Science, 240.4752 (1988). (3.1. Bell and T.G. Man (editors); Computers and DNA, Addison-Wesley, Reading, Massachusetts (19%
[120] M.S. Waterman (editor); Mathematical Methods for DNA Sequences, CRC Press, Boca Raton, Florida (1989). [121] S. Benzer; On the topology of the genetic fine shucture, Proc. Nut. Acad. Sci. U.S.A.,45, 1607-1620 ( 1959). 11221 S. Benzer; The fine structure of the gene, Sci. Amer.. 206,7044 (1%2). [123] G. Hutchinson; Evaluation of polymer sequence fragment data using graph theory, Bull. Math. Biophys., 31,541-562 (1969). 11241 R.W. Holley, G.A. Everett, J.T. Madison, M. Marquisee and A. Zamir; Structure of a ribonucleic acid, Science, 147, 162-1465 (1%5). [123 R. Arratia and E.S. Lander; The distribution of clusters in random graphs, Adv. Appl. Math., 11,3643 (1990).
New directions in graph theory (emphasizingapplications)
41
[126] W.H.E. Day and F.R. McMoms; Critical comparison of consensus methods for molecular sequences, Nucleic Acids Research, 20, 1093-1099 (1992). [127] W.H.E. Day; Alignment and consensus (an annotated bibliography), mimeographed, Department of Computer Science, Memorial University of Newfoundland,St. John's, Newfoundland (1992). 11281 M.S. Waterman; Consensus patterns in sequences, in Mathematical Methodr for DNA Sequences, M.S. Waterman (editor), CRC Press, Boca Raton, Florida. 93-1 15 (1989). [129] B. Mirkin and F.S. Roberts; Consensus functions and patterns in molecular sequences, mimeographed, Department of Mathematics,Rutgers University, New Brunswick,New Jersey ( 1992). [130] F. Harary, R.Z. Norman and D. Cartwright; Structural Modets: An Introduction to the Theory of Directed Graphs, Wiley, New York (1965). [131] F.S. Roberts; Graph theory and the social sciences, in Applications of Graph Theory, R. Wilson and L. Beineke (editors), Academic Press, London, 255-291 (1979). [132] F. Harary; On the notion of balance of a signed graph, Michigan Math. J., 2, 143-146 (1954). [133] D. Cartwright and F. Harary; Structural balance: A generalization of Heider's theory, Psych. Rev., 63, 277-293 (1956). [134] E.C. Johnsen; The micro-macro connection: exact structure and process, in Applications of Combinatorics and Graph Theory in the Biological and Social Sciences, F.S. Roberts (editor), Springer-Verlag.New York, 169-201 (1989). [135] C.H. Hubbell, E.C. Johnsen and M. Marcus; structural balance in group networks,in Handbook of Social Science Methods. B. Anderson and R.B. Smith (editors), Irvington Publishers, distributed by Halsted Press, New York (1978). [136] H.J. Greenberg,J.R. Lundgren and J.S. Maybee; Inverting signed graphs, SZAMJ. Alg. & Discrete Meth., 5,216223 (1984). [ I 3 7 M. Morishima; On the laws of change of the price system in an economy which contains complementary commodities, Osaka Economic Papers, 1,101-113 (1952). 11381 P. Hansen and B. Simeone; Unimodular functions, Discrete Appl. Math., 14,269-281 (1986). [139] A.M. Colbonm and J.W. Kennedy; Graphs for groupthink in social task groups, Graph Theory Notes of New YorkXX, New York Academy of Sciences, 34-39 (1991). [140] P. Samuelson; Foundations of Economic Analysis, Harvard University Press, Cambridge, Massachusetts (1947). [141] V. Klee; Sign-patterns and stability, in Applications of Combinatorics and Graph Theory in the Biological and Social Sciences, F.S. Roberts (editor), Springer-Verlag.New York, 203-219 (1989). [142] J.S. Maybee; Qualitatively stable matrices and convergent matrices, in Applications of Combinatorics and Graph Theory in the Biological and Social Sciences, F.S. Roberts (editor), Springer-Verlag, New York, 245258 (1989). [143] C. Jeffries, V. Klee and P. van den Driessche; When is a matrix sign stable?, Canad. J . Math., 2 9 , 3 1 5 326 (19n). [144] H.J. Greenberg and J.S. Maybee (editors); Computer-Assisted Analysis and Model Simplification, Academic Press, New York, (1981). [145] F.S. Roberts; Signed digraphs and the growing demand for energy, Envir. & Planning, 3 , 3 9 5 4 1 0 (1971). I1461 S.M. Tanny and M. Zuker; The sensitivity of eigenvalues under elementary matrix perturbations, Lin. Alg. & Appl., 86, 123-143 (1987). [ I 4 7 S.M. Tanny and M. Zuker; A further look at stability notions for pulse processes on weighted digraphs, mimeographed, Department of Mathematics,University of Toronto, Toronto, Ontario (1988). [la]F.S. Roberts; Building and analyzing an energy demand signed digraph, Envir. & Planning, 5,199-221 (1973). [I491 F.S. Roberts; Weighted digraph models for the assessment of energy use and air pollution in transportation systems, Envir. & Planning, 7,7(M-724 (1975). [150] F.S. Roberts; Measurement Theory, with Applications to Decisionmaking. Utility, and the Social Sciences, Addison Wesley, Reading, Massachusetts (1979). 11511 R.D. Luce, D.H. Krantz, P. Suppes and A. Tversky; Foundations of Measurement, Volume 111, Academic Press, San Diego (1990). [152] F.S. Roberts; Applications of the theory of meaningfulness to psychology, J . Math. Psychol., 29.31 1-332 (1985).
42
F.S. Robem
[I531 F.S. Roberts; Limitations on conclusions using scales of measurement, in Opprations Research and Public Systems, A. Barnett, S. M. Pollock, and M.H. Rothkopf (editors), Elsevier (to appear). [154] F.S. Roberts; Meaningless statements, matching experiments, and colored digraphs (applications of graphs and combinatorics to the theory of measurement), in Applications o j Combinatorics and Graph Theory in the Biological and Social Sciences, F.S. Roberts (editor), Springer-Verlag.New York, 277-294 (1989). [155] F.S. Roberts; Meaningfulness of conclusions from combinatorialoptimization, Discrete Appl. Math., 29, 221-241 (1990). [156] F.S. Roberts; Structural modeling and measurement theory, Tech. Forecasting and Social Change, 14, 353-365 (1979). [157] M.B. Cozzens and F.S. Roberts; Greedy algorithmsfor T-colorings of complete graphs and the meaningfulness of conclusionsabout them, J . Comb. Inj & Syst. Sci., 16, 1 6 2 9 (1992). [158] J. Acz61, F.S. Roberts and Z. Rosenbaum; On scientific laws without dimensional constants, J. Math. Anal. & Appl., 119,389416 (1986). [159] J. Acz6l and F.S. Roberts; On the possible merging functions, Math. SOC.Sci., 17,205-243 (1989). [I601 L.C. Freeman, A.K. Romney and D.R. White (editors); Methods in Social Networks Analysis, George Mason University Press, Fairfax. Viginia ( 1988). [161] J.P. Boyd; Social Semigroups: A Unified Theory of Scaling and Blockmodelling as Applied to Social Networks, George Mason University Press, Fairfax, Virginia (1991). [162] S.P.Borgatti, M.G. Everett and P.R. Shirey;LS sets, lambda sets, and other cohesive subsets, Social Nefworks, 12,337-357 (1990). [163] R.L. Breiger, S.A. Boorman and P. Arabie; An algorithm for clustering relational data, with applications to social network analysis and comparison with multidimensional scaling, J . M a h . Psychol., 12.32%383 (1975). I1641 P. Arabie, S.A. Boorman and P.R. Levitt; Constructing blockmodels: How and why, J. Math. Psychol., 17.21-63 (1978). [165] W .H. Batchelder; Inferring meaningful global network properties from individual actors‘ measurement scales, in Research Methods in Social Nefwork Analysis, L.C. Freeman, D.R. White, and A.K. Romney (editors), George Mason University Press, Fairfax, Virginia, 88-134 (1989). [166] F.S. Roberts; Structure and stability in weighted digraph models, Annals New York Acad. Sci., 321,6477 (1979). [167] S.J. Cyvin and 1. Gutman; KekulC structures in benzenoid hydrocarbons,Lecture Notes in Chemistry, 46, Springer, Berlin (1988). [I681 P. Hansen and M. Zheng; A revised peeling algorithmfor determiningif a hexagonal system is K e k u l h , J . ojMolecular Structure, 235.293-309 (1991). [169] P. Hansen and M. Zheng; A linear algorithm for peIfect matching in hexagonal systems, Discrete Math., (to appear). [I701 M. Zheng; Perfect Matchings in Benzenoid Systems, Ph.D. Thesis, Rutgers Center for Operations Research, Rutgers University, New Brunswick, New Jersey (1992). [171] P.K. Wong; Cages - a survey, J . Graph Theory, 6, 1-22 (1982). [I721 A. Schrijver;Tait’s Flyping Conjecture for well-connected links, J. Comb. Th. B. (to appear). [173] 0. Schramm; Existence and uniqueness of packings with specified combinatorics, Israel J . Math., 73, 321-341 (1991). 11241 W.P. Thurston; The geometry and topology of 3-manifolds, Princeton University Notes, Department of Mathematics, Princeton University, Princeton, New Jersey (undated). [175] W.P. Thurston; The finite Riemann mapping theorem, invited talk at the International Symposium in Celebration of the Proof of the Bieberbach Conjecture,F’urdue University (1985). [176] B. BollobL; Random G r a p h , AademicPress, New York (1985). [177] J. Spencer; Ten Lecfures on the Probabilistic Method, CBMS-NSF Monograph, SIAM Publications (1987).
[178] P. Erdb and A. R6nyi; On the evolution of random graphs, Publ. Math. Inst. Hungar. Acad. Sci., 5, 1761 (1960). [I791 P. ErdBs and A. RCnyi; On the evolution of random graphs, Bull. Inst. Int. Statist. Tokyo, 38,343-347 (l%l). [180] B. BollobL; The evolution of random graphs, Trans. Amer. Math. SOC.,286,257-274 (1984).
New directions in graph theory (emphasizingapplications)
43
[MI] R. tuczak. A. Ruariski and B. Voigt; Ramsey properties of random graphs,J . Com b. Th.,€56 (to appear). [I821 J.E. Cohen, J. Komlb and T. Mueller; The probability of an interval graph, and why it matters, in Proc. Symp. on Relations between Combinatorics and other Parts of Mathematics, D.K. Ray-Chaudhuri (editor), Amer. Math. Soc.,Providence, Rhode Island (1979). [183] E.R. Scheinerman; Random interval graphs, Combinatorica, 8,357-371 (1988). [184] E.R. Scheinerman; An evolution of interval graphs, Discrete Math., 82,281-302 (1990). [183 J. Justicz, E.R. Scheinerman and P.M. Winkler; Random intervals, Amer. Math. Monthly, 97,8814389 (1990). [186] B. Bollobiis;Threshold functions for s m a l l subgraphs, Math. Proc. Camb. Phil. Soc., 90, 197-206 (1981). [187J A.D. Barbour. S. Janson, M. Karonski and A . Rucinsh; Small cliques in random graphs, Random Structures and Algorithms, 1,403434 (1990). [188] E. Gyori. B.L. Rothschild and A . Rucinski; Every graph is contained in a sparsest possible balanced graph. Proc. Cambridge Philos. Soc., 98.397401 (1985). [189] P. E'rdds and J. Spencer; Probabilistic Methods in Combinatorics, Academic Press, New York (1974). [I901 K.H. Borgwardt; Probabilistic analysis of the simplex method, Contemporary Math.. 114,21-34 (1990). [191] R. Shamir; The efficiency of the simplex m e t h d a survey, Man. Sci., 33,301-334 (1987). [I921 R. Shamir; Probabilistic analysis in linear programming,in Probability and Algorithms, National Academy Press, WasbingtonD.C., 131-148 (1992). [193] R.M. Karp, J.K. Lenstra, C.J.H. McDiarmid and A.H.G. Rinnooy Kan; Probabilistic analysis of combinatorial algorithms: An annotated bibliography, in Combinatorial Optimization: Annotated Bibliographies, Wiley, New York (1984). [194] A.R. Kemp; Fundamentals of the Average Case Analysis of Particular Algorithms. W h y , B.G. Teubner, Stuttgart (1984). [I951 J.M. Steele; Probabilistic and worst-case analysis of classical problems of combinatorid optimization in Euclidean space, Math. of Oper. Res., 15,749-770 (1990). [1%J M. Dyer and A. Frieze; Randomized greedy matching, Random Structures and Algorithms, 2, 29-46 (1991). [197] R.M. Karp, E. Upfal and A. Wigderson; Constructing a perfect matching is in N C , Proc. 17th ACM STOC, 22-32 (1985). (1981 R.M. Karp and J.M. Steele; Probabilistic analysis of heuristics, in The Traveling Salesman Problem, E.L. Lawler, J.K. Lenstra, A.H.G. Rinnooy Kan, and D.B. Shmoys (editors), Wiley, Chichester, 181-205 (1985). [199] A. Frieze; On the exact solution of random travelling salesman problems with medium size integer coefficients, SIAMJ. Comp.. 16,1052-1072 (1981). [200] F. Gavril and J. Schonheim; Constructing trees with prescribed cardinalities for the components of their vertex deleted subgraphs, J. of Algorithms, 6,239-252 (1985). [201] I. Krasikov, M.N. Ellingham and W.J. Myrvold; Legitimate number decks for trees, Ars. Combinatoria, 21, 15-17 (1986). [202] I. Krasikov and J. Schonheim; The reconstruction of a tree from its number deck, Discrete Math., 53, 137-145 (1985). [203] L. Rbdei; Ein kombinatorischer satz, Acta Litterarum ac. Scientiarum (Sectio Scientarum Mathematicarum). Szeged, 7 , 3 9 4 3 (1934). [204] P. T u r h ; Eine extremalaufgabeaus der Graphentheone, Mat. Fit. Lapok, 48,436452 (1941). [205l P. Erdds and A.M. Stone; On the structure of linear graphs, Bull. Amer. Math. Soc.. 52. 1087-1091 (1%). [206] P. Erdds and M. Simonovits; A limit theorem in graph theory, Studia Sci. Math. Hungar., 1, 51-57 (1966). [2071 L. Narens; Meaningfulness and the Erlanger Program of Felix Klein, Math. Znj Sci. Hum., 101.61-71 (1988). [208] National Council of Teachers of Mathematics, Curriculum and Evaluation Standards for School Mathemdics. National Council of Teachers of Mathematics,Reston, Virginia (1989).
This Page Intentionally Left Blank
Quo Vadis, Graph Theory? J. Gimbel, J.W. Kennedy & L.V. Quintas (4s.) Annals of Discrete Malhemarics, 55,45-58 (1993) 0 1993 Elsevier Science Publishers B.V. All rights reserved.
A SURVEY OF (m, k)-COLORINGS Marietjie FRICK Department of Mathematics, Applied Mathematics, and Astronomy University of South Africa, Pretoria, SOUTH AFRICA
Abstract For a given graph invariant y an ( rn,k)y-cnloring of a graph G is a partition of the vertex set of G into m subsets V ,,...,Vi such that H
) s k for i = 1,. .. ,m. Various aspects of (m,k)y-colorings are compared for the cases where y is taken to be, in turn, the clique number, the maximum degree, the degeneracy
and the path number.
1.
Introduction
All graphs considered in this paper are finite and simple. For undefined concepts we refer the reader to [l]. A coloring of a graph G is an assignment of colors to the vertices of the graph, one color to each vertex. Thus a coloring of G in m colors corresponds to a partition of V(G) into m subsets V , , ...,V, (called the color classes of the coloring). A coloring is proper if each of its
color classes induces an edgeless graph in G. The minimum number of colors required for a proper coloring of G is called the chromatic number of G, and is denoted by x(G). A graph G is called m-colorable if x ( G ) 5 m, and m-chromatic if x(G) = in. The chromatic number is probably the most intensively studied graph invariant, and it has been generalized in several different directions, usually by generalizing the concept of a proper coloring. A popular way of doing so is to generalize some property of edgeless graphs and then to define a generalized coloring as a coloring which is such that each of its color classes induces a subgraph which has this generalized property. In this paper we shall consider the property that some graph invariant be less than a given integer. For a graph invariant y and integers m and k we define an (m,k)y-coloring of G as a partition of V(C) into m subsets V,, .. . ,Vm such that y ( < V i > ) i k f o r i = 1, ...,rn The minimum m for which G has an (m,k)koloring is called the k-th chromatic number of G, and will be denoted by @G). A graph G is called (m, k)y-colorable if x$(C) 2 m, and (m,k)ychromatic if xI(G) = m . If we do not wish to specify the invariant to be considered, the y i n the notation will be omitted. On the other hand, if the integers m and k are of no importance, we shall simply speak of y-colorings. to be a true generalization of x, and we wish our theory of y-colorings to Since we want closely resemble that of proper colorings, we shall only consider graph invariants y which satisfy the following requirements.
1 It must be possible to characterize edgeless graphs by theiry-values, i.e. there must be an integer ko such that N is an edgeless graph if and only if r(N) = h.
II If H is a subgraph of G t h e n ~ f l I)~ ( G ) . 111 If GI and G2 are two disjoint graphs with %GI) I k andy(G2) I k then y(G1 u G2) 5 k. Requirement I ensures that there exists an integer k such that x k
0
= X.
Requirement I1 ensures that, for every positive integer m , there exists a graph G with
M. Fnck
46
xz(G) = m (see [2], Theorem 1.2).
Requirement I11 ensures that if #(G) = m , then every (m,k)y-coloring of G is complete, Le., any two distinct color classes of an (m,k)y-coloring are joined by an edge.
Examples of graph invariants which satisfy these three requirements are (a) The clique number o.(For a given graph G, o(G) is the order of a largest complete subgraph of G.) (b) The maximum degree A.
(c) The degeneracy p. (For a given graph G, p(G) = max 6(H) . H
(d) The path number 2. (For a given graph G, z is the order of a longest path in C.) It is not the object of this paper to present a unifying theory of (m,k)-colorings, but rather to compare known results on the colorings corresponding to the four invariants listed above. On the one hand, such a comparison might lead to some results which hold for (m,k)-colorings in general and, on the other hand, it would indicate open problems for specific invariants. In [3] - [8] @colorings are investigated; A-colorings are also called defective colorings, and these are investigated in [9] -[15]; p-colorings are also called LickWhite vertex partition -[20], and r-colorings are investigated in [15] and numbers, and they are investigated in
[la
P11. I shall not attempt to cover all the work done on the colorings in question, but will rather concentrate on those areas where there has been a good deal of common activity. 2. Various Areas of Investigation k Relations between x k and x
Note that (a)
x =x y
=x! = 2;.
It is easy to see that, for any graph G and any integer k 2 1
Moreover, it is proved in [6] that, if k, m and n are positive integers there exists a graph G with xr(G) = m and x(C) = n if and only if m Ir n / k l . Thus the difference between %and xr can be made arbitrarily large.
(b)
It is well-known that x(G) I 1 + A(G) for any graph G. It follows that, for every k 2 0
xi(@ 2 x(G) (c)
( k + 1) X $ )
It is also well-known that x(G)I1 + p(G) (see [l], Theorem 11.3). Hence, for every k z O x,P(C>2 x(G) 2 ( k + 1) x,P(G)
(d)
. .
Further, x(C) 5 z(G) (see [l], Theorem 11.5). Hence, for every k 2 1 5 x(G) Ik x p 3
'
B. Relations between x k and w It is well-known that, for every graph G,
A survey of (m. k)-coloringS
47
x ( c ) 2 o(G).
We also have the following result, which has been proved by means of several different constructions (see [221 and [23]). Theorem 1: For every integer m 2 2 there exists a graph G with x(G) = m and w( C) = 2. Remark:
By a slight modification of Mycielski's construction in [22], we can prove that for a given m. integer m L 2 there exists a graph G with x(C) = m and w( G) = c if and only if 21 c I It is easy to prove the following relations.
From [24], Theorem 2 we obtain the following generalization of Theorem 1 above, which holds for k-th chromatic numbers in general. Theorem 2: (Folkman) If y is any graph invariant and H is a graph such that y(G) 2 k then, for every integer m there exists a graph G with xk(G) 2 m and o(G) = N H ) .
Thus, for any graph invariant y the difference between large.
2 1,
x i and o can be made arbitrarily
For y = A, p, or 2 it follows directly from Theorem 1 and the relations given in (b), (c) and (d) of Section A, that there exist graphs with clique number 2 and arbitrarily large k-th y-chromatic number. This is, however, not the case for y = w, but Theorem 2 asserts that there exist graphs with clique number k + 1 and arbitrarily large k-th wchromatic number. Sachs [S] constructed, for all positive integers m and k , a graph G with x,"(G) = m and o(G) = k + 1. Using a different construction, Broere and Frick [4] proved Theorem 3
Given integers k 2 1 and m only if k + 1 I c I&.
L2
there exists a graph G with x,"(C) = m and NG) = c if and
The following result was proved constructively in [20]. Theorem 4: (SimBes-Perreira)
Given integers k 2 0 and m only if 2 5 c s m ( k + 1).
L 2 there
exists a graph G with
It is easy to prove constructively that Theorem 4 also holds if C. Relations between X k and A
It is well-known that, for any graph G ,
= m and w(G) = c if and
is replaced with
x,".
M. Frick
48
x(c) s A(G) + 1.
In 1941 Brooks [25] proved the following famous theorem. Theorem 5: (Brooks) If G is a connected graph then x(G) s A(@ unless G is a complete graph or an odd cycle.
(a) From the inequalities $’(G) I r X ( G ) / k l and x ( C )s A(G) + 1, we obtain
r
X,O(G) 5 ( w )+1)
/q
.
(b) It follows from a beautiful result proved by Lovhz ([26], Theorem 1) that
x$w~(A(G)+~)/(~+
.
(c) Lick and White [18] proved that Xi(G)Ir(A(G)+l)/(k+l)i (d) I do not know of a bound for
.
xi in terms of A.
Mitchem [19] generalized Brooks’Theorem for xias follows.
Theorem 6: (Mitchem)
If G is a connected graph then $(G) I rA(G)/ (k + 1) 1 unless G is (i)
A complete graph of order t(k + 1) + 1, k 2 0, t 2 0 .
(ii)
A ( k + 1)-regular graph, k 2 0.
(iii) An odd cycle, k = 0. One could now attempt to find similar generalizations of Brooks’ Theorem for
D. Bounds for
xi in terms of k and y
(a) As mentioned in Section B, x,”(c) 2 ro(G)/kl for all k 2 1, and there is no upper bound for
x,”
in terms of o and k alone.
(b) As mentioned in Section C , it follows from [26] Theorem 1 that
$(C)Ir(A(G)+l)/(k+
1)1 forallk20.
(c) Lick and White [18] proved I
r (p(c) + 1) / ( k + 1)1 for all k 2 0.
(d) Chartrand, Geller and Hedetniemi [21] proved that xi(G) ~ L ( T ( C-) k) /2J + 2 forall k z 1.
E. Chromatic number sequences The y-chromatic number sequence of a graph G is the sequence
x,”
and
x,”.
A survey of (m,k)-colorings
where k , is the integer such that
49
xio= X .
This sequence is clearly nonincreasing and, if y(G) = n then x ~ ( G=) 1 for all k 2 n . Thus the sequence has an infinite tail of ones. (a) In
[q (Theorem 2) o-chromatic number sequences are characterized as follows:
Theorem 7:
A given sequence of positive integers m l , f i , . .. is the wchromatic number sequence of some graph G if and only if the following two conditions are satisfied: (i)
(ii)
m, = 1 for some integer r [ m i + l/m,l > j / i forall iand jwith 1 5 i < j s r .
Concerning the length of a constant subsequence of an o-chromatic number sequence, we have Theorem 8:
If i, j and m are positive integers with i < j and m L 2 then there exists a graph G with x;(G) = x L I ( G ) = ... = $?(G) = m i f a n d o n l y i f j s 2 i - 1 . (For a proof see [6], Theorem 3.) We also have a result concerning consecutive terms of an wchromatic number sequence (see [6] Theorem 4): Theorem 9: (i)
Given positive integers i, m and n with i 1 2 and m 2 n, there exists a graph G such that $(G) = m and x;+*(G) = n.
(ii)
Given positive integers m and n there exists a graph G with xY(G) = m and x$'(c> = n if and only if n 5 r m / 2 1 .
(b) The following theorem gives necessary conditions for a sequence of positive integers to be the A-chromatic number sequence of some graph (see [13]). Theorem 10:
If mo,m I,m2,. .. is the A-chromatic number sequence of some graph, then the following two conditions are satisfied: (i) mr = 1 for some integer r
(ii) m, Imi 5 mi[
u+ 1) / ( i + 1) 1
if 0 5 i < j .
For sequences with at most twelve terms greater than 1 it can be shown that these two conditions are also sufficient, but for sequences with more than twelve terms greater than 1 it is not known whether these conditions are sufficient. For example, we have not succeeded in finding a graph with A-chromatic number sequence 4,4,4,4,2,2,2,2,2,2,2,2,2,1, ... although this sequence satisfies the two conditions of Theorem 10. The behavior of A-chromatic number sequences is very different from that of o-chro-
M. Frick
50
matic number sequences. For example, Theorem 8 above shows that the length of a constant a-chromatic subsequence in the “greater-than-one part” of an a-chromatic sequence is restricted, whereas it follows from [13] Theorem 3 that constant subsequences of arbitrary length can occur in any part of a A-chromatic sequence. Also, Theorem 9 above implies that the difference between any two terms of an a-chromatic number sequence can be arbitrarily large, but this is not the case for A-chromatic number sequences (see Theorem 10 above). In particular, we have for any graph G, &G) I 2xt+ (GI for i 2 0.
(c) As far as I know no work has been done on p-chromatic number sequences. However, comparing the bounds given in (b) and (c) of Section D, we could expect gchromatic number sequences to behave in much the same way as A-chromatic number sequences. For example, chromatic sequences also satisfy the conditions (i) and (ii) of Theorem 10. Chartrand, Geller and Hedetniemi [21] proved that, for any graph G and integers i and j with2si<j,
1
xT(G) I (j - i - 1) /2]x,(G)
and
xI(G) s.ix,(G). Thus we have Theorem 11: If ml,rn2,. .. is the z-chromatic number sequence of some graph G, then (i) m, = 1 for some r (ii) ml s jmj for j 2 1 (iii) mi I L (j- i - 1) /2J for 2 Ii <j. It remains to find sufficient conditions for a sequence of integers to be the z-chromatic number sequence of a graph. It follows from [21], Theorem 5 that for any positive integer n, there exists a graph such that the first n terms of its z-chromatic number sequence all equal 4.Thus a z-chromatic number sequence can have constant subsequences of arbitrarily long length in its “greater-thanone part”. As far as consecutive terms are concerned, we have that, for any graph G and any integer i 2 1,
X T W 5 2x;+ dG). (see [21], Corollary 3c).
F. Critical graphs A graph G is called critical with respect to Xk(&critical for short) if xk(H) < a ( G )for every proper subgraph Hof G. If G is &-critical with xAG) = m, we shall say that G is (m,k)critical. Every (m, @-chromatic graph obviously contains an (m,k)-critical graph as subgraph and
A survey of (m. k)-colorings
51
in the study of (m,k)-chromatic graphs it is often sufficient to consider only those which are (m, k)-critical. If a graph is m-chromatic and critical with respect toX then it is called m-criticul. We know the following about the order of m-critical graphs.
Theorem 12: (i) The smallest m-critical graph is the complete graph K,. (ii) There does not exist an m-critical graph of order m + I. (iii)The only 2-critical graph is K2 and the only 3-critical graphs are the odd cycles. (iv)For m > 3 the graph
r5+ K,,, - 3 is the only m-critical graph of order m + 2.
(v) For m t 4 there exists an m-critical graph of order n for every n 2 m + 2. Results (ii) and (iv) above are due to Dirac [28], and (v) is proved in [%I, Theorem 11.7.5. The following results about the structure of m-critical graphs are proved in [28] , Chapter 1 1.
Theorem 13:
If G is an m-critical graph then (i) G is 2-vertex connected. (ii) G is ( k - 1)-edge connected. (a) Theorem 12 can be generalized as follows for (m,k)w-critical graphs (see [3]).
Theorem 14: (i) The smallest (m, k)w-critical graph is the complete graph Kk(,,- I ) + 1. (ii) There does not exist an (m, k)w-criticalgraph of order k(m - 1) + r for 2 s r 5 k (iii)The graph C2k+3+ K k ( m -
3) + k -
+ 1.
is the only (m, k)w-critical graph of order mk
+ 2.
(This is the graph obtained from the complete graph of order mk + 2 by the removal of the edges of a (2k + 3)-cycle, and is the smallest non-complete (m, k)w-critical graph.) Various constructions of larger (m,k)w-critical graphs appear in [71, but it has not yet been established whether (m,k)w-critical graphs of arbitrarily large order exist. Theorem 13 is generalized as follows in [71.
Theorem 15:
If G is an (m, k)”-critical graph then (i) G is 2-vertex connected. (ii) G is k( m - 1)-edge connected form
2
2.
The results of Theorem 15 are best possible. (b) In [13] it is proved that every (m,k)’-critical graph has order at least (m - l)(k- 1) + 1. Now, for any integer m and any odd integer k 2 1 let HA, be the graph obtained from the complete graph K(,Hm, k
l)(k+ 1) by the removal of a 1-factor, and let
= Kl -k Hm, k
M. Frick
52
Also, for integers m 2 2 and k
22
let
Gm,k= K(m-2)(k+ I ) + 1 +Kk+l
The (m,k)A-critical graphs of order (rn - 1) (k - 1) + 1 can be characterized as follows (see [I311.
Theorem 16: If G is an (m, k)A-critical graph of smallest order then one of the following holds: (i) GnK,and
k=O.
(ii) G ='Grn,k and k t 2. (iii) G G H m , k and k is odd and m 2 3. Concerning the structure of (m,k)A-critical graphs we have
Theorem 17: If G is an (rn,k)A-criticalgraph with rn 2 2 then 6(G) L rn - 1. The above result is best possible, i.e., for every k 2 0 and m 2 2 there exists an (m,QA-critical graph G with 6(G) = rn - 1.This is in strong contrast with the fact that, for m 2 2, every (m, k)m-critical graph has minimum degree at least k(m - 1) (see Theorem 15 above). (c) Lick and White [18] proved the following two results on (m,k)P-critical graphs.
Theorem 18: The complete graph K(m + I)@-
+
is (m, k)P-critical graph.
Theorem 19: If G is an (m, k)P-critical graph then 6(G) 2 (m - l)(k + 1). In many aspects the nature of A-colorings and that of p-colorings are very similar, so I find the difference in the lower bounds for 6(G) in Theorem 17 and Theorem 19 rather surprising. (Both bounds are best possible.) (d) I do not know of any results on (m,k)%xitical graphs. G. Uniquely colorable graphs
A graph G is uniquely (m,k)-colorable if it is (m, k)-chromatic and any two (m,k)-colorings of G produce the same color classes. For proper colorings the following result is proved in [29].
Theorem 20: (Harary)
For every rn L 3 there exists a uniquely m-colorable graph with clique number less than m In [30] this result is considerably improved to
Theorem 21: (BollobL and Sauer) For all integers m 2 2 and g 2 3 there exists a uniquely m-colorable graph with girth g. Clearly, the smallest uniquely m-colorable graph is the complete graph K,, and Theorem 21 implies that, for every m 2 2, there exist uniquely rn-colorable graphs of arbitrarily large order.
A survey of (m.k)-colorings
53
The following result is proved in [31].
Theorem 22: (Chartrand and Geller) Every uniquely m-colorable graph is ( m- 1)-vertex connected. (a) It is shown in [5] that, for every k L 1 and m L 3 there exist uniquely (m,k)("-colorable graphs of order n if and only if n 2 mk + ( m- 1)(k + 1). Uniquely (m,k)"-colorable graphs of smallest order (i.e., of order mk characterized as follows in [5].
+ ( m- l ) ( k + 1) are
Theorem 23: Let k z 2. Then (i) There is only one uniquely (2,k)o-colorable graph of smallest order (i.e., of order 3 k + l), namely the graph C,, +K +
(ii) Form 2 3 every uniquely (m,k)"-colorable graph of order mk + (m- l)(k + 1) is a spanning subgraph of the graph ( m- 1) CZk+ + K, and one of its color classes induces a graph isomorphic to Kk, while each of the others induces a graph isomorphic to C2k +
In [7] Theorem 23 is generalized to
Theorem 24:
If G is a uniquely (m, k)O-colorable graph, then (i) G is ( m- 1)-vertex connected. (ii) G is k( m - 1)-edge connected. (b) Concerning the order of uniquely (m, k)'-colorable graphs we have
Theorem 25:
x
(i) If m 2 2 and k L 1 every unique1 (m,k)'-colorable graph has order at least m(k + 2) - 1, and there exists a uniquely (m, k) -colorable graph of order m(k + 2) - 1. k)A-colorable graphs of arbitrarily large order. (ii) For m 2 2 and k L 0 there exist uniquely (m,
The result (i) above is proved in [13], and (ii) follows from the fact that the m-partite graph K ( p I ,...,PAis uniquely (m,k)'-colorableif p i > 2 k + 1 for i = 1, ...,m (see the proof of [13). Theorem 3). It is also interesting to note that the graph K(pl,.,.,p,,Jwithpi = 2 k + 1 for i = 1, ...,m is uniquely (m, I)A-colorable for every C with 0 5 C S k . I find it rather intriguing that the only (m, k)O-coloring that this graph has is a proper m-coloring. All the uniquely (m,k)"-colorable graphs and the uniquely (m,k)'-colorable graphs that I know of have relatively large clique numbers, so it is still a challenging open problem to generalize Theorems 20 and 21 for these cases. (c) The following generalization of Theorem 21 was proved by Bollobas and Thomason [16] and, independently, by Cook [171.
Theorem 26:
For every k 2 0 and m L 1 there exists a uniquely (rn,k)P-colorable graph of arbitrarily large girth.
54
M. Fnck
Theorem 26 implies that there also exist uniquely (m,k)p-colorable graphs of arbitrarily large order. However, uniquely (m,k)P-colorable graphs of smallest order have not yet been characterized. (d) I do not know of any results concerning uniquely (m,k)'-colorable graphs. In fact, it has not even been established that uniquely (m, k)'-colorable graphs exist for k 2 2 and m 2 2. I. Planar graphs
Since the theory of graph colorings had its origin in the Four Color Conjecture, it seems natural to ask the following. Question: For a given integer k, what is the smallest number m such that every planar graph can be (m,k)-colored? (a) In the case of wcolorings we know from the Four Color Theorem that the answer is m = 4 when k = 1 (since x1 = x), and it now follows directly that m = 2 when k = 2 or 3 , and m = 1 when k = 4. (b) For A-colorings Cowen, Cowen and Woodall [12] proved, without using the Four Color Theorem, that m = 4 when k = 1; m = 3 when k = 2, andm r 2 for all k 2 0. (c) For p-colorings Lick and White [ 181proved that m I 3 when k = 1; m 5 2 when k = 2 , 3 or 4,andm= l f o r a l l k r 5 . (d) For z-colorings we have a complete answer (and a rather surprising one at that), namely that m = 4for all k 2 1. This follows from the Four Color Theorem and 1211, Theorem 4. J. Hadwiger's conjecture
An elementary contraction of a graph G is obtained from G by identifying two adjacent vertices; the result of a sequence of elementary contractions is a contraction of G. A graph H is a subcontraction of G, denoted by G > H, if H is a contraction of a subgraph of G. A very deep and long-standing conjecture in graph theory, stated in 1321, is Hadwiger's Conjecture: IfX(G)=rnthenG>K,. The conjecture is true for m I5. For m I 3 it is easy to prove; the case m = 4 was proved by Dirac 1331,and Wagner [34] showed that the case rn = 5 follows from the Four Color Theorem. At the conference Quo Vadis, Graph Theory? (Alaska, 1990) BollobAs appealed to graph theorists to make a gargantuan effort to settle Hadwiger's conjecture, so I feel obliged to at least have a glance at the conjecture from the viewpoint of (m, k)-colorings. Consider the following conjecture: Conjecture H(rn,k):
If XkG) = m then G > K,. If the conjecture H(m,k) is true for some mand k, then H(m,f) is also true for all f z k . The following two results, proved in [351 and [36] respectively, will be useful. Theorem 27: (Mader) I f O i ms7andIE(G)Ir(m-2)IV(G)IthenG> K,.
A survey of (m, k)-colorings
55
Theorem 28: (Wagner) For every m t 2 there exists an integer q(m) such that ~ ( m t) q(m) implies that G > K,. Moreover, if q(m) is the smallest number with this property, then q(m + 1) I2q(m) - 1. (a) The conjecture H(m,Q W is weaker than Hadwiger's conjecture, because x(C) > x,"(C) if k > land x(G) > 1. We now prove the following. Theorem 29: If k 2 1 and x:(G) 2 3 then G > K k + 2 .
Proof: We may assume that C i s (m, k)w-criticalwith m 2 3. Then, clearly, every vertex of G is contained in a ( k + 1)-clique of G (i.e., a subgraph of G isomorphic to Kk+ l ) . By considering the intersection graph of the family of ( k + 1)-cliques and using the fact that every graph has a vertex which is not a cut vertex, it is easy to see that G has a ( k + 1)-clique A which is not a cutset of G. Since G is (m, k)W-critical,it follows from Theorem 15 that 6 ( G )t k(m - 1) > k. Every vertex of A is therefore adjacent to a vertex of G- A. Since G - A is connected, it can be contracted to a single vertex which is then adjacent to every vertex of A. Thus G can be contracted to Kk+ 2 . Corollary: For each k 2 1 the conjecture H(m, k) is true for all m Ik + 2. Using the theorems of Mader and Wagner above, we now also prove Theorem 30: (i)
For k = 1 the conjecture H(m, k)W is true for m I5.
(ii)
For k t 2 the conjecture H(m, k)w is true form I7.
(iii) For k 2 5 the conjecture H(m, k)O is true form I8. Proof:
(i) T h s follows from the fact that
xt= x .
(ii) Let k t 2, m 5 7 and G be a graph with xk(G) = m. We may assume that G is (m, k)w-critical. Then 6(G) t k(m - l), so that IE(G)I > (m - 2)IV(G)I, and hence by Mader's theorem, G > Km
(iii) We know now that q(5) I5 and, applying Wagner's recursive formula, we obtain q(8) I 33. But if x 5 ( C ) = 8 then x(G) 2 40 (see Section A) and hence, by Wagner's theorem, G > K8. (b) Woodall [ 151 showed that, for each k 2 0 and m 2 1 there exists an (m,k ) A-chromatic graph G which does not have a subcontraction to Km Thus, if the conjecture (m,k)' is true, it will be a best possible result. +
(c) Using Theorem 19 and the same method of proof as in Theorem 31(ii), we can prove that, for k 2 1 conjecture (m,k)p is true for m I7. (d) I do not know of any results in this area for (m,k)'-colorings with k > 1.
M. Frick
56
References G. Chamand and L. Lesniak; Graphs and Digraphs. 2nd Edition. Wadsworth & BrookslCole. Monterey Califomia (1986). J. Brown and D.G. Comell; On generalizedcolorings, J. Graph Theory, 11.87-99 (1987) I. Broere and M. Frick; On the order of color critical graphs, Proc. Sixteenth Southeastern International Conference on Combinatorics, Graph Theory and Computing. , Congr. Numer., 47, 125-130 (1985). I. Broere and M. Frick; Two results on generalized chromatic numbers, Quaestiones Mathematicae, 13, 183-190 (1990). I . Broere and M. Frick; On the order of uniquely (k,m)-colourablegraphs, Discrete Math. 4 2 , 2 2 5 2 3 2 (1990). I. Broere and M. Frick; A characterizationof the sequence of generalized chromatic numbers of a graph. Graph Theory, Combinatorics and Applications: Proceedings of the Sixth Quadrennial International Conference on the Theory and Applications of Graphs, Y. Alavi et al. (editors), John Wiley and Sons, Inc., New York, 179-186 (1991). M. Frick; Generalised Colourings of Graphs, Ph.D.-thesis,Rand Afrikaans University, Johannesburg, (1986). H. Sachs; Finite graphs, Recent progress in combinaforics, Proc.Third. Waterloo Conf. on Combinatorics, Academic Press, New York, 175-184 (1969). J.A. Andrews and M.S. Jacobson; On a generalizationof chromatic number, Proc. Sixteenth Soulheartern International Conference on Combinatorics, Graph Theory and Computing, Congr. Numer., 4 7 , 3 3 4 8 (1985). J.A. Andrews and M.S. Jacobson; On a generalization of chromatic number and two kinds of Ramsey numbers. Ars Combinatoria, 23.97-102 ( 1 W ) . D. Archdeacon; A note on defective colorings of graphs in surfaces, J. Graph Theory, 517-519 (1987). L.J. Cowen, R.H. Cowen and D.R. Woodall; Defective colorings of graphs in surfaces: partitions into subgraphs of bounded valency, J . Graph Theory, 10.187-195 (1986). M. Frick and M.A. Henning; Various results on defective colorings of graphs - submitted. F. Harary and K.F. Jones; Conditional Colorability11: Bipartite Variations. Prw. Sundane C o d Combinatorics and related topics, Congr. Numer.. 50,205218 (1985). D. Woodall; Improper colourings of graphs. Graph Colourings, Pitman Research Notes in Mathematics Series, R. Nelson and R.J. Wilson (editors), Longman Scientific and Technical (1990). B. Bollobiis and A.G. Thomason; Uniquely partitionable graphs, J . London Math. SOC.,16, a 4 1 0 (1977). R.J. Cook; Point partition numbers and girth, Proc. Amer. Math. SOC.,49,510-514 (1975). D.R. Lick and A.T. White; k-degenerate graphs, Canad. J . Math. 22, 1082-1096 (1970). J. Mitchem; An extension of Brooks’ Theorem to n-degenerate graphs, Discrete Math., 17,291-298 (1977). J.M.S. Simks-Pereira; A note on graphs with prescribed clique and point-partition numbers. J . Combinatorial Theory (€4). 1 4 . 2 5 2 5 8 (1973). G. Chartrand, D.P. Geller and S. Hedetniemi; A generalization of the chromatic number, Proc. Cmnb. Phil. SOC.,64,265-271 (1968). J. Mycielski; Sur le coloriage des graphs. Colloq. Math., 3, 161-162 (1955). B. Descartes; Solution to advanced problem 4526, Amer. Math. Monthly, 61,352-353 (1954). J.H. Folkmao, Graphs with monochromatic complete subgraphs in every edge coloring, SIAM J . Appl. Math.. 18. 19-24 (1970). R.L. Brooks; On coloring the nodes of a network, Proc. Cambridge Phil. SOC.,37,194-197 (1941). L. Loviisz; On decompositions of graphs, Siudia Sci. Math. Hungar., 1,237-238 (1966). G.A. Dirac; Graph union and chromatic number, J . London Math. SOC..39,451454 (1%). 0. Ore; The Four-colour Problem, Academic Press, New York (1967). F. Harary, S.T. Hedetniemi, and R.W. Robinson; Uniquely colorable graphs, J. Combinatorial Theory, 6, 264270 (1969). B. Bollobiis and N. Sauer; Uniquely colorable graphs with large girth, Can. J. Math., 28, 1340-1344 (1976).
A survey of (m,k)-cdorings
57
G. Chartrand and D.P. Geller; On uniquely colorable planar graphs, J . Combindorial Theory, 6.27 1-278 (1%9). [321 H. Hadwiger; Uber eine Hassifikation der Streckenkomplexe, Vierteljahresschr. Naturforsch. Ges. Zurich, 88.13>142 (1943). [331 G.A. Dirac; A property of 4-chromatic graphs and some remarks on critical graphs, J . London Math. Soc., 27.85-92 (1972). [34] K. Wagner; Uber eine Eigenschaft der ebenen Komplexe, Math Ann., 114,57G590 (1937). [35J W. Mader. Homomorphiesatze fur Graphen; M a h . Ann., 178.1.54-168 (1968). [36] K. Wagner; Beweiss einer Abschwachung der Hadwiger-Vermutung,Math. Ann., 153,139-141 (1964). 1311
This Page Intentionally Left Blank
Quo Vadis, Graph Theory? J. Gimbel, J.W. Kennedy & L.V. Quintas (eds.) Annals of Discrete Mathematics, 55, 59-70 (1993) 0 1993 Elsevier Science Publishers B.V. All rights reserved.
NUMERICAL DECKS OF TREES Fanica GAVRIL Center for Military Analysis Haifa, ISRAEL
Ilia KRASIKOV and Johanan SCHONHEIM School of Mathematical Sciences, SacMer Faculty of Exact Sciences Tel Aviv University, Ramat-Aviv, Tel Aviv, ISRAEL
Abstract Let the vertices respectively the edges of a tree T be {vI,vz, ...,}.v and {e,,e,, .... en.j}.The Deck respectively the Edgedeck are then the collections D(T)= { T- v ~ } ,!and ED(7) = { T - e,} ,!. Each of these decks is redundantly sufficient for reconstructing T. In a numerical approach define the number deck of the tree T the collection ND(7') = {p,,&, ...,pJ, where p, = (al,q,. ..,as}is the collection of integers correspondingto the cardinalities of the connected components of T- v, Similarly define the edge number deck END(7') and other numerical decks. First the recognition problem is considered, and conditions established for collections of multisets of numbers to be the ND(T) of a tree, the END(T) of a tree, or some other numerical decks considered. It tums out that except for N Q T ) the problems are NP-complete.Then, the ND(7) reconstruction problem is considered and a characterization is given.
r;-
:z
1. Introduction 1.1 Decks of Trees and the Reconstruction Problem
We shall use the term multiset for an unordered collection of not necessarily distinct elements of a set. Let G be a graph with vertex set V = { v j }=:
multiset of the unlabeled subgraphs {G - v i } =:
1.
Following Harary, we shall call D(G) the
the Deck of G .
A conjecture standing since 1942 is that for n > 2 the Deck Q G ) determines G uniquely up to isomorphism. This is expressed by saying that G is QG)-reconsfructibZe. One of the cases for which the conjecture has been confirmed, a long time ago, is the case when G is a tree T [l].More recently [2] it turned out that a tree T is reconstructible even from many submultisets of D( T). Similarly when the edge set of T is E = { e j } ?r =- 'l , then the multiset of the unlabeled subgraphs { T - e , } ?1 =- 1 is called the Edgedeck of T and denoted ED(n.It is also known that a tree is ED(n-reconstructible. Notice that while a member of D(T) can have many connected components, each member of m(T)has exactly two connected components. 1.2 Numerical Decks of a Tree
A first impression upon considering the tree reconstruction topic could be that it is closed. But asking the '$Quo vadis" question leads us to assert that a wider approach can open new ways. Such a possibility is to consider partial information contained in the Deck rather than a submultiset of its members. Of particular interest is the use of numerical information leading
F. Gavril, I. Krasikov and J. Schhheim
60
to various algorithms. First steps in this direction arose in 1980 [3], when the concept of a Number Deck of a tree was introduced. We shall now give the exact definitions of the above and further related concepts. Definition 1:
Let pi be the multiset of positive integers determined by the cardinalities of the connected components of T - vi. The multiset { p I,p2,...,pfl}is called the Numberdeck of the tree T and is denoted ND(T).The largest element of pi will be denoted w(pi).The set of distinct elements in ND(T),will be called Reduced Number Deck of T and denoted RND(T). Observation 1:
The sum of the members in each pi is n - 1. Definition 2:
Let p i be the unordered pair of not necessarily distinct positive integers determined by the cardinalities of the two connected components of T- e,. The multiset @1,p2,...,pn - 1) is called the Edge Number Deck of the tree T and is denoted END (T). Observation 2:
The sum of the numbers in each pi is n and this implies that pi ={ai,n - a,}. It is a matter of notation to have ai 5 n - ai. Definition 3: Let pi be as in definition 1. The multiset of integers defined by p1 u p2 u ...U p, is called the Total Numberdeck of T and denoted nVo(T). It is understood that the union takes into account the multiplicities in each multiset.
Observation 3:
It iseasy toseethat pl u p 2 ~ . . . u p , = p u1 p 2 ... ~upn-1,wherepiisasindefinition2. Definition 4:
z.
In a rooted tree Twith root v,, each v i determines a tree rooted at vi. Namely for i < n, let e be the edge adjacent to vi in the path v i - v,, subgraph of T, then is the connected component of T- e not containing v,. If the cardinality of Ti is a, then the multiset {al,a2, ...,a, 1, a, = n} is called the Endtree number deck of T, denoted EnvD(ij. Recall that a centroid of a tree is a vertex such that the maximal connected component of T - v is minimal. Observation 4:
If the root of T i s a centroid then ai 5 n - a, for 1 I i 5 n - 1. The purpose of the present paper is to discuss the relationship between these five types of numerical decks of a tree, to give characterizations of them, and to analyse the complexity of the related recognition problems, i.e. of recognizing whether a given multiset of multisets is one of the numerical decks of a tree. We shall consider also the ND(T)-reconstruction problem, the only one as (shown below) for which the recognition problem is not NP-complete. If a tree is not ND(T)-reconstructible, the problem of constructing all trees having the same ND is considered.
61
Numerical decks of trees
2.
Characterization and Recognization of Numerical Decks
The following characterization of a total number deck of a tree leads to characterizations of number decks and edge number decks: Theorem 1: A multiset S of 2(n - 1) positive integers is the total number deck of a tree if and only if there exists a (n - 1)X n matrix whose nonzero entries are the members of S, each row having exactly two nonzero entries summing up to n while the sum in each column is n - 1. Proof: Let S be the total number deck of a tree T. A matrix M as required can be obtained by replacing the nonzero entries ag of the edge versus vertex incidence matrix of T by the cardinality ni of the component of T - e, not containing vj. Clearly niis a member of S. Notice that if the second nonzero entry of row i is ajk and is replaced as described by n2 then nl + n2 = n.The column sum condition is also satisfied. Conversely, assume that a matrix N, as required, exists. Replacing all nonzero entries by ones the obtained matrix N* is an incidence matrix of a graph T. First, let us prove that T is a tree. T having n - 1 edges and n vertices, it is enough to show that T is acyclic. Assuming the contrary, let C be a cycle of length k contained in T. Let N’ be the k x k submatrix of N defined by C. Every row of N’ sums up to n, while the sum in every column of N’ is at most n - 1. Thus the sum of all the entries is kn if counted by rows and at most k(n - 1) if counted by columns, a contradiction. It remains to show that ZND(T) is S,the multiset of the nonzero entries of N.
In fact, we will show that matrix Mobtained from T by the method of the “only if‘ part of the proof is the given matrix N. Let e j be one of the edges of T. Let pibe the pair {nl,n2} appearing in the row of e of Mthen nl + ‘12 = n. On the other hand, let {al,a2) be the pair of nonzero entries occurring in the row of ei in N . Now T - e consists in two trees T I ,T2, to each corresponding a submatrix of N, N1 respectively N2, Then al E N1, and a2 E N2. Summing up the columns of N1 one has nl(n - 1) - a1 while the sum of the rows of N1 is (nl - 1)n. It follows that n = nl + al, therefore a1 = n2 similarly a2 = nl. The above theorem leads to the following characterization of the vertex and edge number decks of trees. Corollary 1: [4]
A collection N = {p1,p2,...,pn}of n multisets of positive integers summing up in each multiset to n - 1 is the number deck of a tree if and only if the union U/,I pi can be partitioned into pairs, each pair summing up to n. We shall refer below to such a partition as pairing. Corollary 2
/
A collection P = ( ~ 1 9 2 , ...,pn- 1) of n - 1 pairs of positive integers, each pair summing up to N , is the edge number deck of a tree if and only if the union u : :pi can be partitioned into N multisets, each summing up to n - 1. We have already mentioned that the complexities of the recognition problems based on Corollary 1 and Corollary 2 are very different, namely very easy for 1 and NP-complete for 2. See section 4.
F. Gavril, I. Krasikov and J. Schonheim
62
Corollary 3: a) All trees having the same TND can be obtained by the construction in the proof of Theorem 1, using a matrix M. b) All trees having the same ND can be obtained from the different possible pairings in Corollary 1. c) All trees having the same EWD can be obtained from the different possible partitions in Corollary 2. A relationship between various numerical decks is formulated in the following theorem, an immediate consequence of the definitions and observations 2 , 3 , 4 .
Theorem 2: Consider a multiset R = {al,a2, ...,a,. 1,a, = n} of positive integers. The following three statements are equivalent: (i) R is the endtree number deck of tree Trooted at a centroid. (ii) ai In - 4 for i E { 1,2,...,n - 1) and the collection of pairs P = { {a1,n - al},{a2,n - a ~ } , ...,{a,. 1 ,n- a, - 1)) is the edge number deck of tree T. (iii) ai In -ai for i E {1,2,...,n - 1) and S ={a1,a2,...,a, - 1, n - al, n - a2 ,... ,n- a, - 1) is the total number deck of the tree T. The following characterization of an endtree number deck involves subsums. By virtue of Theorem 2, similar characterizations can be given for END and TND.
Theorem 3: ThemultisetofpositiveintegersR= {al,a2, ...,a,},withal < a 2 < ... 1 a n d i c {1,2, ...,n}
c
a1 = 1+ a a E S,
Proof:
For the only if part let Si be the multiset of elements of R corresponding to the endtrees determined by the sons of v,, when ai > 1. Then ai = 1 + Xu si a. Therefore S1 v S2 u ... u s, is a partition of R - { n} as requested.
For the if part construct a graph T whose vertices are the elements of R two vertices at, aj being connected by an edge directed from ai to aj if and only if aj is in Si.One can see that T i s a rooted tree having the desired ETND. Alternatively, in the matricial approach of corollary 2, consider
p = { ( U l J - a1),...,(a,- 1,n - a, - 1 ) ) . The condition of the corollary is satisfied if and only if there is a partition of R - { n } as in Theorem 3. Indeed each term n - ai must be completed in its column to n - 1 and this is done by Xu sia for ai > 1 and by 0 for ai = 1. An additional such column is determined by i = n. By Theorem 2 this also proves Theorem 3.
Numerical decks of bees
3.
63
Reconstruction of a Tree from its ND
We shall see in section 4 that all numerical recognition problems except the ND-problem are NP-complete. Therefore we restrict ourselves to the reconstruction problem in this case only. We shall see further that essentially the problem reduces to the RND case. For ETNDreconstruction see [5]. 3.1 Necessary and Sufficient Conditions for a Tree to be Nonreconstructible from Its ND
Such conditions have been established in [6]. We reproduce them here in Theorem 4. Theorem 4: A tree T with ETND(T) = {al,a2,..., a,-], n} and ND(T) = {pl,p2,...,p,) is not ND-recon-
structible if and only if there exists a pair of indices i, j such that the following three conditions hold: (i) ai= aj (ii) The endtrees
- -
Ti, Tj are not isomorphic as rooted trees with roots vi, vj respectively
(iii)vi,vj are dissimilar in Tij= T - T, - Tj V vi V vj
Observe that (i) can be replaced by w(pi) = c~(pj. Moreover, (ii) can be replaced by (ii) '
Pi Pj f
Proof: Indeed clearly (ii)'implies (ii). The proof of the converse implication is more involved. We have to show that if there is a pair i, j satisfying (i), (ii), (iii) then there is also a pair satisfying (i), (ii)', (iii). Actually choose i, j satisfying (i), (ii), (iii) for which ai is minimal. We shall show that such a pair satisfies also (ii)'. Assume the contrary that pi = pj Then there are two nonisomorphic branches Bj, Bj of the same size at vi and vj respectively. Let wi, wj the neighbours of vi, vj in Birespectively Bj. Now clearly (i), (ii), (iii) holds for wi, w j Namely (i) by the choice of wi,wj, (ii) by the choice of Bi, Bj and (iii) since any isomorphism mapping wi onto wj must also map vi onto vj, contradicting the nonsimilarity of vi, vj Finally the minimality of ai is contradicted by the order of the endtree defined by wi. The use of (ii)' instead of (ii) is more suitable for producing algorithms, deciding whether a tree is nonreconstructible from its ND since checking the isomorphism of trees is replaced by checlung the equality of multisets. In order to decide whether a tree is ND-reconstructible by the above conditions we need the knowledge of ND(T). That knowledge is equivalent to the knowledge of RiVD(7) in addition to knowing the multiplicity of every member of RND(T)in ND(T). We shall show in the next paragraph that deciding whether a tree is ND-reconstructible it is enough to know RND(T)and the subset of its members having multiplicity exactly one in ND(I).
F. Gavril, I. Krasikov and J. Schijnheim
64
3.2 The Role of RND in ND Reconstructibility
Let T a tree having N vertices with RND(T) = {pl,p2,...,ps}.An important tool in the above topic will be a certain digraph c(T) defined as follows: Definition 5:
The vertex set of c(7) is defined to be RND(7). Denote by v1 the vertex determined by p1. -.. Two vertices vi, vj are joined by an edge vi. v, if and only if there is an element x of pi and an element y of p j such that x+y=n
(1)
and and n > y. Observe that x = w(pi), the maximal element in pi,while y = o(pj) is possible only when vj is a centroid of T. Notice also that the set of sinks of is just the set of centroids of T.
c(Z‘)
Definition 6:
c(7)
A vertex vi of is called homogeneous if and only if there is no j # i such that o(&)= o(pj), i.e., when the maximal element of its defining multiset is not maximal in any other multiset in RND( T). Definition 7:
The number of members of ND(T)having the same multiset p is called the multiplicity p of p. Theorem 5: A tree is not ND-reconstructible if there is a nonhomogeneous vertex pi in there are two distinct paths from pi to the set of sinks of
c(7).
E(T)such that
Proof:
Let vi and vj be two nonhomogeneous vertices in T such that o(pi) = o ( p j ) but pi # pj The vertices vi and vj in T satisfy the conditions (i) and (ii) of Theorem 4.To see that they also satisfy condition (iii), observe that if P is a path from vi to a sink v then P ’ = ( P - pi)u vj is also a path from vj to v. Let P I ,P2 be two distinct paths from vj to v and Pi,P i the corresponding paths, as above, from vj to v. The edges of c(7) which are members of P1 v P; satisfy relations (1) and they can be extended to a pairing in ND(7). This determines a tree T’ with ND(Z) = AD(T).Moreover the vertices vi and vj are dissimilar in TV = T - Ti - Tj v vi v vP since any automorphism of qj mapping vi intovj must map P I onto But this is impossible, corresponding multisets along P1and p‘2 being different. Thus by Theorem 4 the tree T ’ , and consequently T , is not reconstructible.
6.
The conditions in Theorem 5 are not complete since there is no “only if’ part. The following Theorem remedies this situation. Theorem 6:
Suppose that for any inhomogeneous p of c(T) the path p - v is unique. Then T is not NDreconstructible if and only if there are two pairs of nonhomogeneous vertices vi, vj and vh, Vk in T ) such that the following two conditions hold:
c(
(i) o(v i)= o(vj) and o(vh) = Nvk)
Numerical decks of trees
(11) all paths from vi, vi. vu, least two.
Vk
65
to the sink v meet in a common vertex z # v of multiplicity at
Proof: Suppose that two such pairs do exist. Fix one required pair v., V J and consider the sets u h and u k consisting of all E u h and c u k such that any dcan i x taken as the second pair. Fixing the second pair V h E Uh and Vk E uk, consider the paths (vi - v), (vj - v), (Vh - v), (vk v) in G(T).Notice that each of them is the unique path from the corresponding vertex to the sink. Extend the union of the four paths to a pairing in ND. In the obtained tree the vertices vi and vj satisfy the conditions of Theorem 4; the first two by the assumptions. For the third, let z1, z2 two vertices of T' corresponding to Z in c(T) and choose the pairing in ND in such a way that the paths (vi - zl), (vh - zl), (vj - z2) and (vk - zi)occur in TI. But then vi and vj are dissimilar in Tq = T - Ti - v vi v vj since any automorphism of Tq mapping vi onto vj also maps uh onto Ub This is impossible. This proves the sufficiency.
4
i,
5
Suppose now that there are two nonisomorphic trees having the same ND. Let vi, vj be two vertices satisfying the assumptions of Theorem 4. By assumption the paths (vi - v) and (vj - v) are unique. Let (Vi
- v) = (vl,vil,va2 ,...)v)
Extend their union to a pairing of 21'D and consider the corresponding tree T'.
For each p E (vi - v) u(vj - v) define a rooted tree T'(p) consisting of all branches of p except those belonging to the above union. Since there is no automorphism of Tq mapping vi onto vj there is a minimal L such that T(vic) is not isomorphic to qvj[) as rooted trees. Hence by the argument used in proving equivalence of (ii) and (ii)' in Theorem 4,thereexists a second pair vh, vk, vh E T(vic), vk E T(vjc), as required and viecoinciding with vjcin G(T) is the required common vertex z. 4. Complexity of Numerical Deck Recognition Problems We now consider the computational complexity of the recognition problems of numerical decks of trees. 4.1 The ND-Recognition Problem Reference [7l contains a polynomial time algorithm for the recognition problem of a number deck of a tree. Clearly Corollary 1can be directly used to devise a very simple and efficient algorithm for this problem. 4.2 N p Completeness of some Numerical Deck Recognition Problems Theorem 2 states that the recognition problems of an edge number deck of a tree, a total number deck of a tree and an endtree number deck of a tree are reducible to each other in polynomial time. An attempt to give a polynomial time algorithm for recognizing whether a multiset R is an endtree number deck of a tree based on Theorem 3 is as follows: Letjbethefirstindexin Rsuchthataj> l . T h e n f o r e v e r y i = j , j + 1, ..., nwetrytofind among the unused elements of {al, ...,ai- 1) a subset which sums up to ai - 1. Unfortunately
F. Gavril, I. Krasikov and J. Schonheim
66
this does not give a polynomial time algorithm since there may be many such subsets summing up to a i - 1 . Yet this gives a non-polynomial branch and bound algorithm. Indeed it turns out (see Theorem 7 below) that the problem is NP-complete. The known NP-complete problem which will be shown to be reducible to our problem is the so-called 3partition problem. It is formulated as follows: Consider a multiset A of 3n positive integers and an integer bound b such that Z, A a = b nb and for every element a of A one has Tb < a< T. The question is whether there is a partition S1 U S ~ U ... u S,of Asuch that C,, sia= bfor 1 si I n. Restricted to n > 2, the problem is NP-complete in the strong sense. To define NP-completeness in the strong sense [S] let I be an instance of a numerical problem 41) its length and m(1) the maximal number appearing in I. A problem is said to be NPcomplete in the strong sense if there exists a polynomial F such that the restriction of the problem to instances I having m(l) sf(C(1)) is also NP-complete. Theorem 7:
The 3-partition problem restricted to instances I having m(1) 5 fiC(1)) is reducible in polynomial time to the ETND-recognition problem, T rooted at the centroid.
Proof: Consider an instance A = { U ~ , . . . , Q ~ b~ }to, the 3-partition problem. Let R be the multiset obtained from A by adding to it nb - 3n Occurrences of one, n-Occurrences of b + 1 and one Occurrence of n(b + 1); the multiset R has (nb - 3n) + 3n + n + 1 = n(b + 1) + 1 elements. Let R be an instance to the recognition problem of an endtree number deck of a tree rooted at a By Theorem centroid. This reduction can be done in polynomial time since b I m(l) sffC(Z)). 3 , R is a solution to the second problem if and only if b + 1 < n(b + 1) - ( b + l), which is true for n > 2, and there is a partition sl, ...,s d + + 1 of R = {n(b + 1)) fulfilling
,
sl=O for 1 1 i < n b - 3 n ,
C a + l = a i for n b - 3 n + 1 1 i s n b , a E s,
x a + l = b + l for n b + l < i I n b + n , a E s,
C a + l = n ( b + l ) + l for i = n b + n + l . a E s,
...,s d + is a solution to the instance of the 3-partition problem. Conversely, consider a solution snb 1, ..., Snb ,to the instance of the 3-partition problem.
Therefore s d + 1,
+
+
Define si as 0 for 1 I i I nb - 3n, si as a multiset of aiones for nb - 3 n + 1 5 i 5 nb, and Snb + + as a multiset of n Occurrences of b + 1. Then sl, ..., snb + + 1 is a solution to the instance of the recognition problem of an endtree number deck of a tree rooted at a centroid.
,
Corollary 4 The recognition problems for a total number deck of a tree, an edge number deck of a tree and an endtree number deck of a tree rooted at a centroid are NP-complete.
Numerical decks of trees
67
4.3 Reduced Versions
Since the recognition problem of an endtree number deck of a tree rooted at a centroid is NP-complete, it is natural to look for reduced versions of this problem which might be less difficult. The depth of the trees used in the proof of Theorem 7 is at most four, therefore, the above problem remains NP-complete when restricted to rooted trees of depth at most four. Similarly, the recognition problems for edge number deck and total number deck remain NPcomplete when restricted to trees of diameter at most seven. We now prove that the endtree number deck problem remains NP-complete when restricted to trees with at most two sons per vertex. For proving this we use the numerical matching with target sums problem known to be NP-complete in the strong sense [8].The problem is defined as follows: consider two multisets X, Y, each containing m positive integers and a target vector
Proof: Consider an instance X = {XI ,...,x,}, Y = { j l , ...,y,}, 4 1,...,b 2 to the first problem and let q = 2maxuy! 1 {xi,yi,bi}. Define 2 1 = { x + 3q I x E X}, 2 2 = (y + 4q I y E Y}, Z3 = { b i+7 q + 11 lSiSm},Z4={zj=C{=1 ( bi + 7q+ 1 )+j -l }7, 2, 3 = { n + q l X E {y + 2q I y E Y}. Let Z7 be a multiset containing 2m Occurrences of 2q - 1. For every r a Z,u Z g u Z, letZ,= { r - 1, r -2, ..., 1). Define R = (Ul, Zi) u ( U , , z1 . z2 Zr); R has I I = C, x x + C y Y y + 7qm + 2m - 1 elements, the maximal element being n. Let R be an
w,&=
instance for the second problem. Consider a solution sl, ...,,s to the first problem. We construct a directed tree Twhose vertices are the elements of R as follows: we set a row with the vertices b1 - 7q + 1, ..., b, + 7 q + 1, and below a row with the vertices of Z1 u Z2. For every 1 Ii 5 m, we have s, = z{x~,,yi,}, xil + yi, = bi and we insert edges from bi + 7q + 1 to x i , + 39 and y i , + 4q. We make z2 E Z4 the father of bl + 7q + 1 and b2 + 7q + 1, we make z3 E & the father of z2 and b3 + 7q + 1 and so on until we make z, the father of z, - 1 and b, + 7q + 1. For every 1 5 i Im we make xi + 3q the father of x i + q and 2q - 1 and we make y i + 4q the father of y i + 2q and 2q - 1. Below every vertex r, r = xi + q or r = yi + 2q or r = 2q - 1, we insert a directed path of vertices r, r - 1, ..., 2, 1. It is easy to see that R is the endtree number deck of Tand Tis a tree rooted at a centroid having at most two rows per vertex. Conversely, consider a solution T to the instance of the second problem. Denote every vertex of T by the cardinality of the subtree rooted at the vertex. A vertex in Z4 is greater than 14q thus it can have as sons only vertices of Z4 u Z,. A vertex in Z3 is greater than 7q thus it can have as sons only vertices of Zl u ZF A vertex bi + 7q + 1 of Z3 can have as sons exactly one vertex xil + 3q from Z1 and one vertex yi,+ 4q from 3.For every b,, let si = {xii,bii} where xi, + 3q and yii + 44 are the sons of bi + 7q + 1. It is easy to see that $1, ...,,s is a solution to the instance of the first problem.
Corollary 5: The recognition problem of an endtree number deck of a tree rooted at a centroid remains NPcomplete when restricted to trees in which every vertex has at most two sons. The recognition
F. Gavril. I. Krasikov and J. Schonheim
68
problems for a total number deck of a tree and an edge number deck of a tree remain NP-complete when restricted to trees with vertices of maximum degree three.
4.4 NP-Completenessof the RND Problem The Minimum Cover problem, known to be NP-complete [S], is defined as follows: consider a family C of subset of a finite set Sand a positive integer k, k s ICI. Is there a subfamily C'of C,with IC'I s k, such that every element of S belongs to a subset in C'? Theorem 9: The Minimum Cover problem is reducible in polynomial time to the RND problem. PrOOf:
Consider an instance C = { S1,. ..,S,}, S = { s1,. ..,sp},k to the Minimum Cover problem. We replace every element of S by its index in S plus 2n and restate the input as follows: S is the set of integers from 2n + 1 to 2 n + p and C is a family of multisets of S. We construct the corresponding instance to the RND problem as follows: To make things clearer, we construct the instance for RND together with a candidate tree. Let mi = Zs sis, M = 1 + max 1 mi, N = (n + k + l)M+ 1; N is the number of vertices in the tree. The tree has a root with a multiset of N + k + 1 M's. The root has a set F1 of n children corresponding to the elements of C,a set F2 of k children corresponding to the elements of C' and an additional child X.To the children in F1 correspond the multisets Si,..., SL, each Si obtained from Si by appending to it M -mi- 1 ones. To the children of F2 correspond multisets which are copies of 4'....,SA. Consider a vertex of F1, corresponding to some S/. Its children are lSil vertices, corresponding to the elements of Siwith multisets (N - sj ,i + n,sj - n - i - 1), and M - mi - 1 terminal vertices. Let F4 denote the set of children of the vertices in F l , corresponding to the elements of S. A child of a vertex Si in F1, corresponding to some sj, has attached to it a path with si- n vertices and a path with n - 1 vertices. The instance to the RND problem in the family of above multisets with multiples deleted. If the instance C,S,k has a solution C' for the Minimum Cover problem, then we assign the vertices of F2 to the elements of C'. For a vertex in F2 corresponding to some SiE C', we join it by edges to the vertices of F3 corresponding to the elements of Si and to M- m, - 1 terminal vertices. If IC'I < k we take k - IC'I vertices of F2 as copies of vertex X. It is easy to see that this tree is a solution to the corresponding instance of RND.Conversely, consider a tree T which is a solution to the instance of the RND problem. By the construction, the multisets corresponding to the vertices in F3 and those corresponding to the vertices of F4 are distinct, thus they are represented by distinct vertices in T. Also, the multisets corresponding to two vertices sj,s; E F4 are distinct by their identity or by the identity of their fathers, thus they are represented by distinct vertices in T. Therefore in T appear distinct vertices for the multisets corresponding to F3 u F4 If in T , a vertex of F3, corresponding to some sj E S is a son of a vertex in F1, then a vertex of F4 corresponding to sj is a son of a vertex in F2 and they can be interchanged. In this way we obtain a tree in which the vertices in F3 are children only to the vertices of F2. Thus, the vertices in F2 define a cover C' of S.
4
Corollary 6:
The RND problem is NP-complete even for trees with vertices of maximum degree three. In this case the reduction is also from Minimum Cover problem, but in the candidate tree the root is replaced by a path with n + k + 1 vertices to which the vertices of F1 u F2 u { X } are attached and every vertex Siin F1 is replaced by a path with IS) vertices to which the vertices in F4 corresponding to the elements of Si are attached. Some minor changes complete the reduction.
Numerical decks of trees
69
5. Concluding Remarks We have seen new aspects of the tree recognition and reconstruction problem. Being aware of the successes in recognizing and reconstructing trees from partial decks, we attempted to obtain recognition and reconstruction by using rather partial information from the Deck namely some numerical information such as the cardinalities of connected components in the cards of the deck. The numerical nature of the data lead to algorithms whose computational complexity turns out to differ surprisingly for the ND and END problems; the first being polynomial while the second NP complete, even if reduced to trees with some particular structure. In [9] and recently in [S]more general numerical decks are used based on cards like T - {vl,...,vs} called s-decks D,, thus the usual decks are 1-decks. More s is larger less the recognition and reconstruction from Ds= (01, D2,...,D,} is dependent on structural properties like similar vertices or isomorphic branches. It is also conjectured that every tree is reconstructible from D3. It is very likely that the corresponding algorithms will turn out to be of low complexity.
References P.J. Kelly; A congruence theorem for trees, Pacific J. Math., %1-968 (1957). F.Harary, E.Palmer; The reconstruction of a tree from its maximal subtrees, Canad. J. Math., 18,803810 (1966). Y.Caro,J. Schonheim; Decomposition of a tree into isomorphic subtree., Ars. Combinatoria, 9,119-130 (1980). I. Krasikov, M.N. Ellingham and W.J. Myrvold; Legitimate number decks for trees, Ars. Combinatoria, 21,15-17 (1986). I. Krasikov; Interchanging branches and similarity in a tree, Graphs and Combinatorics, 7 , 165-175 (1991). I. Krasikov and J. Schonheim; The reconstruction of a tree from its number deck, Discrete Math., 53. 137-145 (1985). F.Gavril and J. Schonheim; Constructing trees with prescribed cardinalities for the components of their vertex deleted subgraphs, J. of Algorithms, 6.23-252 (1985). W.B. Giles; Reconstructing trees from two point deleted subtrees, Discrete Mathematics, 15,325332 (1976).
This Page Intentionally Left Blank
Quo Vadis, Graph Theory? J. Gimbel, J.W. Kennedy & L.V. Quintas (eds.) Annals of Discrete Marhematics, 55, 71-80 (1993)
0 1993 Elsevier Science Publishers B.V. All rights reserved.
THE COMPLEXITY OF COLOURING BY INFINITE VERTEX TRANSITIVE GRAPHS Bruce BAUSLAUGH Department of Mathematics and Statistics, Simon Fraser University Burnaby, British Columbia, CANADA
Abstract For a fixed graph H, the homomorphism problem for His the problem of determining whether or not there is a homomorphismof a finite input graph G into H. We show that there exist locally finite vertextransitive graphs with unsolvable homomorphismproblems, but that all recursive locally finite vertextransitive graphs have solvablehomomorphism problems.
1.
Introduction
Given two graphs G and H, a homomorphism from G into H is a mapping F:V(G)+ V(H) such that uv E E(G) implies that F(u)F(v)E E(H). If such a homomorphism exists, we will write G + H otherwise we write G k H. If H i s a subgraph of G, and F ( v ) = v for all v E V(H),then F is called a retraction of G, and H is a retract of G . If G has no retraction onto a proper subgraph then G is retract-free. If G k H and H k G then G and H a r e hornornorphism incompatible or simply incompatible. If a collection G of graphs is pairwise incompatible then G is called an incompatible family of graphs. If H is a fixed graph, then the homomorphism problem for H, or the H-colouring problem, is the problem of determining whether or not there exists a homomorphism from a finite input graph G into H. Note that if H = K,,then H-colouring is equivalent to n-colouring. We define og(G), the odd girth of the graph G , to be the size of the smallest odd cycle in G. A graph is locallyfinite if every vertex has finite degree, and is vertex-transitive if for every pair of vertices u and v , there is an automorphism of the graph which maps u to v. If A is a set of words over a fixed alphabet, then the membership problem for A is the problem of determining whether or not a given input is in the set A. If this problem is solvable, then A is said to be a recursive set. A graph G is recursive, if the vertex-set V and the edge-set E of G are both recursive.
When His restricted to be finite and undirected, H-colouring is known to have polynomial complexity when H i s bipartite, and is NP-complete otherwise [l]. When His a finite digraph, no such general result exists. Some partial results are derived in [2]-[51. In [6], the case of countably infinite H-colouring is first examined. Therein, a general reduction from the Word Problem for Groups is given which yields recursive graphs with unsolvable homomorphism problems (indeed, with arbitrary recursively enumerable degrees of unsolvability) as well as recursive graphs with solvable homomorphism problems of arbitrarily high complexity. In addition, it is shown that all of the graphs obtained by the reduction have edge-decision problems whose complexities are bounded by the same recursive function. In this paper we show that all recursive vertex-transitive graphs have solvable homomorphism problems, but there are non-recursive vertex-transitive graphs with unsolvable homomorphism problems. 2.
Preliminary Results The following simple results may be easily proved by the reader.
72
B. Bauslaugh
Lemma 2.1: If G1 -+ G2 and G2 -+ G3,then G I + G3
Lemma 2.2:
I f G + H , then x(G) Ix(H). Lemma 2.3: Let G and H be non-bipartite. If G -+ H , then og( G) 1 og(H).
Lemma 2.4:
If G + H a n d H + G , then for any graphX,X + G if and only if X + H. Corollary 2.5:
Let H1 be a retract of H . Then G -+ H i f arld only if G
+H I .
The constructions in this paper will rely upon the existence of infinite incompatible families of graphs. That such exist is well known, for example, see [7]. However, for our purposes it will be useful to deal with a specific infinite incompatible family, defined as follows. Let G I = Ks,and let G i ,i > 1be the smallest graph with og(Gi)> og( Gi- 1) and x(Gi)> x(Gi - l). The existence of such graphs is implied by the results in [8]-[10](the references [9]and [lo] give constructive proofs). Our next lemma shows that { Gi:i 2 1) is an infinite incompatible family of graphs.
Lemma 2.6:
If i # j , then G i and Gj are incompatible. Proof: If Gi + Gj then x(GJ I x ( G j )by Lemma 2.2, so i 5 j . Also by Lemma 2.3, og(Gi) 2 og( Gj),so i I j . Thus, i = j . Lemma 2.7: All Gi are retract-free.
Proof: Suppose G is a proper retract of Gi. Then G -+ Gi and Gi + G. By applying Lemmas 2.2 and 2.3, we see that x(G)= x(Gi) and og(G) = og(Gi). However, Gi is a minimal graph with these properties, and so G = Gp
Lemma 2.8: All Gi are connected.
Proof: If Gi has two components A and B containing vertices u and v, respectively, then we may color each of A and B with x(GJ colors so that u and v have he same color. Therefore, if we identify u and v (Figure 1) obtaining H, x(H) = x(G). Clearly, og(H)= og(G),and so H has fewer vertices than Gi, the same odd girth and chromatic number, a contradiction. This last result will be useful because the homomorphic image of a connected graph must also be connected.
The complexity of colouring by infinite vertex transitive graphs
73
Figure 1: 3. The Constructions
In this section we show that all recursive locally finite vertex-transitive graphs have solvable homomorphism problems, and give a method for constructing non-recursive locally finite vertex-transitive graphs with unsolvable homomorphism problems. T o prove the former claim, we first note that given a vertex v in a resursive locally finite vertex-transitive graph, we may determine its neighbourhood "v]. This allows us to find the set of all vertices at distance no more than two from v , that is, N[N[v]]; the set of all vertices of distance no more than three from v, and so on. Given a vertex-transitive graph G, define R(G) as follows. Choose some arbitrary vertex v in G which is called the center of R(G). Let Ri, i 2 0,be the subgraph induced by all vertices of distance no more than i from v. Let R(G) be the disjoint union of the Ri. Lemma 3.1: Let G be recursive, locally finite and vertex-transitive. Then R(G) is recursive, and for all finite graphs H, H +G if and only if H + R(G). PrOOf:
Since we may explicitly constuct each component of R ( G ) , it is recursive. Now, let v be the center of R(G).If JH +G is a homomorphism, then let k be the maximum distance from v of any f ( x ) , x E H. Then H + Rb so H +R(G).Also, R(G) + G, so if H +R(G), then H -+G. Theorem 3.2: All locally finite recursive vertex-transitive graphs have solvable homomorphism problems.
Proof: Let G be recursive, locally finite and vertex-transitive, so R(G) is defined. Let v be the center of R(G). Given an input graph H, let u be any vertex of H. 1ff:H + G is a homomorphism, then let w =f(u). Now let g be an automorphism of G which maps w to v. The composition g.f is a homomorphism from H to G which maps u to v. Thus, we may arbitrarily choose some vertex u of H a n d we know that there is a homomorphism from H to G which maps u to v if and only if there is a homomorphism from H to G. Assume without loss of generality that H i s connected and let d = diameter(H). Then H + G if and only if H + Rd, and Rd is finite. It should be obvious that this result may be generalized to graphs with finitely many orbits in their automorphism groups. If we allow our graph to be non-recursive, however, H-colouring may be unsolvable, as the following construction shows. We begin with three definitions. A Directed Cayley Colour-Graphfor a pair (W, s), where W is a group and S is a set of elements from the group, is the graph whose set of vertices is the set of elements of W, and which has a directed edge of
74
B. Bauslaugh
colour s E S from u to v if and only if us = v in W . A finitely presented (or f.p.) group is a pair (A;R ) , where A is a finite set of generators { a l , ...,an},and R is a finite set of equivalences of the form w = e, with w an element of ( A uA-')*, A-' = {a-': a E A}, and e is the empty word. This pair defines a unique group W whose elements are equivalence classes of ( A uA- 1)*, with concatenation modulo R as the group operation, and e as the identity. The word problem for a finitely presented or finitely generated group W is the problem of determining whether a given string w is equivalent to the identity. There exist finitely presented groups for which this problem is unsolvable [l 11. Let G = (A; R) be a finitely presented group with an unsolvable word problem, and let B be the Directed Cayley Colour-Graph ( G ;A). We define a modified homomorphism problem for B , where edges in the input graph are directed and coloured and must be mapped to edges of the same direction and colour. This problem is unsolvable, since a word w = a l . . .ak in W is equal to the identity if and only if a directed k-cycle with edges coloured al, ..., ak maps into B . For the same reason, this homomorphism problem is unsolvable even when we restrict our input graphs to directed cycles. The general method of this proof is to reduce this modified homomorphism problem to a homomorphism problem in an undirected, uncoloured graph which is still vertex-transitive (even though it may no longer be a Cayley graph).
-
Given a Directed Cayley Colour-Graph G, we define U( G) as follows. We will base our construction on the graphs Giconstructed in Section 2. If G has n different edge colours then we will need to use G , , ..., G3n.In each Gi fix some arbitrary pair of adjacent vertices ui and vi. Let m be one more than the number of vertices of the largest G i , and let D be the graph obtained by taking two rn-paths and connecting so as to create a string of K4 (Figure 2).
rn-path
Figure 2: We now construct graphs Ei,i = 0, ..., n - 1, by joining G3i + 1, G3i + 2, and G3i + 3 by identifying the end vertices of four copies of D to their respective ui and vi, as indicated in Figure 3. Label the loose ends a l , a2, b l , and b 2 Next, we replace each vertex v in the Directed Cayley Colour-Graph H by a copy of K2 with vertices v1 and v2 and replace each directed edge e = ny of colour i with Ei, by identifying ai with xi and bi with yi.
Lemma 3.3: If G and H are Directed Cayley Colour-Graphs, then G
-+ H if and only if U(G)+ U(Hj.
Proof:
Consider first the image of some Gj in U ( G )under a homomorphism f Because of the length of the copies of D connecting different Gk,f(G,) may only intersect one Gk in U(H),call it G, Note that G j has chromatic number greater than four, D has chromatic number equal to four,
The complexity of colouring by infinite vertex transitive graphs
75
Figure 3: andf(Gj) contains some portion of G, plus possibly some subgraphs of one or two copies of D which intersect with G, in a K2. Thus, the intersection of f(Gj) and G , must have chromatic number at least as large as x(Gj),so a 2j.By definition of the Gi, og(GJ 1 og( GJ), and so the odd girth of the intersection off(Gj) and G, is also at least og(Gj).Therefore,fmust be a oneto-one mappng of Gj into G,, or we would have a graph smaller than Gj with equal or larger odd girth and chromatic number, a contradiction. It now easily follows by definition of the Gi that j = a, and so G j and G , are isomorphic, andf is an isomorphism. Also, since a copy of G3+ is only attached by a copy of D to one G3i + 1 and a G3i + is only attached by a copy of D to one G3i + 3, the subgraph Ei must map onto another copy of Ei, and the Gj wittun Ei must map isomorphically to their copies. This forces the copies of D to also map isomorphically, and so the Eiwill act exactly as the coloured edges in the original graph.
Corollary 3.4: U ( B )has an unsolvable homomorphism problem.
The problem, of course, is that U(B) is not vertex-transitive. However, it is clear that its automorphism group has only finitely many orbits. We can automorphically map any copy of Eionto any other copy of Ei,since any edge in a Cayley colour graph may be mapped to any other edge of the same colour by an automorphism. Thus, there are no more orbits than there are vertices in all of the different Ei.We will now construct a new graph B* which is vertextransitive and also has an unsolvable homomorphism problem (see Figure 4). Let k be the number of orbits in the automorphism group of U(B).Note that each orbit contains an infinite number of vertices. Assume that the orbits are numbered 1, ..., k and the vertices within each y b i t are numbered 1, 2 , 3 , .... Label thej' vertex in the 8' orbit with
Pi
Now, in a graph BP,we will label the vertex
76
B. Bauslaugh
Figure 4 equivalent under the substitutions given above ((a,b)a= 0 and (x,y)z(z,y) = (x,y)), and labelling the resultant vertex with the shortest label of any of the original vertices. It will sometimes be convenient to refer to vertices by labels other than this one, and so when we say a vertex has label kid),we mean the shortest string equivalent to P(ij). The resulting graph is shown (in part) in Figure 4. Each vertex is incident to k copies of U(B),and occurs in a different orbit of the automorphism group of U(B)in each one. For example, the vertex ( a j )occurs in orbit a in Bpi, and for each q # a it is the x th vertex of orbit q in B(,,k The overall structure of B* is that of an infinite tree-structure with copies of U ( B ) as nodes, and Bp,as the root. The labels of the vertices represent paths from the root to the vertex ...c,, - l(un,bn)is interpreted as: begin with verin question, that is, the label (ul,b~)c~(u2,b2) tex v1 = (al,bl) in B,, move to the copy of U(B) incident to v1 in which v1 is in orbit c1, that is, (B(al,b,)c), move to vertex v2 =
The complexity of colouring by infinite vertex transitive graphs
77
effect. Note that B* is locally finite.
Lemma 3.5:
B* is vertex-transitive. Proof: We claim that the following are automorphisms of B*. (i) F(xyk(P(a,b)) = (x,y)zp(a,b), is an automorphism for all x, y, z. That is, if we concatenate any fixed string (x,y)z to the left end of the label of every vertex in B* we obtain an automorphism of B*. (ii) If f i s an automorphism of U(B),then
GJ-Wl>bl)cl(a2>b2)cz...c n - 1(an,b,J) = Cf(al,bl))clCf(a2,b2))Cz ...c, - 1 r n ~ n J n N is an automorphism of B*. That these mappings preserve edges follows easily from the definition of B*. Furthermore, each mapping of form (i) has an inverse, specifically, (F(xy)z)- = F(,y)x~ince
(z,y)x(x,y)zp(a,b) = (z,y)zp(a,b) = P(a,b) for all vertices p(a,b). Each mapping of form (ii) also has an inverse, as ( Gt)- = G u Thus, all of these mappings are automorphisms.
1).
Now we show that by means of these automorphisms, the vertex (1,l) may be mapped to any other vertex in B*, and so the automorphism group has only one orbit. To map (1,l) to (al,bl)cl ...c,- l(an,bn), first find an automorphismfof U ( B ) that maps <1,1> to
Lemma 3.6:
B* has an unsolvable homomorphism problem. PlWOfi
We claim, if P i s a directed coloured cycle, then U(P) + U(B)if and only if U(P) +B*. This will suffice because, as was stated earlier, the homomorphism problem for the Directed Cayley Colour-Graph B is unsolvable even when input is restricted to cycles. Thus, for a cycle P, P + B if and only if U ( P ) +U ( B )if and only if U(P) +B*, and therefore if we could solve the homomorphism problem for B* we could solve the homomorphism problem for B with input restricted to cycles. We will now prove the above claim. Let C = U(P)for some directed coloured cycle P. Obviously C + U(B)implies C + B*. Suppose C +B*. If the image of C i s contained in only one copy of U ( B ) , then C + U(B).On the other hand, if the image intersects with more than one copy of U(B),then the image contains a cut point, since copies of U(B) intersect only in cut points of B*. We will show that this cannot happen. Let us first look at the image of some copy of Gi in C under a homomorphism& If this f(Gi) contains a cut point, then all blocks in the image must have fewer vertices than Gi and each block must be contained in only one copy of U(B).Since the chromatic number off(GJ is at least as large as that of Gi, some block Zoff(Gi) must have chromatic number at least as large
B. Bauslaugh
78
as x(Gi),which is at least five. Hence, Z cannot be entirely within a copy of D, which has chromatic number 4. Also, by the fact that the distance between the ends of D is larger than the size of any Gi, we know that Z cannot intersect with more than one G j in U(B).Thus, it intersects with exactly one GI, say G,. It may of course also contain part of the copies of D incident with G,. However, since D has chromatic number 4 and intersects with G, in a K2, we know that the chromatic number of Z is the same as the chromatic number of the intersection of Z and G,. Thus, x(C,) 2x(Z) 2 x(Gi),and so by the definition og(G,) 2 og(Gi). Now the intersection of Z and G, has chromatic number at least that of Gi, odd girth at least that of G i , and fewer vertices than G,, a contradiction. We now know thatf(Gi) cannot contain a cut point, and so must be contained in one copy of U(B),and so in fact G i must map isomorphically to some copy of Gi, by our previous argument. Note also that the image of a K4 under a homomorphism is always a K4, which contains no cut point. We will now show that if we “glue” graphs Gi and K4 together in a certain way, the new graphs also contain no cut point in their image in B*. If X and Y are two nonempty graphs, define X! Y to be the graph obtained by identifying some K2 in X with one in Y (Figure 5).
Figure 5: Supposefis a homomorphism from X! Y into some fixed graph W, and suppose also that neither f ( X ) nor f(Y) contains a cut point. It is a simple exercise to show thatlpx! Y) does not contain a cut point. We can “almost” construct C just using copies of various Gi and copies of K4 joined using the ! operation in the obvious way. Specifically, we may construct the graph C in Figure 6, which is the same as C except that it is missing the four dashed edges. However, C‘is a spanning subgraph of C, so whenever C has an image in B* with connectivity k, C‘ has an image in BY with connectivity k ’ l k. This is because every image of C contains an image of C‘ as a spanning subgraph. Thus, since any image in B* of the graph in Figure 5 (without the dashed edges) has connectivity at least two, so does any image of C. Therefore, if C -+ B* then the image of C i s contained in one copy of U(B),so C + U(B). Corollary 3.7: There exist locally finite vertex-transitive graphs with unsolvable homomorphism problems. We may well ask at this point whether there are any interesting families of graphs for
The complexity of colouring by infinite vertex transitive graphs
79
n Remainder of C
Figure 6: which the homomorphism problem is easily solved. It appears that a high degree of symmetry is of no help, since any class containing all vertex transitive graphs contains unsolvable instances.
Acknowledgements The author wishes to thank Pavol Hell for his excellent supervison throughout the course of this research. This work was supported in part by the Natural Sciences and Engineering Research Council of Canada.
References P. Hell and J. NeSetiil; On the complexity of H-co1oring.J. Comb. Theory Ser. B , 48,92-110 (1990). J. Bang-Jensen and P. Hell; The effect of two cycles on the complexity of colorings by directed graphs, Discrete Applied Malh., 26, 1-23 (1990). J. Bang-Jensen, P. Hell and G. MacGillivray; The complexity of coloring by semicomplete digraphs, SZAMJ. Math., 1,281-298 (1988). J. Bang-Jensen, P. Hell and G. MacGiIlivray; On the complexity of coloring by supergraphs of bipartite graphs, Simon Fraser University, CSSiLCCR TR90-01 (1990). G. MacGillivray; The Complexity o j Generalized Colorings, Ph.D. Thesis, Simon Fraser University (1990). B. Bauslaugh; The complexity of of infiniteH-coloring,J. Comb. Theory, Series B (submitted). P. Hell; On some strongly rigid families of graphs and the full embeddtngs they induce, Algebra Universalis, 4, 1W126 (1974). P. Ed&; Graph theory and probability, Canad. J. Mafh., 11,3&38 (1959). L. Lovhsz; On chromaticnumber of finite set systems, Actu Math. Acad. Sci. Hung., 19,5967 (1968). J. NeSetiil and V. Rodl; A short proof of the existence of highly chromatic hypergraphs without short cycles, J. Comb. Theory. Series B, 21,225-227 (1979). J.R. Shoenfield; Mathematical Logic, Addison-Wesley, Reading, Massachusetts (1%7).
This Page Intentionally Left Blank
Quo Vadis, Graph Theory? J. Gimbel, J.W. Kennedy & L.V. Quintas (4s.) Annals of Discrete Mathematics, 55, 81-88 (1993) 0 1993 Elsevier Science Publishers B.V. All rights reserved.
RAINBOW SUBGRAPHS IN EDGE-COLORINGS OF COMPLETE GRAPHS Paul ERD6S and Zsolt TUZA Hungarian Academy of Sciences, Budapest, HUNGARY
Abstract We raise the following problem. k t F be a given graph with e edges. Consider the edge colorings of K, (nlarge) with e colors,such that every vertex has degree at least din each color (d < nle). For which values of d does every such edge coloring contain a subgraphisomorphic to F, a l l of whose edges have distinct colors? The case when F is the triangle 4 is well-understood, but for other graphs F many interesting questions remain open,even for d-regular colorings when n = de + 1.
1.
Problems
Let F and G be two graphs and f an edge coloring of G. Then F is said to be a rainbow subgraph (or totally multicolored subgraph) of G if G contains a subgraph isomorphic to F, all of whose edges are assigned to distinct colors. The existence of rainbow subgraphs has been investigated under various conditions in several papers, see for example [ I]-[8]. In this note we are interested in rainbow subgraphs of edge-colored complete graphs when some degree conditions are imposed onf. Let e, and n be natural numbers, n > de . Call fan (e,d)-coloring of Kn if it assigns precisely e colors to the edges of the complete graph of order n (one color to each edge) in such a way that every vertex is incident to at least d edges of each color. Notation
Let F be a graph with e edges. Let d(n, F) = if Kn has an (e, L ( n - 1) / e J) -coloring without a rainbow F , where L x l denotes the largest integer not exceeding x ; otherwise we define d(n, F) as the smallest integer d such that every (e &coloring of Kn contains a rainbow copy of F. The basic problem raised here is to determine how ‘uniform’ should a coloring be in order to insure the existence of a rainbow subgraph of a given type. Certainly, the first question is to settle whether a rainbow F can be forced at all. Problem 1:
Is d(n, F)finite for every graph F and every sufficiently large n = 1 (mod e)? It is necessary to assume the condition n = 1 (mod e ) in Problem 1. We shall show that there are infinite classes of graphs F for which d(n,F ) = for every positive n = 0 (mod e). Problem 2:
For which graphs F is d(n, F ) finite for every sufficiently large n? Although these questions seem to be innocent, so far the trees, the triangle K3, and the 4cycle C 4 are the only graphs for which we can prove that they satisfy the requirements of Problems 1 and 2. The simplest candidates for counterexamples to Problem 1 might be K4, C,, and 2K3; we could not prove or disprove that every t-regular k o l o r i n g of K,, + 1 contains them as rainbow subgraphs (here t has to be even). Similarly, simple examples would be the
P. fid6s and Zs. Tuza
82
5-cycle C5, or the graph with 5 vertices and 6 edges that form two triangles sharing a vertex. Assuming that d(n, F) is finite for all large n, it becomes an interesting question to determine the growth rate of d(n, F). We put this question in the following asymptotic form. Problem 3:
Find the smallest (or infimum) value of the constant c = c(F) such that d(n, F) Icn holds for every large n. Here the most interesting problem is to decide whether c < l / e holds, where e denotes the number of edges in F . More generally, we can ask Problem 4:
Given a graph F and natural numbers n and k, k 1 IE(F)I,determine the smallest d such that every (k,d)-coloring of K,, contains a rainbow F.
For the triangle we give a complete solution in Section 2, but for other graphs only poor estimates are known, and the problem for k > e has not yet been investigated extensively. Perhaps the case k = e + 1 has a considerably different nature than that of k = e , and the answer to the following question will be affirmative. Problem 5
Does every (e + I,L(n - I)/(e+ I)J)-coIoring of K,, contain every graph F of e edges as a rainbow subgraph?
Problems 1 through 5 are just a few typical examples of the numerous questions that can be asked which, as far as we know, have not been explored. For instance, determine the smallest number k = k(F) of colors such that every k-coloring of K , with minimum degree at least (1 - c ) n / k in each color (for some positive constant c) contains a rainbow F . It would be interesting to prove (if true) that k(F) = IE(F)I + 1 holds for every F, but perhaps k(F) is larger for some graphs F . In the latter case the next question would be how rapidly k ( F ) can grow as a function of the order or size of F. 2.
Results
We begin this section with a general result that can also be viewed as a sufficient condition insuring a rainbow triangle. Theorem 1:
Let f be an edge coloring of K,,, with an arbitrary number of colors, and denote by k ( 9 the number of colors occurring on the edges incident to the i-th vertex ( 1 I i 2 n ) . If fcontains no rainbow K3, then
c
(1)
,244) 2 1.
1S i l n
Theorem 2:
For k 2 3 , every ( k ~ l ( L n / 2 ~ - ~11/41 J - + 1)-colorin of K,,containsa rainbow K3 (n22k-2). Moreover, replacing 2L(Lr~/2~ - 2J- 1)/4J+ 1 by 2 L & ~ / 2 ~2J- 1)/4Jthe conclusion does not hold anymore. In particular, d(n,K3) = 2L(Ln/21- 1)/4]= 2L(n - 2)/8J+ 1.
For the particular case k = 3 , a slightly weaker form of Theorem 2 was also proved by Kostochka (private communication).
Rainbow subgraphs in edge-colorings of complete graphs
83
Theorem 3: L n / 6 J I d(n, C4) I ( 1/4 - c) n for some positive constant c.
The largest possible value of c, for which the upper bound on d(n,C4) remains valid, is not known.
Proposition 1:
If F is a tree with e edges, then d(n,F)I e - 1. If F is a forest with e edges, then d(n,F) I 2e - 2 . Moreover, for n large, those upper bounds can be improved to e - 2 and 2e -3, respectively. Most probably, the bounds given in Proposition 1 are very far from being best possible. It may be the case that if n is sufficiently large and every vertex is incident to at least one edge in each of the e colors, then a rainbow F always occurs. This strong property may hold for all forests; for instance, we can prove it for matchings of arbitrary size. The next observation shows, however, that such a sharp result fails to be valid whenever F contains a cycle; in fact, K3 turns out to be extremal among those graphs, for any number of colors. In order to formulate the statement, denote by d(n, F ; k ) the smallest integer d such that every (k,d)-coloring of K,, contains a rainbow F . Recall that Theorem 2 gives the exact value of d(n, K3 ;k) for every k .
Theorem 4 If a graph F is not a forest, then d(n, F ; k ) 2 d(n, K 3 ; k ) for every n and k. Let us say that a graph F with e edges is of type Z if for infinitely many values of n = de, K,, admits an (e,d -1)-coloring with n o rainbow F .
Theorem 5: (i) If F is a graph with e edges, such that every vertex of F has an even degree and e = 2 (mod 4), then F is of type I. (ii) If each edge of a graph F with e edges is contained in a triangle, and e is even, then F is of type 1. (iii) If each edge of a graph F withe edges is contained in a triangle, e is odd, and K , has a proper e-coloring (that is, an edge coloring with e colors in which no vertex is incident to a pair of monochromatic edges) without a rainbow homomorphic image of F, then F is of type I.
3. Proofs 3.1 Triangles
Proof of Theorem 1 : Let E l , ..., Ek be the monochromatic classes of edges in an edge coloring fof K,, with an arbitrary number k of colors, without a rainbow triangle. For k I 2, each term in (1) is at least 114 and therefore (1) trivially holds for n 2 4; also, it is easy to check the inequality for 1 5 n I 3. For k 23 we are going to apply induction on n (and in some cases on k as well), assuming that (1) holds for every smaller complete graph. It follows from the results of Gallai [9] (see also [lo]) that at most two of the Ej can be connected spanning subgraphs of K,,.Assuming that these are El andlor E,, and that k 2 3, let R be a connected component of E3. The connectivity of K' implies that for each vertex v c K',all edges joining v with K' have the same color. Indeed, if f(vx) gfivy) for some vertices x, y E R,then consider an x-y path P of color 3 in K ' . This P contains two consecutive vertices x' and y' such that ~ ( v x 'f) f(vy') holds. Moreover,
P.fidds and Zs. Tuza
84
since v does not belong to the component K', we also havef(vx') induces a I-ainbow triangle, contradicting the assumption on J Let vl,
..., v,,
be the vertices not belonging to K', and v,
+
1,
+ 3 #f(vy'),
...,v,,
so that { v , x , y )
the vertices of K'. Here
II n' 5 n - 2 holds. Denote by k'(0 the number of colors incident to Vi in the subgraph induced by K' (n'+ 1 5 i I n). and by k'(v) the number of colors assigned to edges between { v 1, ,v,, }
...
and K'. On one hand, K' induces no rainbow triangle, so that the induction hypothesis yields n i=n'+l
On the other hand, contracting K'to a single vertex v, we obtain a coloringf of a complete graph Kn$+ without rainbow triangles, in which the i-th vertex is incident to edges of k(i) colors (1 I i In') and v is incident to k'(v) colors (since the star formed by the v' - K' edges is monochromatic for each v' e K'). Thus, for K,,, + the induction hypothesis yields
i=l
Taking into account that k ( i ) I k'(i) + k'(v) for n' < i I n , and applying (2),we obtain n
n
i=n'+l
i=fl'+l
(4)
Then (3) and (4) imply (1). Proof of Theorem 2: First we show that there exists an edge coloring of K , with k colors (k 12) with minimum degree ' - 1)/4] in each color, such that no rainbow triangle occurs. Partition the n at least 2L(Ln/2k- 1 vertices into m = 2 k - 2 sets V , , ..., V,,,, each of cardinality Ln/2k-2JorLn/2k-2J+ 1. If k 1 3 , then assigning color k to all edges between Vl u.. . u V,Q and Vm,2+1u... u V,, the problem is reduced to finding a coloring with k - 1 colors on vertex sets of sizes Ln/2J and L(n + 1)121, with the same degree condition. Hence, we may assume that k = 2 and rn = 1. Representing K , by the vertices of a regular n-gon, a suitable coloring is obtained when we join two vertices in color 1 if and only if their distance on the periphery of the n-gon is less than n/4. To prove that larger degrees force a rainbow triangle, let E l , ..., E , be the monochromatic classes of edges in an edge coloring fof K , with k 2 2 colors. As in the previous proof, we apply the fact that the deletion of El or El u & makes K,, disconnected. In the first case the smallest connected component has at most Ln/2J vertices, so that an inductive proof works from k - 1 to k. For a similar reason, in the second case it will be enough to prove that K,\(El uE2) has at least four components. Let K' be a component of K , \ (El u&). We show that K' has a connected spanning subgraph in some color c 2 3. Let K" be a monochromatic component of K' with a maximum number of vertices, the color of which is different from 1 and 2. We prove that K" contains all vertices of K'. Otherwise, since the colors 1 and 2 do not disconnect K', there is an edge joining K" with some vertex v E K7K" in some color c 2 3. In this case, however, all edges from v to K" have color c, providing a larger connected component and contradicting the choice of K 'I.
Rainbow subgraphs in edge-colorings of complete graphs
85
Choose a monochromatic connected spanning subgraph, with color other than 1 and 2, in each component of K,,\ (El u Ed. Those subgraphs insure that all edges joining two components have the same color. Thus, if the number of components is t , then replacing each of those components by a single vertex we obtain an edge coloring of Kt with two colors, without a monochromatic cut. This property trivially implies t 2 4.
Proof of Theorem 3: Consider the coloring, extremal for K3, defined recursively at the beginning of the previous proof. Assuming that the colors have been assigned in a decreasing order, for every i (2 Ii I k) the edges of colors 1,. ..,i together form a graph, say Gi, all of whose connected components are complete graphs. It is enough to show that none of those Gi contain any rainbow cycle. Suppose to the contrary that C is a rainbow cycle, and let i be the smallest subscript such that some component G' of Gicontains C as a subgraph. Certainly, i 2 I Cl 2 3, and C has an edge f of color i. Thisf joins the two components of Gi-1 contained in G', however, so that C should have one more edge of color i - a contradiction. 3.2 Cycles of Length Four
Proof of Theorem 4: The lower bound is easy to see by splitting the vertex set into two parts Vl and V2 of cardinalities Lnl21 and L(n + 1)/21respectively, coloring all Vl - V2 edges with color 1, and coloring the edges inside each almost regularly with colors 2, 3, and 4. The latter can be done, for example, by viewing as the vertex set of a regular IV.1- on, assigning colors to the edges J .g according to their lengths mod 3 (and modifying the colonng, if necessary, within the classes belonging to lengths 1 and 2).
VJ VJ
We give a proof of the upper bound by contradiction, assuming that for every E and every n 2 n ( ~there ) is an edge coloring fof K,, in which every vertex has degree at least (114 - ~ ) in n each of the four colors, and no rainbow 4-cycle occurs inf: n (1/4 + 3 ~ ) n , Let E be very small. Since every monochromatic degree is between (1/4 - ~ ) and each vertex is incident to at most (( 114 + 3 ~+ 3( ) 1/4 ~ - ~ ) ~ ) nmonochromatic, ~ / 2 pairs of edges, and this number is just slightly more than 1/4 of the total number of pairs. Hence, viewing the situation from the side of pairs, the average number of vertices adjacent to a vertexpair xy by edges of the same color is less than (1/4 + ~ 1 ) for n some small ~ 1 We . shall prove that almost every pair has nearly n/4 common neighbors in some color.
For each pair xy of vertices define a 4 by 4 matrix Mxy in which the entry aij (1 I i, j 5 4 ) is the number of vertices adjacent to x in color i and to y in color j . We have to show that x u i j is close to n/4 for most of the xy. Since 4-colored 4-cycles are excluded, aij# 0 # au implies that { i, j , k, 1 ) z { 1 , 2 , 3 , 4 ) . Consequently, there is an m (1 Im I4) such that either all nonzero non-diagonal entries are in the m-th row or column (Case A), or no non-zero non-diagonal entry occurs in the m-th row and in the m-th column (Case B). Recall that all row sums and column sums in Mny are between (114 - &)n and (1/4 + 3)n. Thus, in Case A the entries not in the diagonal can sum up to at most (1/2 + &)n, so that the ratio of diagonal entries is at least 1/2 - 9,and therefore at most &3n2 of the Mxycan be of this type. On the other hand, in Case B the diagonal entry a,, already has a value at least (1/4 -&)n, so that in all but at most &3n2 of the Mny the other aji ( i t m ) must be small and, moreover, in color rn the vertices x and y have the same neighborhood. We conclude that for almost every pair xy there is a color m (1 I m I 4) such that x and y have
86
P. Erd6s and Zs. Tuza
the same - and at least (114 - &)n - neighbors in color m (since ammis the unique non-zero entry in the m-th row and column of Mq)and in every other color x and y have less than e4n common neighbors. Applying this fact we define another edge coloring $ of K,, with 5 colors as follows. If Mq is of type A, or x and y have more than &4ncommon neighbors in at least two colors, then let $(xy) = 5. Otherwise, let i$(xy)= m,where m is the color in which x and y have the same neighborhood. By the previous argument, very few pairs are assigned to color 5. Observe further that if a triangle T has two edges of color m 5 4 in $, then T either is monochromatic or its third edge is of color 5. Indeed, if each of q and yz has an identical neighborhood in color m,then the neighborhoods of x and z are the same as well, hence $ ( x z ) # m only if xz is of type A; that is, when $ ( x )= 5, by definition. If some vertex x had degree larger than (1/4 + 3 &)nin some color m I 4 in $, then that many vertices had a common neighbor z in color m in the coloringfas well, contradicting the degree assumption on J Moreover, since the number of color-5 edges is small, the set V' of vertices incident to less than ~ 5 edges n of color 5 is small, too. Choose a vertex x @ V'. It has degree n color 1 in the coloring 9. Since the set V "of color-1 neighbors of x which do not (114 - ~ g ) in belong to V'i nduce very few edges of color 5 and no edge of color i for 2 I i 5 4 , by Turan's theorem there is a K4 of color 1 in V". Denote the vertices of this Kq by vl, ..., v4. Since the vi are in V ",they have degrees at least (114 - ~ g ) in n color 2 in the colonng $, and those color-2 neighbors do not belong to the color-I neighborhood of x . Hence, if S( 1/4 - ~ g >) 1 - and this inequality always holds when the initially chosen E is sufficiently small - then two of those vi, say y and v 2 have a common neighbor vo in color 2. In this case, however, the set { vo, v,, v2} induces a triangle T that contains two edges of color 2 and one edge of color 1 in $. This final contradiction proves the theorem. 3.3 Forests
Proof of Proposition 1: Let us begin with the remark that the proposition can be strengthened in two ways simultaneously. On one hand, all edge-colorings f of K,, with an arbitrary number of colors contain a rainbow F whenever each color satisfies the corresponding minimumdegree condition e - 1, 2e - 2, etc. On the other hand one vertex v of F can be chosen arbitrarily in K,,; v can be either any vertex of F (when the degrees are at least e - 1 and 2e - 2, respectively) or any vertex with a degree 1 neighbor v' (when the degrees are one smaller). It will be convenient to prove Proposition 1 in this stronger form. In either case, we take a sequence F,, F,, F,, . ..,F, of trees (forests) such that FO= v, F, = F, and the tree Fj of j edges is obtained from F j + 1 by deleting a degree 1 vertex distinct from v or, if F is a forest, then possibly an isolated edge of FJ + 1 which is not incident to v. (Such a sequence exists, for any tree of at least 2 vertices has at least two vertices of degree 1.) A rainbow F can be built up recursively, as follows. First we map FO onto the prescribed vertex of K,,. Having mapped F j (0 I j < e) onto a rainbow subgraph of K,,, consider the vertex v" E Fj incident to the edge Fj + 1Wj. (If the edge in question is isolated in + 1, then one can take an arbitrary v" E K,,Wj.) Assuming that v" is incident to edges of k 2 e distinct colors, choose one of the k - j 1 1 colors, say c, not contained in Fj If the stronger degree condition ( e - 1 or 2 e - 2 ) is satisfied, or if the edges starting at v are assigned to more than e colors, then c can be chosen arbitrarily. Otherwise, for n large, there is a color, say color 1, whose degree at v is at least e - 1. In this case we let c # 1 in the first e - 1 steps, and let c = 1 when building F, from F,- 1.
9
Rainbow subgraphs in edge-coloringsof complete graphs
87
Since v" can be adjacent to at mos t j - 1 < e - 1 vertices of the tree FJ in the color chosen for Fj + l F j (or to 2 j - 2 vertices when F is a forest), we always find a rainbow F,- rooted at v, and also a rainbow F, if the stronger degree conditions are fulfilled, as well as in the case when v" has a large degree in color 1. (In forests, either v" is non-isolated in F, - 1 , yielding that the procedure can be completed to a rainbow F,, or else we have a free choice for v" in which case we take any vertex adjacent to Fe- 1 with a color that already appears in F,- 1.) On the other hand, if v" is incident to more than e colors, then there are at least two of them not occurring in F, - 1, and those colors provide at least 2e - 4 2 e - 1 (respectively, 4e - 6 22e) possible ways to extend F, - 1 to F, in the non-trivial cases e > 3. 3.4 Colorings of K d
Proof of Theorem 5: (i) Suppose that e = 4t + 2. We consider the following coloring of K,, (constructed by Brightwell and Trotter [ll] for the case when F is a cycle). Let n = (4t + 2)d, d > 0. Partition the vertex set of K,, into two equal parts V' and V". One can always color the edges within V', as well as within V", with colors 1,2, . .., 2 t + 1 in such a way that each vertex has degree d or d - 1 in each color. Moreover, the complete bipartite graph formed by the V'-V" edges has a 1-factorization, the d(2t + 1) classes of which can be used to define ad-regularcoloring with colors 2 t + & 2 + 3 , ..., 4 + 2
To see that the edge coloring of K,, obtained in this way contains no rainbow F , recall that the assumption of even degrees at all vertices implies that every connected component of F has an Eulerian cycle. Clearly, each of those cycles contains an even number of V'-V" edges. On the other hand, the number of colors joining V'w ith V" is odd, so that F cannot contain precisely one of each of them. (ii) Since e is even, K, has a 1-factorization into e - 1 classes of edges. Ford 2 2, replace each vertex v of Ke by a set S(v) of d vertices; color the edges joining distinct sets with their colors in K,, and color all edges within a set with color e. A copy of F i n K k could be rainbow only if it contained an edgefof color e. However, this edge belongs to some triangle Tin F, and the third vertex of T should belong to a set S(v) distinct from the set S(v')containingf. It follows that the two edges of T , other than f,must have the same color and therefore F cannot be a rainbow subgraph. (iii) Start with a coloringfof K , with no rainbow subgraph homomorphic to F , and apply the substitution described in (ii) with the following modification. Within an S ( v ) ,color all edges with the (unique) color that does not occur on the edges incident to v. If F contains some edge of an S(v),then F is not rainbow for the reason explained in (ii). On the other hand, if F has no edge in any S(v), then contracting each S ( v ) to v we obtain a color-preserving homomorphic embedding of F into K , so that F cannot be rainbow by the assumption on f.
Acknowledgement We are indebted to L. F'yber for several fruitful discussions and valuable remarks on the problems investigated here, to L. Lovisz for calling our attention to Gallai's work [9], and to R.J. Faudree for discussions that led to an improvement of a previous version of Proposition 1.
References [l]
L.D. Andersen; Hamilton circuits with many colours in properly edge-coloured complete graphs, Math. Scand., 6 4 , 5 1 4 (1989).
88
131 141
P. ErdBs and Zs. Tuza
P. ErdBs, M. Simonovits and V.T. S6s; Anti-Ramsey theorems. in Infinite and Finite Sets I1 (A. Hajnal, R. Rado and V.T. S6s (editors), Proc. Colloq. Math. Soc.J. Bolyai, Keszthely (Hungary) 1973, NorthHolland, 633-643 (1974). P. Erdds and Zs. Tuza; Rainbow Hamiltonian paths and canonically colored subgraphs in inftnite complete graphs, Mafh. Pannonica, 1.5-13 (1990). G. Hahn and C. Thomassen; Path and cycle subRamsey numbers and an edge-colouring conjecture. Discrete Math., 62,29-33 (1986). V. Rijdl and Zs. Tuza; Rainbow subgraphs in properly edge-colored graphs, Random Structures and Algorithms, 3, 175-182(1992). M. Simonovits and V.T. S6s: On restricted colourings of K,,,Combinatorica, 4. 101-1 10 (1984). 2s. Tuza; Representations of relation algebras and patterns of colored triplets, in Algebraic Logic H. Andreka, J. D. Monk and I. Nemeti (editors), Proc.Colloq. Math. Soc.J. Bolyai, Budapest (Hungary) 1988. North-Holland, 671-693 (1991). K. Walker; Fully-coloured Hamiltonian cycles in edge-colouringsof Kn (n odd) when there are no fullycoloured Cds. Ars Combinatoria, MA, 97-105 (1987). T. Gallai; Transitiv orientierbare Grapben, AcfaMath. Acad. Sci. Hungar., 18.25-66 (I%?. T.A. McKee; Generalized complementation, J. Combinatorial Theory Ser. B , 42,378-383 (1987). G. Brightwell and W. T. Trotter; - private communication.
Quo Vadis, Graph Theory? J. Gimbel, J.W. Kennedy & L.V. Quintas ( 4 s . ) Annals of Discrete Mathematics, 55, 89-92 (1993)
0 1993 Elsevier Science Publishers B.V. All rights reserved.
GRAPHS WITH SPECIAL DISTANCE PROPERTIES
Martin LEWINTER Mathematics Department, State University of New York Purchase, New York, U.S.A.
Abstract A graph G is called an F-graph if its center containsat least two vertices, and if x and y are in the center, then the distancebetween x and y is equal to the radius of G. The existence of such graphs is surprising and is in marked contrast to trees whose center vertices must be adjacent. Some properties of F-graphs are presented, along with open questions. Recent results on other graphs with unusual distance properties are included.
1.
Introduction
The distance between vertices x and z of a connected graph, denoted d(x, z ) , is the least number of edges in an x-z path. The eccentricity of vertex x , e(x), is max{d(x, w)} for all w E V(C).y is an eccentric vertex of x if d(x, y) = e(x). The radius and diameter of graph G are the minimum and maximum eccentricities respectively. The center of G, C(G),consists of those vertices of minimum eccentricity. Theperiphery of G, Peri(G), is the set of all vertices of maximum eccentricity. x is a center eccentric point of G, if it is an eccentric vertex of a center vertex of C. The collection of such vertices is denoted CEP(G). It is well known that for a non-trivial graph, IPeri( G)I 2 2. Analogously, it is shown in [l] that ICEP(G)I 2 2.
If d(x, z) = diam(G), we call n and z a diametral pair. An x-z path of length equal to the diameter is called a diametral path. See [2] and its references for more information on distances and centrality. 2.
S, D, L, and L'- Graphs
The graph of Figure l a is somewhat surprising as Peri (C) n CEP(C) = 0 .The graph H of Figure l b behaves as one might expect: Peri(H) = CEP(H).
Figure 1:
These graphs motivate the following definitions in [l] and [ 3 ] . Definition 2.1:
Agraph GisaD-graph,if Peri(G)nCEP(C) = 0.
90
M. Lewinter
Definition 2.2: A graph G is an S-graph , if Peri(G) = CEP( G). It should be noted that while D-graphs and S-graphs are mutually exclusive, they are not jointly exhaustive. In 131 the following theorems are proven.
Theorem 2.1: If C(G) consists of two adjacent vertices x and y such that edge xy is a bridge, then G is an Sgraph. Theorem 2.2:
If C( G) = {x} and x does not lie on a cycle, then G is an S-graph. It follows, in light of Jordan's Theorem, that trees are S-graphs. A 2-dimensional mesh M(r,s) is the Cartesian product of paths P, and P,. Let x and y be a diametral pair of a 2-dimensional mesh. Then if one considers the x-y diametral paths, some contain center vertices while others do not. We examine the extreme situations.
Definition 2.3: A graph G such that all diametral paths contain center vertices is called an L-graph.
Definition 2.4: A graph G such that no diametral path contains center vertices is called an L'-graph. Figures 2a and 2b exhibit L-graphs and L'-graphs respectively.
Figure 2:
In [l] the following theorems are proven. Theorem 2.3: If C(G) consists of a pair of adjacent vertices xand y, such that edge xy is a bridge, then G is an L-graph. Theorem 2.4: If C( G) = {x} and x does not lie on a cycle, then G is an L-graph. It follows, as before, that trees are L-graphs.
Graphs with special distance properties
91
Theorems 2.1 and 2.2 are similar to Theorems 2.3 and 2.4. However, note that L-graphs are not necessarily S-graphs. In fact, the graph of Figure l a is an L-graph and a D-graph (a DL-graph for short). 3.
F-Graphs
Jordan's Theorem states that the center of a tree is either a single vertex or two adjacent vertices. Thus, if a graph has several center vertices, one expects them to be close to one another. The graph of Figure 3 has radius four and two center vertices whose distance is four, that is, the center vertices are as far apart as possible! It is an example of a graph with the following property.
Definition 3.1: A graph G is an F-graph, if IC(G)I 2 2 and if x,y E C(G), then d(x, y) = rad(G).
Figure 3: It is well known that if G is any graph, then q G) is contained in a single block. We show in [l] that if G is an F-graph, no center vertex can be a cut-point. Furthermore, unlike the case for arbitrary graphs, F-graphs satisfy the following theorem proven in [ 11:
Theorem 3.1: Any diametral path of an F-graph has a subpath of length at least equal to its diameter minus its radius contained in the block containing the center. In order to shed some light on the nature of F-graphs, we introduce the concept of central distance sets. Let G be a graph with C(G) = { cl, c2, ..., cs} . We define the central distance of vertex x, denoted d(n, C),by
d(x,C) = min{d(x,cJli = 1 , 2,..., s}.
For each non-negative integer j , define the j - t h central distance set Nj by N, = { x I d(n,C) = j } . Obviously No = C( G). If k > rad( G), then Nk = 0 . Furthermore, if Nk f 0 and j < k , clearly Nj z 0.We obtain the following lemma.
Lemma 3.1: Let G be a connected graph with radius a and diameter b. Then Nb - a
# 0.
M. Lewinter
92
Proof: Let x E Peri(G), i.e., e(x) = diam(G). Let x E N j . Then 4.r) = diam(G) I j j 2 d i m ( G) - rad(G), establishing the lemma.
+ rad(G), yielding
If G is an F-graph, we can say more, as the next theorem shows. Theorem 3.2: Let G be an F-graph with radius r , and l e t j = h 2 J . Then N j # 0.
Proof: Let C(G) = (cl, c2., ..., c , } . Let z be a vertex of a cl-c2 shortest path other than c or c2, such that d ( c l ,z) = J . We claim that z E N j Otherwise, d(z, c i ) < j for some i f 1,2, in which case d(cl, ci) I d(cl, z) +d(z, C J < 2j I r , yielding d(c,, ci)< r contradicting the assumption that G is an F-graph. 4. Concluding Remarks We seek additional properties of S,D , L, L' and F-graphs (and their combinations). Are there elegant characterizations? Are there easy ways of determining whether a given graph has any of these properties? For example, Cartesian product graphs are not F-graphs. Do the central distance sets of any of the above-mentioned graphs have interesting properties?
References [l] [2]
[3]
F. Buckley and M. Lewinter; Graphs with all diametral paths through distant central nodes, Comput. Math. Appl., (to appear). F. Buckley and F. Harary; Dtstunce in Graphs, Addison-Wesley, New York (1990). F. Buckley and M. Lewinter; Minimal graph embeddings, eccentric vertices aud the peripherian, Proc. 51h. Carib. Con5 Comb. & Comp., 72-84 (1988).
Quo Vadis, Graph Theory? J. Girnbel, J.W. Kennedy & L.V. Quintas (eds.) Annals of Discrete Mathematics, 55, 93-108 (1993) 0 1993 Elsevier Science Publishers B.V. All rights reserved.
PROBABILITY MODELS FOR RANDOM MULTIGRAPHS WITH APPLICATIONS IN CLUSTER ANALYSIS Erhard A.J. GODEHARDT Department of Thoracic and Cardiovascular Surgery Heinrich-Heine-University, Dusseldorf, GERMANY
Abstract The main difficulty in deriving test statistics for testing hypotheses of the structure of a data set lies in finding a suitable mathematical definition of the term “homogeneity” or vice versa to define a mathematical model which “fits” to a real, but homogeneous, world. This model should be both realistic and mathematically tractable. Graph-theoreticcluster analysis provides the analyst with probability models from which tests for the hypothesis of homogeneity within a data set can be derived for many environments. Because of variations of the scale levels between the different attributes of the objects of a sample, it is better not to compute one single similarity between any pair of vertices but more - say I similarities. The structure of a set of mixed data then can more appropriatelybe described by a superposition o f t graph, a so-called “completely labeled multigraph”. This multigraph model also provides researchers with more sophisticated and flexible probability models to formulate and test different hypotheses of homogeneity within sets of mixed data. Different probability models for completely labeled random multigrapbs are developed, their asymptotical equivalence is shown, and their advantages when applied to testing the “randomness” of clusters found by single-linkage classification algorithms are discussed. It is also shown how the multigraph models can be used to derive nonparametric test statistics to test the independenceof the different attributes whch have been measured.
1.
Introduction, Graph-Theoretical Concepts
A cluster is a maximal collection of suitably similar objects drawn from a larger collection or sample C of objects. Thus, classification procedures usually are based on similarities, or dissimilarities (distances), respectively, which must be defined or calculated between every pair of objects. Only a few classification algorithms can uncover irregular or sickle-shaped clusters correctly even if the number of clusters is known. Here, graph theoretical concepts based on similarities - or, more generally, on binary relations - are helpful.
The n objects of a data set to be clustered can be interpreted as vertices 1, ... , n of a graph. Two vertices are connected by an edge if and only if the related objects are similar enough, that means if their mutual distance is not greater than a user-defined threshold d . The cornponertts of such a graph r = r(d) (the maximal subsets of vertices where any two vertices are connected by a sequence of edges) are known as single-linkage clusters, and the cliques (the maximal subsets of vertices where any two vertices are adjacent) become the cornplete-linkage clusters. Here, the notation r(d) means that the graph consists of the vertices 1, . . . , n representing the objects and those edges ( i j )connecting every two vertices i and j for which the distances dij between the pairs objects satisfy dV 5 d. The advantage of graph-theoretic cluster procedures is that a cluster is a priori defined by the choice of a threshold d. (For most of the cluster-detecting procedures, clusters are not defined a priori by certain properties but are just the result of that procedure which the analyst has chosen, see 1111’2.1). Some weak points of this way of defining clusters are well known like the chaining effect for single-linkage clusters (111-[3]). Some of them can be by-passed by modifying the cluster definition; we can use weak k-linkage clusters (or k-clusters) for disjoint classifications, and strong k-linkage clusters for overlapping clustering. Fork = 1, we get the single-linkage clusters; with k , we can determine the degree of compactness within the groups ([l] -[3]). For the remainder of this paper, we consider single-linkage clusters only.
E.A.J. Godehardt
94
Often, the scale levels vary considerably between the different items, that is between the dimensions of the data vectors. It then is nearly impossible to compute overall (or global) similarities sij or distances d y respectively, between the elements of a data set. The structure of a data set consisting of n multidimensional vectors can be described better by a multigraph than by a graph. We combine some of the dimensions to so-called “blocks” (for example, we can combine all binary components of the data vectors to a block and all continuous items form another block etc.). That gives t blocks. For each such block, we calculate “local” similarities or distances (using, for example, the matching coefficient or Tanimoto’s distance for the block of binary data, and the Euclidean distance for the block of continuous data). Thus, we get t local distances for every pair of objects (if the data vectors consist of either binary or continuous items then we get two local distances between each pair of objects by this procedure). We can define a multigraph Tt with the !objects as vertices as follows: For every block 1, we define a threshold dl; this gives a vector dT = ( d l , . .,dt)of t “local” thresholds. We superpose the n vertices in t layers so that every block of dimensions of the data is represented by a layer. For every block number 1, we compute the distance dijl between any pair of objects and draw the edge (i,& in the t-th layer between the vertices i and j if duf Idl (two vertices are adjacent in the I-th layer if the corresponding objects are similar enough in the 1-th block of variables). A total of t different labeled edges therefore can connect two verti$es directly, and we get an (undirected, completely labeled) multigraph rt,,,,Nwith N = N( dT) edges (completely labeled since we distinguish between the edges joining the same pair of vertices). For 1 IS 5 t , the s-projection r of a multigraph rt is the graph with the same vertices as Tt, where exactly those pairs of vertices are connected by an edge which are connected by at least s edges in (s-fold connected in With this definition, we can generalize properties of simple graphs to muhigraph? by “mapping” them: An s-component in r, is defined as a compo_nent in the s-projection r of T r ;an s-isolated vertex in rt is defined by being isolated in r, that is being not connected by an edge to another vertex in the s-projection, and so on [2] [4].
rt
rt).
For a given integer s (1 I s s t), a single-linkage cluster of level d is an s-component of the multigraph rt,n,N(Tl defined by the data or a component of the s-projection. In practical classification problems, it is often acceptable for objects to differ in some dimensions of their data vectors; they will be put in the same cluster if they are similar enough in a number of other dimensions or blocks. This is the reason why we define single-linkage clusters not as tcomponents but as s-components of rf,,,,N (1 I s 5 t). If there is only one block we get the previous definition of single-linkage clusters as special case t = s = 1. In the same way we can generalize all graph-theoreticAbasedcluster definitions to multigraphs. In [ 2 ] ,we described an AT algorithm for uncovering (k, d,s)-clusters which uses either the original graph T,(d ) or its sprojection.
2.
A Probability Model Based on Random Graphs
The result of every clustering procedure will be a number of clusters. This holds true even if the sample has been drawn from a homogeneous population. Therefore we need statistical tests to decide whether the clusters found are “real” and reflect a heterogeneous structure within the population or are “random”. The main difficulty in deriving test statistics for testing hypotheses of the structure of a data set lies in finding a suitable mathematical definition of the term “homogeneity” or vice versa to define a mathematical model which “fits” to a real, but homogeneous, world. This model should be both realistic and mathematically tractable [5][q. Graph-theoretic cluster analysis provides the analyst with simple probability models from which tests for the hypothesis of homogeneity within a data set can be derived for many
Probability models for random multigraphs
95
environments. R.F. Ling assumed a uniform distribution of distances as the null hypothesis of homogeneity. A random attachment of the N = N ( d ) distances smaller than a threshold d to pairs of objects then can be interpreted as a random choice of the corresponding N( d ) edges in a graph r(d) [3]. Thus R.F. Ling could use the analogy between single-linkage clusters or k-clusters and certain subgraphs of the graph T(d) to derive conditional exact or asymptotic test statistics for testing the hypothesis that a sample C has been drawn from one single homogeneous population (and thus has been partitioned randomly into different clusters) using results from a probability model of random graphs (r.g.‘s).This has been discussed in [7J (see also [3]): A cluster structure formed by the first N ( d ) distances is said to be “real” if the probability to get a r.g. rn, with n vertices, labeled 1, ...,n, N edges and the same properties as found in the sample c i s lower than a given level a of significance, if, for example, the value of the random variable (r.v.) X , , , the “number of isolated vertices” in rn,Ndiffers too much from the one we would expect under random conditions. (An exact formula of the distribution of X.1 is given in [2];we also can use other r.v.’s to construct test statistics, see [2]- [4] .) The following probability model for r.g.’s, corresponding to a uniform distribution of distances, is assumed [2] [3] [S]: rn, N is a r.g. with given n vertices where N of the (2.) possible edges have been drawn at random and without replacement (uniform model, hypergeometric model or urn model without replacement).
If the sample size n and the number N ( d ) of edges drawn at a threshold d are not very small then it becomes cumbersome to calculate exact probabilities. In this case, one can use asymptotic results like those from If N is of order of magnitude nlog n then it is well known that a r.g. r n , Nconsists almost surely of one single large component and some isolated vertices besides it. The following theorem holds (with LxJ as the integer part of x, and o(1) as a null sequence; in this context, o and 0 denote the Landau symbols).
m.
Theorem 1: (Erd&Rt%yi 1960) In sequences (rn,N)n- of r.g.’s with IZ vertices and
edges, the expected numbers of isolated vertices tend to a positive limit for n + 00: E n , N X , l + h = e-‘. The number of isolated vertices tends to a Poisson distribution: Pn,,,(X.l = k) -+ e-‘hkik! (k = 0,1,2,...). The limit distribution for the number of components with at least two vertices is degenerate: P n , ~ ( = Z 1) + 1. If we draw edges so that N fulfills condition (1) for a constant c (or if we choose the threshold d such that the expected number N(d) is large enough), and if then in the graph rn,N or rn,,,(d) with k isolated vertices, which we get from the data, this number k is larger than the expected number e+, and Pn,,,(X,l 2 k) = 1- e-A (1 + h+h2/2+ ... +hklk!) < a holds, then we can reject the null hypothesis of homogeneous data. The components of that graph then are interpreted as real clusters. (We can restrict the test to being one-sided, since we can choose N or d so that (1) holds with c = 0, which means that we can expect less than one isolated vertex. However, we can also construct two-sided tests.) In (l), the function o( 1) is unknown. For calculating c and from the data, we have to insert a function here. Putting o( 1) = 0 is admissible for n > 200. For n I200, however, this choice is rather poor (see [2][8]). Here,
E.A.J.Godehardt
%
i ( n - 1 ) (logn+c) ( 1 -
(n-1) ( l o g n + c ) - 2 (n- 1) ( 2 ( n - 1) + l o g n + c )
should be used for calculating c and h. With this edge function, we can use the asymptotic results from Theorem 1 for sample sizes of about 50 or 60 (see [8]). Some authors prefer another model for r.g.'s, which fits better to the idea of determining the graph T(d) from a threshold d [9]: A r.g. G,,p arises by making a random choice for every pair (iJ) of vertices from 1,. ..,n,independent of each other and with the same probability p whether to include the edge (i,J or not (binomial model). For p= N I , both probability models are asymptotically equivalent for many graph properties; we get the same asymptotic results under both assumptions if the properties are the same and the number N of edges is not 1 too large m. For example, with p = p(n) = --(log n + c + o(n)). the results of Theorem 1 also
(:)
hold for r.g.'s Gap. In the classification model, a graph G(d)= G,,p(4 is the same as a graph Tn,N(4 with the exception that the number of edges included now is the realization of a r.v. for a probability p ; Nand p both depend on the threshold d N = hfd) andp = p ( d ) .The application of both models to cluster analysis can be justified: Looking for the first N smallest distances we get a r.g.r,,N with a fixed number N of edges, defining a threshold d for the distances we get a r.g. G,,pc4 = r,,,, where the number N(d) of edges is the result of a random experiment. 3.
A Probability Model Based on Random Multigraphs
This test procedure as well as other ones, based on [7], can be generalized to random multigraphs. The matrices Dl = (dijl)of local distances for every block ( 1 = l, ...,t) are arranged to a distance tensor D . Clusters are defined now by a threshold vector 2' = ( d l , ...,dt) of thresholds for every block and by an integers with 1 < s < t. We now assume, that homogeneity of a sample can be described by random order of the N smallest distances in a total of t local distances. This corresponds to the assumption, that in the corresponding multigraph kL,~(a., the N( edges are drawn at random. Hence the assumption of homogeneity of the data leads to the following probability model:
aT)
(S)
(Al) Let t colors (corresponding to the layers), 11 vertices, labeled 1,. ..,n,and edges of each color be given, let the edges be labeled (1,2)1, (1,3)1, ...,( n - 1,n)l for color 1, (1,2)2, (1,3)2,...,(n - l,n)2 for color 2,and so on. Put all (1) edges into an urn and choose N edges without replacement. This constitutes a random multigraph Tt,,,N with at most t distinguishable edges linlung two vertices i and j together. The probability for a certain multigraph with N given edges is
A justification for (2) is, that in the case of homogeneous data the computed distances are considered as realizations of r.v.'s Dijl, for which the condition Dql= a + ~~l holds true with a as a positive constant and the eijlas independent, continuous, identically distributed r.v.'s with
Probability models for random multigraphs
= 0 (1 5 i < j 5 n, 1 5 1 5 t ) . In this case there are with the same probability to be chosen, namely
97
(i) different symmetric rank tensors A
The probability that the N lowest ranks will take N given places regardless of their order then is the same as in (2).In practice, the global constant a for all blocks is not very realistic. However, we can get rid of this condition through a simple transformation of the elements of every distance matrices Dp We then only need the assumption that Dijl = a1 + eijlholds true for the original data with positive constants a l , ...,a, and independent, continuous r.v.'s eijf,which are from the same family of distributions. The assumption of continuous r.v.'s Dvl, or ev/, respectively, is needed to get t ( ; ) different distances dijl with probability 1 and thus to keep (3) to be true. We can drop this assumption if equal distances are arranged in an ascending order by randomization. Now, let Qr,,Nbe the set of all multigraphs r , , , with ~ n given vertices, labeled 1,...,n, and N of the t
(a)
possible edges. Let the probability P(r,,,N) that an element I-,, N is chosen
randomly from Q t , , ~ be , given by ( 2 ) .Every element r , , , ~ chosen at random is called a random multigraph (r.m.). For t = 1 this is the model of Ling or of Theorem 1. For a given integer s, let the following r.v.'s be defined on probability spaces ( Q , , , N , P ( Q t , , ~ ) , P 1 , , ~ )Let . Tijlbe aO-1-variable with TijXr,,,N) = 1 if i a n d j a r e linked be another 0-1-variable with Llijl(T,,,N) = 1 together by an edge (i,j)l in the 1-th layer. Let Usij if Tvkrl,,N) 2 s, that is, if i a n d j are connected by at least s edges in T,,, N (they then are 1 called s-fold connected). By V, = zUs,, we get the number of s- connections, that is, the num-
-
ber vof edges in the s-projection rn,"of r,,N Defining another 0-1-variable X,il byXsil = 1 if the vertex i is s-isolated, we get the number of s-isolated vertices by Xs.l. By Z,we count the number of s-components (of any size), and 2 = Z , - X,1 gives the number of s-components with at least two vertices in r.m.'s Tt,n,N. (For t = 1 and s = 1, we get simple r.g.'s,in this case we omit the leading indices t = 1 and s = 1.) From our probability model, (4)
Pr>,N(xs,l= k I v, = v) = P,,(X,1= k )
follows. Therefore, the conditional distribution of the number of s-isolated vertices in r.m.'s r,,, under the condition that the r,,, N have exactly v s-connections, is the same as the distribution of the number of isolated vertices in r.g.'s r, with v edges. Moreover, let p : be the probability that two vertices in a random multigraph are s-fold connected (that is, the corresponding edge in the s-projection Gng* is present), then the probability model (Al) gives
(5)
p,* = Pt,n,N(Usij= 1) = P,,n,N(Tij.2s)
E.A.J.Godehardt
98
as the probability of having an edge in the s-projection and
-
(t)
Ns
"-'--{ n 2 s - 2
l+o N2+Nntn2
tS
(
Nn2
)}
as the expected number of edges. (The last part of (6) holds if N is small enough to give an ~ s-projection.) ) Thus, the s-projections are realisaexpected number of edges of ~ ( n ~in 'the tions of r.g.'s Grip* with p,* according to (5) or of r.g.'s r n t ~ v with ] LEV] = LEt,&N Vs] edges according to (6). Using this idea of attaching to every r.m. rt,n . its~ s-projection rn,LEv] or Gn,pg, we can prove
Et.n, N xs.1 - E~,LEVJ X , 1 - nexH-dEr, n,N v b ) , Pi,,. N (Xs. 1 = k ) - P~,LEvI(X.I = k)
and similar results for other properties of r.m.'s ([2], [4]).Using these facts, we get the following result by applying Theorem 1 (which has been proved in 1980, see [2][4] [lo] [ll]).
Theorem 2: In sequences ( I-&,
of r.m.'s with t layers, n vertices, and
edges, the expected number of edges in the s-projection of rf,n,N, E t , n , ~ V , ,is given by (l), and the corresponding sequences of s-projections behave as sequences of r.g.'s according to Theorem 1. In (Tn,N)n,, the expected numbers of s-isolated vertices tend to a positive limit for n + Et,,, Xs.l + h = e-c. The number of s-isolated vertices tends to a Poisson distribution: P , , n , ~ ( X s ,=l k) + e-'/k! = ex ( k = 0,1,2, ...). The limit distribution for the number of s-components with at least two vertices is degenerate: P f , R N ( Z= 1) + 1. 00:
Proof (Main Idea):
For N(n) given by (7),the corresponding s-projections of r.m.'s Tt,n,Nbehave as r.g.'s rn, or Gn,p$, where the expected numbers of edges are given by Etvn,N V,
- in(log n + c + o( 1))
the probabilities of edges to be chosen are given by 1 n Thus, Theorem 1 of P. Erd6s and A . RCnyi on isolated vertices can be applied to the sequences of s-projections since the expected numbers of edges satisfy condition (l), with v or LEV] instead of N . With (4), we can prove that in the original sequence of r.m.'s the number of s-isolated vertices then also tends to a Poisson distribution.
p3 = p ( n ) = -(logn+c + o ( n ) ) .
Probability models for random multigraphs
99
Remark 1: As in the theory of simple r.g.'s, the asymptotic behavior and structure of sequences (rfl,N)flof r.m.'s can be described for various types of edge sequences N = N ( n ) ([2] [4]). The proof technique for these generalizations of the results of P. Erdiis and A. RCnyi and others for r.g.'s to r.m.'s is in most cases the same: We look for the conditions under which the sequences of s-projections have the desired properties and then use formulas similar to (4) together with (5) or (6) to determine the corresponding edge sequences for the original r.m.'s. The results of Theorem 2 can be used to tesithe homogeneity of a data set usin the limit distribution of the r.v. Xs,r. We have to choose dT so that the number of edges N( d ) is large enough to satisfy (7) (as in the previous section, this number should be chosen so that the expected number of s-isolated vertices is small to get a one-sided test). Inserting t , n, and N(ZT) which we obtain from our data, into (7) we compute the value for c and the expected number h = e-' of s-isolated vertices under the null hypothesis of drawing edges at random, Hence the probability to get k or more s-isolated vertices in a multigraph with t layers, n vertices and N edges is P 1 , ~ (Xs.l , ~ 2 k ) = 1- e li (1 + h+ h2/2 + ... +hk/k!).For a given level of significance of a,we can reject the null hypothesis that the edges have been drawn at random if Pl,n N ( X ~2. k~) < a.Then, we accept the alternative of inhomogeneous data and consider the detected clusters not as being found "at random" but as real ones. Some medical examples where this procedure has been applied to, have been published in [2] and [4]. As in (I), the function o( 1) is unknown in (7):This function now depends not only on n but also on t ands. Putting o( 1) = 0 for calculating c and h from the data, is admissible for n > 200 as it was for the model of simple r.g.'s. For n 5 200, however, this choice again is rather bad (see [2] [lo]). Fairly good choices for o( 1) have been derived for the cases s = 1 and s = t in [2] and [lo]. For s = 1,
4
N(n) =
I:
-(TI-
1) (logn+c) ( 1 -
'
( n - 1) (logn + c ) - 2 ( n - 1 ) ( 2 t ( n - 1) +logn+c)
should be used for calculating c and h, and for s = t,
l)l-l"(logn+c)l~'
n(logn+c) - 2 ( l - n (2 ( n - 1)
+ logn + c )
is much better than using (7) with o( 1) I 0. With these edge functions, the asymptotic results of Theorem 2 can be used as test statistics for testing the homogeneity within a data set for sample sizes larger than 60. Obviously, for 1 < s c t the formulas for r.m.'s are more cumbersome to be used than those for r.g.'s.Theorem 2, however, shows how to reduce most problems for random multigraphs rl,fl,N to problems in random graphs rfl,N: For any 1 s s s t , we can use the s-projection instead of the original multigraph Tl,fl,NThus, we use the formulas for ordinary r.g.'s instead of those for r.m.'s, but we take full advantage of the original multigraph model and its greater flexibility for defining distance thresholds. Theorem 1 then can be used to compute, for example, the probability of having k or more isolated vertices in the s-projection of T 1 , n , N ( f(see ) Section 5 for advantages of this procedure). We can also use a generalization of the r.g.'s Gfl,pto completely labeled r.m.'s layers as a model for classification. This leads to the following definition.
with t
E.A.J.Godehardt
100
(A2) Making for every edge a decision with probability p whether to take it from the urn gives another probability model, GcnP.As the probability to choose a certain multigraph with N edges at random, we get (8)
P(Gf,,P) = P N (1 - p)'(;)bN
From (8),the probability p i that two vertices are s-fold connected is given by
which is the tail of a binomial distribution while (5) is the tail of a hypergeometric distribution. For p = N / @)), both probability models (the r,,N-model and the Gt,,gmodel) are asymptotically equivalent if N is not too large. Again, the s-projections are realisations of r.g.'s GKP*or r,,vwith
as the expected numbers of edges if N is small enough to give an expected number of edges of order o( r?'2) in the s-projection. That means that - as with r.g.'s r,,N and GKp- we get the same asymptotic results under both assumptions if the properties are the same.
4.
New Probability Models Based on Random Multigraphs
Several ways to generalize the probability models for r.g.'s r,, or G,, from Section 2 to undirected, completely labeled multigraphs exist. The following two models are of special interest for applications to classification theory. (B 1) Let t r.g.'sr,,,,, ,...,rn,N with the same vertices 1, ...,n and NI edges per graph be chosen independently of each other. Superposition of these r.g.'s defines a r.m.rn,(N ,&) with N1 + ...+ Nt = N edges altogether. The probability to draw a certain multigrapk with (N 1,. .. ,Nr) edges is
(B2) Let t random graphs GnPl,. ..,GnPt - with probability pl per graph that vertices are adjacent - be chosen independent of each other. In each random graph Gv ,we expect I
E N I = (;) p r edges. Superposition of these random graphs defines a random multigraph Gr,K(pl,...,p) with EN= EN1 +...+ EN, expected edges altogether. The probability for a certain multigraph with (N 1,. .. ,Nr) edges is
With I= { 1,2,. ,. ,t},and
s c I,we get from (B 1)
Probability models for random multigraphs
as the probability that an edge is in the s-projection
101
Grip,*, while (B2) gives
as the probability that two vertices i and j are s-fold connected. For s = I , pz = ll:= ,pl follows from (13), and for s = 1 and maxpl + 0, pz = for model (B 1) if we substitute p~by NI/(
lpI holds, and similar results follow
).
F o r p l = ... = PI= p, model (B2) is exactly the same as model (A2); in this case, (13) gives the tail of a binomial distribution with parameters t and p. In this section, we show that both multigraph models, (B 1) and (B2), give the same asymptotical result for the distribution of the number of s-isolated vertices. (Generally, we can expect that both models are asymptotically equivalent for those edge sequences where the graph models are asymptotically equivalent for each layer in the superposition. But this is not proved here.) For model
( Q f , n , ( N l,....N ) , P ( Q t , n , ( N I ,...J ) ) ) istheprobabilitysp%
Q f , n , ( N l,..., N,
contains all those multigraphs Tr,&( N ~ , , . , , N ) which , are superpositions of graphs Tt,n= (G,3 ) with vertex set G = {l, ...,n}, and Nrelement edge sets %, (N1 +...+ N, = N). The set Q t , K ( ~ l , , . , ,obviously ~t) is a subset of a , ,N from Section 3, since we added a condition. Now let N1= ...= N,= I
" 4 n, (Nl, ...
(
N,) with
p)'
be another additional constraint. This new constraint leads to a set
multigraphs as elements which shall be equiprobable. This
gives
instead of (10). Furthermore,
(15) holds for the probability of drawing an edge in the I-th layer (see the last paragraph of the preceding section, where we already considered p = N/ @(;)).This implies
= (l) p q 1 -
p)f-k
for the probability that exactly kedges connect the two vertices i and j (we use the same nota-
E.A.J. Godehardt
102
tionsP,,,Nand E,,n,~asofSection3insteadofP t , n ,,..., (~ ~ ~ ) E m* ,d, ( N,,.,.3 , ~ w h i c h m o r e correctly should be preferred for the probabilities and expectations for model (Bl); the r.v.’s are the same for the different models). From this, f
(16)
;;=Pf,,N(U’ij=
1)=
c (L)p“lk=s
p y t
follows immediately. Thus, under (14), like under (S),the probability that two vertices are sfold connected is given by the tail of a binomial distribution B ( t , p ) , while under ( 2 ) , this probability is given by the tail of a hypergeometric distribution. The expected number of sconnections, that is, the expected number of edges in the s-projection in model (B 1) is
(2”)
if $/ 4 0 . Combining (16) with (5). we have pz / p z + 1 for = N / t . Formulas (17) and (6) are asymptotically equivalent in that case, too. Similar facts hold for Vurl,,,N V,, which are not hard to prove. Thus, the following theorem, which is the analog of Theorem 2, can be proved using virtually the same techniques (see [ l l ] ) . Theorem 3: In sequences (rt,n,(N/l ,_,,, N/t)n(18)
of r.m.’s with t layers, n vertices, and
[4
N , = i ( n ) = N( n)/t 7 n 2 -”’(log n + c + o( 1 ) ) l’’
I
edges per layer ( N 1 + .. . +Nt = N), the expected number of edges in the s-projection of rt,n,N, E r , n ,V,, ~ is given by (I), and the corresponding sequences of s-projections behave as r.g.‘s according to Theorem 1. In (Tl,n,N)n+, the expected numbers of s-isolated vertices tend to a positive limit for n + -: Et,n,N Xs.l + h = e-‘. The number of s-isolated vertices tends to a Poisson distribution: P t , n , ~ ( X s ,=l 1 ) + d h k / k ! =e < ( k = 0,1, 2, ...). The limit distribution for the number of s-components with at least two vertices is degenerate: P n , ~2( = 1 ) + 1. Obviously, we need not care about Nlt to be integer: The null sequence in (18) allows to round this value up or down to the next integer. Thus, we use Nlt without the flooring or ceiling operator. Because of (2), we expect Nlt edges per layer in a r.m. rt,n,N. That means that under model ( A l ) , the probability to draw a multigraph with very different numbers of edges in the layers tends to 0 as n +-. Therefore, it is not very surprising that under the additional condition of N l = fu = N h for 1 = 1 , ..., t, we get the same asymptotic results for both models (Bl) and (A 1 ) . (Remember that the hypergeometric distribution tends to a binomial distribution under “moderate” conditions as n +-.) Like Theorem 2, we can use Theorem 3 to test the homogeneity of a data set by using the
Probability models for random multigraphs
103
limit distribution of the r.v. X,1. This theorem has an additional advantage: In Section 1, we defined what we may call “threshold model” for classification by including edges, depending on a threshold vector d .‘ For this threshold model, both (B 1) and (B2) are more suitable than (A 1) or (A2) if we want to test the “randomness” of clusters. In fact, we draw Nl edges per layer - either the edges belonging to the smallest local distances according to (B 1) or the Nl(dl) edges per layer for which dijl < dl holds. The assumption to draw the edges equiprobably and independently of the different layers is somehow artificial and less suitable than the assumption of drawing a fixed number per layer. 5.
Discussion
Graph-theoretical models have advantages, when they are used in classification theory: The clusters have defined properties; the probability model for testing the null hypothesis of randomness of clusters is simple; the results are invariant under monotonic transformations of the distance measure; no a priori information about the classes is required. The calculation of t local distances is much easier than the calculation of one global distance between every pair of objects for mixed data as is proposed in [ 11. Further advantages of the multigraph model as against the simple graph model are the greater flexibility of the model, and the fact that in many cluster problems, it is tolerated that objects of the same cluster are dissimilar in some variables if they are similar enough in at least - say - s blocks of variables. The question of how to choose a good value for s must be left open to discussion between the biometrician or statistician and the researcher who wants to perform a cluster analysis. By varying s, the homogeneity of clusters can be controlled. The case s = 1 allows objects to belong to the same cluster if they are similar in just one block. The case s = 2 , on the other hand, proposes that two objects must be similar in all blocks before they will belong to the same group. The multigraph model gives a deeper insight into the structure of the data to be clustered. We see exactly in which layers two objects are similar and thus are adjacent. Partly, this information is lost when we use the s-projection for classification. Here we only count the number of edges connecting two vertices. This, however, is still more informative than using only a single distance measure for high-dimensional data: In that case we do not know whether two objects are in two disjoint clusters since either they are slightly different in all dimensions or they differ significantly in only one dimension and are similar in the remaining t - 1 dimensions. The significant disadvantage for graph-theoretically based test procedures is that for distances the triangle inequality holds true. That means that no complete random choice of the edges with the same probability for each edge is possible even under the null hypothesis [12]. This holds for Ling’s graph model as well as for each dimension or block in the multigraph model. In the s-projection however, the triangle inequality will not hold for s < t [2] [4]. Thus, for the s-projection of the original multigraph, the randomness of drawing edges can be easier adopted as a null hypothesis of randomness of clusters. This allows us to use the s-projection of a multigraph T t , n ,and ~ the related results for ordinary r.g.’s to test the randomness of clusters, and to take at the same time full advantage of the original multigraph model in describing the structure of the data set. We get the same probability model and asymptotic behavior for the s-projections under all multigraph models discussed here. Thus, we suggest to use this idea to test the randomness of clusters with formulas from the theory of simple r.g.’s and at the same time take advantage of the better information from t local distances.
E.A.J. Godehardt
104
Statistics for testing the randomness of structures found within a data set are more plausible and easier accepted if they are based on models (B 1) or (B2) like Theorem 3. In (A 1) and (A2), we expect the same number of edges in each layer of the multigraph, namely Nlt. This is also the number of edges per layer in (B 1) and the expected number of edges per layer in (B2) if we assume N1 = ... = Nt = N/t o r p l = ... = = p . In this case the probability models (A2) and (B2) are equal, too.Thus, under these constraints and with p = N/ @(;)),we can show Gi,n,p = Gi,n.@*...,p).
Gr,n,p
-
Ti, N .
Gr,n,@,....p ) - rr,n,(Nlr ,....Nlr)
Under the constraint that we expect the same number of edges in every layer, the differences between all probability models are negligible if n is large and Er,,,N V, nlog d 2 as the comparison of Theorems 2 and 3 shows. It then makes no difference to assume the N edges to be drawn completely randomly from all t ( ; ) ones or with fixed numbers Nl = N / f for each of the
-
t layers if n is large and if we choose a threshold vector same number of edges in every layer.
d‘
so that we can expect about the
Remark 2:
It is of interest to know the edge functions N(n) and Nl(n),...,Nt(n) and the properties of r.m.’s for which
is true. We only proved that it is true for N given by (7), Nl = N/t, and the distribution of the number of s-isolated vertices. However, under certain - not too heavy - constraints, for example under the assumption that we expect the same number of edges in each of the t layers and that the total number of edges included is not too large, all probability models should be asymptotically equivalent. That means that they should give the same asymptotic results for many other properties of random multigraphs.
For the null sequences in (18), we can use the same functions as those found for (7) which again allows us to use the asymptotic results already for fairly small values of n ( n 2 0). Theorem 3 also shows a simple way to determine the number of edges to get a powerful test statistic if we can assume mutually independent layers: For every layer, we choose Nl edges with Nl given by (18). This means that we choose the 100 x Nll (g) per cent quantile of the local distances as threshold dl for every layer 1 ( I = 1, ...,t). In this case, we know that we can expect e4 s-isolated vertices and can choose an optimal parameter c for the asymptotic Poisson distribution of the number of s-isolated vertices (or c = 0, to get a one-sided test). This technique
guarantees the choice of a good test procedure since we get the best possible number of edges to use the asymptotic results of Theorem 3 to test the randomness of clusters. Another advantage of using (18) for the number of edges to be included is that for such a large number of edges, we do not expect more than one proper component or s-component (component with more than a single vertex) as Theorems 1-3 indicate. As a consequence of these theorems, the limit distribution of the number of s-components in r.m.’s, diminished by one, is that of the number of s-isolated vertices. Thus, for n not too small and N given by (l), (7), or (18), respectively, the presence of more than one proper component indicates “real clusters” in a data set. The procedures which are suggested here as test statistics to test the hypothesis of random-
Probability models for random multigraphs
105
ness of clusters are based on the idea that clusters can be defined as subgraphs, and on the assumption that the distances are independent r.v.’s and are interpreted as weights of edges which can be drawn at random. No formal optimality properties are known. We also d o not know the sort of alternative hypotheses in classification theory against which the tests suggested here have high power. We discussed that the partition of variables into blocks on the basis of their “type” (combining all binary variables into a block, all continuous variables into another block and so on) has advantages in terms of the choice of the local similarity measures. Quite often, however, this partition is not good for a data set. Then the variables divide into natural groups like social, economic and political ones. In this case, the variables within one block are not all of the same type. The researcher then either can divide every such block into “sub-blocks” of variables of the same type (which may overemphasize groups of variables against othels since they get more layers) or has to calculate distances from “mixed data”. Here, scaling methods can be used to transform the variables of each block to the same scale level. The concept of the s-projection as we used it here, implicitly weights all blocks equally. In some situations, however, it is necessary to attach greater weight to some variables rather than to others. One possible way is to choose the layers according to the importance of the variables.
1.
For “not so important” variables, the objects must be similar in all those layers or variables.
2.
For “very important” variables, the objects must be similar in at least one of these variables (here, each variable possibly defines a separate layer).
3.
For “rather important” variables, the objects must be similar in some (at least s) of these variables.
Let tl very important, t2 rather important and t3 unimportant variables be given. This gives three multigraphs with corresponding s-projections:
(s3 = t).
*
*
*
Usually, we get p1 > p2 > p 3 . The new multigraph Gj,,@; ,p. and its 3-projection, respectively, define the clusters in the sample (for example. as cohpdnents of this 3-projection). By this procedure, the more important variables get greater weight in the construction of the clusters. Instead of simply counting the numbers of edges. which connect two vertices i and j, we can give to every layer in a completely labeled multigraph a weight I/?. Now, a pair (i,J of vertices is s-fold connected if the sum of the weights of those layers where i and j are connected by an edge, is at least slt. This approach can be generalized by attaching different weights SI to different layers t , with & 1 q = 1 but leaving the threshold slt unchanged. This allows to prefer certain layers with important variables by giving them greater weights, and thus improves their contribution to the sum of weights for every pair of vertices. This idea especially seems to be useful for the concept of a general classification model; we do some research work in that direction.
E.A.J.Godehardt
106
Testing the randomness of clusters is possible with different probability models for random graphs and multigraphs. The number of isolated vertices found in a sample can be taken as test criterion if the number of edges is not too small; other criteria like the number of cycles are also possible. However, some important questions remain: (a)
How large has n to be to use asymptotic results like the Poisson law for the number of isolated vertices?
(b)
How can we model possible dependencies between different layers?
(c)
How must ( p ; ,p; ,p ; ) be chosen to provide us with powerful test statistics?
(d)
What does the null hypothesis of randomness of edges in practice mean?
Remark 3: Problep (c) is connected to the following problem. For which choice of pl, ... ,p , or N 1 , . ..,N , does ps take its maximum, given that either the sum p 1 + . .. + p , = tp, that means EN, or the sum N1 + . . . + N, = N is kept fixed or a function of n? It follows from straightforward calculations that for s = 1, we have to choose all edges from one layer, or choose p1 = t, p2 = ... = p , = 0 if we want to maximize pz . For s = t , similar calculations show that we must choose an equal number of edges per layer, or put p1 = ... = p t = p . For the case 1 < s < t , we must consider two subcases. For max < (s - 1)/(t - l), using the concept of Schur convex functions as in [ 131,we can show that we must choose an equal number of edges per layer (N1= ... = N, = N/t or p 1 = . .. = p t = p ) if we want to maximize pz . For min p: > (s - 1)/(t - l), however, the same concept shows that an equal number of edges per layer is the worst choice, this choice minimizes p:. Is the best choice p 1 = ... = ps = ps+l = p , for this subcase? For limit theorems for the probability models (Bl) or (B2), we have in most cases max p1 +O or Nl = o(n2).That means that we consi$er the cases p1 = ... = p , = p or N l = ... = N, = N/t if we want to get a maximal probability ps . 6.
More Applications of Random Multigraphs
The distribution of the number of s-connections can be used to derive a nonparametric test for the independence of the different variables or attributes of a sample c (if single variables define the layers in the multigraph model) or of the blocks of variables. Let p = 0 be the null hypothesis (uncorrelated blocks), and let p > 0 (blocks with positive correlations) be the alternative. Under the null hypothesis, we expect average numbers of s-connections in the multigraph for any s and for any threshold vector. Under the alternative, more than the average number of edges should connect the same pairs of vertices. From this it follows that for s > 1 more than the average number of s-connections can be expected while the number of l-connections should be below the average (compare Remark 3, where we also got different results for different values of s). In Paragraphs 5.4and 6.2 of [2], we derived asymptotic and exact formulas for the distribution of V,, the number of s-connections, for model (A 1). Meanwhile, we could prove similar formulas for model (Bl) which is more adequate for testing the correlation between different blocks. While the exact distribution is mostly of academic interest, the “translation” of Theorem 5-16 in [2] to the framework of model (B 1) may be of interest. Thus, it is stated here.
Theorem 4: In sequences ( T f , n , (,,,,, ~ / N/,Jn+., f
of r.m.’s with t layers, n vertices, and
Probability models for random multigraphs
l, n2-2qc
(19)
I
+ o( 1))liS
107
(c > O),
N1=12(:) edges per layer, the expected numbers of s-connections satisfy E t , n , ~Vs + c12. For edge sequences as above, this sequence of expectations remains bounded and tends to a positive limit, and Vs tends to a P(c/2)-distribution,
Furthermore, Et,n,N Usi. = (c + o( l))/n, Et,n,N Xsi. = nexp(-(c + o( l))/n). For edge sequences : . ? ,the number of s-trees of size 2 (isolated following (19), the sequence of expectations of X (T ) pairs of s-connected vertices) remains bounded and tends to the same positive limit, and X,,* tends to a P(c/2)-distribution,
Both results of Theorem 4,(20) or (21) can be used to test positive correlations in the same way as the results of Theorems 1-3 have been used to test the homogeneity: If for s > 1, we get too many s-connections or s-trees of size 2 in a sample C, then we can reject the null hypothesis p = 0. The same holds if we get too few 1-connections or 1-trees of size 2 than expected. At the moment, we study the asymptotic behavior of r.m.’s for both models (Bl) and (B2) under different constraints for the numbers N1 (or probabilities PI). We hope to find other criteria which allow us to derive nonpararnetric test statistics for certain properties in data sets which can be modeled by random multigraphs.
Acknowledgement Investigations concerning Problems (b) and (c) and Remark 3 have been stimulated by J. Jaworski from the Adam Mickiewicz University in Pozna6 (Poland). The results on these topics arose from our joint work on multigraphs during one of his visits in Dusseldorf and Bielefeld in May 1990.
References [1] [2] [3] [4]
[SJ [6] [71
H.H. Bock; Clusteranalyse - iiberblick und neuere Entwicklungen, OR Spektrum, 1,211-232 (1980) E. Godehardt; Graphs as Structural Models: The Application of Graphs and Multigraphs in Cluster Analysis, 2nd edition, Vieweg, Braunschweig - Wiesbaden (1990). R.F. Ling; A probability theory of cluster analysis,J. Amer. Statist. Assoc. ,68, 1.59-164 (1973) E. Godehardt and H. Henmann; Multigraphs as a tool for numerical classification, Classification and related methods of data analysis, Proc. 1st Conf. of the International Federation of Classification Societies, H.H. Block (editor), North-Holland, Amsterdam - New York, 219-228 (1987). E3.H. Bock: On some significance tests in cluster an4ysis.J. Classificafion,2,77-108 (1985). J.A. Hartigan; Statistical theory in clustering, J . Classification, 2, a - 7 6 (1985). P. E r d b and A. Rknyi; On the evolution of random graphs, Publ. Math. Inst. Hung. Acad. Sci., 5, 17-61 Urn).
108
[8]
[9] [lo]
[11]
[12]
[13]
E.A.J. Godehardt
E. Godehardt; The connectivity of random graphs of small order and statistical testing, Random graphs ‘87. Proceedings of the 3rd International Seminar on Random Graphs, M. Karoi~ski.J. Jaworski, and A. Rucinski (editors),Wiley, New York, 61-72 (1990). E.N. Gilbert;Random graphs, Ann. Math. Statist., 30,1141-1144 (1959). E. Godehardt; Limit theorems applied to random multigraphs of small order, Graph Theory Notes of New York XVII. New York Academy of Sciences, 36-45 (1989). E. Godehardt; Multigraphs for the uncovering and testing of structures, Classijiaztion, Data Analysis, and Knowledge Organization: Models and Methodr with Apptications, Proceedings 14th Annual Conference of the Gesellschafrfur Klassijkation e.V., H.H. Bock and P. Ihm (editors).Springer, Berlin - Heidelberg New York, 43-52 (1991). M. Eigener; Konstruktion von 2-Stichproben-Testsmit Hilfe clusteranalytischerMethoden, Bplomarbeit. Institut fiir Mathematische Stochastik der Universitit, Hamburg (1976). S. Ross; A Random Graph, J. Appl. Prob.,l8,3W315 (1981).
Quo Vadis, Graph Theory? J. Girnbel, J.W. Kennedy & L.V. Quintas (eds.) Annals of Discrete Mathematics, 55, 109-126 (1993) 0 1993 Elsevier Science Publishers B.V. All rights reserved.
SOLVED AND UNSOLVED PROBLEMS IN CHEMICAL GRAPH THEORY
Alexandru T. BALABAN Department of Organic Chemistry, Polytechnic Institute Splaiul Independentei, Bucharest, ROUMANIA
Abstract Chemistry and graph theory meet in several areas which are briefly reviewed. A few solved and unsolved problems are discussed generalized centers in cyclic graphs; irreducible sequences in polymers; cages; spectral graph theoretical problems; k-factorable graphs with k > 1, and perfect matchings with k = 1.
1.
Introduction
Points of contact between graph theory and chemistry exist from the very beginning of graph theory. It is an established fact that the birth of graph-theory occurred from three independent areas: mathematics via Euler’s famous problem of the seven bridges in Konigsberg; electricity via Kirchhoffs electrical network theory; and chemistry via Cayley’s enumeration of alkane isomers. The latter problem continued to attract both mathematicians and chemists, and led to P6lya’s celebrated enumeration theorem. Even Sylvester, who coined the name “graph”, was fascinated by the theory of chemical structure in organic chemistry, due mainly to KekulC. A brief review entitled “Early History of the Interplay between Graph Theory and Chemistry” was published as the first chapter of a monograph Chemical Applications of Graph Theory [11. At present, chemistry and graph theory are expanding very rapidly because both are faced with challenging problems, and in both cases the unknown peaks lie close at hand, needing fewer approaching expeditions and base camps than other scientific disciplines. By cross fertilization, during the last 30 years, the interdisciplinary areas of graph-theoretical applications in chemistry has become a recognized research field with its own journals [2] [3], monographs and symposia [4]-[ 111. Describing my own experience in this field, after organic chemistry, I registered as a student in mathematics. However, I could not finish the latter studies because a third opportunity (of “once only” type) arose for a one-year training program in nuclear physics and radiochemistry. Thus, as a fresh Ph.D., in 1959 I published my first chemistry papers. In my first article I tried to solve a graph-theoretical problem, connected with the enumeration of all possible monocyclic aromatic systems [12]. It was actually the “necklace problem” with restrictions as to adjacencies, and it had several sequels [13] [14]. From the outset, two facts became clear: (i) in order to work in a borderline area, one has to be familiar with all relevant disciplines; (ii) the best results are obtained by cooperation between specialists, provided they bridge the barriers of terminology and publication style characteristics peculiar to each discipline. At present, among the areas of intensive activity in graph-theoretical applications in chemistry, one may cite: (a)
Quantitative structure-property (or activity) relationships (QSPR or QSAR), especially for drug design, using topological indices;
(b)
Reaction networks, including retro-fragmentations for the design of organic syntheses;
110
(c)
A.T. Balaban
Coding and nomenclature of chemical structures, including input and retrieval of chemical information for documentation purposes.
All of these areas have reached the stage of providing successful commercial services. Some of the computer programs are commercially available. The main idea behind most of the graph-theoretical applications in chemistry is the oneto-one correspondence between chemical structures and their constitutional graphs, wherein atoms are represented by vertices, and covalent bonds by edges. I t is customary in organic chemistry to use hydrogen-depleted graphs, where only non-hydrogen atoms are indicated by vertices. Chemical structures are discrete entities, whereas their properties vary continuously. Graph theory aims at unique representation (alphanumerically), coding, ordering, and enumeration, of all possible chemical structures. As at 1990, more than ten million compounds have been reported in the chemical literature, and at the present rate of growth, this number will double in about 20 years. Molecular formula (such as C12H22011for sugar) can be easily manipulated, ordered, and retrieved. However, the huge number of isomers (that is, substances having the same molecular formula and differing in their structure due to differences in the topological or geometrical mode of atom bonding) complicates the situation and requires help from graph theorists. In retrieving information, chemistry can manipulate structures which are represented by graphs without the need of using words, since words are often ambiguous or imprecise. Direct access to Chemical Abstracts, Beilstein, or Gmelin databases allows a chemist to learn in a few minutes whether a given structure or substructure has been described in the last few decades. In the near future, all lo7 compounds will be included in these databases. Thus for most purposes, namely those where structural formulas rather than words are involved, chemists can safely assert that chemistry is the best documented science. Physical chemistry, or chemical engineering, which need words or keywords, are only as well documented as mathematics, physics, medicine, or law. The present paper will present a few solved or unsolved problems involving chemical graphs, in the hope that the latter problems will serve as challenge and incentive for graph theorists. It should be made clear that the choice of topics is subjective and is linked to personal interests rather than to the intrinsic importance of the problems . 2.
Graph Centers, Chemical Nomenclature and Documentation
Traditional chemical nomenclature and documentation is based upon the system adopted by the International Union of Pure and Applied Chemistry (IUPAC) which seeks in acyclic structures the longest linear chain. When several such chains are present, there exist hierarchical rules for making a unique choice. This chain is numbered starting from one end; again, the choice of the canonical end is governed by elaborate rules. It is easy to see that this system is cumbersome. Still more intricate are the rules for the IUPAC nomenclature of polycyclic systems. Significant progress was achieved on a graph-theoretical basis in the proposed nodane nomenclature which, however, is still based on the longest chain [15][16]. The enumeration of 4-trees (trees with vertex degree at most four) in terms of both the number of vertices (carbon atoms in hydrocarbons) and the number of vertices in the longest chain was effected by means of Pdya's theorem, using a specially devised computer program [ 171. The unique and simple graph center or centroid of any tree has led Read to devise a centric representation for acyclic chemical compounds [18]. He has also proposed an extension of these ideas to cyclic chemical systems [19]. In cooperation with Bonchev, Mekenyan and RandiC [20] [21] we
111
Solved and unsolved problems in chemical graph theory
have developed an algorithm for finding the generalized center for any graph. Briefly, the algorithm consists in finding sequentially: A.
(i) minimum vertex eccentricity; (ii) minimum vertex distance sum; (iii) minimum number of Occurrences of the largest distance (or, when this is the same for two or more vertices, the next largest distance, etc.);
B.
The same parameters as in A.(i)-(iii) but for edges.
Thus, vertices and edges are ranked into equivalence classes (the most central vertices/ edges have the smallest ranks). When the same rank sum is obtained from different summands, priority is given to the partition containing the smallest rank. Then the above steps are iterated replacing the word distance by rank, until the ranlung of vertices and edges undergoes no further modification. The center consists of the vertex/vertices with lowest rank. Although this idea reduces substantially the number of central vertices in polycyclic graphs, allowing in principle the simplification of chemical coding or naming by analogy with Read's approach, we feel that the last word in this respect has not yet been said.
3.
Irreducible Sequences in Polymers
Polymers (natural or synthetic) are essential for life (for example, proteins, nucleic acids) and for civilized life (for example, plastics, composites, elastomers). In stereoregular polymers, the three-dimensional configuration of chiral atoms can lead to various types of sequences which may be detected experimentally by nuclear magnetic resonance or by the thennal/mechanical properties. In isotactic polypropylene all configurations are of the same kind (. . .RRR.. .) and this polymer has higher strength and melting point than irregular (atacTable 1: Numbers of necklaces ( N K ) and of irreducible sequences ( I S ) for each partition R,S, ...U , ( r + s + ... + u = m ) , total numbers N( m,n) of irreducible sequence for given numbers of n of comonomers and m of mers in the repeating irreducible sequence.
I
n=2
n=3
n=4
N K Partition IS N(m,n) N K Partition IS N(m,n) NK Partition
-
-
-
-
-
1 RS
1
1
1 R2S
1
1
1 RST
1
1
1 R3S 2 R2S2
1
2
2 R2ST
2
2
-
IS N(m,n) -
3 RSTU
-
1
1
6 R2STU
2
2
13
10 R3STU 16 R2S2TU
3 8
11
21
15 R4STU 4 30 R3S2TU 17 48 R2S2T2U 12
33
~
1 R4S 2 R3S2
1 2
1 R5S 3 R4S2 3 R3S3
1 2 2
1 RgS
1
D C
1
3
3
~ 5 5 2
4 R4S3
J
4
5
YI Q
2 R3ST 4 R2S2T
2 3
3 R4ST 6 R3S2T 1 1 R2S2T2
3 6 4
3 RsST R4S2T 10 R3S3T 18 R3S2T2
3 7 12
A.T. Balaban
112
tic), polypropylene. Alternating configurations (. ..RSRSRS.. .) are encountered in syndiotuc-
tic polymers. In binary copolymers the two comonomers can also give rise to various sequences. Let the comonomers in higher copolymers (ternary, quaternary, etc.,) be denoted by R, S, T, U. All irreducible sequences of these comonomers, whose infinite repetition leads to a polymer chain, have been enumerated [22]-[24]The basic idea is to start from the necklace problem and to eliminate those necklaces (with rn beads of n colors) which on opening and linking into an infinite chain are reducible to smaller necklaces. Table 1 and Table2 present the numbers of irreducible sequences as well as the sequences themselves for the simplest cases.
Table 2: Irreducible sequences with n = 2 , 3 or 4 (binary, ternary or quaternary copolymers) and sequence lengths rn = 2 through 7. n=2
'1
n=3
RS
!
RRSS
lRRRS
RRRSS RRSRS
MRRS
I
-
UlRS
n =4
RRST RSRT RRRST RRSRT
RSTU RRSST RRSTS RSRST
RRRRSS RRRSSS RRRRST RRRSST RRSSTT RRRSRS RRSRSS RRRSRT RRRSTS RRSTST RRSRRT RRSRST RRSTTS RRSRTS RSRTST RRSRTT RSRSRT
M R R R S RRRRRSS RRRRSSS RRRRRST RRRRSST RRRRSRS RRRSRSS RRRRSRT RRRRSTS RRRSRRS RRSRRSS RRRSRRT RRRSRST RRSRSRS RRRSRTS RRRSRTT RRSRRST RRSRRTT RRSRSRT RRSRTRS
-
RRSTU RSRTU RRRSTU RRSRTU RSRTRU
RRSSTU RRSTSU RRSTTU RRSTUS RSRSTU RSRTSU RSRTUT RSTRSU
RRRSSST RRRSSTT RRRRSTU RRRSSTU RRSSTTL
RRRSSTS RRRSTST RRRSRTU RRRSTSU RRSSTU? RRSRSST RRSRSTS RRSRTSS RRSSRTS RSRSRST
RRRSTTS RRSRRTU RRSRSTT RRSRTRU RRSRTST RRSRTTS RRSSRTT RRSTRST RRSTRTS RRSTSTS RSRSTRT RSRTRST
RRRSlTU RRRSTUS RRSRSTU RRSRTSU RRSRTTU RRSRTUS RRSRTUT RRSRTUU RRSSRTU RRSTRSU RRSTRTU RRSTRUS RSRSRTU RSRSTRU RSRTRSU
RRSTSTU RRSTSUS RRSTSU? RRSrrSL RRSTUST RRSTUTS RSRSTUT RSRTSTU RSRTSLTI RSTRSTU
4. Cages and Reaction Graphs Unlike the constitutional (molecular) graphs discussed so far, in which vertices symbolize atoms and edges symbolize covalent bonds, in the graphs about to be discussed a vertex represents a molecule or a reactive intermediate, and an edge represents an elementary reaction step. Such graphs are termed reaction graphs. Two isomorphic graphs can result from quite different chemical contexts:
Solved and unsolved problems in chemical graph t h e ~ r y
113
(i) Rearrangements of carbocations (Scheme l).The scheme depicts the two (unordered) substituents linked to the positively charged carbon atom, and this can be either at the left or right of the C-C bond symbolized by a period [25]-[27l. The reaction step involves the shift of a substituent from the vertex of degree four to that of degree three.
Scheme 1: Portion of the reaction graph for rearrangements of carbocations. (ii) Pseudorotation of pentacoordinated compounds. This is exemplified by phosphoranes with pentavalent phosphorus at the center of the trigonal bipyramid [28] [29] (Scheme 1). This
4 (23.)
;5-2
4
(13.)
5 (.451
4 (12.1
Scheme 2 Portion of the reaction graph for pseudorotation of trigonalbipyramidal compounds. scheme indicates the two (unordered) apical substituents, and the period discriminates among the two resulting enantiomers. The reaction step involves the conversion of the two apical substituents mutually situated at an angle of 180"into equatorial substituents at an angle of 120";the third equatorial substituent of the new configuration stays fixed during the rearrangement (pivot substituent); the remaining two formerly equatorial substituents become the new apical substituents by increasing their angle from 120"to 180". The resulting graph is bipartite, regular of degree three (cubic graph), has 20 vertices, and is known as the Desargues-Levi graph (see Figure 1).
114
A.T. Balaban
2L 35@ :3
14
25
Figure 1: Reaction graph corresponding to Scheme 1 and Scheme 1 (Desargues-Levi graph with 20 vertices), and the Petersen graph with 10 vertices (5-cage). If in (i) the two carbon atoms are indistinguishable (no isotopic label), or if in (ii) enantiomerism is ignored, the period in the above notation vanishes and, by painvise identification of the antipodes, the above graph reduces to the 5-cage (Petersen graph) with 10 vertices 1251 [30]. A g-cage or (3,g)-cage is defined to be a cubic (trivalent) graphs with girth g, having the smallest number of vertices [31]-[33]. On examining the known cages it is evident that they are related to each other (see Figure 2). For the cages with odd girth only one representation with a (g + 1)-circuit is shown, but for those with even girth two representations with g- and (g + 2)-circuits are presented[34]. It is easy to see how, on excising trees (shaded area in Figure 2) from the even g cages in the representation with (g + 2)-circuits one obtains the cage with girth equal t o g - 1. Table 3 shows the conjectured excised trees. One exception is the 9cage. At this time, Biggs [35], Evans [36], and McKay [37], have found eighteen such 9-cages having low symmetries with 58 vertices. The excision procedure leads to (3,9)-graphs with 60 vertices, starting with one of the three known 10-cages [38] [39] with 70 vertices, as is shown in Figure 3 [38]. The same procedure, applied to the unique known 12-cage (or Benson graph) [a], leads to the conjectured (uniquely so at this time) 11-cage with 112 vertices [34], shown in Figure 4 and Figure 5. Nothing is known about cages with girth higher than 12; the low symmetry of the (39)-graphs with 58 vertices raises the question if lower numbers of vertices might perhaps lead to higher symmetries in this case. A challenge for programmers would be to devise a computer game which would highlight the high symmetry of most cages. Table 4 shows the girth g and order n of the known trivalent cages. On comparing the numbers of automorphisms of the tetrahedron (namely, 12 automorphisms) and the 3-cage (24 automorphisms), both having n = 4 vertices, it is evident that the symmetry operations for the graph are much more numerous than for the corresponding polyhedron. It would be interesting to see on the screen (in the game), by various edge colorings, all edge and s-path automorphisms of the Petersen graph (5-cage) or the Tutte graph (8-cage); the former graph has 120 automorphisms and is 3-regular (3-unitransitive), while the latter has 1440 automorphisms and is 5-regular. 5.
Spectral Graph Theory In the well-known Hiickel molecular orbital (HMO) theory, the eigenvalues of graphs
Solved and unsolved problems in chemical graph theory
115
g= 3 4
5
?
7
6
7
8
7
Figure 2: Representation of cages by bridging opposite vertices in circuits with paths. Excision of shaded trees converts a g-cage with even g into a (g - 1)-cage. whose vertex degrees are at most three are of three types: negative, representing bonding nmolecular orbitals (BMOs); positive, representing anti-bonding n-MO's (ABMO's); and zero, corresponding to non-bonding n-MO's (NBMO's). Normally, for most cyclic or acyclic molecules having an even number of carbon atoms in conjugated systems, the number of BMO's equals that of ABMOs, and there is no NBMO. The homodiatomic triple-bonded nitrogen molecule (N2) has a very high stability because all BMO's are filled with electrons, there is no NBMO, and all ABMO's are vacant. Exactly
116
A.T. Balaban
Table 3: Excised trees from g-cages with even g, for converting them into (g - 1)-cages.
the same situation and hence stability occurs in aromatic molecules such as benzene with 4 k + 2 7c-electrons, where k = 0, 1,2, ... . In polycyclic molecules having delocalized 7c-electrons, various situations may occur. One of the most peculiar and challenging is to have no NBMO's, and to have more positive than negative eigenvalues, or vice versa, as it was pointed out first by Bochvar and Stankevich. Several examples are gathered below, but the general rules (that is, necessary and sufficient conditions) are not yet clear. Two classes exist [41]: Class A : Graphs with an excess of negative eigenvalues over positive ones: 2j pairs (j= 1,2, ...) of ( 4 k + 1) -membered rings condensed (that is, sharing one edge) directly, or via one (4k + 2 ) -membered ring, or via two 4k-membered rings, in a centro-symmetrical arrangement. Examples are shown in Figure 6. Class B: Graphs with an excess of positive eigenvalues over negative ones. Similar to class A , but with 2j pairs of ( 4 k + 3) -membered rings. Examples are presented in Figure 7.
6.
k-Factorable Graphs with k > 1
Decomposition of graphs into congruent factors has interesting chemical implications, the most important of which will be described in the next section. Here we shall discuss a less studied application, namely decomposition into factors with at least three vertices. Terpenoids are naturally occurring compounds having a polyisoprenic skeleton, corresponding to factors with five vertices in a branched chain having a vertex of degree three. Living cells synthesize terpenoids via the reaction of acetyl-coenzyme A( 1) with acetoacetyl-coenzyme A(2) which
Solved and unsolved problems in chemical graph theory
117
Figure 3: Two representations for one of the three 10-cages [38]. affords mevalonic acid ( 3 ) .This is phosphorylated and decarboxylated yielding geraniol pyrophosphate 4 (the pyrophosphate unit is symbolized by OPP), which by sequences of reactions leads to acyclic compounds such as farnesol 5 or rubber 6, while by cyclization squalene 7 yields cholesterol 8 and its derivatives (see Scheme 3). Other isoprenoid graphs are shown in Scheme 4: monoterpenes such as pinene 9, paracymene 10, camphor 11; sesquiterpenes: guajazulene 12, vetivazulene 13; diterpenes such as
118
A.T. Balaban
Figure 4: Derivation of the conjectured 11-cage from Benson's 12-cage by excising a tree with 14 vertices. retinol (vitamin A) 14. In all above cases the isoprene unit (factor) is shown with full lines, and dotted lines link these units. In all terpenoids, the molecular graph is decomposable into congruent factors. The problem in chemistry is to detect whether a given graph is factorable into similar isoprenoid factors, and vice versa to generate such polyisoprenoid graphs. In collaboration with Professor S. Marcus from the Faculty of Mathematics of Bucharest University, by using picture grammars and push-down automata, several computer programs were devised for this purpose [42][44].It would be interesting to apply other methods to this problem, and to generalize the problem for other k-factors.
7. Perfect Matchings (Factorable Graphs) In chemistry, molecular graphs that can be decomposed into 1-factors (K2 graphs) have a special significance, especially for polyhex graphs. Such graphs represent polycyclic aromatic hydrocarbons (PAH's) and they have higher stability when they are 1-factorable than when
Solved and unsolved problems in chemical graph theory
119
Figure 5: The conjectured 11-cage. Inner vertices, having other vertices at distance eight, belong to a different orbit from the outer vertices [34].
~
g
3
4
5
6
7
8
9 a 1 0 b 1 1
12
n
4
6
10
14
24
30
58
126
70
112
they are not, or when they have more such factorizations (also called perfect matching, or KekulC structure counts) [45] [&I. A necessary but insufficient condition for the graph to have at least one Kekule structure is that it has an even number of vertices. Recently, a set of necessary and sufficient rules for polyhexes to be I-factorable was published [43. Examples of even-numbered polyhexes which have no 1-factorization(called concealed non-Kekulhzn) 15 and 16, are presented in Scheme 5. Polyhexes (PAH's or benzenoids) are of three types: catafusenes, penfusenes, and coronafusenes (coronoids). As indicated by Balaban and Harary [ 4 7 , the dualist (characteristic)
I20
A.T. Balaban
Qm
Figure 6:Examples of class A graphs.
Figure 7:Examples of class B graphs. graph is a useful criterion for discriminating among these three types. Its vertices are the centers of hexagons and its edges connect condensed hexagons (that is, hexagons sharing two adjacent vertices, representing two carbon atoms). Unlike graphs, the angle between edges of dualist graphs is important. Unlike dual graphs, in dualist graphs there is no vertex corresponding to the outer region. The dualist graphs of catafusenes are trees; those of perifusenes
121
Solved and unsolved problems in chemical graph theory
~CH~CO-SCOA 4 CH3COCH2CO-SCOA
1 1 +2
--*
2 HO I HO-CHz-CH2-C-CH2-CmH I
3
4
CH3
OPP
4
8
Scheme 3: Examples of terpenoids (polyisoprenoid graphs) and their biosynthesis.
Scheme 4: Examples of polyisoprenoid graphs. have 3-membered rings; those of coronoids have larger rings which are not the periphery of assemblies of 3-membered rings. Several coding systems have been devised on the basis of dualist graphs for polyhexes. Cata-condensed appendages may be present in peri- and coronafusenes, and pen-fused subgraphs may be present in coronoids [@] [49]. Examples are presented in Scheme 6.
122
A.T. Balaban
15
16
Scheme 5: Examples of non-KekulCan perifusenes with even number of vertices.
Scheme 6: Examples of polyhexes with their dualist graphs: chrysene (catafusene), the carcinogenic benzopyrene (perifusene), and Kekulene (coronafusene). One may consider most polyhexes as portions of the graphite lattice; however, this is not always true. Indeed, polyhexes may or may not be embeddable in a plane without vertices coinciding. An example of the latter is 7-helicene shown as the last polyhex in Scheme 6, which is an out-of-plane catafusene. In work with Tomescu we defined isoarithmic polyhexes as PAH's which have the same numbers of hexagons and the same K values (see next paragraph), but differ in their topology. For example, 17 and 18 (see Scheme 7). Also, we applied algebraic methods for enumerating K values of catafusenes obeying certain composition rules [MI-[531.
17
18
19
Scheme 7: Isoarithmic catafusenes (17 and 18), an acene (19) and their dualist graphs. The number K of perfect matchings (1-factorizations, or KekulC structures) plays an important part in the so-called valence bond theory of PAH's, and an appreciable part of theoretical chemical papers is devoted to such topics. For example, the n-acenes 19 having K = n + 1 are less stable than n-helicenes 17 or other isoarithmic systems like the zig-zag catafusenes 18; in general, such fibonacenes as 17 or 18 have K = F , where F , is the n-th Fibonacci number.
Solved and unsolved problems in chemical graph theory
123
In Erich Huckel's MO theory of aromatic character (which refers to electronic delocalization and stability, and not to smell) the molecular orbitals (MO's) for the x-electrons (one for each carbon atom in benzenoids) are found with the help of the adjacency matrix. From it one obtains the characteristic polynomial whose roots xi = ( a- Ei)/p afford the orbital energies E , in bunits relative to a value a (Coulomb integral). Thus for benzene (CH)6 also called annulene, all bonding levels (BMO's) are occupied, there is no NBMO, and all ABMO's are vacant, resulting in a closed n-electron shell, or x-electron sextet (the arrows in Scheme 8 indicate n-electrons with their spin). In a far-reaching generalization, Huckel showed that molecules with 4k + 2 melectrons in a delocalized system have aromatic character [%I[ 5 7 . Examples are shown in Scheme9: benzene 20, naphthalene 21, phenanthrene 22, anthracene 23, azulene 24, cyclopentadiene anion 25, tropylium cation 26, thiophene 27, 18annulene 28, and tetra- t-butyl-bis-dehydro- 14-annulene 29. Energy
..:Dl
ABMO's
a+p 4 a
a-2P
1
fc
Scheme 8: Molecular orbitals of benzene (Qannulene). An unsolved problem is the following: what are the general structural patterns for in-plane (and separately for out-of-plane) polyhexes with maximal K values for any given number h of hexagons in the polyhex? A brute-force approach led to the following conjecture: the polyhex is a branched catafusene; for out-of-plane catafusenes and for certain h values (4,10,22,46, .. .), Gutman [58]pointed out the most branched structures. For in-plane catafusenes, Table 5 presents the dualist graphs with h I 13 and the corresponding K values. The two cases with asterisks (h = 11 and 12) have two isoarithmic solutions each, and give higher K values for corresponding out-of-plane catafusenes (305and 510, respectively). The cases with h = 9 and 13 have isoarithmic out-of-plane catafusenes.
Note Added in Proof: The last problem was recently solved for out-of-plane benzenoids. The corresponding problem for in-plane benzenoids is still unsolved.
124
A.T. Balaban
20
21
22
24
'I
\
23
27
26
Scheme 9: Examples of aromatic molecules obeying Huckel's 4k + 2 n-electron rule (each double bond or heteroatom contributes two x-electrons); the numbers of n-electrons are inscribed in the formule. Table 5: Dualist graphs of the in-plane catafusenes (with h hexagons) possessing the highest numbers K of KekulC structures. h
K
h
K
1
2
2
3
5
4
9
4
h
K
3
5
14
6
24
H
7
41
8
66
9
110
10
189
11
302
12
504
13
863
9
References [I]
A.T. Balaban and F. Harary;Early history of the interplay between graph theory and chemistry, in Chemical Applications of Graph Theory, A.T. Balaban (editor), Academic Press, London-New York, 1 4
[2]
P.G. Mezey and N. Trinajstii (editors); Journal of Mathematical Chemistry, Balzer Publ., Basel, 1 (1987).
(1976).
Solved and unsolved problems in chemical graph theory
125
A.T. Balaban, A. Dreiding, A. Kerber and O.E. Polansky (editors); Mathematical Chemistry, Mulheiml Ruhr, l(1975). N. TMajstiC; Chemical Graph Theory, 2nd. edition, CRC Press, Boca Raton, Florida (1992). R.B. King (editor); Chemical Applications of Topology and Graph Theory, Elsevier. Amsterdam (1983). R.B. King and D.H. Rouvray (editors); Graph Theory and Topology in Chemistry, Elsevier, Amsterdam (1987). m D.H. Rouvray (editor); Computational Chemical Graph Theory, Proceedings of the 1988 American Chemical Society Meeting in L o s Angeles, Nova Science Publ. Inc., New York (1989). J.W. Kennedy and L.V. Quintas (editors); Applications of Graphs in Chemistry and Physics, Nod-Holland, Amsterdam (1988). R.C. Lacher (editor); MATHICHEWCOMP 1987, Elsevier, Amsterdam (1988). D.H. Rouvray and A.T. Balaban; Chemical applications of graph theory. in Applications of Graph Theory (R.J. Wilson and L.W. Beineke, editors), Academic Press, London, 177-221 (1979). A.T. Balaban; Applications of graph theory in chemistry; J. Chem. In& Comput. Sci., 25.334-343 (1985). A.T. Balaban; An attempt towards the systematics of monocyclic aromatic compounds, Studii Cercet. Chim. Acad., Romania, 6,257-295 (1959).(Roumanian). A.T. Balaban and F. Harary; Chemical graphs: IV. Dihedral p u p s and monocyclic aromatic compounds, Rev. Roumaine Chim., 12,1511-1515( 1%7). K.Lloyd; The footballers of Croam, London Math. SOC.Lecture Notes Series, 13.97-102 (1974). A.L. Goodson; Graph-based chemical nomenclature. I. Historical background and discussion, J. Chem. In$ Compur. Sci., 20, 167-172 (1980). A.L. Goodson; Graph-based chemical nomenclature.11. Incorporationof graph-theoretical princlples into Taylor’snomenclature, J. Chem. InJ Comput. Sci., 20. 172-176 (1980). A.T. Balaban, J.W. Kennedy and L.V. Quintas;The numher of alkanes having n carbons and a longest chain of length d: An application of a theorem of P6lya. J. Chem. Educ., 65.3W3 13 (1988). R.C. Read; The coding of trees and tree-like graphs, University of the West Indies, Jamaica (1968).preprint. R.C. Read and R.S. Milner; A new system for the designation of chemical compounds for the purpose of data retrieval. 11. Cyclic compounds, Report to the University of West Indies, Jamaica (1968). D. Bonchev, A.T. Balaban and M. RandiC; The graph center concept for pdycyclic graphs, Int. J. Quanrum Chem., 19,6142(1981). D.Bonchev, 0.Mekenyaa and A.T. Balaban; Iterative procedure for the generalized graph center in polycyclic graphs, J. Chem. In$ Compul. Sci., 29,9147(1989). A.T. Balaban and C. Artemi;Mathematicalmodeling of polymers.I. Enumeration of non-redundant (irreducible) repeating sequences in stereoregular polymers, elastomers, or in binary copolymers, Math. Chem., 22,3-32 (1987). 1231 C. Artemi and A.T. Balaban; Mathematical modeling of polymers. 11. Irreducible sequences in n-ary copolymers, Math. Chem., 22.77-100 (1987). A.T. Balaban and C. Artemi; Mathematical modelling of polymers. 111. Enumeration and generation of “I repeating irreducible sequences in linear bi-, ter-, qnater-, and quinquenary copolymers and in stereoreplar homopolymers,Makromol. Chem.,189.863470 (1988). A.T. Balaban, D. Farcasiu and R. Banica; Chemical graphs: Part 2.Graphs of multiple 1.2-shiftsin carbonium ions and related systems, Rev. Roumaine Chim., 11,1205-1227(1966). A.T. Balaban; Chemical graphs: Part 16.Intramolecularisomerization of octahedral complexes with six different ligands, Rev. Roumaine Chim., 18,841-854(1973). A.T. Balaban; Chemical graphs: Part 19.Intramolecularisomerization of trigonal-bipyramidal structures with five different ligands, Rev. Roumaine Chim., 18,855-862 (1973). P.C. Lauterbur and F. Ramirez; Pseudorotation in trigonal-bipyramidalmolecules, J. Amer. Chem. SOC., 90,6722-6726(1968). K.E. DeBruin, K. Naumann, G. Zon and K. Mislow; Topological representation of the stereochemistry of displacement reactions at phosphorus in phosphonium salts and cognate systems, J. Amer. Chem. SOC., 91,7031-7040(1%9). J.D. Dunitz and V. Prelog; Ligand reorganization in the higonal bipyramid, Angew. Chem. Internat. Ed. Engl., 7,725-726 (1968).
[31
126
A.T. Balaban
W.T. Tutte, Connectivity in Graphs, University of Toronto Press (1966). P.K. Wong; Cages - a survey, J. Graph Theory, 6, 1-22 (1982). F. Harary; Graph Theory, Addison-Wesley, Reading, Mass., 174 (1959). A.T. Balaban; Trivalent graphs of girth nine and eleven and relationships between cages, Rev. Roum. Math. Pures Appl., 18, 1033-1043 (1973). I351 N.L. Biggs and M.3. Hoare; A trivalent graph with 58 vertices and girth 9, Discrete Math., uf,299-301 (1980). I361 C.W. Evans; A second graph with 58 vertices and girth 9, J. Graph Theory, 8.97-99 (1984). [37l B. McKay; - Personal communication. 1381 A.T. Balaban; A trivalent graph of girth ten, J . Comb. Theory, Ser. B, 12, 1-5 (1972). PSI M. O'Keefe and P.K. Wong; A smallest graph of girth 10 and valency 3, J. Graph Theory, 5 , 7 9 4 5 (1981). C.T. Benson; Minimal regular graphs of guzh eight and twelve, Cmud. J. Math., 18, 1 0 9 - 1 0 9 4 (1966). A.T. Balaban; Chemical graphs: Part 17. cata-condensedpolycyclic hydrocarbons which fulfil Huckel's rule but lack closed electmnic shells, Rev. Roumaine Chim., 17, 1.531-1543 (1972). [421 A.T. Balaban, M. Barasch and S. Marcus; Computer program for the recognition of acyclic regular isoprenoid structures, Math. Chem., 5,239-261 (1979). A.T. Balaban, M. Barasch and S. Marcus; Picture grammars in chemistry. Generation of acyclic iso[431 prenoid structures, Math. Chem., 8, 193-213 (1980). [441 A.T. Balaban, M. Barasch and S.Marcus; Computer program for the recognition of standard isoprenoid structures. Math. Chem., 8,215268 (1980). r45l S.J. Cyvin and I. Gutman; Kekule' Structures in Benzenoid Hydrocarbons, Lecture Notes in Chemistry, #46, Springer, Berlin (1988). 1461 J.R. Dias;Handbook of Polycyclic Hydrocarbons, Elsevier, Amsterdam (1987). I47l A.T. Balaban and F. Harary; Chemical graphs: Part 5. Enumeration and proposed nomenclature of benzenoid cam-condensed polycyclic aromatic hydrocarbons, Tetrahedron, 24,2505-2516 (1968). I481 A.T. Balaban; Chemical graphs: Part 7. Proposed nomenclature of branched cata-condensed benzenoid hydrocarbons, Tetrahedron, 25,2949-2956 (1%9). A.T.Balaban; Challenging problems involving benzenoid polycyclics and related systems, Pure Appl. Chem., 54, 1075-1096 (1982). A.T. Balaban and I. Tomescu; Chemical graphs: Part 41. Numbers of conjugated circuits and Kekulk structures for zigzag catafusenes and (j.k)-hexes; generalized Fibonacci nu&&, Math. Chem., 17,91120 (1%). A.T. Balaban, C. Artemi and C. Tomescu; Algebraic expressions for Kekulk structure counts in nonbranched regularly cam-condensed benzenoid hydrocarbons, Math. Chem., 22.77-100 (1987). A.T. Balaban and I. Tomescu; Alternating 6-cycles in perfect matchings of graphs representing condensed benzenoid hydrocarbons,Discrete Appl. Mdh., 19.6-16 (1988). (Reprinted in [8]). 1531 I. Tomescu and A.T. Balaban; Decomposition theorems for calculating the number of Kekulk structures in coronoids fused viapennaphthenyl units, Math. Chem.. 24.289-309 (1989). I541 A. Streitwieser; Mofecular Orbital Theory for Organic Chemists, Wiley, New York (l%l). ~ 5 1 C.A. Coulson, B. O'Leary and R.B. Mallion; Hiickel Theoryfor Organic Chemists, Academic Press, London (1978). 1561 E. Heilbronner and H. Bock; Das HMO-Model1 und seine Anwendung, Verlag Chemie, Weinheim (1%). [TI A.T. Balaban, M. Banciu and V. Ciorba; Annulenes, Benzo-, Hetero-, Homo-Derivatives and Their Valence homers, CRC Press, Boca Raton, Florida (1987). I. Gutman; A class of benzenoid systems with large number of Kekulk structures, J. Serb. Chem. SOC., 53, 607-612 (1988).
Quo Vadis, Graph Theory? J. Gimbel, J.W. Kennedy & L.V. Quintas (eds.) Annals of Discrete Mathematics, 55, 127-136 (1993)
0 1993 Elsevier Science Publishers B.V. All rights reserved.
DETOUR DISTANCE IN GRAPHS
Gary CHARTRAND Department of Mathematics, Western Michigan University Kalamazoo, Michigan, U.S.A.
Gamy L. JOHNS Department of Mathematics, Saginaw Valley State University University Center, Michigan, U.S.A.
Songlin TIAN Department of Mathematics, Central Missouri State University Warrensburg, Missouri, U.S.A.
Abstract For vertices u and v in a connected graph G, the detour distance d*(u, v ) between u and v is the length of a longest path P for which the subgraph induced by the vertices of P is P itself. A graph G is called a detour graph if b“(u, v) equals the standard distance between u and v in G for every pair u, v of vertices of G. Several results concerning detour distance and detour graphs are presented.
1.
Introduction
Ordinarily, when we wish to proceed from point A to point B we take a route which involves the least distance. We have all been faced with detour signs which require us to take a route from A to B that involves a greater distance. In any such detour route from A to B we assume that there is no possible shortcut along the route, for otherwise this should have been part of the route initially. When one is driving along such a detour, it sometimes seems that we are using the longest route possible from A to B (again subject to the “no shortcut” condition). In this paper we investigate longest detour routes in graphs and present analogues of known results concerning the standard distance. The distance d(u, v ) between two vertices u and v in a connected graph G is the length of a shortest u-v path in G (that is, a shortest path in G connecting u and v). For a nonempty set S of vertices of G, the subgraph (S) of G induced by S has S as its vertex set while an edge of G belongs to (S) if it joins two vertices of S. Buckley and Harary [l] have written a book devoted to the topic of distance in graphs. Terms not defined here may be found in this book.
If P is a u-v path of length d(u, v ) , then the subgraph ( V ( P ) )induced by the vertices of P is P itself. This observation suggests the following concept. The detour distance d* (u, v) between u and v in G is the length of a longest induced u-v path, that is, a longest u-v path I-‘ for which ( V ( P ) ) = P . An induced u-v path of length d* (u, v ) Is called a detour path. In the graph G of Figure 1, there are paths P of lengths 2 , 3 , and 4 connecting the nonadjacent vertices a and b such that ( V ( P ) ) = P and no such paths of greater length; therefore, d d a , b) = 2 and d$(a, b) = 4. This graph G has the added property that for every two vertices x and y and every integer n such that d,(x, y) I n I d,*(x, y) ,there exists an n-y path P of length n for which ( V ( P ) ) = P . By adding more paths of length 2 between a and b and joining the internal vertices of these paths to each other as well as to all other vertices of G, a graph H with this property and having arbitrarily large order can be produced. Whether such a
G. Chartrand. G.L. Johns and S. Tian
128
graph H exists containing vertices x and y with d$(x, y) > 4 is not known. Observe that d*(u, v ) >d(u, v) for all vertices u and v of G and that d * ( u , v) = d(u, v ) = I if u and v are adjacent. Also, note that d*(u, v) = d*(v, u) for all vertices u and
v of G. Therefore the detour distance is symmetric. However, the triangle inequality does not hold in general. Consider the wheel W, of order n + 1, where n 1 5 (see Figure 2). Then d* (u, v) = n - 2 > 2 = d* (u, w) + d* (w,v) . Therefore, in general, the detour distance is not a metric on the vertex set of G.
G: b
a
Figure 1: Let G be a connected graph and let F be an induced connected subgraph of G . Then d,(u, v ) 1 d,(u, v) for u, v E V(F). However, for the detour distance, we have the opposite inequality, that is, d;(u, v) Idg(u, v) for u, v E V(F).
Figure 2: 2. Detour Eccentricities
The detour eccentricity e x (v) of a vertex v is defined by e*(v) = max ( d * ( v , w ) l w ~ V(c)). The detour eccentricity set e * ( G ) of a connected graph G is the set consisting of all detour eccentricities of G, that is, e* (G) = { e* (v)l v E V ( G ) }. The difference between the eccentricities of two adjacent vertices is at most 1. However, the difference between the detour eccentricities of two adjacent vertices can be arbitrarily large. In fact, our next result gives the even stronger result that every set of positive integers is the detour eccentricity set of some connected graph.
Theorem 1: Let S = {s s2, . .., sk} be a set of positive integers with s, < s2 < .., < sk.Then there exists a connected graph G such that e* (G) = S .
Detour distance in graphs
129
Proof:
If IS1 = 1 thenlet G = Cs1+2.thecycleoforder s1 + 2 . Itfollowsthat e*(C) = {sl} = S. Assume that the theorem holds for all sets T of positive integers with 1 I111 < k , where k 2 2. Let S be a set of cardinality k , with s1 E S, and define S' = S- {sl} . By the inductive hypothesis, there exists a connected graph F such that e*(F) = S'. We construct the connected graph G by first replacing the vertex v1 of Csl + 2 : vl, v2, ..., vSl+ 2, v1 by F . Then we join v2 and vsl + 2 with all the vertices in F (see Figure 3). Then, for 2 I i I s l + 2 and v E V(F) it follows that d* (vi, v) I s1 and d* (vi, v) = s1 if and only if i = 3 or i = s1+ 1. Clearly, d* (vi, vj) 5 s1 for 1 Ii <j 5 s1+ 2. Furthermore, it follows that d* (v2, vsI + 2) = s1 andd*(vi,vi-.J = s1 f o r 4 1 i I s l + l . T h e r e f o r e , e * ( v i ) = s l f o r 2 1 i I s 1 + 2 . S u p p o s e Y E V(F).Then d * ( v , v i ) I s l l e * ( v ) for 2 1 i 5 s 1 + 2 . Itfollows that dE(v,w) =d$(v,w) forall v, w E V(F).Therefore, e$(v) = e$(v) for v E V(F).Hence, e*(G) = e*(F) u {sl} = S ' U ( S ] } = s.
Figure 3: Lesniak [2] defined a nondecreasing sequence S a 1, a2, ..., up of nonnegative integers to be an eccentric sequence if there exists a connected graph G whose vertices can be labeled vl, v2, ..., vp so that e(vi) = ai for 15 i I p . In this case, S is said to be the eccentricizy sequence of G. We call a nondecreasing sequence S: al, a2, ..., a of nonnegative integers a P detour eccentric sequence if there exists a connected graph G whose vertices can be labeled vl, v2, .. ., vp so that ex(vi) = ai for 1 5 i I p . The sequence S is said to be the detour eccentricity sequence of G. Lesniak [2] showed that a nondecreasing sequence S: al, az, ..., ap with m distinct values is eccentric if and only if some subsequence with m distinct values is eccentric. We show now that a detour eccentricity sequence may be characterized in an analogous fashion. Theorem 2: A nondecreasing sequence S: al, a2, ...,a with m distinct values is the detour eccentricity sequence of a graph if and only if some su&equence of S with m distinct values is the detour eccentricity sequence of some graph. Proof:
If S is a sequence with m distinct values that is the detour eccentricity sequence of some graph, then S is a subsequence of itself, that is, S is the detour eccentricity sequence of a graph.
G. Chartrand, G.L. Johns and S. Tian
130
For the converse, suppose that S' is a subsequence of S that has the same m distinct values as S and suppose that S' is the detour eccentricity sequence of some graph G. Let t l , t2, . .., t,,, be the distinct values of S' . For each ti, 1 I i Im , select a vertex vi of G whose detour eccentricity in G is ti. Let ni ( 1 I i 5 m ) be one more than the number of Occurrences of ti in S less the number of occurrences of ti in S'. In G replace v1 with a copy of K,,, and join each vertex of K,,, to all the vertices adjacent to vi in G. Denote this graph by GI. In G,, replace v2 with a copy of Kn2 and join each vertex of Kn2to all the vertices adjacent to q in G I .We continue in this fashion to obtain the graph G,. Then S is the detour eccentricity sequence of G,. The detour radius rad*(G) of G is the minimum detour eccentricity, while the detour diameter diam* (G) of G is the maximum detour eccentricity. Our next result gives upper bounds for the detour radius and detour diameter of a connected graph. The maximum degree and minimum degree of the vertices of a graph G are denoted by A(G) and 6(G),respectively,
Theorem 3: For every connected graph G of orderp, rad* (G') I p -A(@ and diam* (G') I p - 4G).
Proof: Let v be a vertex in G of maximum degree and let w be a vertex such that e* (v) = d* (v, w ) . Let P b e a v -w path of length d * ( v ,w) such that (V(P)) = P . Then, IV(P) nN(v)l = 1 so IV(P)I I p - deg v + 1 = p - A(G) + 1. Therefore, rad*(G)Ie*(v) = d * ( v , w ) = IV(P)I-lIp-A(G). Similarly, diam* (G) I p - 6(G). The bounds given for rad*(G) and diam* (G) in Theorem 3 are sharp. If G is the graph of Figure 2, then rad* ( G ) = p - A(G) and diam* (G) = p - 6(G). Our next result gives an upper bound for the size of a graph in terms of the detour diameter.
Theorem 4: If G is a graph of order p for which diam* (G) = n 2 1, then
Proof: Let u and v be two vertices of G for which d* (u, v ) = n , and let P be a M-v path in G such order n + 1 and size n, the size of (V(P)) is that ( V ( P ) ) = P . Since the that is, q ( G ) I -n = Since the graph G = K p - ,,- + P, + ( 1 I n < p ) has detour diameter n and size the bound given in Theorem 4 is sharp.
( 2 ) ( ), ( ) ( ),
3.
Detour Centers
The subgraph induced by those vertices with minimum detour eccentricity is called the detour center of G and is denoted by C* (0.Harary and Norman [3] proved that the center of every connected graph G lies in a single block of G. We now prove an analogue of this result involving detour center.
Detour distance in graphs
13 1
Theorem 5: The detour center of every connected graph G lies in a single block of G. Proof: Suppose G is a connected graph whose detour center C* ( C ) does not lie within a single block of G. Then G has a cut-vertex v such that G\v contains components G1 and G2, each of which contains elements of C* (G). Let u be a vertex such that d* (u, v) = e * ( v ) ,and let P1 be a v u path of G having length e* ( v ) and ( V ( P , ) ) = P , . At least one of Gl and G2, say G2, contains no vertices of P I . Let w be an element of C* (G) belonging to G2, and let P2 be a w-v path of minimum length. The paths P1 and P 2 together form a u-w path P 3 with (V(P3)) = P3. Therefore, e*(w)2 lV(P3)l - 1> IV(P1)l- 1 = e * ( v ) ,which contradicts the fact that w belongs to C* (G). Therefore, C* (G) lies in a single block of G. The center of a graph was introduced in an attempt to define the “middle” of a graph. Another interpretation of the middle of the graph is the median. The distance d(u) o f a vertex u is defined by d(u) =
d(u, v ) .
wc) The median M(G) of G is the subgraph of G induced by those vertices of minimum distance. The detour median can be defined in a similar manner. The detour distance d*(u)of a vertex u in a connected graph G is defined by d*(u, v ) .
d*(u) = v
E
V(C)
The detour median M* (G) of G is the subgraph of G induced by those vertices of minimum detour distance. 42
34
34
42
46
46
42
34
34
42
Figure 4 Although the detour center of a connected graph lies in a single block of G, this is rwt the case with the detour median. The vertices of the graph G of Figure 4 are labeled with their detour distances. Therefore, M*(G) consists of the two isolated vertices u and v, which belong to distinct blocks. In [4] and [5], it was proved that every graph is the center of some connected graph. We now prove that this is also true with respect to the detour center.
G. Chartrand. G.L. Johns and S . Tian
132
Theorem 6: Let G be a graph. Then there exists a connected graph H such that C*(H)= G.
Proof: Let k = p(G). We define the graph H by adding 2k new vertices ui, vi (1 Ii 5 k) to G , and the edges uiui+ 1, vivi+ (1 Ii 5 k - 1) together with the edges joining u 1 and v1 with all verticesof G(seeFigure5).Thene*(tiJ>d*(u,vk) = i + k a n d e * ( v J 2 d * ( v i , u k ) = i + k f o r 1 I i I k. Clearly, d*(u, v) I k for all u, v E V(G).Furthermore, since d*(w, u i ) I d*(w, uk) =kandd*(w,vi)Id*(w,vk) = kforall W E V(G)and 1 I i S k , i t f o l l o w s t h a t C * ( H ) = G .
H:
Figure 5:
Slater [q showed that for every graph G there exists a graph H such that M(G) = G . Whether such a result exists for detour median is unknown. 4.
Detour Peripheries
The periphery P(G) of a connected graph G is the subgraph induced by those vertices with maximum eccentricity. In 1983 Bielak and S y s b [71 proved that a graph G of order p is isomorphic to the periphery of some graph if and only if either A(G) I p - 2 or G = K,. In this section we consider the analogue of this concept for detour distance. The detour periphery P*(G) of a connected graph G is the subgraph induced by those vertices with maximum detour eccentricity. We next present a characterization of those graphs that are isomorphic to the detour periphery of some graph. Perhaps surprisingly, this result parallels the theorem of Bielak and Syslo Theorem 7: A graph G of order p is isomorphic to the detour periphery of some graph if and only if A(G) I p - 2 or G = K,.
Proof: Assume G is a noncomplete graph of order p that is isomorphic to the detour periphery of some graph and, suppose to the contrary, that A(G) = p - 1. Let v be a vertex of G such that deg G~ = p - 1. Suppose H i s a graph for which P* (H) = G. Since G 4 Kp ,,it follows that diam* (H) 2 2. Let w E V(H)for which d* (v, w) = diam* (H). Then, necessanly, w P V ( G ) . However, since eH*(w) = d*(v,w) = diam*(H),
Detour distance in graphs
133
the vertex w is in P*(H),contradicting the fact that w 4 V(G). Conversely, assume that G is a graph of orderp such that A(G) I p - 2 or G = K p . If G = K p , then the detour periphery of G is G itself. Suppose, then, that A(G) I p - 2. We construct a graph H such that P*(H) = G. Let diam* ( G ) = n. If n = 2, then H = G + K , has the desired property. So we assume that n 2 3 . For each vertex u of G , there is a vertex v of G that is not adjacent to u. Suppose there are k pairs uF vi of nonadjacent vertices (1 I i I k ) such k
that U { ui,vi
= V(G).(Note that for i # j , the sets { ui, vi} and { u,, vj} may not be dis-
i= 1
joint even if the minimum such k is chosen.) *, ..., wi,"- to G and join w i to every vertex For i = 1,2, ..., k , we add the path wi, wi, of G except vi and join w ~- , to~every vertex of G except ui. We alsojoin w ~to wj,z , ~for all i, j with 1 I i , j I k and i # j . (See Figure 6.) Denote the resulting graph by H.
H:
/
\
Figure 6 We claim that eH*(v) = n if V E V(G) and eH*(v)< n if V E V(H)\V(G). Suppose that v E V(G).Then we may assume that v = u i for some i ( 1 I i 2 k ) . Since Pi: up w <1, w <2,
...,wt, n - 1, vt
has the property that (V(P,)) = P i , it follows that eH*(v)2 n. By the consideration of a number of cases, it can be shown that there is no induced path of length exceeding n that begins at ui. Thus eH*(v)= n if v E V(G).
,
If v E V(H)\V(G), then v = wj, for 1 I j Ik and 1 I 1 I n - 1. Again, by consideration of cases, there is no induced path of length n beginning at v. Thus,eH*(v) < n. Consequently, P*(H)= G . 5.
Detour Graphs
A connected graph G is called a detour graph if d* (u,v) = d(u, v) for all vertices u and v of G . Therefore, neither of the graphs of Figure 1 or Figure 2 is a detour graph. Also, no cycle
G. Chmand, G.L. Johns and S. Tian
134
of length 5 or more is a detour graph. On the other hand, all trees and all complete graphs are detour graphs. If u and v are vertices of a graph G such that d* (u, v) = 1 or 2, then d* (u, v) = d(u, v). Thus, we have the following result. Theorem 8:
If G is a graph for which diam* (G) 5 2 , then G is a detour graph.
\’ ) ( :).
Consequently, if G is not a detour graph, then diam* (G) 2 3 . By Theorem 4, the maximum size of a graph of order p having detour diameter n is Combining these two facts, we have the following corollary. (The condition p 2 in the corollary is to guarantee that G is not a tree and so diam (G) I2.) Corollary: The maximum size of a graph G of order p 25 that is not a detour graph is Another corollary of Theorem 8 is stated next.
(*I P
-3.
Corollary: If G is a detour graph with diam (G) 5 2 , then G + K,is a detour graph for every positive integer n. We now present a characterization of detour graphs. Theorem 9: Let G be a connected graph. Then G is a detour graph if and only if every induced connected subgraph of G is a detour graph.
Proof: Suppose G is a detour graph and F is an induced connected subgraph of G. For vertices u and v of F , it follows that
dF(u,v) 2 d, ( u , v) = d,* (u, v) 2 dF* (u, v) 2 d d u , v). Therefore, d d u , v) = d,* (u, v) for all vertices u and v of F; so F is a detour graph. If every induced connected subgraph of G is a detour graph, then, in particular, G is an induced connected subgraph of itself; so G is a detour graph. Since every block is an induced connected subgraph, the following is a stronger characterization of detour graphs. Theorem 10: Let G be a connected graph. Then G is a detour graph if and only if each block of G is a detour graph. Proof: Let u and v be vertices in some block B of a detour graph G. Then every induced u-v path lies entirely in B. Therefore, d,*(u, v) = d,*(u, v) = d(u, v) = dB ( a , v ) .
Since the block B is arbitrary and the vertices u and vare arbitrary in B, it follows that every
Detour distance in graphs
135
block of G is a detour graph. We now prove the sufficiency. Assume that every block of G is a detour graph, and consider two arbitrary vertices u and v of G . If u and v lie in the same block B of G , then dG*(u,v) = dB*(u,v) = dB(u,v) = d (u,v) . Suppose then that u and v lie in different blocks of G. Let P be a longest u-v path with (V(P))= P. Without loss of generality, we assume that P :
-
-
u = V 1 , l ~ * ~ * ~ V 1=, iV lz , l ~ * . * , V z i z -V 3 , 1 3 - . . , V m - 1 , ~ m - 1- V,ql,-*.,Vm,~,,,= V
,
where vk,j E V(Bk),1 I j I ,i 1 I k I m, and the blocks Bk ( 1 I k Irn) are distinct in G . Therefore, m
dG*
(u9
v, =
dBT(Vk
87
'Bk('k,
1 7 'k, ir> = d(j'(u,
'k, i J
k= 1
m
=
')'
k= 1
Hence, G is a detour graph. We mentioned earlier that if d * ( u , v) = 2 for vertices u and v in a graph G, then d(u, v) = 2. If the converse of this statement holds for all such vertices u and v, then we have the following result.
Theorem 11: A graph G is a detour graph if and only if for every pair u, v of vertices of G , whenever d(u, v) = 2, then d*(u, v) = 2. ProOfi
Suppose G is not a detour graph. Then there exist vertices u and v such that d(u, v) < d* (u,v). Among all pairs of vertices with this property, choose u and v such that (i)
there exist internally disjoint u-v paths P and P* such that P : u = uo, u l , a shortest u-v path and P:u = vo, v l , ..., vt = v is a detour path; and
(ii)
d(u, v) is as small as possible.
,
.., us = v
is
Since s = d(u, v) < d* (u,v) = t, the vertices u and v are not adjacent. Thus s 2 2 . We now prove that s = 2. Suppose, to the contrary, that s 23. We first claim that uzvl E ( G ) ;for otherwise d*(vl,v) 2 t - 1 > s - 1 2d(vl,v). Since the path P I : vl,vzl ..., v t = v is an induced path, every shortest vl-v path must be internally disjoint from PI.This contradicts the choice of vertices u and v since d(vl,v) < d(u, v). Therefore, u2v1 e E(G). However, this implies that d*(u, uz) 1 3 > 2 = d(u, u z ) ,which contradicts property (ii). Therefore, s = 2 . By a fan of order p (2 5), we mean a graph obtained by joining some specified vertex in a cycle C of length p to other vertices of C. Figure 7 shows three examples of fans. By employing basically the same proof as given for Theorem 11, we have the following characterization of detour graphs, given in terms of forbidden subgraphs.
Theorem 12: A graph G is a detour graph if and only if no induced subgraph of G is a fan
136
G. Chartrand, G.L. Johns and S. Tian
.-
,I_/
Figure 7: Three examples of fans.
, Acknowledgement G.Chartrand, research supported in part by Office of Naval Research Contract N00014-915-1060.
References [l] [2] [31 [4]
[5] [6] [71
F. Buckley and F. Harary; Distance in Graphs, Addison-Wesley, Redwood City, California (1990). L. Lesniak; Eccentricity sequences in graphs, Period. Math. Hungur.. 6,287-293 (1975) F. Harary and R.Z. Norman; The dissimilarity characteristicof Husimi trees, Ann. ofMufh.,58. 134-141 (1953). F. Buckley, 2. Miller and P.J. Slater; On graphs containing a given graph as center, J. Graph Theory, 5, 427434 (1981). G.N. Kopylov and E.A. Timofeev; Centers and radii of graphs, Usp. Mat. Nuuk.,32,226 (1977). P.J. Slater; Medians of arhtrary graphs, J. Graph Theory, 4,389-392 (1980). H. Bielak and M.M. Sydo; Peripheral vertices in graphs, Studiu Sci. Math. Hungur., 18,269-275 (1983).
Quo Vadis, Graph Theory? J. Gimbel, J.W. Kennedy & L.V. Quintas (4s.) Annals of Discrete Marhematics, 55, 137-144 (1993)
0 1993 Elsevier Science Publishers B.V. All rights reserved.
INTEGER-DISTANCE GRAPHS Ralph P. GRIMALDI Department of Mathematics, Rose-Hulman Institute of Technology Terre Haute, Indiana, U.S.A.
Abstract This paper presents an extended discussion on a family of graphs and their complements. The discussion includes the derivations of several numeric parameters of the graphs, such as the sizes of their edge sets, their independence numbers, and their clique numbers. Furthermore,it is shown that these complements are unit interval graphs -hence chordal, and perfect, while the original family comprises graphs that are comparabilitygraphs of semiorders and these are perfect though not generally chordal.
1.
Introduction
For n, k E Z', where k is fixed, we define the (undirected) integer-distance graph G(n,k) = ( V , E ) as follows: V = { l , 2 , 3,..., n};
E = { {i,j}ll
- the complete graph K,,is (isomorphic to) the graph G(n, 1). In this paper we shall investigate many of the properties that are satisfied by the graph G(n,k) and its complement G(n, k ) . For example, when n > k we find that there are ( 1 / 2 ) [ ( n - k ) * + ( n - k ) ] edges in G(n, k ) , and when n t k fect although not chordal.
+ 3 the graph
an,k ) is per-
The definitions that are given in this paper are not original. They are provided as a reminder for the reader and to make the presentation as self-contained as is reasonably possible. For any definitions that are not given the reader is referred to the texts by G. Chartrand and L. Lesniak [ l ] and M.C. Golumbic [ 2 ] . 2. Connectedness and Domination
In general, when n I k the graph G(n,k) consists of n isolated vertices. For k < n < 2k, the vertex (integer) r n / 2 1 is such that In - rn/211 < k for all x E V, so r n / 2 1 is isolated. In fact there are 2 ( k - r n / 2 1 ) = 2 k - n isolated vertices when n is even, and 2 ( k - r n / 2 1 ) + 1 = 2k - n such vertices when n is odd. In either case these 2k - n isolated vertices are at n - k + 1 , n - k , ..., r n / 2 1 - 1, [t2/21, r n / 2 1 + 1, ..., k . When n 2 2k we find that G(n,k) not only has no isolated vertices but the graph is also connected. In fact, if n = 2k, the vertex sequence k + 1+ 1
+k + 2 + 2 +k + 3
+3
+ ... + k + i + i + ... + k + k ( =
provides a Hamilton path. For n > 2k the graph G(n, k) has a Hamilton cycle:
(1)
The vertex sequence
n)
+k
R.P. Grimaldi
138
1 -+ rn/21+ 1 -+ 2 -+ r n / a l + 2 -+3 -+ ... -+ i -+ rn/21+ i -+ ... -+ rn/21 provides such a cycle when n is even; (2)
-+2 r n / 2 1 ( =
n) -+ 1
In the case where n is odd such a cycle is given by the vertex sequence 1 -+ rn/21 + 1 -+ 2 + rn/21 + 2 -+ 3 -+ ... -+ i - + rn/21+ i - + ... -+ rn/21- 1 + 2rn/21- 1 (= n) -+ rn/21-+ 1
As we mentioned earlier, when n I k each vertex in G(n, k ) is isolated - that is, has degree 0. For n > k the following results arise for the degrees of the vertices: (1)
When k < n < 2k we have deg(1) = deg(n) = n - k ; deg(2) = deg(n - 1) = zz - k - 1; ...; deg(n - k ) = deg(k + 1) = 1, and for any n - k + 1 I i 5 k, deg(i) = 0;
(2)
If n 2 2 , then deg(1) = deg(n) = n - k ; deg(2) = deg(n - 1) = n - k - 1; deg(3) = deg(n - 2) = n - k - 2; and, in general, for any 1 5 i I k , deg(i) = deg(n - i + 1) = n - k - i 1 . Furthermore, forany k < n < n - k + 1, deg(x) = ( x - k ) + [ n - ( x + k)] + 1 = n - 2 k + 1, so deg(k) = deg(k+ 1) = deg(k + 2) = ... = deg(n-k) = deg(n- k + 1) = n-2k+1;
+
(3)
A(G(n, k)) = max { deg(v)ll Iv 5 n} = n - k and,
(4)
6(C(n,k)) = min { d e g ( v ) l l < v < n } = m a x ( O , n - 2 k + l } .
Results (1) and ( 2 ) ,along with our observations on isolated vertices, provide us with one more property for this section. Here y(G) denotes the domination number of the undirected graph G. (1)
For n 2 2k the vertices 1 and n provide a smallest minimal dominating set for G(n, k ) , so here y(G(n,k)) = 2.
(2)
When k < n < 2k, y(G(n, k ) ) = (2k - n ) + 2 , since the isolated vertices together with 1 and n constitute a smallest minimal dominating set.
(3)
If 11 n I k, we have y(G(n,k)) = n.
3.
Complete Subgraphs and Independent Subsets of G(n,k)
In this section we determine the number of complete subgraphs Kr, 2 I r If n / k l , contained in G(n, k ) . For r = 2 this is the number of edges in the graph. The complementary notion of independence is also examined here as we count the number of independent subsets (in G(n,k ) ) of size s, for 0 I s I k . Once again k is fixed, but now we shall concentrate on the case where n > k - the graph G(n,k) = (V, E ) then has at least one edge and not every subset of Vis independent. For any x,y E V, suppose that 1 I n < y I n. Then the edge { x , y} is in E if and only if y - x 2 k. Consequently, we can determine IEl by considering the number of ways to arrange n marbles (identical except for color) in a line, where two marbles are red, the other n - 2 blue, and there are at least k - 1 blue marbles between the two red ones. Hence the two red marbles determine three possible locations for each of the remaining ( n - 2) - ( k - 1) blue marbles: (i) the first location is on the left of the first red marble; (ii) the second between the two red marbles; and, (iii) the third to the right of the second red marble. When we select one of these three positions for each of the remaining n - k - 1 blue marbles we count selections of size n - k - 1 from a set of size 3, where repetitions are allowed. This gives us
Integer-distance graphs
3+(n-k-1)-1 n-k-1
) = [ nn - k--
selections, so there are ( 1 1 2 ) [ ( n - k )
y 1
= (n-;+1)=
* + ( n- k ) ]
139
(1/2)[(n-k)2+(n-k)]
edges in G(n, k ) .
Remark: If we let en denote the number of edges in G(n, k ) for k fixed and n > k, then the preceding result may also be derived from the recurrence relation = en+ ( n + 1 - k ) , where e k + 1 = 1. Turning now to the triangles (subgraphs isomorphic to K3) in G(n, k) we adjust the preceding argument. Here we need to position (in a line) three red and n - 3 blue marbles so that there are at least k - 1 blue marbles between each consecutive pair of red ones. Now four possible locations arise for each of the remaining (n - 3) - 2(k - 1) = n - 2k - 1 blue marbles. So we find the number of triangles in G(n,k) by counting the number of selections of size n - 2k - 1 we can make - with repetitions allowed - from a set of size 4. This results in n-2k-1
trianglesinthegraph G ( n , k ) .
We generalize the previous arguments as follows. For any 4 I r I Ln/k J, the number of subgraphs (in G(n,k ) ) isomorphic to Kr is determined by counting the number of selections of size ( n - r ) - ( r - l)(k - 1) we can make - with repetitions allowed - from a set of size r + 1 . This number is r + l + [ ( n - r ) - ( r - 1) ( k - l ) ] - 1 ( n - r ) - ( r - l ) ( k - 1)
) -- (
n-(r-l)k+ r
(r-1)
We summarize our prior computations in the following.
Theorem 1: For n > k and 2 I r 2 Ln/k J, the number of complete subgraphs (isomorphic to)K, in G(n,k)
For n > k , the clique number of G(n,k ) = o(G(n,k)) = Ln/kJ. Turning our attention now to independent subsets in G(n,k) consider 2 I s 5 k. [For s = 0 the empty set is the only independent subset; there are n independent subsets for s = 1 namely, where 1 5 j I n.1
u},
Fix s and let i(n,s) denote the number of independent subsets of size s in G(n,k ) . [In order to facilitate the solution of the following recurrence relation, at this point we shall replace the condition “ n > k” by ‘
R.P. Grimaldi
140
where
kIi)
is the number of independent subsets (in G(n, k ) ) of size s that contain n + 1
along with s- 1 of the vertices n - k + 2 , n - k + 3, ...,n - 1, n . The solution to the recurrence relation (*) has the form i ( n , s) = i ( n , s) ( h ) + i ( n , s) @) ,
where i ( n , s ) ( h ) = the homogeneous part of the solution = A , a constant, and
i ( n , s) ( P ) = the particular part of the solution = Bn,where B is a constant.
Upon substituting i ( n , s) @) = Bn into (*) we find that B = A + [ k-l)n.Since s-1 i(k,s) =
( 5 ) wefindthatA= (:)-(:I;
( ;I: ), so
i(n, s) =
Ik.
The preceding is now summarized as follows.
Theorem 2: For n 2 k and 2 S s Ik , the number of independent subsets of size s in G(n,k) is
Corollary 2: For fixed k in Z+ and any n in Z+, the independence number of G(n, k) = P(G(n, k ) ) = k .
Corollary 3: For n 2 k the Fibonacci namber of G(n, k) - that is, the total number of independent subsets (including0)of G(n,k) - is ( r ~ - k ) 2 ~ - ~ + 2 ~ .
Proof: The total number of independent subsets in G(n, k) is
I+n+n(2k-l-l)
+(2k-I-k)
-k(2k4-1)
=
( n - k)2k- l + 2k.
Remark: The adjective “Fibonacci” is used in this graph-theoretic context for the following reason. For n E Z+, the number of subsets of { 1,2,3,... ,n} which do not contain any consecutive integers is F, 2, the ( n + 2)-nd Fibonacci number (where F , = F2 = 1, and F , = F, - + F, - ,for n 2 3 ) . And this set - namely, {1,2,3,... ,n) - can be considered as the vertex set for the graph P,, the path on n vertices. Consequently, F,, + 2 is the total number of independent sub+
Integerdistance graphs
141
sets (including 0)of the vertices for the path P,,. This definition of the Fibonacci number of a graph is given by H. Rodinger and R.F. Tichy in reference [3]. 4.
G(n,k) - Perfect Though Not Always Chordal
As we continue to investigate the structure of the graph G(n, k ) , we turn now to the notion of a perfect graph. This is defined as follows.
Definition 1:
An undirected graph G = (V, E ) is called perfect if for each induced subgraph H of G, we have w(H) = x(H). (Once again o(H)denotes the clique number of H,while x(H) is the chromatic number of H.) In order to prove that c(n, k ) is perfect we shall prove in two ways that its complement G(n,k ) is perfect. This will be accomplished in both cases by proving that G(n,k) is a chordal graph, a notion we shall define after we make some observations about G(n, k ) .
(i)
is a path of length consists of n isolated vertices while For any n E %+, n - 1. If n, k E Z+ and n 5 k, then G(n, k ) is (isomorphic to)K,,.Finally, for k 2 3 and n > k, the following vertex sequences provide Hamilton cycles in G(n,k):
(neven)
1-3-5-
...-n-l-n~n-2-n-4-...j2j
(nodd)
1+3+5+
...+ n - 2 + n + n - l + n - 3 +
(ii)
l;and,
...+ 2 + 1 .
If n > k and 2 s s s k , then it follows from Theorem 2 that G ( n , k ) has
( :I ).+ [(5 ) - ( 5: edgesinG(n,is
:
)k] subgraphs isomorphic to K , - in particular, the number of
( n - ( k / 2 ) ) (k-l),andw(G(n,k)) = k .
(iii) For n > k and 2 I r I L n / k J , we learn from Theorem 1 that G(n,k) has
( n - ( r - l ) kr +
(r-l)
1 independent subsets
of size r , and P(G(n,k)) = L n / k J .
Returning to the major theme of the section, we now need the following ideas. Definition 2:
The undirected graph G = ( V , E ) is called chordal if every cycle in G of length greater than 3 possesses a chord - that is, an edge joining two nonconsecutive vertices in the cycle. Definition 3:
For an undirected graph G = ( V , E ) let (T = [ v l ,v2, ..., vn] be a linear order for the vertices in V. For 1 s i 5 n let N(vi) denote the set of all vertices in G that are adjacent to vI - that is, N ( v i ) = { wI w E V, {vi,w} E E } . We call o a perfect vertex elimination scheme if for each 1 s is n - 1 the subgraph induced by N ( v i ) n { v i + 1,vi+2,..., v n } is a complete graph. In reference [4] it is shown that the existence of a perfect vertex elimination scheme characterizes chordal graphs. Now for any G(n, k) - in particular, for those cases where n, k > 1 - we find that by ordering the vertices as 1,2,3, ... ,n - l,n, we have:
(1)
For n
Ik ,
G(n,k) is (isomorphic to) the complete graph K , and for all 1 s i s n-1,
R.P. Grimaldi
142
+ 1, i + 2 , ... ,n} induces the complete subgraph K , - i. When n > k, N(Q n { i + 1, i + 2, ...,n} induces (1) the complete subgraph Kk - 1 for i = 1,2,...,n - k + 1; and, (ii) the complete subgraph K , - for i = n - k + 2, ... ,n - 1.
h f Q n {i
(2)
Consequently, G(n, k ) is chordal for all n,k E Z+.And from the results of C. Berge [5] and A. Hajnal and J. SurAnyi [6], every chordal graph is perfect, so is now perfect for all n,k E Z+. Furthermore, in L. LovAsz shows that an undirected graph G is perfect if and only if G is perfect. Therefore the integer-distance graph G(n, k ) is also perfect for all n,k E Z+. However, for a fixed k > 1, when n r k + 3, the cycle 1 -+k + 2 + 2 + k + 3 + 1 is a s u b graph of G(n, k ) , but neither of the edges { 1,2} and { k + 2, k + 3) is in G(n,k) - so in these cases G(n, k) is not chordal. The preceding is summarized in the following.
Theorem 3: (1) For all n,k E Z+, the graph G(n, k) is chordal (and perfect). ( 2 )For k E Z+and any 1 s n s k + 2, G(n,k) is chordal (and perfect).
(3) For k E Z+ and any n 2 k + 3, G(n,k) is perfect but not chordal.
The next result follows from Theorem 3 and the definition of a perfect graph. Corollary 4
For n,k E Z+ with n > k , we find that x(G(n,k ) ) = Ln/kJ and x(G(n,k ) ) = k . The results in Theorem 3 and Corollary 4 can also be obtained (with additional infonnaby showing that G(n, k) is a unit interval graph. tion on the structures of G(n,k) and c(n,) In general if F is a family of nonempty sets, then the intersection graph of F is the graph whose vertices correspond to the sets in F, and whose edges connect two vertices when the corresponding sets (from f i have a nonempty intersection. When F is a family of intervals for a totally ordered set (like the real line) then the intersection graph is called an interval graph. If the intervals all have unit length then the term unit interval graph is used.
For n,k E Z+ and G(n,k) = (V,E ) ,define the real-valued function u: V +R as follows. If rn E V, then m E Z+ with 1 s rns n, and we define u(m) = rnlk. For 1 Ir n I < 9 5 n, where rn1,rn2 E Z+, we know that m2
{rnl,rnz) E E w ( m z - r n l ) < k w - - k
m1
k
< l e u ( r n , ) -u(rn2) < 1.
And in [S] F. Roberts shows that the existence of such a function u characterizes unit interval graphs. So G(n,k) is a unit interval graph - and G. Hajos [9] shows that any interval graph is chordal. Hence, as we mentioned earlier, G(n,k) is perfect, as is G(n,k ) . But, in addition, the results by F. Roberts in [8] show that G(n,k) is a comparability graph where every transitive orientation is a semiorder. (For more on this we refer the reader to pp. 15-16 and pp.186-187 of the text by M.C. Golumbic [2].) We close with one more result for the graph G(n, k ) . When n s k, G(n,k) is (isomorphic to) K , and the chromatic polynomial of G(n,k) is
P(G(n,k),h)= h ( h - l ) ( h - 2 ) ...( h - n + l ) .
Integerdistance graphs
143
For the case where n > k recall that if G is an undirected graph with subgraphs G I , G2, G3, and G = G I v G2, with G , n G2 = G g ,a complete graph, then the chromatic polynomial of G is P(G, h) =
P(G 1' X ) P ( G , P(Gy
.
By now applying this technique - perhaps, several times - to G ( n , k ) , we find that P ( G ( n , k ) , h ) = h ( h - 1) ( h - 2 ) ... ( h - k + l ) k - l ,
when n > k.
5. Some Further Properties of G(n,k) and We close with the following observations. (1)
Given an undirected graph G = ( V , E ) , a clique cover (of G) of size m is a partition of V = V1 v V 2 v ... v V,, where the subgraph of G induced by each Vi, 1 s i s m,is a complete subgraph. The size of a smallest possible clique cover of G is called the clique cover number of G and is denoted by K(G).
In general, P(G) I MG), where P(G) is the independence number of G, and for any subgraph H of G we have p(H) = w(@ and K(H) = When G is perfect we know that ??is also perfect, and so for any induced subgraph H of G , it follows that for ?Sin E w e have M??, = x(??,. Therefore, for any induced subgraph H of G one finds that p(H) = NH), since P(H) = )@u = = MH). Consequently, for any fixed kin Z+and any n in Z', we find that:
x(@.
x(@
(i)
the clique cover number of G(n,k) is k ; and,
(ii)
for G (n,k ) the clique cover number is 1 when n Ik , and Ln/k] for n > k.
(2)
For an undirected graph G = ( V , E ) , the vertex covering number of G , denoted a(G), is the size of a smallest subset S of V where each edge of G is incident with at least one vertex in S. The following result, due to T. Gallai [lo], now determines or(G(n,k)) and
a(FC3). If G = ( V , E ) is an undirected graph with no isolated vertices, then a(G) + p(G) = IVI. Consequently, for n,k E Z+, with k fixed, it follows that:
(i)
a(G(n,k ) ) = n - k , for n t 2k;
(3)
Related to the invariants in ( 2 ) one finds that for an undirected graph G = ( V , E ) , a set E' of edges in G is called edge independent if for any e l . e2 E E' there is no common vertex. The size of a largest such set E' is called the edge independence number of G denoted P1(G). When G has no isolated vertices we define an edge cover of G as a subset E" of E such that for all v E V, v is a vertex on at least one edge in E ". The size of a smallest edge cover is the edge covering number of G - denoted al(c). A second result due to T. Gallai [lo] yields a.,(G)+ P,(G) =
lu.
For n,k E Z+ with k fixed, when n 2 2k the graph G(n, k ) has a Hamiltonian path (when n = 2 k ) or a Hamilton cycle (when n > 2k). Consequently,'
144
R.P. Grimaldi
al(G(n, k ) ) = r n / 2 1 , and Pl(c(n, k ) ) = n - r n / 2 1 , for n 2 2k. A similar argument gives us
a,(G(n, k ) ) = r 1 ~ / 2 ,1and
6.
Acknowledgement
The author wishes to thank the referees for their comments on improving this article especially with regard to the second way to obtain Theorem 3 by considering results about unit interval graphs.
References G. Chartrand and L. Lesniak; Graphs and Digraphs, Second Edition, Wadsworth & BrooksICole, Monterey, California (1986). M.C. Golumbic; Algorithmic Graph Theory nnd Perfect Graphs, Academic Press, New York (1980). H. Prodinger and R.F. Tichy; Fibonacci numbers ofgraphs, The Fibonacci Quarterly, 20,1621 (1982). D.R. Fulkerson and O.A. Gross; Incidence matrices and interval graphs, Pacific Journal ofMathematics. 15,835-855 (1%). C. Berge; Les problkmes de coloration en theorie des graphes, Publ. Imt. Statist. Univ. Paris, 9,123-160 (1%0). A. Hajnal and J. Surhyi; ijber die Auflosung von Graphen in vollstindige Teilgraphen, Ann. Univ. Sci. Budapest, Eofvos Sect. Math., 1, 113-121 (1958). L. Lovfisz; Normal hypergraphs and the perfect graph conjecture,Discrete Math., 2,253-267 (1972). F.S. Roberts;Indifference graphs, Proof Techniques in Graph Theory, Frank H a r q (editor), Academic Press, New York, 139-146 (1%9). G. Hajos; iiber eine Art von Graphen, Intern. Math. Nachr., 11, Problem 65 (1957). T. Gallai; iiber extreme Punkt-und Kantenmengen,Ann. Univ. Sci. Budapest, E&vOs Sect. Math, 2, 133138 (1959).
Quo Vadis, Graph Theory? J. Gimbel, J.W. Kennedy & L.V. Quintas (eds.) A m a h of Discrele Marhematics, 55, 145-152 (1993) 0 1993 Elsevier Science Publishers B.V. All rights reserved.
TOUGHNESS AND THE CYCLE STRUCTURE OF GRAPHS
Douglas BAUER Department of Pure and Applied Mathematics Stevens Institute of Technology, Hoboken, New Jersey, U.S.A.
Edward SCHMEICHEL Department of Mathematics and Computer Science San Jose State University, San Jose, California, U.S.A.
Abstract We discuss some old and new problems concerning the relationship between the toughness of a graph and its cycle structure.
1.
Introduction
Since ChvAtal introduced the notion of toughness in [l] significant progress has been made toward understanding the relationship between this parameter and the cycle structure of a graph. Much of this progress is surveyed in [2]. However many vexing problems remain. Some of these problems were raised in [l] but others are relatively new. The purpose of this note is to discuss recent progress in this area and indicate some directions for future research. Before proceeding further we present a few definitions and some notation. Additional definitions will be given later as needed. A good reference for any undefined terms is [3]. We consider only finite undirected graphs without loops or multiple edges. Let w(G) denote the number of components of a graph G. A graph G is t-tough if IS12 to(G - S) for every subset S of the vertex set Vof G with o(G - 4 > 1. The toughness of G , denoted t(G), is the maximum value o f t for which G is t- tough (t(K,) = = for all n 2 1). We let a(G) denote the cardinality of a maximum set of independent vertices of G. The length of a longest cycle in G is called the circumference of G and is denoted c(G). We also letp( G) denote the length of a longest path in G. A k-factor is a k-regular spanning subgraph. For k 22, we let k
ok = min {
I
d ( v i ) {vl, v2, ..., vk } is an independent set cf vertices } .
i= 1
2. A Direction for Future Research Does there exist a constant to such that every to-tough graph is Hamiltonian? In [ l ] ChvAtal conjectured that such a to does exist and noted that to = 2 would imply a theorem of Fleischner [4], stating that the square of every 2-connected graph is Hamiltonian. ChvAtal also conjectured that every 3/2-tough graph has a 2-factor and that every k-tough graph on n vertices with kn even has a k-factor. Only the latter conjecture is correct, as shown by Enomoto et al. [5] in Theorems 1 and 2 below.
Theorem 1: Let G be a k-tough graph on n vertices with ti 2 k +1 and kn even. Then G has a k-factor.
D. Bauer and E. Schmeichel
146
Theorem 2: Let k 11.For any positive real number E, there exists a ( k - &)-toughgraph G on n vertices with kn even and n 2 k + 1 which has no k-factor. Since we are interested in the cycle structure of graphs we focus for now on k = 2. Theorems 1 and 2 state, in essence, that 2-tough graphs have 2-factors and that there exist (2 - E)tough graphs without 2-factors. The infinite family of graphs in [5] that demonstrate the latter all have vertices of degree 4. Can such graphs be found with minimum degree at least 5? What if we require 6 ( G )2 alV(G)I for some constant a > O? More generally, we raise the following questions. Let G be a t-tough graph on n vertices, where 1 5 t 12. Find the smallest nonnegative constants P ( t ) and X t ) such that for sufficiently large n (1)
6(G) 2 P( t)n implies G is Hamiltonian
(2) 6 ( G )2 y(t)n implies G has a 2-factor. Clearly p(t) 2y(t)and if the conjecture that 2-tough graphs are Hamiltonian is correct then P(2) =$2) = 0. We first outline recent progress on question (2). By Theorem 1, y(2) = 0. The following theorem is established in
[a.
Theorem 3: 2-2 For 1 I t < 2 let G be a t-tough graph on n 2 3 vertices. If 6(G) 2 ( -) n, then G contains a l+t 2-factor. It is demonstrated in [6] that Theorem 3 is best possible for 1 I t <3/2; consequently y ( t ) = 2 - t / 1 + t in that range. Furthermore, if 3 /2 S t < 2, then $ 2 ) Y ( t ) = (7t ) andin fact y(t) = f i t ) for every t of the form (2r - 1) / r , where r 22. The diff&&e betweLn At) and (2 - t)/(1 + t) is quite small since .94 I(t 2 - 1)/(7t- 7 - t2)I1 for 2 E [3/2,2]. It is also shown in [6] that if 312 I t < 2, then y(t) 2g(t), where g(t) is defined as follows. Let r = r(t) 2 2 denote )(2-1) and the integer such that (2r - l ) / r I t I (2r + l ) / ( r+ 1) and set g, ( t ) = (r+1)(1+1)-5 1-1 . Then
(e) '_;!
g 2 ( t ) = 3 ( r + l ) ( 2 1 - 3 ) +1+1
g(t) =
i
g l ( t ) if ( 2 r - I ) / r I t < b ( r ) g2(t) if b ( r ) l t < ( 2 r + l ) / ( r + 3 )
where b(r) = (6r2- r - 4)l(3r2+ r - 3 ) Note that g ( t ) is continuous at t = b ( r ) and that g(t) =f(t) if t = (2r - l ) / r for r 1 2. For other values of t the difference between f(t) and g(t) is quite small, e.g., g( 1I n ) = 3/19 = .I58 and f(l1/7) = 4/25 = .16. We conjecture that in fact r(t) = g(t) if 312 I t < 2.
In contrast to our knowledge of 'I(t),we know almost nothing about p(t). We do know that p(1) = 1/2since the following theorem of Jung [7] is best possible. Theorem 4: Let G be a 1-tough graph on n 2 11 vertices with 0 2 2 n - 4. Then G is Hamiltonian.
In [I%] it is shown that there exists an infinite collection of non-Hamiltonian 1-tough graphs on n vertices with 6 2 = n - 5 and thus p(1) = 112. If we assume t(G) > 1 the bound on o2in
Toughness and the cycle structure of graphs
147
Theorem 4 can be lowered, but not by very much [9]
Theorem 5: Let G be a graph on n 230 vertices with t(G)> 1. If
02
2 n - 7, then G is Hamiltonian.
In [9] it is also shown that there exists an infinite collection of non-Hamiltonian graphs whose toughness is larger than 1 and with 0, = n - 8. Recently, the non-Hamiltonian 1-tough graphs for which 6 3 2 (3n - 24)12 have been characterized [lo]. Theorem 5 also follows from this characterization. Let us now assume that t is a fixed rational number such that 1 < t 12. For such t determining p(t) is an open problem. We can obtain a (possibly crude) upper bound on p ( t ) from a result in [ll], given below.
Theorem 6: Let G be a 1-tough graph on n2 3 vertices with 032 n. Then c(G) 2min (n,n + 03 I 3 -a). Clearly 01 I n/(t + 1) and 03 2 36 and so the next corollary follows.
Corollary 7: For 1 1 t 1 2 let G be a t-tough graph on n vertices with 6 2 n / ( t+ 1). Then G is Hamiltonian. Thus we conclude that p(t) I 1/(t + 1) for 1 I t 1 2 . We now suggest the intriguing possibility that y(t) = p(t) for 1 I t 1 2 . Since Theorem 4 is also best possible for the existence of 2factors, we know y(1) = p( 1) = 112. If the conjecture that 2-tough graphs are Hamiltonian is correct then y(2) = p(2) = 0. However for an intermediate value of t, say t =3/2, all we know is that 1/5 = y(3/2) I p(3I2) _<2/5.Clearly there is much work to be done. Some evidence that $ 2 ) = p ( t ) for 1 1 t 1 2 is that a similar phenomenon occurs with respect to the binding number. For v E V ( G )let N(v) denote the set of vertices in G which are adjacent to v, and for S V(G)let N ( S ) = ,N (v) . The binding number of G, denoted b(G),is the minimum of IN(S)I/ISI taken over all nonempty subsets S of V(G)such thatN(S) # V(G).In [I21 Woodall proved that if b(G) 2312 then G is Hamiltonian. He also showed that the constant 3/2 is best possible, both for the existence of a Hamiltonian cycle as well as for the existence of a 2-factor. While this might lead one to believe that y(t) = p(t) it is important to realize that binding number and toughness have different characteristics. For instance it is NP-hard to determine if a graph G is t-tough for any fixed rational number t [13], while b( G ) can be computed in polynomial time [14]. 3.
More Open Problems
Another question regarding the relationship between toughness and cycle structure concerns what we will refer to as the Dominating Cycle Property (DCP). A cycle C in a graph G is a dominating cycle if every edge of G has at least one of its vertices on C. A graph G has the DCP if every longest cycle in G is a dominating cycle. Dominating cycles were introduced by Nash-Williams in and discussed extensively in a later paper by Veldman [16]. Recently, the DCP has proved to be a useful idea in the study of cycle structure in graphs (e.g., [2], [l 13, [17]-[19]). The relationship between toughness and the DCP is not known for t > 1. For t = 1 we have the following results, each of which is best possible. The first theorem is a result due
[la
D. Bauer and E. Schmeichel
148
to Bigalke and Jung [8]. Theorem 8:
Let C be a 1-tough graph on n vertices with 6 2n13. Then G has the DCP. This was later generalized in [ll]. Theorem 9:
Let G be a 1-tough graph on n vertices with 03 2n. Then G has the DCP. Since every Hamiltonian graph has the DCP it is natural to add a third question to the two raised in the previous section. Let G be a t-tough graph on n vertices, where 1 5 t 52. Find the smallest nonnegative constant q(f) such that for sufficiently large n (3) 6 ( G )2 q ( f ) nimplies G has the DCP.
0
Clearly q(r)Ikt)and if the conjecture that 2-tough graphs are Hamiltonian is true, q(2) = p(2) = 0. But how does q(f) compare with $t)? Since Theorem 8 is best possible, 113 = q ( l ) < s 1 ) = 112. Also since q(t)and are nonincreasing functions of rand $514) = 113, we have q(t)ssr)for 1It I514. Is it the case that q(t) 5 $t) for 514 < t I 2 ?
sr)
Another interesting problem is to find c(G),given t( G) and 6(@. Here we know very little, even for 1-tough graphs. The following theorem appears in [20]. Theorem 10:
Let G be a 1-tough graph on n I 3 vertices with 6 2 n13. Then c(G)2min ( n , n + 6 - a + 1).
Corollary 11: Let G be a 1-tough graph on n 2 3 vertices with6 2 n13. Then c(G)2 5n16 + 1. However, we do not believe that Corollary 11 is best possible. In fact we conjecture [ 111 that under the hypothesis in Corollary 11, c ( G )2 ( l l n + 3)112. We have stated our questions in terms of 6 rather than bk ( k 22). Of course they can be considered in terms of 0 k and in fact Theorem 10 has recently been generalized in this direction [21], as shown below (compare with Theorem 6 ) . Theorem 12: 0
Let G be a I-tough graph on n 2 3 vertices with 0 3 2 n . Then c(G)2 min (n, n+ 2 - a + 1). 3 With regard to cycle structure problems it is often difficult to generalize a theorem involving a lower bound on 6to a similar theorem involving a lower bound on ok.Surprisingly this is not the case with respect to $t). With little additional effort we established the following in
[6J. Theorem 13:
Let G be a t-tough graph on n 2 3 vertices with 1 5r < 2. If
Toughness and the cycle structure of graphs
149
then G has a 2-factor. Another open problem concerns the relationship between the toughness of a graph and whether the graph is pancyclic. A graph G on n vertices ispuncyclic if G contains a cycle of length 1 for every 1 such that 3 I 1 In.In [ 11 ChvAtal conjectured that there exists a constant to such that every to-tough graph is pancyclic. Of course this question is still open. An easier (and hence more frustrating) question was recently raised by Jackson and Katerinis [22]; namely, does there exist a constant to such that every t,-tough graph has a triangle? A problem that has received some recent interest concerns the growth of c(G) as a function of n for fixed t. More specifically, let %(t, n) denote the class of all 2-connected t-tough graphs on n vertices and let q t , n) = min {c(G)I G E G2(t, n)}. As n +=,will q t , n) + for fixed t? The answer is yes, although if t is not fixed and G2(t, n) is replaced by the class of all k-connected graphs, the answer is no; e.g., C(Kk,n - k ) = 2k for all n 2 k. The following appears in [B]. 00
Theorem 14:
For a fixed constant A depending only on t, C(t, n) log C(t,n) 2 A log n. Conjecture 15:
For a fixed constant A depending only on t, C(t, n) 2 A log n. It is shown in [23] that Conjecture 15 is true for 3-connected graphs. Additional evidence for the conjecture is that a similar result holds for paths. Let G,(t, n) denote the class of all connected t-tough graphs on n vertices and let P(t,n) =min {dG)I G E G,(t, n)}. Theorem 16:
For a fixed constant A depending only on t, P(t,n) 2 A log n. It is shown in [23] that Conjecture 15, if true, and Theorem 16 are essentially best possible f o r t S 1. It is an open problem to determine best possible lower bounds on q t , n) and P(t, n) f o r t > 1. We conclude with some comments on the relationship between toughness, minimum degree, and existence of k-factors. The following result [24] generalizes both Theorem 1 and Theorem 3. Theorem 17:
Let G be a t-tough graph on n vertices and k 2 2 an integer such that n 1k + 1 and kn is even. If 6(G) 2 ( k - 1)( k - t) n/(1 + t) + ( k - 2), then G contains a k-factor. Theorem 17 is meaningful only if t > k- 1 - Ilk. It remains an open problem to determine if the degree bound in Theorem 17 is best possible in this range. If t > (2k2 - 2k - 1)/(2k - l), Theorem 17 strengthens a result proved by Katerinis [25] and independently by Egawa and Enomoto [261.
D. Bauer and E. Schmeichel
150
Theorem 18: Let G be a graph on n vertices and k 2 1 an integer such that n 2 4k - 5 and kn is even. If 6(G) 2 n I2 then G contains a k-factor.
Acknowledgements Douglas Bauer was supported in part by the National Security Agency under Grant MDA 904-H-89-2008. Edward Schmiechel was supported in part by the National Science Foundation under Grant DMSS904520.
References V. Chvhtal; Tough graphs and Hamiltonian circuits, Discrete Math., 5,215-228 (1973). D. Bauer, E.F. Schmeichel, and H.J. Veldman; Some recent results on long cycles in tough graphs, Proc. 6th Int. Conf. on the Theory and Applications ojGraphs, Kalamam, 1988, Y. Alavi, G. Chartrand, O.R. Oellermaun and A.J. Schwenk (editors), 113-123 (1991). G. chartrand and L. Lesniak; Graphs and Digraphs, Wadsworth,Inc., Belmont, calif.(1986). H. Fleischner; The square of every 2-connected graph is Hamiltonian, J. Cornbinatorial Theory Ser. 8. 16.29-34 (1974). H. Enomoto, B. Jackson, P. Katerinis, and A. Saito; Toughness and the existence of k-factors, J. Graph Theory, 9.87-95 (1985). D. Bauer and E.F. Schmeichel; Toughness, minimum degree and the existence of 2-factors, preprint (1991). H.A. Jung; On maximal circuits in finite graphs, Annals ojDiscrele Math., 3, 129-144 (1978). A. Bigalke and H.A. Jung; Uber Hamiltonische Kreise and unabhangige Ecken in Graphen, Monatsh. Math., 88, 195-210 (1979). D. Bauer, G. Chen, and L. Lasser;A degree condition for Hamiltonian cycles in t-tough graphs with t >1, preprint (1991). H.A. Jung, Shwe Kyaw, and Wei Bing; private communication. D. Bauer, A. Morgana, E.F. Schmeichel, and H.J. Veldman; Long cycles in graphs with large degree sums, Discrete Math., 79,SP-70 (1989/90). D.R. Woodall; The binding number of a graph and its Anderson number, J. Cornbinatorial Theory Ser. B , 2 9 , 2 7 4 6 (1973). D. Bauer, S.L. Hakimi, and E.F. Schmeichel; Recognizing tough graphs is NP-hard, Discrete Appl. Math.,28, 191-195 (1990). W.H. Cunningham; Computing the binding number of a graph, Discrete Appl. Math., 27, 283-285 (1M). C. St. J.A. Nash-Williams; Edge-disjoint Hamiltonian circuits in graphs with vertices of large valency, Studies in Pure Mathematics, Academic Press, London, 157-183 (1971). H.J. Veldman; Existence of dominating cycles and paths, Discrete Math., 43,281-2% (1983). D. Bauer, G. Fan, and H.J. Veldman; Hamiltonian properties of graphs with large neighborhood unions, Discrete Math., 9 6 , 3 3 4 9 (1991). D. Bauer, H.J. Broersma, and H.J. Veldman; Around three lemmas in Hamiltonian graph theory, Topics in Cornbinutorics and Graph Theory, Physica-Verlag,Heidelberg, 101-1 10 (1990). Bert Fassbender; A sufficient condition on degree sums of independent triples for Hamiltonian cycles in I-tough graphs, preprint (1989). D. Bauer, E.F. Schmeichel, and H.J. Veldman; A generalization of a theorem of Bigalke and Jung, Ars Cornbinatoria, 26.53-58 (1988). Vu-Dinh-Hoa;Note on a theorem of Bauer, Morgana, Veldman and Schmeichel, preprint (1990). B. Jackson and P. Katerinis; A characterizationof 312-tough cubic graphs, preprint (1990). H.J. Broersma, J. van den Heuvel, H.J. Veldman, and H.A. Jung; Long paths and cycles in tough graphs, prepI.int (1991). D. Bauer and E.F. Schmeichel; Toughness, minimum degree, and the existence of k-factors, preprint (1991).
Toughness and the cycle structure of graphs
151
[25] P. Katerinis; Minimum degree of a graph and the existence of k-factors. Proc. Indian Acad. Sci. (Math. [26]
Sci.), 94, 123-127 (1985). Y. Egawa and H. Enomoto; Sufficient conditions for the existence of k-factors. Recent Studies in Graph Theory,V.R. Kulli (editor), Vishwa International Publications, 96-105 (1989).
This Page Intentionally Left Blank
Quo Vadis, Graph Theory? J. Gimbel, J.W. Kennedy & L.V. Quintas (4s.) Annals of Discrete Mathematics, 55, 153-158 (1993) 0 1993 Elsevier Science Publishers B.V. All rights reserved.
THE BIRKHOFF-LEWIS EQUATIONS FOR GRAPH-COLORINGS William T. TUTTI? Department of Combinatorics and Optimization University of Waterloo, Waterloo, Ontario, CANADA
Abstract The paper Chromatic Polynomials by G.D. Birkhoff and D.C. Lewis has long served as a textbook for students of map colorings. Much of it is concerned with equations relating to two kinds of chromatic polynomial (or chromial). thefree and the constrained. In a paper in Discrete Mathematics the author simplified the theory of these equations and obtained what can charitably be described as a general solution. (It requires the inversion of a very large matrix, denoted in this paper by M). He reported on this work at the conference Quo Vudis, Graph Theory? The present paper is based on his lecture notes. Essentially it is a shortened version of the paper in Discrete Mathematics but there are differences of approach. This paper keeps closer to the spirit of the grand original by relating constrained chromials directly to free, whereas the paper in Discrete Mathematics proceeds by relating mn-planar free chromials to planar ones.
The equations of the title arose out of the problem of coloring graphs. T o get a coloring of a graph from h given colors we assign a color to each vertex so that each edge joins vertices of two different colors. (This is impossible if X has a loop). We then have a Lcobring of G. We denote the number of such h-colorings by P(G;h). It is well-known that P(G;X)has the form of a polynomial in h with integer coefficients. It is identically zero if and only if G has a loop. Otherwise its degree is the number of vertices of G and its leading coefficient is 1. It is the chromatic polynomial or chromial of G. We note that if G is the edgeless graph of m vertices then P(G;h) = h" but if G is the complete graph of m vertices then (2)
P(G;h) = h ( h - 1) ( h - 2 )
... ( h - m +
1).
Chromials were introduced by G.D. Birkhoff in the hope that they would help solve the Four Color Problem. But they have acquired an interest of their own, largely because of identities found to hold between chromials of related graphs. For example let G have an edge A that is not a loop. Let graphs GAand @ be derived from G, the first by deleting A and the second by contracting A, with its two ends, into a single new vertex. Then it is easy to show that (3) The Birkhoff-Lewis equations are identities of the same general kind. In naming the Birkhoff-Lewis equations I refer to the great paper of G.D. Birkhoff and D.C. Lewis called Chromatic Polynomials published in 1946 [l]. Much of that paper is concerned with relations holding between their free and constrained chromatic polynomials. In a paper of my own, which is to appear soon in Discrete Mathematics [2] these relations are both generalized and simplified but are still essentially the same Birkhofl-kwis equations. The present paper, based on a lecture given at the Conference Quo Vadis, Graph 7'heory?,covers the same ground. But the theoretical development is rather different.
W.T. Tutte
154
Consider a circle C in the Euclidean plane. Let a planar graph G be drawn, planely, in that plane so that one or more of its vertices are on the circle and the rest of the graph lies entirely inside C. Let V ( c ) be the set of vertices on C, and let their number be n. We enumerate them as v,, v2, ..., vN, in their order on C, and we write S,,for their cyclic sequence. A partition X of S,, is a collection of disjoint non-null subsets of V ( C ) whose union is V(C).The subsets are the parts of X. We write their number as h(X). For example if n = 5 there is a partition U of V(C)with parts { vl,v3) , { v2, v4} and { v5) , and with h( v) = 3. We naturally write this as (4)
((
~ 1 ,~ 3 1 (, ~ 2 7~ 4 1 (, ~ 5 1 *)
To identify a part of a partition it suffices to give its suffixes, without separating commas. Commas can be used to separate groups of suffixes determining different parts. Thus the above symbol of the partition U can be abbreviated as (13,24,5) . A partition X of S,,is called non-planar if two vertices of one part separate two vertices of another on C , and planar if there is no such separation. As examples of planar partitions with n = 6 we may have ( 192, 394,596) > ( 13,2,46,5) , ( 123,456) ,
(123456). The partition U = (13,24,5) mentioned above is non-planar, as is (135,246) . Let Xand Y be partitions of Sn-. We say that X refines Y if each part of Xis contained in some part of Y,or equivalently that each part of Y is a union of one or more parts of X.
Given a partition X of S,,. we can define two new polynomials, each depending on both X and G. Following Birkhoff and Lewis we call them thefree and constrained chromials of G, with respect to X. Actually they both satisfy constraints. Initially we define the free chromial F(G, X;h) as the number of h-colorings of G such that any two vertices in the same part of X have the same color. The constrained chromial K(G, X;h) satisfies the same condition with the additional requirement that vertices of S,,.in different parts of Xmust have different colors.
To justify the terminology we note that each of F(G, X;h) and K(G, X;h) can be interpreted as the ordinary chromial of a graph, possibly non-planar, obtained from G by adjoining appropriate new edges and making suitable identifications of vertices. A free or constrained chromial is called planar or non-planar according as the corresponding partition XISplanar or non-planar. It can be verified that each of the new polynomials satisfies an analogue of (3). Thus,
(5)
F(G, X;h) = F(G,, X;h) - F ( d , X;h),
(6)
K ( G , X;h) = K(GA,X;h) - K ( d , X;h).
T o prove (3), (5) or (6) we observe that, of the permissible h-colorings, the ones of GAin which the two ends of A have different colors are those of G, and the ones of GAin which the two ends have the same color are in 1: 1 correspondence with those of GA.But there is an awkwardness here: if both ends of A are on C then when we form GA by contracting A we spoil the circle C.
The Birkhoff-Lewisequations for graph-coloring
155
We can circumvent this difficulty by using a device recommended to the author by F. Bernhart. We change the definition of GA.Instead of contracting A we replace it by a new kind of edge called a contractive edge. For a graph with contractive edges we extend the definition of a h-coloring by requiring that the two ends of a contractive edge must have the same color, whereas the two ends of an ordinary edge must still have different colors. From now on we allow G to have contractive edges. But we do not relax the requirements of planarity and relationship to C, which treat both kinds of edges alike. The definitions of chromials, ordinary, free and constrained, remain as before. The two definitions of GA are chromatically equivalent. The chromial of a graph is not altered by contracting its contractive edges. Equations (3), (5) and (6) are valid with the new definitions, which we propose to use from now on. A graph H in which all the edges are contractive is called a contractive graph. If such a graph has k components then clearly (7)
P(H;h) = hk.
The theories of [l] and [2] discuss free and constrained chromials with respect to both planar and non-planar partitions. But here we take them only with planar ones. Our BirkhoflLewis equations express planar constrained chromials as linear combinations of planar free ones. More precisely we seek equations of the following form. (8)
B(X)K(G, X;h) = x b ( X , Y)F(G, EL), Y
where Xis any planar partition of S, and Y runs through the set of all planar partitions of S,. B(X) and b(X, Y) are to be polynomials in h whose coefficients are integers. We may suppose the equation reduced to its lowest terms, i.e., that any common factor of all these coefficients, whether polynomial or integral, has been divided out. Complete sets of such equations, for any specified G, can be got by various devices for some small values of n. Birkhoff and Lewis did this, in effect, for the two simplest non-trivial cases, n = 4 and n = 5 . In a later paper D.C. Lewis and D.W. Hall solved the problem for n = 6, [3]. The solutions that have been reported show some interesting regularities. First, the coefficients B(X)and b(X, Y) depend on S, but not otherwise on the structure of G. Let us express this property by saying that the equations are invariant. A second regularity concerns B(X),which is found to be a product of a power of h and a product of what combinatorialists call Beruha polynomials. These are closely related to the Chebyshev polynomials. Each Beraha polynomial is a polynomial in h. There is one of them, denoted here by c(m) for each integer m, positive, negative or zero. They can be defined by the initial equations (9)
c(0) = 0 , c(1) = 1
and the recursion formula ( 10)
c(m + 1) +c(m - 1) = J(m)c(rn),
where J(m) is h if m is even, and 1 if m is odd. As an immediate consequence of this definition we have the rule that c(-m) = -c(m) for each m.
W.T. Tune
156
Table 1: c(m) for some small values of m.
1 0 1
O
1 1 1
1 2 1
hZ- 4h +- 3 h3-66h2+ lOh-4 At Waterloo R. Dahab and D. Younger verified the second regularity up to n = 6. In [l], with a rather different definition of free chromials, the regularity is not quite so well marked. But the factor c(n + 1) is evident in the cases n = 4 and n = 6. It was the appearance of Beraha polynomials that aroused my own interest in the problem. I wanted to know if they continued to appear for higher values of n and if so how they managed to d o it. Work at Waterloo, done after the Conference, has answered these questions. A paper about it has been submitted to the Journal of Combinatorial Theory. There is a far-reaching theory about equations of the form of equation (8).Consider a hypothetical one in which the coefficients B(X) and b(X, Y) have been fixed arbitrarily. Using (5) and (6) we find that if the equation holds for every relevant G of fewer than m ordinary edges, then it holds also for every relevant G of m ordinary edges ( m >0) . Relevant here means properly related to C and S,,. It follows by induction that if the equation is true for every relevant contractive graph G then it is true for every relevant G. Now if G is contractive each vertex of S, belongs to some component of G. Those components of G that meet S,, define a planar partition Z(G) of it. Each part of Z(G) is the set of vertices of S,, contained in some one component of G. In any h-coloring of G all the vertices of any one component must have the same color. So the chromial of G is also its free chromial with respect to the planar partition Z(G). Let us say that a contractive G conforms to S,, if each of its components includes one or more vertices of S,. In the general case let G have exactly k components that do not meet S,,. Then it is evident that for each partition Xof S,, we have the following equations, where H i s the subgraph of G, conforming to S,,,that is the union of the components that do meet S,,. (11)
F(G, X ; h ) = hkF(H, X;h),
( 12)
K(G, x
; ~=)A~K(H,X ; Q .
We deduce that if eqn(8) holds for all relevant contractive G conforming to S,,, then it holds for all relevant contractive G. Combining this with our other results about (8)we see
The Birkhoff-Lewis equations for graph-coloring
157
that if that equation holds for all relevant conforming contractive G then it holds for all relevant G. The free and constrained chromials for a conforming contractive G can be evaluated in terms of the planar partition Z = Z(G).Let us first make the trivial observation that we can always find a conforming contractive G corresponding to a given Z. Consider F(G, X;h)for a conforming contractive G . It is the number of h-colorings of G that satisfy the restrictions of both X and Z. That is, any two vertices of S, belonging to the same part, either of X or of Z must have the same color. The combined restriction is that of a partition X v Z of S,, called the chromatic join of X and Z (see [4]).Its defining properties are first that both Xand Z refine it, and second that h(X v Z)has the greatest value consistent with that condition. It can be shown that the chromatic join, so defined, is unique. Accordingly we have (13)
F ( G , X ; h ) = hh ( X V z ) .
I have found the following graph-theoretical definition of X v Z to be of interest. From X and Z we construct a bipartite graph H(X, Z)with a bipartition { U , V } . The vertices in U are the parts of X and the vertices in V are the parts of Z . If some part is common to X and Z then it appears twice as a vertex of H(X, Z), once in U and once in V , and the two appearances are counted as distinct vertices. Each member of S,, belongs to one part of X and one part of Z , and it is represented in H(X, Z)by an edge joining the two corresponding vertices. Then the parts of X v Z are the sets of vertices of the components of H(X, Z). The chromatic join of two planar partitions X and Z is not necessarily planar. Thus if X = ( 1 3 , 2 , 4 ) and Y = ( 1 , 2 3 , 3 ) we have X v Z = ( 1 3 , 2 4 ) , and this is a non-planar partition. Let us now supplement (13) with an equation for K(C, X;h).It is clear that this constrained chromial is identically zero unless Z refines X . But if Z does refine X we have (14)
K ( G , X ; h ) = h ( h - 1 ) (A-2)
... ( A - h ( X ) + l ) .
Let us write <(X,Y) = 1 if Y refines X , and 5(X, Y) = 0 otherwise, where X and Yare any partitions of S,. Then we can rewrite (14) as
K(G,X;h) = C ( X , Z ) h ( h - l ) ( h - 2 ) ...( h - h ( X ) + l ) . (15) Let us substitute from (13) and (15) in our (8).We get (16)
B ( X ) S ( X , Z ) h ( h - l ) ( A - 2 ) ... ( h - h ( X ) + l ) = ~ b ( X , l ' ) h M Y " ' . Y
The significance of this equation is as follows. If and only if the coefficients B(X),b(X, Y) are such that (16)holds for each pair of planar partitions { X, Z } of S,, then (8) is an invariant Birkhoff-Lewis equation, valid for every G satisfying the initial conditions. Having reached this point at the meeting in Fairbanks I offered the chilling reflection that now all the graph theory had been squeezed out of the Birkhoff-Lewis problem, and that it was now but an exercise in the theory of partitions of a cyclic sequence. But the recent work at Waterloo has depended heavily on the properties of the graph H(X, Y) that can be used to define the chromatic join X v Y. So I must now withdraw that observation. However hard you work to expel graph theory from a mathematical investigation it always returns.
W.T. Tutte
158
We can now regard (16) as a set of equations, one for each pair X , Z } ,for the quotients b(X, Y ) / B ( X ) that make (8)a valid invariant Birkhoff-Lewis equation. If the equations of the set are independent there must be a unique solution for the set of quotients. We get shorter formulae by writing (16) as a single equation between matrices. Enumerate the planar partitions of S,, as X , , X , . .., X , . Define four square matrices Z, D, Eand M of order t as follows. For Z, E and M the entry in the ith row andfi column is & ( X , X,), b ( X , X j / B ( X j ) and "),' respectively. D is a diagonal matrix whose entry in the jth diagonal place is
h(h-1) (1-2)
... ( h - h ( P , ) + l ) .
Using these definitions we rewrite eqn( 16) as ( 17)
DE = EM.
Now D is clearly non-singular. We can prove that E is non-singular as follows. When we enumerate the planar partitions X i we can take first the one with one part, then those with two parts, then those with three, and so on. The matrix E will then have a 1 in every diagonal position, and zeros everywhere above the diagonal. Hence the determinant of Eis 1. The non-singularity of M is proved in [ 2 ] .The products of t elements of M are considered, one element from each row and one from each column. Each product is equal to a power of h , and it is shown that the diagonal product gives a higher power of h than any of the others. Consequently det M is not identically zero as a polynomial in h. It follows from the non-singularity of E,D and M that (17) can be solved uniquely for E when the other three matrices are known, as they are. Then E gives us the coefficients in our Birkhoff-Lewis equations. The solution is
(18)
E = DEM-'.
The recent work at Waterloo already mentioned has dealt with the elements of M-I and with the determinant of M . The determinant has been evaluated for every n as a product of powers of Beraha polynomials multiplied by a power of h. The result verifies a conjecture given in [4].
References [l] [2] [3] [4]
G.D. Birkhoff and D.C. Lewis; Chromatic polynomials; Trans. Amer. Mdh. Soc., 60,35%$51 (1946). W.T.Tutte;On the Birkhoff-Lewis equations, Discrete Malhematics, 92,417425 (1991). D.C. Lewis and D.W. Hall; Coloring six-rings; Tram. Amer. Math. Soc., 64, 184-191 (1948). D.M. Jackson; The lattices of partitions and non-crossingpartitions,and the Birkhoff-Lewis equations for the chromatic polynomial of a planar map, Research Report, CORR 89-34, University of Waterloo (1989).
Quo Vadis, Graph Theory? J. Gimbel, J.W. Kennedy & L.V. Quintas (4s.) Annals of Discrete Mathematics, 55, 159-172 (1993) 0 1993 Elsevier Science Publishers B.V. All rights reserved.
THE COMPLEXITY OF KNOTS Dominic J.A. WELSH Merton College, University of Oxford Oxford, ENGLAND
Abstract The paper considers the computational complexity of classifying knots and of determining several well known knot invariants, both with and without an oracle for testing isotopy.
1.
Introduction
The title of this paper is taken from Tait’s seminal work on knots. Although it is unlikely that Tait was thinking of complexity in the sense that it is used here, the underlying problems encountered by Tait are basically the same questions that we shall be considering. Two fundamental problems of knot theory are:
Is a knotted curve really knotted? Are two knotted curves really the same knot? These were clearly the topics in Tait’s mind when he wrote [l] (p.300): “Before taking up the question of the complexity of a knot a word or two must be said about the methods of reducing any given knot to its
simplest form. I have not been able as yet to find any general method of doing this, nor have I even discovered what would probably solve this diflculty, any perfectly general method ofpronouncing at once from its scheme or otherwise whether a knot is reducible or not.” Tait might be amused to know that one hundred years later and despite a massive effort, these problems are still difficult. For example, until 1974 the knot diagrams shown in Figure 1 were wrongly thought to represent different knots.
Figure 1: In this paper we shall discuss various algorithmic and complexity questions arising from knots. Familiarity with concepts from combinatorics will be assumed; the knot theory concepts will be defined, more details may be found in the books of Burde and Zieschang [2] and Kauffman [3]. The complexity terminology follows Garey and Johnson [4].
D.J.A.Welsh
160
The complexity classes P, NP, #P and EXPTIME have their usual meanings and we remind the reader that although it is known that (1.1)
P G N P G P#'c EXPTIME,
the only inclusion which has been proved to be strict is that P i s distinct from EXPZ7ME. The notation A = B implies A is Turing reducible to B. 2.
Isotopy of Links
A link L with c(L) components in the three-sphere S3 is a smooth sub-manifold that consists of c(L) disjoint simple closed curves. A knot is a link with one component. Two links K, L are isotopic if there exists a homotopy h,: 9 +9 ( 0 I t I 1),such that h , = 1, each h, is a homeomorphism and h, ( K ) = L . We restrict attention to tame links and thus we may assume that, for each link considered, the projection x [ L ] of L to R2 is a finite 4-regular plane graph. The link diagram D(L) of L arising from x[L] is obtained by indicating at each crossing which of the two curve segments goes over the other.
The fundamental theorem of Reidemeister [5]states: Theorem 2.1:
Two links K and L are isotopic if and only if any link diagram of K can be transformed into any link diagram of L by a finite sequence of the moves (I), (II), (111) and their inverses.
Figure 2: Reidemeister moves. We will say that two link diagrams are equivalent if the corresponding links are isotopic. We denote this equivalence of link diagrams and isotopy of links by D -D' and K K'.
-
These moves, known as Reidemeister moves are applied locally. In each case, away from the crossings to which the move is being applied, the diagrams remain unchanged. It is important to note that there are other notions of equivalence; for example one could regard K1 and K2 as equivalent if there exists an autohomeomorphism of 9 which maps K1 to K2. This is a weaker notion than isotopy, for example taking K1 and K2 as shown in Figure 3; they are the right hand trefoil knot and its mirror image the left hand trefoil; these are equivalent in the above sense but are not isotopic; there is no sequence of Reidemeister moves which will transform K 1 to K2. In other words they are chiral. Henceforth we shall exclusively use equivalence in the sense of isotopic equivalence, and thus can rely entirely on the calculus of Reidemeister moves to demonstrate it.
The complexity of knots
161
Figure 3: Two non-isotopic knots. A Graphic Interpretation of Link Isotopy
Given any link diagram D we note that if overlunder crossings are ignored it can be regarded as a 4-regular plane graph G(D).Accordingly it is Eulerian and its dual plane graph, consisting of the faces of G ( D )can be 2-coloured. We color the boundary faces black, and if two faces share a crossing we join them by a signed edge according to the convention shown in Figure 4.
+ve crossing
-ve crossing
Figure 4 In this way, given any link diagram D we get a plane signed graph S(D) in which each edge, appropriately signed, corresponds to a crossing in D. Conversely, given any plane graph G with its edges signed + or -, we can associate with G, in a canonical way, a link diagram D(G) such that S(D(G))= G. The construction is easy; draw the medial graph of G , call it m(G) and this will be the knot diagram where the over/ under nature of the crossings is determined by the sign of the appropriate edge in G. We leave the details to the reader. From this correspondence between signed planar graphs and link diagrams it is straightforward to verify that the Reidemeister moves (I) -(HI) have the following interpretations as graph transformations Add or delete a loop or isthmus of any sign. (I*): (II*(a)): Insert or delete a parallel pair of oppositely signed edges between any pair of vertices, provided this does not contravene the planar representation. (II*(b)): Contract a pair of oppositely signed edges which are in series. Insert a pair of oppositely signed series edges at a vertex as shown in Figure 5.
(HI*):
The signed star triangle transformation (Figure 6).
Example: Consider the trefoil and its mirror image as shown in Figure 3. Their associated signed graphs D , D a r e shown below. How does one show that K1, K2 are not isotopic? Alternatively, how does one prove that there is no sequence of signed graph moves from D to D?
162
D.J.A.Welsh
Figure 5:
Figure 6 This difficulty is at the heart of the problems considered in this paper. It highlights a property of links which received much attention by Tait, who described K as amphicheiral, (now often called achiral) if K is isotopic to its mirror image. Deciding whether a knot is achiral is still difficult.
Figure 7: As another example of equivalence, take any plane graph G and let each edge be +ve, giving the signed graph G+. Now take the dual plane graph G* with each edge negatively signed, giving (@)-. Then it is not difficult to check
(2.2) The links having diagrams D(G+)and D[( @)-I are isotopic.
Moreover, we can show: (2.3)If the plane graph G has n edges then G+ may be transformed to (G*)- by O(n)Reidemeister moves.
This is just a special case of a more general result:
Result 2.4: For any plane graph G if S denotes a signed version of G, and Sdenotes its e a n a r dual G * with the corresponding edges oppositely signed then S can be transformed to S by O(IE{G)I) of the moves I*-HI*.
The complexity of knots
163
Sketch of Proof: Given a link diagram D with n crossings, which represents the link L , then if A denotes any boundary face of D,and the arc PQ represents the common internal boundary of face A as illustrated in Figure 8 then the diagram fi can be transformed to the link diagram D'by a
Figure 8: sequence of O(n)Reidemeister moves corresponding to pulling PRQ through the diagram. It is now easy to check that the graph of D'is in fact the dual graph of D.We call this move the fold. For more on the relation between graphs and knots see [6] 3. The Complexity of Knot Triviality and Isotopy As mentioned earlier, one of the most fundamental algorithmic questions about knots is to decide whether a knot is trivial. In the now standard terminology of complexity questions we formalize this question as follows: KNOT TRIVIALITY:
Instance: A knot diagram D. Question: Is D topologically equivalent to the unknot? There is an involved and difficult algorithm due to Haken [71, see also Schubert [8] and Hemion [9], which shows that this problem is decidable. However, as far as I am aware, the status of this problem in the complexity hierarchy is not known. More precisely, we pose the fundamental problem: Problem3.1: Find a function f : Z + Z such that if D is a knot diagram which is equivalent to the unknot and which has n crossings then there exists a sequence of at mostf(n) Reidemeister moves which demonstrates the equivalence. More succinctly, we are looking for a bound on the length of Reidemeister proofs of equivalence to the unknot. An example of a knot diagram which in order to be shown equivalent to the unknot needs an increase in the number of crossings, is the diagram K shown in Figure 9. However, in a sense this highlights the inadequacy of the Reidemeister moves as a knot calculus since if one allows the fold as a single move, illustrated by the transition from K to k,then it is not so easy to find examples of diagrams with this property.
164
D.J.A.Welsh
Figure 9 This prompts the question:
Problem 3.2 Find a sequence of link diagrams D ieach representing the unknot and such that the number of crossings, nj, of D icannot be reduced to the unknot in fewer than n: Reidemeister moves. In a discussion with L. Kauffman, we decided that the following construction may provide a class of hard examples to unravel. Take a fairly complicated diagram of the unknot, having say m crossings. Now fold it in half, then fold it in half again and continue in this way for perhaps k folds. This will give a diagram D which still represents the unknot, has approximately q m k ) crossings, and would seem to demand very many Reidemeister moves in any unraveling process. However, proving this assertion seems very difficult. A closely related, but even harder problem, is the following: KNOT EQUIVALENCE:
Instance: Two knot diagrams D1, D2. Question: Do Dland 4 represent equivalent knots? It would be a major advance to show that KNOT EQUIVALENCE belonged to the complexity class EXPTZME; though again there is no firm evidence indicating that the problem is not in polynomial time P.
4. Classical Link Invariants Two link diagrams which have different numbers of components obviously represent non isotopic links. This is probably the simplest example of a link invariant and is used as a first subdivision of links in the classical tables of links (see for example [lo]). A less trivial invariant is the crossing number. This is defined to be the minimum number of crossings in any diagram representing the link. Determining the crossing number of a link is difficult. Expressed more formally, define the problem: CROSSING NUMBER:
Instance: A knot K represented by a diagram D , together with an integer k. Question: Is the crossing number of K < k? Since a knot has zero crossing number if and only if it is trivial, we have: (4.1) KNOT TRIVIALITY = CROSSING NUMBER.
The complexity of knots
165
Tait introduced another knot invariant which is superficially related to crossing number and which he called beknottedness. It is the minimum number of changes of sign of crossings which reduce the knot to triviality. He remarks [l] (p.308) “There must be some very simple method of determining the amount of beknottedness for any given knot: but I have not hit upon it. ” He might have been surprised to know that determining the beknottedness of a knot is still an immensely difficult problem, even for diagrams with as few as 9 crossings, see for example [Ill. A link diagram D is alternating if and only if it has the property that in D the crossings are alternately overlunderlover.. .. While it is trivial to verify that a given diagram is alternating, it is highly nontrivial to test whether a given link L has a representation as an alternating link diagram. Any such link is called alternating. Equivalently, a link diagram D is alternating if and only if the associated signed graph has all its edges the same sign. This is easy to see. Thus, the question of deciding whether a link is alternating reduces to: Problem 4 3:
Is there a polynomial time algorithm which will check whether a given signed graph is transformable to a monosigned graph using the moves I*-HI*? If there were such an algorithm it would settle the knot triviality question, since a recent result of Murasugi [12] and Thistlethwaite [13], which settles one of the long standing Tait conjectures, states:
Theorem 43: If L is an alternating link and D is an alternating diagram representing L then provided D has no nugatory crossing it has a minimum number of crossings over all diagrams representing L. (A crossing is nugatory in D if, as is shown in Figure 10, it corresponds exactly to the corresponding edge of the graph G(D)being an isthmus).
Figure 10: A nugatory crossing. It is clear from this and the fact that nugatory crossings are easily recognized and eliminated, that finding an alternating representation of an alternating knot is at least as hard (in the computational sense) as deciding whether a knot is trivial. Hence we define the problem: ALTERNATING: Instance: A link diagram D. Question: Does D represent an alternating link?
We have immediately (4.4) KNOT TRIVIALITY = ALTERNATING.
One of the most powerful invariants of a knot is its group, or more precisely the funda-
D.J.A.Welsh
166
mental group of the knot complement, and it is natural to examine its complexity. The origin of this invariant can be traced back to Poincark. A series of results culminating in a recent result of Gordon and Luecke [14] show that it is close to providing a complete classification of knots. First we observe:
(4.5)From a knot diagram it is easy to obtain in polynomial time a presentation of the knot group in terms of generators and relations. The well known Wirtinger presentation of an ncrossing diagram gives a presentation of n generators and n relations and can be found in time o(n2). (4.6) The famous theorem of Dehn and Papkyriakopoulos (1957) states that K has a group presented by a single generator if and only if K is the unknot. In other words: (4.7) The trivial knot is completely characterized by its group. Unfortunately this characterization is not always easy to use and at the moment it still does not seem possible to use it to produce even an exponential time algorithm for deciding knot triviality, let alone link equivalence. A geometric invariant of a knot is its genus. This is defined as follows: Seifert (1935) showed how to construct for any knot an orientable surface with the knot as its only edge. There are in fact many such surfaces, but of such surfaces, the one having minimum genus (that is, fewest handles) is called a minimal surface for the knot, and its genus is the genus of the knot. Deforming the knot by Reidemeister moves cannot change the genus. As with the knot group, we have a characterization of triviality in terms of genus, namely: (4.8) The only knot having genus zero is the unknot. This result is the basis of the Haken (1%1) algorithm, refined by Hemion [9] which tests whether a given diagram represents the unknot. Starting with any Seifert surface spanning the knot, the surface is successively modified to produce a sequence of such surfaces of decreasing genus. At each decremental stage a test of minimality of genus is applied and this, in conjunction with (4.8) gives an algorithm for knot triviality. This technique has been further developed to construct an algorithm for deciding whether two diagrams represent isotopic knots, thus demonstrating at least that the problems are decidable even if the algorithms do not appear to be practicable. For details see Waldhausen [15]. I t is tempting to suggest that (4.8) could be used as the basis for a polynomial time nondeterministic algorithm for triviality; this would exist if, for any diagram D representing the trivial knot, there was some polyhedral spanning disc of D which has only polynomially many faces. If this were the case, an NP algorithm for KNOT TRIVIALITY would consist of first guessing a spanning disc, and then verifying that the faces fit together to form a disc and that D is the boundary curve. Unfortunately this very attractive idea fails. Snoeyink [16] has constructed for each integer n, a knot which is trivial but which has a representation in 3 - s p c e by a polygon of 4n + 17 line segments and for which any spanning disc has at least 2n faces. Invariants Reducible to Equivalence
Several of the invariants we have discussed are polynomial time reducible to the problem of knot equivalence. In other words, suppose we define P' (respectively NP') to be the class
The complexity of knots
167
of problems K which can be solved in polynomial time by a deterministic (respectively nondeterministic) Turing machine equipped with an oracle to decide knot isotopy. Then it is clear that: (4.9) KNOT TRIVIALITY E P’
(4.10) CHIRALITY E P’
Here the problem CHIRALITY is defined in the obvious way, given a link diagram, does it represent a chiral link? However, I can only show:
Result 4.11: ALTERNATING Proof:
E NP‘
Given a link diagram D , on say n crossings, the nondeterministic machine ‘guesses’ a link K which is alternating and uses the oracle to verify that K D. By Theorem (4.3) K cannot have any more than n crossings.
-
An interesting question is: Problem 4.12:
Does ALTERNATING E P’? A related and slightly easier question is:
Problem 4.13: Show that ALTERNATING E (coNP)! It is also easy to see
Result 4.14: CROSSING NUMBER
E
NP’.
Proof: Given a link diagram D on n crossings and an integer t one can verify that D has crossing number at most t by “guessing” a link diagram D‘ on fewer than t crossings and using Z to show D D’..
-
K2
Figure 11: However, I cannot settle: Problem 4.15:
Does CROSSING NUMBER E PI?
D.J.A.Welsh
168
As far as beknottedness is concerned, it is not clear that finding the beknottedness of a link is decidable. An intriguing question concerns knot primality. If K1, K2 are two knots their sum Kl # K2 is obtained by “tying” them together as shown in Figure 11 and then joining their ends to form a closed string A knot K is prime if is it not the unknot and cannot be expressed as the sum of two nontrivial knots. Consider now the question: KNOT PRIMALITY: Instance: Knot diagram D . Question: Does D represent a prime knot?
It is far from clear that this question is decidable, though it is. The difficulty is that there is no (Ipriori bound on the size of the possible components of a composite knot.
Problem 4.17: DWS KNOT
PRIMALITYE EXPZTMEI?
Note that as a problem about signed graphs, the question can be restated as follows. If K = K1 # K2 and the diagrams D l , D 2 of K1, K2 are given, then the diagram of K is just a block sum D 1* D2. Thus given the right representation of K it is easy to check primality. Just check there is no cut vertex. However, for each composite knot there are infinitely many possible diagrams, some of them have this separation, others do not. There appears to be no bound on the size of the smallest diagram at which the vertex separation must become apparent. 5. Link Polynomials
We close with a brief discussion of complexity questions arising from the recent surge of activity in link polynomial theory which has occurred since the discovery of the Jones polynomial [17] as a link invariant in 1984. There have been several recent excellent surveys of link polynomials and their interrelationship (see for example [18]), and therefore we shall concentrate here just on the Kauffman bracket polynomial. This is essentially the Jones polynomial, and is very close to polynomials already well understood in combinatorics. The bracketpolynomial [ L ]of a link L is obtained from any link diagram of L by applying the equations (5.1)
[xl=A[V,l+B”?
(5.2)
[L 01 = d [Ll,
(5.3)
[OI = d,
locally. We use 0 to denote the unknot. [L] is a polynomial in the 3 variables A , B and d, which are assumed to commute. It thus follows easily that [ ] is well defined on unoriented diagrams. As it stands it is not an invariant of isotopy. However, suppose that we consider the second Reidemeister move, applying the bracket rules we obtain the following formula.
The complexity of knots
U = AB[
169
1+ 82[ 1+ BA [> c]
A ] + A2 [ U
=(A2+$+ABd)[n]+AB[)c].
But in order for the bracket to be invariant under isotopy, it certainly must satisfy
and this forces the relations AB = 1 , A 2 + B 2 + A B d = 0
Thus if we specialize the bracket by insisting that B = A - ' , d = - ( A 2 + A -2)
we have shown that the new one variable bracket of L, denoted by (L), is invariant under Reidemeister move (11). It is an easy exercise to check that (L) is also invariant under the third Reidemeister move and this means it is an invariant of regular isotopy. It is not an invariant under Reidemeister move (I). However, provided that the link is oriented and then suitably normalized, Kauffman [19] showed that the resulting invariant is in fact the Jones polynomial of the link. More precisely, if we define the writhe w ( L ) of an oriented link to be the sum of the signs at crossings, then we have
+ve crossing
-ve crossing
Figure 12: Theorem 5.4 The Jones polynomial V,(t) of an oriented link L is given by
As far as complexity is concerned, the writhe is easy to calculate and hence computing the Jones polynomial of an oriented link is Turing equivalent to computing the one variable bracket polynomial of an unoriented version of the same link. But in turn,for the case of alternating links this is easily seen to be Turing equivalent to computing the Tutte polynomial of the underlying graph. To see this note the correspondence between the bracket rule and the deletekontract formulation of the Tutte polynomial
D.J.A. Welsh
170
In general, this is a correspondence between links and signed graphs but in the special (presumably easier) case of alternating link diagrams the edges of the corresponding graph can be chosen to be positive and then we are exactly in the situation of having to determine the Tutte polynomial of a plane graph. Using this correspondence, a consequence of the results of Jaeger, Vertigan and Welsh [20] and Vertigan [21] is: Theorem 5.7: (i)
Determining the Jones polynomial VL(t)of an alternating link is #P-hard.
(ii)
Evaluating the Jones polynomial V,Ct ) at a point to is #P-hard unless to is one of the spe*e4xi13). cia1 points {*I, *i,
One surprising feature of this result is that these special points are exactly the points at which knot theorists knew other evaluations of V L . Working on the assumption that #P # P, made even more reasonable by Toda’s Theorem [22] showing that #Pis at least as hard as any problem in the polynomial hierarchy, we believe there can be no further exact evaluation of the Jones polynomial in terms of easily computable functions. Perhaps the most important open question concerning the Jones polynomial or its bracket equivalent is the following: Probiem 5.9: If V d r ) = 1, is K the unknot?
The answer to this problem is probably not; it is easy to produce examples of non-isotopic knots having the same Jones polynomial in the same way as is it relatively easy to produce nonisomorphic graphs having the same Tutte polynomial. However, settling Problem (5.9) may not be easy: a link K with V d t ) = 1 and K nontrivial will need to be nonaltemating and have crossing number at least 14. The number of distinct knots grows exponentially, see [23] and [24], and combined with Theorem (5.7) we see that a computer search for such a knot does not seem practicable. However, if the answer to Problem (5.9)turned out to be yes, then it would mean that testing knot triviality was no harder than the classical enumeration problems which are known to be #P-complete. It would be a major advance.
Acknowledgement I am grateful to L.H. Kauffman, W.B.R. Lickorish, D.W. Sumners and M.B. Thistlethwaite for their very helpful correspondence and discussions about various points raised in this paper.
References [I] 121 [3]
[4] [5]
[6] [71 [8]
P.G. Tait; On knots I, 11, III, Scienh$c Papers, Volume I, Cambridge University Press, London, 273-347 (1898). G. Burde and H. Zieschang; Knots, de Gruyter (1985). L.H. Kauffman; On Knots, Princeton University Press Ann. Math. Stud. (1987). M.R. Garey and D.S. Johnson; Computers and Intractability - A Guide to the Theory ofNP-Completeness, Freeman, San Francisco (1979). K. Reidemeister;Homotopieringeund Linsemaume, Abh. Math. Sem. Hamburg 11,102-109 (1936). T. Yajima and S. Kinoshita; On the graphs of knots, Osaka Muth. J , 155163 (1953) W. Haken; Theory der Nmalflachen, Acta Math., 105,245375 (l%l). H. Schubert; Knoten und Vollringe, Acta Math., 90,131-286 (1%3).
The complexity of knots
[91
171
G. Hemion, On the classification of homeomorphismsof 2 - m d o l d s and the classification of 3-manfolds, Actu Math., 142,123-155 (1979). M.B. Thistlethwaite;Knot tabulations and related topics, Aspects of Topology, I.M. James and E.H. Kronheimer (editors),Cambridge Univ. Press, 1-76 (1985). W.B.R. Lickorish; The unknotting number of a classical knot, Contemp. Math.. 44,117-121 (1985). K. Murasugi; Jones polynomials and classical conjecturesin knot theory, Topology, 26,187-194 (1987). M.B. Thistlethwaite;A spanning tree expansion of the Jones polynomial, Topology, 26,297-309 (1987). C. MCA. Gordon and J. Luecke; Knots are determined by their complements, J. Amer. Math. SOC.,2, 371415 (1989). F. Waldhausen; Recent results on sufficiently large 3-manifolds. Proc. Symp. Pure Math. Amer. Math. SOC., 32.21-38 (1978). J. Snoeyink; A trivial knot whose spanning disks have exponential size, Proc. 6~ ACM Conj Computational Geometry. 13%147 (1990). V.F.R. Jones; A polynomial invariant for knots via von Neumann algebras, Bull. Amer. Math. SOC., 12, 103-111 (1985). W.B.R. Lickorish; Polynomials for links, Bull. Lond, Math. SOC.,20,558-588 (1988). L.H. Kauffman;State models and the Jones polynomial, Topology. 26,395407 (1987) F. Jaeger, D.L. Vertigan and D.J.A. Welsh; On the computationalcomplexity of the Jones and Tutte polynomials, Math. Proc. Camb. Phil. SOC.. 108.3S53 (1990).
D.L. V d g a n ; On the computational complexityof Tutte, Homily and Kauffman invariants - to appear. S. Toda; On the computationalpower of PPand $P, Proc. 30thFOCSSymp.. 514-519 (1989) C. Emst and D.W. Sumners; The growth of the number of prime knots, Math. Proc. Camb. Phil. SOC., 102,303-315 (1987). D.J.A. Welsh; On the number of knots and links, Colloq. Math. SOC.Janos Bolyai, 59, 1 4 (1991).
This Page Intentionally Left Blank
Quo Vadis, Graph Theory? J. Gimbel, J.W. Kennedy & L.V. Quintas ( 4 s . ) Annals of Discrete Mathematics, 55, 173-178 (1993) 0 1993 Elsevier Science Publishers B.V. All rights reserved.
THE IMPACT OF F-POLYNOMIALS IN GRAPH THEORY Edward J. FARRELL Department of Mathematics,The University of the West Indies St. Augustine, TRINIDAD
Abstract Some of the interesting developments in the area of F-polynomials are highlighted. Connections with well-known graph polynomials are discussed.Also, possible areas for further investigations are identified.
1.
Introduction
We consider only finite graphs which can have loops and multiple edges. Let G be such a graph. A cover of G is a spanning subgraph of G. Let F be a family of graphs. An F-cover of G is a cover of G in which every component is a member of F. Let us associate an indeterminate or weight wa with every member a of F and the weight w(C) = Zw,, with every cover C of G, where the product is taken over all the elements a of C. Then the F-polynomial of G is F(G;w) = C w ( C ) , where the summation is taken over all the covers in G and w is a vector of indeterminates. This polynomial was introduced in Farrell [l]. The form for F(G;w) depends on the weight criterion used. We can assign to each element of F the same weight or at the other extreme, each element of F can be assigned a weight depending on its isomorphism class. In the first case, the polynomial is called the simple Fpolynomial of G. In the second case, it is called the isomorphism weighted F-polynomial. These polynomials result from extreme cases of weight assignments. Many different graph polynomials have been introduced over the last few years. Most of them are indeed F-polynomials. It is therefore desirable to develop a formal umbrella theory which will facilitate an in-depth study of all graph polynomials.
2. The Hierarchy of Graph Polynomials The coefficients of F-polynomials are the numbers of covers of the types defined by the monomials. Thus the coefficients count the number of certain spanning subgraphs of the graph and are therefore always non-negative. We refer to such an F-polynomial as apure Fpolynomial. We can compare F-polynomials according to the generality of their weight assignments. We therefore speak about lower and higher level F-polynomials. A higher level F-polynomial is obtained from a lower level F-polynomial by reducing the generality of the weights. For example, the circuit polynomial is a pure F-polynomial. Also, it is a lower level F-polynomial than the characteristic polynomial, which is obtained from it by replacing w1 by x, w2 by -1 and wr by -2, for r > 2 (Farrell [2]). It has been shown [ l ] that many of the well-known graph polynomials are high level Fpolynomials. These include the chromatic polynomial, the dichromatic polynomial and the characteristic polynomial. The classical rook polynomial is also a high level F-polynomial (Farrell [3]) since it is a special matching polynomial. It should be noted that it is possible for a high level F-polynomial to originate from different low level F-polynomials. For example,
E. J. Farrell
174
the chromatic polynomial originates both from the subgraph polynomial (weighted according to the number of nodes and edges in the components of the cover [l])and also from the clique polynomials (Farrell [4]) i.e the case in which F is the family of cliques. It is useful to be able to identify the pure F-polynomials corresponding to a given high level F-polynomial. Generally, the low level ‘parent’ F-polynomial provides a better combinatorial environment for investigating the corresponding high level polynomial. Therefore, the problem of indentifying pure F- polynomials corresponding to high level polynomials is an important one. The impact of this line of investigation to graph theory will be quite significant. In what follows, we identify several areas of investigation which will have significant impact on the subject. 3. Analytical Properties
When we speak about differentiation and integration we mean formal differentiation and integration. Graph polynomials can be differentiated and integrated formally. As far as we know, the well-known high level F-polynomials have not been investigated in this regard. One reason for this is that it is difficult to understand the connection between the derivative (and integral) of the a high level F-polynomial and the graph itself. For example; How are the coefficients of the derivative (or integral) of the chromatic polynomial of a graph related to the graph itself? The low level F-polynomials are useful for investigating analytical properties of their associated higher level polynomials. For example, the subgraph polynomial is the ‘parent’ polynomial of the chromatic polynomial. This relationship has provided a technique for obtaining a combinatorial interpretation for the derivative of the chromatic polynomial of a graph. This relationship is given in the following theorem, established in Farrell [5]. Theorem 1: Let P(G;h)be the chromatic polynomial of a graph G, with q edges. Then a
where G - H , is the graph obtained from G by removing all the nodes of a connected subgraph H, with r edges, and the summation is taken over all such subgraphs of G. This theorem can be used to obtain new and interesting results about chromatic polynomials. The derivatives of the matching polynomial and the cycle (circuit) polynomial can also be used to obtain results about characteristic and rook polynomials respectively. Not much has been done on integration of the low level F-polynomials. Any results along these lines will be useful for finding combinatorial interpretations for the integrals of the associated higher level F-polynomials. 4. Graph Decompositions
F-polynomials have provided a new way for obtaining results on decomposition of graphs into spanning subgraphs. By definition, the coefficient of the monomial lT; = l w kis the number of covers consisting of elements of F with weights w1,w;?,... and w,. It follows that a lot of information about graph decompositions is stored in F-polynomials. The simple F-polyno-
The impact of F-polynomials in graph theory
175
mial stores information about the number of decompositions of the graph into spanning subgraphs with a specified number of components. Many results on graph decomposition have been obtained by the use of F-polynomials. (see Farrell [6], [7], [8], and [9]). The simple technique which is provided by the use of F polynomials will continue to be useful for future investigations in this area of research.
5. Reconstruction of Graphs Definitions:
(i) An F-polynomial of a graph G is reconstructible if it can be found from the F-polynomials of the node-deleted subgraphs, i.e., the graphs G - v i where vi is a node of G. (ii) Let G be a graph. An F-polynomial characterizes G if and only if for any graph H , F(H;w) = F( G;w)if and only if H E G. (iii) An F-polynomial is called characterizing if and only if it characterizes every graph. Suppose that a graph G is characterized by a particular F-polynomial and suppose that the F-polynomial is reconstructible, then G is reconstructible. It is therefore important to be able to establish the reconstruction of F-polynomials. The areas of graph characterization and polynomial reconstruction are worthy of further investigations. These areas will provide a wealth of information about reconstructible graphs. F-polynomials can be used to investigate the famous Reconstruction Conjecture. It is not difficult to deduce the following result. Theorem 2:
The Reconstruction Conjecture holds if there exists a reconstructible characterizing graph polynomial. The above result essentially transfers Ulam’s Conjecture to an equivalent conjecture on F polynomials. It is well known that none of the popular graph polynomials are characterizing polynomials. The reconstruction of some graph polynomials have already been established (Farrell and Wahid [lo] and Farrell and Grell [ 111). However, there is still a lot to be done in this area. We expect that the impact of F-polynomials to this area of graph theory will be quite significant. 6. New Approaches to Old Problems
The study of F-polynomials will have a great influence on some of the classical problems in combinatorics and graph theory. For example, a connection has been established between rook theory and matching theory. The following result was established in [3]. It shows that the rook polynomial is an F-polynomial. (N.B., A connection between the rook polynomial and the acyclic polynomial, a special matching polynomial, was given earlier in Godsil and Gutman [W) Theorem 3: Let B be a chessboard and G g a bipartite graph associated with B as follows. If B has m rows and n columns, then Gg has disjoint node sets labeled 1, 2, ..., rn and 1, 2, ...,n respectively. Also, node i is joined to node j by an edge, if and only if cell ( i ,j ) belongs to B. Then the rook polynomial of B is equal to the matching polynomial of G g , with w1 replaced by 1 and
176
E. J. FarreU
w2 replaced with x .
This connection between the rook polynomial and the matching polynomial has led to a better way of tackling such thorny problems as equivalence of chessboards and discordant permutations. The advantage of using matching theory is that it is easier to work with a graph than with a chessboard. This facility presented by matching theory, will have a significant effect on rook theory. It is now possible to study rook theory entirely by means of graphs. The potential effect of F-polynomials in this area of study is great. The following result is taken from Fame11 and Whitehead [13]. Thesimple matching polynomial is the matching polynomial in which each element of a cover is given the same weight w. Theorem 4:
The chromatic polynomial of a graph G is obtained from the simple matching polynomial of (thecomplementof G) byreplacing wkby(h)k=(h)(h- l)(h-2)...(h - k + l)anddually, the matching polynomial of E c a n be obtained from the chromatic polynomial of G by replacing ( h ) k by wk,if and only if G is triangle free (i.e., G has no subgraph which is a triangle). The above theorem establishes a connection between matching theory and chromatic theory. Thus Theorems 3 and 4 can be used to form connections and interconnections between rook theory, matching theory, and chromatic theory. F-polynomials will also have a great impact on chromatic theory. It was shown in [l] that the subgraph polynomial is a ‘parent’ polynomial of the chromatic polynomial. Recently, it has been shown [4] that the clique polynomial is also a ‘parent’ of the chromatic polynomial. The following results give the connections. Theorem 5:
Let S(G;w)be the subgraph polynomial in which each component of a cover with r edges is assigned the weight w,. Then the chromatic polynomial of G is obtained from the subgraph polynomial of G by replacing w, by (-1)‘h. Theorem 6:
The chromatic polynomial of a graph G is obtained from the simple clique polynomial of its complement c,by replacing w‘ by (h),. The connections between chromatic, clique and subgraph polynomials have provided a different approach to problems in chromatic theory. This new approach can lead to interesting new results in chromatic theory and provide easier proofs for some of the known results in this area.
7. Unifying Effect of F-Polynomials F-polynomials have provided a unifying approach to the study of graph polynomials. Indeed, all the well-known graph polynomials are subgraph polynomials. Some of the lesser known polynomials, such as the frame polynomial of Balsubramanian and Parathasarathy [14] and the subgraph enumerating polynomial of Borzacchini [ 151 have also been shown to be subgraph polynomials. The connections were established in Farrell [16]. The unifying effect of F-polynomials is discussed in Farrell [17]. Perhaps the most important feature about F-polynomials is their potential for providing a single unified theory of graph polynomials. Such a theory will be useful for establishing glo-
The impact of F-polynomials in graph theory
177
bal results which can then be applied to the various specific F-polynomials, to yield specific local results. 8.
Discussion
The study of F-polynomials is relatively new. There are many aspects of the polynomials which have not as yet been examined. One such property is the weight function. The weight of an element a of the family F is an indeterminate associated with a.This leaves a lot of room for variations in the assignment of weights. We have assigned special weights to different families in order to obtain various well-known higher level graph polynomials. Perhaps different assignments of weights might lead to other interesting polynomials. One area which is certain to be affected by different weight assignments is that of characterization of graphs. The ability of a polynomial to characterize a graph depends on the weighting criterion. One might conceivably begin with a graph G and a family F, and by a judicious choice of a weight criterion, find an F-polynomial which characterizes G. Such an approach to graph characterization will certainly advance this area of research. Many interesting articles on graph polynomials have recently been published. Among them are the work of Hosoya and Balasubramanian [18] and 1191 and reviews by Trinajstic [20] and Balasubramanian [21] on characteristic polynomials. It is expected that more and more, attention will be focussed on F-polynomials. This area of research is likely to become a very important area in graph theory.
References E. J. Farrell; On a general class of graph polynomials, J . Comb. Theory B, 26.111-122 (1979).
E.J. Farrell; On a class of polynomials obtained from the circuit in a graph and its application to characteristic polynomials of graphs, Discrete Math., 25,121-133(1979).
E.J. Farrell; On the matching polynomial and its relation to the rook polynomial, J . Franklin Institute, 325,527-536(1988). E. J. Farrell; A note on the clique polynomial and its relation to other graph polynomials -to appear. E. J. Farrell; On the derivative of the chromatic polynomial, Sixth Caribbean Conference on Combinatorics and Computing, Trinidad, January 1991. E. J. Farrell; Forest decompositions of graphs with cyclomatic number 3,Internaf.J. Math. & Math.Sci.. 6,535-543 (1983). E.J. Farrell; Decompositions of complete graphs and complete bipartite graphs into node-disjoint paths, Pure and Applied Math. Sci., 17.7-14 (1983). E.J. Farrell; Decompositions of graphs with cyclomatic number 2 into node-disjoint paths, Indian J.Mdh., 26.1-13 (1984). E.J. Farrell; A note on cycle decompositions of complete bipartite graphs, Caribb. J. Math, 3,9-15 (1985). E.J. Farrell and S. A. Wahid; On the reconstruction of the matching polynomial and the reconstruction conjecture,Internut. J . Math and Math Sci. ,lo, 155-162 (1987). E.J. Farrell and J. C. Grell; Some analytical properties of the circuit polynomial of a graph, Caribb. J . Math, 2.69-76 (1984). C.Godsil and I. Gutman; On the theory of the matching polynomial, J. Graph Theory,5,137-144(1981). E.J. Farrell and E. G. Whitehead; Connections between the matching and chromatic polynomials, Intern&. J . Math. & MathSci. - to appear. K. Balasubramanian and K.R.Parthasarathy; In search of a complete invariant for graphs, Proc. of the Symposium in Combinatorics and Graph Theory,I.S.1. Calcutta, 42-59 (1980). L.Borzacchni and C. Pulito; On subgraph enumerating polynomials and Tutte polynomials, Bollettino U.M.I V,6,589-597(1982).
E. J. Farrell
178
[16]
[17] [18] [19]
E. J. Farrell; The subgraph polynomial and its relation to other graph polynomials, Curibb. J. Math., 2, 3%53 (1984). E. J. Farrell; A survey of the unifying effects of F-polynomials in combinatorics and graph theory, Proc. Fourth Yugoslav Seminar in Graph Theory, Novi Sad,159-168 (1983).
H. Hosoya and K. Balasubramanian; Exact h e r statistics and characteristic polynomials of cacti lattices, Theor. Chim.Actu, 76,315329 (1989). H. Hosoya and K. Balasubramanian;Computationalalgorithms for matching polynomials of graphs from the characteristic polynomials of edge-weighted graphs, Journal of Computational Chemistry, 10,698710 (1989).
[ZO]
N. Trinajistic;The characteristic polynomial of a chemical graph ,J. Math Chem ., 2,197 (1988)
[Zi] K. Balasubramanian; Applications of combinatorics and graph theory to spectroscopy and quantum chemistry, Chem. Rev.,85,599-618 (1985).
Quo Vadis, Graph Theory? J. Girnbel, J.W. Kennedy & L.V. Quintas (4s.) Annals of Discrete Mathematics, 55, 179-182 (1993)
0 1993 Elsevier Science Publishers B.V. All rights reserved.
A NOTE ON WELL-COVERED GRAPHS Vhclav CHVATAL Computer Science Department, Rutgers University New Brunswick, New Jersey, U.S.A.
Peter J. SLATER Mathematical Sciences Department, University of Alabama Huntsville, Alabama, U.S.A.
Abstract It is shown that determining if a graph G is not well-covered is anNP-complete problem.
Each of the authors independently proved the result presented in this note, and the strong encouragement (if not insistence) of Mike Plummer motivated the writeup. For a graph G = (V, E) we let BO (G) denote the maximum cardinality of an independent vertex subset of V. In 1970 Plummer [l] defined a graph to be well-covered if every maximal independent set has cardinality BO(G). That is, G is well-covered if any two maximal independent sets have the same cardinality. Equivalently, G is well-covered if every minimal vertex cover is a minimum vertex cover. Papers with results on well-covered graphs include references [2]-[9]. We will show that the following problem is NP-complete. Not Well-Covered Znstunce: Graph G = ( V , E) Question: Is G not a well-covered graph? To show that determining if G is not well-covered is NP-complete, it will be shown that there is a polynomial time reduction from 3-satisfiability.
3-Satisfiability : Instance: Collection C = {c~,c2, ..., c,} of clauses on a finite set U = { u l , u2, ...,u n } of variables such that lcil = 3 for 1 S i I m . Question: Is there a truth assignment for U that satisfies all the clauses in C. Theorem: Not Well-Covered is an NP-complete problem.
Proof: One can easily verify the independence and maximality of two proposed vertex sets of different cardinalities, showing that the problem is in NP. Given an instance of 3-satisfiability construct a graph G as follows. For each ui E U form a K2 on vertices ui and zii ;form a K, on a vertex set { v l , v2, ..., vm}; if cj = ( vj, v vj, v v ~ , (where ~ ) each vj,k is some ui or iii) make vj adjacent to vj,l,v j 2 , and
V. ChvStal and P.J. Slater
180
vj,3. Graph G has 2n + m vertices and n
+ 3m + m(m - 1)/2 edges and can be constructed from
C in time polynomial in the length of C.
To complete the proof we verify the following claim.
Claim: The collection C has a satisfying truth assignment fi{ul, u2, . .., u,} is not well-covered.
+ {T,Fj if and only if G
proof:
First note that &(G) 5 n + 1 because n + 1 cliques cover V( G) (namely, a Km and n K2's). To see that Bo (G) = n + 1, let S = {vl, V 1 , ~ ,ij1,2, V13} uS*, where S* contains exactly one of ui and fii when {4, n i } n { v t , ~ v1,2, , V I ~ }=O. For example, if C1 = ( u l v E2v ug)consider S = {vl, iil,u2, ii3,u4, us, ..., u,}. Because S is an independent set with IS1 = n + 1, we have Bo(G) = n + l . Assume that there is a satisfying truth assignment t : { u l , 4, ..., u,} + { T , F). Then, let R = ( u i l t ( u i ) = T } u (Ki t ( u i ) = F}.Becauseeverycicontainsatrueliteral,each vj is adjacent to a vertex in . It follows that R is a maximal independent set with IRI = n < B,, ( G) ,so G is not well-covered.
k
Assume there is no satisfying truth assigment, and let R be a maximal independent set. At most one vj is in R, and it easily follows that for 1 I i I n, we have exactly one of ui and Bi in R. Now the truth assignment defined on {ul, iil, ~2,272,..., u,, an} by letting a literal be true if and only if it is in R leaves at least one clause cj unsatisfied. R being maximally independent implies that one of thevj's corresponding to an unsatisfied clause cj is in R. Thus IR I = n + 1, and G is well-covered. Addendum. A recent manuscript of Sankaranarayana and Stewart [ l o ] also contains a proof of this theorem.
References [l] [2]
M.D. Plummer; Some covering concepts in graphs, J . Combinatorial Theory, 8,91-98 (1970). C . Berge; Some common properties for regularizable graphs, edge-critical graphs and B-graphs. Tohoku Univ. Tsuken Symp. on Graph Theory and Algorithms, 108-123 (1980).
A note on well-covered graphs
181
0.Favaron, Very well covered graphs, Discrefe Math., 42,177-187 (1982). A. Finbow, B. h e l l ; A game related to covering by stars,Ars. Combinatoria, 16A, 189-198 (1983). A. Finbow, B. Hartnell, and R. Nowakowski; A characterisation of well-covered graphs of girth 5 or greater - submitted. M. Lewin; Matching-perfect and cover-perfect graphs, Israel J. Ma&, 18,345-347 (1974). G. Ravindra; Well covered graphs, J. Combin. Inform.System. Sci., 2.20-21 (197). J.A. Staples; On some subclasses of well-covered graphs, J. Graph Tfieory.3,197-204 (1979). J.A. Staples; On Some Subclasses of Well-Covered Graphs, Ph.D. dissertation, Vanderbilt University (1975). R.S. Sankaranarayanaand L.K. Stewart; Complexityresults for wellcovered graphs - manuscript.
This Page Intentionally Left Blank
Quo Vadis, Graph Theory? J. Girnbel, J.W. Kennedy & L.V. Quintas (eds.) Annals of Discrete Mathematics, 55, 183-190 (1993) 0 1993 Elsevier Science Publishers B.V. All rights reserved.
CYCLE COVERS AND CYCLE DECOMPOSITIONS OF GRAPHS Cun -Quan ZHANG Department of Mathematics, West Virginia University Morgantown, West Virginia, U.S.A.
Abstract In this paper, some conjectures and recent progress on cycle cover and cycle decomposition problems are surveyed.
We follow the terminology and notation of [l]. All graphs we will consider in this paper are 2 -edge-connected. Note that the cycles in this paper are closed simple paths. 1.
Cycle Cover Property of Graphs
The set of all 3-connected graphs is denoted by &. A weight w: w : E ( C ) -+ { 1,2} is called Eulerian if the total weight of each edge-cut is even. A graph G with an associated weight w is denoted by (G,w).The set of all (1,2)-Eulerian weights of G is denoted by WG If (G,w) has a family F of cycles such that each edge e of G is contained in precisely w(e) cycles of F , then the family F is called a cycle w-cover and G is cycle w-coverable (see [2], [31and 141). A graph G is said to have the cycle cover property if G is cycle w-coverable for every w E Wc. Denote the set of all such 3-connected graphs by Zccp,
1.1 Outside Z ~ c p(Or . Which Set Contains Zccp?)
Let P be the Petersen graph and M be a 1-factor of P . Assign a weight W M on E(P) such that each edge in M has weight 2 and each edge not in M has weight 1. It is not very hard to see that P has no wMcycle cover. Thus Z c c p is a proper subset of GC. Let w2 E W , be such that w2(e) = 2 for every edge e of G. The set of all 3-connected graphs which are w2-cycle coverable is denoted by 2 ~ 0It~is .a well-known conjecture (Cycle Double Cover Conjecture, see the surveys [S] and [q)that Z c p p = Z 3 c . This conjecture is still open. Note that the Petersen graph is w2-cycle coverable and therefore Z c c p is a proper subset of ZCDC A postman's tour of a graph is a closed walk that includes every edge at least once. The Chinese Postman's Problem (abbreviated to CPP) is to find a shortest postman's tour of a given graph (see [71 and [8]). The Shortest Cycle Cover Problem (abbreviated to SCC) is to find a family F of cycles of G such that each edge of G is contained in at least one cycle of F and the total length of all cycles in F is as short as possible (see [8]-[15] and [16]). It is obvious that for any graph an optimum solution of the Chinese Postman's Problem cannot be greater than a solution of the Shortest Cycle Cover Problem. And these two solutions need not be equal, see for example, the Petersen graph [13]. We say that the Chinese Postman's Problem and the Shortest Cycle Cover Problem are equivalent for a graph G if the optimum solution of Chinese Postman's Problem equals the solution of Shortest Cycle Cover Problem. Finding the relations
C.-Q. Zhang
184
between the Shortest Cycle Cover Problem and the Chinese Postman's Problem is a very interesting problem, since the Chinese Postman's Problem can be solved by a polynomial algorithm [7] while the Shortest Cycle Cover Problem might not be. The set of all 3-connected graphs for which the Chinese Postman's Problem and the Shortest Cycle Cover Problem are equivalent is denoted by Z c p p = SCC. It was proved by Guan and Fleischner [S] that the Chinese Postman's Problem and the Shortest Cycle Cover Problem are equivalent for all 2-connected planar graphs. This result was recently generalized by Alspach, Goddyn and Zhang 141 to all 2-connected graphs containing no subdivision of the Petersen graph and by Jackson [ 141 and Zhang [16] to all graphs admitting nowhere-zero 4-flows. The following lemmas are just exercises. Lemma 1.1: If G is cycle w-coverable for some w E W,, then G E Z,, Lemma 1.2: G E ZcDc = of WG.
if and only if G is cycle w-covetable for some minimum element w
Thus we have, Proposition 1.3: 'CCP
' C P P = SCC
'CDC
'3c'
The relations between Zccp and Z c p p = SCC, Z C -and 5,are conjectured in the following problems, Conjecture 1.4: (Cycle Double Cover Conjecture, Szekeres [17], Seymour [2]) ZCPP
=5 c .
Conjecture 1.5: (Zhang [16]) ZCCP
= ZCPP = scc.
(That is, G is cycle w-coverable for every w in WG if and only if G is w,-cycle coverable for some minimum element w, of WG) Note that the conjecture is not true if the condition of 3edge-connectivity is dropped. In the following figures we give a 2-connected graph for which the Chinese Postman's Problem and the Shortest Cycle Cover Problem are equivalent but which does not have the cycle cover property. (This is why all graphs considered in $1 are 3connected.)
Figure 1:
Cycle covers and cycle decompositions of graphs
18.5
In figure 1, each dashed edge
Figure 2
1.2 Inside Zccp. (Or Which Set is Contained in Zccp and Which Graph has Cycle Cover Property?) Denote the set of all 3-connected graphs containing no subdivision of the Petersen graph by zNoP. Theorem 1.6: (Alspach, Goddyn and Zhang 131[41) 'NoP
zCCY
Note that ZN*P is a proper subset of Z c c p since some graph containing a subdivision of the Petersen graph is still cycle w-coverable for every w of We Let G be a 2-edge-connected graph. A nowhere-zero 4$0w (OJ of G is an orientation 0 of E(G) and a weight f : E(G) + { 1,2,3} such that for each vertex v of G
c u ( v x ) :for each oriented edge vx starting at v)
= ~ ( x v ) for : each oriented edge xv ending at v}. (For more properties of integer flows, refer to [18] or [6]). Note that for cubic graphs, admitting a nowhere-zero 4-flowis equivalent to being 3-edge-colorable. The set of all 3-connected graphs admitting a nowhere-zero 4-flow is denoted by 24 -f,
Theorem 1.7: (Zhang [16]) Z4-f C Z C C P .
Whether the inequality holds between Z4 -fand ZNop is still unknown:
Conjecture 1.8: (Tutte [19]) 'No,
='4-f'
Since the Petersen graph is not planar and since a nowhere-zero 4-flow is equivalent to a 3-edge-coloring for a cubic graph, Tutte's Conjecture is stronger than the 4-COlOr theorem. The following conjecture is a refinement of Tutte's Conjecture and is motivated by Theorem 1.6.
C.-Q. Zhang
186
Conjecture 1.9 (Zhang [16]): ZCCP
=
z4-f
2. Small Cycle Double Covers If F is a family of cycles of G such that each edge of G is contained in precisely two cycles of F , then F is called a cycle double cover of G. Recall the following well-known conjecture Conjecture 1.4: (Cycle Double Cover Conjecture) Every 2-edge-connected graph has a cycle double cover. Similar to a conjecture of Hajos (see [20]) that a simple Eulerian graph of order n has a cycle decomposition into at most ! , cycles, ! I a refinement ! of the cycle double cover con1
2
1
jecture was proposed by Bondy. Conjecture 2.1: (Small Cycle Double Cover Conjecture (SCDC conjecture), Bondy [20]) Every 2-edge-connected simple graph G of order n has a cycle double cover with fewer than n cycles. Except for the following result due to Seyffarth, there has not been much progress on this conjecture yet. Theorem 2.2: (Seyffarth, [21]) The Small Cycle Double Cover Conjecture holds for all triangulated planar graphs and all Hamiltonian planar graphs. For cubic graphs the upper bounds of smallest cycle double cover's are expected to be much lower. Conjecture 2.3: (Small Cycle Double Cover Conjecture For Cubic Graphs, Bondy [20]) Every 2-edge-connected simple cubic graph G of order n 2 6 has a cycle double cover with at most n/2 cycles. 0 Certainly one cannot expect these two conjectures to be solved before that of the Cycle Double Cover Conjecture. However, with the assumption of the Cycle Double Cover Conjecture for a given cubic graph, Conjecture 2.3 is solved in a sense. Theorem 2.4 ( h i , Yu and Zhang [22])
If a simple cubic graph of order n 2 6 has a cycle double cover, then some cycle double cover contains at most n/2 cycles. Actually, a stronger result is known for cubic graphs in which parallel edges are allowed. Theorem 2.5: ( h i , Yu and Zhang [22])
If a cubic graph of order n has a cycle double cover, then it has a cycle double cover with at most nl2 + 2 cycles. 3.
Cycle Decompositions Of Eulerian Graphs
3.1 Compatible cycle decomposition
Let v be a vertex of a given graph G and P(v) be a partition of the set of edges incident
Cycle covers and cycle decompositions of graphs
187
with v. An element of P(v) is called a forbidden part at v. The set P = ttGG)P(G)is called a set of forbidden parts of G. A graph G with an associated set of forbidden parts P is denoted by (G,P>. A cycle decomposition C of E ( G ) is compatible with a set P of forbidden parts if I E ( C ) n P I I Iforevery C E Candevery P E P.
A cut T of G is called a bad cut of ( G , P ) if 21P n T 1 > 14 for some forbidden part P of P. It is obvious that a necessary condition for ( G , P ) having a compatible cycle decomposition is that ( G , P )has no bad cut. Let G be a 2-connected graph. Construct an Eulerian graph G" by replacing each edge of G by a pair of parallel edges. For each vertex v, let P(v) be the set of all pairs of parallel edges incident with v . It is obvious that G has a cycle double cover if and only if G" has a cycle decomposition compatible with P = U P ( G ) . VE
V(C)
The following Theorem, a generalization of a prior result in [23], was proved by Fleischner and Frank. Theorem 3.1: (Fleischner and Frank [24])
Let G be a planar Eulerian graph and let P be a set of forbidden parts without a bad cut. Then ( G , P )has a compatible cycle decomposition. As we mentioned above, having no bad cuts is a necessary condition for having a compatible cycle decomposition. However, it is not sufficient, as the following example shows: Let K5 be the complete graph with 5 vertices { v o , v l ,..., v q } and P* = (2-path VjvjVk: either k = j + 1 and i = j - 1, or k = j + 2 and i = j - 2 mods}. According to Kuratowski's Theorem, K5and K33 are the only two forbidden minors for planar graphs. However, K33 is not an exception for the problem of compatible cycle decomposition. A natural question is whether a graph containing no K5minor has a compatible cycle decomposition for any set of forbidden parts without a bad cut. This question is answered by the following theorem: Theorem 3.2: (Zhang [25])
Let G be an Eulerian graph containing no subgraph contractible to Ks and let P be a set of forbidden parts of G without a bad cut. Then ( G , P )has a compatible cycle decomposition. L t L = e le2...e, be an Eulerian tour of an Eulerian graph G. The set
P L = {eiei+1 I i = 1, ...,r modr} is called the set of forbidden parts induced by L. The following well-known conjecture was due to Sabidussi, Conjecture 3.3: (Sabidussi [26])
If L is an Eulerian tour of graph G, then ( G,PL)has a compatible cycle decomposition. By Theorem 3.2, this conjecture is solved for all graphs containing no K5 minor.
3.2 Even cycle decomposition Let G be an Eulerian graph. A cycle decomposition F of E ( G ) is even if each cycle of F has even length. An odd block of a graph is a block containing odd number of edges. It is obvious that if an Eulerian graph has an odd block then it cannot have an even cycle
C.-Q. Zhang
188
decomposition. However, this necessary condition for even cycle decomposition is not sufficient since K5 does not have an even cycle decomposition (note that it has 10 edges and every even cycle has length 4). The following well-known result was proved by Seymour, Theorem 3.4 (Seymour [27], or see [24])
Let G be a planar Eulerian graph containing no odd block. Then G has an even cycle decomposition. Theorem 3-2 can be applied to generalize Theorem 3-4. Theorem 3.5: (Zhang [%I)
Let G be an Eulerian graph containing no subgraph contractible to K5 and containing no odd block. Then G has an even cycle decomposition. Conjecture 3.6: Let G( G # Ks)be a 3-connected Eulerian graph containing even number of edges. Then G has an even cycle decomposition.
Note, the connectivity cannot be reduced in the above conjecture since some 2-connected (or 2-edge-connected) Eulerian graphs ( g K 5 ) containing an even number of edges may not have an even cycle decomposition (see figure 3 and figure 4).
Figure 3:
Figure 4:
Cycle covers and cycle decompositions of graphs
189
Acknowledgement This research was partially supported by National Science Foundation under the grant DMS8906973.
References J.A. Bondy and U.S.R. Murty; Graph Theory with Applications, Macmillan, London and Elsevier, New York. P.D. Seymour; Sum of circuits, in Graph Theory and Related Topics,J.A. Bondy and U.S.R. Murty (editors), Academic Press. New York, 341-355 (1978). B. Alspach and C. Q. Zhang; Cycle coverings of cubic multigraphs,Discrete Mdhematics (toappear). B. Alspach, L. Goddyn and C.Q. Zhang; Graphs with the circuit cover property, Transaction of the American Mathematics Society (submitted). F. Jaeger; A s w e y of the cycle double cover conjecture, Annals of Discrete Mathematics. 27, 1-12 (1985). F. Jaeger; Nowhere-zero flow problems,Selected topics in graph theory 3, L.W. Beineke and R.J. Wilson (editors), Academic Press, New York, 71-95 (1988). J. Edmonds and J. Johnson, Matching; Euler tours and the Chinese postman, Mathematical Progranming ,5,88-124 (1973). M. Guan and H. Fleischner; On the minimum weighted cycle covering problem for planar graphs, Ars Combinatoria, 20.61-68 (1985). N. Alon and M. Tarsi, Covering multigraphs by simple circuits, SIAM J. Alg. Dis. Math., 6,345-350 (1%).
C. Bermond, B. Jackson and F. Jaeger; Shortest covering of graphs with cycles, J . Combinatorial Theory, B, 35,297-308 (1983). P. Fraisse; Cycle covering in bridgeless graphs, J . Combinatorial Theory, B, 39, 1 6 1 5 2 (1985). G. Fan; Covering weighted graphs by even subgraphs, J. Combinatorial Theory, B (to appear). A. Itai and M. Rodeh; Covering a graph by circuits, Automata, Languages and Programming, Lecture Notes in Computer Science 62, Springer-Verlag, Berlin, 289-299 (1978). B. Jackson; Shortest circuit covers and postman tours in graphs with a nowhere zero 4-flow. SIAM J.DisCrete Math (to appear). U. Jamshy and M. Tarsi; Short cycle covers and the cycle double cover conjecture, J. Comb. Theory, B , (to appear). C-Q. Zhang; Minimum cycle coverings and integer flows, J . Graph Theory, Vol. 14, No. 5,537-546 (1990). G. Szekems; Plyhedral decompositionsof cubic graphs, J. Austral. Math. SOC.,8,367-387 (1973). D. H. Younger; Integer flows, J . Graph Theory, 7,349-357 (1983). W. T. Tune; On the embeddingof linear graphs in surfaces, Proc. London Mdh. Soc.. Ser. 2, 51,474489 (1949). J. A. Bondy; Small cycle double covers of graphs, Cycles and Rays, G. Hahn, G. Sabidussi and R. Woodrow, editors, Kluwer Academic Publishers, 21-40 (1990). K. Seyffarth; Cycle and Path Covers OfGraphs, B.D. Thesis. University of Waterloo (1989). H.J. Lai,X. Yu and C.Q. Zhang; Small cycle double covering of cubic graphs, Journal of Combinatorial Theory, B (submitted). H. Fleischner; Eulersche Linien und Kreisuberdeckungen die vorgegebene Durchgange indeu Kanten vermkden, J . Combinatorial Theory, B, 29,145167 (1980). H. Fleischner and A. Frank, On cycle decomposition of Eulerian graph, 1. Combinatorial Theory, B . 50, 245-253 (1990). C-Q. Zhang; On compatiblecycle decomposition of Eulerian graph - preprint H. Fleischner; Eulerian graphs, Selected Topics in Graph Theory 2, L.W. Beineke and R.J. Wilson (editors), Academic Press, New York, 17-53 (1%). P. D. Seymour; Even circuits in planar graphs,J. Combinatorial Theory, B , 31,178-192 (1981). C-Q. Zhang; On even cycle decompositionof Eulerian graphs - prepriut
This Page Intentionally Left Blank
Quo Vadis, Graph Theory? J. Gimbel, J.W. Kennedy & L.V. Quintas (4s.) Annals of Discrete Mathematics, 55, 191-200 (1993) 0 1993 Elsevier Science Publishers B.V. All rights reserved.
MATCHING EXTENSIONS AND PRODUCTS OF GRAPHS Jiping LIU and Qinglin YU Department of Mathematics and Statistics, Simon Fraser University Burnaby, British Columbia, CANADA
Abstract An m-matching is a set of m independent edges of a graph. For a graph G, let Mbe an m-matching of G and U = { u ,u2, ...,un} be a set of n distinct vertices of G such that ui (1 i; i s n) is not incident with any
tf
edge of M. there exists a perfect matching M* of G such that M c M* and uiy, e M * for any ui, uje U , we call M* a maching extension o f ( M , U )or an (m.n )-extension.The graph G IS called (m,n)-extendable if it has a perfect matching and there exists an matching extension of (M, v) for any M and U as given above. In this paper, we study the properties of (m,n)-extendable graphs and the relationship between matching extensions and product of graphs.
I. Introduction All graphs in this paper are finite and have no loops and multiple edges. A perfect matching, or 1-factor, of a graph G is a set of independent edges which together cover all the vertices of G. A graph G is n-extendable if it contains a set of n independent edges and every set of n independent edges can be extended to a perfect matching. The concept of n-extendability was first introduced by Plummer [l]. He studied the properties of nextendable graphs and also the relationships between n-extendability and other graph parameters (for example, degree, connectivity, genus, etc.). In order to study the number of different perfect matchings in a graph and the structure of graphs with perfect matchings, LovLz [2] introduced a family of graphs called bicritical graphs (a graph G is bicritical if G-{x, y} has a perfect matching for any distinct vertices x and y of V(G))and later he strengthened this notion to that of a brick (a brick is a 3-connected bicritical graph). Using bricks, LovAsz [2] developed a decomposition theory, brick decomposition, to clearly describe the structure of 1extendable graphs. This decomposition has also turned out to be very useful in the study of matching polyhedra (LovAsz [3]). Plummer [I] proved that every 2-extendable graph is either bipartite or a brick. Motivated by this result, much attention today continues to focus on the properties of n-extendable graphs. In this paper, we generalize the n-extendability to what we call (m,n)-extendability and study the properties of (m,n)-extendable graphs. An m-matching is a set of m independent edges of a graph. For a graph G, let M be an mmatching of G and U = {u1,u2 ...,u,,} be a set of n distinct vertices of G su$h that no one of ui (1 s i s n) lies on fny edge of M . If there exists perfect matching M of G such that M E M and u . u . 4 M for any ui,U . E U , we call M a matching extension of (M, ZJ) or an ' J (m,n>exfension. A graph G is called(m,n)-extendable if it has a perfect matching and there exists a matching extension of (M, U) for any given M and U as described above. Notice that (m,O)-extendability is just m-extendability. To illustrate, the graph given in Figure 1 is (2,l)extendable but not (2,2)-extendable. Most of results in this paper involve products of graphs. The definitions are described here. The Cartesian product G, x G2 of GI and G2 has vertex-set V(C1) x V(G2)with (u1,u2) adjacent to ( v ,v ) if and only if either u1 = v 1 and u2 is adjacent to v2 in G, or u2 = v2 and u1 1 2 is adjacent to v 1 in GI.
J. Liu and Q. Yu
192
Figure 1: The wreath product G, @G20f G, andG2(also called lexicographic product) is the graph with vertex-set V(G1) xV(G2) and an edge joining (u1;u2)to (v1,v2) if and only if either u1 is adjacent to v1 in G, or u, = v1 and u2 is adjacent to v2 in G,. For any set S V(G), we denote by G-S the subgraph of G obtained by deleting the vertices of S together with their incident edges, and by G[fl the subgraph of G induced by S. Denote the maximum and the minimum degree of G by A(G) and 6(G), respectively. The neighborhood-set of S in G is denoted by NG(S) and is the set of all vertices in G which have a neighbor in S.The cycle and the path with n vertices will be denoted by C,, and P,,, respectively. To conclude this section, we list several results which will be used later in this paper. Theorem 1.1: (Tutte [4]) A graph G has a perfect matching if and only if o(G-S) is the number of odd components of G-S.
5
ISI, for all S E V(G),where o(G-S)
Theorem 1.2: (Plummer [l]) Let k and p be positive integers with p even and let G be a graph with p vertices. If p then: (1) If G is k-extendable, then G is (k - 1)-extendable; (2) If p 2 4 and 6(@ 2 p/2 + k, then G is k-extendable; (3) If G is connected and k-extendable, then G is (k + 1)-connected.
2 2k
+ 2,
Theorem 1.3: (see [5]) A graph G is bicritical if and only if for Sc V( G) and IS1 2 2, then o(G-S) 2.
5
IS1 - 2.
Relationships Among (m, n)-Extendabilities From the definition of (m,n)-extendable, we easily obtain the following two lemmas.
Lemma 2.1:
If IV(@lz 2m + 2, then the graph G is (m,O)-extendable if and only if it is (m,1)-extendable. Lemma 2.2: If a graph Gis (m, n)-extendable, then it is also (mp)-extendable for any 0 5 p
5
n.
193
Matching extensions and products of graphs
The following theorem gives a basic relationship among the (m,n)-extendabilities.
Theorem 2.3: If G is a connected (m,n)-extendable graph, and IV(c)l t 2m + 2n + 2, then Gis ( m - l,n + 1)extendable.
Proof: From Lemma 2.1 and Theorem 1.2 (l), we may assume that n 2 1. Case 1: n > 1. Arbitrarilychoosean(m- 1)-matchingM={vlw1,v2w 2,...,vm-lwm-l} anda set of n + 1 vertices U = {ul,u2 ,...,u , + ~ } ,where {v1,w1,v2,w2,..., vm - l , ~ m- 1} n {U1,U2,...,Un+ 1 ) = 0. First we claim that there exists a matching extension for (M,U-{u, l}). If not, then for any VE V(G)-{Vl,W1,V2'W 2,...,vm_1,wm- 1,ul,u2, ...,un l},NG(~)c{vi7wi I i = 1,2,...,m - 1). Otherwise, if VUE E(G) and u(5 {vi,wi I i = 1,2, ...,m - l}, then (Mu{vu}, U-{u,+ 1}) has a matching extension and so does (M,U-{u, 1}), which contradicts the assumption. But if = 1,2 NG(v) ~ { v ~ I ,i w ~ ,...,m - l}, then vis an isolated vertex in G-{vl,Wl~v2,w2,...,vm - 1, wm - l}. Thus M cannot be extended to a perfect matching of G. A contrdctlon. +
+
+
Let M* be a matching extension of (M,U-{u, ,}) in G. If no uiu, 1~ M*, then M* is a matching extension of (M U). Otherwise, say unun + d .Since n > 1, there exists a vertex w f u n + so that u WE M*. Let M'= Mu{ulw}. Then (MI, U-{u,}) has a matching extension d*.Clearly M** i: also a matching extension of ( M U). +
+
Case 2. n = 1. Let M* be a matching extension of (M,U-{ul}) in the (m, n)-extendable graph G. If u1u2 E M*, then M* is a extension of (M, { u l , u 2 } ) . So suppose that u1u2eM* for any matching extension M* of (M,{ul}). Let {vi,wi I i = 1,2,.,.,m- l} u {u1,u2} = W. Then NG(u1) E W and NG(u2) c W. Claim 1. If one of { u l , u2} is adjacent to v . (1 5 j I m - l), then the other one must be adjacent J tow.. J
Supposethat u1 isadjacentto v.forsomej.Let U = { U ~ ) ~ ~ ~ M = { VZ,...,vj-l~j-l; ~W~,V~W J vj + lwj + l , . ..,vm - l ~ -m1, vJul}.Then (M, U) has a matching extension M .Thus u w E M 2 j or u2 is adjacent to w .. A similar argument can be applied to u2. J
Claim 2. Let W ' = NG(ul) u NG(u2).Then for any v E W',NG(v) c W' E W. If not, then there exists a vertex, say vi, of W' such that NG(vi) Q W and viul€ E(G). Let UE NG(vi)-W'. Then U V ~ EE(G). Thus, by Claim 1, u2wie E(G). Denote the edges of Mwhich are in W b y M'. Let M = M'-{viwi} u { mi, uzwS. Since N ( u ) W ,u1 is an isolated vertex G 1. of the graph which is obtained by deleting M" with its end-vertices from G. Then M " can not be extended to a perfect matching of G.But G is (m,n)-extendable thus, by Lemma 2.2, G is (rn,O)-extendable. By Theorem 1.2 (1) G is (IM"I,O)-extendableas IM"I 5 m. A contradiction. So Claim 2 is proved. From Claim 2, we see that G[W'] is a component of G. But IV(G)I 2 2m + 2n + 2 > IW'I, so G[W] is a proper subgraph of G. This contradicts the*fact that G is connected. This cclntradiction implies that there exists a matching extension M of ( M , { u l } ) such that ulu2(5 M . Hence M* is a matching extension of (M,{u1,u2}).
194
J. Liu and Q. Yu
Remark 2.4: Consider the m-cube Q, with 2, vertices. It is known (see [q)that Q , is ( m - 1,O)-extendable. By Lemma 2.1 and Theorem 2.3, then Q, is (m-2,2)-extendable. Since Q, is an m-regular graph, clearly it is not (m - 2,3)-extendable. Thus (m,n)-extendability does not imply (m - 1, n + 2)-extendability. In this sense, Theorem 2.3 is the best possible. Corollary 2.5: If G is an (m,O)-extendable graph of order at least 2m O
+ 2, then
G is (p,q)-extendable for
Proof: Let G be an (m ,O)-extendable graph. Then G is ( m ,1)-extendable by Lemma 2.1, and the previous theorem implies that C i s (m- 1,2)-extendable; using Theorem 2.3 again, G is (m- 2,3)extendable, and so on, in general, G is ( p , m - p + 1)-extendable for 0 5 p Im. Applying Lemma 2.2, G is (p,q)-extendable for any 0 I p I m and 0 5 p + q s m + 1. Corollary 2.6:
If G is an (m,n)-extendable graph of order at least 2m + 2n + 2, then G is @,q)-extendable for 01 p s m , 0 Iq In. Remark 2.7: Let GI and G2 be two (0,n)-extendable graphs. Choose v E (GI) and join v to every vertex of G2. The resulting graph is denoted by G. Then {v} is a cut set of Gor G has connectivity 1. It is easy to see that G is still (0,n)-extendable. But by Theorem 1.2 (3), G is not 1-extendable. This example shows that (0,n)-extendability does not guarantee even (1,O)-extendability. From the observations of Corollary 2.5 and Remark 2.7, we can see that a graph being m edge extendable ((m,O)-extendable) is also m-vertex extendable ((0,m)-extendable). Therefore the edge-extension is stronger than vertex-extension. 3. Properties of (0,n)-Extendability The two special cases of (m,n)-extendability, (m,O)-extendability and (0,n)-extendability, are particularly interesting. The (m,O)-extendability was extensively studied by Plummer in [l]. Here we consider the (0, n)-extendability. Lemma 3.1: If a graph G is (0,n)-extendable, then 6(G) 2 n.
Proof: Let deg(u) = 6 for some u E V(C). If 6 < n,let U = {u} u Ndu).Then I UI I n but U cannot be extended to a perfect matching of G. A contradiction. Next, we give a characterization of (0,n)-extendable graphs. For convenience, we write E(S) for E(G[SJ)where S E V(G). Theorem 3.2: A graph G is (0,n)-extendable if and only if for any U o(G-E(U)-W) s IWI.
V(G)with IUI = n and W
V(G),
Matching extensions and products of graphs
195
Proof: Let G be a (0,n)-extendable graph and U any n-set of V(G). Then G-E( U) has a perfect matching. Hence, by Theorem 1.1,forany W~V(G)o(G-E(U)-W)sIWI. Conversely,foranyfixed U5;V(G)withIUi= n,ifo(G-E(U)-W)sIWIforanyW~V(C),we know that G-E(U) has a perfect matching. This perfect matching is a matching extension of (0,v). Notice that if U contains only one vertex, then E(U) = 0. So we have that G is (0,l)extendable if and only if o(G-W) I1 WI for any W E V(G). Hence G is (0, 1)-extendable if and only if G has a perfect matching. In this sense, Theorem 3.2 is a generalization of Tutte's theorem. Therefore (m,n)-extendability is also a generalization of perfect matching. Corollary 3.3: A graph G is (0,2)-extendable if and only if for any edge e and any W c V(G), o(G-e-W) 5 IWI.
In light of Theorem 3.2, we can give some sufficient conditions for the existence of (0,n)extendable graphs. Theorem 3.4: 1
Let G be a graph of even order and s(c) L TIV(G)l + m + n - 1. Then Gis m, n)-extendable.
Proof: Consider any m edges v1w1, v2w2, ..., vmwmand n vertices ul, u2, {ul,uz,...,un} and G'=G-E(U)-{vl,wl,v2,w2, ..., v,,w,}.Then 6(G')26(G)-2m=
1
#(c>l-
1
...
( n - 1 ) 2Z-IV(G)I+m+n-1 - 2 m - n + m=
1
un of G. Let U =
1
(1V(G)J - 2m) = ;Iv(G31
Therefore, by Dirac's theorem, G' has a Hamilton cycle. Since G' has even order, it has a perfect matching. This perfect matchng together with vlwl, v2w2, ..., vmwmis an (m,n)-extension of G. Theorem 3.5: If a graph G is bicritical, then G is (0,2)-extendable.
Proof:
For any e E E ( Q , let G' = G-e. By Corollary 3.3, we need only to show that o(GI-8 any S E V(G).
5I S1 for
When IS12 2, let 01,. ..,ok be the odd components of G'-S. Then each odd component of G-S is either some Oi or some Ojjoined to an even component of G '-S by e. In either case, we see that o(G-S)rk-2.SinceGis bicritical then,byTheorem 1.3, o(G-S)sISI-2.Hencek-2~ o(G-S) IIS1- 2 ork s ISI. That is, o(G'-S) 5 ISI. When IS1= 0, since G is bicritical, it is (1, 0)-extendable and 2-connected. Hence o(G') = 0, and SO o(G'-S) IISI. Suppose IS1 = 1, say S = {v}. If o(G'-{v}) I1, we are done. If o(G'-{v}) > 1, then by parity, o(G'-{v})=3. Let e=xyand T={n,v}.Then o(G-l)>2or G-Thasnoperfectmatching.A contradiction.
J. Liu and Q. Yu
1%
Although there is a close relationship between (m,O)-extendability and connectivity (Theorem 1.2 (3)), unfortunately there is no similar relationship for (0,n)-extendability. (The graph G in Remark 2.7 is (0,n)-extendable, but it is not even 2-connected). 4.
The Extendability of Products of Graphs
We start this section by considering Cartesian products G1 x Gz. For convenience, we denote the subgraph induced by {v} x V(G2)in Cl x G2 by G,, called a layer, which is a copy of Gz. If v1 and v2 are adjacent in GI, then {vlv2} x G 2 is isomorphic to P z x Gz. Let E(G,,G,) denote the edges between {u} x V(G2) and {v} xV( Gz) in G1 x G2. Thus, if uv E E(Gl), then E(G,, G,) is a perfect matching between G, and G, Let Pw be a bijection from G, to G, with Puv (( UJ)) = ( VJ) for every x E V(G2). Hence, the projection Puv is an isomorphism from G, to G,.
Lemma 4.1: If G is a (0,n)-extendable graph, then for any e
E
E(G), G + e is also (0,n)-extendable.
The advantage of this lemma is that instead of studying the given graph we can consider the (0,n)-extendability of our "favorite" spanning subgraph of the graph. Sometime this can help to simplify the description.
Theorem 4.2: If Gz is a (0,n)-extendable graph (nL 1 ) and 6 = 6(G1), then Gl x Gz is (0,n+ @-extendable.
Proof: Since G2 is (0,n)-extendable, V( GI) x Gz is (0,n)-extendable. Hence by Lemma 4.1, G1 x G z is (0,n)-extendable. We use induction on IV(G1)I. If IV(G1)I = 1, then 6 = 0 and GI x Gz is (0,n)-extendable. Suppose that the claim holds for IV(G1)I < m. When IV(G1)I = rn, choose any n u l , u2,. ..,u, + 8 from Gl x Gz. Let G = Gl x G2 and U = { ul,u2'...,u, + 6).
+ 6 vertices
If there exists v E V(G1) such that 0 < IU nV(G,)I s n, then by the induction hypothesis G' = (GI-{v}) x G2 is (0,n+ s(G1-{v}))-extendable, where n + ~(GI-{V}) 2 n + 6 - 1 and IUnV(G')I s n + 6- 1. Therefore, there is a matching extension M' of (0,Un V(G'))in G'. Since G2 is (0,n)-extendable, there is a matching extension M " of U nV(G,) in G,. Hence M ' u M ' ' is a matching extension of (0,U). That is, G is (0,n + 6)-extendable. If no such v E V(G1) exists such that 0 < IU nV(G,)I 5 n, then either IU n V(G,)I = 0, or IU A V(G,)l 2 n + 1 for every v E V(G1). Let U c V(G,,) u ... u V(G,) and I U n V(C,) I L n + 1. Thenk(n + 1) s I U n V(G,,)I + ... + I U n V(G,)I= n + 6 . Hence k s Suppose that there are pi layers adjacent to G , and each of them intersects U . Then (pi+ l)(n + 1) s n + 6 orpi 5 n+6 - 1. Therefore there are at least 6 -pir 6- %+I layers which are adjacent to G,, n+ 1 and do not intersect U .
s.
L e t X = {v1,v2, ...,vk}and Y = V(GI)-X. WeconstructabipartitegraphB=(X,Y)asfollows: For vj E X and wj E Y, V . W . E E(X, Y)if and only if vjwj€ ,?(GI). By the above argument, we + s +1 for i = 1,2,. ..,k. Clearly 6 know that degB(vi) 2 6 -* in+l +I L n+6 if 6 2 1. Thus, n+l degB(vi) L k = 1x1for each vi E X.So NB(S) 2 S for any S X. By Hall's theorem, there is a matching which saturates x , say v1w1, ...,v k q . Let Mibe the perfect matching between G v i
5
Matching extensions and products of graphs
197
and G , , (1 s i s k ) . For any v ci {vi,wi I i = 1,2,. . . , k } , let M, be any perfect matching of G,, then M l u M2u ... U M k u ( u { M , I V E V(G1)-{vl,wl,...,vk,wk})isaperfectmatchingof G which is a matching extension of (0,Q Since we can choose G2 such that 6(&) = n, for example, G2= K,,n, the minimum degree of G1 x G2 is 6(G1) + n,by Lemma 3.1. Theorem 4.2 is the best possible.
Theorem 4.3: If G 1 is a graph without isolated vertices and G2 is a connected (m,O)-extendable graph of order at least 2m + 2, then G I x G , is (m + 1,O)-extendable.
Proof: We use induction on IV(G1)l. Without loss of generality, we assume that G1 is connected. First consider G = P2 x G I Choose any (m + 1)-matching Mfrom G . Let P2 = uv. Then two copies of G2 in G are G , and G , Let M = { e l , ...&,ell,. ..,e's,ulu'l,...,up'h}, where k + s + h = m + 1, e i E E(G,) for 1 s is k, e> E E(G,) for 1 s j s s, and ut E V(G,), uft E V(G,) for 1 s t s h. Project { e l l , ...,e's} to G,by using puvsuch that (1)
if puv(e>)is incident to just one edge of el,e2, ...,ek, then we delete this edge but keep the end-vertex which is not incident to any edge of el, e2,..., ek; and
(2) if puv(e>) is incident to two of e l , e 2 , ...,eb then delete pde').
Then we obtain a matching M' = {el,e2,...,ek,ek+l,...,ep} and a set of vertices U'= {ul, u2, ..., uh, uh+l, ..., uq}, where p + q Irn + 1, U' and M' are independent. Case 1: q = 0. Then k +s = m + 1. If s = Oor k = 0, says = 0, then M LG,. LetM*=M u p , ( M )
u { np,(x)
I n E V(G,)-V(M)}. Then M* is a perfect matching of G containing M. So suppose k > 0 and s > 0, then k 5 m and s 5 rn. But G 2 is (m,O)-extendable, by Theorem 1.2 (1). there exist perfect matchings F1 and F2 in G,and G , which contains M1 and M2 respectively. Hence F l u F2 is a perfect matching of G as required. Case 2: q > 0. By Corollary 2.5, we knoy that G,is (p,q)-extendable. Let M*l be an extension of (M',U') in G , Let u1v1, ...,uqvq E M 1 . Then ell,. ..,e;,u'lv'l,...,uIqvlqis a matching in G,. wheres + q < m y d u ' i =p,(ui). Hence {e'l ,...,e',,u'lv'l,...,U'q~'q}canbeextendedtoaperfect matching M in G,. Now let
M'
=
{ U ~ U ' viv',li ~, =
1, ..., h }
u (M; - {uivili= 1, ..., h} ) u (d2 - { u y p 1, ...,h } )
Then M* is a perfect matching of G and M EM*. Now consider the general graph G1 x G2 Let M be any (m +l)-matching of G . Case 2.1. M L u { E ( G , ) I v E V(G1)).If M €E(G,) for some v, choose u E V(G1),such that uv E E(G1). Consider G' = {u,v} x &,so that G ' r P2 x G ~Hence . G ' has a perfect matching M' containing M. For each w E V ( G l )-{u,v}, let M , be a perfect matching in G,. Then M = M' u (u{M,I w E V(G1)-{u,v}}) is a matching extension of Min G. If there does not exist a vertex v such that M c E ( G , ) , then for any W , E V(G1) we ha2e IMn*E(G,)I 5 m. So M nE(G,) can be extended to a perfect matching M in G , Let M = u{M ,,,I w E V(G1)}.We are done.
,
Case 2.2: M Q u ( E ( G , ) I v E V(G1)).Let v E V(G1)and NG1(v)= { v l , ...,v p } .Let
J. Liu and Q. Yu
198
P
M = U ( M n E(G,, G,)). i= 1
We may assume M' t 0.Now project M nE(G,) ( i = 1,2,...,p) to G, in following manner:
If p v, (e) is incident to E(G,) nM or to MI, then delete it but keep its end-vertices; if some p ( e ) is incident to some pv,,(e'),then delete one of two edges, also keeping the end-vertices. IV
After such projections, we obtain a matching M1 and some vertices U = {ul,. ..,uq) in*G, with q 2 1, and lMll + q I m + 1. But G, is (IMll,q);extendable by Corollary 2.5. Let M be an extension of (MI, U) in G, Then MnE(G,) M Now for each G,,, let Ei= { e E M*, I e is incident to M nE(Gv,Gv,)}.Projecting Ei to Gvtwe obtain p,, ,,i(Ei)which is a matching and ( M n E( G,,)) u p,i(Ei) is still a matching in G, ,for I i = 1, ...,p.
,
,
If we delete G,, then we obtain a graph G' = (Gl-{v)) x G2, and a matching P
M' = ( M - E ( G J )
U
(UPvv,(Ei))
IMI'm+'
i=l
By the induction hypothesis on IV(Gl)I, M' can be extended to a perfect matching M" in G'. Let
Fi = I x ~ v v , ( xY)P,V V , C Y ) I
XY E
Eil
Then M* is a perfect matching and M G M*. Next we consider the extendability of wreath products of two graphs. From the definition of GI €3 G2, if uv E E (G1), then E(G,,G,,) is a complete bipartite graph between G, and G, For each v E V(Gl), G,, is isomorphic to G2,
Theorem 4.4: For a graph G, P2k €3 G is (0, IV(G)I)-extendable,but not (O,IV(G)I + 1)-extendable. PrOOf:
Let P2k = v1v2. ..v2kand IV(G)I = n. Choose any n vertices of G, and denote this set by U. For the k subgraphs of G, E(GVl,G S ) ,E(GV3,GVJ . ..)E(GWzr-l 9 GV2i) IUnE(G
vz,-
I
I
,G )
...,k.
=
Since E(G, ,G, ) 3 P, €3 V(G) K , , n and U nE(GV2,-l, G ) can be extended to a per,21 , G, ). Let M = M1 u . , . u Mk Then M is a matching extension fect matchiigMi ir?E(G, 1 21 of (0,U) in G. 11-
To see that G is not (O,IV(G)I + 1)-extendable,choose U = V( CVl)u { u } where u E Gv2.Obviously U can not be extended to a perfect matching in G. We are done.
Theorem 4.5: Let G be a (0, t)-extendable graph ( t t 1) and s = min{lV(G)I, I V(G)1/2 + 2t). Then P3 GO G is (0,s)-extendable.
Matching extensions and products of graphs
199
Proof:
Let U be any set of s vertices of P3 @ G. Let P3 = 123 and ai = I U n V( GJ for i = 1,2,3. Then + a2 + a3 = s.
a1
Case 1: a l or a3 It, say a3 It. In this case, then a1 + a2 s s 5 n = IV(G)I. Since E(G1,G2) z P2 @ G is (0,n)-extendable, there is a matching extension M' of (0,U n(V(G1) u V(G2)) in E(G1,Gz). There is a matching extension M " of U n V(G3) in G3 since a3 It. Hence M = M' u M " is a matching extension as required. Case 2: ~ 1 , >~t. 3Choose a perfect matching Mi in Gi for i = 1,2,3. Let ci be the number of edges in Mi each of which is incident to one vertex in U.Let Bi = {edges in Mi each of which has both ends in U } and bi = IBil. Let Di = {edges in Mi which are independent in U } and di = ID&Then we have ci+ 2bi = ai and 4 = n/2 - ci - b , for i = 1,2,3. Since Gi is (0,t)-extendable, we may assume that bi + ci > 2. Thus a, 2 bi+ tor ci 2 2t - ab The perfect matching MI v M2 u M3 is not the required extension, because the edges in Mi n Bi do not satisfy the extension property. Notice that if uv E B2 and x y E D 1u4,then we can delete uv,xy from M1 u M2 uM 3 and add ux,vy to it. Thus, if b2 5 dl + d3 and bl + 5 d2, then we can get the desired matching extension from M1 u M2 u M3 by the above described exchange. Therefore the rest of proof is to verify b2 5 d1+ d3 and bl + 4 5 dp Since s = al + a 2 +
I n and a, = ci +2bi, we have
b, + 6, +b,+ c1 + c3 l a , + a2 +a3< n or b2 I n - b, - b3 -cl -c3.
Because d,+d, = n - b l - c l - b 3 - c 3 , t h e n Since s = a, + a 2 + a, I
b21dl+d3.
5 + 2 t and c . > 2 t - a,, we have I -
a 1 + a 2 + a 3 + c 2 - c 1- c 3 -<- 2n + 2 t + a 2 - ( 2 2 - 5 ) - (2t-a,) I
z - 2 t + a, + a 2 +
i
I - 2t +
z +2t = n
Because ai = ci+ 2b,, then a l + a2 + a 3 + c2 - c1 - c3 I n implies 2b1 + 2b2 +2b3 + 2c2 I n or b + b3 -<- T - b2 - c,. However, d, = - b, - c 2 ,hence 6, + b3 Id p This completes the proof.
4
Theorem 4.6: If G1 has a perfect matching, then G1 @G2 is (0,IV(G2)1)-extendable.
If G2 is (0,t)-extendable for t > 0 and G I has no isolated vertices, then GI 8 G2 is (0,s)-extendable where s = min { IV(G)l, ilV(c>l+ 2 t ) . Proof:
The first statement follows from Lemma 4.1 and Theorem 4.4. The second is immediate from Lemma 4.1, Theorems 4.4and 4.5. In general, the (0,s)-extension of Theorem 4.6 is not best possible. We may improve it a little bit, but is not worth the effort. Theorem 4.7: C3 @ G is (0, %IV(G)I)-extendable if IV(G)I is even.
200
J. Liu and Q. Yu
Proof:
r,,
Let IV(G)I = n. Then C3 €9 z,,E C3 €9 G and C3 €9 E Kn,n,n. It is easy to see that Kn,n,nis (0,n)-extendable if n is even. By Lemma 4.1, so is C, €9 G . Theorem 4.8:
Kr,r
@rn is (0,rn)-extendable.
PrOOf:
Kr.r C3rnG Krn, r n Obviously, Kr,r @
P2€9rm.
By Theorem 4.4, Kr,r @Knis (0,rn)-extendable.
r,,is not ( O m + 1)-extendable as it has only 2m vertices.
Corollary 4.9: If G is a graph of order n, then C4 @ G is (0,2n)-extendable. ProOf:
By Theorem 4.8, then K22 €9 Enis (0,2n)-extendable. But K2.2 €9 F,,z C 4 @En,thus C4 €9 is (0,2n)-extendable. From Lemma 4.1, then C4 C3 G is (0,2n)-extendable.
z,,
Acknowledgements The authors wish to thank Professors B. Alspach and K. Heinrich for their help during the preparation of this paper. Thanks are also due to the referees for a helpful suggestion to shorten the proof of Theorem 4.8.
References 111 [2] [3] [4]
[q [6]
M. D. Plnmmer; On n-extendable graphs, Discrete Math., 31,201-210 (1980). L. LovPsz; On the structure of factorizable graphs, Acta Math. Acud. Sci. Hunger., 23, 179-195 (1972). L. Lov6sz; Matching structure and the matching lattice, J . Combin.Theory, ser.(B), 43. 187-222 (1987). W.T. Tutte; The factorization of linear graphs, J . London Math. Soc., 22,107-1 11 (1947). L. LovaSz and M. D. Hummer; Matching Theory, North-Holland, Amsterdam (1986). Q.L.Yu; Factors and Factor Extensions, Doctoral dissertation, Simon Fraser University (1991).
Quo Vadis, Graph Theory? J. Girnbel, J.W. Kennedy & L.V. Quintas (eds.) Annals of Discrete Mathematics, 55, 201-210 (1993) 0 1993 Elsevier Science Publishers B.V. All rights reserved.
PROSPECTS FOR GRAPH THEORY ALGORITHMS Ronald C. READ Department of Combinatorics and Optimization University of Waterloo, Waterloo, CANADA
Abstract The history of graph algorithms so far exhibits a process of evolution. In the early days (say up to 1960 or so) it was sufficient, given a problem, to find some algorithm that solved the problem somehow. Later. consideration of complexity, in time and space, started to become of paramount importance, and concepts like that of NP-completeness focused attention on questions of how efficient an algorithm might possibly be. This trend can be expected to continue. It hmts at a growing level of abstraction in the study of graph algorithms. Theorems such as those of Robertson and Seymour, which, for example, demonstrate the existence of polynomial algorithms for certain problems without exhibiting an actual algorithm are signposts to one direction in which the theory of graph algorithms is going.
1.
Introduction
A problem that we sometimes give to our first-year algebra students is the following: show that it is possible for an irrational number raised to the power of an irrational number to be rational. Most students expect us to answer the question by producing two numbers, a and p, such that ap is rational, but instead we give them the following argument. Consider the
$fa. This is either rational or irrational. If it is rational then we have proved the . . statement with a = p = 4.On the other hand, if f i& is. irrational, take a = f ia and number
p
=
fi.Then 'a
= 2 and is rational.
For most students this is their first introduction to a nonconstructive, purely existential proof, and many of them are uncomfortable with it, feeling that there is something not quite right somewhere. Perhaps more germane to what follows is Euclid's proof of the infinitude of the prime numbers. Insofar as it shows the existence of primes which are not actually constructed or exhibited, this proof tends to be greeted with similar incredulity by first-year students. Nonconstructive, existential proofs are quite common in mathematics, and are not infrequently met in graph theory; but until recently they have been rare in the study of graph algorithms. It seemed almost self-evident that the only way to show that there was a polynomialtime algorithm for a given problem was to construct one. Now all this is changed as a result of a landmark series of papers by Robertson and Seymour [11-[8]. T o see the impact of this recent work let us take a brief look at the history of graph theory algorithms. 2.
Stages in the Study of Graph Algorithms We can distinguish two stages in this study. In the first stage, lasting until around the mid-
~ O ' Sthe , principle aim was that of just finding an algorithm for a given problem. Even then, Edmonds [9] had drawn attention to the distinction between efficient (polynomial-time) algo-
rithms and inefficient (exponential or worse) algorithms, but detailed consideration of complexity was not much to the fore.
R.C. Read
202
In the second stage the study of complexity became more important. Researchers strove to find polynomial-time algorithms of low degree, ideally, algorithms that were linear in the number of edges of the graph. The successive steps taken by Hopcroft and coworkers in achieving a linear algorithm for planar isomorphism is a good illustration of this, see [lo]1121. Suppose we have two algorithms for the same problem. Algorithm 1 runs in time An2 while algorithm 2 runs in time Bn3. Is algorithm 1 better than algorithm 2? The stage 1 answer would be “not qecessarily” - not if A is much larger than B . If, for example, An2 were less than Bn3 only for n > lo9 say, then for all practical purposes algorithm 2 would be better. The stage 2 answer would more likely be “yes” no matter what the magnitudes of A and B. Clearly what has happened here is a shift in emphasis from algorithms as somewhat idealized computer programs for practical problems to algorithms as theoretical concepts, the objects of an abstract theory. Along with this more abstract approach in stage 2 came the concept of NP-completeness and the riddle of whether N P = P. It became important to answer the question “Does this problem admit a polynomial-time algorithm”. How are we to answer such a question? One way, clearly, is to exhibit a polynomial-time algorithm for the problem in question; but as with the ap problem this may not be the only way. Moreover, if we thus restrict ourselves to “proof by example” we may find that we are seriously limiting the scope of our research. We may be missing something. What other ways are there of answering such questions? Until recently there were none; but now the work of Robertson and Seymour, depending on the properties of graph minors, provides alternative methods, and can be seen as ushering in a third stage in the study of graph algorithms. To get the flavor of their work we must first consider graph minors. 3.
Graph Minors
Definition:
H i s a minor of G if it can be obtained from a subgraph of G by contracting edges. We write HIG. This is illustrated in Figure 1 which shows that the wheel on five vertices is a minor of the Petersen graph.
(contraction)
(redrawing)
Figure 1: Formation of a graph minor. This “minor ordering” is a partial ordering. A family of (finite) graphs will have elements that is, that have no minors in the family. These are the that are minimal under the ordering I, “minor-minimal elements”.
Prospectsfor graph theory algorithms
203
Definition: A family F of graphs is closed under 1 if any minor of a graph in F is in F , that is HIG
and
GEF-+HEF.
Definition: The obstruction set for F consists of the minor-minimal elements of the complement of F. A well-known example of this is the family of planar graphs for which the obstruction set is { K,, K 3 , 3 } ,by a variation of Kuratowski’s theorem. From these definitions we get the following theorem. Theorem 1: GE
if and only if no minor of G is in the obstruction set for F.
In 1937 Wagner [13] made the following conjecture. Conjecture: Any set of finite graphs contains only a finite number of minor-minimal elements. This conjecture has been recently proved by Robertson and Seymour, and is now called the Combinatorial Finite Basis Theorem. Another theorem we shall need is the following. Theorem 2: (Robertson and Seymour [l]) Given a fixed graph H , the problem of determining, for a given graph G, whether H I G can be solved in polynomial time. Corollary:
Suppose is a family of graphs closed under 1.Then there is a polynomial algorithm to test whether a given graph G is in !F.
Figure 2: An inner-planar graph. The following are some examples of applications of these results. (1)
The family of planar graphs is closed under 1.Hence there is a polynomial algorithm to test whether a graph is planar. (Of course, we knew that already.)
204
R.C. Read
(2) The Disk Dimension problem. An outer-planar graph is usually defined as one which can be drawn in the plane so that all its vertices lie on a circle and all its edges are chords of the circle, with no two edges intersecting. An example is shown in Figure 2. It would perhaps be better to call this an inner-planar imbedding of the graph, and reserve the term outer-planar for the opposite concept, shown in Figure 3, where the vertices are on
Figure 3 : An outer planar graph. the boundary of a disk and the edges all lie in the remainder of the plane. The latter definition is readily generalized to the case of more than one disk: each vertex has to lie on the boundary of some disk, and the edges, which must not intersect each other, are drawn in the surface obtained by removing the disks from the plane. The smallest number of disks for which this is possible is called the disk dimension of the graph. Figure 4 shows a graph with disk dimension 3.
Figure 4: A graph with disk dimension 3. Consider the problem of determining whether a graph G has disk dimension at most k . It is readily verified that the family of all graphs with disk dimension at most k is closed under the minor ordering. Therefore, by the corollary to Theorem 2, there exists a polynomial algorithm for this problem.
Prospects for graph theory algorithms
205
(Note that this tells us only that an algorithm exists, but not what it is. In point of fact an algorithm for this problem has been found by Bienstock and Monma in [14] .) (3)
4.
Problem: Can a graph G be drawn in 3-space in such a way that no cycle of G forms a knotted curve? Note that a planar graph clearly can be so drawn, but a result of Conway and Gordon [Is]si ows that K7 cannot. The f ~ n i l yof graphs that can be so drawn is closed under the minor ordering. Hence, again, there must exist a polynomial-time algorithm for recognizing these graphs. Nevertheless, as far as I know, no polynomial time algorithm is known for this problem, and little progress has been made towards discovering one.
TheBadNews
The results cited in the last section may appear to be too good to be true. They me true, but not as good as one might at first think, for a variety of reasons. The proof of the Combhatorial Finite Basis Theorem is a pure existence proof. It gives no indication of how to find minor-minimal graphs, or even how many there are. Thus, for example, it is known from this theorem that there is a “Kuratowski-like” theorem for imbeddability of graphs in surfaces of any given genus. However, in all but the simplest cases the set of forbidden minors is not known. The degree of the polynomial-time algorithm may not be known, or may be very large. Even if the polynomial is of low degree, the multiplicative constant may well be impossibly large. In [2] Robertson and Seymour discussed the disjoint paths problem, namely: Given a planar graph and k pairs of vertices, do there exist k disjoint paths joining these pairs? They showed the existence of a polynomial algorithm for this problem, but stated that the degree of the polynomial was of the form
I am indebted to one of the referees for the information that, more recently, Robertson and Seymour [ 161 have shown that there is an algorithm of degree 3 for this problem. In the present context the degrees of the algorithms are usually small, typically 2 or 3. Thus the problem of imbedding a graph in 3-space without a knotted cycle can be shown to be testable in O(n3) time. But since no actual algorithm is forthcoming, it might well be that the multiplicative constant is enormous. We shall return to the Robertson-Seymour results later, but first we digress to a somewhat different topic. 5.
A Graph Property
Let us say that the graph G has property Am if, for any 2m vertices of G, u l , u2, ..., ti,, vl, v2, ...v, there is a vertex of G adjacent to every uiand to no vi. For m = 1 the circuit C5 provides a simple example of a graph with property A l , but even for the next case, m = 2 , examples are not easy to find. Let us consider a slightly more general property. Let us say that the graph G has property if, for any m + n vertices of G, ul,u2, ...,urn,v l , v2, ...vn there is a vertex of G adjacent to every uiand to no vi. This gives us a half-way house between A1 and A z , namely A1.2, for ~ been which examples might be easier to find. In fact, graphs with the property A I , have
R.C.Read
206
investigated by Harary and Exoo [lq.They showed that the smallest graphs with property Al,n for n I 6 are those shown in Table 1. Table 1:
n
Name
Description
Number of vertices
2
Petersen graph
(3,5)-cage
10
3
Robertson’s graph
(4,5)-cage
19
4
Wegner’s graph
(4,5)-cage
30
5
O’Keefe and Wong graph
(6,5)-cage
40
6
Hoffman-Singleton graph
(7,5)-cage
50
Of these, the Petersen graph is well-known (and is shown in Figure l),Robertson’s graph is shown in Figure 5, Wegner’s graph is shown in Figure 6; the other two are too dense to be easily depicted.
Figure 5:Robertson’s Graph. On first hearing of these properties, and the sets of graphs which enjoy them, I tried to construct a small graph having property A2. Not succeeding, I mentioned the problem to two colleagues, Bill Martin and Chris Godsil, who quickly put me on the right track by suggesting that the key lay with the Paley graphs. This was indeed the case. In fact, all the necessary information was at hand in Random Graphs, by Bollob& [18], in the section dealing with Paley graphs. If q is a prime of the form 4 k + 1, the Paley graph Pq is defined as the graph with vertex set ...,4 - 1} and for which ( i ,j ) is an edge if and only if i - j is a quadratic residue mod 4. Now some results in Bollobh’ book show that any sufficiently large Paley graph will have property A,,. They also indicate roughly how large the graph will be. By means of a simple computer program I was able to determine that the smallest Paley graph having property A2 was P61,shown in Figure 7. This of course does not preclude the possibility of there being smaller graphs with property A2. { 0, 1,2,
What is the relevance of this digression? It is that, here again, we have something that is
Rospects for graph theory algorithms
207
Figure 6: Wegner’s Graph. contrary to what we expect from our experience of graphs. The property A, seems to be an extremely stringent property. The known graphs with the property are large and very special, and finding other graphs with the property seems to be a difficult task. Thus it is somewhat of a surprise to come across the following theorem. Theorem 3:
For any m, almost all graphs have property A,, in the sense that, for graphs on n vertices, the ratio of the number with property A, to the total number tends to 1 as n tends to infinity. This theorem was published by Fagin in [ 191, but goes back in essence to some early work by Erdos. For an introduction to this theorem and many related matters see the paper by Blass and Harary [20], which presents the results in terms more familiar to graph theorists. Now those graph theorists who work with random graphs or in asymptotic enumeration may well take this theorem in their stride, but I suspect that, to the majority, this theorem will be quite counter intuitive. This emphasizes how limited is our acquaintance with graphs in general, and the extent to which our intuition is based on experience with comparatively small graphs only. In much the same way our intuition tends to be conditioned by those algorithms with which we are familiar. In the next section I explore the consequences of this by means of a series of questions (to which I do not claim to have the answers!). 6. Questions and Thoughts There is a Chinese proverb which succinctly describes the position in which graph theorists now find themselves.
(Afrog at the bottom of a well doesn‘tsee much of the sky and the sun.) I believe that many present day graph theorists, like the frog in the well, have a restricted
208
R.C. Read
Figure 7: field of vision. The questions given below are intended to heighten awareness of this phenomenon, and may perhaps serve as points for discussion. Question 1:
Our immediate knowledge of graphs is confined to comparatively small graphs. Is is not therefore surprising that so many conjectures, based on such imperfect evidence, turn out to be true for all graphs?
A Thought: Maybe it is not surprising. Perhaps it merely means that our knowledge of theorems is limited to those we can guess from small examples. Horrible Thought:
Maybe all the really good theorems are not guessable in this way! Question 2:
Wouldn’t it be nice if N P = P - or would it? Read on.
Prospects for graph theory algorithms
209
Question 3:
Would you be happy if someone found a polynomial algorithm for the satisfiability problem? Question 4:
Would you still be happy if the degree of the polynomial turned out to be
Question 5:
Suppose someone proved that an O(n3-time solvable problem X was NP-complete. Would you be glad? Question 6:
Suppose that, for every previously known NP-complete problem, the algorithm converting it to an instance of X had enormous degree. What then? Question 7 : (Johnson'snightmare)
How would you react to a nonconstructive proof that N P = P ? (See [21].) 7.
Summary
The history of graph theory algorithms has already displayed a trend towards greater abstraction, and this trend can be expected to continue at an accelerating pace. The theorems of Robertson and Seymour point to a coming theory of graph algorithms that transcends our ability to construct explicit examples. Not everyone will be happy about these developments, and that is not surprising; it is comfortable to sit at the bottom of the well surrounded by our low-degree polynomial algorithms that we can program on our computers. But that is not the way the subject is going; whether we like it or not, we must climb out of the well and look at the stars!
Acknowledgement I am greatly indebted to two papers by M.R. Fellows and M.A. Langston [22][ B ]which , sparked my interest in this topic and on which much of the preceding material is based.
References N. Robertson and P.D. Seymour; Graph minors - a survey, Surveys in Combinatorics, I. Anderson (editor), Cambridge University Press, 153-171 (1985). N. Robertson and P.D. Seymour;Disjoint paths - a survey, Siam J. Alg. Disc.Math., 6,300-306 (1985). N. Robertson and P.D. Seymour; Graph minors 1. Excluding a forest, J. Combinatorid Theory, B35.3961 (1983). N. Robertson and P.D. Seymour; Graph minors 11. Algorithmicaspects of tree width, J. ofdlgorithms, 7 , 309-322 (1986). N. Robertson and P.D. Seymour; Graph minors 111. Planar tree width, J. Combinatorid Theory, B36.4964(1984). N.Robertson and P.D. Seymour; Graph minors V. Excluding a planar graph, J. Combinatorid Theory, B41.92-114 (1986). N.Robertson and P.D. Seymour; Graph minors VI.Disjoint paths across a disk, J. Combim%oriuI Theory, B41.115-138 (1986).
R.C. Read
210
N. Robertson and P.D. Seymour; Graph minors VII. Disjoint paths on a surface, J . Combinatorial Theory, B45,212-254(1988). J. Edmonds; Paths, trees and flowers, Canad. J. Math., 17,449467 (1965). J.E. Hopcroft, and R.E. Tajan; A v 2 algorithm for determining isomorphism of planar graphs, hf. Proc. Lett., 1.32-34(1971). J.E. Hopcroft and R.E. Tajan; Isomorphism of planar graphs (working paper). Complexity of Computer Computations, R.E. Miller and J.W. Thatcher (editors), Plenum Press, New York, 131-152 (1972). J.E. Hopcroft and J.K. Wong; Linear time algorithm for isomorphism of planar graphs (Preliminary xport). Proc 6th Symposium on the Theory ofcomputing. 172-184 (1971). r131 K. Wagner; Uber eine Eigenschaft der ebenen Komplexe, Math. Ann., 14,570-590 (1937). 1141 D. Bienstock and C.L. Monma; On the complexityof covering vertices by faces in a planar graph, Slam J. Comput. 17.53-76 (1988). J.H. Conway and C.McA. Gordon; Knots and links in spatial graphs, J. Grciph Theory, 7,445-453 (1983). N. Robertson and P.D. Seymour; Graph minors XIII, J. Combinatorial Theory (submitted). G. Exoo and F. Harary; The smallest graphs with certain adjacency properties, Discrete Math., 29,2532 (1980).
B. Bollobi%; Random Graphs, Academic Press, London, Toronto (1985). R. Fagin; Probabilities on finite models, J. Symbolic LogLC, 41.50-58 (196). A. Blass and F. Harary; Properties of almost all graphs and complexes, J. Graph Theory, 3, 225-240 (1979).
[231
D.S. Johnson; The NP-completeness column:An ongoing guide, J . of Algorithms, 8,285-303 (1987). M.R. Fellows and M.A. Langston; Nonconstructive advances in polynomial-time complexity, lnf. Proc. Lett., 26, 157-162 (1987188). M.R. Fellows and M.A. Langston; Nonconstructive tools for proving polynomial-time decidability, J4.C.M. ,35727-739 (1988).
Quo Vadis, Graph Theory? J. Gimbel, J.W. Kennedy & L.V. Quintas (eds.) Annals of Discrete Mathematics, 55, 21 1-248 (1993) 0 1993 Elsevier Science Publishers B.V. All rights reserved.
THE STATE OF THE THREE COLOR PROBLEM Richard STEINBERG AT&T Bell Laboratories Murray Hill, New Jersey, U S A .
Abstract The Three Color Problem is: Under what conditions can the regions of a planar map be colored in three colors so that no two regions with a common boundary have the same color?This paper describes the origin of the Three Color Problem and virtually a l l the major results and conjectures extant in the literature.
1.
Introduction
Under what conditions can the regions of a planar map be colored in three colors so that no two regions with a common boundary have the same color? This is the Three Color Problem. Like its more famous sibling [11 [2], it has lead to some wonderfully elegant graph theory. However, while the Four Color Problem was laid to rest fifteen years ago [3] [4] (or was perhaps thrown by the computer into an unhappy limbo’), the Three Color Problem is very much alive, replete with an assortment of established results and an abundance of open problems. Since 1-coloring is trivial, and both planar and general 2-colorable graphs are easily characterized (no odd cycles), 3-coloring is the first significant graph coloring problem and, on the plane, the only unqualified graph coloring problem remaining. The Three Color Problem was considered as a distinct topic as early as 1958 by Herbert Grotzsch, who provided an important new theorem which he discussed in the context of other 3-coloring results2 Branko Griinbaum spurred the development of 3-coloring in 1%3 by improving on Grotzsch’s result and raising an additional important issue, both of which have led to much activity in the field. The Three Color Problem was canonized in 1967 by Oystein Ore who devoted an entire chapter (albeit his briefest) to the topic in his book on the Four Color Problem [ l ] (chapter 13). V.A. Aksionov provided a rigorous treatment of Griinbaum’s Theorem in 1974. Aksionov and L.S. Mel’nikov co-authored expositions of 3-coloring in 1978 and 1980, focusing on some specific developments arising out of Grotzsch’s Theorem and its extensions. Bjarne Toft [8] devoted several sections of his 1987 booklet on graph coloring problems to 3-coloring. This is not to say that the concept is only a few decades old. As we show in $2, the history of the Three Color Problem goes back to the earliest history of the Four Color Problem. 1. In the preface to their book on the Four Color problem, Saaty and Kainen [2] have these cautionary words: “The sophisticated technique of Haken and Appel appears to have succeeded in proving that the 4CC [Four Color Conjecture] is true. We say ’appearsto have succeedpd‘ since their proof involves the computer-fmilitated analysis of 1936 special cases, and will thus require several years for thorough checking. Even then, there will probably persist some lingering doubt among many scientists became of the elaborateness of the argument.” Toft has asked (see Problem 1 in [q):“Is there a short proof to show that the four color problem is afiniteproblem; that is, is it possible by a short argument to exhibit a number N such that. if there is a 5-chromatic planar graph there is one of at most N vertices?” 2. Mention should also be made of the elementarybook Mathematical Conversations published in Russia in 1952 [6].The first of the book‘s three sections is entitled Map Coloring Problems. This section was republished as a booklet Multicolor Problems [7] in the US.in 1!363 and contains a chapter on coloring with three colors.
212
R.Steinberg
This paper describes the origin of the Three Color Problem and virtually all the major results and conjectures extant in the literature. In $2 we discuss the origin of the Three Color Problem; $3 provides several basic results; $4 covers the topic of triangle-free graphs and chromatic number which was first investigated by Blanche Descartes. Grotzsch’s Theorem is presented in $5 and generalizations due to Grunbaum and Aksionov are presented in $6. Closely related to $35 and 6 are $7, which considers how the distance of the triangles (3cycles) in a planar graph affects its 3-colorability, and $8,which considers uniquely %colorable planar graphs. In $9 is a brief discussion of the question of whether restricting the 4cycles and 5-cycles in a planar graph will insure 3-colorability. Algorithms and complexity issues are covered in $ 10. In $ 11 two geometric 3-coloring problems are described, one planar, one general; $12 discusses connections between 4coloring and 3-coloring; $ 13 presents the question of the number of 3-colorings a graph may have. In $ 14 we describe two applications: the first, a very specific use of planar 3-colorability to elegantly solve a geometric problem; the second, a novel application of 3-colorable graphs to perform logical computations. Tutte’s idea of generalizing Grotzsch’s Theorem to general glaphs by appealing to the theory of nowhere-zero flows is examined in $15.The paper concludes, $ 16, with other directions in 3-coloring. An appendix outlines the proof of Griinbaum’s Theorem. Our notation follows that of Bondy and Murty [9]. We will use the term cubic to describe maps and graphs which are regular of degree 3. Unless otherwise specified, a map will be assumed to be planar. For purposes of clarification, we will occasionally use the term general graph to indicate a graph that is not necessarily planar. 2.
The Origin of the Three Color Problem
Probably the first mention of coloring the regions of a map in three colors is in an 1879 paper of Arthur Cayley [lo] on the Four Color Problem. In this paper, Cayley makes the exceedingly simple observation that if a circle is divided into a number of sectors, four colors are required for the resulting map (which includes the surrounding area as a region) if the number of sectors is odd, but only three colors are required if the number of sectors is even. Cayley also shows that if all cubic maps are face colorable in four colors then so are all maps. Cayley’s paper was probably what inspired another author later that year to discover the first three color theorem. However, the theorem appeared in the most notorious paper in the history of graph theory: the 1879 work of A.B. Kempe [l 11 that contains the fallacious proof of the Four Color Theorem. Kempe’s three color theorem is that, for cubic maps, if “every district is in contact with an even number of others along every circuit formed by its boundaries [the map’s edges], three colours will sufice to colour [the faces of] the map.’’ Kempe’s language is somewhat unclear - he was a barrister by profession [I21 - but his 3-coloring condition is correct if we understand it to mean that every region is to be adjacent to an even number of others. Undoubtedly Kempe was aware of the trivial converse: a cubic map is face colorable in 3 colors only if every face borders an even number of others. In 1890, P.J. Heawood [13] disposed of Kempe’s fallacious Four Color Theorem proof, but proved the Five Color Theorem. Heawood - working as Kempe did with cubic maps went on to say: “There is another simpler proposition not yet noticed [sic],that a map can be fface] coloured with but 3 colors if all its divisions are touched by an even number of others. The proof of this is not difficult, but it appears to shed no right on the main proposition ...”
The state of the three color problem
213
Heawood’s Second Paper Heawood stated the proposition again in his follow-up paper of 1898 [14], by pointing out that a “special case” of map coloring which was “before noticed is that where all the divi-
sions are ‘even’, i.e. touched by an even number of others: such a map can always be coloured with three colours only, two occurring alternately round each division (fig.5). Heawood’s figure 5 (see Plate 3 in [14]) is reproduced here as Figure 1, where r, y, and b stand for red, yellow, and blue. Almost certainly, this is the first map ever published to illustrate 3-coloring.
Figure 1: The first 3-colored map. Heawood’s 1898paper contained other basic 3-coloring results. (Although most of these are now well-known, Heawood’s pioneering role in 3-coloring seems to have been largely forgotten.) One such result is a proof of a theorem from 1880 due to P.G. Tait [15]: a cubic map is face colorable in four colors if and only if its edges are colorable in three colors so that all three colors are represented at each vertex. Also contained in Heawood’s second paper is a demonstration that the Four Color Problem is a special case of the Three Color Problem: “Zfwe replace the boundary segments of a map by quadrilateral divisions, as m a y be done by drawing lines from their extremities to meet in the centres of the divisions of the original map... it will be seen that the distinguishing of these segments by three diflerent kinds of lines, which we have proved involves the map theorem, is really a particular case of the [face]colouring in three colours of a map with ‘multiple’ corner points [non-cubic vertices], at which none but adjacent divisions are supposed to have contact.” Heawood also says: “With but three colours, however, it is no such simple matter to deduce the conditionsfor the map with multiple points, from the particular case of the ordinary [cubic] one which can be done with that number. ”
Yet another result: “We rnayPrst observe that for any of the possible arrangements of three colours round a multiple point, so that no two in succession are alike, there is some appropriate ‘extension’ of the point into a string of ordinary ones for which the colours will still be available.” (The procedure he describes in which non-cubic vertices are “extended”
R. Steinberg
214
into cubic vertices is presented in dual form in Theorem 3.3 below.) Heawood realized that his observation provides necessary and sufficient conditions for a planar map to be face colorable in three colors: “lfthena ‘complex’[non-cubic]map can be completely Iface] coloured with three colours, it will be possible by a due resolution of its multiple points to reduce it to an ordinary one thus coloured, and so to one with all its divisions even, and the converse necessarily holds.” Although he gave a clear and concise proof of this basic result, it soon passed into the folklore and has rarely been properly referenced. The paper also provides a formulation of the Three Color Problem as a system of linear congruencies modulo 3. The reader is urged to consult this remarkable work for more details. The Three Color Theorem
What Heawood’s 1898 paper does not contain is a proof of the theorem for even cubic maps that he had announced in 1890. Who in fact first proved the result is difficult to say. Did Kempe have a proof? Perhaps. Did Heawood? Gabriel Dirac seemed to think so. In his extended obituary of Heawood, Dirac [16] proffers: “Most of the assertions stated in [Heawood’s 1898 paper] are not actually proved, only made plausible, but they have since been proved rigorously by other writers, which indicates that Heawood was in possession of all the necessary proofs but did not choose to include them. ’’ The new century did not start out well for 3-coloring. In October 1900, W. Ahrens completed the manuscript of his book Mathematische Unterhaltungen und Spiele [17] which contained an incredible misstatement about face coloring. The erroneous assertion appeared in the first edition of the book in 1901 and re-appeared uncorrected in later editions: “Althoughfour colors are required as a general rule, there are nonetheless many special cases in which fewer will do. For example, three colors suflce, as Tait observed, in all cases [sic] of regional division in which only three regions meet at each junction ...” (This is from the 1918 second edition, volume 2, p. 215.) Ahrens repeats his confusion about Tait’s theorem in a footnote a few pages later. In 1929, the proposition was stated once again - this time correctly and with the converse explicitly included - by A. Sainte-Lague [18]. Although he did not provide a proof, SainteLague explains that the result can be established “parre‘currence”;he references an undated working paper. The year 1936 saw the appearance of DCnis Konig’s Theorie der endlichen und unendlichen Graphen [19]. In this classic book, Konig points out Ahrens’s error but neglects to give a correct statement of the proposition, much less the proposition’s proof. Finally, in 1939 Philip Franklin [20] published a correct statement of the proposition, including its converse, together with a bomfide proof. Franklin called the result the “Three Color Theorem. ’I3
3.
Basic Results
Three Interrelated Results
We now formally present three basic, interrelated 3-coloring results in dual (vertex color3. Franklin shows the Three Color Theorem follows from the Two Color Theorem.” The Two Color Theorem was also named by Franklin in the same paper [20]. It had been stated without ambiguity by Kempe his 1879 paper [Ill: “If an even number of boundaries meet at every point of concourse, two colours will sufice. This species of map is that which is made by drawing any number of continuous lines crossing each oiher and themselves any number of times.’’
The state of the three color problem
215
ing) form. Theorem 3.1 is a special case of a sufficiency condition for k-coloring planar graphs given by Oystein Ore4 [I] [Remark 3.11. Theorem 3.2 is the Three Color Theorem [201. Theorem 3.3 is Heawood’s [14] necessary and sufficient condition for planar 3-colorability. (This last result has been independently discovered and proved several times as recently as the 1970s; for example, [21]-[24] .) Theorem 3.1: Let G be a graph embedded in the plane where the number of edges in the boundary of each face is a multiple of three. Then G is vertex colorable in three colors if every vertex has even degree.
A triangulation is a graph embedded in the plane in which the boundary of every face is a triangle (3-cycle); a triangulation is even if every vertex has even degree. (As we shall see, triangles are a recurring theme in 3-colorability.) Theorem 3.2 (Three Color Theorem): A triangulation is vertex colorable in three colors if and only if it is even.
In the early 196O’s, L.I. Golovina and I.M. Yaglom [25] [26] provided an inductive proof of Theorem 3.2 in terms of face colorings. We defer the proofs of Theorems 3.1 and 3.2 until Q 15. Herbert Fleischner discusses various formulations of the Three Color Theorem in [27. Theorem 3.3: A planar graph is vertex colorable in three colors if and only if it is a subgraph of an even triangulation.
Proof: Sufficiency follows from Theorem 3.2. Consider a graph embedded in the plane with a given 3-coloring. First, 1-gons and 2-gons can triangulated by the addition of a new vertex in the interior of the face joined by one or two edges, respectively, to the one or two vertices in the boundary, and given a different color. Next, let F be a k-gon where k 2 4. Either the vertices of F alternate in two colors, in which case join all the vertices of F with a new vertex in the interior of F colored the third color, or there are three consecutive vertices in F colored in three different colors, in which case join the first and the third vertex by an edge; repetition of this procedure will eventually result in a triangulation that is vertex-colored in three colors. By Theorem 3.2 it must be even. This proof of Theorem 3.3, based on the presentation in [28], is Heawood’s proof [I41 dualized to vertex c o l ~ r i n gUnfortunately, .~ the theorem is not very useful. As we shall see in Section 10, there is unlikely ever to be a good characterization of 3-colorability, planar or general, since both problems are NP-complete. Outerplanar and Cubic Graphs A graph is outerplanar if it can be embedded in the plane so that every vertex lies in the boundary of the same face. The following easily-provedresult was published (in dual form) in 1905 by a Hungarian student named Dknis Konig [31]. It was his first paper in graph theory. 4. Ore [l] also provides a few minor lemmas regarding 3coloring, primarily in Chapter 13. 5. See also Lemma 8 of Fisk [29]. In a series of papers and a book, Fisk develops a sophisticated theory of colorings from the viewpoint of algebraic topology. See also Chapters 3 and 6 in [30] on the topic of the properties of 3colorings.
R.Steinberg
216
Theorem 3.4 Every outerplanar graph is vertex colorable in three colors. The next 3-coloring result did not appear until more than thirty-five years after Konig’s paper. It is a specific case of a theorem about k-coloring general graphs.
Theorem 3.5: Brooks’Theorem (cubic case) A connected graph is vertex colorable in three colors if its maximum vertex degree is 3, unless it is K4. In general, the 1941 theorem of R.L. Brooks [32] states that a connected graph with maximum degree k is vertex colorable in k colors unless it is Kk + 1 or an odd cycle. By making use a lemma of John B. Kelly and L.M. Kelly [33], Gabriel Dirac [34] in 1957 provided a short proof for the case k = 3. Laszlo h v b z [351 gave a short proof for general k in 1975. Dirac [34] points out that Brooks’ Theorem is the first general theorem connecting the structure of a graph with its chromatic number, and that it is equivalent to the following: Zfa k-critical graph, k 2 4, on n vertices and m edges is not a complete graph, then: 2 m 2 (k- l ) n + k - 1. Dirac strengthened this result by proving that, under the same assumptions, 2m 2 (k - 1) n + k - 3. He showed that the result is best possible but can be further refined if there is information about the lack of smaller cliques in the graph.
4. Triangle-free Graphs and Chromatic Number Generally speaking, a graph with a large number of triangles is likely to contain K4 and thus not be vertex 3-colorable. Can we find a graph which is not vertex 3-colorable but which contains no 3-cycles? Blanche Descartes posed the following more difficult problem in Eureka in April 1947 [36]:
Find a network in which it is impossible to colour the points in three colors so that no two points of the same colour are joined, and which contains no circuit of less than six lines. Her solution was published in Eureka in March 1948 [37]. Heptagonize a set of seven vertices ( v l , v2, ..., v7} by the addition of a 7-cycle v ’ ~vf2, , ..., v g 7 and , seven more edges v l v f l ,v2vI2,..., v7vf7.The resulting graph clearly cannot be 3-COlOred with v l , v2, ..., v7 all receiving the same color. Now, take a set M of nineteen vertices and let G be the graph obtained from M by heptagonizing every subset of seven vertices. Clearly, G contains no cycle of length less than 6. If we color the nineteen vertices of M in three colors, at least one set of seven must all have the same color. Hence G is not 3-colorable.
Do there exist graphs without triangles with arbitrarily high chromatic number? This was indeed shown to be the case in 1949 by Zykov [38] (see Theorem 8). Let Gk be any trianglefree k-chromatic graph. Take k copies of Gk; label them G,,, Gk2,..., G,,, ..., G . For every kk possible k-tuple of vertices t = ( v l ,v2, ..., vi, ...v,) where vi E G , , add a new vertex vt as well as k edges v l v l ,vtv2, ..., v,vk. Clearly, the resulting graph Gk is triangle-free. Since each G uses all k colors, at least one of the new vertices v, must be adjacent to a set of vertik, ces using all k colors. Hence Gk + 1 is (k+l)-chromatic. +#
In 1953, Peter Ungar [39] independently proposed Zykov’s result as a problem in the American Mathematical Monthly. Ungar’s unpublished solution was identical to Zykov’s,
The state of the three color problem
217
although Ungar obtained it independently [@I. In 1954 the Monthly published a solution given by Blanche Descartes [41]; it generalized her earlier result by constructing graphs free of 3-, 4-, and 5-cycles with arbitrarily high chromatic number. Let Gk be a k-chromatic on copies of Gk and add a graph on n vertices which has no 3-, 4-, or 5-cycles. Take k n - k + set Mof kn - k + 1 extra vertices. Set up a 1-1 correspondence between the copies of Gkand the sets of vertices of size n in M. Join each copy of Gk with n new edges to the each of the members of the corresponding set of n vertices in M. The resulting graph Gk + 1 is not k-colorable and has no cycles of length 3,4, or 5. Although Gk+ is not necessarily (k+ 1)-chromatic, edges can be removed to obtain a (k+ 1)-chromatic graph. Virtually the same construction was discovered independently that same year, 1954, by the team of John B. Kelly and L.M. Kelly
( A
P31. In 1955, yet another method of generating triangle-free graphs of arbitrarily high chromatic number was provided by Jan Mycielski [42]. Mycielski's construction is more parsimonious than that of Zykov and, if we are only interested in forbidding 3-cycles, than that of Descartes as well. Let Gk be a triangle-free k-chromatic graph with vertices vl, v?, ..., v,. Add n+ 1 new vertices v ' ~vlZ: , ...,v', and v. For each i, join v'i by edges to each neighbor of vi and to v. Clearly, the resulting graph Gk + 1 is triangle-free and ( k + 1)-colorable. Suppose Gk were to have a k-coloring. Assuming v is colored k, then no v > is colored k, and we can recolor each vertex vi of color k with the color assigned to vli. This results in a (k+ 1)-coloring of Gb a contradiction. Hence Gk + 1 is (k+ 1)-chromatic. If we take G3 to be the 5-cycle, then the graph Gq generated by Mycielski's method is the unique smallest 4-chromatic trianglefree graph (see Chv6tal [43]). This graph is usually known as the Mycielski graph (Figure 2). +
Figure 2: The Mycielski graph. There are many generalizations of this concept. Paul Erd6s showed probabilistically [44] and LOvBsz constructively [45l that for any two integers g 2 3 , k 1 3 , there exist graphs with girth g and chromatic number k. Two other results due to ErdCis [46] are: (i) for every k there is an E > 0 and an integer No(&,k) so that if n > NO(&, k) there exists a k-chromatic graph on n vertices for which every subgraph on L&n] vertices is 3-colorable; (ii) a graph on n vertices is 3-colorable if its girth is greater than 1 + 210g2n. 5. Grotzsch's Theorem
Theorem 3.2 characterizes vertex 3-colorability for planar graphs in which every face is
218
R. Steinberg
bounded by a 3-cycle. Suppose instead we consider planar graphs containing no 3-cycles. This is a far larger class of graphs, of course, so a good characterization of 3-colorability here would be significant. Between 1956 and 1962, Herbert Grotzsch published a series of sixteen papers under the rubric, “On the Theory of Discrete Structures”.6 Most of the papers were concerned with the combinatorial properties of some particular class of planar graphs. Grotzsch mentions that many of the results had been long known to him but did not seem to be available in the literature. One such result was the topic of his seventh paper, published November 1958, “A ThreeColor Theorem for Triangle-free Networks on the Sphere” [&]. Grotzsch’s result was striking: Every planar graph without 3-cycles is vertex colorable in three colors. More formally: Theorem 5.1 Griitzsch’s Theorem:
If G is a graph without 3-cycles that is embedded in the plane, then: i. The vertices of G can be colored in three colors. (Assignment Condition.) ii. If G has at least one 4-gon or 5-gon, then the 3-coloring of the vertices according to the assignment condition may be specified arbitrarily in an arbitrarily selected 4-gon or 5-gon, called the distinguishedface. (Additional Condition.) (A k-gon is a face with exactly k edges in its boundary, where cut edges are counted twice.)
Gfitzsch’s Proof
The proof of Grotzsch’s Theorem is by induction on the number of edges plus vertices of G. If the graph satisfies the conditions of part ii of the theorem, it is further assumed that it contains a distinguished face D.(It should be emphasized that part ii of the theorem is necessary for the induction.) An ordered list of reducible configurations is provided. Each reducible configuration is a specific description about the possible structure of G; for example, a particular subgraph that the graph G may contain. For each reducible configuration a corresponding reduction is provided which is a procedure showing how a graph containing the reducible configuration can be 3-colored in accordance with the theorem, assuming that the theorem is valid for all smaller graphs and that the graph contains none of the previous reducible configurations. The list of reducible configurations forms a “complete system” in the sense that, if any graph satisfying the hypotheses of the theorem contains none of the reducible configurations before the final one, then by Euler’s formula the graph must in fact contain the final reducible configuration, a subgraph called the “Grotzsch configuration” [49] which we describe in the Appendix. Grotzsch also independently constructed the graph of Figure 2 to show that the restriction to planar graphs cannot be dropped (see [a], p. 110).(For this reason, the Mycielski graph is sometimes referred to as the “Grotzsch graph.”) In fact, as pointed out in [50] and [51], Grotzsch’s 3-coloring result cannot be extended to any other surface. The Mycielski graph has girth 4 and is 4-chromatic and embeds in both the torus (Figure 3a) and the real projective plane (Figure 3b), hence in all higher surfaces, both orientable and nonorientable7 [52].However, W.T. Tutte has suggested a way to generalize Grotzsch’s Theorem which, remarkably, makes no mention whatsoever of 3-coloring. This will be discussed in $15. 6. All of these papers appeared in the Mathematics-Natural Sciences Series of the Scientific Journal ofthe Martin Luther University. Sachs [47 provides an overview of these papers. 7. The embedding of Figure 3a is given by Kronk and White [B];the embedding of Figure 3b was pointed out to the author, through W.T. Tutte, by H. Sachs [50].
The state of the three color problem
219
Figure 3: Embeddings of the Mycielski graph. It should be added that Grotzsch was motivated in part by the Four Color Problem, and he uses his main result to obtain a partial result for the Four Color Theorem based on the concept of a half-even normal map. The reader is referred to Grotzsch’s paper (see [a], p. 111) for further details. Variations on Grotzsch’s Approach
Several authors have obtained graph coloring results for higher surfaces based on limiting the girth, e.g., Kronk [%I, Kronk and White [53], Cook [55J,and Woodburn [56],or the number of triangles, e.g., Kaiser (571. Despite the fact that all these authors claim inspiration from Grotzsch’s Theorem, none of these approaches makes use of deep structural properties of the graph. (Moreover, not all the bounds obtained have been shown to be sharp.) Rather, all these results are based on ingenious counting arguments, for the most part making use of Dirac’s refinements to Brooks’ Theorem (see $3) in conjunction with the generalized Euler polyhedron formula [52]. Results include that a graph is 3-colorable if it is embeddable in: (i) the orientable surface of genus 1 (torus) with girth at least 6 [53]; or (ii) the nonorientable surface of genus 1 (projective plane) with girth at least 6 [53; or (iii) the orientable surface of genus 2 with girth at least 7 [55].’ Grotzsch’s proof is rather long. As he himself admits at the end of the paper (see [49], p. 119): “The train of thought behind this prooJ presented here in great detail, remains to be expressed more concisely.. .” Can a short proof be found? One possible approach would be to 8. I gratefully acknowledge correspondence received from V.A. Aksionov [58] which insightfully discusses the
papers [53]- [55l,and [57.
220
R. Steinberg
show that all planar graphs without triangles are subgraphs of some class of graphs already known to be 3-colorable. Griinbaum [49] reports on one such attempt based on the claim, alas incorrect, that every triangle-free planar graph is a subgraph of a chordal graph not containing K4.’ Bjarne Toft (see Problem 1.6 in [8]) suggests trying to demonstrate directly that every planar triangle-free graph is a subgraph of an even triangulation. Griinbaum’s own approach had a most serendipitous result. 6.
Griinbaum’s Theorem and Aksionov’s Theorem
In an attempt to find a simpler proof of Grotzsch’s Theorem, Branko Griinbaum lo [49] discovered that as many as three 3-cycles could be allowed. Griinbaum’s Theorem, published in 1963, is of course best possible, as shown by the graph Kk The paper contains two propositions. Proposition 1 is essentially the assertion that the theorem is true if it is true for the restricted class of graphs in which every face is a 3-gon, 4-gon or 5-gon. Proposition 2 is the theorem stated for the restricted class, including an “additional condition”: Proposition 2: Let G be a planar graph having asfaces only triangles, quadrangles, and pentagons and containing at most three 3-circuits. Then it is possible to color G with 3 colors. Moreover, if G contains at most one 3-circuit, the colors on the nodes of one arbitrarily chosen 4or 5-face may be prescribed, unless the chosenface is a pentagon three consecutive vertices of which form a 3-circuit--in which case the prescribed coloring of the pentagon must assign different colors to those three vertices.
However, in 1972 Tibor Gallai found a simple counterexample to Griinbaum’s F’roposition 2 (reported in [63][64]). Gallai’s graph is shown in Figure 4, where the 3-coloring of the 5-gon cannot be extended to the vertex x . Griinbaum ave a revised version of his Proposition 2 in which the “additional condition” was weakened.“ The revised Proposition 2 was proved by V.A. Aksionov in 1974 (see [63], Theorem 1), thus rigorously establishing Griinbaum’s Theorem. In addition, Aksionov precisely characterized conditions under which the original form of Griinbaum’s Proposition 2 is valid [633 (see Theorems 1 & 2). Despite the full rehabilitation of Griinbaum’s Theorem almost two decades ago, many graph theorists to this day are uncertain of its status, undoubtedly due to the fact that Aksionov’s paper was published in 9. A chordal graph (or rigid circuit graph or iriangulated graph) is graph for which there is no induced cycle of length greater than 3. Claude Berge had shown [a] that if a chordal graph is not k-colorable, then it contains a (k+ 1)-clique.Berge claimed [59] that every triangle-freeplanar graph is the subgraph of a chordal graph not containing K4. However, Griinbaum [49] points out that there are simple counterexamples, such as the 3-cube. A graph G is perfect if, for every induced subgraph H of G , the chromatic number of H is equal to the size of the is that a graph G is perfect if and only if G conlargest clique of H.Berge’s Strong Perfect Graph Conjecture [a] tains no hole or antihole, where a hole is a induced odd cycle of length at least 5, and an antihde is the complement of a hole. Alan Tucker [61] showed in 1977 that the Strong Perfect Graph Conjecture is true for 3-chromatic graphs. 10.Expositions of Griinbaum’s 1%3 paper are givenin [l] and 1621. Ore @. 238) concludes his presentation of the Griinbaum proof with the followiug surprising statement: “0can be considerably simplified ifone restricts onese[fto Grotzsch’s original theorem when there are no triangles.” 11. The revised Proposition 2 appeared in Griinbaum’s unpublished circa 1972 manuscript, “A New Proof of Grotzsch’s Theorem on 3-Colorings”. However, a gap in Griinbaum’s proof of his revised proposition was subsequently found by W.T. Tutte. (A story soon circulated that, upon hearing from Tutte, Griinbaum replied with the palindromic telegram: “Et tu Tutte?” In Waterlooin the autumn of 1975, Blanche Descartes confessed to me that she had in fact invented the story.)
The state of the three color problem
22 1
1
2
Figure 4: Gallai’s graph. Siberia and is virtually unknown in the West. We will need two definitions. In any 3-coloring of the vertices of a 5-gon, two vertices will be colored in one color, two vertices will be colored a second color, and the remaining vertex, called the special vertex, is colored the third color. The edge of the 5-gon opposite the special vertex is called the special edge. The papers of Griinbaum [49] and Aksionov [63] together establish Theorem 6.1: Griinbaum’sTheorem
If G is a graph with at most three 3-cycles that is embedded in the plane, then: i. The vertices of G can be colored in three colors. ii. If G has at most one 3-cycle, and has at least one 4-gon or s-gon, then the 3-coloring of the vertices may be specified arbitrarily in an arbitrarily selected 4-gon or s-gon, assuming this face is not a 5-gon having its special edge contained in a 3-cycle. The Griinbaum-Aksionov proof is an extension of the Grotzsch proof, although the generalization is far from obvious; the contributions of both Griinbaum and Aksionov are considerable. The proof of Griinbaum’s Theorem is outlined in the Appendix. l 2 Aksionov’s Theorem As mentioned above, one of the innovations of Griinbaum’s proof is to show that the theorem holds generally if it holds for the restricted case of graphs having as faces only 3-gons, 4gons, and 5-gons. (See Appendix.) Now, let us define A as the set of graphs embedded in the plane having as faces only 3-gons, 4-gons, and 5-gons, and containing a single 3-cycle and a distinguished 5-gon whose special edge is contained in the 3-cycle. If a graph G is contained in A, under what conditions can the coloring of the distinguished 5-gon be extended to a 3-coloring of the entire graph? Aksionov completely characterized this.
We need a few additional definitions. If C is a cycle in a graph G embedded in the plane, then the interior (exterior)component of C is the graph obtained from G by deleting all vertices lying in the exterior (interior)of C. A cycle in a planar graph is said to be separating if the interior and exterior components determined by the cycle each contain at least one vertex not in the cycle. Let F be any 5-gon in G other the distinguished 5-gon D. Then a 4-cycle will be 12. In Steinberg and Younger [Sl], Griinbaum’s Theorem is called the “Griinbaum-Aksionov Extension.”
R.Steinberg
222
said to be an isolating 4-cycle for F if F and D u C lie in different components of the 4-cycle. Theorem 6.2: Aksionov’s Theorem Suppose G E A where the 3-cycle is not separating. Then the 3-coloring of the distinguished 5gon can be extended to G if and only if at least one 5gon F (# D)does not have an isolating 4cycle. Corollary 6.3:
Suppose G E A where the 3-cycle is not separating and there are no separating 4-cycles. Then the 3-coloring of the distinguished 5-gon can be extended to the entire graph if and only if G contains more than one 5-gon. Although Aksionov’s Theorem seems somewhat specialized, we shall see in the sequel that it is a deep and useful result that complements Griinbaum’s Theorem.13 We conclude this section with two issues raised by the two parts of Theorem 6.1. First, is the theorem of Griinbaum similar to that of Brooks in that K4 is the only exception? That is, can we allow four 3-cycles in a planar graph and be assured of vertex 3-colorability if the graph does not contain K4? Second, if a planar graph has a large number of triangles but they are sufficiently dispersed, are the portions of the graph surrounding each triangle arbitrarily 3colorable to the extent that the entire graph will admit a 3-coloring? Can we make this concept of “sufficiently dispersed” more precise? We address both issues in the next section. 7.
The Distance of the Triangles
The question of how the relative placement of the triangles in a planar graph affects its vertex 3-colorability was first raised by Griinbaum. This topic has been covered thoroughly in the informative surveys of Aksionov and Mel’nikov [64] [&I, so this section will consist only of an updated overview. In his 1%3 paper, after stating his generalization of Grotzsch’s theorem, Griinbaum mentions that he was unable to prove the following conjecture: If a planar graph Gis not 3-colorable, then G contains two pairs of (edge or vertex) incident triangles. Ivan Havel [67] refuted this conjecture in 1969 by providing a planar non-3-colorable graph with four triangles, no two of which have a common vertex; a simplification of Havel’s graph due to Griinbaum also appeared in Havel’s paper (see Figure 5a). Havel defined d, the distance of the triangles in a graph, to be the length of the shortest path joining vertices of different triangles. He asked: Problem 7.1:
Does there exist an integer no such that any planar graph with d 1no and an arbitrary number of triangles is vertex 3-colorable? Havel further asked: Is, possibly, no = 2? In 1970, Havel [a] himself answered the latter question in the negative by constructing a non-3-colorable planar graph with d = 2. This graph contained six triangles; Aksionov and Mel’nikov [64] [66]later provided a non-3-colorable planar graph with d = 2 containing only 13. Derailed presentations of Theorems 6.1 and 6.2 are given in Steinberg [65].
The state of the three. color problem
223
four triangles (Figure 6a), thus answering a question of Horst Sachs [62] (p.258).
Figure 5: A counterexample with d = 1.
Figure 6: A counterexample with d = 2. Quasi-Edges We approach Problem 7.1 by analyzing the counterexamples of Figure 5a and Figure 6a. In Figure 5a, consider the subgraph with eight vertices lying between z1 and 22 (Figure 5b). In any 3-coloring of the subgraph, vertices z1 and z2 are assigned different colors. (This follows
224
R. Steinberg
easily by contradiction.) Havel’s graph in Figure 5a can now immediately be seen to be constructed by replacing LWO nonadjacent edges of K4 by copies of the graph of Figure 5b. The graph of Figure 6a is constructed in an analogous fashion with the 1 1-vertex subgraph lying between z1 and z2 (Figure 6b). Aksionov and Mel’nikov [64][MI, define a quasi-edge (zl, ,z2) to be a graph embedded in the plane containing a pair of nonadjacent vertices ZIand 22 lying in the outer face which are assigned different colors in any 3-coloring of the graph. l4 Thus the graphs of Figure 5b and Figure 6b are two examples of quasi-edges. By devising a quasi-edge in which the triangles are more widely dispersed, Aksionov and Mel’nikov [64][66]constructed a planar non-3-colorable graph with d = 3 (Figure 7a). Thus, Havel’s no must be at least 4.
Figure 7: A counterexample with d = 3. We have shown that non-3-colorable planar graphs can be constructed by piecing together quasi-edges. But how do we construct the quasi-edges themselves? One technique makes use of Aksionov’s Theorem (Theorem 6.2) and can be used to construct all three quasi-edges described above. Consider, for example, the graph of Figure 7b from [MI, where shaded areas indicate regions consisting of any number of 4-gons. It is a straight-forward exercise [66]to show that Corollary 6.3 implies this general configuration is a quasi-edge. As a particular case this includes the quasi-edge used in constructing the graph of Figure 7a. A second technique for constructing quasi-edges was introduced by Steinberg in 1975 (as reported in Aksionov and Mel‘nikov [64][&I) and uses multiple copies of yet smaller graphs which Aksionov and Mel’nikov dubbed “building blocks.” This technique was illustrated only 14. Upon being introduced to some of the problems of $7,Paul Erdds [69] remarked on the similarity of quasiedges to signal senders introduced in the paper of Burr, Ed&,and h v b z [70]. However, Erdds immediately added, “It won’t help you.”
The state of the three color problem
225
briefly in Aksionov and Mel’nikov’s two surveys, so we will go into some detail here. Building Blocks
Define a building block as a graph embedded in the plane, containing in the outer face four vertices in cyclic order, x1,x2,y?,y1, such that, in any 3-coloring of the graph where x2 receives the same color as xl, y2 will receive the same color as y1. The graph of Figure 8a is an example of a building block.
Y1
Y2
Figure 8: Example of a building block. The graph of Figure 8 b is obtained by “stacking” six building blocks, and adding a new vertex y as well as six new edges as shown. We can now show that the graph of Figure 8b is a quasi-edge ( z ~ ,z 2 ) . Suppose z2 is colored the same as z1 in some 3-coloring. Then v 2 must receive the same color as vl, and w 2 must receive the same color as wl.However, the six new edges ensure that the three labeled vertex pairs and the vertex y each receive a different color, a contradiction. The quasi-edge of Figure 8 b was constructed by Steinberg; the quasi-edge of Figure 9b, based on the building block of Figure 8a,was constructed by Aksionov and Mel’nikov [a] [&I. All building block quasi-edges which have appeared in the literature have essentially used the six block construction of Figure 8b and Figure 9b, but other structures would be worth investigating. Note that all the non-3-colorable planar graphs described above were constructed by replacing two edges of K4 by two quasi-edges. Hence each has connectivity 2. Aksionov and Mel‘nikov [64](Conjecture 2) [66] (Conjecture 1) offered the following conjecture: Any 4critical planar graph with d 2 1 has connectivity 2. However, Han-Kun Ma [71] used the quasi-edge of Figure 7 b to construct a counterexample with d = 2 and connectivity 3 (Figure 10). As reported in [8] (Problem 1.8), Michael Albertson suggested in 1986 that it might be
R.Stehberg
226
X1
IYZl
Y1
Y2
Figure 9: Another building block.
Figure 1 0 A counterexample with d = 2 and connectivity 3. better to look for other conditions on triangles that force G to be 3-colorable, that perhaps d is not the right parameter. We end this section with an open problem due to Paul ErdBs [69]: Problem 7.2
If G is a non-3-colorable planar graph with exactly four triangles, is it true that G must contain either K4,the graph of Figure 5a, or the graph of Figure 6a? 8.
Uniquely 3-Colorable Planar Graphs
A graph is said to be uniquely k-colorable if there is precisely one w t i o n of its vertex set into k independent subsets. Since every 3-colorable triangulation is uniquely 3-colorable,this may suggest that triangles play a role in determining whether a planar graph is uniquely 3-col-
The state of the three color problem
227
orable. Let t denote the number of triangles in a graph. In 1%9, Gary Chartrand and Dennis Geller [72] used the first part of Griinbaum’s Theorem to show that if G is a uniquely 3-colorable planar graph on n vertices with n 2 4, then t 2 2. Aksionov published two papers in 1977. In his first paper p3], he makes use of both parts of Griinbaum’s Theorem as well as his own theorem (Theorem 6.2) to show that Chartrand and Geller’s lower bound is sharp only for n = 4. Thus: Theorem 8.1:
If G is a uniquely 3-colorable planar graph where n 2 5, then t 2 3. Aksionov also characterized uniquely 3-colorable planar graphs with exactly three triangles: Theorem 8.2:
If G is a uniquely 3-colorable planar graph where t = 3, then: (i) If n I 5, then G is isomorphic to G1 or G2 (Figure ll), (ii) If n 2 6, then Gcontains G1 as a subgraph and also contains a 2-vertex on the boundary of a 5-gon.
Figure 11: Uniquely 3-colorable planar graphs. In his second 1977 paper, Aksionov [74] generalizes Chartrand and Geller’s result in another direction. Two vertices of a graph are said to be 3chromatically connected if they are assigned the same color in any 3-coloring of the graph. In the terminology of D.L. Greenwell [75],a graph having a pair of 3-chromatically connected vertices is said to be semi-uniquely3-colorable. l5 Aksionov proved that: Theorem 8.3:
If G is a semi-uniquely 3-colorable planar graph, then t 2 2.
15. Greenwell was primarily concerned with 4-coloring.
R. Steinberg
228
Aksionov ended his first 1977 paper with two conjectures. Let G be a uniquely 3-colorable planar graph where G - e is not uniquely 3-colorable for each edge e of G. Then G is said to be u-critical. Conjecture 1: Every uniquely 3-colorable planar graph must contain two triangles having an edge in common. Conjecture 2: If G is u-critical planar graph, then m = 2n - 3.
That same year, Mel’nikov and Steinberg [76] constructed a graph which refuted both of Aksionov’s conjectures (Figure 12). Their paper contained two new problems which are still open:
Figure 12: One counterexample for two conjectures on 3-coloring. Problem 8.4: Does there exist an integer n1 such that if a planar graph with any number of triangles has d 2 n l , then the graph is not uniquely 3-colorable? It is possible that n l = 1. Problem 8.5: Find an exact upper bound for the number of edges m in a u-critical graph G with n vertices. Note that L.J. Ostenveil [77]provides a technique for generating a class of uniquely 3-colorable g e n e d graphs. 9.
Restricting 4-Cycles and 5-Cycles
In investigating conditions on vertex 3-coloring, we may restrict ourselves to graphs having no vertices of degree less than 3. In the planar case, it follows from Euler’s formula that for such graphs: 3f3+2f4+f5212+C
(i-6)fi
t> 7
where f;: denotes the number of i-gons in the graph. Call a cycle small if has length 3,4, or 5.
The state of the three color problem
229
Then the graphs we are concerned with must contain a considerable number of small cycles. The theorems of Grotzsch and Griinbaum show that if the number of 3-cycles is sufficiently small, then a planar graph with any number of 4-cycles or 5-cycles must be vertex 3-colorable. A complementary result would be to allow any number of 3-cycles, but to limit the number of 4-cycles and 5-cycles. The following question was raised in 1975 by Steinberg in a letter to Aksionov and Mel’nikov (see Conjecture 4 in [MI, or Remark 4 in [66] ;also Problem 1.7 in [8] and Problem 5 in [q): Problem 9.1: If G is a planar graph without 4-cycles and 5-cycles, must G be 3-colorable? Toft [8] points out that Problem 9.1 can be reformulated as: If G is a planar 4-critical graph is it then true that G contains a 4-cycle or a 5-cycle? Both 4-cycles and 5-cycles must be excluded, as K4 and the graph of Figure 5a respectively show. On the other hand, Aksionov and Mel’nikov point out that in a certain sense either the 4-cycles or the 5-cycles can be ignored, since it is possible to insert the quasi-edge of Figure 5b or the quasi-edge of Figure 13 into non-3-colorable graphs, eliminating either the 4-cycles or 5-cycles, respectively.
P
Figure 13: A quasi-edge to eliminate 5-cycles. Paul Erd6s [69] has suggested the following relaxation of Problem 9.1: Problem 9.2:
Is there an integer k 2 5 such that if G is a planar graph without i-cycles, 4 I i I k , then G must be 3-colorable? 10. Algorithms and Complexity
Brooks’ Theorem (Theorem 3.5) means of course that vertex 3-colorability of a graph with maximum vertex degree 3 can be determined in polynomial time, since this amounts only to verifying that the graph does not contain K 4 (See [78] for the complexity terminology.) Since Brooks’ Theorem applies to all graphs, nonplanar as well as planar, might we not hope that with the additional restriction of planarity a stronger result could be obtained? The bad news is provided by Larry Stockmeyer [79] and by Michael Garey, David Johnson and Stockmeyer [80]:
230
R. Steinberg
Theorem 10.1: (i) Vertex 3-colorability of graphs is NP-complete, (ii) Vertex 3-colorability of planar graphs is NP-complete, (iii) Vertex 3-colorability of planar graphs with maximum vertex degree 4 is NP-complete. Outline of Proof:
(i) The NP-complete problem 3-satisfiability is shown to be reducible to 3-colorability by constructing, for any set C of clauses containing 3 literals, a graph G which is 3-colorable if and only if C is satisfiable. l6 A similar idea presented by Read and Wright 1821 is discussed in
Q 14. (ii) 3-colorability is shown to be reducible to planar 3-colorability by a modification procedure which gives, for any proper drawing in the plane of a nonplanar graph G, a planar graph G' which is obtained by replacing edge crossings with copies of the graph H called a crossover (see Figure 14). It has the following two properties: (a) Any 3-coloring of H assigns x' X
X'
Figure 14: A crossover. the same color as x and assigns y' the same color as y. (Assume x and x' are colored 1 and 2. Then a contradiction will easily follow, whether the central vertex is colored 1 or 3.) (b) Any 3-coloring of x, x', y, and y' can be extended to a 3-coloring of H.(This is easily verified.) (This crossover from [So]was suggested by M.J. Fisher and is simpler than the one used by Stockmeyer [79]in the original proof.) The construction is effected by replacing a copy of the crossover corresponding to each edge crossing, as shown in Figure 15 for the case of an edge (u, v) crossed twice by other edges. The equivalence of the 3-coloring of G (Figure 15a) and G' (Figure 15b) is straight-forward, as is the generalization to any number of crossings. (iii) Planar 3-colorability is shown to be reducible to planar 3-colorability with vertex degree at most 4 by a modification procedure which gives, for any planar graph G, a planar graph G" 16. A.K. Dewdney [81] has shown that, not only is there is polynomial time transformation between 3-satisfiability and graph 3-colorability, these problems are in fact linear-timeequivalent.
The state of the three color problem
I I
231
I I
Figure 1 5 Using crossovers. with degree at most 4 which is 3-colorable if and only if G is 3-colorable. This procedure replaces each vertex in G of degree k, k 2 4 (see Figure 16a), with a graph (Figure 16b) called a vertex substitute. As easily seen, G ” is 3-colorable if and only if G is 3-colorable. Results (i), (ii), and (iii) were given Results (i) and (ii) were first given by Stockmeyer [79]. by Garey, Johnson and Stockmeyer [SO]. Gideon Ehrlich, Shimon Even and Robert Tarjan [B]provide another complexity result on 3-coloring. Let V be a finite set of arcs (i.e., Jordan arcs) in the plane. Associate a graph G(V), called the intersection graph of Vas follows: Vis the vertex set of G and there is a single edge connecting v1 and v2 if and only if the arcs v1 and v 2 intersect (one or more times). If all the members of Vare straight line segments, then G(V) is said to be an intersection graph of straight line segments. It is easily demonstrated that every planar graph is an intersection graph, although clearly not every intersection graph is planar. Ehrlich, Even and Tarjan show that the set of all graphs properly contains the set of intersection graphs, which in turn properly contains the set of intersection graphs of straight line segments. They say: “This seems to leave hope that one may find an efficient algorithm to color intersection graphs in spite of the fact that the problem offinding the chromatic number, of graphs in general, is NP-complete.” By making use of a construction from Stockmeyer’s paper p9],they dash this hope:
R.Steinberg
232
0
...
Figure 1 6 A vertex and its substitute. Theorem 10.2: Vertex 3-colorability of intersection graphs of straight line segments is NP-complete. The Petford-Welsh Algorithm If a graph satisfies the conditions of Brooks’ Theorem for the case k = 3, then the proofs referenced in $3 provide polynomial time 3-coloring algorithms. Otherwise, providing an efficient 3-coloring algorithm is not as easy as it may first seem. David Johnson [a] found that for each algorithm on a list of “prominent” graph coloring algorithms he could create a sequence of 3-colorable graphs G, with O(n)vertices, n = 3,4,..., for which the number of colors used by the algorithm is at least n.
Given the intractability of 3-coloring, A.D. Petford and D. Welsh [SS],address the problem of developing a heuristic algorithm to 3-color the vertices of a graph. They report that a randomized algorithm, called an antivoter model, “works well in a wide variety of cases. ” Their heuristic is: (1) Color the vertices arbitrarily with three colors i = 1,2,3.
(2) Define a vertex to be bad if it is adjacent to a vertex of the same color. Compute B, the set of bad vertices. (3) If B is empty, a proper 3-coloring has been found. Stop. Otherwise, choose a random vertex v in B and for i = 1,2,3 compute si, the number of neighbors of v with color i.
The state of the three color problem
233
(4) Re-color v with the random color X given by P(X = i) = p(s,, s2, s3;ij for 1 I is 3, where p is the transition function satisfying:
(i)p(s,,s2,s3;i)20for 1 S i 1 3 , a n d 3
(ii)
C p(s,,s,,s3;i)
= 1.
i= 1
(5) Continue to repeat steps 2 through 4. If a time limit is reached before the algorithm has found B 3-colorable, then the algorithm claims, perhaps erroneously, that the graph is not 3-colorable. Petford and Welsh found that the transition function
worked well in a large variety of cases. For a certain class of graphs, in fact, this choice of p appeared to achieve a 3-coloring which is linear in the number of the vertices of the graph. Two situations they found in which the algorithm does not appear to be efficient are: (a) when the only 3-coloring of the graph decomposes the vertex set into disjoint sets where one of the sets is much larger than each of the others, and (b) when the graph is approximately regular with a low vertex degree, e.g., order 5 or 6 for a loo0 vertex graph. With reference to the result of Theorem 10.1 (iii), Petford and Welsh make the intriguing remark that graphs of maximum vertex degree 4 could possibly be among the hardest to color. (Two geometric problems involving the 3-colorability of ®ular graphs are discussed in the fallowing section. ”) Avnm Blum considers in his Ph.D. thesis [88]the problem of coloring a k-colorable graph in polynomial time with as few additional colors as possible; Blum’s emphasis is on the case k = 3. He presents an algorithm to color any 3-colorable graph with O(n3’* polylog(n)) colors. His thesis contains a history of the problem and additional references. 11. Geometric Problems 11.1 Plane 4-Regular Graphs with Cycle-Decompositions Grotzsch conjectured that any graph formed by the superposition of a set of simple closed curves (Le., Jordan curves) in the plane, no two of which are tangent and no three of which meet at a p i n t , is vertex 3-colorable. Grotzsch’s conjecture was presented by Sachs at several conferences, see for example [89], in the early 1970s.
More formally, the set of simple closed curves determine a graph whose vertices are the p i n t s of intersection of pairs of curves, and whose edges are the arcs of curves determined by the intersection points. Equivalently, this is a 4-regular graph embedded in the plane which has a decomposition into edge-disjoint cycles, such that any pair of adjacent edges on any face lie in different cycles of the decomposition. 17. G. Haj6s [a] gave a necessary and sufficient condition for a graph not to be (k - 1)colomble involving constructing the graph through a sequence of operations from complete graphs &. A.J. Mansfield and D. Welsh [S7] define the Hujds number h Q of a graph based on the minimal length of this sequence for the case of k = 4. (If the graph is in fact 3-colorable the Haj6s number is defined to be zero.) Using this concept and the fact that 3colorability is NP-complete, they investigate the conjmture that CONPf NP. The Hajds number turns out to be exceedingly difficult to calculate in general; the authors report that the Hajds number of the Mycielski graph is still unknown.
R. Steinberg
234
A counterexample to Grotzsch’s conjecture consisting of five overlapping circles was given by G. Koester [W] [91] in 1984 (Figure 17a), who also provided a 4-critical example consisting of seven overlapping circles [91] [%I in 1985(Figure 17b). To see that the graph of Figure 17a is not 3-colorable, try extending the two non-isomorphic 3-colorings of the central 4-gon. To see that the graph of Figure 17b is 4-critical is also straight-forward; a formal proof is given by Koester [%I.
Figure 17:Counterexamples Grotzch’s conjecture. It is not difficult to show [91] [%I that if the generating cycles of the graph in Grotzsch’s conjecture can be partitioned into three classes, where the cycles in each class are painvise vertex-disjoint, then the graph is 3-colorable. This led Koester to propose the following problem [91], published 1985.
Problem 11.1: Consider a graph formed by the superposition of a set r of simple closed curves in the plane, no two of which are tangent and no three of which meet at a point. If the elements of r (that is, the generating cycles) can be partitioned into four classes, where the cycles of each class are painvise vertex-disjoint, is the graph then vertex 3-colorable? Koester has conjectured an affirmative answer to Problem 11.1.
11.2 The Circle-Triangle Conjecture This problem has several equivalent forms and concerns general graphs. It was submitted by Michael Fellows [94] to the problems session at the Summer Research Conference Graph
The state of the three color problem
235
and Algorithms at the University of Colorado, July 1987. Fellows reports in 1990 that he first learned of it in the form presented below from Frank Hsu, and that it had been “circulated widely for a few years, partly through the ofices of Paul Erdos” [93. Problem 11.2: The Circle-Triangle Conjecture No matter how a collection of triangles is inscribed in a circle with all points of intersection distinct, one can always choose one vertex of each triangle so that no two of the chosen vertices are consecutive with respect to the circle.
More formally, the circle and triangles determine a graph whose vertices are the points where triangles intersect the circle, and whose edges are the sides of the triangles and the arcs of the circle determined by the circle-triangle intersection points. The conjecture states that if n triangles are inscribed in a circle, the resulting graph on 3n vertices always contains an independent set of size n. Fellows adds that a somewhat stronger statement may be true, viz., that the vertices of the graph can be partitioned into three independent sets of size n. Equivalently: Problem 11.3: The Strong Circle-Triangle Conjecture No matter how a collection of triangles is inscribed in a circle with all points of intersection distinct, the resulting graph is vertex 3-colorable.
Fellows [93 attributes the strong form of the conjecture “primordially” to I. Schur and independently to J. Schonheim (1988, unpublished). Noga Alon [%] equivalently formulates the strong form of the conjecture as follows. Let G be a graph on n vertices. If k divides n then G is said to be strongly k-colorable if, for any partition of Vinto sets V ieach having cardinality k , then G is vertex colorable in k colors such that each color class intersects V, by exactly one vertex. If k does not divide n, then G is said to be strongly k-colorable if the graph obtained by adding k r n / k l - n isolated vertices is strongly k-colorable. The strong chromatic number of a graph G , %(G), is the minimum k such that G is strongly k-colorable. In this terminology, the Strong Circle-Triangle Conjecture is that, for every cycle on 3n vertices, C3,,, %(C3J 53.It has been shown by F. de la Vega, Fellows and Alon that, for every cycle on 4n vertices, C,,, %(C4,J I 4 (see [953 and [!TI). See 9 16 for a dramatic development on Problem 11.3.’8,’9 12. Relationship to Four Coloring
Fred Holroyd and Feodor Loupekine [B]introduced the following concept: A combinatorial statement of the form “For every X there exists a Y with property Z” is loosely coupled to the Four Color Theorem if it is equivalent to it, but either there is no natural correspondence between plane maps and X ’ s , or, for a fixed map and the corresponding X,there 18. Daniel LA. Cohen asked in 1W whether every planar map of rectangular regions can be face c o l o d in three colors. The following year, an easy counterexample was published consisting of a rectangle surrounded by five rectangles which was found by David Cohen, a 9th grade high school student in Philadelphia [98][99]. However, geometric sufficient conditions for planar 3-colorability may be worthy of further study. 19. Dynlun and Uspenskii p] consider an arbitrary collection of circles drawn in the plane, where each circle contains a chord so that chords of two different circles have at most one point in common.They show that the resulting map is face colorable in three colors. In general, any map formed by the superposition of a set of 3-facecolorable plane maps, where the number of intersection points is finite, will be 3-facecolorable.The proof is easy; from the flow theory ($15) it is immediate.
R. Steinberg
236
is no natural correspondence between equivalence classes of map colorings and the corresponding set of Y‘s with property Z. Although this definition of “loosely coupled” begs the question of what “natural correspondence” exactly means, it does helps to clarify the concept of restating the Four Color Problem in other, presumably more tractable, formulations. 12.1 Triangle-Colored Graphs As discussed in 52, Heawood was the first to demonstrate that the Four Color Problem is, via Tait colorings, a special case of the Three Color Problem. A formal statement of this was provided by Ore [l] (Chapter 9). We need some definitions. The line graph, L(G), of planar graph G is the graph having its vertex set in one-to-one correspondence with the edge set of G, where two vertices of L(G) are adjacent if an only if the corresponding edges of G are adjacent. A plane graph (i.e., a graph embedded in the plane), all whose vertices have even degree, is said to be triangle-colored when it can be face colored in two colors a and p such that all the a-faces are 3-gons. Theorem 12.1:
The following statements are equivalent: (i) Every planar graph is vertex 4-colorable. (ii) Every triangle-colored simple plane graph which is regular of degree 4 is vertex 3-colorable. Thus, by the Four Color Theorem, Theorem 12.1 provides an additional class of graphs not covered by any other 3-coloring results. Alternatively, Theorem 12.1 may be a vehicle through which a short(er) and possibly non-electronic proof of the Four Color Theorem could be obtained (see 5 1). Holroyd and Loupekine [28] state (ii) equivalently as: The line graph G’ of any cubic map G is a subgraph of an even triangulation. They claim that although this statement is tightly coupled to the Four Color Theorem if triangular embeddings are attempted one at a time, it may be regarded as loosely coupled if stated as: The line graph of every cubic map arises as a subgraph of some member of a particular class of even triangulations. The reduction of the next subsection is an example of “loose coupling.” 12.2 Toft’s Color Reduction Bjarne Toft raised the following question of expressing 4-colorability in terms of 3-colorability in lectures on graph coloring at the University of Regina, Canada in 1985 (see [ 8 ] , Problem 5.8).It concerns general graphs. Problem 12.2: Toft’s Color Reduction
Suppose G is 4-colorable graph that does not contain K4 as a subgraph. Does the following necessarily hold? G contains two vertices x and y and two 3-colorable subgraphs G 1 and G2, each containing x and y but not containing the edge (x, y), such that: (1) in any 3-coloring of GI, the vertices x and y receive different colors, and ( 2 )in any 3-coloring of G2, the vertices x and y receive the same color. Toft points out that if the conditions above hold for a given graph G , this would explain why G is not 3-colorable. Observe that the non-3-colorable planar graphs described in $7 clearly have this structure, where GI is a quasi-edge with end vertices {z1,~2} = {x, y}. Toft also points out that if the condition holds for every 4-chromatic graph without K4,
The state of the three color problem
237
this would characterize 4-colorable graphs in terms of 3-colorability. Further, he remarks that of course this is not expected to lead to a polynomial time characterization of 4-colorability, since then the NP-complete problem of 3-colorability would also be in coNP, hence coNP = NP, which is a stronger statement than P = NP (see [78]and 8 10). The corresponding reduction for 3-colorability in terms of 2-colorability does indeed hold; for k-colorability in terms of (k - 1)-colorability, the corresponding reduction does not hold for k 2 6. For references on this problem and variations, see [5J(Problem 28), and [8] (Problem 5.8). 13. Counting 3-Colorings
Heawood was the first to consider the number of colorings a map can have. In a footnote in his 1898 paper [14] (see p.281), he correctly reports that the number of ways in which three colors can properly color n regions meeting at a vertex is (2"-l + (-1)"). After Heawood's paper, most of the counting focused on the case of four colors. Around the early 1WOs, however, the counting of 3-colorings resumed. We will start by considering two 3-colorings of a graph distinct if they induce different partitions of the vertex set of the graph into three independent subsets. In 1972, Ioan Tomescu [lo01 considered graphs on n vertices whose chromatic number is equal to 3. As usual, let C,, denote the cycle on n vertices; also, let C j denote the graph consisting of C,, with the addition of a one vertex adjacent to one vertex of the cycle. Tomescu proved the following: Theorem 13.1: The maximum number of 3-colorings of a connected graph having n vertices and chromatic number 3 is (2"- - 1) for odd n, and ( 2 n - 2- 1) for even a. If n is odd, the unique connected graph that achieves the maximum number of 3-colorings is C,,; if n is even, the unique connected graph that achieves the maximum number of 3-colorings is C j -
5
Tomescu points out that the theorem remains true for connected planar graphs since the extremal graphs C,,and CL are planar. Thus, Heawood's formula for the number of ways of 3coloring a configuration of n regions meeting at a vertex is in fact the maximum number of ways a map with n regions can be 3-colored if n is odd, and this maximum can only be achieved in such a configuration. In 1976, Tomescu supplied the following analogous result for Hamiltonian graphs [loll. Theorem 13.2 The maximum number of 3-colorings of a Hamiltonian graph having n vertices and chromatic number3is $(2"-'-1) foroddn,and f ( 2 " - ' - 2 ) -$(2"-1+22s+2"-2s - 7), where s = Ln/4 J for even n. If n is odd, the unique Hamiltonian graph that achieves the maximum number of 3-colorings is C,,; if n is even, the unique Hamiltonian graph that achieves the maximum number of 3-colorings consists of C,, and a chord joining two vertices x,y of C,, such that their distance d(x, y) = 2Ln/4J (that is, d(x, y) is even and maximal). Again, the theorem remains true if restricted to planar graphs since the extremal graphs are both planar. In 1990, Tomescu [lo21 extended his 1972 result by characterizing the class of connected 3-chromatic graphs having the maximum number of p-colorings for p 2 3. Further, he obtained chromatic polynomials of connected 3- and 4-chromatic planar graphs that are maxi-
R.Steinberg
238
mal for positive integer-valued arguments. The result for 4-chromatic planar graphs is based on structural properties obtained by appealing to Griinbaum’s Theorem. Kenneth Berman [lo31 considers 4-regular maps embedded in the plane, which he observes can be obtained by the union of a set of continuous closed curves in the plane which cross themselves or each other once at each point of intersection. He calls these curves crosscurves. Among his results is a formula for the number of face 3-colorings of a 4-regular map. His formula is given in term of the numbers of cross-curves of a collection of planar 4-regular maps derived from G. A graph on n vertices and m edges is called an (n,m)-graph. Results on numbers of colorings for general (n, m)-graphs were obtained by E.M. Wright [ 1041 and Felix Lazebnik [1051. Both authors consider two 3-colorings of a graph to be different if they induce two different partitions of the vertex set or if they only differ in a nonidentity permutation of the colors. Wright gives an asymptotic approximation to the total number of h-colorings of all (n,m> graphs for fixed h where n is large. The results of Lazebnik include upper and lower bounds on the greatest number of 3-colorings for an (n,m)-graph.
14. Applications of 3-Coloring 14.1 Chvital’s Art Gallery Theorem
Vasek ChvBtal [ 1061 answered a question posed by Victor Klee in 1973 by showing: If S is a subset of the plane bounded by a (not necessarily convex) polygon with n vertices, then there is set T of at most n/3 points of S such that for any point p of S there is a point 4 of T with the segment p4 lying entirely in S. If S is an art gallery with n straight walls, ChvBtal’s result gives a bound, easily shown to be sharp, that no more than n/3 guards are needed to survey the entire gallery. ChvBtal provides a clever, inductive proof that covers two and a half pages in the Journal of Combinatorial Theory. By using 3-coloring, Steve Fisk [lo71 provided the following ultrashort proof. Any triangulation of the interior of S obtained by connecting its vertices with non-intersecting diagonals is outerplanar and thus 3-colorable (Theorem 3.4), and the smallest color class T has at most n/3 vertices. Every point 4 of S lies in some triangle, and every triangle has a point p of Ton it. Since triangles are convex, p 4 E S. 14.2 Computing with 3-Colorable Graphs Ronald Read and Colin Wright in their idea paper [82] suggest the “appealingly bizarre” use of 3-colorable graphs to perform logical operations and, by extension, arbitrarily complex computations. They show how to construct a “computer graph,” a 3-colorable graph where two colors represent true and false, and where the third color has no logical connotation but rather is used in what might be called the “internal workings” of the computer graph. Two sets of the vertices are distinguished as “input vertices” and “output vertices.’’ There is also a set of three “reference vertices” which are used to constrain the colors of other vertices. For example, all the input and output vertices are restricted to the two colors representing the two truth values. The computer graph “operates” as follows. The computer input corresponds to a given coloring of the “input vertices.” The computation is effected by extending the coloring of the input vertices and the reference vertices to the entire graph. The resultant coloring of the “output vertices” corresponds to the computer output. The authors construct computer graphs for the basic logical operations of negation (Figure
The state of the three color problem
239
2
z
= xvy
loll 1 1
1
Figure 18: Computer graphs. 1%). AND (Figure 18b). and OR (Figure 18c). They provide an informal proof that, by cascading these three basic elements, a graph C(C) can be constructed corresponding to any given collection C of clauses such that G(C) is 3-colorable if any only if C is satisfiable, thus showing that the NP-complete problem surisfubility can be reduced to 3-coloring. (Compare with [79]and [80].)
Read and Wright report on a computer graph program which accepts two integers, constructs the corresponding computer graph to perform the multiplication, and 3color it via the Petford and Welsh 18Sl algorithm. By running the computer graph backward, i.e., coloring the output vertices and determining the color of the input vertices, they find the two factors of the number. They report with some regret that computer glaphs tend to fall into both classes of graphs for which Petford and Welsh found their algorithm to be inefficient (see $10). Specifically, computer graphs tend to have a large disparity between the degree of the reference vertices and the other vertices, and there are many vertices of degree 5.
240
R. Steinberg
15. Nowhere-Zero %Flows A nowhere-zero k-flow ( k 2 2 ) is an orientation of the edges of a graph together with a mapping from the edges to the set of integer weights { 1,2,3,. ..,k - 1) such that, at every vertex,the sum of the weights on the edges directed out is equal to the sum of the weights on the edges directed in. A mod k-orientation is an (unweighted) orientation of the edges of a graph such that, at every vertex, the number of edges directed out minus the number of edges directed in is congruent to zero modulo k. A useful result is: A graph has a nowhere-zero 3flow if and only if it has a mod 3-orientation (see [ l a ] ) . Mod 3 orientations are somewhat easier to work with than nowhere-zero %flows. (Space does not allow us to develop here the full theory of nowhere-zero flows. The interested reader may wish to consult the Lomprehensive survey by Jaeger [1091.)
W.T. Tutte tied together colorings and flows in 1954 with the following result [110]: A 2edge-connected graph embedded in the plane has a nowhere-zero k-flow if and only i f it is face colorable in k colors. (An elementary proof can be found in [ l l l ] (see Lemma 1.6). A short proof in terms of mod k flows can be found in [lo81 .) A cut of a graph G = (V, E) is Xz) of V where each edge in the cut has the subset of E defined by a nontrivial partition (XI, one end in X1 and one end in Xz A k-cut is a cut of cardinality k. Observe that no graph with a 1-cut has a nowhere-zero k-flow for any k. Grijtzsch’s Theorem in dual form states that every planar map without 1-cuts and 3-cuts has a face coloring in three colors. We thus have: Theorem 15.1: Grotzsch’s Theorem (Flow Formulation)
Every planar graph with neither 1-cuts nor 3-cuts has a nowhere-zero 3-flow. What happens if we remove the restriction of planarity? This idea of generalizing Grotzsch’s Theorem in dual form was first considered by Tutte (1972, unpublished):20 Problem 15.2: Tutte’s 3-mow Conjecture
Every graph without 1-cuts and 3-cuts has a nowhere-zero %flow. Steinberg and Younger [51] showed that Tutte’s %Flow Conjecture is true for graphs embeddable in the real projective plane. In fact, they showed that at least one 3-cut can be allowed for projective planar graphs. We will need two additional definitions. A distinguished vertex in a graph M is a 4- or 5-vertex d , not a cut vertex, at which a mod 3-orientation has been specified; that is, the edges at dare directed so that the net outflow from d is congruent to zero modulo 3. For a 5-vertex d, one of the incident edges, called the minority edge, opposes the other four in direction. Of course, the minority edge corresponds to the special edge in the proof of Griinbaum’s Theorem (Theorem 6.1). Theorem 15.3:
If G is a graph without 1-cuts that is either embedded in the plane and has at most three 3-cuts, or is embedded in the projective plane and has at most one 3-cut, then: i. G has a mod 3-orientation. ii. If G is embedded in the plane and has at most one 3-cut, then a mod 3-orientation may be specified arbitrarily on an arbitrarily selected 4- or 5-vertex, assuming the minority edge of the 5-vertex is not contained in a 3-cycle. 20. Tutte’s %now Conjecture follows the pattern of conjecturesdescribed in Tutte’s 1954paper [110];however, reference is made there only to specific k-flow conjectures for k = 4 and k = 5. While working on my Master’s thesis [SO],Tutte told me that he had formulated the conjecturein 1972. Apparently, the first explicit reference to Tutte’s 3-now Conjecture is in [SO].
The state of the three color problem
241
Discussion of Proof: The proof is along the lines of Grotzsch [ a ] , and Griinbaum [491 & Aksionov [63], although it is dualized to work with k-cuts rather than k-cycles. The proof is, again, by induction using the method of reducible configurations,cumulating in the Grotzsch configuration (see Appendix) in dual form. As in Grotzsch’s Theorem (Theorem 5.1) and Griinbaum’s Theorem (Theorem 6.1), part ii is necessary for the proof of part i. The allowance for one 3-cycle on the projective plane IS also necessary for the inductive proof. The reductions are similar to those of Grotzsch and Aksionov in dual form. In addition, the Steinberg-Younger proof includes a new reduction (“a 6-cut that contains a ‘zigzag”’) that is introduced to handle the additional embeddings that are possible in the projective plane. The reader is referred to the paper [51] for details. Clearly, a nowhere-zero k-flow of a graph is also a nowhere-zero (k + 1)-flowof the graph. Jaeger [112] [113] has shown: Theorem 15.4: Every graph without 1-cycles and 3-cycles has a nowhere-zero &flow. Tutte’s 3-Flow Conjecture can be stated equivalently in the following form (see for example [lll] (Lemma 2.2) or [113] (Proposition 6)), which shows that 2-cuts can be ignored Problem 15.5: Tutte’s 3-Flow Conjecture (Edge-Connectivity Formulation) Every 4-edge-connected graph has a nowhere-zero %flow. Jaeger has proposed the following two variations on Tutte’s Conjecture. The first is a relaxation, the second a generalization: Problem 15.6: The Weak %Flow Conjecture There exists an integer k such that every k-edge-connectedgraph has a nowhere-zero 3-flow. Problem 15.7: The Circular Flow Conjecture For all p 2 1, every 4p-edge-connected graph has a mod(2p + 1)-orientation. The Weak %Flow Conjecture was proposed in [114]. (Compare with Problem 9.2.) A detailed discussion of the Circular Flow Conjecture (including an explanation of the conjecture’s name) can be found in [109]. Additional results on %flows appear in the paper of CunQuan Zhang [11 3 . Back to Basics Using the flow theory, we now provide easy proofs of Theorems 3.1 and 3.2. We cast the two propositions into the face coloring formulations in which they originated: A graph embedded in the plane for which every vertex degree is a multiple of 3 is face colorable in three colors if every face has an even number of sides. A cubic graph embedded in the plane is face colorable in three colors if and only if every face has an even number of sides. The proofs are as follows. When each face is even, every cycle has even length and the graph is vertex colorable in 2 colors, say 0 and 1. Direct each edge from 0 to 1. At each vertex all edges will have the same direction and the graph will have a mod 3-orientation, hence a face coloring in three colors. In the cubic case, for the faces to be 3-colorable, the faces adjacent to any given face must alternate in color, hence every face must have an even number of sides.
R.Steinberg
242
Franklin’s 1939 proof of Theorem 3.lb [20] is similar to this proof, although the flow theory per se did not exist until Tutte introduced it fifteen years later [llO]. However, Tutte had already shown in the 1940’s that the original map-coloring formulation of the Three Color Theorem for planar graphs is but a special case of a theorem about general graphs on orientable surfaces [1161 (see Theorem V): A cubic graph G has an embedding in an orientable surface as a 3-jace-colorable map i f and only i f G is bipartite.21It is interesting that Heawood himself did not consider this question: not only was he the first to give a clear statement of the Three Color Theorem, but it was he who first systematically investigated map coloring on higher surfaces [13] (although this idea, too, was anticipated by Kempe [ll]). Two open questions include:
Problem 15.8:
Is it true that if G is a projective planar graph without 1-cycles and with at most two 3-cycles, or with at most three 3-cycles, then G has a nowhere-zero %flow? Problem 15.9: Is Tutte’s %How Conjecture true for graphs embeddable in surfaces of sufficiently low genus other than the plane?
Problems 15.8 and 15.9 arise naturally from Theorem 15.2. Problem 15.9 is further motivated by the fact that a related problem, called Tutte’s 5-How Conjecture [lo91 [ l lo], was first proved for the projective plane [118] and then extended to surfaces of low genus [119]. 16. Conclusions
We began this paper by investigating the origin of the Three Color Problem and found that many of the basic ideas originated with P.J. Heawood. We have described the major results and open questions, and have attempted throughout to emphasize the connections among various 3-coloring problems. The concept of graph 3-coloring has been extended in several directions. A graph homomorphism @ from a graph G to a graph H is a mapping from the vertex set of G to the vertex set of Hsuch that if ( ~ 1 . ~is2 an ) edge in G, then (Qvl,@v2)is an edge in H . Graph homomorphisms can be considered generalizations of colorings; in particular, a 3-coloring of a graph G is equivalently a homomorphism from G to K3. H.A. Maurer, 1.H. Sudborough, and E. Welzl 11201 have shown that for any odd cycle, the problem of determining whether there is a horn+ morphism from a graph G to the cycle is NP-complete, thus generalizing Theorem lO.l(i). Michael Albertson and Karen Collins [ 1211 have shown the existence of an infinite family of 3-chromatic graphs which are painvise non-homomorphic. Vasek Chvital investigated the 3-coloring of random graphs and was able to show that almost all graphs with n vertices and (1.44 + o( 1))n edges are 3-colorable [122]. Three-coloring of infinite graphs has been considered by Stephen Hechler [123] and Roger Eggleton [124]. Three-coloring of hypergraphs has been considered by J. Beck [125] and by ErdBs and Lovkz [126]. Robert Haas [127] has proposed a coloring theory for the elements of a group, where a coloring is defined as a group “non-homomorphism.” Haas’s definition attempts to mimic graph k-coloring as closely as possible. He found group 3-coloring to be the only mathematically interesting case. Undoubtedly there are analogous 3-coloring questions for other combinatorial structures. 21. Dan Archdeaconhas further extended Tutte’s result [117]meorem 5.1).
The state of the three color problem
243
One of the more exciting recent developments in 3-coloring is the announcement by Herbert Fleischner and Michael Stiebitz of a proof of the Strong Circle-Triangle Conjecture (Problem 11.3). Complete details can be found in their 1991 working paper [128]. In conclusion, it was pointed out at the conference Quo Vadis, Graph Theory? (August 1990, Fairbanks, Alaska) that essentially all planar 3-coloring problems have analogues in terms of nowhere-zero 3-flows. In this more general formulation these problems have for the most part been completely unexplored and would serve as excellent starting points for future research.
Acknowledgement I am deeply grateful to W.T. Tutte and Horst Sachs for thorough readings of an earlier version of the manuscript; each had subtle corrections and excellent suggestions. 1 would also like to thank Noga Alon, Dan Archdeacon, Frank R. Bernhart, Avnm Blum, Vasek ChvBtal, Michael Fellows, Steve Fisk, Herbert Fleischner, Branko Griinbaum, David Johnson, Felix Lazebnik, David Petford, Arvind Rajan, Kenneth H. Rosen, Richard E. Stone (who suggested the paper’s title), Bjarne Toft, Ioan Tomescu, Craig A. Tovey, Peter Ungar, and two anonymous referees. The accuracy and presentation of the material in this paper is of course the responsibility of the author alone.
References 0.Ore; The Four-Color Problem, Academic Press, New York,Chapter 13 (1%7). T. Saaty and P. Kainen; The Four-Color Problem: Assaults and Conquest, McGraw-Hill, New York (1977). K. Appel and W. Haken; Every planar map is four colorable. Part I: Discharging, Illinois Journal of Mathematics, 21,429-490 (1977). K. Appel, W. Haken, and J. Koch; Every planar map is four colorable. Part 11: Reducibility, Illinois Journal ofMathematics, 21,491-567 (1977). B. Toft; 75 graph-colouring problems, Chapter 2 in Graph Colourings, R. Nelson and R.J. Wilson (editors), John Wiley & Sons, New York, 9-35 (1990). E.B. Dynkin and V.A. Uspenskii; Mathematical Conversations, Gosudarstv. Izdat. Techn.-Teor. Lit.. Moscow and Leningrad (1952). (In Russian.) E.B. Dynkin and V.A. Uspenskii; Multicolor Problems, D.C. Heath and Company, Boston (1963). (English banslation of “Map Coloring Problems,” Part One of [6].) B. Toft; Graph colouring problems, Part 1, Institut for Matematik og Datalogi, Odense Universitet. Reprints. No. 2, Apnl(l987). J.A. Bondy and U.S.R. Mnrty; Graph Theory with Applications, Macmillan. London and Basingstoke (1976). A. Cayley; On the coloUring of maps, Proceedings of (he Royal Geographical Society (New Series), 1, 259-261 (1879). A.B. Kempe; On the geographical problem of the four colours, American Journal of Mathematics, 2, 193-200 (1879). N.L. Biggs, E.K. Lloyd, and R.J. Wilson; Graph Theory: 17361936,Oxford University Press, London (1976). P.J. Heawood; Map-colour theorem, Quarterly Journal of Pure and Applied Mathematics. 24,332-338 (1890). P.J. Heawood;On the four-colour map theorem; Quarterly Journal of Pure and Applied Mathematics, 29. 27CLB5 (1898). P.G. Tait; Remarks on the previous communication,Proceedings of the Royal Society of Edinburgh, 10, 729 (1878-1880). G.A. Dirac; Percy John Heawood, Journal of the London Mathematical Society, 38,263-277 (1963).
244
R. Steinberg
W. Ahrens; Mathematische Vnterhakungen und Spiele, Teubner, Leipzig (1901). A. Sainte-Lague; Geom6hie de situation et jeux, Mkmorial des Sciences Mathkmatiques, Fasc. 41, Gauthier-Villars, Paris (1929). 1191 D. Konig; Theorie der endlichen und unendlichen Graphen, Akademische Verlagsgesellschaj Leipzig (1936). P. Franklin; The four color problem, Scripta Mathematica, 6,149-156 & 197-210 (1939). H. Kr61; On a sufficient and necessary condition of 3-colorableness for the planar graphs. I, Prace Naukowe Instytutu Matematyki i Fizyki Teoretycznej Politechniki Wroctawskiej, Seria Studia i Materialy, No. 6, Zagadnienia kombinatoryczne,3 7 4 0 (1972). 1221 H. Kr61; On a sufficient and necessary condition of 3-colorableness for the planar graphs. 11, Prace Naukowe lnsryturu Maematyki i Fizyki Teoretycznej Polirechniki Wroctawskiej, Seria Studia i Materialy. No. 9 Grafy i hypergrafy, 49-54 (1973). 1231 V.P. Homenko and N.V. Lysenko; Necessary and sufficient conditions for n-colorings of graphs, in Graph Theory, N.P. Homenko (editor), Izdanie Inst. Mat., Akad. Nuak Ukrain. SSR, Kiev, 107-114 (1977). (InRussian.) N.I. Martinov; 3-colorable planar graphs, Serdicq 3, 11-16 (1977) (InRussian). L.I. Golovina and I.M. Yaglom; Inductton in Geometry, second edition, revised, Gosudarstv. Izdat. Fiz.Mat. Lit., Moscow (l%l). (InRussian.) L.I. Golovina and I.M. Yaglom; Inducrron in Geometry, Topics in Mathematics, D.C. Heath and Company, Boston (1963). (English translation of [u].) H. Flkschner; iiber endliche, ebene Eulersche und paare, kubische Graphen, Monatshejefur Mathemahk,74,410-420 (1970). F. Holroyd and F. hupekine; The four colour problem is not dead, Chapter 3 in Graph Colourings, R. Nelson and R.J. Wilson (editors), Longman Scientific & Technical, Essex, England and John Wiley & Sons, New York, 37-44 (1990). 1291 S. Fisk Cobordism and functoriality of colorings, Advances in Mathematics, 37, 177-21 1 (1980). S. Fisk; Coloring Theories, Contemporary Mathematics, Volume 103, American Mathematical Society. "1 Providence, Rhode Island (1989). [311 D. Kthig; On map coloring, Mathematikai ks Physikai Lapok, 14,193-200 (1905). (InHungarian.) 1321 R.L. Brooks; On colouxing the nodes of a network. Proceedings of the Cambridge Philosophical Society. 37, 194-197 (1941). J.B. Kelly and L.M. Kelly; Paths and circuits in critical graphs, American Journal of Mathematics, 76, 786-792 (1954). 1341 G.A. b r ac; A theorem of R.L. Brooks and a conjecture of H. Hadwiger. Proceedings of the LDndon Mathematical Society, Series 3.7, 161-195 (1957). L. LovLz; Three short proofs in graph theory, Journal of Combinatorial Theory, Series B , 19,269-271 (1975). B. Descartes; A threecolour problem, Eureka, 9, April, 21 (1947). [B. Descartes]; A three-colour problem, Solutions to Problems in Eureka No. 9, Eureka, 10, March, 24 (1948). A.A. Zykov; On some properties of linear complexes, Matemariceskii Sbornik N.S.. 24(66) 163-188 (1949). (InRussian.) American Mathematical Society Translation No. 79 (1952).Translation republished in: Algebraic Topology, Translations, Series 1, Vol. 7.418449, American Mathematical Society, Providence, Rhode Island (1962). 1391 [p. Ungar]; Problem 4526, Advanced Problems and Solutions, American Mathematical Monthly, 60. 123 & (corrected) 336 (19%). P. Ungar; private communication. January (1991). [B. Descartes]; k-chromaticgraphs without triangles, Solution to Problem 4526, Advanced Problems and Solutions, American Mathematical Monthly, 61,352-353 (1954). J. Mycielski; Sur le coloriage des graphes, Colloquium Mathematicum, 3, 161-162 (1955). V. ChvAtal; The minimality of the Mycielski graph, in Graphs and Combinatorics, R.A. Bari and F. Harary (editors), Proceedings of the Capital Conference on Graph Theory and Combinatorics. George Washington University, June 1973, Lecture Notes in Mathematics, 406, Springer-Verlag. Berlin, 243246 (1974). 1441 P. E d & ; Graph theory and probability, Canadian Journal of Mathematics. 11,3438 (1959).
.
The state of the three color problem
[45l
1531 1541
L.581
POI P11 P21
245
L. LovLz; On chromatic number of finite set-systems, Acta Mathematica Academiae Scientiarum Hungaricae, 19.59-67 (1968). P. MI%; On circuits and subgraphs of chromatic graphs, Mathematika, 9,170-175 (1%2). H. Sachs; Zur Theorie der diskreten Gebilde. Ein Beitrag zur Wiirdigung des graphentheoretischen Werkes des Jubilars, Wissenschafliche Zeitschrift der Martin-Luther- Universitat Halle- Wittenberg. M a t h e m a t i r c h - N a t u n h a f l i c h e Reihe, 37,116121 (1988). H. Grotzsch; Ein Dreifarbensatz fiir dreikreisfreie Netze auf der Kugel, Wissenschaftliche Zeifschrifl der Martin-Luther- Universitat Halle- Wittenberg. Mathematisch-NaturwissenschafilicheReihe, 8, 109-120 (19.5811959).mote: Page 120 is blank.] B. Griinbaum; Grotzschs theorem on 3-colorings, Michigan Mathematical Journal, 10,3@-310 (1%3). R. Steinberg; Grotzsch’s Theorem Dualized, M. Math. Thesis, University of Waterloo, Ontario, Canada (1976). R. Steinberg and D.H. Younger; Griitzsch’s theorem for the projective plane, Ars Combinaloria, 28, 1 5 31 (1989). A.T. White; Graphs, Groups and Surfaces, Revised Edition, North-Holland Mathematics Studies, 8, North-Holland,Amsterdam (1984). H. V. Kronk and A. T. White; A 4-color theorem for toroidal graphs, Proceedings of the American Mathematical Society, 34.83-76 (1972). H. Kronk;The chromatic number of triangle-free graphs, in Graph Theory and Applications Y. Alavi, D.R. Lick, and A.T. White (editors), Proceedings of the Conference at Western Michigan University, Kalamazoo,Michigan, May, 1972, Lecture Notes in Mathematics, 303.Springer-Vedag,Berlin, 179-181 (1972). R.J. Cook; Chromaticnumber and girth, Periodica Mathematica Hungarica, 6. 103-107 (1975). R.L. Woodburn; A 4-color theorem for the Klein bottle, Discrete Mathematics. 76,271-276 (1989). E. Kaiser; Fiirbungssatze fiir Graphen auf der projektiven Ebene, den Torus und dem Kleinschen SchIauch, Wissenschajliche Zeitschr:f der TechnischenHochschule llmenau ,20,47-53 (1974). V.A. Aksionov; private communications,June 23. October 16 (1976) C. Berge; Les problkmes de coloration en thkorie des graphes, Publications de l’lnstitut de Statistique Z’Universite‘de Paris. 9, 123-160 (1960). C. Berge; Fiirbung von Graphen, deren siimimtliche bzw. deren ungerade Kreise starr sind, Wissenschafiliche Zeihchrifi der Martin-Luther- Universitat Halle- Wittenberg. Mathematisch-Natunuissenschafiliche Reihe, 10.114-115 (l%l). A. Tucker; Critical perfect graphs and perfect 3-chromatic graphs, Journal of Combinatorial Theory, Series B. 23, 143-149 (197). H. Sachs; Einfihrung in die Theorie der endlichen Graphen. 11.Teubner, Leipzig (1972). V.A. Aksionov; Concerning the extension of the 3-coloring of planar graphs, Disbetnyi Analiz ,26,3-19 (1974).(Russian.) V.A. Aksionov and L.S. Mel’nikov;Essay on the theme: the three-color problem, Combinatorics. Colloquia Mathemalica Societatis Jdnos Bolyai,18.23-34 (1978). R. Steinberg; The Three Color Problem, B.A. Thesis, Reed College, Portland, Oregon (1975) V.A. Aksionov and L.S. Mel‘nikov; Some counterexamples associated with the three-color problem. Journal of Combinatorial Theory, Series B , 28, 1-9 (1980). I. Havel; On a conjecture of B. Griinbaum,Journal of Combinatorial Theory, 7,184-186 (1%9). I . Havel; The coloring of planar graphs by three colors, Mathematics (Geometry and Graph Theory). Univ. Karlova, Prague, 89-91 (1970). (In Czech with English summary.) P. Erdds; informal discussion during the conference “Quo Vadis, Graph Theory?’ University of Alaska, Fairbanks, Alaska, August (1990). S.A. Burr, P. Ed&, and L. Lovhz; On graphs of Ramsey type, Ars Combinatoria, 1. 167-190 (1976). H.-K. Ma; A counterexample to the conjecture of Aksionov and Mel’nikov on non-3-colorable planar graphs, Journal of Combinatorial Theory, Series B , 36,218-219 (1984). G. Chartrand and D.P. Geller; On uniquely colorable planar graphs, Journal of Combinatorial Theory. 6, 271-278 (1%9). V.A. Aksionov; On uniquely 3-colorableplanaf graphs, Discrete Mathematics, 20,209-216 (1977).
R. Steinberg
V.A. Aksionov; Chromatic connected vertices in planar graphs, Diskretnyi Analiz. 31.5-16 (1977).(In Russian.) D.L. Greenwell; Semi-uniquelyn-colorable graphs, in Proceedings of The Second Louisiana Conference on Combinatorics, Graph Theory and Computing, R.C. Mullin, K.B. Reid, D.P. Roselle, and R.S.D. Thomas (editors), (Louisiana State University, Baton Rouge, March, 1971). Louisiana State University, Baton Rouge, 253-256 (1971). L.S. Mel’nikov and R. Steinberg; One counterexample for two conjectures on three coloring, Discrete Mathematics, 20,203-206 (1977). L.J. Osterweil; Some classes of uniquely 3-colorable graphs, Discrete Mathematics, 8.59-69 (1974). M.R. Garey and D.S. Johnson; Computers and Intractability.W.H. Freeman, San Francisco (1979). L. Stockmeyer;planar 3-colorability is polynomial complete, SIGACT News (ACM Publication), 5 , July, 1%25 (1973). M.R. Garey, D.S. Johnson, and L. Stockmeyer; Some simplified NP-complete graph problems, Theoretical Computer Science, 1,237-267 (1976). A.K. Dewdney; Linear time transformations between combinatorial problems, International Journal of Computer Mathematics, 11.91-110 (1982). R.C. Read and C.D. Wright; Computing with three-colourablegraphs: a survey, Ars Combinatoria. Series B, 29,225-234 (1990). G. Ehrlich, S. Even, and R.E. Tarjan; Intersectiongraphs of curves in the plane, Journal of Combinatorial Theory, Series B, 21, %20 (1986). D.S. Johnson; Worst case behavior of graph coloring algorithms, in Proceedings of the Fifrh Southeastern Conference on Combinatorics. Graph Theory, and Computing, F. Hoffman, R.B. Levow and R.S.D. Thomas (editors), Congressus Numerantium X, Utilitas Mathematica Publishing, Winnipeg, Manitoba, 513527 (1974). A.D. Petford and D.J.A. Welsh; A randomised 3-colouring algorithm. Discrete Mathematics, 74,253261 (1989). G. Haj6s; h r eine Konstruktion nicht n-fabarer Graphen, Wissenschaflliche Zeitschrlft der MartinLuther-Universitat Halle- Wittenberg. Mathematisch-NahmrwissenschafllicheReihe. 10, 116-1 17 (1961). A.J. Mansfield and D.J.A. Welsh; Some colouring problems and their complexity, Annals of Discrete Mathematics, 13, 159-170 (1982). A. Blum; Algorithms for Approximate Graph Coloring, Ph.D. Thesis, MassachusettsInstitute of Technology, Cambridge, Massachusetts (1991). H. Sachs; Problem, in Mathematica Balkanica, 4,536 (1974). G. Koester; Bemerkung zu einem Problem von H. Grotzsch, Wissenschafrliche Zeitschrifr Unrv. Halle, 33, 129 (1984). G. Koester; Coloring problems on a class of 4-reguIargraphs, in Graphs, Hypergraphs and Applications, H. Sachs (editor), Proceedings of the Conferenceon Graph Theory, Eyba 1984, Teubner-Texte zur Mathematik, 73,Teubner, Leipzig. 102-105 (1985). G. Koester; Note to a problem of T. Gallai and G.A. Dirac, Combinatorica, 5,227-228 (1985). F. Jaeger; Snr les graphes couverts par leurs bicycles et la conjecture des quatre couleurs, in Problt?mes combinatoires et tMorie des graphes, J.-C. Bermond, J.-C. Fonrnier, M. Las Vergnas and D. Sotteau (editors), Edition Centre National Recherche Scientifique,Paris, 243-247 (1978).(English summary.) M.R. Fellows; Six Problems, in Graphs and Algorithms, R. Bruce Richter (editor),Contemporary Mathematics, 89, American Mathematical Society, Providence, Rhode Island, 187-190 (1989). M.R. Fellows; Transversals of vertex partitions in graphs, SlAM Journal on Discrete Mathematics, 3, 206-215 (1990). N. Alon; The strong chromatic number of a graph, Random Structures and Algorithms, 3, 1-7 (1992). N. Alon; The linear arboncity of graphs, Israel Journal of Mathematics, 62,3 11-325 (1988). [D.I.A. Cohen]; Problem E 1726, Problems and Solutions, American Mathematical Monthly, 71, 912 (1964). [D. Cohen, H.M. Gehman, M.S. Klamkin and R.F. Jolly; Planar maps of convex countries, Solution to Problem E 1726, Problems and Solutions, American Mathematical Monthly, 72,904 (1%5). [lo01 1. Tomescu; Le nombre maximal de 3-colorations d’un graphe connexe, Discrete Mathematics, 1,3513% (1972).
The state of the three color problem
247
I. Tomescu; Le nombre maximal de colorations d’un graphe Hamiltonien, Discrete Mathematics, 16, 353-359 (1976). I. Tomescu; Maximal chromatic polynomials of connected planar graphs, J. Graph Theory, 14, 101-110 (1990). K.A. Berman; Three-colouringsof planar 4-valent maps, J. Combinatorial Theory, Series B . 30.82-88 (1981). E.M. Wright; Counting coloured graphs 111, Canadian J. Mathematics. 24,8249 (1972) F. Lazehnik;On the greatest number of 2 and 3 colorings of a (V.E)-graph, J. Graph Theory, 13,203-214 (1989). V. Chv6tal; A comhinatoiial theorem in plane geometry. 1. Combinatorial Theory, Series B , 18,3941 (1975). S. Fisk; A short proof of Chvdtal’s watchman theorem, J. Combinatorial Theory, Series B, 24, 374 (1978). D.H. Younger; Integer flows, J. Graph Theory, 7,349-357 (1983). F. Jaeger; Nowhere-zero flow problems, Chapter 4 in Selected Topics in Graph Theory, 3, L.W. Beineke and R.J. Wilson (editors), Academic Press, London, 71-95 (1988). W.T. Tutte; A contibution to theory of chromatic polynomials, Canadian J . Mathematics, 6. 80-91 (1954). R. Steinberg; Flows, Colorings and Embeddings of Graphs, Ph.D. Thesis, University of Waterloo, Ontario, Canada (1978). F. Jaeger; On nowhere-zero flows in multigraphs,in Proceedings of the Fifth British Combinatorial Conference, C. St. J. A. Nash-Williams and J. Sheehan (editors), University of Aberdeen, Aberdeen. July 1975). Congressus Nwnerantium, XV,Utilitas Mathematica,Winnipeg, 373-378 (1976). F. Jaeger; Flows and generalized coloring theorems in graphs, J. Combinatorial Theory, Series B , 26. 205216 (1979). F. Jaeger; On circular flows in graphs, Finite and Infinite Sets, Colloquia Mathematica Societatis Jdnos Bolyai, 37,391402 (1982). C.-Q. Zhang; Minimum cycle coverings and integer flows, J. Graph Theory, 14,537-546 (1990). W.T. Tutte; On the imbedding of linear graphs in surfaces, Proceedings of the London Mathematical Society, Series 2,51,474433 (1949). D. Archdeacon; Face colorings of embedded graphs, J. Graph Theory, 8,387-398 (1984). R. Steinberg; Tutte’s 5-flow conjecture for the projective plane, J. Graph Theory, 8,277-285 (1984). M. Moller, H.G. Carstens, and G. Brinkmann; Nowhere-zero flows in low genus graphs, J. Graph Theory, 12, 183-190 (1988). H.A. Maurer, I.H. Sudborough, and E. Weld; On the complexity of the general coloring problem, Information and Control, 51, 12%145 (1981). M.O. Albertson and K.L. Collins; Homomorphisms of 3-chromatic graphs, Discrete Mathematics, 54, 127-132 (1985). V. ChvStal; Almost all graphs with 1.44n edges are 3-colorable, Random Structures and Algorithms. 2, 11-28 (1991). S.H. Hechler; On infinite graphs with a specified number of colorings, Discrefe Mathematics, 19,241255 (1977). R.B. Eggleton; New results on 3-chromatic prime distance graphs, Ars Combinatoria, Series 8.26. 153180 (1988). J. Beck; On 3-chromatichypergraph, Discrete Mathematics, 24, 127-137 (1978) P. Erdds and L. Lovisz; Problems and results on 3-chromatichypergraphs and some related questions, Infinite and Finite Sets. Volume 11, Colloquia Mathematica Societalis Jdnos Bolym, 10.609427 (1975). R. Haas; Three-colorings of finite groups or an algebra of nonequalities, Mathematics Magazine, 63, 211-225 (1990). H. Fleischner and M. Stiebitz; A solution to a coloring problem of P. Erdds. Institut for Matematik og Datalogi, Odense Universitet, Preprints, No. 8, September (1991).
248
R. Steinberg
APPENDIX: Outline of Proof of Grunbaum’s Theorem (Theorem 6.1) The Theorem is proved in two propositions. Proposition 1 is essentially the assertion that the Theorem is true if it is true for the restricted class of graphs in which every face is a 3-gon, 4-gon or 5-gon. Proposition 2 is the Theorem stated for the restricted class. The proof of Proposition 1 is simple and constructive. Given any k-gon F where k 2 6 , a vertex inserted in the interior of F can always be joined by two edges to two vertices of F such that: (i) F is reduced to two faces each with less than k edges and, (ii) no new 3-cycles are created. In this way, we can construct a supergraph H of G embedded in the plane whose faces are 3-gons, 4 g o n , and 5-gons, where every 3-cycle of H i s a 3-cycle of G. Proposition 2 is rather more involved. It is proved by induction on the number of edges of G . The reductions for G are as follows: (i) SeDaratine 3- and 4-cvcles: If G contains a separating 3- or 4-cycle C, then by the induction hypothesis, we can color (in accordance with the conditions of the Theorem) the component of C containing the distinguished face D or the greater number of 3-cycles. By the induction hypothesis, the coloring of this first component can be extended to the second component, with C taken as the distinguished face if C is a 4-cycle. (ii) SeDarating 5-cvcles: If G contains a separating 5-cycle C , a 3-coloring of G can be obtained as in reduction (i) by extended the coloring from one component of C to the other, unless the special edge of C is contained in a 3-cycle of the second component. It was Aksionov’s master stroke to handle this case by connecting two vertices of C by a new edge so as to create a new 3-cycle in the first component (see [63] for details). There are several subcases to consider, each of which resolves such that any coloring of the first component can be extended to a coloring of the second component. (iii) 4-eons: A pair of opposite vertices of the 4-gon are identified; by previous reductions, no new 3-cycles can be produced. From a coloring of the reduced graph coloring of the original graph can be obtained. Special consideration must be given to cases involving the distinguished face. (iv) Vertices of degree less than 3: A vertex of degree 2 and its incident edges are replaced by a single edge or the two incident edges are contracted. (v) %eons having one or more vertices of degree 3: By previous reductions, a 3-vertex incident with a 3-gon must be incident also with two 5-gons. Several reductions are considered to cover every possible case, some involving deleting the 3-vertex and identifying two of its neighbors contained in one of the 5-gons, others by deleting the vertex and adding in a new edge. (vi) The Grotzsch configuration: a 5-gon Fo adjacent to five 5-gons, where four of the vertices of Fo have degree 3 and the fifth vertex of F has degree 3,4, or 5. Let a 5-gon be special if it has at least four vertices of degree 3. It follows from previous reductions that every special 5gon is adjacent to five 5-gons. By Euler’s formula it follows from the previous reductions that there are at least 8 special 5-gons in G. At least one of the special 5-gons, Fo, is situated such that the corresponding Grotzsch configuration does not include the distinguished face, and is sufficiently distanced from the distinguished face, that the Grotzsch configuration can be reduced in two different ways. At least one of these reduced graphs is contained in the class of graphs considered by Proposition 2. By the induction hypothesis, the reduced graph is colorable, hence the original graph is colorable. This establishes Proposition 2; by Proposition 1 the Theorem follows.
Quo Vadis, Graph Theory? J. Girnbel, J.W. Kennedy & L.V. Quintas (eds.) Annals of Discrete Mathematics, 55, 249-260 (1993)
0 1993 Elsevier Science Publishers B.V. All rights reserved.
RANKING PLANAR EMBEDDINGS USING PQ-TREES
Almira KARABEG Department of Mathematical Sciences, Oakland University Rochester, Michigan, U.S.A.
Abstract In this paper we describe procedures, based on a level embedding of a given planar graph G, for enumerating and linearly ordering all different embeddings of the graph. PQ-treesused in the embedding algorithm are natural structures for the solution of the above problems since they provide for a simple counting recurrence, and reduce the ordering problem to, essentially,ranking and unraaking permutations.
1.
Introduction
The problem that we consider in this paper is how to enumerate, linearly order, rank and unrank all planar embeddings of any fixed biconnected planar graph G. Our approach is based on the Lempel, Even and Cederbaum planarity testing algorithm [l] and the PQ-tree data structure [ 2 ] .Vo, Dick and Williamson are concerned with the same problem in [3],but they use a different approach. Their algorithm is based on the set of connected components of the segment graphs of some depth first cycle basis of G. The method presented here is simpler and more intuitive. In Section 2 we give some basic definitions and notations. In particular, we introduce a new definition of embedding and show that this definition is equivalent to that used in [3]. In Section 3 we describe how to construct some embeddings of G, using a PQ-tree data structure. Our algorithm for constructing one embedding of the graph G is similar to that in [4], but we use somewhat different approach and rely on our terminology from Section 2. In Section 4 we derive a new formula for computing a number of isotopically different planar embeddings of a given planar biconnected graph and show that the number of embeddings can be obtained by a very simple alteration of the PQ-tree planarity testing algorithm. In Section 5 we give efficient algorithms for linearly ordering, ranking and unranking planar embeddings of G.
2. Definitions and Notation Let G = ( V , E ) , Iv = n be a biconnected planar graph. Definition 2.1: An st-numbering is a numbering of the vertices of a graph G = ( V , E ) , IVl = n, with numbers 1, 2, ..., n, such that 1. there is an edge { s , t } E E such that s receives number 1 and t receives number n,and 2. every other vertex number 1 < j c n is adjacent to at least one lower numbered vertex and at least one higher numbered vertex. Lemma 2.1: (Lempel, Even, Cederbaum) An st-numbering exists if and only if the graph G is biconnected. Thus any biconnected graph G permits an st-numbering. Having assigned an st-numbering, each edge { v,w}E E can be oriented from v to w if v has a smaller st-number than w.The Vertex v will be called a tail vertex and vertex w a oriented edge will be denoted by (v,w). head vertex of the edge (v,w).
250
A. Karakg
Definition 2.2: A directed graph, together with an st-numbering, that is obtained from a graph G by orienting edges in the above described manner, is called an st-graph, and denoted by st-G.
A linear time algorithm for finding an st-numbering is given by Even and Tarjan in [S].
Definition 2.3: Given a multiset U = {al,a2, ...,a,}, the class of PQ-Pees over U is the class of all rooted, ordered trees whose leaves are labeled by U , and whose internal nodes are distinguished as Pnodes or Q-nodes. A PQ-tree is proper when the following holds: 1. every P-node has at least two children, and 2. every Q-node has at least three children. Two PQ-trees are equivalent if and only if one can be transformed into the other by applying zero or more equivalence transformations. There are two types of equivalence transformations: 1.
Arbitrarily permute the children of a P-node.
2.
Reverse the children of a Q-node.
Definition 2.4: Let S be a unit sphere and G(V, E ) be a graph. A planar (spherical) embedding of G is a pair of functions (jg) such that f maps vertices of the graph injectively to the points on the sphere and g maps edges E of the graph injectively to the simple Jordan curves on the sphere so that the end points of g(v, w) aref(v) andf(w) for every edge (v,w)and if el and e2 are two different edges than g(e1) and g(e2) do not intersect except perhaps at the end points. This definition implies that the sphere (or the plane) is partitioned into regions or domains as the graph G is embedded into it. With these regions we associate their bounding cycles or the domain boundaries, that is, we list the vertices on the boundary of the region in clockwise order viewed from the interior of the region. If two embeddings of G may be obtained from one another by continuous deformations of the sphere, they are said to be isotopic. We adopt the following definition.
Definition 2.5: Two planar or spherical embeddings of a graph G are regarded as equivalent (isotopically) if they have the same domain boundaries. Thus an embedding is completely specified by its set of domain boundaries, or dually, by giving a cyclic order of edges around each vertex. In the following section we will show how to construct an embedding of a planar graph by using a PQ-tree data structure and the Lempel, Even and Cederbaum planarity testing algorithm. The nature of the algorithm requires a modified definition of embedding. The reader should recall ( [ 2 ] , how the algorithm works: the series of trees are constructed, one related with each node. One step of the algorithm is finished when we group together all edges that are incident to the leaves of the tree bearing the currently minimal label in the st-numbering. The next tree is obtained by replacing those leaves by a single leaf with the same label, and its outgoing edges.
[a)
To define the embedding of the st-G, the reader will get the best intuitive picture if he thinks of embedding st-G on the surface of the sphere. Vertices labeled by 1 and n in st-num-
Ranking planar embeddings using PQ-trees
25 1
bering are embedded as the North and South poles, respectively. Further, the reader should imagine n - 2 equidistant parallels of latitude between the poles, the one nearest to the North pole being labeled 2, the next one by 3 and so on. Edge (s,t) is going to play the same role Greenwich has on the globe: it will help us distinguish East (right) and West (left). But the edge (s,t) is neither left nor right, so we will make the convention to list this edge as the rightmost edge in the list of outgoing edges for vertex labeled by 1. Definition 2.6: The level embedding of a directed st-numbered graph G is specified by a left to right order of head vertices of outgoing edges for each vertex, when vertices are positioned at the parallel corresponding to their st-numbering label. As an example of how the level embedding is specified for some graph st-G, see Figure 3. The following lemma establishes the validity of the above definition of the embedding.
Lemma 2.2: Every embedding is isotopically equivalent to a level embedding. Proof: We will give an algorithm that, given a level embedding of a graph, produces a cyclic order of incident edges for each vertex. Then we will show how to construct a level embedding for any embedding defined by cyclic order of edges at each vertex. T o produce a cyclic order of edges for a given level embedding, we start with the vertex labeled by 1. By definition of level embedding, the order of its outgoing edges is known and it has no incoming edges. At some vertex v , labeled by k , we know the order of its outgoing edges, but not the order of its incoming edges. Assuming, by induction, that we know the order of both outgoing edges and incoming edges (that is, the cyclic order of edges) for all vertices with labels lower than k. We claim that, under this assumption, we can determine the relative order of any two edges incoming to vertex k. To establish the claim, consider tails (labeled x and y) of any two edges incoming to k (Figure 1). If x and y are adjacent consider
Figure 1: Deciding on the order of any two edges incoming to vertex labeled by k. Labels on the edge indicate the order.
that of the two vertices (say y) which has a smaller st-label. Then x and k are the heads of two outgoing edges whose tail is y. The order of these edges is known. Hence, we can infer the relative order of x and y at k, from the entry for y in the level embedding. Consider now the case when x and y are not adjacent. From the property of st-numbering that every vertex is related to a lower and higher numbered vertex, we know that x and y will have a common ancestor (if none else, the vertex labeled by 1 for sure). Consider the highest numbered common ancestor
252
A. Karabeg
(call it z ) and the order in which the directed paths from z to x and y are leaving z. Note that these paths cannot cross since the graph is planar and z was the highest numbered common ancestor of x and y. Then the order of incoming edges to k will be the reverse of the order in which paths left z. Since for any two edges we can decide on their order, we can learn the order of all incoming edges for k. This completes the proof in one direction. Given any embedding of a planar biconnected graph G and any st-labeling of that graph, the Lempel, Even and Cederbaum algorithm shows that there is a level embedding of G that has the same ordering of outgoing edges. Intuitively, this means that we can always embed the graph as follows: select the cycle containing the edge {s,t}. Embed this cycle as specified by the definition of level embedding. Then embed all the bridges of this cycle recursively. 3. Constructing an Embedding for the Graph st-G For many practical purposes it is crucial not only to test the planarity of the graph, but to explicitly construct an embedding for a planar graph. There exist several solutions to this problem ([3] [4] for example). Here we present our algorithm that has been devised independently of [4], but is based on the same idea. Recently, a parallel algorithm for planarity testing and embedding a planar graph based on PQ-trees has been devised [7]. The starting step of the algorithm; that is, st-labeling of the graph can be found in [8]. Both algorithms run in O(log2n) time on n processors of a parallel RAM. In what follows, we modify planarity testing algorithm from [Z] so that the output for planar graph contains the level embedding specification (as in Definition 2.6). The difference between the algorithm in [2] and the embedding algorithm presented here is that for planarity testing we do not need to know the order of incident edges at a vertex. All we need to know is whether it is possible to bring edges bearing the currently minimal label together. But to construct an embedding the order of these edges is exactly what matters. Consider the following version of the planarity testing algorithm. Algorithm PLANARITY: (1) procedure PLANARITY (st-G) (2) begin (3) U := set of vertices adjacent to vertex n; (4) T := T( U,n); (5) f o r j = n - 1,l do (6) begin (7) T := BUBBLE(TJ1; (8). T:= REDUCE(T,j); if T = T(0,O)then return FALSE; (9) S':= set of vertices adjacent to j ; (10) if ROOT( Z J ) is a Q-node (1 1) then replace the full children of ROOT( 7;j)by PS'j); (12) else replace ROOT( Z j ) by T(S'j1; (13) (14) end; (15) return TRUE; (16) end.
Remarks: 1: The input to the planarity testing algorithm is st-G (for each vertex of st-G, an adjacency list of lower numbered vertices is given).
Ranking planar embeddings using PQ-trees
253
4 The first PQ-tree T(U,n) := T(n - 1) is just a P-node whose children are U .
5 For each level j ; the j-reduction is performed. Notice that the algorithm is executed in reverse order of that in [ 2 ] .
7:Procedure BUBBLE performs one pass up the tree in which pertinent nodes are marked and the count of the number of children to be processed is left at each pertinent node. 8: Procedure REDUCE queues the pertinent nodes in breadth first order and then does the template matching. Since the entire algorithm is run “backwards”, reductions will be camed out for currently maximal label.
9 If the tree returned by REDUCE is an empty tree, graph st-G is nonplanar. 11-13: T o complete the process ofj-reduction, we need to substitute full children of the tree returned by REDUCE by a single P-node whose children are adjacency list for j (this is denoted by T(S’j]).
15-16 If all the iterations have been carried out successfully, graph st-G is planar and the algorithm terminates.
In order to construct the embedding algorithm, we think of the loop steps (5)-(14)as embedding vertices labeled by n - 1 through 1. Prior to executing step (11) though, we need to record the order of full children in ROOT( Zj)(this corresponds to recording the order of outgoing edges for vertex j ] and then perform the required replacement. Here lays the reason for executing the algorithm in the reverse order. Once we embed the vertex j , the order of its outgoing edges cannot be changed any more since vertex j will not play the active role in the rest of the algorithm (i.e. from then on we look at lower numbered vertices only). But, if the ROOT( z j ) is a Q-node, at some later stage of the algorithm it may become a part of another Q-node. If this Q-node is flipped, the order of outgoing edges for vertex j is reversed. This phenomena of a node becoming a part of another Q-node can happen recursively, so we need to keep track of how many times this happens in order to be able to recover the correct orientation of outgoing edges for vertexj. A straight forward way of correcting the orientation of outgoing edges by counting the number of subsequent reversions of the list of outgoing edges for vertex j , would clearly increase the time spent by the embedding algorithm to O(n2).In order to maintain the linearity of the whole algorithm, we have devised a simple book-keeping strategy for Q-nodes based on the following idea: construct trees consisting of Q-nodes that became part of another Q-node and label edge between the two nodes by + or - depending on wether reversal was needed or not. For some Q-node at depth d of such a tree, orientation needs to be changed if the number of - signs on the path from the root to the node is odd. This, in short, describes our embedding algorithm. The actual algorithm and implementa[q.An interesting question in the spirit of [9],that we are currently investigating, is how to dispIay and manipulate a graph on the computer screen using the above method of finding an embedding. tion details can be found in [2] and
4.
The Number of Embeddings
In this section we will give a way of counting isotopically different embeddings of a planar graph G. Consider the graph G shown in Figure 2 and its two representations with their level embeddings. The level embeddings seem to be different, yet, it is a fact that the graph from Figure 2(a) is triconnected and as such it has only one embedding [lo]. The difficulty here
254
A. Karakg
comes from the fact that, if we consider a spherical embedding of the planar graph, the two sides of any one cycle of the graph are indistinguishable; that is, we cannot tell what is inside of the cycle and what is outside of the cycle. In particular, all regions or domains of the spherical embedding of the planar graph are finite. In the planar embedding, though, one of the regions is unbounded and thus we can produce two apparently different planar embeddings of the graph by drawing a graph inside and outside of some region of the graph (Figure 2(b) and (c)). It is clear that such embeddings are equivalent in the sense of the Definition 2.5. If we
1: 3 2 4 234 3: 4
1:234 243 3: 4
Figure 2: (a) Triconnected graph. (b) The graph embedded inside the region bounded by cycle (1,3,4,1). (c) The graph embedded outside the region bounded by cycle (1,3,4,1). consider a level embedding of some planar graph embedded inside some region and then the level embedding of the same graph embedded outside the same region, we will observe the following: for any level, the order of head vertices of outgoing edges for one embedding is the reverse of the order of the vertices for the other embedding. At level 1, though, we will respect the convention to list the vertex labeled by n last, so for level 1, lists will be the reverse of each other, except for the vertex labeled by n. Thus we have the following definition: Definition 4.1 : Two levels embeddings are (isotopically) equivalent if the lists of vertices at corresponding levels are the same or, the order of vertices in one list is the reverse of the order in the other list for all vertices, excluding the vertex labeled by n at level 1. Referring to Figure 3 the reader will note that representations of the graph shown in Figure 3(b) and Figure 3(c) belong to the same equivalence class; that is, they have the same level embedding, while the representation shown in Figure 3(d) is different. In what follows we discuss the point in the embedding algorithm at which we actually select the embedding and ask what can be done to obtain an isotopically different embedding.
Ranking planar embeddings using PQ-trees
1: 2 3 4 5 6 9 2: 9 3 3: 9 465 5 6 6:978 7 98 8 9
1: 6 5 4 3 2 9 2: 3 9 3: 9 456 5: 6 6:879 7:89 8: 9
255
1:234569 2 93 3: 9 465 5 6 6:987 789 8 9
Figure 3: (a) st-labeled graph G. (b) A level embedding of G. (c)The same level embedding as in (b). (d) A different embedding of the same graph. Suppose that we are embedding vertex labeled by k, k < n. We perform a k-reduction on the graph Gk Upon the completion of the process, we have fixed the order of outgoing edges for vertex k. This order may be reversed at some later stage of the algorithm, but it can never be changed. So, if we had some freedom in choosing the order of outgoing edges for k, it must have been reflected in the process of k-reduction. k-reduction starts with a tree Tk and terminates with a tree Tk -1 meanwhile producing a sequence of intermediate trees Tk.0,...,TkP. where Tk is obtained by a single template substitution (for more details see article [2]).If one of these intermediate trees has a full P-node (Q-node) it will be called a full P-tree (a full @ tree). Templates that apply to such trees are called P1 (Ql). The number of different full Ptrees arising from a single full P-node is simply the number of permutations of the children of the full P-node. The number of different full Q-trees arising from a single Q-node is just two, since we are limited to either leaving the order of its children unchanged or reversing it. If we consider, for the moment, subgraph Gk as part of the graph being embedded inside some reference cycle, then we can say that each different full P-tree or full Q-tree produces a different embedding of Gk. In continuing the process of k-reduction after encountering a full tree, we replace its full node by a single full leaf and proceed with the construction of intermediate trees. The choice of some arrangement of children of the full node in one full tree is independent of the selection of some arrangement for another full tree in the same sequence. Thus, the total number of possible choices for embedding vertex k is denoted by E ( Gand ~ is given by
256
A. Karabeg
where pk is the number of full P-trees, Pkjl is the number of children of the full P-node in the j* full P-tree in the sequence of intermediate trees and q k is the number of full Q-trees in the sequence. A special case occurs while embedding vertex labeled by 1. The very last tree in the sequence of intermediate trees has to be a full P-tree (since graph st-G is assumed to be planar) consisting of the single full P-node. One of the edges in this full tree is the (s,t) edge and it has a fixed position; that is, it needs to remain the rightmost edge in any embedding of GI. Thus, for this particular full-tree, the number of different choices is (IPp,l - l)!, The total number of choices for embedding vertex labeled by 1 is
-n 1 2P
p‘
(PIj(!2q1.
’PI j = 1 1
The factor of T comes in as a consequence of the fact that so far, for each level embedding we have counted its “reverse”’ embedding (as shown in Figure 2(b) and (c)) also. From the above discussions, we have: Theorem 4.1:
Let &(G)be the number of different embeddings of the planar biconnected graph G . Then E( G) is given by
The algorithm for counting the number of embeddings is essentially the same as PLANARITY, with the following addition. Algorithm NUMEMBEDD:
(1) procedure NUMEMBEDD (st-G,&) (2) begin (3) let E(G)= 1 (4) start executing PLANARITY (st-G) while executing REDUCE (T,k)do (5) if template P1 applies then (6) let IPk) = #(full children) in PI (7) (8) &(G):= &( G)IPkJI! else if template Q l applies (9) (10) &(G):= 2&(G) (11) end while; (12) end PLANARITY (st-G) (13) &(G):=E(G)/21Ppll (14) end.
Ranking planar embeddings using PQ-trees
257
Lemma 4.1: Algorithm NUMEMBEDD correctly computes the number of embeddings of a planar biconnected graph in linear time.
Proof: The correctness of the algorithm follows directly from Theorem 4.1 To see that the algorithm is linear-time, observe that NUMEMBEDD does the same work as PLANARITY except for the computation of the factorial at step (8).By precomputing the factorials of all integers up to n - 1( n - 1 is the maximal in-degree of the graph), which is done in linear time, each iteration of step (8)takes a constant time, and thus the entire algorithm remains linear. We offer an example by counting the number of embeddings of the graph from Figure 3. We start with a tree T 6 The reader can easily check that, in process of k-reduction for k > 6, templates P1 or Q 1 do not occur. In the process of 6-reduction we have applied template P1
2 3 6
6
6 1
twice to full trees of size 2. Thus, E( G) := 2.2. In performing k-reduction, 1 < k < 6, templates P1 or Q l do not occur and &(G)remains the same. Now we perform the 1-reduction. Template P1 was applied four times to three full trees of P1
PI
+
4
1 1 1
1 1 1
size two and one of size three. Thus, E(G) := 2%!. Now we execute step (12) of the algorithm to obtain the final answer for the number of different embeddings:
A. Karabeg
258
5. Ranking Planar Embeddings
In the previous section we have given a formula for counting the number of different planar embeddings of a planar biconnected graph. Instead of simply being concerned with counting, we would like to produce the list of all embeddings by defining rank and unrank functions for the set of embeddings. This will enable us to list the embeddings canonically and to find the successor and predecessor of some embedding easily. Random generation of embeddings can also be done quickly provided we have a random number generator that generates numbers in the set { 1,...,&(G)},where E(G) is the number of distinct planar embeddings of the graph G. The counting formula (see Theorem 4.1) suggests that we can order the embeddings of G relative to the ordering of full P-trees and orientation of full Q-trees. We will briefly review a few facts about ranking Cartesian products of sets and ranking permutations, since our method for ranking the embeddings is essentially a combination of those. The proofs, together with further details on ranking and unranking combinatorial objects, can be found in [3], [lo] or [ l l ] or [3]. Definition 5.1:
Let S be a finite set with IS I = s. A rank function for S is a bijection r:S+{O, 1,. ..,s - 1). We say that element a E S has rank k if r ( a )= k. The inverse of the rank function is called the unrank function. Let S , ,...,Sk be sets with rank functions r 1,...,rk. Let tion p as follows:
n = S I X ...XSk. We define a func-
Lemma 5.1:
p is a rank function for n. Let n denote the set of integers { 1,2,. ..,n}. Let sc = p1p2...pn be a permutation of n. The inversion sequence ZW(A ) = ili2...in is defined so that ik is the cardinality of the set { j :j > k andpj < p k } . Let the function a be defined as follows:
a@) = i , ( n- 1) ! + i, ( n - 2 ) ! + ... + i,O!
,
Lemma 5.2:
a is a rank function for the permutations of n. Theorem 4.1 in Section 4 asserts that, apart from the factor 1/(12PIql), that every embedding can be specified by specifying one particular permutation or one particular orientation for each full P-tree or Q-tree, respectively. It is clear that we can rank the permutations of each individual full tree as in Lemma 5.2. Then the Cartesian product of all full trees can be ranked as in Lemma 5.1. In what follows, we take care of the factor 1/ ( ) 2 P I p 1 .) The ) factor 1/ (IP,,, 1) arises while embedding vertex 1, and is related to the very last tree in the sequence of intermediate trees. That tree is, as mentioned earlier, simply a full P-tree one of whose edges is (s,f).Its position in a P-tree is fixed. So, in order to rank permutations of this full tree, we rank the permutations of the rest of its ( IPl,,l - 1) children in a usual way. As for the factor 112, we look at the very last full tree we encounter during the execution of the embedding algorithm. Denote that tree by LFTree and let the number of the full children of its full node be rn If LFTree is a Qtree or a P-tree with m = 2, we simply omit LFTree
Ranking planar embeddings using PQ-trees
259
from a Cartesian product of all full trees. It is easy to see that any element of the new Cartesian product produces a new embedding of the graph. But, if LFlkee is a P-tree with m > 2, we need a technique to rank only 1/2 of the permutations in such a way as to exclude the “reveme” embeddings of the graph. If we keep a relative order of any two full children fixed, we achieve our purpose; that is, we eliminate 1/2 of the permutations that produce “reverse” embeddings. Thus, we are interested in ranking permutations associated with LlTree with a restriction that, say, child 1 always precedes child m.For each permutation 7c = p1p2...pm of the set of full children {l, ...,m } we define the complement7cc= ( m + 1 -PI)...(m + 1 -p,,,). Complementation reverses the order of child 1 and child m,as well as lexicographic order of pennutations. Let M be the set of permutations on m satisfying the condition that q precedes m. The following lemma holds:
Lemma 5.3: Let p be defined on Mas follows: pCx) = a(n)if a(@ < m!/2;otherwise, p(n) = ct(xC).Then, p is a rank function for M. Let Sp, where p is the total number of full trees, be the set of permutations of LlTree with the rank function rp computed as in Lemma 5.3. Let S , i = 1,. .., p - 1 be the set of permutafull tree (for a Q-tree the set will consist of two elements only) ranked as in tions of the im Lemma 5.2. Let lT= Slx...x S p In conclusion, we have the following theorem:
Theorem 5.1: Each element of lT defines a distinct embedding in G. The rank function is
In Section 4, we have computed the number (E(G)= 32) of embeddings of the sample graph from Figure 3. We would like now to use the Theorem 5.1 to find the rank of the embedding shown in Figure 3(d). Let us consider the full trees that we have obtained during the execution of EMBEDDING. For each tree we will specify the set Si and the rank function of its elements.
Let S1 be the set of permutations of two elements 7 and 8; that is, S1 = {78,87}. Let r1(78)= 0 and rl(87) = 1. Let Sz be the set of permutations of two elements 9 and [78], where 1781 is considered as one element; that is, S,= {9[78],[78]9}, r2(9[78]) = 0 and r2([78]9) = 1. Sets S, S,, S5 and S, are all obtained in the process of 1-reduction, and they are all sets of permutations of two elements given as follows:
A. Karaheg
260
Since S6 contains only two elements, it will be omitted from the product (as discussed above). : s , : s si, s i ) Now, it is easy to see that n =SIX ...xS5. The embedding is specified by (s:, ,
and the rank of the embedding is p(s:,s:,s~,s~s:)= r , (s:)ISzl ...IS,I+...+
r4(s~lS51+r5(s:) = 16.
Although we have not defined the inverses of the rank functions from Lemmas 5.1-5.3, they can be constructed fairly easily. In a nutshell, the computation of unrank function in our case is simply the greedy algorithm for finding a permutation with given rank in lexicographical order. The details are omitted here, the reader is referred to [l11 for examples of computation of unrank functions. The above method can also be efficiently used to verify whether a given adjacency list of the graph st-G represents a planar embedding of the graph.
References A. Lempel, S. Even and I. Cederbaum; An algorithm for planarity testing of graphs; in Theory of Graphs: lnternational Symposium: Rome, July, 1966,P. Rosenstiehl (editor), Gordon and Breach, New York, 215232 (1967). K.S. Booth and G.S. Lueker; Testing for the consecutive ones property, interval graphs, and graph planarity using PQ-tree algorithms, Journal of Computer and System Sciences, 13.33S379 (1976). K.P. Vo, E.W. Dick and S.G. Williamson; Ranking and unranking planar embeddings, Linear and Multilinear Atgebra, 18,3565 (1985). N. Chiba, T. Nishizeki, S. Ahe and T. Ozawa; A linear algorithm for embedding planar graphs using PQtrees, J . of Computer and System Science, 30.54-76 (1985). S. Even and R.E. Tarjan; Computing an st-numbering, Theoretical Computer Sci., 13,339-344 (196). A. Karaheg; PQ-tree Data Structure and Some Graph Embedding Problems. Ph.D. Thesis, University of California at San Diego (1988). P.N. Klein and J.F. Reif; An efficient parallel algorithm for planarity, Proceedings of FOCS, 45-47 (1986).
Y. Maon, B. Schieber and U. Vishkin; Parallel ear decompositionsearch and st-numberingin graphs, Tel Aviv University Technical Report 46186 (1986). R.C. Read Methods for computer display and manipulation of graphs, and the corresponding algorithms, Lectures at the Graph Theory Conference, Fort Wayne, Indiana, March (1986). S.G. Williamson; Combinatoricsfor Computer Science, Computer Science Press, 681-693 (1985). E.A. Bender and S.G. Williamson; Foundations of Applied Combinatorics, Addison Wesley, 69-80 (1991).
Quo Vadis, Graph Theory? J. Gimbel, J.W. Kennedy & L.V. Quintas (eds.) Annals of Discrete Mathematics, 55,261-264 (1993)
0 1993 Elsevier Science Publishers B.V. All rights reserved.
SOME PROBLEMS AND RESULTS IN COCHROMATIC THEORY Paul ERDdS Mathematical Institute, Hungarian Academy of Sciences Budapest, HUNGARY
John GIMBEL Department of Mathematical Sciences, University of Alaska Fairbanks, Alaska, U.S.A.
Abstract Given a graph G, the cochromatic number of G, denoted by z (C), is the fewest number of parts we need to partition V(C)so that each part induces in G an empty or a complete graph. We present some background, recent results and open problems in the field of cochromatic theory.
The subject of cochromatic theory grew out of work by Foldes and Hammer [I] on split graphs. It was originally introduced in 1973 by Lesniak and Straight [2].Attesting to its natural characteristics, the topic was rediscovered by several people. It was studied by computer scientists (see [3]) who are interested in decomposing posets into a minimum number of chains and anti-chains. This is equivalent to computing the cochromatic number of the graph underlying the poset. Also, cochromatic theory provides graph theorists with a nexus between coloring problems and Ramsey theory. The parameter was studied in some detail; producing beautiful results. Yet, there are still interesting open problems which we discuss here. Let %(G) denote the chromatic number of G. Let fand g be two functions defined on the integers. Then we define the function 8 by An) = 8(g(n)) provided A n ) = O(g(n))and g(n) = O(f(n)).Given an integer n, let z(n) be the largest cochromatic number of all graphs on n vertices.
From [4]we know the following:
Remark 1:
n
With the above notation, z(n) = 8 (-) . In (n) However, we do not know the value of z(n) for even some small values of n. For example, we know 4 I ~ ( 1 2 I) 5 and 5 5 z(15) 2 6 . Exact values of ~ ( 1 2 and ) z(15) are unknown at present. Determination of z(12) might be made after obtaining an answer to the question: Is there a graph G of order twelve where G and it's compliment are &-free, yet both graphs have a chromatic number of at least five? It seems unlikely that such a graph exists. Thus, we conjecture that z(12) = 4. Considering the size of a graph, it is shown in c"z
[a that if z(G) = n then G contains at least
edges, for some positive constant c. However, little is known concerning specific val-
~n*(n)
ues of n. In [q and [7] the topic of criticality is discussed. A graph G is m-critical if z(G) = m and z(G\v) = m - 1 for every vertex v in G. Also, G is critical if it is m-critical for some m. I t was noted that there are no critical graphs of orders 1,3, and 4. It was recently shown [8] that
P. Erdds and J. Gimbel
262
there is no critical graph of order 8. With this and a result of [7] we know that for all other n there is a critical graph of order n. Further, it was shown in [7j that if m 2 4 and i? is such that 2n 2 m2 + m + 4 then there is an m-critical graph of order n. It was shown in [8]that the result does not hold for m = 3 Suppose S,, represents the orientable surface of genus n. Further, let z(S,) be the maximum cochromatic number of all graphs which embed on s,,. One of the authors showed in [4] the existence of constants c1 and c2 where
but the growth rate of z(S,) is still unknown. Given a large clique, we see that it has a large chromatic number and the cochromatic number of each induced subgraph is one. So, let us instead consider the case of non-induced subgraphs. Suppose we have a graph G with x(G) = m.How large can z(H) be, where H i s a subgraph of G? We have a solution, but it does not seem to be best possible.
Remark 2: If G is a graph with chromatic number m, then G contains a subgraph H , such that
where c is some positive constant. PrOOf:
It can easily be shown that if t is at least as large as the clique number of G then x(G) = x(tG) = z(tG) I tz(G),where tC represents t disjoint copies of G. So consider o the clique number of G.
,Jm.
Case 1: Suppose o 2 Let H'bea clique in G having w vertices. From Remark 1 we know that H ' has a subgraph H where
Case 2 : Suppose w I
im.Let t = w.Note, m
= x(tG) 5 tz(G)I . / m z ( G ) .Hence,
We believe that the square root can be omitted. However, by considering the complete graph we can show that the bound can not further improved. Given D , an oriented graph (a digraph with no cycle of length one or two), the dichromatic number of D is the smallest number of colors needed to color V(D)so each color class induces an acyclic digraph. For a graph G, we define the dichromatic number of G to be the maximum dichromatic number taken over all orientations of G. The dichromatic number was originally defined in 191. We do not know if a graph with large cochromatic number must contain a graph with large dichromatic number. But if this is true, then Remark 2 would show that a graph with large chromatic number must have large dichromatic number. This would answer a question raised by Erd6s and Neumann-Lara in [ 101.
Some problems and results in cochromahc theory
263
i.
Suppose G, is a random graph on n labeled vertices with edge probability From [l 11 and [12] we know G, almost surely (a.s.)has clique number and independence number less than 210g 2(n).Hence,
But, z(G,) I x(G,). Further, from [13] we know the chromatic number of G, is as. bounded n above by ( 1 + o( 1)) -. Hence, a s . 21% 2(n)
However, it isn't known if x(G,) - z(G,) a s . goes to infinity. Upper bounds on the chromatic number have been studied for graphs with known order, cochromatic number and clique number. For example, Lesniak and Straight showed that if G contains no triangle and G has at least three vertices then z(G) = x(C).It was further shown in [14] that if G contains no clique on m vertices then x(G) I z(G) + 4". Also, if G has at least four vertices and no clique of order four then x(G) I z(G) + 1. It is not true that if G has order at least five and no clique on five vertices then x(G) 2 z(G) + 2. To see this, consider the complement of C , u C,, the union of two five cycles. This is a counterexample, along with several hundred others that we have constructed. However, we believe there may be only a finite number of counterexamples. This leads to a more general question: suppose t is the maximum difference between the chromatic number and the cochromatic number of all graphs with clique number no more than m and order at least m. Now, are there an infinite number of graphs G with clique number no more thanm where 2 = x(C) -z(C)? In [14] it was conjectured that if G is a Ks-free graph and z(G) 2 4 then x(G) I z(G) + 2. This remains open. In [14] the existence was established of an E > 0 such that for any m, there is a &-free graph G with the property that z(G) + (1 + E ) I x~ ( G ) . We close with a slight improvement on this. To find a better result may require a breakthrough in Ramsey theory. Remark 3: For each m there is a &-free graph G where z(G) + $ x m xI( G ) . Proof: Select E a small positive constant. Consider G, the random graph with edge probability p = on vertices. Let a = l / p .
LJCT"''
,&
We know from [ll] and [12] that the clique number of G is as. less than 210ga(firn)
- 210g.log.(firn)
+ O(1)
and this is less than m. We note that G,the complement of G has edge probability 1 - p . Let b = 1/ p Now, choose E ~ > Osothat ( 1 -cl)ln(b)- ( l + ~ ~ ) l n ( a ) > O . S i n ln(b)>ln(a), ce thisispossible. From [13] we know that a s .
P. FxdBs and J. Gimbel
264
We note that the expression in parenthesis is positive. Hence, x(G) - z(G) 2 ,k2 - 2~ sufficiently large m and the desired result is established.
...
for
References S. Foldes and P. Hammer; Split graphs, Proceedings of the 8th. Southeastern Conference on Combinatorics, Graph Theory and Computing, Utilitatas Math., Winnipeg, 31 1-315 (1977). L. Lesniak and H.J. Straight;The cochromatic number of a graph, Ars Combin., 3 . 3 9 4 (1977). A. Brandsttidt and D. Kratsch; On partitions of permutations into increasing and decreasing subsequences, Elekfron. InJverab. Kybern.. 22,263-273 (1986). J. Gimbel; Three extremal problems in cochromatic theory, Rostock. Math. Kolloq., 30.73-78 (1986). P. Erdas, J. Gimbel and D. Kratsch; Some extremal results in cochromatic and dichromatic theory, J . Graph Theory, 15,579-585 (1992). I. Broere and M. Burger; Critically mhromatic graphs, J. Graph Theory, 13.23-28 (1989). J. Gimbel and H.J. Straight; Some topics in cochromatic theory, Graphs and Combinatorics. 3,255-265 (1987). L. Jzrgensen; - private communication. V. Neumann-Lira; The dichromatic number of a digraph, J. Comb. Theory ( B ) , 30,265-270 (1982). P. E d & ; Problems and results in number theory, Proc.. Ninth Manitoba Conference on Numerical Mathematics and computing, 3-21 (1979). B. BollobL and P. Erdzis; Cliques in random graphs, Math. Proc. Camb. Phil. Soc.,80.419427 (1976). D. Matula; The largest clique size in a random graph, Proc. 2nd Chapel Hill Conference on Combinatorial Mathematics and Its Applications, University of North Carolina, 3 5 3 6 9 (1970). B. BollobL; The chromatic number of random graphs, Combinatorica, 8,4%55 (1988). P. Erdas, J. Gimbel and H.J. Straight; Chromatic number versus cochromatic number in graphs with bounded clique size, Europ. J. Combinaforics,11,235240 (1%).
Quo Vadis, Graph Theory? J. Gimbel, J.W. Kennedy & L.V. Quintas (eds.) Annals of Discrete Mathernarics, 55, 265-274 (1993) 0 1993 Elsevier Science Publishers B.V. All rights reserved.
FROM RANDOM GRAPHS TO GRAPH THEORY Andrzej RUCINSKI Department of Discrete Mathematics Adam Mickiewicz University, Ponzaii,POLAND
Abstract Four graph theoretic problems, all born in the theory of random graphs, are presented. They have been stimulated by investigations of the existence and distribution of subgraphs of given type contained in the binomial random graph. The deterministic problems involve only the notions of a subgraph, subgraph count, and subgraph density and deal with balanced extensions of graphs, convex hulls, globally sparsest Ramsey graphs (in vertex-coloringcase) and so calledp-proportionalgraphs.
1. Introduction The theory of random graphs can be located between probability and graph theory. Usually its methods and tools are probabilistic while the inspiration comes from graph theory. The underlying idea is to pick a graph property and ask what can be said about almost all graphs in that respect. The meaning of “almost all” depends on the type of probability space of graphs we choose. Then, of course the whole knowledge of graph theory is taken for granted. But sometimes random graphs pay back by raising interesting deterministic questions which enrich the ever the ever growing theory of graphs. In this survey paper four such graph theoretic problems, all inspired by investigations of the subject of small subgraphs of a binomial random graph will be presented (sections 3-4). Their presentation will be proceeded by a brief overview of the random graphs’ results which led to raising the questions (section 2).
2. Small Subgraphs of Random Graphs We consider the binomial random graph K(n, p ) created on the vertex set [n] = { 1,. ..,n} by Ber( I 1 noulli trials with probability of success p. The parameterp usually depends on n and n -+ independent, random insertion of each edge with probability p . Thus it results from
00.
The small subgraph problem can be defined as follows. Given a graph G with at least one edge, what is the limit of the probability of the event “G c K ( n , p ) ” and what is the limit distribution of the sequence of random variables X, &G) = X ( G ) , counting subgraphs of K(n,p) isomorphic to G. Such subgraphs will be referred to as copies of G. Both questions can be also formulated in the induced version, when we ask about the existence and distribution of induced copies of G, but only the latter case and only for p constant (not depending on n) differs essentially form the original problem. Below we briefly survey those results and proofs which are related to the forthcoming sections. (For an extensive survey on small subgraphs of random graphs see [l].) Let us introduce some notation. For a graph G, let VG and eG denote the number of vertices and edges, respectively. The ratio d , = e c / e c is called the density of G and we set %= m a
~
GdH.For a given G with at least 1 edge, let Gi, i = 1,. . . ,Af = (
”,G ) v,!
/aut(G) , be
A. Rucinski
266
all copies of G one can construct on [ n ] ,where aut(G) is the number of automorphisms of G. These are the potential copies of G which may or may not occur in K(n,p). Let us associate to each such potential copy an indicator random variable Ziequal to 1 if G c K(n,p ) and 0 otherwise,i= 1,...,J ThenX(G) = q = , Z i a n d X(G)= fpeGxnVGpeGu n (npd')
'' (here and below
an bn means that the two sequences differ asymptotically by at most a constant factor) and, d
since P(X(G)> 0) < EX(G), we obtain the implication that if p = p(n) is such that np -+ 0 then P(X(G) > 0) + 0 as n + 0 0 . However, this can be immediately strengthened by observing that, for every subgraph H of G, P(X(G)> 0) IP(X(H) > 0), and repeating the argument for that H which maximizes dH Hence, P(X(G) >0) + 0 if npmc+= 0 . In 1981 BollobAs [2] completed this observation proving that ( 1)
P(X(G) > 0 ) + 1 if npmc +
oQ.
Let us present the simplest proof of (1) discovered in 1985 by RuciAski and Vince [3]. Assume, as we may, that p + 0. Then var X(G)
f f
=zx
i l i=l
e(G,nGj )>o
=CCcov(~,zj)
U(EX(G))2zn-"Hp-eH n
[E(l,,zj)-p2"1
=O((EX(G))2)
H
and so, P(X(G) > 0) IVar X(G)/ (EX(G))
+ 0.
Remark:
For the above is enough to run & over all induced subgraphs of G with at least one edge. But if we were to be more precise, we should determine which subgraphs of G can be expressed as an intersection of two copies of G. So, we jump into pure graph theory - just for warming up. Once we know when P(X(G) > 0) + 1, we would like to learn about the rate of this convergence. By the FKG-inequality one gets
( 7
P(X(G) = 0) 2 1 - p e G > exp[-EX(G)/(l-p)], which can be immediately strengthened to
P(X(G)= 0) > ex{-
min H c G , eH>O
I
EX(H) / (1 - p ) .
However, there was a popular demand for an upper bound of similar form, and indeed, it was established in 1987 by Janson, Luczak, Ruci hski[4]. Let me present here its asymptotic refor-
From random graphs to graph theory
267
mulation. Call a subgraph H of G a leading overlap of G if eH > 0 and for all H‘ 5; G, eH’< 0 , EX(H) = q E X ( H ) ) .(The name comes from the asymptotic expression (2), where the terms with highest order of magnitude correspond to the leading overlaps H ) If H is a leading overlap of G then for some constant c = c(G) and for all n (3)
P(X(G‘) = 0) c exp(-cEX(H)).
We shall soon see how powerful this theorem is. It agrees with our intuition that when p gets larger and larger, there are more and more copies of G in K(n,p).But when, i.e., for what range of p = p(n), they become so crowded
;
that the Ramsey property K(n,p ) + (C) begins to hold almost surely? The property should read “for every coloring of the vertices of K(n,p) by at most r colors there is a copy of G whose vertices all have the same color.” For a pure random graphist it may seem a peculiar measure of the saturation of K(n,p) with copies of G, but we are on the right track (cf.the title of the paper). Obviously, if the property does not hold then there must be a subset of at least nl r vertices which induces a subgraph of K(n,p) with no single copy of G in it (take the largest color class of the coloring which does not create any monochromatic copy of G ) .Thus, and here (3) comes in handy, P( K(n,p ) k G : ) < 2”exp (-CEX~,,,,.,,~(H)),where H is a leading overlap of G. (The leading overlaps do not change when switching from n to n I T . ) Now set d i = e G / (vG- 1) if eG > 0 and 0 otherwise and set m> = rnaxHGGdA.Observe if
npmG>C then for each subgraph H of G with at least one edge, EX(H) > C l n , where C1 grows at least linearly with C. Thus, for C large enough P(K(n,p)+ (G)“) + 1 i f
npmG>C. In fact, this Ramsey property appears during the evolution of K ( n , p )very rapidly, since it was recently shown by Luczak, Rucinski and Voigt [5] that, for c small enough, (4)
~ ( ~ ( n , p )( G+ ) ; ) + O i f n p m c < c .
(For r = 2 one has to assume that G contains a path on three vertices.) Finally, let us turn to the problem of the limit distribution of the number Y, &G) = Y(G)
of induced copies of G in K(n,p),when this time p is a fixed real number between 0 and 1. It was proved in 1989 by Barbour, Karonski, and Rucinski [q (see also [7]) that Y(G) satisfies the Central Limit Theorem as long as its variance is of order at least n z v G - 3 . To see when it happens, let us set v = vG and e = eG.Associate to each v-element subset S of [n]an indicator random variable J s equal to 1 if S induces a copy of G in K(n,p) and 0 otherwise. Then
Y(C) = z J , a n d V
(;j-,
where /l= D, =cGpeq , cG = v!/aut(G) and q = 1 - p . To proceed, it is convenient to define edge indicator random variables Lij equal to 1 if there is an edge between i and j and 0 othenvise. Then
268
A. Rucidski
and
3. Balanced Extensions
The story began in 1%0 when ErdCis and RCnyi [8] proved (1) but only for balanced graphs, i.e., for graphs G that satisfy m , = d,. They gave basically the same proof presented in Section 2, but did not notice (as many authors later) that the argument could be extended to arbitrary graphs. This was a fortunate oversight, since it gave birth to further purely deterministic developments. In 1980 Karonski and Rucinski, when trying to prove (l), came up with the following idea. For a given graph G call a graph F a balanced extension of G if G F and mF = d, = mG,i.e, if F is a theoretically sparsest balanced supergraph of G. We observed that if every graph has a balanced extension then (1) easily follows in full generality if it is only true for balanced graphs. Indeed, then one can apply (1) to F but mc = d, and P(X(G) > 0) >P(X(F)> 0). Our conjecture was proved in 1985 by Gyiiri, Rothschild, and Rucinski [9],and independently, by Payan [lo]. In the former paper the question about ext(G) = min(vF: F is a balanced extension of C) was raised. Accidently, the latter paper provided a nice interpretation of ext(G) by linking the whole subject to Edmond’s matriodal generalization of Nash-William’s arboricity theorem. To describe it briefly, let us recall that the edge sets of all spanning trees of a given connected graph form the bases of a matroid called circular. But the spanning subgraphs whose every connected component is unicyclic also define a matroid (called bicircular; see [l I]). Specifying Edmond’s theorem for the circular matroid of the multigraph obtained from a graph G with rn; = s/t by replacing each edge by t parallel edges, we see (cf. [12]) that ext’(G) = min (vF : G F, rn> = dF = m;) is the smallest number of vertices in a supergraph of G which can be covered by spanning trees in such a way that each edge belongs to precisely t of them. An analogous interpretation of ext(G) is obtained by applying Edmonds’ result to the bicircular matroid. After finding some motivation we can now examine ext(G) in more detail. A naive approach would be to express in, = s / t , where s and t a r e relatively prime integers, and hope to be lucky enough to prove that ext(G) S v G + t - 1. Unfortunately, after a good start for t = 1,2, we soon realize that our “first thought” conjecture fails for t = 3 (take a forest whose r components are just the edges and one is a path of length 2; you then need r extra vertices to make the graph balanced without exceeding m, = 2 / 3 . Hence, the problem deserves
269
From random graphs to graph theory
a more serious treatment. Let ext(n) = max(ext(G) : v, = n ) . The constructive proof of 1
existence of a balanced extension brings the bound ext (n) < ( -
4
+ o( 1)) n2. Isn’t it too gener-
ous? No, because there are n-vertex graphs with ext(G) 2 An2. For instance, for n even, the 8
graph obtained from an nl2-vertex cycle with one diagonal by attaching to one of its vertices n12 pendant vertices is such. Let us prove it. Assume that m, > 1 and that F is a balanced extension of G. Then the minimum degree in F must be at least 2 and so 2m,V~=
A, - 2 > F - 2 ( m G - 1) ’
~ ~ F V F = ~ ~ F ? A , + ~ V Fe - ~V
where & is the maximum vertex degree in the graph G. Only recently, Rucidski and Vince proved in [13] that, for n sufficiently large, ext(n) =
[-81 (n + 3 ) 21. On the other hand, for
most n-vertex graphs the n2 bound is much too high. Already in 1985, Erdds conjectured that ext(G) should be linear in n if only G is sufficiently dense. It was prove by Rucinski and Vince [14] that the conjecture is true whenever e, proved that, setting m, = 1 + E, absolute constant c. Problems :
E
n2. Recently, tuczak and Rucidski [15]
= E(n), if&< 119 or E > 3.25 then ext(G) < e n / & for an
1. The irritating gap that appears in the last result is caused by some technical difficulties. We have applied two different “random constructions” for sparse and dense cases and were not able to extend the range of E for which they work any further. We hope, however, to close the gap in the near future by some refinement of our techniques.
2. The proofs in [15] involve random graphs and therefore are nonconstructive. Thus a constructive proof would be welcome. 3. Determine ext(n) for small n. 4. For a hypergraph, it is natural to define its density as one half of the average vertex degree. Then one can ask the same questions as for graphs. The problem of existence of a balanced extension of a hypergraph was examined by Rucidski and Vince [14]. 4.
Convex Hulls of Graphs
In view of inequality (3) as well as expression (2), we have an unquestionable right to ask what subgraphs of a given graph G become its leading overlaps and when, i.e., for what range of p = p(n). Surprisingly, the answer can be provided in simple geometric terms. Let Q = { (vH ,eH) :H G, eH > 0 ) be the set of points of the Cartesian ny-plane corresponding to all subgraphs of G with at least one edge (we assume that G itself has at least 2 edges and no isolated points). Let r be the upper boundary of the convex hull of rR. Then H is a leading overlap of G ifand only ifthe point (vH,eH) lies on G. Moreover, the range ofp for which H is a leading overlap of G can be read out from the slopes ofthe straight line segments of r which meet a t this point. For examples see [l] [4] and [16]. Hence, (2) and (3) change their form as many times as the number of extreme points that r has. Let us denote this number by JG. Clearly, J,2 2 for every graph G , but how large JG can be. Let us define y,(F) = max (J,: v, = n, G E F), where F is a family of graphs. For graphs and bipartite graphs the
A. Rucidski
270
question was answered in [ l l , with corresponding values of yn(F)equal, asymptotically, 2n/5 and 2nI7. The graphs G with Jc = 2 are precisely those for which every subgraph Hwith at least 3 vertices satisfies the inequality ( e , - 1) / ( v , - 2) < ( e , - 1) / ( v , - 2) . This condition resembles the definition of balanced graphs and, indeed, every such a graph is balanced and even, except disjoint unions of edges, strictly balanced. ( G is strictly balanced if for every proper subgraph H the inequality e H / v H< e G / v G holds. All trees and regular connected graphs are strictly balanced. Strictly balanced graphs play an important role in the theory of random graphs as they are the only graphs for which X(G) enjoys the Poisson convergence.) The converse is not true, as the graph obtained from K5 by deleting two incident edges is strictly balanced and has 3 extreme points. But, at least intuitively, balanced graphs should not have two many extreme points. This intuition is confirmed (or, maybe, undermined) by the following result recently proved by Lucak and Rucinski [18]. Let B and S be the families of all balanced and strictly balanced graphs, respectively, and let yJF) = max (JkvG = n, eG = N, G E F).
Then, for N = N(n) > (loglogn) 3 n 3 / 2 , Y n d B ) yn(B) y,(S)
u
n
213
yn,N(S) N
113
and, in particular,
. The proof again makes use of random graphs.
Problems: 1. We conjecture that our last result is also true for sparse graphs, i.e., when
N< (~og~ogn)~n~/~. 2. Determine, at least asymptotically, y n ( m for the family $of regular graphs. Since !R-c B , we know (and this is all we know) that = O(n2’3.
yn(a
3 . What graphs maximize the area of their convex hulls? 4. It seems to be interesting to study a multi-dimensional version of the problem by adding
other parameters, like minimum degree or chromatic number, to the pair ( e H ,V H ) .
5. Sparse Ramsey Graphs The subject of sparse Ramsey graphs is a well established offspring of Ramsey theory. It originated from the forbidden subgraph problem and turned to the search for local density restrictions. Let us illustrate this kind of results by the following observation whose short proof based on (3) can be found in [5].For all subgraphs G and all positive integers r and k there exists a graph F such that F + ( C ),” and,for each subgraph H of F with 1< v H< k , the inequality d k I m> holds. But in the course of proving (4) we came up with another idea of how to measure the density of Ramsey graphs. Let us first sketch the proof of (4). When G is a forest then (4) easily follows from what is known about the structure of K ( n , c / n ) ,c < 1. Assuming G is nonforest, we define a cluster as maximal in respect to containment subgraph of K(n, p ) which is a union of copies of G such that, for every bipartition of the copies, there are two of them, no in the same class, which share an edge. We then show that, almost surely, all clusters have a bounded size, intersect each other on at most one vertex, and that the hypergraph with edges being vertex sets of the clusters consists of isolated trees and unicylic components, the latter built exclusively of pure clusters, where pure clusters are those which are just single copies of G. Therefore we realize, and this is the crucial point, that K(n, p ) k (G) ,“ almost surely if only every cluster alone satisfies this property. But the clusters F, as small subgraphs of K(n, p ) , have the maximum subgraph density mF I m;. Thus we
27 1
From random graphs to graph theory
have arrived in the proof of (4) at the final stage where we are to show that, for graph F with mF 5 m>, F k (G) This is, of course, a purely deterministic problem which led us to raise the following question: Given a graph G and an integer r , find m,(G,r) = inf( mF:(F -+ GF) ) . Thus, we ask about globally sparsest Ramsey graphs as opposed to the previously mentioned problem of local density. As every graph has a balanced extension (cf. section 3), the question is equivalent to asking about inf(d, : F (G) ;, F is balanced). The following lemma, which can be found in [5J, completes the proof of (4) (since for nonforest G, maxHGGS(H) > m;) as well as it provides a lower bound for m,(G, r).
F.
Lemma: Let r be a positive integer and let G and F be graphs satisfying mG c ;max H G &H), 6 is the minimum degree. Then F k (G)
F.
where
For complete graphs K, it follows from the lemma that mcr (Ks, r) 2 r (s - 1) /2, but on the other hand dK = r (s - 1 ) 12 and, by the pigeon-hole principle, (r- l ) r + 1
m,XK, r) = r (s- 1 ) /2. At the moment, these are the only known values of mcr(G,r), and the next to come is mcr(P2,2), where P, is a path of length 2. We only know that it lies between 413 and 715, inclusively. The upper bound follows by constructing an infinite sequence of graphs G,, n = 1,2,.... Graph G, is obtained from the cycle on the vertex set {l,Z,. .. ,2n+l} by attaching at each odd i a copy of Kq - {edge}, identifying with i a vertex of degree 3. It can be easily verified that G, + ( P 2 ) i and that mGn= \7/5. The lower d% bound of 413 is proved in a recent paper [19]. Also, there is a general upper bound twice as large as the one established in the lemma. It is asymptotically achieved by sufficiently large Stars.
6.
Proportional Graphs
At the end of section 2 we pointed out that for fixed p, UG), the number of induced copies of G in K(nq), does not obey the Central Limit Theorem if and only if (5) and (6) hold. Let us resolve the two equations in terms of conditions they impose on the structure of G. For (S), lq( e , where stands for the number of copies of G with
)-
the edge { 1,2}. (Let us recall that e = eG, v = VG,and q = 1 - p.) Counting, in two ways, the copies of G with a rooted edge we have
. Similar argument gives the same value of p as a solution in case when L12 = 0. As far as (6) is concerned,
where c; is the number of copies of G on [v] containing the edges {1,2}, {1,3}, {2,3} and, denoting by t3(G) the number of triangles in G,
= cG t3(G).Hence, 3
ILI2=Ll3=Lz3=1 = p e $ ( G ) = p
("). 3
Among the seven cases left there is essentially only one which deserves presentation. Let
A. Rucinski
272
%(G) be the number of induced paths of length 2 in G , and let c: be the number of copies of G on [v] containing the edges { 1,2}, ( 2 , 3 } but not contaming { 1,3}. Then
if t2(G) = 3 p2q(
), since ( 3" )
:C
= cG i2(G). Setting tl(G) = t,(G") and t&G) = Z3(G3,
where GCis the complement of G, we conclude a graph G satisfies (6) if and only if
t3(c): b ( ~t)l (:q : tdc)= p3:3p2q:3pq2:4"
(3
That is, the counts of the four mad types are in the binomial proportion. Let us call graphs with
eG
=p(
"2 )
and satisfying (7)p-proportional. Do there exist such graphs? At the time
of writingthe papers [q and [7] only a few examples of 1/2-proportional 8-vertex graphs were known, among them the wheel. But even for such a well structured graph as a wheel it is not so easy to verify (7). Fortunately, using identities
( )
known already to Goodman in 1959 [20], one can prove that (7) is implied by t3(G)= p 3 " 2 3 and Cd = pn(n - 1)( 1 + p ( n - 2)), where d, is the degree of vertex x in the graph G. This was actually shown by Janson and Kratcchvil in their very recent paper [21] concerned with proving the existence of p-proportional graphs. They solved the problem only partially. Below I state their two main results and the conjectures they pose. Theorem 1: [21]
There is an ro such that for all r > re r & 2 (mod 3) there is a 112-proportional graph on 8r vertices. Conjecture 1:
There are n-vertex 1R-proportional graphs for all n satisfying n = 0,1,8 (mod 16). The condition on n given in Conjecture 1 is necessary. Theorem 2: [21] If 0 < p < 1, p = r / s , where r and s are integers such that s is odd and the largest power of 2 which divides r (s - r ) is even then there exist infinitely many p-proportional graphs.
The set of rationals for which Theorem 2 holds is dense in [0,1]. Conjecture 2:
For every rational number p between 0 and 1 there is ap-proportional graph. Comment: During the Obenvolfach meeting on Random Glaphs in Fall 1990 Janson revealed that a student of his, Kamnan, had refined the construction from [21] and proved Conjecture 2. Moreover, Janson and Spencer seem to have a probabilistic proof of it. So, once again, random graphs come back through the back door to solve deterministic problems which originated in the theory of random graphs.
From random graphs to graph theory
273
Acknowledgements An earlier version of this paper was written during the author's visit to Fakultlit fur Mathematik, Universitiit Bielefeld, in Spring 1990, supported by SFB 343. The final version was prepared when the author visited Department of Mathematics, University of Gainesville in Fall 1990.
References A. Rucidski; Small subgraphs of random graphs - a survey, Proceedings of Random Graphs '87,Wiley, Chichester, 283-303 (1990). B. Bollobfis; Threshold functions for small subgraphs, Maih. Proc. Cambr. Phil. Soc., 90, 197-206 (1981). A. Rucinski and A. Vince; Balanced graphs and the problem of subgraphs of random graphs, Congress. Numeraniium, 49,181-190 (1985). S . Janson, T. tuczak and A. Rucinski; An exponential bound for the probability on nonexistence of a specified subgraph of a random graph, Proceedings of Random Graphs '87, Wiley, Chichester, 73-87 (19% T. tuczak, A. Rucmski and B. Voigt; Ramsey properties of random graphs, J. Comb. Theory Ser. B (to
appear). A. Barbour. M. Karhski, A. Rucinski; A central limit theorem for decomposable random variables with applications to random graphs, J. Comb. Theory Ser.B , 47.12.5-145 (1989). S. Janson; A functional limit theorem for random graphs with applications to subgraph count statists, Random Siruciures Algorithm, 1. 15-37 (1990). P. Erd& and A. R h y i ; On the evolution of random graphs, Publ. Math. Inst. Hung.Acad. Sci., 5, 1647 (1W). E. Gybri, B. Rothschild and A. Rucibski; Every graph is contained in a sparest possible balanced graph, Math. Proc. Cambr. Phil. Soc.,9 8 , 3 ! V 4 1 (1985). C. Payan; Graphes 6quilibr6s et arboncid rationnelle, Europ. J . Combin., 7,263-270 (1986). J.M.S. Simks-Pemira; On subgraphs as matroid cells, Math. Z., 127,315-322 (1972). P.A. Catlin, J.W. Grossman, A.M. Hobbs, and H.-J. h i ; Fractional arbncity, strength, and principal partitions in graphs and matroids. Research Report CORR 89-13, Faculty of Mathematics, University of Waterloo (1989). A. Rucinski and A.Vince; The solution to an extemal problem on balanced extensions of graphs - submitted. A. Rucibki and A.Vince; Balanced extensions of graphs and hypergraphs, Combinaiorica, 8,279-2971 (1988). T. tuczak and A. R u c i a i ; Balanced extensions of sparse graphs - to appear. A. Rucinski; When are small subgraphs of a random graph normally distributed?, Prob. Th. Rel. Fields, 78. 1-10 (1988). A. Rucibki; On convex hulls of graphs, Ars Combinatoria (to appear) T. Luczak and A. Rucihski; Convex hulls of dense balanced graphs - to appear. A. Kurek and A. Rucidski; Globally sparse vertex-Ramsey graphs - submitted. A.W. Goodman,On sets of acquaintances and strangers at any party, Amer. Math. Mon., 66, '778-783 (1959). S. Janson and J. Kratochivil; Proportional graphs, Random Structures Algorithm, 2,209-224 (1991).
This Page Intentionally Left Blank
Quo Vadis, Graph Theory? J. Gimbel, J.W. Kennedy & L.V. Quintas (eds.) Annals of Discrete Mathematics, 55, 275-312 (1993) 0 1993 Elsevier Science Publishers B.V. All rights reserved.
MATCHING AND VERTEX PACKING: HOW “HARD” ARE THEY? Michael D. PLUMMER Department of Mathematics, Vanderbilt University Nashville, Tennessee, U.S.A.
Abstract Two of the most well-known problems in graph theory are: (a) Find a maximum matching (or perfect matching, if one exists), and (b) Find a maximum independent set of vertices. The first problem - usually called the matching problem is known to have a polynomial time algorithm; the second - often called the vertexpucking prubkm - is known to be NP-complete. However, many graph theorists especially those who do not deal much with complexity of algorithms - know little more about the complexity issues associated with these two problems than these two basic facts. What is not so widely known within the graph theory community is that these two problems have motivated a great deal of recent activity in the area of algorithms and their complexity. Of course it is not known whether or not P = NP, but most workers in the area currently believe that equality is unlikely to hold. Motivated by this belief, a number of people have studied variations of both matching and vertex packing with the general theme being two-fold. On the one hand, one can add various side conditions to the matching problem and study the complexity - both sequential and parallel - of the resulting problems. On the other hand, one can investigate certain large and interesting classes of graphs trying to prove that for these classes the vertex packing problem has a polynomial solution. Each branch of this two-pronged attack has yielded both interesting theorems and perplexing unsolved problems. This paper will survey this work.
-
1.
Introduction and Background: The Two Fundamental Problems
In this paper, graphs will be assumed to be connected, undirected and will have no multiple edges or loops. A matching is any set of independent edges, i.e., no two have a vertex in common. A maximal matching is a matching not properly contained in any other matching. A maximum matching is one of largest cardinality. A perfect matching (sometimes called a 1factor) in a graph G is a matching which covers all vertices of G. A set of vertices S E V ( G )is independent if no two vertices of S are adjacent. An independent set S is maximal if it is not a proper subset of any other independent set and maximum if it is an independent set of largest cardinality. So far, then, maximal and maximum matching and independent sets are quite analogous; matching corresponding to sets of edges and independent sets corresponding to sets of vertices. To be sure, they are related. A (maximal, maximum) matching in a graph G corresponds to a (maximal, maximum) independent set in the line graph 4 G ) . But we shall soon see that the concepts quickly diverge in difficulty.
First, however, let us note that matching and vertex packing remain closely related, at least in the computational sense, if one considers the class of bipartite graphs. Historically speaking, it was this class of graphs which was first studied with the task in mind of finding maximum matching and independent sets. Implicit in the early work of Konig and Egerviiry were the roots of the first bipartite matching algorithms. (For much more complete histories of matching algorithms see [l] and [ 2 ] . )Using classical alternating path (or slightly more efficiently, alternating tree) arguments, one can find an algorithm to find a maximum matching in a bipartite graph. Moreover, the algorithm provides a maximum matching in a number of steps polynomial in the size of the input encoding of the graph. (See the next section for more information on polynomial and other types of complexity.) As an important bonus, however, these classical bipartite matching algorithms also yield a
M.D. Plummer
276
m i ~ i m ~vertex m cover of the graph, i.e., a smallest set of vertices which collectively contain at least one endvertex of every edge in the graph. One of the classical results of bipartite matching and covering due to Konig [3], [4] says that the size of any maximum matching equals the size of any minimum vertex cover. At this point, one need only observe the simple, but important, fact that the complement of a (minimum) vertex cover must be a (maximum) independent set of vertices (cf. Gallai [5]) and presto! We have a polynomial algorithm for vertex packing.
For graphs which are not bipartite, however, the situation changes dramatically. Computationally the two roads of matching and vertex packing diverge quickly. We will follow the route of matching first. Although the (polynomial) roots of matching algorithms for bipartite graphs date to the 1930’s with Konig-Egervgry, it was not until 1%5 that the first polynomial algorithm for graphs in general was found by Edmonds Edmonds himself gave an implementation in O(n3 time. (As we write this paper, Blum 171-191 claims that the Edmonds’ implementation can be shortened to q n 3 ) without complications.) Since Edmonds’ original paper, there have been a number of papers successively reducing the time bound.
[a.
For the past ten years or so, Micali and V.V. Vazirani [lo] have held the record of 0 ( f i m ) time. Interestingly, the first proof of correctness of this algorithm did not appear until 1989 [ 113 and is some forty pages in length! It is interesting to note that again until very recently the bound for bipartite graphs has been no better than that for general graphs, although Hopcroft and Karp [12], [13] attained the 0 ( f i m ) bound some years earlier than did Micali and Vazirani. This bound stood until earlier this year when new breakthroughs in bipartite matching were made by the quartet of Alt, Blum, Mehlhorn and Paul, and, independently, by Feder and Motwani [14]. The first four authors improved the time bound to 0 ( n 1 . 5 , , / m) using a new “fast adjacency matrix scanning technique” due to Cheriyan, Hagerup and Mehlhorn [16]. This bound improves the old time bound by a factor of &when the graph is dense, i.e., when the number of edges is O(n2). Feder and Motwani, on the other hand, developed an algorithm which is even faster. Their algorithm finds a maximum matching in a bipartite graph in time O(,&/K(~, m)) where^ (nm) = (log n)/[log (n2/m)l.
[la
But now let us return to the fork in the road and discuss vertex packing for graphs in general. The best deterministic sequential algorithm known to the author is O(2”’) due to Tarjan and Trojanowski [17J which clearly shows the large gap presently extant between matching and vertex packing. Consider the special case when G is the line graph of another graph, a situation which was observed above to guarantee a polynomial algorithm for vertex packing in G. It is well known that line graphs have a characterization in terms of a list of nine forbidden induced subgraphs (cf. H a m y [18]). Of these nine subgraphs, perhaps the most widely studied is the claw K , d . A graph containing no induced claw is said to be claw-free. In 1980, Minty [19] and Sbihi [20] independently proved that if a graph is claw-free, then the vertex packing problem can be solved in polynomial time. The remainder of this paper will be organized as follows. In Section 2 we present a rather intuitive survey of the complexity classes in which we will be primarily interested, together with an illustration showing the known containment relationships among the classes. Section 3
Matching and vertex packing: How “ h a d ’ are they?
211
contains a number of variations on matching which have been formulated and the status of their complexity, if known. Section 4 deals with vertex packing variants and relations in much the same way that Section 3 deals with matching. Sections 5 and 6 deal with matching and vertex packing in parallel, respectively. Section 7 deals with the status of the matching and vertex packing problems involving counting, both exact and approximate. Finally, Section 8 is a short introduction to the work on lower bounds for the complexity of matching and vertex packing, especially that of Razborov. We conclude with a list of over 260 references. 2.
Complexity Classes
So far we have spoken only about algorithms which are polynomial. But what does this mean precisely? What kind of computational devices are being used? Does it matter? What is “NP” anyway? In order not to get bogged down, we will, for the most part, unashamedly sidestep these important issues, trying instead to present our complexity results at a more intuitive level. Fortunately, we have excellent resources to fall back on. The “bible” of complexity theory remains the book of Garey and Johnson [21], together with Johnson’s ongoing guide in the Journal of Algorithms. This author, himself a neophyte in the jungles of complexity theory, feels compelled to admit that after giving his talk at the Quo Vadis, Graph Theory? meeting, which corresponded to a rough first draft of this paper, soon found himself spending nearly a year in the subsequent ferreting out of such things as what a “Monte Carlo” algorithm truly is (not everyone quite agrees on the definition, it seems!) and, most of all, trying to compile the table of complexity classes and the corresponding lattice of containments shown in Figure 1. With immense feelings of relief (mixed with not a little irony) after we had suffered through most of this, we discovered the newly published Handbook of Theoretical Computer Science, Volume A: Algorithms and Complexity. Most of our lattice can be gleaned from the collection of lattices provided by Johnson in Chapter 2 of this book [22]. In general, four chapters in this book are excellent sources for other facets of our survey. We refer our readers to Chapter 1 [23]by van Emde Boas for machine models, Chapter 2 1221 by Johnson on complexity classes and their interrelationships, Chapter 14 by Boppana and Sipser [24] for the Boolean circuit approach to complexity and to Chapter 17 [q, [26] by Karp and Ramachandran for parallel computation. Other good references on various aspects of complexity that were especially helpful to the author include [271-[56].
The types of problems of concern to us come in different varieties: (1) decision problems often called simply “yes-no” problems (“Does graph G have a perfect matching?’); (2) search problems (“Given graph C , find a maximum independent set of vertices in G ” ) ;and (3) counting problems (“Given a graph G , compute @(G),the number of perfect matching in G”). This would seem to be an appropriate place to be more precise about decision versions of our two fundamental problems. In particular the “yes-no” variants of matching and vertex packing which we shall most often refer to are: (1) Given graph G and k > 0, does G have a matching of size 2 k? ( 2 ) Given graph G and k >O, does G have an independent set of size 2 k?
We will call problem (1) the matching problem and denote it by MATCH. Problem ( 2 ) will be called the vertex packing problem and will be referred to as VP. Sometimes others have been known to call the perfect matching problem by our term MATCH, but this should cause us no problems in this paper.
278
M.D. Plummer
Now as to complexity classes. In Section 1, we began our discussion with the matching problem, a problem known to have a polynomial solution. A problem is in class P if there is an algorithm to solve it which runs in time polynomial in the size of the input function. In other words, the algorithm always terminates in a number of steps polynomial in input size. Historically, it is fair to say, one of the most important roots of computational complexity theory is the concept of the class NP. These are the problems which can be solved in nondeterministic polynomial time. We will mostly avoid the treatment of non-deterministic machines and circuits by instead explaining NP via the “certificate” method. The NP machine is allowed to “guess” a solution to a given problem (or “consult an oracle”), but then must provide a certificate for the solution which can be checked in polynomial time. For example, the problem “Does graph G contain a Hamilton cycle?’ is in class NP, for given an H-cycle in G , one can convince himself/herself that it is indeed a Hamilton cycle. (Just draw it! That is, output it as an alternating sequence of vertices and edges and then check the sequence to be sure that all the vertices of V(G),except exactly one, appear once, and this one exception (the “beginning” and “end” vertex) appears exactly twice. Finally, check that each edge in the sequence joins the vertex preceding it in the sequence to the vertex succeeding it.) It is crucial to understanding the concept of NP that we realize that how the Hamilton cycle was found in the first place is irrelevant; we are only interested in certifying in polynomial time that it is indeed a Hamilton cycle. Now what if the input graph G does not have a Hamilton cycle? No “certificate” of this fact in the above sense is known. The opposite “yes-no” situation occurs for the following question: “Is every maximal independent set in graph G of the same size?” (Such graphs are called well-covered or w-c for short and were introduced in [57].)This time if the answer is “no”, it is easy to certify. Just exhibit two maximal independent sets of different size. Note that maximality is easy to check. But if the answer is “yes”, there is no known certification. Define the class co-NP to be the class of all “yes-no” problems such that if the answer is “no”, it can be certified in polynomial time. So the Hamilton cycle problem is in NPand the well-covered graph problem is in co-NP. Clearly, P questions.
sNP nco-NP. Does equality hold? Is P = NP? Is NP = co-NP? All are open
Next let us try to formalize the idea of a “hardest” problem in a complexity class. We will be content to illustrate with class NP. We begin with the notion of apolynomial transformation. Ifx and IC;‘are decision problems, a polynomial transformation fromx to x’ is a function ffrom inputs x of x to inputs f ( x ) o f t such that whenever x yields the answer “yes” for problem x, f ( x ) yields the answer “yes” for problem x’. Moreover, the (deterministically computable) function f ( x ) can be computed in polynomial time. The complexity theorist would hasten to add that the concept of a (polynomial) transformation should be contrasted with the more general notion of a polynomial (Turing)reduction which allows multiple calls of a subroutine. However, we shall not refer to the latter type of reduction here. A problem x is NP-hard if, given any problem x;’in NP, there is a polynomial transformation of every instance of k;’ to an instance of problem x. That is, if we can solve x in polynomial time, then we can solve x’ polynomially as well. If, in addition, problem x belongs to NP, we say that 7c is NP-complete. That is to say, it is a “hardest” problem in NP. The first NP-
Matching and vertex packmg: How “hard” are they?
279
complete problem was found by Cook [58] in 1Wl. It is the satisfiability problem of logic (or “ S A T for short). Suppose formula F is the conjunction of a set of disjunctions of propositional variables. Is there an assignment of trues and falses to the literals making F true? Levin [59] (see also [a] in which a corrected translation of Levin’s article appears as an appendix) independently and essentially simultaneously arrived at the notion of NP-completeness. Shortly after Cook’s ground-breaking result, Karp [61] published a list of 21 additional NP-complete problems. One of these was VERTEX COVER defined as follows. “Given a graph G and an integer k > 0, does G have a set of k or fewer vertices which collectively touch all the edges of G?” Since the complement of a VERTM COVER is an independent set, vertex cover is clearly equivalent to the vertex packing problem VP defined earlier. It makes sense to ask if there are “complete” problems for other complexity classes as well, for example, for class P. One must be careful, though, to carefully define the allowable types of transformations. For example, with polynomial transformations allowed, every problem in P would be “complete”! This, then, hardly captures a useful meaning for a “hardest” problem in P. In the case of class P, a different kind of transformation is employed - socalled log-space transformations. In terms of a worktape of a Turing machine, a log-space transformation is one that is computable in logarithmic space on the worktape. Note that these transformations are also polynomial transformations since a logarithmic workspace bound means that only a polynomial number of distinct states on the worktape are possible. In this more restricted sense of log-space transformation, class P still has complete problems; probably the most famous of these is the linear programming problem [62]. There is, however, another P-complete problem closer in spirit to our paper. This is LFMAXLVP-lexicographically first maximal independent set [63] [@I. Label the vertices in the graph with the integers 1, 2, ... ,n and begin to build a maximal independent set I by putting vertex 1 into I , deleting { 1) u N( l), where N( 1) denotes the set of neighbors of the vertex 1. Select the vertex with smallest label remaining and put it into set I. Delete it and all its neighbors and continue in this manner until all vertices have either been included in I or discarded. The “yes-no” question is then: “Is vertex n in set I?” (Throughout this paper, n will refer to the number of vertices in a graph and m to the number of edges, unless otherwise stated.) Another question close to home for us in this paper is: “Is the perfect matching problem Pcomplete?” The evidence so far leads most workers active in the area to conjecture that the answer is “no”, but the problem remains open (cf.[63][a]). With these basic locations on our lattice behind us, let us continue to discuss some additional classes. In fact, let us consider parallel computation classes next. A problem is said to be in parallel class NC’ if it can be solved in O((log n)‘) time with a polynomial number of processors. Thus i is the degree of the time-bounding polynomial in log n. The absence of an index on the processor bound in naming this class would seem to support the contention that “time is money” and “processors are cheap”! To be fair, it should be emphasized that many complexity theorists concern themselves with the time versus hardware tradeoff by considering the product of the time and processor bounds as the “true” measure of efficiency.
-
,u
The class NC = NC’ is called “Nick’s class” in honor of its first proponent, Nick PipI= 1 penger [63. We shall consider our parallel processing to be done on a CREW PRAM, a ConcurrentRead Exclusive Write Parallel Random Access Machine. In such a device, two processors
280
M.D. F’lummer
IP = PSPACE = NPSPACE
‘ ‘t
t t
NC
NC Figure 1: A complexity class framework. may read the contents of the same memory cell at the same time, but only one at a time may write in such a cell. Some of the results mentioned later in the sections devoted to parallel problems may refer to the more restricted family of EREW PRAM’Swhere the acronym is selfexplanatory. However, the differences which arise between these two models usually mean only a difference in the ‘‘Y part of NC’ and hence no difference in whether or not a given problem is in class NC. For this reason the class NC is often referred to as being “robust.”
Matching and vertex packing: How “hard” are they?
28 1
Although we will have much less to say about it, we would be remiss if we did not at l c y t mention the Boolean circuit model of parallel computing. In this model the class NC’ is defined as a set of languages (i.e., subsets of {0,1}*) recognizable (i.e., “accepted”) by classes of Boolean circuits having a polynomial number of “and”, “or” and “not” gates and having depth O((log n)’). (These circuits can be thought of as directed acyclic graphs with certain vertices designated as inputs (indegree 0) and a special output vertex (outdegree 0). The depth of the circuit is the length of any longest path from an input to the output and is the circuit analog of “parallel time.” In addition, the class of Boolean circuits is usually assumed to be log-space uniform. (A class of Boolean circuits { B,,}: is said to be log-space uniform if there is a deterministic Turing machine which, for each n, can construct circuit B,, in space O(1og n) (see [22] and [24]). In summary, then, we have NC’ 5 iVC2 ... NC 5; P . Whether or not equality holds at any stage of this chain is unknown. We have here at the onset still another unsettled question about matching. Does the perfect matching problem lie in NC? Let us next give a brief introduction to randomized complexity classes. The main motivation here is the following. It may be the case that there is an algorithm for a certain problem which gives the correct answer most - but not all - of the time. However, its execution time is faster than some deterministic alternative. In other words, it may be possible (and often desirable) to trade some accuracy for speed. Let us begin with the random complexity class RP (for ‘‘random” polynomial time). These are the problems for which there exist algorithms which behave as follows. If the correct answer to an input is “yes,” the algorithm returns the answer “yes” with probability at leastp, where p is some fixed probability bounded away from 0. (Frequently this probability is given the value 1/2 in this definition, but any fixed positive value will do.) However, if the correct answer is “no,” the algorithm returns the incorrect answer “yes” with probability 0. Moreover, it does this in polynomial time. Such algorithms are said to have “one-sided error.” In other words, if the algorithm ever answers “yes,” one can be certain that “yes” is indeed the correct answer! The reader is warned that some authors call all randomized algorithms Monte Carlo, while others reserve this term for those algorithms with one-sided error, like those in class RP. Let us also briefly mention a class containing RP, the class BPP. (The “B” stands for error bounded away from 1D.) Members of this class are those problems with polynomial algorithms which return the correct answer (be it “yes” or “no”) with probability at least 112 + E , for some E > 0. For this reason, BPP is called a “two-sided error” class. For a BPP problem one can rerun a given input a number of times and take the majority answer. The more runs the more probable that the majority answer is correct. Finally, we mention the randomized complexity class ZPP. (The “Z” stands for “zero error.”) Here the algorithm either gives the correct answer (be i t “yes” or “no”) in polynomial time or it refuses to answer at all! Moreover, the probability of no answer at all is strictly less than 1/2. Thus, while BPP and RP algorithms may “lie”, a ZPP algorithm never does! Randomized algorithms which do not lie are called Las Vegas algorithms, a term due to L. Babai. There are, of course, parallel analogs to all of these randomized classes. Of these we shall be concerned in this paper only with RNC (and one lonely example in UVC!).Clearly, NC RNC by definition, but whether or not equality holds is another important open question. Now let us turn to a problem of a different nature. To set the stage, let us backtrack a bit.
282
M.D. Hummer
In defining all the complexity classes up to this point, we have dealt exclusively with decision problems, i.e., problems which have answers which are either “yes” or “no”. Of course there are other types of problems which we would like to consider. Search problems, for example, are especially central to this paper. Examples central to our theme are: given a graph “find the size of a maximum matching” or “find the size of a maximum independent set”. In more extensive treatments of complexity (cf. again [22]) such distinctions are discussed at length and even separate complexity classes are defined to reflect these differences. For example, class FP(“F” is for “function”) is defined as the search analog of decision class P. But even more often in the existing literature, such distinctions are glossed over. In the interests of efficiency in attaining the goals of this survey paper while the reader is still awake, we shall also ignore such distinctions for the most part. Having said that, however, we next treat yet a third t y p of problem of considerable interest to graph theorists. These are the so-called counting problems. To be sure, such problems are “functional,” but the output in this case is a non-negative integer. A non-deterministic Turing machine which, given an input string x, outputs A x ) is equal to the number of accepting computations for this input string, is called a counting Turing machine. The class #P is defined to be the set of all functions that are computable by polynomial time counting Turing machines. It is perhaps most instructive to think of a #P function as one with the “magical” property that it instantly prints out the number of acceptable computations of an associated polynomial time non-deterministic Turing machine. Some familiar examples of #P problems are: (a) Given graph G, how many Hamilton cycles does it contain? (b) Given graph G, how many 3-colorings of its vertices are there? (c) Given graph G, how many perfect matchings does it contain? A problem is #P-hard if there are polynomial time Turing reductions from all problems in #P to the given problem. If, in addition, the problem belongs to the class #P as well, then we say that the problem is #P-complete.
The main new idea is that for #P, the polynomial transformations from problem to problem are required to be parsimonious, that is, they must preserve the number of solutions. Valiant [66l[67] showed that the problem of the number of perfect matchings in a graph (we shall call it #PM) is #P-complete and therefore NP-hard, even when the graph is bipartite. This problem is the main reason for introducing the class #P in the present paper. Strictly speaking, it does not make sense to ask if NP is a subclass of #P, since the former is a set of languages (a language in turn is just a set of strings), but #P is a set of functions from strings to the non-negative integers. Nevertheless, in our Framework (see illustration) we have drawn a dashed arrow from N P u co-NP to #P to indicate that problems in #P are harder by their very nature to compute than are those in NP u c o - N P For example, a #P-machine will compute the number of Hamilton cycles in a graph G. If that number equals 0, then G has no Hamilton cycle, but if that number is greater than 0, it does. Thus we have an instant answer to a proven NP-complete problem! Here we bring a halt to our discussion of complexity classes. The reader likely will want to refer to our “Framework” illustration to see the containment relations known to exist (at this time anyway) among the various classes (see [22] for those classes we do not discuss here).
Matching and vertex packing: How “hard” are they?
283
Finally, we will defer definitions and discussion of the two classes @ and CC until Sections 4 and 5, respectively, where we encounter them for the first and only time in this paper. The reader will note that we have drawn a horizontal dashed line across the Framework diagram. Those classes below the line are those known to be polynomial time computable (where in some of these classes recall that randomization is allowed). 3.
Matching Variations and Their Complexity
We now give a list of some variants on the standard matching problem and give their complexity, if known. Recall that the matching problems - both “perfect” and “maximum” varieties - are known to be in P. An even simpler problem (at least sequentially) is the problem of finding a maximal matching. The greedy algorithm will solve this nicely. Choose a edge and delete its endvertices. Choose a edge in the remaining graph and delete its endvertices. Continuing in this manner we have a trivially polynomial (sequential) algorithm for maximal matching. It is perhaps surprising then that the following problem (MINIMUM MAXIMAL MATCHING) is NP-complete. 1. Given graph Gand integer k> 0, does G have a maximal matching of size at most k?
NP-completeness was first shown by Yannakakis and Gavril [ a ] . In fact, they proved NP-completeness even in the cases where G has maximum degree at most 3 and is either planar or bipartite. The same two authors found an 0 (n) algorithm for the problem in the case when the graph G is a tree. Chronologically, one of the first generalizations of matching to be shown to be NP-complete was the 3-dimensional matching problem (3DM). This is most easily described as follows. Let H be a 3-uniform hypergraph (i.e., each “edge” contains three vertices, not two). 2. Given H , is there a perfect matching of hyperedges, i.e., a set of hyperedges such that each vertex of H lies in exactly one hyperedge? This problem was one of Karp’s original 21 [61] and the proof is by reduction from 3SAT. The NP-completeness of 3 D M easily implies that the following two matching problems are NP-complete (see [69]). 3. Given a bipartite graph G and a partition E(G) = El u ... uEkof its edges, does G contain a perfect matching F such that either Ei E F o r Ei nF = 0 ,for all i = 1,...,k? 4. Given a bipartite graph G and a coloring of its edge set E(G), does G contain a perfect
matching with exactly one edge of each color? Motivated by the fact that certain timetable and image analysis problems can be modeled as matching problems with certain additional restrictions, Itai,Rodeh and Tanimoto [70]introduced the RESTRICTED MATCHING problem.
5.Given a bipartite graph G , a collection Rl, ..., Rk of subsets of E(G) and a collection of nonnegative integers rl, ..~,rk, does G have a perfect matching F such that IF nR,I Iri for i = 1, . . . , k ? If k is set equal to 1 in problem 5, the resulting problem is equivalent to finding a perfect matching using as few edges of R l as possible and this is a very special case of the minimum weight perfect matching problem which was shown to be polynomially solvable by Edmonds
M.D. Plummer
284
in 1965 [71]. Note that since k is unrestricted in problem 5, problem 4 is a special case of problem 5 and hence the latter is NP-complete also. However, if k is restricted to afixed value, we have yet another problem, this one introduced by Papadimitriou and Yannakakis [72] (see also [73]). 6. Let the positive integer k have afixed value (say, 10, for example). Given a bipartite graph G with its edges colored in k colors and a set of non-negative integers c1, ...,ck, does G contain a perfect matching F such that F contains at most ci edges of color i, for all i?
The difference between problem 6 and problem 5 is important to see, although it is somewhat subtle. In problem 6, the number of edge classes (i.e., “colors”) is fixed or bounded; in problem 5,k is not bounded, but may assume values as large as one likes as part of the input string to the problem. Papadimitriou and Yannakakis [72] showed problem 6 to be polynomially equivalent to five other problems, including the next problem in our list - EXACTMATCHZNG.
7. Given a graph G and a set of distinguished edges R E E(G) (call these edges “red”) and an integer k > 0, is there a perfect matching of G which contains exactly k red edges? Note that EXACT MATCHZNG can also be thought of as the special case when k = 2 and R nR2 = 0 of the RESTRZCTED MATCHZNG problem. Although the complexity of problems 6 and 7 remains unknown, even for bipartite graphs, Barahona and Pulleyblank p4] have shown EXACT MATCHZNG to be polynomially solvable when G is PfafJian. The class of Pfaffian graphs is discussed in [I] and we will meet them again in Section 7 of the present paper. Suffice it to say here that they include all K3.3-free graphs and hence all planar graphs as well. More recently, V.V. Vazirani [75]p6] has shown EXACT MATCHZNG to be in parallel class NC for K3,3-free graphs. We shall return to EXACTMATCHING in Section 5. Given a matching M in a graph G, C is an alternating cycle with respect to M if C is a (necessarily even) cycle in which every second edge belongs to M. Matching M is alternating cyclefree if no such C exists. To test if a graph G has an alternating cycle free perfect matching can be done in polynomial time, for clearly G has a cycle free perfect matching if and only if G has a unique perfect matching. To test for the latter property, let M = {el, ..., e,12}be perfect matching for G and test G- e for each i in turn to see if it contains a perfect matching, using the polynomial algorithm of Eidmonds or any of its variants. On the other hand, consider the following two related problems. 8. Does a bipartite graph G have a perfect matching which has no alternating 4-cycle?
9. Given a bipartite graph G and an integer k > 0, does G have an alternating cycle free matching of size at least M
Pulleyblank [nJ has proved both these problems NP-complete by reducing 3SAT to each. This work was done in connection with minimizing setups in precedence constrained scheduling. Varying our demands somewhat yet again, let us call a matching M in graph G induced if no two edges of Mare joined by a edge in E(G) - M. Consider now the induced matching problem.
10. Given graphG and an integer k > 0, is there an induced matching in G of size at least k?
Matching and vertex packing: How “‘hard’are they?
285
This problem has been proved NP-complete even for bipartite graphs by Stockmeyer and V.V. Vazirani [78] and later, independently, by Cameron p9]].(The first two authors actually showed more. Let the distance between two edges e and e‘ be the length of a shortest path from an endvertex of e to an endvertex of e’. A &separated matching M is a matching in which the distance between any two edges is at least 6. Stockmeyer and Vazirani showed that for each 6 1 2 the problem “Given a graph G and an integer k > 0, does G have a &separated matching of size at least k?” to be NP-complete, even for bipartite graphs regular of degree 4.) On the other hand, Cameron showed maximum induced matching (that is, maximum 2separated matching) is polynomial - in fact in NC - for all chordal graphs. (A graph is chordal if every cycle of length at least 4 has a chord, that is, an edge not in the cycle, but which joins two vertices of the cycle.) In the Stockmeyer and Vazirani paper, two other interesting variations are shown to be NP-complete. These are maximum TR-matching and maximum star matching. Both are relevant to network testing of various sorts. A TR-matching in graph G is a pair (M,h) where M is a matching andh is a labeling function which assigns to each vertex of G one of the three labels from {T,R,A}. Here “T” stands for “transmitter,” “R” for “receiver” and “A” for “neither.” In addition, the labeling h is subject to the following conditions:
(a) h(v)= A whenever M does not cover vertex v, (b) if edge uv E M , then precisely one of u and v has label T, the other R, and (c)if
UVE
E(G)-M,then{h(u),h(v)} #{R,Z‘).
Condition (c) says that no transmitter is connected to a receiver other than the one to which it is matched by M . The size of a TR matching is the cardinality of M. 11. Given graph G and integer k > 0, is there a TR-matching in G of size at least k? The motivation for this problem is reasonably clear. We want to test the network by sending a signal simultaneously from all transmitters (T‘s) to their receivers (R’s).Condition (c) precludes the possibility of “jamming”; that is, receivers receiving test signals from two different transmitters. Star-matching arises from a different testing procedure. A star-matching is simply a labeling functionh from the set of vertices V(G ) into the set {T,R} such that for each vertex u with h(u) = R there exists another vertex v adjacent to u such that h(v) = T . The size of a starmatching is the size of the set of vertices having R labels. 12. Given graph G and integer k > 0, does G have a star-matching of size at least k? Problems 11 and 12 were shown to be NP-complete for bipartite graphs by Even, Goldreich and Tong [So].Stockmeyer and Vazirani showed them to be NP-complete for all cubic graphs. Returning to induced matching for a moment, we point out that Stockmeyer and Vazirani refer to it as “risk-free marriage”! (A moment’s reflection on the part of the reader will undoubtedly reveal why.) This leads us to another matching variation which has been widely studied over the past twenty years or so: The Stable Marriage Problem. Fortunately, for this problem we have a quite recent and comprehensive survey in the form of the book of Gusfield and Irving [81].
286
M.D. Hummer
As in the case with matching in general, studies here naturally divide into the treatment of the bipartite case (the Stable Marriage Problem) and the general - i.e., not necessarily bipartite - case (the Stable Roommates Problem). We treat the bipartite version first. Suppose we have a bipartite graph G with vertex bipartition V ( G )= A uB and suppose that IAl = IBI. Each man (member of A) and each woman (member of B) has a complete list of preferences for the member of the set of the opposite sex which is a strict order relation. A matching M is said to be unstable if there exists a man and woman who are not matched by M, but each prefers the other over hidher partner in the matching M. A matching is said to be stable if it is not unstable. The problem then is: 13. Given a bipartite glaph as above with ranked preferences, does G have a perfect matching which is stable?
Of course as usual, there are really two fundamental related problems here. First, does a stable matching exist and second, if so, can it be found efficiently. The answer, fortunately, is “yes” to both questions. In 1%2, Gale and Shapley [82] showed that a stable matching always exists and provided an O(n2)algorithm to find one. The more general problem of Stable Roommates is defined in a manner similar to the Stable Marriage Problem, except that the underlying graph G need not be bipartite (although it should have an even number of vertices) and each vertex (student) has a strict ordering of all other vertices (possible roommates). 14. Given a graph G on an even number of vertices with preference lists for each vertex, does G have a stable matching?
The fundamental difference between the Marriage and Roommates problems turns out to be that the latter need not have a stable matching. who The complexity of the Stable Roommates Problem was settled in 1985by Irving [a] found an O(n2)algorithm which either outputs a stable matching or indicates that none exists. These two problems are unique in this paper in the sense that it is known that the algorithms are asymptotically optimal. That is, it has been proved that an algorithm for finding a stable matching (even in the bipartite case) must require at least c time, for some constant c > 0. This was proved by Ng and Hirschberg in 1988 [MI.
2
If one relaxes the demand on strict preference lists to allow the possibility of ties, one must agree on a suitable redefining of stability. However, for two of the “most natural” such definitions (called super-stable and strongly stable by Gusfield and Irving), the Gale-Shapley algorithm can be extended to produce a stable matching (or report the existence of none) in polynomial time in the bipartite case. However, for the non-bipartite Roommates Problem, if ties are allowed, the problem becomes NP-complete. This was proved by Ronn [SS] .[%I. There are a host of variations on Stable Marriage and Stable Roommates, some in P, some NP-complete and some the complexity of which is unknown. (See the Gusfield and Irving book [81].) We will finish our treatment with one more such variation. Note that we now return to the original requirement that the preferences be strict. For a roommates matching M, let the value ofMbe
Matching and vertex packing:How “hard” are they?
v(M) =
C UVE
287
(r(u,v)+r(v,u))
M
where r(u,v) denotes the ranking of v by u. The Optimal Roommates Problem is: 15. Given a graph G, compute the minimum value of v(M) taken over all stable matchings M.
Feder [87l has recently proved that this problem is NP-complete, but the bipartite (or Marriage) version is, in fact, polynomial. We conclude by reporting that the problem of counting stable matchings, even in the bipartite case, is #P-complete [%I. Now we veer in a different direction. Maximum matching can be thought of as a “packing problem” in which one wants to find the largest number of edges which are mutually vertexdisjoint. Suppose now we attempt to generalize the notion of “edge” in the packing to another type of subgraph. Let us call the following the H-matching problem and denote it by HMATCH. Given graphs G and H is there a spanning subgraph of G consisting of vertex-disjoint copies of H ? Of course if H = K2,we just have the perfect matching problem and the solution is polyno] mial. But if H has any component having 3 or more vertices, Kirkpatrick and Hell [S[89] proved that the problem is NP-complete (see also [90] [91]). In all other cases, the problem is polynomial . We can modify this problem in yet another way. Instead of one fixed graph H, as our subgraph to be packed, let us allow a choice of graphs from a certain specified family. Let H = {HI,H2, ...} be a finite or infinite family of graphs. An H-factor for G is then a vertex-disjoint collection of subgraphs of G which together cover V( G) and each member of the collection comes from H . For example, if H = { C3, C4, ...}, the family of all cycles, then an Hfactor for G just becomes a 2-jactor. Hell and Kirkpatrick study a variety of possibilities for H . For example, if H I is the set of all cycles just mentioned, the problem is polynomial. If H2 = {K2} u H I , the packing sought is called a perfect 2-matching and again the problem is polynomial. If H3 = { K l V nI n 2 3 ) (Le., the family of “stars”), the problem is NP-complete. On the other hand, if we add to H3 either K1 or K2 (or both), and call the resulting class H4, the problem clearly becomes polynomial. A close relative to this problem is obtained as follows. Let H s be any subset of {Kl,,I n L 1) with the property that for some 2, K1,[+? Hg, but K l , t + E H5. Then the H5 problem is NP-complete. One final example is obtained by letting H6 be any family of complete graphs. Then the Hg packing problem is polynomial if K1 or K2 is in H6 and NP-complete in all other cases. Proofs of all these results, as well as other related results, may be found in [90]-[9;1.
More recently, the H-matching problem has been investigated for planar graphs [%I. (Here again, let us note, H is a single fixed graph.) If H = K3 or K1.3 (the “claw”), Dyer and Frieze [9!4 showed HMATCH to be NP-complete in the plane. Even more recently, Berman et al. [%I have shown that if H has at least 3 vertices, then muximum planar matching version of HMATCH is NP-complete. Surprisingly, the perfecr planar matching version of HMATCH is another story! If H has at least 3 vertices and is connected and outerplanar, they show that the problem is NP-complete. On the other hand, if H i s a triangulated graph with at least 4 verti-
288
M.D. Plummer
ces, there is an a n ) algorithm for the problem. A characterization of those H's for which perfect planar HMATCH is polynomial remains unknown. An approximation algorithm for planar H-matching is obtained in [loo].
Now let us look at a quite different problem associated with matching. Edmonds showed that matching can be formulated as a linear program. There now exist several polynomial L P
algorithms such as the ellipsoid method and Karmarkar's method. The problem with using these directly is that there exist an exponentially large number of inequalities which must be furnished in order to formulate the matching problem as an LP. However, the ellipsoid method has the property that if the inequalities can be fed to it as needed, it can solve the LP in polynomial time. Padberg and Rao [loll described an algorithm to do this for the matching polyhedron. Thus their algorithm, when used in conjunction with the ellipsoid method, provides another method for solving matching problems in polynomial time. But can one somehow solve the matching problem in polynomial time using a more standard LP algorithm like the aforementioned simplex method or Karmarkar's algorithm? This remains an open question. Barahona [la] has made some progress on this question by showing that one can solve matching via apolynomiul number of LP's, eachpolynomial in size. He has also shown that in the case when G is planar, this polynomial number of polynomial size LP's can indeed be reduced to only one [1031. It should be mentioned in this connection that a planar graph can indeed have an exponential number of facets in its matching polytope (cf.Gamble [104]). On the negative side, however, Yannakakis [ l o 3 has shown that no symmetric L P formulation of polynomial size for matching on the complete graph K2,, is possible. (A formulation of the matching problem is symmetric if any extra variables and the roles they play are independent (up to permutation) of the order in which the graph is examined.) T o close this section of our paper, we make a fleeting visit to the land of matroids. For details the reader may refer to [l] or [ 1061, or to many other reference sources for matroid theory. Suppose M = (E,Z) is a matroid where E is the ground set and Z is the family of independent subsets of E. Suppose F is a pairing of all the elements of the ground set E. A set contained in E is a purity set if for every element e E A, its mate under F is also in A. Then the MATROID PARITY problem is: Given M = ( E J , a pairing F and an integer k > 0, is there a parity set A in M of size at least k? MATROID PARITY is provably exponential and this result does not depend upon the assumption that P # NP! This was shown by Lov&z in 1978 [107]. If the matroid is linear, however, he gave a polynomial algorithm, albeit relatively slow. More recently, faster algorithms have been found [1081. 4.
Vertex Packing Variations and Their Complexity
As mentioned earlier in this paper, the vertex cover problem is one of the original NP-complete problems in the list of Karp [61J . Later it was proved NP-complete even when restricted to cubic planar graphs [21], to triangle-free graphs [lo91 and several other families listed in [21] [110]. Although vertex packing is polynomial for Kl,j-free graphs (see Minty [19] and Sbihi [201 mentioned earlier), it is still NP-complete for K1,4-free graphs [19] (see also [lll]). For even more classes for which vertex cover is NP-complete, see also Mahadev [ 1121.
Matching and vertex packing: How “hard” are they?
289
Recall that earlier, along with claw-free graphs, we also mentioned line graphs and bipartite graphs as classes for which VP is polynomial. There are several other classes for which polynomiality for vertex packing has been shown, but too recently to appear in the Garey and Johnson book. The most famous of these is surely the family of perfect graphs. Letw(G) denote the size of any largest complete subgraph in G. This is called the clique number of G. Clearly, o(G)5 x(G) for any graph G, where x(G) denotes the chromatic number of G. Graph G is said to be perfect if o(G’) = x (GI) for every induced subgraph G‘ of G. The concept of perfection is due to Berge [113] in the early 1960’s and the class of perfect graphs is now known to include many other well-known families such as bipartite graphs, line graphs of bipartite graphs, interval graphs, comparability graphs and triangulated graphs - as well as the complements of all such graphs. In fact, LovBsz, in a celebrated 1972 result [114], proved that for any perfect graph the complement must also be perfect. (For general references on perfect graphs, we recommend [ 1 1 3 [116] as well as [117]-[119].) It is a highly non-trivial fact, proved by Groschel, LovBsz and Schrijver, that vertex packing is polynomial for the class of all perfect graphs. The proof uses the ellipsoid method, the first polynomial L P algorithm to be discovered. For a thorough account of this see [1 1 7 [1181 [120]-[ 1221. In fact, they prove polynomiality for a larger class, namely, the socalled h-perfect graphs. The h-perfect graphs are defined as those graphs the stable set polytopes of which are defined by certain families of linear inequalities - in this case the socalled non-negativity constraints, clique constraints and odd cycle constraints. Unfortunately, no purely graph-theoretical characterization of h-perfect graphs is yet known. One last remark about perfect graphs is in order. Although the complexity of showing a graph to be perfect is not known, showing imperfection is in class N P . (According to Berge and ChvBtal[ll6], this result is attributable to Edmonds and Cameron.) For additional work along these lines, see references [ 1231 [124]. Now let us return to bipartite graphs and recall the fundamental result of Konig which says that v(G) =T(G) wherev(G) is the size of any maximum matching and T( G) is the size of any minimum vertex cover [3] [4]. This is an archetypal example of a “minimax” theorem in graph theory. Such theorems have grown in importance since the discoveries of the last twenty years or so indicating that graph theory and linear programming can profitably be brought together (see [l] for just one of many references on this subject, also [2]). It is important to realize that the above minimax equation holds for some non-bipartite graphs as well; for example, consider the 4-vertex graph obtained by attaching a pendant edge to a triangle. Graphs satisfying the minimax equation are said to have the Konig Property. First of all, there are several characterizations of these graphs which lead to polynomial recognition algorithms for the graphs in the class (see [l] [123-[127). Trivially, these graphs have polynomial algorithms to find the size of a maximum independent set via the matching algorithm. However, none of the above references explicitly gives a polynomial algorithm for finding a maximum independent set - i.e., the search problem. However, in [127, there is a polynomial algorithm for the search problem implicit in Lemma 3.3. This result depends upon the so-called Gallai-Edmonds decomposition of a graph, a canonical decomposition of graphs in terms of their maximum matching. This decomposition can be found in polynomial time via Edmonds’ algorithm. The details may be found in [l]. Also see the thesis of Korach [128]. It turns out, however, that a polynomial algorithm for finding a maximum independent set in a graph with the Konig Property has been around quite a bit longer than the method referred
290
M.D. plummer
to in [l]. Define a 2-cover of a graph G to be an assignment of weights 0, 1 and 2 to the vertices of G such that the sum of weights of the two endvertices of any edge is at least 2. The sum of all weights is called the size of the 2-cover. The minimum size of any 2-cover of G is denoted by 22(G). It can be shown (see Corollary 6.3.4of [l]) that a graph G has the Konig Property if and only if it satisfies 22(G) = 22(G). In a 1975 paper, Nemhauser and Trotter [129] gave a polynomial algorithm which, when applied to an arbitrary graph, either produces a maximum independent set or shows that %(G) <22(G) and thus that the graph does not have the Konig Property. (This was before the name “Konig Property” had yet been coined, however.) For further discussion on this approach, the reader is referred to [130] and [ 13 l].
Another class of graphs with polynomial algorithms for VP are the well-covered graphs mentioned earlier. Recall that a graph is well-covered if every maximal independent set is in fact maximum. In other words, these are the graphs for which the greedy algorithm for a maximal independent set always results in a maximum independent set. The polynomiality for VP for these graphs is thus trivial. What is not trivial about well-covered graphs is recognizing them! A moment’s reflection will tell the reader that well-covered graphs are in co-NP. In fact, very recently it has been shown that in fact well-covered graph recognition is co-NP-complete (see [132] [133]). But membership in NP is unsettled. In fact, it is thought by most that membership in N P is highly unlikely, for such a result would imply that NP = co-NP, a situation considered almost as unlikely as P = NP! The concept of a well-covered graph was introduced in [57. In the past five years or so, there has been a surge of interest in this graph class and the interested reader may want to consult the forthcoming survey [1341. See also [132] [ 1331 [135l. Next we would like to briefly discuss the so-called a-critical graphs. Let the independence number of graph G (Le., the size of any largest independent set in G) be denoted by a ( G ) .A graph G is said to be a-critical if a ( G - e) > a(G), for all edges e E E(G). It may well be that the recognition problem for these graphs belongs to neither NP nor to co-NP! At this point, no one knows. If these graphs were in NP, then one could give a “good characterization” of a (G) for any graph. (In other words, one could show that determininga (G) belongs to NP nco-NP [123].) There has been quite a lot of interest in obtaining structural properties of this family of graphs. For an overview of the structural results presently known for these graphs, see [ 11. In 1%5, Hajnal [136] proved that in any a-critical graph G, maxdeg v 5 IV(G)I - 2a(G) + 1. Denote the quantity IV(G)I - 2a(G) by 6(G). Gallai suggested that studying connected acritical graphs G by means of the value of 6(G) might prove profitable. The parameter 6(G) has come to be called the Gallai class number of G for this reason. Clearly the only connected a-critical graph with6 = 0 is K2. It is also easy to see that the only connected a-critical graphs having 6(G) = 1 are the odd cycles. From this point on, the situation rapidly becomes more complex. Andrhfai [137] showed that the only connected a-critical graphs with a ( G ) = 2 are the even subdivisions of Kk A deep result due to Lovhz some eleven years later [121] showed that for any value of 6, the class of a-critical graphs G having 6 = 6 (G) must arise from afinite class of “basis” graphs via even subdivisions. For 6 = 3, it has been shown that there are precisely four basis graphs, while for 6 2 4, it is known only that the number of basis graphs is bounded above by a rather complicated function exponential in 6.
Matching and vertex packing: How “hard” are they?
29 1
Returning to Anddsfai‘s result for a moment, let us compare it with a result of Chvfital [138] which says that every a-critical graph G with 6 ( G )2 2 must contain a subdivision of K4. Moreover, Chvfital also showed that any graph which does not contain a subdivision of K4 (i.e.,a so-called series-parallel graph), has a polynomial algorithm for VP. Chvktal also conjectured that all a-critical graphs with6 2 2 must in fact contain an even subdivision of K+ This conjecture was settled in the affirmative by Sewell [139] [ l a ] and Sewell and Trotter [141] in 1990. Moreover, in [139] [140], Sewell also showed that if graph G does not contain an even subdivision of K4,then the VP problem for G is polynomial. Thus we have a kind of “separation” result in that we have a class of graphs which are not a-critical, but which have a polynomial VP algorithm. It is interesting that the polynomial VP algorithm of Sewell uses the ellipsoid method also. It is an open question as to whether use of the ellipsoid method can be avoided in this case. One final remark is in order regarding a-critical graphs. We stated earlier that it may well be that recognizing these graphs is neither in NP nor in co-NP. What then, if anything, can be said about the complexity of this problem? In 1982, Papadimitriou and Yannakakis [142], in an attempt to classify the complexity of facets of the polytope which arises in the LP formulation of the traveling salesman problem, defined a new complexity class (see also[50] [73] [143]). In terms of languages (i.e., sets of binary strings), DP is defined as the set of all languages L1 nL2 where L1 is in NP and L2 is in co-NP. In terms of problems, we perhaps can best illustrate this class with the example of EXACT VERTEX PACKING: Given a graph G and an integer k > 0, is a ( G )= k? This can be thought of as the intersection of two problems. The first is our old friend VP which says “Given G and k > 0 does G have an independent set of size at least k?” This is in NP as we have seen earlier. As the second problem, consider: “Given G and k > 0, does G not have an independent set of size at least k + l?”In the second problem, if the answer is “no”, it is easily certifiable by giving an independent set which has size at least k + 1. It follows by definition that NP u co-NP E p.It should be emphasized that there is an important distinction between a class defined in terms of two problems, one in NP, the other in co-NP and a class defined in terms of a single problem simultaneously belonging to NP and to co-NP. D?’ is an example of the former; NP nco-NP is an example of the latter. It turns out that not only does the problem of determining the facets of the VP polytope lie in this new class, but so does EXACT VERTEX PACKING. In fact, both are DP-complete (see [73][142]). Somewhat later in 1985, Papadimitriou and Wolfe [I431 announced that V.V. Vazimni has shown that the recognition problem for a-critical graphs is also @-complete, although we are unaware of a published proof to date. We conclude this section with one brief remark on uppronimatinga(G).Some time ago, Garey and Johnson [21] proved that in a sense this is a hopeless task. More specifically, they showed that if one could find a polynomial time algorithm which, given an&> 0, outputs an approximate value a * ( G ) in the sense that la(G) - a*(G)I < E , then in fact there must be a polynomial algorithm for MAXMVP and hence P = NP! There are, of course, other measures of “approximation”. For example, suppose a*(G) is an estimate for a(G).One could consider the ratio a(G)la*( G )as a measure of ‘‘goodness’’ of approximation. But for this case as well, the news is not good. There is no known algorithm
M.D. mummer
292
for FINDVP which guarantees a ratio any better than O(nE)),where as usual n is the input size ande > 0. Some recent results by Feige, Lovhz, Goldwasser, Safra and Szegedy [ l a ] - [ 1461 are of special interest here. These authors show that if one could approximate a(G) in polynomial time to within a factor of exp,( (log n) -‘), then NP E; QP, where QPdenotes quasi-polynomial time; that is, O(exp2(logcn)). This set inclusion would then imply NMPTIME = EXP7ZME. (This equality would also be implied by P = NP.) It is also shown that if one can approximate the independence number within a constant factor in polynomial time, then in fact every NP problem can be solved in nwog log time. The proofs employ techniques of interactive proof systems (see the “ZP” at the top of our complexity class diagram). But lest we leave the reader on too negative a note, it has been shown by Boppana and Halldbrsson [147J that there is a polynomial time algorithm which will approximate a(G) within a factor of n/log2 n. In view of the difficulties encountered in approximating the size of a maximum independent set, one may find the following surprising, due to the complementary connection between independent sets and vertex covers. Gavril [21] first observed in 1974 that there is a straightforward polynomial approximation algorithm which for any graph G supplies a number Z*(G) such that Z*(G)/z(G) 5 2! Remember that by Gallai’s result [q,z(G) + a ( G ) = IV(G)I. Now simply construct any maximal matching M , say of size k. Then the set of endvertices C of M is a set of 2k vertices and by the maximality of M , C is a vertex cover. Now every cover of G must in F c u l a r cover M , so z(G) 2 k. Thus we have 2z(G) 2 2k = ICI and hence IClh(G) 5 2 as claimed. This ratio of 2 was improved to a factor of 2 - sZ((log log n)/log n) by Bar-Yehuda and Even in 1983 [ l a ] and independently by Monien and Speckenmeyer in 1985 [149]. 5.
Matching in Parallel
It is a good starting point for this section to recall that without any of the various additional “bell and whistle” conditions discussed in Section 3, the problem of finding a maximum matching for g e n e 4 graphs can be done in polynomial time via Edmonds algorithm or any of its descendants. Of course these algorithms are all sequential. Note also that (a) finding a perfect matching (or showing none exists) and (b) finding a maximal matching are both sequentially polynomial as well. Polynomiality of (a) follows from Edmonds algorithm and polynomiality of (b) is essentially trivial by the greedy algorithm. The situation is quite different when one moves to a parallel setting. It is not at all clear how to efficiently parallelize even greedy matching, let alone Edmonds’ algorithm. In fact, perhaps the outstanding open question regarding matching in parallel is whether or not maximum matching (or perfect matching, for that matter) belongs to NC. (There are certainly parallel algorithms for maximum matching, however, see [150]-[154]. They simply are not p l y l o g time algorithms.) But let us begin with the (apparently) even simpler problem of finding a maximal matching. Let us denote this problem by MAXLMATCH. So that the reader won’t be kept in suspense, let us announce that MAXLMATCH is now known to be in NC (in fact, in NC2). But just how it got there is an interesting story.
Matching and vertex packing: How “hard” are they?
293
In 1980, Lev [I551 showed that for bipartite graphs, MAXLMATCH was in NC?. Actually, she gave three different algorithms for this problem. In addition - and this seems not to be widely realized - she also gave an Nd‘ algorithm for finding a maximum matching (henceforth MAXMMATCH) in any regular bipartite graph by means of an NC? edge coloring algorithm for this family. The bound was improved to NC2 in [156]. The complexity for general regular graphs remains open, although see [157). No processor count was mentioned by Lev, although Israeli and Shiloach [158] assess her processor bound as 0 ( d l l o g m). Four years later, Karp and Wigderson [I591 [160] extended the result of Lev by showing MXXLMATCH to be in NC4 for all graphs. Their algorithm employed m + n processors. The Karp-Wigderson algorithm was actually a special case of the first NC maximal independent set (hereafter MAXLVP) algorithm. We shall return to their algorithm in the next section. In [158], Israeli and Shiloach give another N d ‘ algorithm for MXUMATCH using m + n processors which implemented on a CRCW-PRAMreduces to NC?. Shortly thereafter, Israeli and Itai [161] found a randomized algorithm for MAXLMATCH which lowers the polylog time exponent by 2, thus placing the algorithm in RNC’.(The reader should note that this is the first use of a randomized algorithm in this paper, but not the last!) A side remark is in order at this point. Luby [162] [I631 proved MAXMMATCH to be in NC? with O ( n )processors. Although this bound is not as good as that of Israeli and Shiloach, it deserves mention because it introduced a new and sophisticated technique the full potential of which has probably not yet been realized. The procedure we refer to formulates a randomized algorithm (in this application to both MAXLMATCH and to MAXMLVP) and then removes the randomness to obtain a deterministic algorithm. Such techniques were used in Karp and Wigderson [I591 [I601 and previously by Luby himself [163 primarily in connection with MAXLVP (see the next section). But these older procedures have the unpleasant side effect of producing a rather large blow up in the number of processors required. In [162] [163], Luby develops a new approach making clever use of a new probabilistic space which ultimately permits removal of randomization with 110 increase In the number of processors required. However, it was in [I641 [165] that Luby reduced the complexity of the problem to NC2, again proceeding as did Karp and Wigderson to obtain his bound as a special case of an algorithm for MAXLVP. Again, we will have more to say about this in the next section. Now let us turn to parallel algorithms for perfect and maximum matching. To further muddy the waters, in the parallel case, we shall have to differentiate between the problems of deciding if a perfect matching exists (hereafter ?PM) and the search version of the problem (FZNDPM). Remember: none of these three problems is known to be in NC. There are some intriguing new questions which arise in parallel computing which are not really significant in the sequential situation. One of these asks: “What is the difference in complexity between decision problems and search problems?” The interested reader may consult [I661 for a diversion into the land of rank and independence oracles and their relative power. In 1979, Lovisz [167] proposed a test for ?PMby giving a randomized procedure for testing if a certain matrix is non-singular. Borodin, von zur Gathen and Hopcroft [la] [I691 then combined this with an NC2 algorithm already developed by Csfinky [I701 for testing matrix non-singularity to give an RNC2 algorithm for ?PM.
M.D. Hummer
294
For the search problem FINDPM, a breakthrough was obtained in 1985 by Karp, Upfal and Wigderson [171] [172] who found the first RNC algorithm. Their procedure actually placed the problem in RNC3. Shortly thereafter, Galil and Pan [173] [174] considerably improved the processor bound. In 1987, Mulmuley, Vazirani and Vazirani [1751 [ 1761 put the problem in RNC with a faster algorithm with processor bound O(n3~5m>. Now what about MAXMMATCH? Actually, Mulmuley, Vazirani and Vazirani [1751 [ 1761 showed that this problem is also in RN$ as well as the following related search problems: Problem (a): Find a maximum weight perfect matching in a graph the edge weights of which are given in unary, and Problem (b): Find a matching covering a set of vertices of maximum weight in a graph where the weights on the vertices are given in binary. Moreover, they show that if any of these three search problems is in NC, they all are. (The Galil-Pan improved processor bounds apply here too.) A clever observation by Karloff [lnJ can be used to turn all these algorithms of Karp, Upfal and Wigderson and Mulmuley, Vazirani and Vazirani from Monte Carlo to Las Vegas, so all are now known to actually lie in “zero-en-or’’ class me2. It is interesting that a close relative to the above three problems has unknown complexity: Problem (c): Find a minimum weight perfect matching in a graph with edge weights given in binary. Before leaving these two papers, we should also inform the reader that Mulmuley, Vazirani and Vazirani were also able to show that EXACTMATCH lies in RNC2 as well. Recall from Section 3 that it is not known if this problem can be solved deterministically in polynomial time. Much of the information given in this section down to this point can be found (and in considerably greater detail) in the two excellent surveys by Galil [178] [179] which appeared nearly simultaneously in 1986. We now return to the basic unsolved problems motivating this section; that is: Are any of ?PM, FINDPM or MAXMMATCH in NC? Although these problems remain open in general, membership has been affirmed for various graph classes. In addition to regular bipartite graphs, as discussed earlier (see [155] [156]) FINDPM has been shown to be in NC for the following graph families: 1. Claw-free graphs (and hence for line graphs as well). Chrobak and Naor [180] give an NC2 algorithm. See also Naor [181]. 2. Planar bipartite graphs. This was done by Miller and Naor in 1989 [182]. Note that FINDPM remains open for planar graphs in general. Here is an instance where, on the other hand, ?PM is known to be in NC! This is an immediate corollary of the fact that counting perfect matchings (henceforth #PM) is in NC2 when the graph is planar. We shall deal with this
Matching and vertex packing: How “hard” are they?
295
again in Section 7 below. 3. Dense graphs. A graph is said to be dense if mindeg G 2 n12. Dahlhaus and Karpinski [ 1831
found an N? algorithm requiring O(n8) processors and somewhat later, Dahlhaus, Hajnal and Karpinski [184] [185] found an NC4 procedure using a linear number of processors. Note that a dense graph (on an even number of vertices) always has a perfect matching. This is an immediate corollary to the classical theorem of Dirac which says more, namely that such a graph must have a Hamilton cycle. A very interesting “negative” result is also proved in these three papers. Namely, if one drops the degree bound to n12 - E for any E > 0, then the problem is just as hard (under NCreductions) as FINDPM for general graphs! 4. Strongly chordal graphs. A graph G is chordal if every cycle of length greater than 3 has a
chord. This is equivalent to having a perfect elimination scheme on the vertex set; namely, if (u,v), (u,w) in E(G), then (v,w)in E(G).Strongly chordal graphs can be defined by imposing the additional requirement that the perfect elimination scheme also satisfies: if x < u and y < v and if (x,y), (x.v).( y M E E(G),then (u,v)E E(G).
Strongly chordal graphs are in NC2 by Dahlhaus and Karpinski[l%]. Although FINDPM is in NC2 for strongly chordal graphs, for the wider class of chordal graphs, it is as hard as the general bipartite case, the parallel complexity of which, as we have remarked several times already, is unknown.
5. Co-comparability graphs. A precedence graph is an acyclic transitively closed directed graph. A comparability graph is an undirected graph which can be oriented so as to become a precedence graph. Co-comparability graphs are the complements of comparability graphs. Helmbold and Mayr [187] [188] have produced an NC2 algorithm for these graphs. There is a close relationship between co-comparability graphs and the so-called two processor scheduling problem. As corollaries, one obtains here NC2 algorithms for FINDPM for the class of permutation graphs and partial orders of dimension 2, as well as interval graphs. An NC algorithm for co-comparability graphs was independently found by Kozen, Vazirani and Vazirani [189], although no polylog time exponent is given. This paper is also interesting from another point of view. Although the membership of ? P M in N C is a famous unsolved problem, here these three authors give an NC algorithm for the closely related problem: “Does graph G have a unique perfect matching?’ Moreover, Rabin and Vazirani [ 1901 give an NC2 algorithm for FINDPM, in the case when G has a unique perfect matching. 6. Bipartite graphs with a polynomially bounded number of perfect matchings. Grigoriev and Karpinski [191] give an N$ algorithm for this class. In fact, the authors do more. For such graphs, they show that finding all perfect matchings is in NC3. Moreover, if @( G) is bounded by a constant, they find all perfect matchings with a faster - NC2 - algorithm. An interesting sidelight here is the problem of recognizing such graphs! This is leading us perilously close to yet another important problem: counting the number @(G) of perfect matchings in a graph G. This has been shown to be #P-complete, so hope for a polynomial algorithm, much less an NC algorithm, to compute @(G) is dim indeed. However, Grigoriev and Karpinski do give an RNC3 algorithm which, given any polynomial cnk, will decide if @(G) Icn k. For the problem of ?PM in bipartite graphs with a polynomially bounded number of perfect matchings, these two authors also give an NC2 algorithm in their paper. They also state
2%
M.D. Flummer
that their results can be extended from bipartite graphs to graphs in general, but no proofs are given. Let us insert at this point a few remarks on the problem ?PM. In the paper of Grigoriev and Karpinski discussed in the preceding three paragraphs, one also finds the result that ?PM belongs to NC2 for graphs with a polynomially bounded number of perfect matchings. We will see in the last section of our paper that exact counting of perfect matchings - #PM - is in NC for certain classes of graphs; e.g., planar graphs. Thus by default, ?PM is in NC for all such classes. (Recall that #PMis #P-complete and therefore NP-hard in general.) Now let us return to the maximal matching problem - UAXLMATCH - discussed at the beginning of this section. As we noted above, it is now known to be in NC, although not without a struggle! Motivated still by the apparent difficulty of doing greedy procedures in parallel, researchers have posed and investigated the following variant of MAXLMATCH. Suppose the edges of a graph G are numbered 1 through m . The lexicographicallyfirst maximal matching problem - LFMAXLMATCH - is stated as follows. Does the lexicographically first maximal matching in G contain edge m? Is the problem in NC or perhaps even P-complete? These are open questions (see [190] [192]-[194]). In the latter three references, the problem is one of several used to motivate the creation of a new complexity class - CC. (This will be the last class in our lattice diagram to be discussed.) This class is discussed at length in [192]-[194] with perhaps the most succinct treatment in [22]. First we must define the circuit value problem - or C V “Given a Boolean circuit with n inputs and a single output, together with a binary input value for each input gate, is the output equaltol?” CVis known to be P-complete under logspace reductions (see [195]). Now we modify the CV problem as follows. Suppose we restrict all circuit elements to have precisely two inputs and two outputs, one output yielding x A y , the other yielding x v y for binary inputs x and y. These circuit elements are called comparators and we have the comparator circuit value problem - or CCV “Given a Boolean comparator circuit and binary inputs, is the output equal to l?” The class CC is defined as the class of problems log-space reducible to the CCV problem. Subramanian [ 1921 has shown LFUAXLMATCH to be CC-complete. In addition, he also shows another old friend from earlier in this paper to be CC-complete: the Stable Roommates Problem. For a proof of this, as well as other CC-complete variants, see [1%]. There are several interesting open problems surrounding class CC. Is there a machine characterization of CC? Are CC and NC comparable? Feder [% has‘Ishown that C C contains the class NL (non-deterministic log-space) (see once again [22] for a description of this class). It seems that there are no results on the parallel complexity or indeed any known relationships to classes inside P (e.g., CC) for lexicographically first maximum matching. Let us insert here a short remark on the parallel complexity of MATROZD PARlTY introduced and discussed in Section 3. Very recently, Narayanan, Saran and V.V. Vazirani [197] have shown that if the matroids are linear, there is an RNC2 algorithm to solve the problem which uses O(n”5 processors.
Matching and vertex packing: How “hard” are they?
297
T o close this section, let mention a very recent result on approximating a maximum matching in parallel. Fischer, Goldberg and Plotkin [ 1981 have developed an NC3 algorithm which, given a constant k > 0, computes a matching of cardinality at least 1 - l/(k+l) times the maximum. The algorithm requires O(n2k+2)processors. 6. Vertex Packing in Parallel
The problems considered here have matching analogs which were discussed in the preceding section. In particular, we will treat the maximal independent set problem (MAXLVP) and its lex-first cousin LFMAXLVP. Let us dispose of the latter problem first. LFMAXLVP was shown to be P-complete under (see also [25] [26]). log-space reduction by Cook in 1985 [a] On the other hand, MAXLVP was first shown to be in NC by Karp and Wigderson [159] [160] in 1984. They gave an NC? algorithm using O(n3/(10g n)3) processors. In the same paper, they also provided a faster randomized version, thus showing the problem to be in RNC3. (This result automatically placed LFMAXLhXTCH in the same two classes; this was discussed in Section 5.) It should be noted that the Karp-Wigderson paper was a deep piece of work. In particular, in order to go from the randomized version of their algorithm to the deterministic version, they appealed to the theory of block designs, no less! [165]and to Two independent improvements to NC2 are due independently to Luby [la] Alon, Babai and Itai [199]. Both papers first present a randomized version of their respective algorithms and then cleverly remove randomness. In the past five years or so, several papers have appeared all with their main theme being improvement of the Karp-Wigderson time-processor product bound. Goldberg and Spencer first found an NC? algorithm using 0 (n)processors [200] [201] and then produced an NC3 algorithm requiring 0((n + m)/log n) processors [202]. Preceding both these results, Goldberg had produced a parallel algorithm whose time-processor product was better than Luby or Karp and Wigderson, but the algorithm was not polylog time [203]. The Goldberg-Spencer papers also asked an interesting new question. Tur6n [2W] proved in 1954 that every graph with n vertices and m edges must contain an independent set of size at least n2/(2m + n). Can such a set be found via an NC algorithm? In [205], they answered their own question in the affirmative, giving an NC3 algorithm. Returning now to MAXLVP, let us note that a more efficient NC algorithm for the special case when G is planar was found by Xin He [206] who found an NC2 routine requiring only O ( n ) processors. In 1988, this result was extended by Khuller [207] to K33-free graphs. Khuller made use of a special decomposition technique developed by V.V. Vazirani [75][76] in his work oncb(G) for K3g-free graphs which we will meet again in the final section of this paper. Last year, Dadoun and Kirkpatrick [208] reduced the time bound for planar graphs to O(1og n log* n), while preserving the O(n)processor bound. (Here log* n denotes the number of applications of the log function required to reduce n to a constant value.) They also gave an RNC’ algorithm for MAXLVP. Now what about the MAXMVP problem? In other words, what can be said about the (apparently much harder) task of computing a maximum independent set in parallel?
M.D. Plummer
298
NC algorithms have been found for interval graphs by Helmbold and Mayr [209] and Bertossi and Bonuccelli [210] and more recently for the more general class of chordal graphs by Naor, Naor and S c h ~ f e [211] r [212]. To date, these are the only special classes of graphs known to the author for which MAXMVP has been shown to be in NC. Finally, we also note that an NC algorithm for listing all maximal independent sets in a chordal graph is also given in [211] [212].Dahlhaus and Karpinski [213] have now developed NC algorithms for listing all maximal independent sets for other classes of graphs, all of which have polynomially bounded numbers of maximal independent sets (chordal graphs have this property, for example). 7.
Enumeration
In this section, we will give a short summary of three types of problems dealing with enumeration of matchings and independent sets: 1. Exact counting
2. Approximate counting 3. Complete listing.
Let us begin with counting perfect matchings (#PI@ and a sobering result. At this point, the reader will recall the complexity class #P from Section 2. Valiant [ a ] [67l proved the important (but somewhat deflating) result that counting the number of perfect matchings in a bipartite graph (i.e.,determining @(G)) is #P-complete and therefore NP-hard. It is important to realize that here we have a problem in P (?PMor FINDPM) the counting version of which is #P-hard! It is therefore not surprising that counting maximum independent sets (#MAxMvp) is also #P-complete. So the best we can reasonably hope for in the way of polynomial algorithms for exact counting is to be able to count these objects for certain special classes of graphs. For matchings, Kasteleyn was the pioneer. Motivated by a question from crystal physics (counting so-called “dimers” on a crystal lattice), he showed [214] [215] that there is a polynomial algorithm to count the number of perfect matchings in any planar graph. His method involves showing that any undirected planar graph can have its edges oriented in such a way that a certain matrix associated with the resulting directed graph has the property that its determinant is the square of @(G). Such an orientation has come to be called Pfuflun after the classical Pfaffian function of a matrix. Since determinants can be evaluated in polynomial time, we have our desired result: a polynomial algorithm for #PM. Subsequently, Kasteleyn’s method was extended to a wider class of graphs by Little [216] who showed that every K3,3-free graph has a Pfaffian orientation, although he did not deal with the problem of finding a polynomial algorithm to construct one. In 1988, V.V. Vazirani p5][76]showed how to construct such a polynomial algorithm. In fact, findin a Pfaffian orientation (and hence determining@(G)) in a K3,3-free graph is even in NC ! In the same paper, Vazirani showed that testing to see if a graph is K3,3-free is also in NC.
5
Note that a trivial corollary of Vazimni’s work shows now that ?PM for K3,3-free graphs is in NC. But the nagging problem of FINDPM remains open, even for K3,3-free graphs. Finally, Vazirani showed that the decision version of the problem EXACTMATCH, introduced in Section 3, also lies in NC, for the family of K3,3-free graphs. Again, we emphasize that the search problem remains open.
Matching and vertex packing: How “hard” are they?
299
But let us return for a moment to the problem of finding a Pfaffian orientation for a graph in general. One can pose three (at least formally) different questions here: 1. Does a given graph G have a Pfaffian orientation?
2. Given an orientation of graph G, is it Pfaffian? 3. Given a graph G, find a Pfaffian orientation.
Observe that the first two questions are decision problems, while the third is a search problem. The complexity of all three questions is currently unknown. That question 2 is in co-NP follows from the work of Kasteleyn 12141[2151. More recently, Lov6sz (implicitly) [217J and V.V. Vazirani and Yannakakis [218] [219] (explicitly, using Lovbz’s polynomial time algorithm for computing the GF [2] rank of the set of perfect matchings of a graph) have shown that problems 1 and 2 are polynomially equivalent and hence problem 1 is also in co-NP. It has been pointed out to us by Pulleyblank [220] that it also follows from the work of L o v b z on the matching lattice [217] that problem 3 is equivalent to problems 1 and 2. In the case when graph G is bipartite, there is an interesting connection between these three problems and a fourth problem which has been studied by Seymour and Thomassen [221], among others. 4. Given a directed graph D, does it fail to contain a directed cycle of even length?
Vazirani and Yannakakis have shown [218] [219] that, for bipartite graphs, problem 4 is polynomially equivalent to problems 1 and 2 (and hence problem 3, as well, by Pulleyblank’s remark). Incidently, although #PM is in P for planar graphs by Kasteleyn’s work, it has been shown more recently by Jerrum [222] that counting all matchings in a planar graph G is #P-complete. Going back to crystal lattices for a moment, Kasteleyn’s method of course applies only to those which are planar. What about counting dimers on 3-dimensional lattices? Even this is unknown (cf.Sinclair [223]). Let us also mention at this point that Irving and Leather [224] proved in 1986 that counting the number of stable marriages (i.e., induced bipartite matchings) is #P-complete. We now proceed from exact counting to approximate counting. The fact that exact counting is #P-complete has lent impetus to trying for a less ambitious goal. Can we find a “good approximation” f o r 0 (C) “efficiently”? Here many - but not all - the interesting results have been obtained for the bipartite case. The reason is that evaluating @(G) for a bipartite graph G is the same problem as evaluating the permanent of a square (0-1) matrix A. Let A be any n x n matrix. The permanent of A , denoted per A, is just the sum of all the n! terms of its determinant, except all terms are taken with a plus sign. Although the permanent is therefore even slightly easier than the determinant to define, it is a much more badly behaved function! (See Minc 12251 for a general reference on permanents.) In particular, although polynomial algorithms for evaluating the determinant are well known to every undergraduate mathematics student, the best known algorithm for evaluating the permanent has the ugly time bound of O(n2”) (see Ryser [226]). Fortunately, there has very recently appeared an excellent survey by Luby [227 dealing with approximation algorithms for the permanent and we heartily recommend that the interested reader consult it. Our remarks on approximation will therefore be brief. An
(&,a)
approximation algorithm for per A is a randomized (Monte Carlo) algorithm
M.D. Plummer
300
which accepts an n x n matrix A and two positive real numbers,&and 6 . The algorithm then outputs a number Y as an estimate for per A in the sense that: Prob [(1 - E ) per A 5 Y 5 (1 +&)perA] 1 1 - 6 .
An ( ~ 6approximation ) algorithm is said to be a fully polynomial randomized approximation scheme #rm) if its running time is polynomial in n, I/&and 1/& It is an open question as to whether or not there exists afpras for the permanent function, and therefore for @(G), for G bipartite. Very recently, two major lines of research on this question have been undertaken. The first of these has resulted in an approximation algorithm which meets the accuracy demand of an hrm, but in superpolynomial time. More specifically, Karmarkar, Karp, Lipton, Loviisz and Luby [228] have designed a Monte Carlo algorithm which yields the desired output in time equal to 2”’2(1/~2)10g(1/6)p(n), where p(n) is a polynomial in n. For fixed E and &this is about the square root of the time bound for Ryser’s algorithm. (Even more recently, Jerrum and U. Vazirani [229] have designed a different algorithm with worst-case time complexity equal to e x p ( O ( n ” * I o ~n)), which improves that of [228].) The five authors of [228] also pose the following open question: Is there a deterministic algorithm with running time 0 ( 2 ~which ) accepts as input matrix A and a positive real E and outputs Y such that
(1 - &)perA 5 Y 5 (1 +&)perA ? The second approach seems to have originated with an idea of Broder [230] and been continued by Jerrum and Sinclair [231] [232]. The latter two authors succeeded in finding afpras for dense permanents, that is for dense bipartite graphs. The Jerrum-Sinclair papers (and several other companion papers) are not only important for this result, but perhaps even more so for introducing novel approaches using such esoteric concepts from probability theory as rapidly mixing Markov chains and conductance. (Broder too deals with slightly modified versions of the same.) In general, their common approach deals with reducing the problem of approximately counting perfect matchings to that of generating them at random from an almost uniform distribution. Using these ideas, Dagum, Luby, Mihail and U.V. Vazirani [233] [234] (see also Dagum and Luby [235])have achieved a polynomial speed up of the algorithm of Jerrum and Sinclair and used it to show that there is also a h r m for bipartite graphs with large factor size. Thefactor size of a bipartite graph G = A u B (where IAl = IBI = n) is the maximum number of edgedisjoint perfect matchings in G. (Observe that a graph with an an-factor must have minimum degree at least an,but not necessarily vice-versa.) In this area, Dagum and Luby showed that there is afpras for bipartite graphs with factor size at least a n for any constant a > 0. These results are even more interesting when compared to some related new completeness results. Broder [230] has shown that exact counting in dense graphs is as hard as exacting counting in general and so is #P-complete. Dagum and Luby [235] and Dagum, Luby, Mihail and U.V. Vazirani [233] [234] have shown that exact counting in f(n)-regular bipartite graphs is #P-complete for any fin) such that 3 5 f ( n ) I n - 3. In fact, they show that for any E > 0 and for any function f(n) such that 3 S f(n) 5 n1 -‘, the existence of afpras for f(n)-regular bipartite graphs would imply the existence of a fpras forall bipartite graphs! And finally a word or two about finding (i.e., listing) all matchings and independent sets. Indeed the literature is quite sparse for these problems.
Matching and vertex packing: How “hard” are they?
301
For bipartite graphs, Grigoriev and Karpinski [191] have developed algorithms for constructing all maximum matchings in the case when @(G) is bounded. More particularly, they show #PMis in N 6 when @(G) is bounded by a constant and in NC3 when@(G) is polynomially bounded (that is, @(G) S cnk).They claim their results extend to non-bipartite graphs as well, although they give no proofs. For bipartite graphs in general, it appears that the best algorithm for finding all perfect matchings is due to Fukuda and Matsui [236] who found an O ( ( @ ( G )+ l)(n + m + n2.’)) (sequential) algorithm for the problem. We are unaware of any published parallel algorithms in this case. For independent set listing problems, even less seems to be known. The sole result we will mention is the following. Let M denote the number of maximal independent sets in a graph. The fastest sequential algorithm for constructing all maximal independent sets in the graph has qmM)time and O(n + rn) space. I t is due independently to Chiba and Nishizeki [237] and Tsukiyama, Ide, Ariyoshi and Shirakawa [238]. Of course, this is a polynomial algorithm, if M is polynomially bounded. More recently, Dahlhaus and Karpinski [213] have obtained a parallel algorithm for this problem which runs in O(10g3(nM)) time and uses O(M6n2)processors. Note that if M is polynomially bounded, this becomes an NC algorithm. There are many interesting classes of graphs with bounded M and the authors give a list of eight such families. 8.
Lower Bounds
Note that all of the considerable work discussed and/or referred to above in this paper possesses one common thread Can we find a fmter algorithm to solve the problem at hand? Much less progress has been made on approaches made from the opposite direction: Can we prove that no polynomial time algorithm is possible for a given problem? Or, more generally, can we prove that no algorithm is possible in a given time or space for a given problem? Though results of this kind - that is, finding lower bounds for the computational complexity of certain problems - are much more sparse, we will close this paper with a brief overview of the work of A.A. Razborov in this area. His work has caused considerable excitement among complexity theorists world-wide and has won him the Nevanlinna Prize at the International Congress of Mathematicians held in Kyoto in 1990. For our brief synopsis, we borrow heavily from the articles of LovLz [239] and Sipser [43. This work is cast in the terminology of Boolean circuit theory. (We mentioned boolean circuits briefly in our treatment of the complexity class CC in Section 5.) A Boolean circuit for a computation is an acyclic directed graph the nodes of which represent the elementary steps in the computation. These nodes are also called gates. Gates having indegree 0 are called input gates and those with outdegree 0 are output gates. The simplest kinds of gates are AND, OR and NOT gates. The size of a Boolean circuit is the number of gates in it. Every polynomial time computable function can be computed by a polynomial size Boolean circuit. More precisely this means that if L is a language in P and L, denotes the set of strings of length n in L, then there exists a family B, of Boolean circuits such that B, takes n input bits and recognizes L, (Le., outputs 1 if and only if the n-bit input string is in L,); and the size of B , is bounded by nc for some fixed constant c. So if we could prove that a superpolynomial lower bound exists on the circuit size required by some NP problem such as the existence of a clique of given size, then this would
M.D. Hummer
302
suffice to separate NP from P. Now let us define a Boolean circuit to be monotone if it contains no NOT gates. Razborov, in two 1985 papers, showed that neither (a) [MI deciding the existence of a clique of given size in a graph nor (b) [241] deciding the existence of a perfect matching in a bipartite graph (a special case of (a)) can be done using polynomial-size monotone circuits. In particular, he showed that at least ndogn gates are necessary for a monotone circuit solution of perfect matching. So ?PM, which is well known to have polynomial non-monotone complexity, has super-polynomial monotone complexity. Alon and Boppana [242] have since strengthened Razborov's clique result to show that in fact there is an exponential lower bound for the clique problem and not just a superpolynomial bound. The clique problem is clearly equivalent to the independent set problem via graph complementation. Hence Razborov's results deal precisely with matching and vertex packing, our two paradigm problems. Will monotone circuits somehow prove to yield the key to the apparent separation in the complexity of these two problems and perhaps even provide the answer to the P = NP question? Only time will tell. This brings us to the end of our survey. As is always the case, more could have been included. There are several important topics related to the theme of this paper, but which we have elected not to include. Among these are weighted and capacitated matching. For useful references, see [243]-[248]. For probabilistic analysis of graph algorithms, see "91-[253]. Finally for matching and vertex packing in random graphs, see [254]-[262].
Note Added in Proof: Since the completion of this paper, reference [263] has come to the attention of the author and should be mentioned in Section 5 along with references [150]-[154]. Similarly, reference [264] should be added to references 12371[238] in Section 7. In [264], the authors present an algorithm which outputs all maximal independent sets in lexicographic order.
Acknowledgements It is a pleasure to acknowledge the kind assistance of a number of colleagues. Copious thanks are due to R. Anstee, L. Babai, G. Brassard, E. Eschen, A. Frieze, Z. Galil, A.B. Gamble, P. Hell, M. Jermm, D.S. Johnson, R.M. Karp, L. Lov&z, M. Luby, J. Naor, W.R. Pulleyblank, R. Motwani, E.C. Sewell, M. Sipser, L.J. Stockmeyer, 8. Tardos, C. Thomassen, L. Trotter and V.V. Vazirani. We are very grateful for their help. This work was supported by ONR Contracts #N00014-85-K-0488, #N00014-91-J-1142 and the Laboratoire de Recherche en Informatique, CNRS, Univ. Paris Sud.
References [I] [21
PI [41
[a 161
L. Lovfisz and M.D. Plummer; Matching Theory, Ann. Discrete Math., 29, North-Holland, Amsterdam (1986). M.D. Hummer; Matching theory - a sampler: from D hes Konig to the present, preprint (1991). D. Konig; Graphs and matrices, Mat. Fiz. Lapok, 38,116-119 (1931).(Hungarian) D. Kimig; ijb,trennende Knotenpunkte in Graphen (nebst Anwendungen a d Determinantenund MatriZen, Acta Sci. M d h . (Szeged), 6,155-179(1933). T. Gallai; &r extreme Punkt- und Kantenmengen, Ann. Univ. Sci. Budapest. Eotvos Sect. Math., 2, 133-138 (1959). J. Edmonds; Paths, trees and flowers. Cunud.J. Math., 17.44947(1%5).
Matching and vertex packing: How “hard are they?
N. Blum; A new approach to maximum matching in general graphs, Automata, Languages and Programming, M.S. Paterson (editor), Lecture Notes in Computer Science ,443,Springer-Verlag, Berlin, -597 (1990). N. Blum; A new approach to maximum matching in general graphs, Univ. Bonn Inst. fiir Informatik Report No. 8546-CS (1990). N. Blum; Jack b o n d s ’ original maximum matching algorithm needs only O(i?) time, Univ. Bonn Inst. fur Infonnatik Report No. 8568-CS (1991). S. M i d i and V.V. Vazirani; An 0(V1”E) algorithm for finding maximum matching in general graphs, Proc. 21st Annual Symposium on Foundations of Computer Science, I=, New York, 17-27 (1980). V.V. Vazirani; A theory of alternatingpaths and blossoms for proving correctness of the O(&) general graph matching algorithm, Comell University, Dept. of Computer Science Technical Report TR 89- 1035 (1989). J.E. Hopcroft and R.M. Karp; An n5” algorithm for maximum matchings in bipartite graphs, Proc. 12th Annual Symposium on Switching and Automata Theory (East Lansing, I97I), IEEE, New York, 122-125 (1971). J.E. Hopcroft and R.M. Karp;An n5’* algorithm for maximum matchings in bipartite graphs, SIAM J . Comput., 2,225231 (1973). T. Feder and R. Motwani; Clique partitions, graph compression and speeding-up algorithms, Proc. 23rd Annual ACMSymposium on Theory of Computing, ACM, New York,123-133 (1991). H. Alt, N. Blum, Mehlhom and M. Paul; Computing a maximum cardinality matching in a bipartite Inform. Process. Letters, 37,237-240 (1991). graph in time q n A=), J. Cheriyan, T. Hagerup and K. Mehlhom; Fast and simple network algorithms (extended abstract), preprint (1991). R.E. Tarjan and A.E. Trojanowski; Finding a maximum independent set, SZAM J . Comput., 6,537-546 (1977). F. Harary; Graph Theory, Addison-Wesley,Reading, Massachusetts(1969). G.J. Minty; On maximal independent sets of vertices in claw-free graphs, J. Combin. Theory Ser. B , 28, 2W-304 (1980). N. Sbihi; Algorithme de recherche d’un stable de cardinalit6 maximum dans un graphe sans bode, Discrete Math., 29.53-76 (1980). M.R. G m y and D.S. Johnson; Computers and Intractability: A Guide to the Theory of NP-Completeness, W.H. Freeman and Co., San Francisco (1979). D.S. Johnson: A cataloe of comolexitv classes. ChaDter 2 in Handbook of Theoretical Commter Science. Volume A: Algorithms 2nd Coiplexiiy, J. van.L.ee;wen (editor), ElsevikrlMIT, Amsterdakambridge, 69-161 (1990). P. van Emde Boas; Machine models and simulations, Chapter 1 in Handbook of Theoretical Computer Science, Volume A: Algorithms and Complexity, J. van Leeuwen (editor), ElsevierlMIT, Amsterdam/ Cambridge, 1-66 (1990). R.B. Boppana and M. Sipser; The complexity of finite functions, Chapter 14 in Handbook of Theoretical Computer Science, Volume A: Algorithms and Complexity, J. van Leeuwen (editor), ElsevieriMIT, AmsterWCambridge, 759404 (1990). R.M. Karp and V. Ramachandran; A survey of parallel algorithms for shared-memorymachines, University of California at Berkeley, Computer Sci. Div. (EECS) Report No. UCBlCSD 88l408 (1988 (Has appeared in Handbook of Theoretical Computer Science, NorthHolland (1990). S e e reference [2& R.M. Karp and V. Ramachandran;Parallel algorithmsfor shared-memorymachines, Chapter 17 in Handbook of Theoretical Computer Science, Volume A: Algorithm and Complexity, J. van Leeuwen (editor), ElsevieriMIT Press, AmsterdamlCambridge,869-942 (1990). A.V. Aho, J.E. Hopcroft and J.D. Ullman; The Design and Analysis of Computer Algorithms, AddisonWesley, Reading, Massachusetts (1974). D. Angluin; On counting problems and the polynomial-timehierarchy, Theoret. Comput. Sci., 12, 161173 (1980). R. Anderson; The Complexity of Parallel Algorithms, Stanford University, Dept. of Computer Science, Ph.D. Thesis (1984). J. L. Balcizu, J. Diaz and J. Gaban6; Structural Complexity, Springer-Verlag.Berlin (1988). S.A. Cook; An observation on time-storage trade-off, J . Comput. System Sci., 9,308-316 (1974). S.A. Cook, An overview of computationalcomplexity, Comm. ACM, 26,401-408 (1983). S. Fortune and J. Wyllie; Parallelism in random access machines,Proc. loth Annual ACM Symposium on Theory of Computing (Sun Diego, May 1978). ACM, New York, 114-1 18 (1978).
K. ’
“I
303
M.D. Plummer J. Gill, Computational complexity of probabilistic Turing machines, SIAM J. Comput., 6, 675-695 (1977. R.M. Karp;The probabilistic analysis of some combinatorial search problems, in Algorithms and Complexity, J.F. Traub (editor), AcademicPress, New York, 1-19 (1976). R.M. Karp; An introduction to randomized algorithms, Internal. Comput. Sci. Inst. Tech. Report TR-90024 (1990), Discrete Appl. Mah. (to appear). E.M. Eschen; Synchronous parallel computationcomplexity: an overview, preprht (1985). R.M. Karp and M. Luby; Monte-Carlo algorithms for enumeration and reliability problems, Pros. 24th Annual IEE Symposium on Foundations of Computer Science, IEEE Computer Society Press, 56-64 (1983). C. Lautemann; BPP and the polynomial hierarchy, Inform. Process. Letters, 17,215-217 (1983). C. Lund, L. Fortnow, H. Karloff and N. Nisan; Algebraic methods for interactive proof systems, Proc. 20th Annual IEEE Symposium on Foundations of Computer Science, IEEE Computer Society Press, 2-10 (1gW. N. Nisan and A. Wigderson; Hardness vs. randomness (Extended Abstract), Proc. 29th Annual IEE Symposium.on Foundations of Computer Science, IEEE Computer Society Press, 2-1 1 (1988). A. Panconesi and D. Ranjan; Quantifiers and approximation (extended abstract, Proc. 22ndAnnualACM Symposium on Theory of Computing, ACM, New York, 4 4 6 4 6 (1990). W.J. Savitch; Relationships between non-deterministic and deterministic tape. complexities, J. Cornput. and System Sci., 4, 177-192 (1970). M. Sipser; A complexity theoretic approach to randomness, Proc. 15th Annual ACMSymposium on Theory of Computing (Boston, 1983). ACM, New York, 330-335 (1983). M. Sipser; Alexander Razborov, Notices of the Amer. Math. SOC.,37, 1215-1216 (1990). D.B. Shmoys and 6.Tardos; Computationalcomplexity, The Handbook of Combinatorics, R.L. Graham, M. Grotschel and L. LovLz (editors), North-Holland,Amsterdam - to appear. L.J. Stockmeyer; The polynomial-timehierarchy, Theoret. Comput. Scr., 3, 1-22 (1977). L.J. Stockmeyer; The complexity of approximate counting, Proc. 15th Annual ACM Symposium on Theory of Computing (Boston, 1983). ACM, New York, 118-126 (1983). L.J. Stockmeyer; On approximationalgorithms for #P, SIAMJ. Comput., 14,849461 (1985). L.J. Stockmeyer; Classifying the computational complexity of problems, J . Symbolic Logic, 52, 1 4 3 (1987). U.V. Vazirani and V.V. Vadrani: Random polynomial time is equal to slightly-randompolynomial time, 26th Annual IEEE Symposium on Foundations of Computer Science, IEEE Computer Society Press, 417428 (1985). D.J.A. Welsh; Problems in computational complexity, in Applications of Combinatorics, R.J. Wilson (editor), Shiva Mathematics Series, Shiva Publ. Ltd., Nantwich, 75-85 (1982). D.J.A. Welsh; Randomised algorithms, Discrete Appl. Math., 5, 133-145 (1983). C. Wrathall; Complete sets and the polynomial-timehierarchy, Theoret. Comput. Sci., 3.23-33 (1977). S . Zachos; Robustness of probabilistic computationalcomplexityclasses under definitional perturbations, Inform. and Control, 54, 143-152 (1982). S. Zachos: Probabilistic auantifiers, adversaries and complexity classes: an overview, Structure in Compkxity Thdoy (Berkeley, >986),A.L. Selman (editor), Lkture kotes in Computer Science, 223, SpringerVerlag, Berlm, 383-400 (1986). M.D. Plummer; Some covering concepts in graphs, J. Combin. Theory, 8,91-98 (1970). S.A. Cook; The complexity of theorem-provingprocedures, Proc. 3rd Annual ACMSymposium on Fourdations of Computer Science (Shaker Heights), 151-158 (1971). L.A. Levin; Universal search problems, Problemy Peredaci Informacii, 9, 115-116 (1973) (Russian). (Translation: Problems of Informdon Transmission,9,265-266 (1973).) B.A. Trakhtenbrot; A survey of the Russian approach to perebor (brute-force search) algorithms, Ann. Hist. Comput., 6.3tM-400 (1984). R.M. Karp;Reducibility among combinatorialproblems, in Complexity of Computer Computations, R.E. Miller and J.W. Thatcher (&tors), Plenum Press, New York, 85-103 (1972). D. Dobkin, RJ. Lipton and S. Reiss; Linear programmingis log-space hard for P, Inform. Process. Letters, 8.96-97 (1979).
Matching and vertex packing: How “hard” are they?
Wl
P21
[nl P41
P5l
305
S.A. Cook;The classificationof problems which have fast parallel algorithms, Foundatiom of Computation Theory, (Proc. 1983 Internat. FCT Conference, Borgholm, Sweden, August 1983), M. Karpinski (editor), Lecture Notes in Computer Science, 158. Springer-Verlag,Berlin, 78-50 (1983). S.A. Cook; A taxonomy of problems with fast parallel algorithms, Inform. and Control. 64.2-22 (1985). N. Pippenger; On simultaneous resource bounds,Proc. 20th Annual Symposium on Foundations of ComNew York, 307-3 11 (1979). puter Science, I=, L.G. Valiant; The complexity of computing the permanent, Theoret. Comput. Sci., 8, 189-201 (1979). L.G. Valiant; The complexity of enumeration and reliability problems, SIAM J. Appl. Math. 8,410421 (1979). M. Yannakakis and F. Gavril; Edge dominating sets in graphs, SIAMJ. Appl. Math. 38,364-372 (1980). W.R. Pulleyblank; Matchings and extensions, Handbook of Combinatorics. R.L. Graham, M . Grotschel and L. Lovilsz (editors), North-Holland, Amsterdam - to appear. A. Itai, M. Rodeh and S.L. Tanimoto; Some matching problems for bipartite graphs, J . Assoc. Comput. Mach., 25,517-525 (1978). J. Edmonds; Maximum matching and a polyhedron with (0,l)-vertices, Res. Nut. Bur. Standards Sect. B. 69. 125-130 (1965). C. Papadimitriou and M. Yannakakis;The complexity of restricted spanning tree problems, J . h s o c . Comput. Mach.. 29,285-309 (1982). C.H. Papadimitriou; Polytopes and complexity,in Progress in Combinatorial Optimization, W.R. Pulleyblaak (editor), Academic Press, Toronto, 295304 (1984). F. Barahona and W.R. Pulleyblank; Exact arborescences,matchings and cycles, Discrete Appl. Math., 16, 91-99 (1987). V.V. Vazirani; NC algorithms for computing the number of perfect matchings in K3 3-free graphs and related problems, SWAT 88; Proc. First Scandinavian Workrhop on Algorithm The06 (Halmstad. July 1988), R. Karlson and A. Lingas (editors), Lecture Notes in Computer Science, 318, Springer-Verlag, Berlin, 233-242 (1988). V.V. Vazirani; NC algorithms for computing the number of perfect matchings in K3,3-free graphs and related problems, Inform. and Comput.,80, 152-164 (1989). W.R. Pulleyblank; Alternating cycle free matchings,preprint (1982). L.J. Stockmeyer and V.V. Vazirani; NP-completenessof some generalizations of the maximum matching problem, 15, 14-19 (1982). K. Cameron; Induced matchings, Discrete Appl. Math., 24.97-102 (1989). S. Even, 0.Goldreich and P. Tong; On the NP-completeness of certain network-testing problems, Technion, Haifa, Dept. Comput. Sci. Tech. Rpt. #230 (1981). D. Gusfield and R.W. Irving; The Stable Marriage Problem: Structure and Algorithms, MIT Press, Cambridge (1989). D. Gale and L.S. Shapley; College admissions and the stability of marriage, Amer. Math. Monthly, 69.915 (1%2). R.W. Irving; An efficient algorithm for the stable roommates problem, J. Algorithms, 6.577-595 (1985). C. Ng and D.S. Hirschberg; Lower bounds for the stable marriage problem and its variants, SIAM J. Comput., 19.71-77 (1990). E. ROM; On the Complexity of Stable Matchings With and Without Ties. Yale University, Dept. of Computer Sci., Ph.D. Thesis (1986). E. Ronn; NP-complete stable matching problems, J. Algorithms, 11,285-304 (1990). T. Feder; A new fixed point approach for stable networks and stable marriages, Proc. 21st Annual ACM Symposium on Theory of Computing, ACM, New York. 513-522 (1989). D.G. Kirkpatrick and P. Hell; On the completenessof a generalized matching problem, Proc. loth Annual ACM Symposium on Theory of Computing (Sun Diego. May 19781, ACM, New York, 240-245 (1978). D.G. Kirkpatrick and P. Hell; On the complexity of general graph factor problems, SIAM J . Comput., 12, 601-609 (1983). P. Hell and D.G. Kirkpatrick Scheduling, matching, and coloring, Algebraic Methods in Graph Theory, Szeged (Hungary). 1978, Colloq. Math. Soc.Jhos Bolyai, 25,273-279 (1978). P. Hell and D.G. Kirkpatrick; Packing by cliques and by finite families of graphs, Discrete Math., 49,4559 (1984). G. Comu6jols, D. Hartvigsen and W.R. Pulleyblank;Packing subgraphs in a graph, O.R. Letters, 1,139143 (1982).
306
M.D. Plummer
[93] G. Cornu6jols and W.R. Pulleyblank; Perfect triangle-free 2-matchings. Combinatorial Optimization II (Proc. Conf. Univ.E a t Anglia, Norwich, 1579). Math. Programming Stud. No. 13, North-Holland, Amsterdam, 1-7 (1980). [94] G. Comu6jols and W.R. Pulleyblank; A matching problem with side conditions, Discrete Math., 29,135[951 [%I [97l
[%I [99] [lW] [loll [lo21 [lo31 [lo41 [lOq [lo61
[lOA [lo81 [lo91 [110]
[ll 11 11121 [113] [114]
[llq [116] [117] [118] [119] [120]
159 (1980). P. Hell and D.G. Kirkpatrick; On generalized matching problems, Inform. Process. Letters, 12, 33-35 (1981). P. Hell and D.G. Kirkpatrick; Packings by complete bipartite graphs, SIAM J . Alg. Disc. Meth., 7 , 199209 (1986). P. Hell. D. Kirkpatrick. J. Kratochvil and I. mi; On restricted two-factors, SIAMJ. Disc. Math.. 1.472484(1988). F. Berman, D. Johnson, T. Leighton, P.W. Shor and L. Snyder; Generalized planar matching, J. Algorithms, 11, 153-184(1990). M.E. Dyer and A.M. Frieze; Planar3DMis NP-complete, J. Algorrthms, 7 , 174-184 (1986). B.S. Baker; Approximation algorithms for NP-complete problems on planar graphs, Proc. 24th Annual lEEE Symposium on Foundations of Computer Science, IEEE Computer Society Press, 265273 (1983). M.W. Padberg and M.R. Rao; Odd minimum cut-sets and b-matchings, Math. Oper. Res., 7 , 67430 (1982). F. B a r a h o ~Reducing ; matching to polynomial size linear programming, University Waterloo, Dept. of Combinatoricsand Optimization Res. Report CORR88-51(1988). F. Barahona; On cuts and matchings in planar graphs, Univ. Bonn Inst. fiir Okonometrie und Oper. Res. Report 88503-OR (1988). A.B. Gamble; Polyhedral Extensions of Matching Theory, University of Waterloo, Dept. of Combinatorics and Optimization. Ph.D. Thesis (1989). M.Yannakalu' s ; Expressing combinatorid optimization problems by linear programs, Proc. 20th Annual ACMSyinposium on Theory of Computing, ACM, New York, 223-228 (1988). E.L. Lawler; Combinatorial Optimization: Networks and Matroids, Holt, Rinehart and Winston, New York (1976). L. Lovhz, The matroid matching problem, Algebraic Methods in Graph Theory 11, L. Lovitsz and V.T. S6s (editors), Colloq. Soc.J h o s Bolyai, 25,495-517 (1981). H.N. Gabow and M. Stallmann;An augmenting path algorithm for the parity problem on linear matroids, Proc. 25th Annual IEEE Symposium on Foundations of Computer Science, IEEE Computer Society Press, 217-228 (1984). S. Poljak, A note on stable sets and coloring of graphs, Comment. Math. Univ. Carolin., 15, 307-309 (1974). M.R. Garey, D.S. Johnson and L. Stockmeyer; Some simplified NP-complete graphproblems, Theoret. Comput. Sci., 1,237-267 (1976). F.B. Shepherd;Near-Perfection and Stable Set Polyhedra, University of Waterloo, Dept. of Combinatorics and Optimization,Ph.D. Thesis (1990). N.V.R. Mahadev; Stability Numbers in Structured Graphs, University of Waterloo, Dept. of Combinatorics and Optimization. Ph.D. Thesis (1984). C. Berge; Ftirbung von Graphen, deren s2mtliche bzw. deren ungerade Kreise starr sind, Wiss. Zeitung, Martin Luther Univ.Hall-Wittenberg, 114 (l%l). L. LovBsz; Normal hypergraphs and the weak perfect graph conjecture, Discrete Math., 2, 253-267 (1972). M. Golumbic; Algorithmic Graph Theory and Perfect Graphs, Academic Press, New York (1980). C. Berge and V. Chvital (editors), Topics on Perfecf Graphs, Ann. Discrete Math., 21, North-Holland, Amsterdam (1984). M. Grotschel, L. Lovbz and A. Schrijver; The ellipsoid method and its consequences in combinatonal optirmzation, Combinatorics, 1, 169-197 (1981). M. Grotschel, L. Lovilsz and A. Schrijver; Polynomial algorithms for perfect graphs, Topics on Perfect Graphs, Eds. C. Berge and V. ChvBtal,Ann. Discrete Math., 21,325-356 (1984). M. Grotschel, L. Lovdsz and A. Schrijver; Geometric Algorithms and Combinatorial Optimization, Springer-Verlag.Berlm (1988). M. Grotschel, L. LovAsz and A. Schrijver; Relaxations of vertex packing, J. Combin. Theory Ser. €4.40, 33&343 (1986).
Matching and vertex packing: How “hard” are they?
307
[121] L. Lovhz; Some finite basis theorems in graph theory, Combinatorics, 11, A. Hajnal and V.T. S6s (editors), Colloq. Math. SOC.Jdnos 3olyai. 18, North-Holland.Amsterdam,717-729 (1978). [122] L. Lovhz; Vertex packing algorithms, Automata, Languages and Programming (Naflion. Greece, 1985), Ed. W. Brauer, Lecture Notes in Computer Science, 194, Springer-Verlag.Berlin, 1-14 (1985). (1231 L. Lovisz, Stable sets and polynomials, Princeton University, Dept. of Comput. Sci., preprint (1990). [124] L. Lovhz and A.Schrijver; Matrix cones, projection representations, and stable set polyhedra, DIUACS Series in Discrete Mathematics and Theoretical Computer Science, 1, Amer. Math. Soc.. Providence, 117 (1990). [123 F. Sterboul; A characterization of the graphs in which the transversal number equals the matching number,J . Combinatorial Theory Ser. B, 27.22S229 (1979). [126] R.W. Deming; Independence numbers of graphs - an extension of the Konig-Egerv6ry theorem, Discrete Math., 27.23-33 (1979). [127] L. Lovhz; Ear-decompositionsof matching-coveredgraphs, Combinaforica.3,228.229 (1983). [128] E. Korach; On Dual Integrality, Min-Mar Equalities and Algorithms in Combinatorial Programming, University of Waterloo, Dept. of Combinatonesand Optimization, Ph.D. Thesis (1982). [I291 G.L. Nemhauser and L.E. Trotter; Vertex packings: structural properties and algorithms, Math. Programming, 8,232-248 (1975). [130] W.R. Pulleyblank; Mnimum node covers and 2-bicritical graphs, Math Programming, 17, 91-103 (1979). [131] J.-M. Boujolly and W.R. Pulleyblank; Konig-Egerv’ky graphs, 2-bicriticalgraphs and fractional matchings, Discrete Appl. Math.. 24,63432 (1989). [132] V. Chvhtal and P.J. Slater; A note on well-covered graphs, preprint, January 1991. Quo Vadis, Graph Theory?. J. Gimbel, J.W. Kennedy & L.V. Quintas (editors), Ann. Discrete Math..,55, (1992). [133] R.S. Sankaranarayana and L.K. Stewart; Complexity results for well covered graphs, University of Alberta, Dept. of Computing Science Tech. Report TR 90-21 (1990). [134] M.D. Plummer; On well-covered graphs - a survey, preprint (1991). [135] N. Dean and J. Zito; Well-covered graphs and extendability,preprint ( 1990). [136] A. Hajnal; A theorem on k-saturated graphs, Canad. J . Math., 17,720-724 (1965). [137] B: Andrhfai; On critical graphs, Theory of Graphs (International Symposium, Rome. 1966), P. Rosensaehl (&tor), Gordon and Breach, New York, 9-19 (1%7). [I381 V. ChvAtal; On certain polytopes associated with graphs, J . Combin. Theory Ser. 3,18,138-154 (19775). [139] E.C. Sewell; Stability critical graphs and the stable set polytope, Cornell Computational Optimization Project, Comell University, Tech. Report 90-1 1 (1990). [140] E.C. Sewell; Stability critical graphs and the stable set polytope, Cornell University, School of Oper. Res. and Indust. Eng., Tech. Report 905 (1990). [141] E.C. Sewell and L.E. Trotter, Jr.; Stability critical graphs and even subdivisionsof &.Comell University, School of ORIE, preprint (1990). [142] C.H. Papadimitriou and M. Yannakakis; The complexityof facets (and some facets of complexity), Proc. 14th Annual ACM Symposium on Theory of Computing, ACM, New York. 255-260 (1982). [143] C.H. Papadimitriou and D. Wolfe; The complexity of facets resolved, Proc. 26th Annual IEEE Symposium on Foundations of Computer Science, IEEE Computer Society Press, 74-78 (1985). [ l a ] U. Feige, S. Goldwasser, L. Lovlsz and S. Safra;On the complexity of approximating the maximum size of a clique, preprint (1990). [143 U. Feige, S. Goldwasser, L. Lovhz, S. Safra and M. Szegedy; Approximating clique is almost NP-complete (extended abstract), preprint (1991). [146] U. Feige, S. Goldwasser,L. Lovhz, S. Safra and M. Szegedy; Approximating clique is almost N P c o m plete. Proc. 21st Annual IEEE Symposium on Foundations of Computer Science, IEEE Computer Society Press (1991) (to appear). [147] R. Boppana and M.M. Halld6rsson; Approximatingmaximum independent sets by excluding subgraphs, SWAT 90 (Bergen. Sweden), J .R. Gilbert and R. Karlsson (editors), Lecture Notes in Computer Science, 447,13-25 (1990). [148] R. Bar-Yehuda and S. Even; A 2-(loglog nilog n) performanceratio for the weighted vertex cover problem, Technion, %fa, Tech. Report 260 (1983). [149] B. Monien and E. Speckemeyer; Ramsey numbers and an approximation algorithm for the vertex cover problem, Acta Infbrm., 22. 115-123 (1985).
308
M.D. Plummer
[lSO] H.N. Gabow and R.E. Tarjan; Almost-optimum speed-ups of algorithms for bipartite matching and related problems, Proc. 20th Annual ACM Symposium on Theory of Computing, ACM, New York, 514527 (1988). [I511 A.V. Goldberg, S.A. Plotkin and P.M. Vaidya; Sublinear-time parallel algorithms for matching and related problems, Proc. 29th Annual IEEE Symposium on Foundations of Computer Science, IEEE Computer Society Press, 176185 (1988). [152] H.N. Gabow and R.E. Tarjan; Almost-optimum parallel speed-ups of algorithms for bipartite matching and related problems, Princeton University, Dept. of Computer Sci. Report CS-TR-223-89 (1989). [I531 Y. Shiloach and U. Vishkin; An O(n210g n) parallel MAX-FLOW algorithm, J . Algorithm, 3, 12LL146 (1982). [I541 T. Kim and K.-Y. Chwa; An O ( n log n loglog n) parallel maximum matching algorithm for bipartite graphs, Inform. Process. Letters, 24,1517 (1987). [155] G. Lev; Size Bounds and Parallel Algori6hmsfor Neiworks, University of Edinburgh, Dept. of Computer Sci. Report CST-8-80, Ph.D. Thesis (1980). [156] G. Lev, N. Pippenger and L.G. Valiant; A fast parallel algorithm for routing in permutation networks, IEEE Trans. on Computers, C-30,93-100 (1981). [157] E. Dahlhaus and M. Karpinski;Perfect matching for regular graphs is ACO-hard for the general matching problem, preprint (1990). [158] A. Israeli and Y. Shiloach An improved parallel algorithm for maximal matching, Inform. Process. Letters ,22,57-60 (1986). [159] R.M. Karp and A. Wigderson; A fast parallel algorithm for the maximal independent set problem, Proc. 16th Annual ACMSymposium on Theory of Computing, ACM, New York, 2 6 2 7 2 (1984). [I601 R.M. Karp and A. Wigderson; A fast parallel algorithm for the maximal independent set problem. J . Assoc. Comput. Mach., 32,762-773 (1985). [I611 A. Israeli and A. Itai;A fast and simple randomized parallel algorithm for maximal matching, Inform. Process. Letters, 22.77-80 (1986). [162] M. Luby; Removing randomness in parallel computation without a processor penalty (preliminary version), Proc. 29th Annual IEEE Symposium on Foundations of Computer Science, IEEE Computer Soc. Press, 162-173 (1988). [ l a ] M. Luby; Removing randomness in parallel computation without a processor penalty, Internat. Computer Sci. Institute Tech. Report TR-89-044 (1989). [I641 M. Luby; A simple parallel algorithm for the maximal independent set problem, Proc. 17th Annual ACM Symposium on Theory of Computing, ACM, New York. 1-10 (1985). [I63 M. Luby; A simple parallel algorithm for the maximal independent set problem, SlAM J. Comput., 15, 1036-1053 (1986). [166] R.M. Karp, E. Upfal and A. Wigderson; Are search and decision problems computationallyequivalent?, Proc. 17th Annual Symposium on Theory of Computing, ACM, New York, 4.64-475 (1985). [167] L. LovBsz; On determinants,matchings and random algorithms, Fundamentals of Computation Theory, FCT '79 (Proc. Conf. Algebraic, Arithmetic and Categorical Methods in Computation Theory, Berlin/ Wendisch-Rietz 1979).L. Budach (editor), Math. Research 2, Akademie-Verlag.Berlin, 565-574 (1979). [I681 A. Borodin, J. von zur Gathen and J. Hopcroft; Fast parallel matrix and GCD computations, Proc. 23rd Symposium on Theory of Computing, ACM, New York, 65-71 (1982). 11691 A. Borodin, J. von zur Gathen and J. Hopcroft; Fast parallel matrix and GCD computations, Inform. and Control, 52,241-256 (1982). [170] L. Cs5nky; Fast parallel matrix inversion algorithms, SZAMJ. Comput.,5,618-623 (1976). (1711 R.M. Karp.E. Upfal and A. Wigderson; Constructing a perfect matching is in random NC, Proc. 17th Annual Symposium on Theory of Computing (Providence, Rhode Island), ACM, New York, 22-32 (1985). [172] R.M. Karp,E. Upfal and A. Wigderson; Constructing a perfect matching is in random NC, Combinaoricu, 6 , 3 5 4(1%). [173] 2. Galil and V. Pan; Improved processor bounds for combinatorial problems in RNC, Proc. 26th Annual IEEE Symposium on Foundations of Computer Science, IEEE Computer Society Press, 490-495 (1985). [174] 2. Galil and V. Pan; Improved processor bounds for combinatorial problems in RNC, Combinutorica, 8, 189-200 (1988). [17q K. Mulmuley, U.V. Vazirani and V.V. VaZirani; Matching is as easy as matrix inversion, Proc. 19th Annual ACM Symposium on Theory of Computing, ACM, New York, 345354 (1987). [176] K. Mulmnley, U.V. Vazirani and V.V. Vazirani; Matching is as easy as matrix inversion, Combinatorica, 7,105-113 (1987).
Matching and vertex packing: How “hard” are they?
309
[177] H.J. Karioff; A Las Vegas RiVC algorithm for maximum matching, Combinatorica, 6,387-391 (1986). [178] 2. Galil; Sequential and parallel algorithms for finding maximum matchings in graphs. Ann. Rev. Comput. Sci., 1, 197-224 (1986). [179] 2. Galil; Efficient algorithms for finding maximum matching in graphs, Computing Surveys, 18.23-38 (1986). M. Chrobak and J. Naor; Computing a perfect matching in claw-free graphs, prqrint (1990). J. Naor; Computing a perfect matching in a line graph, VLSI Algorithms and Architectures, 3rd Aegean Workshop on Computing, AWOC 88, J .H. Reif (editor), Lecture Notes in Computer Sci., 319,SpringerVerlag, Berlin, 139-148 (1989). G.L. Miller and J. Naor; Flow in planar graphs with multiple sources and sinks, Proc. 30th Annual IEEE Symposium on Foundations of Computer Science, IEEE Computer Society Press, 112-1 17 (1989). E. Dahlhaus and M. Karpinski; Parallel construction of perfect matchings and Hamiltonian cycles on dense graphs, Theoret. Comput. Sci., 61,121-136 (1988). E. Dahlhaus, P. Hajnal and M. Karpinski; Optimal parallel algorithm for the Hamiltonian cycle problem on dense graphs, Proc. 29th Annual IEEE Symposium on Foundations of Computer Science (White Plains), 18C193 (1988). E. Dahlhaus, P. Hajnal and M. Karpinski; On the parallel complexity of Hamiltonian cycle and matching problems on dense graphs, preprint (1990). E. Dahlhaus and M. Kaqinski; The matching problem for strongly chordal graphs is in NC, Univ. Bonn Inst. fiir Informatik Research Report No. 8 5 5 4 3 (1986). D. Helmbold and E. Mayr; Applications of parallel scheduling to perfect graphs, Proc. Internat. Workshop WG ’86,Lecture Notes in Computer Science, 246,188-203 (1987). D. Helmbold and W. Mayr; Applications of parallel scheduling algorithms to families of perfect graphs, Computing Suppl.. 7.93-107 (1990). D. Kozen, U.V. Vazirani and V.V. Vazirani; NC Algorithms for comparability gra hs interval graphs, and testing for Unique perfect matching, Foundations of S o f i a r e Technology and TLo;etical Computer Science (New Delhi, 1985). S.N. Maheshwati (editor), Lecture Notes in Computer Science, 206, SpringerVerlag, Berlin, 4 5 5 0 3 (1985). [190] M.O. Rabin and V.V. Vazirani; Maximum matchings in general graphs through randomization,J. Algorithms, 10,557-567 (1989). [191] D. Yu. Grigoriev and M. Karpinski; The matching problem for bipartite graphs with polynomially bounded permanents is in NC (extended abstract), Proc. 28th Annual IEEE Symposium on Foundations of Computer Science, IEEE Computer Society Press, 1 6 J 7 2 (1987). I1921 A. Subramanian; The Computational Complexity of the Circuit Value and Network Stability Problems, Stanford University, Dept. of Computer Sci. Report No. STAN-CS-90-1311,Ph.D. Thesis (1990). [193] E.W. Mayr and A. Subramanian; The complexity of circuit value and network stability, Stanford University, Dept. of Computer Sci. Report No. STAN-CS-89-1278(1989). [194] E.W. Mayr and A. Subramanian;The complexity of circuit value and network stability,Proc. Structure in Complexity Theory (4th Ann. IEEE Conf.), 114-123 (1989). [ I 9 3 R E. Ladner; The circuit value problem is log space complete for P, SIGACT News, 7, 18-20 (1975). [1%] A. Subramanian; A new approach to stable matching problems, Stanford University, Dept. of Computer Science Tech. Report STAN-CS-89-1275(1989). [19fl H. Narayanan, H. Saran and V.V. Vazirani; Fast parallel algorithmsfor matroid union, arborescences, and edge-disjointspanning trees, preprint (1991). [198] T. Fischer, A.V. Goldberg and S. Plotkin; Approximating matchings in parallel, Stanford University, Dept. of Computer Sci. Report No. STAN-CS-91-1369(1991). [199] N. Alon, L. Babai and A. 1 6 ; A fast and simple randomized parallel algorithm for the maximal independent set problem, J. Algorithms, 7.567-583 (1986). [200] M. Goldberg and T. Spencer; A new parallel algorithm for the maximal independent set problem, Proc. 28th Annual IEEE Symposium on Foundations of Computer Science, IEEE Computer Society Press, 161165 (1987). [201] M. Goldberg and T. Spencer; A new parallel algorithmfor the maximal independent set problem, SIAM J. Comput., 18.419427 (1989). [202] M. Goldberg and T. Spencer; Constructing a maximal independent set in parallel, SIAM J. Discrefe Math., 322-328 (1989). [203] M.K. Goldberg; Parallel algorithms for three graph problems, Proc. Seventeenth Southeustern Conf. on Combinatorics. Graph Theory and Computing, F. Hoffman et al. (editors), Congress. Numer., 54, 111121 (1986).
3 10
M.D. Plummer
12041 P. Turh; On the theory of graphs, Colloq. Math., 3, 19-30 (1954). [205l M. Goldberg and T. Spencer; An efficient algorithm that finds independent sets of guaranteed size, Proc. First Annual ACM-SIAM Symposium on Discrete Algorithms, SIAM,Philadelphia,219-225 (1990). [206] X. He; A nearly optimal parallel algorithm for constructing maximal independent sets in planar graphs, preprint (1987). [207 S. Khdler; Extending planar graph algorithms to K3,yfree graphs, Foundations of Sofware Technology and Theoretical Computer Science (Pune. India, 1988).K.V. Nori and S. Kumar (editors), Lecture Noies in Computer Science, 338, Springer-Verlag,Berlin, 67-79 (1988). [208] N. Dadoun and D.G. Kirkpatrick;Parallel algorithms for fractional and maximal independent sets in planar graphs, Discrete Appl. Math., 27.69-83 (1990). [209] D. Helmbold and E. Mayr; Perfect graphs and parallel algorithms, Proc. IEEE 1986 lnternational Conference on Parallel Processing, IEEE, New York, 853-860 (1986). [210] A.A. Bertossi and M.A. Bonuccelli; Some parallel algorithms on interval graphs, Disc. Appl. Math., 16, 101-111 (1987). [211] J. Naor, M. Naor and A.A. Sch5ffer;Fast parallel algorithms for chordal graphs, Proc. 19th Annual ACM Symposium on Theory of Computing, ACM, New York, 355-364 (1987). [212] J. Naor, M. Naor and A.A. ScWfer; Fast parallel algorithms for chordal graphs, SIAM J . Comput., 18, 327-349 (1989). [213] E. Dahlhaus and M. Karpinski; A fast parallel algorithm for computing all maximal cliques in a graph and the related problems (extended abstract), SWAT 88 (Halmstad, Sweden, 1988), Lecture Notes in Computer Science, 318, 139-144(1988). [214] P. Kasteleyn; h e r statistics and phase transitions, J . Maih. Phys., 4,287-293 (1963). [2151 P. Kasteleyn; Graph theory and crystal physics. Graph Theory and Theoretical Physics, F. Harary (editor), Academic Press, New Yak, 43-1 10 (1%7). [216] C.H.C. Little; An extension of Kasteleyn’s method of enumerating the 1-factors of planar graphs, Combinatorial Mathematics, Proc. Second Australian Conference, D. Holton (editor), Lecture Notes in Mafh., 403,Springer-Verlag, Bergn. 63-72 (1974). [217 L. LovLz; Matching structure and the matching lattice, J. Combin. Theory Ser. B , 43, 187-222 (1987). [218] V.V. Vazirani and M. Yannakakis; Pfaffian orientations, 011 permanents and even cycles in directed graphs, Automata, Languages and Programming (Tampere,Finland, 1988), T. Lepisto and A. Salomaa (editors), Lecture Notes in Computer Science, 317, Springer-Verlag,Berlin, 667481 (1988). [219] V.V. Vazirani and M. Y d a k i s ; Pfaffian orientations, 0-1 permanents, and even cycles in directed graphs, Discrete Appl. Math., 25, 179-190 (1989). [220] W.R. Pulleyblank;personal communication (December 1991). [221] P. Seymour and C. Thomassen; Characterization of even directed graphs, J. Combin. Theory Ser. B , 42, 36-45 (1987) [222] M. Jemun; Two-dimensional monomer-&mer systems are computationally intractable, J. Stat. Phys., 48, 121-134 (1987). (Erratum: J. Stat. Phys, 59, 1087-1088 (1990). [223] A. SincIair; Randomised Algorithms for Counting and Generating Combinatoriai Siruciures, University of Edinburgh, Dept. of Computer Science Report CST-58-88, Ph.D. Thesis (1988). [224] R.W. Irving and P. Leather; The complexity of counting stable marriages, SIAM J . Comput., 15,655667 (1986). [225l H. Minc; “Permanents,” Encyclopedia of Mothemalics and its Applicaiions ,6,Addison-Wesley,Reading, Massachusetts (1978). [226] H. Ryser; Combinatorial Mathematics, Carus Mathematical Monograph No. 14, Math. Assoc. Amer., Washington (1963). [227 M. Luby; A survey of approximation algorithms for the permanent, Sequences (Naples/Positano. 1988). R.M. CapoceUi (editor), Springer-Verlag.New Yo&, 7.5-91 (1990). [228] N. Karmarkar, R. Karp, R. Lipton, L. LovAsz and M. Luby; A Monte-Carlo algorithm for estimating the permanent, preprint (1988). [229] M. Jermm and U.V. Vazirani; A mildly exponential approximation algorithm for the permanent, preprint (1991). [230] A.Z. Broder; How hard is it to marry at random? (On the approximation of the permanent) (extended abstract), Proc. 18th Annual ACM Symposium on Theory of Computing (Berkeley, CaliJ). ACM, New York, 50-58 (1986). Also see errata: Proc. 20th Annual ACMSympsium on Theory of Computing (Chicago, 1IZ.J.ACM, New York, 551 (1988).
Matching and vertex packing: How “hard” are they?
311
[231] M. Jemun and A. Sinclair; Conductance and the rapid mixing property for Markov chains: the approximation of the permanent resolved (preliminary version), Proc. 20th Annual ACM Symposium on Theory of Computing, ACM, New York, 235-244 (1988). [232] M. Jemun and A. Sinclair; Approximating the permanent, SIAMJ. Comput., 18, 1149-1 178 (1989). [233] P. Dagum, M. Luby, M. Mihail and U.V. Vazirani; Polytopes, permanents and graphs with large factors, Proc. 29th IEEE Symposium on Foundations of Computer Science (White Plains), IEEE, Computer Society Press, 412421 (1988). [234] P. Dagum, M. Luby, M. Mihail and U.V. Vazirani; Polytopes, permanents and graphs with large factors, Theoret. Comput. Sci. (to appear). [235] P. Dagum and M. Luby; Approximating the permanent of graphs with large factors, preprint (1991). [236] K. Fukuda and T. Matsui; Finding all the perfect matchings in bipartite graphs, Tokyo Inst. Tech. Dept. of Information Sciences,Res. Rep. Inform. Sci. Ser. B: Operations Research, Report B-225 (1989). [237] N. Chiba and T. Nishizeki; Arboricity and snbgraph listing algorithms, SIAM J. Comput., 14,210-223 (1985). [238] S. Tsnkiyama, M. Ide, H. Ariyoshi and I. Shirakawa; A new algorithm for generating all maximal independent sets, SIAMJ. Comput., 6,505-517 (1977). [239] L. Lovisz; The work of A.A. Razborov, InternationalCongress of Mathematicians,Kyoto (1990). [240] A.A. Razborov; Lower bounds on the monotone circuit complexity of some Boolean functions, Doklady Akad. Nauk SSSR, 281,798401 (1985) (Russian). English translation, Soviet Math. Dokl., 31,354-357 (1985). [241] A.A. Razborov; Lower bounds on monotone circuit complexity of the logical permanent, Mat. Zametki, 37,887-900 (1985) (Russian). English translation Math. Notes ofthe Acad. Sciences oflJSSR, 3 7 , 4 8 5 493 (1985). [242] N. Alon and R.B. Boppana; The monotone circuit complexityof Boolean functions, Combinatorica, 7.122 (1987). [243] H.N. Gabow; Scaling algorithms for network problems, Proc. 24th Annual IEEE Symposium on Foundations ofcomputer Science, IEEE Computer Society Press, 248-257 (1983). [244] H.N. Gabow; A scaling algorithm for weighted matching on general graphs, Proc. 26th Annual IEEE Symposium on Foundations of Computer Science, IEEE Computer Society Press, 90-100 (1985). [24q H.N. Gabow; Data structures for weighted matching and nearest common ancestors with linking, Proc. First Annual ACM-SIAM Symposium on Discrete Algorithms, SIAM, Philadelphia, 434-443 (1990). [MI H.N. Gabow, Z. Galil and T.H. Spencer;Efficient implementationof graph algorithms using contraction, Proc. 25th Annual Symposium on Foundations ofcomputer Science, IEEE Computer Society Press, 347357 (1984). [247] H.N. Gabow. Z. Galil and T.H. Spencer; Efficient implementationof graph algorithms using contraction, J. Assoc. Comput. Mach., 36,540-572 (1989). [ M I R. Anstee; A polynomial algorithm for b-matchings: an alternative approach, University of British Columbia, Dept. Math., preprint (1986). N. Calkin and A. Frieze; Probabilistic analysis of a parallel algorithm for finding maximal independent “91 sets, Camegie Mellon University, Dept. of Math. Research Report No. 8%38 (1990). [250] D. Coppersmith, P. Raghavan and M. Tompa; Parallel graph algorithms that are efficient on average, Proc. 28th Annual IEEE Symposium on Foundations of Computer Science, IEEE Computer Society Press, 2-269 (1987). [251] A.M. Frieze; Probabilisticanalysis of graph algorithms, Carnegie Mellon Univ. Research Report No. 8% 42 (1 989). [252] A.M. Frieze; Probabilistic analysis of graph algorithms, Computing. Supp., 7,209-233 (1990). [253] M. Jermm; An analysis of a Monte Car10 algorithm for estimating the permanent, preprint (June 1991) [2.54] A.M. Frieze; Maximum matchings in a class of random graphs, J. Combin. Theory Ser. B . 40,196-212 (1%) [255] A.M. Frieze; On matchings and Hamilton cycles in random graphs, Camegie Mellon University, Dept. of Math. Research Report No. 8%36 (1988). [Us] A.M. Frieze; On perfect matchings in random bipartite graphs with minimum degree at least two, Carnegie Mellon Univ. Dept. Math. Res. Rpt. No. M 6 (1990). [U7] A.M. Frieze and D. Tygar; Deterministic parallel algorithms for matchings in random graphs, preprint (1990) - in preparation. [258] 0. Goldschmidtand D.S. Hochbaum; A fast perfect matching algorithm in random graphs, SIAMJ. Discrete Math., 3.48-57 (19%).
312
M.D. Hummer
[259] G.R. Grimmett; Random near-regular graphs and the node packing problem, Oper. Res. Letters, 4, 169174(1985). 12601 G.R. Grimmett and W.R.Pulleyblank; An exact threshold theorem for random graphs and the nodepacking problem, J . Combin. Theory Ser. B , 40,187-195 (1986). [261] M. Jerrum; The elusiveness of large cliques in a random graph, University of Edinburgh, Dept. of Computer Sci. I n t d Report CSR-9-90(1990). [262] R. Motwani; Expanding graphs and the average-case analysis of algorithms for matchings and related problems, Proc. Zlst Annual ACM Symposium on Theory of Computing, ACM, New York, 550-561 (1989). [263] A.V. Goldberg, S.A. Plotkin, D.B. Shmoys and 8.Tardos, Using interior point methods for fast parallel algorithms for bipartite matching and related problems, preprint (1991). [264] D.S. Johnson,C. Papadimitriou and M. Yannakakis, On generating all maximal independent sets, Inform. Process. Letters. 27, 119-123 (1988).
Quo Vadis, Graph Theory? J. Gimbel, J.W. Kennedy & L.V. Quintas (eds.) Annals of Discrete Mathematics, 55, 313-326 (1993)
0 1993 Elsevier Science Publishers B.V. All rights reserved.
THE COMPETITION NUMBER AND ITS VARIANTS Suh-Ryung KIM Division of Mathematics & Science, St. John's University Staten Island, New York, U.S.A.
Abstract If D is an acyclic digraph, its competition graph bas the same vertex set as D and an edge between vertces u and V if and only if for some vertex w of D, there are arcs (u,w) and (v.w)in D.The competition number of a graph G is the smallest number of isolated vertices whose addition makes G into a competition graph. Competition graphs were introduced by Cohen in 1968 as a means of determining the smallest dimension of ecological phase space. Various notions analogous to competition graphs together with competition graphs have applications,not only to ecology, but in studying communication over a noisy channel, in assigning frequencies to radio transmitters, and in modeling complex economic and energy systems. In the study of the competition graph of an acyclic digraph, there are two fun&mental questions which were proposed by Roberts in 1978: One is to characterizethe acyclic digraphs whch have interval competition graphs and the other is to characterize the graphs which are competition graphs of acyclic digraphs. In this paper, we focus our interest on the second question as we survey the results about the competition number and some of its variants, namely, the p-competition number, the double competition number, and the niche number. Open questions related to the topic are discussed as well.
1.
Introduction
Suppose D is a digraph. Its competition graph G has vertex set V ( D ) and has an edge between u and v in V(D)if and only if for some w in V(D),there are arcs (u,w) and (v, w) in the arc set A(D). The notion of competition graph is due to Cohen [l] and has arisen from ecology. A food web in ecosystems is a digraph whose vertices are the species of the system and which has an arc from a vertex u to a vertex v if and only if u preys on v. Given a food web F, it is said that species u and v compete if and only if they have a common prey. An example of a food web and the corresponding competition graph is given in Figure 1. We say a graph G is an interval graph if it is the intersection graph of some family of intervals on the real line. Cohen [1]-[3] observed empirically that most competition graphs of acyclic digraphs representing food webs are interval graphs. Roberts [4] asked whether or not Cohen's observation was just an artifact of the construction and concluded that it was not by showing that if G is an arbitrary graph, then G together with as many isolated vertices as the edges of G is the competition graph of an acyclic digraph D . (Add a vertex i,corresponding to each edge a = { a , b} of G, and draw arcs from a and b to ia) Based on this observation, Roberts [4] introduced the notion of the competition number k ( G ) as the smallest k so that G together with k isolated vertices is the competition graph of an acyclic digraph. In his paper Roberts [4] proposed two fundamental questions on the competition graph of an acyclic digraph
1. Which acyclic digraphs have competition graphs that are interval graphs? 2. What graphs are the competition graphs of acyclic digraphs?
In [5] Scott defined the double competition number and Cable et a1.[6] introduced the notion of niche number. Recently Kim et aZ.[7] defined the p-competition number as a generalization of the special casep = 1, the competition number. In this paper we mainly survey the results on the competition numbers of graphs and their analogues which have resulted from efforts to answer the second question, and present some open questions.
s.a. Kim
3 14
5
2
3
8
7
10
FoodwebD Key : 1. Canopy:leaves, fruits, flowers 2. Canopy animals: buds, fruit-bats, and other mammals 3. Upper air animals: birds and bats, insectivorous 4. Insects 5. Large ground animals: large mammals and birds 6. Trunk, fruit, flowers 7. Middle-zone scansorial animals: mammals in both canopy and ground zones 8. Middle-zone flying animals: birds and insectivorous bats 9. Ground roots,fallen fruit, leaves and trunks 10. Small ground animals: birds and small mammals 11. Fungi
4
0
1
.
0
6
9
The competition graph of D Figure 1: A food web and corresponding competition graph for the Malaysian Rain Forest, from data of Hamson [23, as adapted by Cohen [3] and Roberts [8]. Besides an application to ecology, the concept of competition graph can be applied to the study of communication over a noisy channel (see Roberts [8] or Shannon [9]) and to the problem of assigning channels to radio or television transmitters (see Cozzens and Roberts [lo], Hale [ll], or Opsut and Roberts [12]). In Section 2, we investigate various results on the competition number. Section 3 presents and discusses numerous results on the p-competition number. Section 4 and Section 5 study several main results on the double competition number and the niche number, respectively. Each section also discusses some important open questions. Table 1 compares the definitions of the competition graph of an acyclic digraph D and its variants; Figure 2 illustrates those definitions. Table 2 summarizes the competition numbers,
The competition number and its variants
315
the p-competition numbers, the double competition numbers, and the niche numbers for some classes of graphs whose competition numbers are known.
Vertex Set
Graph
I
Edge Set
The Competition Graph C(D)
{{ u,v} : there exists a vertex w in V(D)such that arcs ( u . w),(v. w)belong to A(D)}
Thep-competition Graph C,(D)
V(0
<{u.v} : there exist p distinct vertices u l , .._,up in V ( D )such that arcs (u, U I ) , (v.a d , ..., (u. a,). (v,ap) belong to A(D)}
The Competition CommonEnemy Graph CC(D)
V(D)
{{ u,v} : there exist vertices w and x in V(D)such that arcs (u.w).(v,w)and arm (x,u), (x.v)belong toA(D)}
The Niche Graph N(D)
W)
{{ u,v} : there exists a vertex w in V(D)such that arcs (u.w).(v.w)or arcs (w,u),(w.v) belong to A@)}
I
2. The Competition Number An edge clique covering is a collection of cliques which cover all the edges of G and a vertex clique covering is a collection of cliques which cover all the vertices of G. We shall denote by OdG) and ev(G) the size of the smallest edge clique covering and vertex clique covering, respectively. The following results indicate that edge clique coverings are closely related to the competition numbers. Theorem 1: (Opsut [ 19821) Forany graph G, e,(c)-IV(G)I+2Ik(c)18,(G). Lundgren and Maybee [13] characterized graphs of a given competition number in terms of edge clique coverings and Kim [14]later found a slight error in the statement of their theorem and corrected it. Theorem 2: (Lundgren and Maybee [1983]), Kim [ 19881) If G is a graph with n vertices, and m I n , then k(G)1 m if and only if G has an edge clique covering {Cl, ..., C,+, -2) and a vertex labeling v1, ...,v,, such that if v i e $, then i > j - m + 1. As mentioned in the previous section, the competition number of any graph can be found. This raises the question of finding the maximum competition number of a graph on n vertices. Harary, Kim, and Roberts answered the question by giving the following theorem from which TurAn's theorem follows as a special case.
[la
Theorem 3: (Harary, Kim, and Roberts [1989]) For all graphs G, QG) 5 Ln2/4J- n
+2
and this bound is achieved only by the complete bipartite graph IQdz~rd21and K3.
s.-R. Kim
3 16
f
C
a
c
Figure 2: The competition graph, C(D), the 2-competition graph, C2(D), the competition-common enemy graph, CC(D), and the niche graph, N(D), corresponding to an acyclic graph D. The following result shows that the problem of computing MG) for a graph G in general is not simple. Theorem 4: (Opsut [19821)
The computation of k(C) for arbitrary G is NP-hard. However, it is easy to calculate k( G) for some classes of graphs. Roberts [4] developed a heuristic algorithm which gives an upper bound m on k( C) by constructing a food web whose competition graph is G ul, where I, is a set of m isolated vertices a n d u means disjoint union, i.e., no edges inserted. The algorithm leads to the following two results: Theorem 5 gives a simple formula for the competition number of a connected, triangle-free graph and Theorem 6 states the competition numbers for some classes of graphs. Theorem 5: (Roberts[ 19781)
Suppose G is connected. Then if G is a triangle-free graph, k(G)= IE(G)I -I V(G)I + 2. Wang [personal communication] proved that the converse of Theorem 5 is also true for a connected graph G.
The competitionnumber and its variants
3 17
Table 2: WG), dk(G),q(G),and kp(G) of a graph G. rlpb
(Sechon 5)
(Section 3)
unknown except
q(P*)= 1; q(P,) = 0.n 23;
Triangulated Graphs
5 1*
q(G) =-if
G is a
nova;
q(K,,) = 1, n 2 2 C, , n 24
Line Graphs
lI 2
P+l
l 2
2 for n =4,5,6; 1for n = 3 , 8 ; Oforn=7,n29
unknown
52'
unknown except q(Kv ) = - i f min{mfi} 23 I
Key:
z
The competition number YC) The pcompetition number kP(Q {The double competition number dk(G)
Nith k isolated vertices is the
} of a graph G is the smallest k so that G together
competition graph p-competition graph {competition-common enemy g a p
3
of an acyclic digraph
'If G can be made into a niche graph of an acyclic digraph by adding isolated vertices, then the niche numx r q(G)of G is the smallest number of isolated vertices needed; otherwise q(G) = -. 'with equality iff G has no isolated vertex twith equality iff the neighbohood of each vertex has vertex clique covering number 2. h e bounds may be improved !with equality if G is connected
Theorem 6: (Roberts [ 19781) (1) Every triangulated graph has k(G) S 1 with equality if and only if G has no isolated vertex. (2) Every interval graph has WG) S 1 with equality if and only if G has no isolated vertex. (3) If G is connected, I V(G)I > 1, and G has no triangle, then k(G) = 1 if and only if G is a tree. (4) If n > 3, then k(Cn)= 2, where Cn is a cycle of length n.
If v is a vertex of graph G, the open neighborhood of v, N(v),consists of all vertices adjais N(v)u { v}.Nv)and N[v] also stand cent to v in G, and the closed neighborhood of v, Nv], for the subgraphs induced by their respective vertices. Opsut found the competition numbers for the family of line graphs.
S.-R. Kim
3 18
Theorem 7: (Opsut [1982]) If G is a line graph, then k(G) I 2, with equality if and only if for any vertex v of G, one has ~ V C n U V N= 2. A line graph satisfies the property that 6JN(v)) I2 for all v in G and Opsut [16]conjectured that this is the only hypothesis needed to derive the conclusion of Theorem 7. Conjecture 1: (Opsut [1982]) If G is any graph with 6,(N(v)) 5 2 for all v in G, then k(G) 5 2 with equality if and only if 6,(wv)) = 2 for all v in G. Let us say that B*(N(v))I2 if (a) e,(w)) < 2 or (b) 6,(wv)) = 2 and there are two cliques C1 and C 2 vertex covering N [ v ] ,both containing v , so that for all w E C1, N(w) - C1 is empty or a clique of G. We also say that 6*(N(v))= 2 if 6*(llyv))5 2 and 6,(N(v)) = 2, i.e., if (b). Kim and Roberts [17] partially answered Opsut's question as follows: Theorem 8: (Kim and Roberts[19901) Suppose that for all vertices v in a graph G, 6*(Afv)) I 2. Then k(G) 22, with equality if and only if for every vertex v of G, 6*(N(v))=2. Wang [18] improved Theorem 8 by replacing the hypothesis of Theorem 8 by 'the one with just a vertex vo satisfying ~*(N(VO)) I 2 and other vertices v satisfying the weaker property 6,(nUV)) I2. Theorem 9: (Wang[1990]) Suppose that for all vertices v in a graph G, ~(Nv)) I 2 and there is a vertex vo with 6*(N(vo)) I2. Then k(G) I 2, with equality if and only if for every vertex v of G, BJN(v))= 2. Though Wang's hypothesis has quite closely approached Opsut's, the property that for a vertex V O , B*(N(vo)) I2,which plays a key role in proving Theorem 9, is much stronger than I 2. the property that ~,,(N(VO)) In an effort to answer Opsut's conjecture, which appears to be difficult to settle, we present the following conjecture which is weaker than Opsut's conjecture and whose falseness implies falseness of Opsut's.
Conjecture 2 If G is any graph with 6JAfv))I 2 for all v in G, then 6,(G) I n where n = IV(c)l. The above conjecture is true with the added condition that G has a clique having size at least L(n +2)/2].In order to show that, we take a clique K, with m 2 L(n +2)/21.For each vertex v in V(G),there are two cliques C v l , C,Q which cover "v], both containing v, by the hypothesis of the conjecture. Now consider
c=uvE V ( G ) - V ( K , ) { ~ V cv21 ~. The cardinality of the following way:
u{Kn~}
c is 2(n - m)+ 1. We claim that C is an edge clique covering of
For an edge {u,v} in G, one of the following is true:
Gin
The competitionnumber and its variants
319
(i) both u and v are in V(G)- V(K,); one of u and v, say u,is in V(G)-V(K& and v in V(K&; (ii) both u and v are in V(K,) (iii) In the case (i), {u, v} is covered by both Cui and C,,, for some i, j = 1 or 2 ; in the case (ii), {u,v} is covered by Cuifor some i = 1 or 2; in the case (iii), {u,v} is covered by K,. Hence B,(G)
I q n - m)
+ 1 I q n -L(n + 2)/2J)+ 1 In
and the claim follows. We know from Theorem 1 that if Conjecture 2 is false, then so is Conjecture 1. However, in case where Conjecture 2 is true, Conjecture l is still open since Conjecture 2 does not guarantee the existence of an ordering among the cliques of an edge clique covering C1, ..., C, satisfying the hypothesis of Theorem 2, that is, for some vertex labeling v1, v;?, v3, ..., v,, if vi E C ’ then i 2 j - 1. 3. The p-Competition Number The concept of the p-competition graph of an acyclic graph was introduced by Kim et al. [7]as a generalization of the special case p = 1 which is the competition graph of the digraph. Suppose D is an acyclic digraph. If p is a positive integer, the p-competition graph corresponding to D is defined to have vertex set V(D)and to have an edge between u and v if and only if for some distinct al, ..., up in V(D),(u,a l ) , (v. a l ) , ..., (u,up),(v, up)are in A(D). The p-competition graph can be thought of as a special case of a more general notion of tolerance intersection graph which has been developed by Jacobson, McMoms, and Mulder [19] and Jacobson, McMoms, and Scheinerman [20]. Kim et al. showed that for any positive integer p , any graph G can be made into the p-competition graph of an acyclic digraph by adding sufficiently many vertices. Based on this observation, they introduced the notion of p competition number kp(C)which is the smallest k so that G U I k is thep-competition graph of some acyclic digraph. This is a notion generalizing the competition number. Indeed, we may obtain results which generalize the results on the competition number. The following theorem gives a lower bound on the p-competition number of a graph with no isolated vertex. It generalizes one of the results on the competition number. Theorem 10: (Kim et al. [ 19901) If G has no isolated vertex, then kp(G)2 p . The following theorem gives an upper bound for the p-competition number in terms of the competition number. Theorem 11: (Kim et al. [1990]) For any graph G, kp(G) I k(G) + p - 1. The following theorems give thep-competition numbers for some classes of graphs: From Theorem 10 and Theorem 11, the following theorem is straightforward. Theorem 12: (Kim et at. [1990]) If k(G) = 1 and G has no isolated vertex, then k& G) = p . Since a triangulated graph and an interval graph have competition number &G) = 1, the fol-
S.-R. Kim
320
lowing corollary is an immediate consequence of Theorem 12. Corollary 12.1: (Kim et al. [1990]) If G has no isolated vertex and is a triangulated graph or interval graph, then kp( G) = p .
Let 6 ( G ) denote the smallest degree of a vertex of G, and o(G) the size of the largest clique of G. Tlreorem 13: (Kim et al. [ 19901)
If G has no isolated vertex and 6 ( G )2 w(G), then kp(G) 2 p + 1. Theorem 13 together with Theorem 11 yields the following corollary: Corollary 13.1: (Kim et al. [1990])
If G has no isolated vertex, k(G) = 2, and 6 ( @ 2 NG), then kp(G) = p + 1. Since a cycle C, with n 2 4 has k( C, ) = 2, 6(C,) = 2, and w( C,) = 2 satisfying the hypothesis of Corollary 13.1, the following corollary immediately follows. Corollary 13.2: (Kim et al. [1990])
I f n 2 4 , k p ( C n ) = p + 1. For given positive integersp and t, define F ( t , p ) to be the smallest a so that
b)
1 t.
Theorem 14: (Kim et al. [19901)
If N(v)has at least t pairwise independent vertices for every v in V(C),then kp(G)2 F(t,p). The above theorem implies that the 2-competition number can be arbitrarily large: For given m, take a complete bipartite graph K,,, with n2( m2 - m)/2. Since N(v) has n pairwise independent vertices for each vertex v of K,,,, one obtains b(K,,,) 2 F ( 4 2 ) by Theorem 14. By the definition of F(n, 2), F(n, 2 ) is the smallest integer2 (1 + ) /2. This implies
d=
1 + ,/k,
W q"1 2
2
2
1 +,/1+8 ( m 2 - m ) / 2 2
= m .
However, Theorem 14 does not appear to give good lower bounds for certain cases. For example, consider the case wherep = 1. Then, the lower bound on k(K,,) which is obtained by the theorem is min{rn, n} and it is far lower than the actual competition number k(K,,J = mn - m -n + 2 when m and n are sufficiently large. Since ( p + 1) is the smallest integer satisfying special case of Theorem 14 where t = 2.
k)
2 2, the following corollary follows as a
Corollary 14.1: (Kim et al.Il9901)
If N(v)has at least two painvise independent vertices for every v in V(G),then kp(G) 2 p + 1. The following is obtained from Theorem 11 and Corollary 14.1: Corollary 14.2: (Kim et al. [ 19901)
If k(G) = 2 and N(v)has at least two pairwise independent vertices for every v, then $(C) = p + 1.
The competition number and its variants
321
The p-competition number of a line graph can be obtained as follows: Corollary 14.3: If G is a line graph with no isolated vertex, then %(G) I p + 1 with equality if and only if for any vertex v of G, BdXyv)) = 2. Proof:
+
It follows from Theorem 7 and Theorem 11 that kp(G)Ip 1. It remains to prove the equality. Suppose there is a vertex vo E V(G) with B,(N(vo)) = 1. Then k ( G ) = 1 by Theorem 7 and %(G) = p < p + 1 by Theorem 12. Now suppose for any vertex v of G, B,(Xyv)) = 2. Then k(G) = 2 by Theorem 7. The assumption also implies that N( v) has two painvise independent vertices for every v in V(C). Then G satisfies the hypothesis of Corollary 14.2 and %(G) = p + 1 follows. The p-competition numbers mentioned thus far generalize the ordinary competition numbers and suggest that the p-competition number of a graph with p 2 2 be larger than its competition graph. Surprisingly, despite those results, kp(C) can be smaller than k(G) for some graphs G, indeed arbitrarily smaller.
Theorem 15: (Kim et al. [1990]) For any natural number m, there exists a graph G such that kp(G) Ik ( G ) - m. From Theorem 5 and Theorem 11, we know that for a connected bipartite graph G, $(G) IIE(G)I - I V(G)I + p
+ 1.
When p = 1, the inequality is replaced by equality by Theorem 5. However, when p 2 2, the upper bound can be significantly lowered for certain bipartite graph. This can be shown by the bipartite graph given in Figure 3 which has 22 vertices and 32 edges, but 2-competition num-
22
h
19
18
Figure 3: A graph from Kim et al. [7] which has the p-competition number less than the competition number for p 2 2 .
ber 27. Hence we may ask whether or not the upper bound can be lowered in the case p 2 2, i.e., whether or not there exists a bipartite graph satisfying the equality in the case p 22.
S.-R. Kim
322
As the p-competition number of a graph is well-defined, it makes sense to ask about the extreme p-competition number of a graph of given n vertices. Based on the fact that the maximum competition number is achieved by a complete bipartite graph, we may first attempt to study the p-competition number of a complete bipartite graph. 4.
The Double Competition Number
Another variation of the notion of competition graph is the notion of competition-common enemy graph introduced by Scott The competition-common enemy graph of an acyclic digraph D has the same set of vertices as D and an edge between vertices u and v if and only if for some w,x E V(D),(u, w)E A(D) and (v, w)E A@), and ( x , u) E A(D) and ( x , v) E A@). As a notion analogous to the competition number, Scott [q defined the double competition number, dk(G), to be the smallest number k such that G u Ik is the competition-common enemy graph of some acyclic digraph. Scott [5] showed that this notion of double competition number, &(G), is well-defined. She also claimed that for every G with no isolated vertex,
[a.
2 I & ( G ) Sk(G) + 1. (4.1) The lower bound follows from the fact that any acyclic digraph has a vertex with only incoming arcs and a vertex with only outgoing arcs. The upper bound can be obtained as follows: Choose a graph G and construct an acyclic digraph D whose corresponding competition graph is G v I q ~ Then j create a new acyclic digraph D‘ by adding to D a vertex v and arcs from v to each vertex that is from V(G).Then G U I k Q u (v} is the competition-common enemy graph of D’. As a consequence of (4.1) and Theorem 6, Scott [5] gave the double competition numbers for some classes of graphs. Theorem 16: (Scott [1987]) (I) &K,,) = 2, where K,, is a complete graph with n vertices. (2) &C,) = 2 if n 2 3. (3) Every interval graph G has &(G) I 2. (4) If G is a nontrivial tree, then dk(G) = 2.
From (4.1) and Theorem 7,we can say that dk(G) 1 3 if G is a line graph. We may ask whether or not the bound is sharp, i.e., whether there exists a line graph G with dk(G) = 3 . The following theorem shows that the double competition number can be arbitrarily large: Theorem 17: (Jones et al. [1987])
For a complete 3-partite graph K(n,n,n), &(K(n,n,n)) 2
fi,
Jones et al. [21] found only one triangle-free graph with &(G) > 2 (see Figure 4) and left open the question of finding an interesting family of triangle-free graphs with double competition number greater than 2. Based upon Theorem 3 and the fact that bipartite graphs are trianglefree, bipartite graphs have been studied to answer the question. We say that a (0,l) matrix M is 1 0 1-clear if M can be transformed by column and row permutations into a matrix with no 1 0 1 on a diagonal. Given a bipartite graph G = (Vl, V2), let the vertices in “I1 be labelled (a, l), .,., (a, IVll) and the vertices in V2 be labelled (b, l), ..., (b, IVzl). Define a lVll x IV21matrix M = (mu) by
The competition number and its variants
323
Figure 4:C(5,2). m,. =
'
{
1, if { (a, i), 0, othenvise
(W 1 E
E(G)
We say that G is I 0 I-clear if Mis 1 0 1-clear. Theorem 18: (Kim, Roberts, and Seager [1989])
If a bipartite graph G = (Vl, V2) is 1 0 I-clear, then dk(G) I 2. Kim, Roberts, and Seager [22] showed that any bipartite graph one part of the bipartition of whose vertex set has size 14 is 1 0 1-clear. Then the following corollary results from Theorem 18. Corollary 18.1: (Kim, Roberts, and Seager [1989])
For any bipartite graph G = (V,, V2) with lVll = n for any n and IV21 5 4,dk(G) 1 2. Scott [5] proved that dk(K,,) = 2, which also can follow as a corollary of Theorem 18. It is quite in contrast with the ordinary competition number k(K,,) = mn - m - n + 2 for sufficiently large m,n. We still do not know the existence of a bipartite graph with the double competition number greater than 2. If such a bipartite graph exists, then we may ask whether or not the double competition number of a bipartite graph can be arbitrarily large. In fact Seager [23] found a family of triangle-free graphs with the double competition number greater than 2. Let C(5, n) be a 5-partite graph whose 5n vertices partitioned into 5 stable sets V l , V,, V3, V4, V5, with I VJ = n,and vertices u and v are adjacent if and only if u E Viand v E Vj where li -jl I 1 (mod 3). We note that C(5, n) is not a bipartite graph. Theorem 19: (Seager 119891) dk(C(5,n))> 2 for any positive integer. Moreover, dk(C(5,n))> 3 for n 2 10.
We note that the triangle-free graph G with dk(G) > 2 found by Jones et al. [21] is C(5,2). In fact, they showed that dk(C(5,2)) = 3 by an exhaustive computer search. Sager [23] similarly proved that dk(C(5,3)) = 3. Seager [23]conjectured that the double competition number for this class is arbitrarily large:
324
S.-R. Kim
Conjecture 3: (Seager [ 19891)
&(qm,n))+ -as n
-+
OQ
for all odd m 2 3 .
Finally, since any graph can be made into a competition common-enemy graph by adding sufficiently many isolated vertices, we may ask about the maximum double competition number of a graph of n vertices.
5. The Niche Number
[a
Cable et al. defined the niche graph of an acyclic digraph D to have the same set of vertices as D and an edge between vertices u and v if and only if for some w E V(D),( u , w ) E A(D) and (v, w ) E A@), or ( w . u ) E A(D) and (w,v) E A@). Cable et al. [q defined the niche number q (c)to be the smallest number k such that G u Zk is the niche graph of some acyclic graph. However, there are graphs that cannot be made into niche graphs by adding isolated vertices as Theorem 20 shows. For such graphs, Cable et al. [6] defined the niche number to be infinite. They also showed that the niche number can be 0 by taking an example K1.3 u Kid. These two facts differentiate q(G) from k(G), kp(C)or &(G) for a graph G. A nova is a graph obtained by replacing each edge of the star K,,,, where n 2 3 , by a clique on at least 2 vertices.
Theorem 20: (Cable et a1.[1987])
If G is a nova, then q( G) =
OQ.
However, a nova is not a forbidden subgraph for a niche graph. It can be shown using the graph K13 u K 1 3 , which has a nova K l g as a generated subgraph, but q ( K l 3 u K 1 . 3 ) = 0. Cable et al.[6] calculated q( G) for some classes of graphs: Theorem 21: (Cable et al. [1987]) (1) q(K,,) = 1 for n 2 2 . (2) q ( P 2 )= 1, q( P,) = 0, for n 2 3 , where P,, is a path on n vertices. (3) q (C,) = 0 for n = 7 ,n 2 9; q(CJ = 1 for n = 3 and 8; q (C,) = 2 for n =4,5,and 6.
In addition to the above results, Sakai [personal communication] has shown that the niche number of a complete bipartite graph K,,,,, with min {m, n } 2 3 is 00. In the same paper, Cable et a1.[6] asked whether there exists a graph G with 2 < q(G) < 00. This was one of many interesting open questions related to competition graphs which was finally answered by Fishburn and Gehrlein [24]: Theorem 22: (Fishburn and Gehrlein [1991])
Suppose m 2 2 and G is a raph with a finite niche number and K , + l-free. Then (1) q(G) 12m if d(v) 2m - 1 for every vertex vof G; (2) q(G) 22m - 1 if ( m- 1)2 < d(v0) < rn2 - 1 for one vertex vo of Gand d(v) 2 m 2 - 1 for every other vertex v, where d(v) denotes the degree of a vertex v.
4
Fishburn and Gehrlein [24] took various graphs satisfying the hypothesis of Theorem 22 and constructed their respective corresponding acyclic digraphs to give finite upper bounds for
The competition number and its variants
325
their niche numbers. Then, they used Theorem 22 to show that the lower bounds for the niche numbers of their examples are greater than positive integers 1,12 3. In certain cases, the lower bounds and the upper bounds for the niche numbers of graphs coincided. These gave the exact niche numbers 3 and 4. In addition, Fishbum and Gehrlein [24] proved that graphs can have arbitrarily large niche numbers by showing that for a positive integer m,they can construct a graph having its finite niche number greater than m.
-
In their paper, they asked to find the smallest n for which 2 < q(G) < for a graph G with IV(G)I = n. They suggested that the number should be between 8 and 11 since 12 was the smallest number among the cardinalities of the vertex sets of their examples and computer enumeration showed that any graph having at most 7 vertices has the niche number 0, 1,2, or -. The fact that their smallest example with niche number 3 has 14 vertices, while one of their examples has niche number 4 and 11 vertices, led them to ask whether min {IV(G)I: q(G) =3} > min {I V(G)I: q(G) = 4). Finally, they asked whether there exists a graph with niche number k for each positive integer k. In addition to the questions asked by Fishbum and Gehrlein [24], it would be interesting to find a good characterization of graphs with q(G) < 00.
References J.E. Cohen; Interval graphs and food webs. A finding and a problem, RAND Corporation Document 176%-PR, Santa Monica, California (1%8). J.E. Cohen; Food webs and the dimensionality of trophic niche space, Proc. Nat. Acad. Sci. ,74,45334536 (1977). J.E. Cohen; Food Webs and Niche Space, Princeton University Press, Princeton, N.J. (1978). F.S. Roberts; Food webs, competition graphs, and the boxicity of ecological phase space, Theory and Application of Graphs, Y. Alavi and D. Lick (editors), Springer-Verlag.New York, 477-490 (1978). D. Scorn; The competition-commonenemy graph of a digraph, Discrete Appl. Math., 17.269-280 (1987). C.A. Cable, K.F. Jones, J.R. Lundgren, and S. Seager;Niche graphs, Discrete Appl. Math.. 23.231-241 (1989). S.-R. K m , A.T. McKee, F.R. McMonis, and F.S. Roberts; p-competition graphs, RUTCOR Research Report RRR # 36-89, Rutgers Center for Operational Research, New Brunswick, New Jersey (1989). F.S. Roberts; Graph Theory and Its Applications to Problems of Society, CBMS-NSF Monograph Number 29. SIAM Publication, Philadelphia, Pensylvannia (1978). C.E. Shannon;The zero capacity of anoisy channel, I R E Trans. Inform. Theory, IT-2.8-19 (1956). M.B. Cozzens and F.S. Roberts; T-colorings of graphs and the channel assignment problem, Congressus Numerantiurn, 35, 191-208 (1982). W.K. Hale; Frequency assignment: thmry and application,Proc. IEEE, 68,1497-1514 (1980). R.J. Opsut and F.S. Roberts; On the fleet maintenance, mobile radio frequency, task assignment and traffic phasing problems, The Theory and Applications of Graphs, G. Chartrand. Y. Alavi. D.L. Goldsmith, L. Lesniak-Foster,D.R. Lick (editors), Wiley, New York, 479492 (1981). J.R. Lundgren and J.S. Maybee; A characterization of graphs of competition number m, Discrete Appl. Math., 6 . 3 19-322 (1983). S.-R. Kim; Competition Graphs and ScientzJc Laws for Food Webs and Other Systems, Ph.D. Thesis, Rutgers University, New Brunswick, New Jersey (1988). F. Harary, S.-R. Kim, and F.S. Roberts; Extremal competitionnumbers as a generalization of Turh’s theorem, J . Ramanujan Math. Soc.,5 , 3 3 4 3 (1990). R.J. Opsut; On the computation of the competitionnumber of a graph, SZAM J . Alg. Discr. Merh.. 3,420428 (1982). S.-R. K m and F.S. Roberts; On Opsut’s conjecture about the competition number, CongressusNumerantiurn. 71, 173-176 (1990). C. Wang; On critical graphs for Opsut’s conjecture, RUTCOR Research Report RRR # 15-90, Rutgers Center for Operations Research, New Brunswick,New Jersey (1990).
326
S.R. Kim
M.S. Jacobson, F.R. McMoms, and H.M. Mulder; Tolerance intersection graph, mimeographed, Department of Mathematics,University of Louisville, Louisville, Kentucky (1988). [ZO] M.S. Jacobson, F.R. McMorris, and E.R. Scheinermau; General results on tolerance intersection graphs, mimeographed, Department of Mathematics.University of Louisville, Louisville, Kentucky (1989). [21] K.F. Jones, J.R. Lundgren, F.S. Roberts, and M.S. Seager; Some remarks on the double competition number of a graph, Congressus Numeranrim, 60, 17-24 (1987). [ZZ] S.-R. Kim. F.S. Roberts, and S. Seager; On 1 0 I-clear (0.1) matrices and the double competition number of bipartite graphs, RUTCOR Research Report RRR # 19-89,Rutgers Center for Operational Research, New Bmswick, New Jersey (1989). [U] S. Seager;The double competition number of some triangle-free graphs, Discrete Appl. Math., 29,265269 (1990). [24] P.C. Fishburn and W.V. Gehrlein; Niche numbers, manuscript, AT&T Bell Laboratories, Murray Hill, New Jersey (1991). J.L. Harrison;The distribution of feeding habits among animals in a tropical rain forest, J . Animal Ecology,31,534 (1962). 1191
[a
Quo Vadis, Graph Theory? J. Gimbel, J.W. Kennedy & L.V. Quintas (eds.) Annals of Discrete Maihemaiics, 55, 321-332 (1993) 0 1993 Elsevier Science Publishers B.V. All rights reserved.
WHICH DOUBLE STARLIKE TREES SPAN LADDERS?
Martin LEWINTER Mathematics Department, State University of New York Purchase, New York, U.S.A.
William F. WIDULSKI Mathematics Department, Westchester Community College Valhalla, New York, U.S.A.
Abstract The hypercube Q, is defined recursively by Ql = K2 and Q, = Q, - 1 x K2. I V(Q,) I = 2". A ladder on 2k vertices is K2 x Pk. Ladders on 2" vertices span Q,. A double starlike tree is a tree with exactly two vertices of degree greater than two. We show which double starlike trees span ladders, implying that such double starlike trees on 2" vertices span Q,.
I. Introduction Let K,, and P,, denote the complete graph and the path, respectively, on n vertices, and let A denote the maximum degree of a graph. The hypercube Q, is defined recursively by Q , = K , and Q, = Q, - x K,. The vertex set V(Q,) contains 2" vertices. A ladder L+,, on 2m vertices is defined by L,, = P , x K,. It is shown in [l] that binary ladders, that is, ladders on 2" vertices, span hypercubes. A double star is a tree with exactly two vertices (called junctions) of degree greater than two while the remaining vertices have degree one. A double starlike tree is a subdivision of a double star; that is, it admits vertices of degree two. In [ 2 ] ,it is shown that equitable double starlike trees with A = 3 and adjacent junctions span ladders. It follows that if such trees have 2" vertices, they span hypercubes. Spanning trees of Q, are of current interest in computer science as hypercubes are the underlying architecture of massively parallel processors (see [1]-[lo]). In this paper, we relax the condition that the junctions of a double starlike tree be adjacent and we show which of them span ladders. Obviously, we still require A = 3. In one case we exhibit a double starlike tree which spans a hypercube even though it fails to span a ladder.
2. Ladder-SpanningDouble Starlike Trees Let S(a,, a,; b , , b , ; (I) be a double starlike tree with junctions u and v of degree 3, such that d(u, v) = d > 1 and the branches at u and v have lengths a l , a2, and b l , b2, respectively. Figure 1 depicts S(5,2;1,4;3). A bipartite graph is 2-colorable. If both color sets have the same cardinality, it is called equitable.
Figure 1: S(5,2;1,4;3).
M. Lewinter and W.F. Widulski
328
We present the following lemma without proof.
Lemma 1: The double starlike tree S(al,a, ;b 1, b, ;6)on 2k vertices is equitable if and only if either (a) u and v have the same color, say white, and S has exactly one black end vertex, or (b) u and v are oppositely colored, and there are exactly two black (and therefore two white) end vertices. Since ladders are equitable, it follows that only equitable double starlike trees can possibly span ladders. We consider cases (a) and (b) separately. Case (a) Without loss of generality, let v be the junction whose branches are oppositely colored. Let al I a,. We have the following theorem. Theorem 1:
An equitable double starlike tree S(al,a?;b,, b,; 6)on 2k vertices such that both junctions have the same color spans the ladder L 2 if and only if a I d.
Proof: Suppose initially that a l = d, then embed S in L2k as shown in Figure 2. Observe that the branch of length a1 is directly opposite the u-v path.
Figure 2: S embedded in L2k If al c d, follow the scheme of Figure 3, in which the u-v path ‘wiggles’ in theladder.
Figure 3: An embedding with a ‘wiggling’ u-v path. On the other hand, if al > d, both branches at u must proceed to the left and include the rung at u. However, al +a,+ 1 is odd and hence cannot span a subladder, and the theorem follows. Case fbk Let the branches at u be oppositely colored, in which case the branches at v must also be oppositely colored.
Which double starlike trees span ladders?
329
Theorem 2: The double starlike tree S(al,a2 ;b,, b,; d) on 2k vertices such that the junctions are oppositely colored and the end vertices of the branches at each junction are oppositely colored spans L2k Proof: The embedding is accomplished using one of the schemes of Figure 4 in which, without loss of generality, u is black.
-..I.. . ...
..._6p61...-
...---O
...-... E-
...
-...a
-z-...
2 -
U
(4
U
V
.-.J.-..,-o
... V
...-...
(b) Figure 4 Case (b) admits two other possibilities. Assume, first, that the branches at the black junction u both have black end vertices. It follows that v is white and so are the end vertices of its branches. Assume that a l S a, and b , S b2. We have the following theorem. Theorem 3:
The double starlike tree S(al, a2 ;b,, b, ;d) on 2k vertices such that the junctions are oppositely colored, and the end vertices of the branches at each junction agree with the color of that junction spans L2k if and only if a + b, S d + 1.
,
PIWOE
Assume initially that al + b, = d + 1. Then the embedding of Figure 5 applies.
...-. -... .... -...
E:::IL
1
-2
U
V
Figure5:al+bl = d + l . When a , + b , < d + 1 one of the schemes of Figure 6 accomplishes the embedding. (The ‘wiggling’ has been shortened for convenience). Clearly, if al + b , > d + 1, no embedding is possible. If the end vertices of the branches at u are white and those at v are black, no embedding is possible since one branch at u and one branch at v must proceed toward one another while the remaining branches, having an odd number of vertices, cannot span the subladders to the left of u and the right of v.
M. Lewinter and W.F. Widulski
330
Figure 6: al + b, < d + 1.
Remark 1: The ladders of Theorems 1 , 2 and 3 on 2" vertices span Q,It should be noted that there are double starlike trees with A = 3 which while failing to span ladders, span hypercubes. The following lemma will enable us to produce such examples. We omit the proof.
Lemma 2: C2n-
1
x K , s p a s Qr
Using Lemma 2, one sees that the graph of Figure 7 spans Q4,though Remark 1 does not apply.
i U
V
Figure 7:S(3,3;1,5;3) spans Q 4
3. Open Problems The 2-dimensional mesh M(m, n) on mn vertices is defined by M(m, n) = P, x Pn.Note that the ladder on 2k vertices is M(2, k). It is shown in [l] that if mn = 2k, then M(m, n) spans Q k W e seek, therefore, a characterization of those double starlike trees of maximum degree three which span 2-dimensional meshes. Unlike the case of ladders, we may now pose the same question with maximum degree four.
References [l]
[2]
p] [4]
F. Harary and M. Lewinter; Spanning subgraph of a hypercube 111: Meshes, Znternational J . Computer Math., 25, 1 4 (1988). F. Harary and M. Lewinter; Spanning subgraph of a hypercube 11: Double starlike trees, Math. Comput. Modeling, 11,216-217 (1988). F. Harary and M. Lewinter; Hypercubes and other recursively defined Hamdton laceable graphs, Congresus Numeraniium, 6 0 . 8 1 4 (1988). F. Harary and M. Lewinter; Spanning subgraph of a hypercube IV: Rooted trees, Comput. Math. Appl. (to appear).
Which double starlike trees span ladders?
[5]
[6]
[8] [9]
[lo]
33 1
F. Harary and M. Lewinter; Spanning subgraph of a hypercube V: Spanned subcubes, Proc.First ChinaU.S.A. Conf. on Graph Theory,Ann. New York Acad. Sci., 576,219-225 (1989). F. Harary and M. Lewinter; Spanning subgraph of a hypercube VI: Survey and unsolved problems, Graph Theory, Combinatorics, and Applications, 2, Wiley-Interscience, New York, 633-637 0991). F. Harary, M. Lewinter and W. Widulski; On two-legged caterpillars which span hypercubes, Congresw Numerantila, 66.103-108 (1988). M. Lewinter and W. Widulski; Minimal hyperhamiltonlacable graphs, Comput. Math. Appl. (to appear). M. Lxwinter and W. Widulski; Equipartitionsets of hypercubes, J . ofcomb. Info. and Sys.Sci., 16, 19-24 (1991). I. Havel and P. Liebl; One-legged caterpillars span hypercubes; J . ofGraph Theory, 10.69-77 (19%).
This Page Intentionally Left Blank
Quo Vadis, Graph Theory? J. Gimbel, J.W. Kennedy & L.V. Quintas (4s.) Annals of Discrete Mathematics, 55, 333-340 (1993) 0 1993 Elsevier Science Publishers B.V. All rights reserved.
THE RANDOMf-GRAPH PROCESS Krystyna T. BALINSKA Computer Science Center, The Technical University of Poznah Pornah, POLAND
Louis V. QUINTAS Mathematics Department, Pace University New York, New York, U.S.A.
Abstract Starting with n vertices and no edges, sequentially introduce edges so as to obtain a sequence of graphs each having no vertex of degree greater thanf. The latter are called f-graphs. At each step the edge to be added is selected with equal probability from among those edges whose addition would not violate the f-degree restriction. A terminal graph of this procedure is called a sequentially generated random edge maxirnalf-graph and the procedure the randomj-graph process of order n. This simple generalization of the classic ErdbRinyi random graph process leads to some challenging mathematical problems and is a process related to a variety of physical applications.
1.
Introduction
We consider the following procedure. Starting with n labeled vertices and no edges, sequentially introduce edges so as to obtain a sequence of graphs each having no vertex of degree greater than f.The latter are called f-graphs. At each step the edge to be added is selected with equal probability from among those edges whose addition would not violate the f-degree restriction. A terminal graph of this procedure is called a sequentially generated random edge maximalf-graph and the procedure the random f-graph process (RfGP) of order n. A graph in a RfGP obtained after t steps has t edges and is said to be on level t. This procedure with f = n - 1 is the classic Erd6s-Rtnyi random graph process (RGP) [1]-[3] which has been extensively studied [4]-[6]. Many of the questions asked about the RGP can be posed for the RfGP, however a bounded degree restriction, f < n - 1, introduces different types of difficulties in the resolution of these questions. The basic causes of these are as follows. (i) It is clear that the RGP terminates in a unique graph Kn whereas for f < n - 1 a terminal graph in a Rf GP is one graph from a set of edge maximal f-graphs.
(ii) The RfGP can terminate with an edge maximal f-graph having less than the maximum possible f n / 2 edges, that is, the terminal level is not the same for all processes. For example, if f = 3 and n is even, a terminal edge maximal 3-graph can appear at level 3 n / 2 , 3 n / 2 - 1, or 3 n / 2 - 2 ; for n odd a terminal edge maximal 3-graph can appear at level 3 ( n - 1) / 2 or 3 ( n - 1 ) / 2 + 1. (iii)A RfGP i s the same as the RGP up to the level t at which the first vertex of degreef appears. For any RGP graph on level t there are C(n;2) - t edges from which to choose uniformly to obtain a graph on level t + 1, whereas in the RfGP the number of edges available is C(n;2)- t - ( n -f) + 1 for anf-graph with exactly one vertex of degreef. For the case on a level t where it is possible to have up to two vertices of degreef, anf-graph with exactly two vertices of degree f will have either C(n;2)- t - 2 ( n -8 + 2 or C(n;2) - t - 2 ( n +3
-n
K.T. Balinska and L.V. Qluntas
334
edges available to choose from depending on whether these two vertices are adjacent or not. Thus, the probability of choosing an edge is not the same for each graph on a given level. A detailed discussion of this distinction in the asymptotic case is given in [7]. Graph theoretical models for real life situations most often have a bounded degree restriction due to either cost considerations or simply due to the fact that for most cases, for example, in communications networks, material pipelines, or transportation networks, not all connections between vertices are necessary (see [S]). It is clear that algorithms that deal with graphs often mimic the RfGP, inputting edges until some stopping criterion occurs based on the edges seen thus far. An obvious algorithm feature in this situation is stopping when a certain type of subgraph is formed. In models associated with chemistry and physics the bounded degree condition can be a consequence of either the environment, as in percolation models in lattices, or the natural restriction on the number of bonds that can be incident to a chemical species. Here random bounded graph models have played a central role for many years [9][ll]. One such area is that of polymer statistics. In particular, we note that the Rf GP models a non-reversible process which is analogous to one of interest in chemistry (see for example [12]). In chemistry such a process is referred to as a kinetic model, in contrast to an equilibrium model in which each of the edges of the random graph model are present, independently, with some given probability. In Section 2 we comment on some probabilistic problems concerning the RfGP. In Section 3 we introduce a digraph associated with the Rf GP and propose a number of nonprobabilistic problems concerning this digraph and its underlying graph. 2.
Probabilistic Problems
As noted in Section 1 the terminal graphs in a RfGP form a set of edge maximal f-graphs. The structure of edge maximal f-graphs is studied in [11][13]-[15]. One approach in such studies is to partition these graphs in accordance with some structural property and seek the distribution of the resultant classes. For example, if e is the number of edges in an edge maximal f-graph, let P(e;n;f) denote the probability that a terminal edge maximal f-graph in the RfGP of order n has e edges. The role of this random variable was noted in Section 1. Problem 1:
Determine the properties of P ( e ; n ; f )for finite n and for n going to infinity. An example of an analogous random variable is to be found in a problem posed (verbally) in 1985 by P. ErdBs. Here we let m denote the number of vertices of degree less than f in an edge maximal f-graph obtained from a RfGP of order n and P(m;n;f) the probability distribution of m (for an alternate formulation see Problem 5 in [ll]). Problem 2 Determine the properties of P(m;n ;f) for finite n and for n going to infinity. Initially, not even a qualitative description of the solution of Problem 2 was available [16]. Subsequently, using computer algorithm realizations of the RfGP considerable insight into this problem and the process itself has been obtained. In [17] [18] the shape of the solution as a function of n for f = 2 , 3 , and 4 is given. In [19] this is explored further for f > 4. These studies provide a detailed description of P(m;n$) for finite n. The description of this distribution of the terminal graphs as a function off is studied in [20] [21]. No theoretical results concerning the asymptotic analysis of P(m;n;f) as a function of n are known at this time. In contrast to this, the asymptotic distribution of m for the equiproba-
The raadomfgraph p m s s
335
ble distribution of edge maximalf-graphs is known [13][17][18]. From [19] we have Problems 3 and 4 and the following conclusion concerning the RfGP. Result: (see [19])
Let T = T(n,f ) be such that P(T;n;f)> P(m;nf) for all m. Iffis constant, then the limit of T(n,f)when n goes to infinity is 0 if fn is even and is 1 iffn is odd. Problem 3:
For the distribution P ( m ; n ; f )withfa constant, determine whether the limits of P ( 0 ; n ; f )cfn even) and P ( 1;n;f ) (lh odd) converge to 1 or to strictly less than 1 when n goes to infinity. Problem 4:
For the distribution P(m ;n ;f) with f = f (n),given that f ( n ) / ( n - 2 ) converges to a constant c ( 0 I c I l ) , what is the limit of T ( n , f ( n ) ) / nas n goes to infinity? In particular, determine whether or not T(n, n - 2) converges to 116, thereby showing that T(n, n - 2) is asymptotic to 1116. In 1987, Rucihski, whose mutual interest in Problem 2 dates back to 1985, introduced an algorithm specifically designed to study this problem for f = 2 (see [22]). This algorithm requires q n 3 ) time. Subsequently, an algorithm requiring O(n2) was developed by Szmanda and Balinska (see [B] [24]). The focus on f = 2 has been helpful in developing ideas and insights in preparation for further assaults on f > 2 problems. In 1990 the algorithm IMAGE was designed for the R2GP (see [15]). We note that this algorithm is not restricted to the study of the distribution P(rn;nrf) since it gives the complete description of the evolving Zgraphs at every level in the R2GP. In addition i t has been modified to handle problems with f > 2. Some of the random variables for f = 2 studied in [lq are the following: m, the number of vertices of degree less than two; C(s),number of cycles of length s; K,the number of components; and M,the order of the largest component. For each of these random variables exact numerical distribution results are given. In [25] we obtain further results on these random variables for both the uniform distribution and the R2GP. It is known that the random variablem for the RfGP and the uniform distribution are quite different (see [13] [19]). However, in [25], when P ( M > n/2;n) for the R2GP and the uniform distribution were compared it was seen that these probabilities tended to be asymptotically equivalent and it was proved that the limit in the uniform case was In(&+ 1) Thus, we are led to propose: Problem 5:
Prove or disprove that for the R2GP, P(M> n/2;n) is asymptotically equivalent to the corresponding probability in the uniform distribution and that lim P(M
0.8814.
n-i-
Also studied in [15] are some random variables of interest in the evolution of the R2GP of finite order. Namely, the hitting time to the first cycle and the length of the first cycle. These results were presented by the authors at the Second Polish Conference on Graph Theory, Niedzica, Poland, 1990. Questions concerning the evolving structure of graphs in the Rf GP ( f > 2 ) await resolution Pioneering work in this direction was done by Erd6s and Kennedy [26]. (for example see
m).
K.T. Baliaska and L.V. Quintas
336
It is proposed by Galina that certain methods of mathematical chemistry will be useful in this area [ 2 7 . Perhaps the powerful asymptotic techniques used by Flajolet, Knuth and Pittel [28] as applied to the RGP will play a role in the asymptotic analysis of corresponding problems in the RfGP. However, the latter is not known at this time. In [14] the nonprobabilistic study of f-graphs with f large relative to n (for example f greater than n/2) sheds some light on the behavior of the functions studied in [20] [21]. In particular, the distinction between the distribution of the vertices of degree less than f when f < n / 2 and when f > n/2 appears to be related to some of the difficulties of the problems associated with the RfGP. Further simulation studies have played a role in obtaining a better insight into the RfGP. Here P(m ;nfi has been studied both as a function of n and off. The qualitative shapes of these functions are pretty well established by these simulations. However exact formulas remain elusive. In [20] [21] [29] we conjecture that P(m;n;f) in the domain 2 IfI n/2 as a function off is of the form AfBexp(-Cj), where A = A(@ n), B = B(m, n), and C = C(m, n). We pose the following problem. Problem 6: Determine analytic expressions for A(? n). B(m, n), and C(m, n) as functions of m and n andfor determine the limits of these coefficients as n goes to infinity. 3.
The Transition Digraph D(n)
In this section we turn to some purely graph theoretic problems which arose during the study of the random graph problems of the preceding section. Let D(n) denote the digraph whose nodes are in 1: 1 correspondence with the unlabeled fgraphs of order n and (H,K) is an arc in D(n) if and only if K can be obtained from H by the addition of a single edge. Furthermore, the arc (H,K) has weight equal to the probability of going from a given labeled graph isomorphic to H to a labeled graph isomorphic to K i n the RfGP. The digraph D(n) is called the transition digraph of the RfGP of order n. The RfGP is a Markov chain and if considered as such then to be precise the transition digraph should have loops at the nodes with outdegree 0. This would then yield a transition matrix which is stochastic as is required in this context. However, for the study of the structure of that we propose it is more convenient to omit these 1-cycles. Thus, in what follows it should be understood that we are referring to the loopless digraph D(n) as defined in the preceding paragraph.
an) an)
By definition the order of D(n) is equal to the number of unlabeled f-graphs of order n. We have obtained the following results for f = 2 (see [30]). Theorem 3.1 :
Let N(t, n) denote the number of unlabeled 2-graphs of size t and order n. Then, N(t, n) = N(t, n - 1) + N(2t - n, t) t = 1,2, ..., n - 1, n 2 2 withN(n,y) = Oif x < O o r n > y , N ( O , n ) = 1 if n 2 1 , a n d N ( n , n ) = r(n,3),isthenumber of partitions of n with no part less than 3. Alternatively, the number N(t, n) can be expressed in terms of partitions.
The randomf-graph process
337
Theorem 3.2: Let N(t, n ) denote the number of unlabeled 2-graphs of size t and order n. Then,
c n
N(t, n ) =
r(n - k, 3)$k, n - t )
k= n-t
where r ( n - k, 3) is the number of partitions of n - k with no part less than 3, s(k, n - t ) the number of partitions of k into n - t parts, such that, by convention, r(O,3) = 1 , s(0,O) = 1, and s(k,0) = 0 if k #O. The order N(n) of D ( n ) is obtained by summing N(t, n ) over all levels t = 0 to t = n . Using this and either Theorem 3.1 or Theorem 3.2 we can compute exact values of N(n) and of the number N ( t , n ) of nodes on any specified level of D ( n ) . The determination of the size of D(n) is not as straightforward as that for order. Here we have the following:
Theorem 3.3: Let A(t, n ) denote the sum of the indegrees of the nodes on level t of D(n). Then,
where p3(n - k) is a partition of n - k with no part less than 3, q"-'(k) a partition of k into n - t parts, a [p3(n-k)] is the number of distinct parts of p3(n- k), and b{q"-'(k)) is the set of distinct parts of qn-'(k). Note that in Theorem 3.3, A ( t , n ) is computed using the following conventions: a [ 0 ] = 0 , b ( 0 ) = 0 , qn-'(0) = 0 , p3(0) = 0 , qo(k) = 0, and if t = 0 and k # n , then (p3(n- k ) , qn(k)) = 0. Using this theorem or variations of it (see [30]) one can compute exact values for the size A(n) of D(n) and the number A(t, n ) of arcs incident to the nodes on level t of D (n). Some initial computations of N(n) and A(n) are obtained in [15], where, in addition, D ( n ) is displayed explicitly for up to n = 8 showing all of the 2-graphs associated with the nodes. Some open questions concerning the order and size of D(n) are the following.
Problem 7: Find exact closed form expressions for N(t, n) , A(t, n), N(n), and A@).
Problem 8: What are asymptotic expressions for N(t, n ) , A(t, n), N(n),and A@)? Obviously, all of the preceding can be considered for f > 2 . In addition to investigating other properties of D(n) we have also started a study of U(n) the underlying graph of D(n).For example, although D(n) is clearly (weakly) connected and acyclic, the graph U(n),for n 1 4, has girth 4 and contains even cycles of various lengths. The cycle structure and other properties of U(n) for f 2 2 have been minimally explored at this time.
K.T. Balinska and L.V. Quintas
338
4. Conclusions Our objective has been to list what is known about the Rf GP and to suggest that problems about the Rf GP for both finite n and for n going to infinity are equally challenging. As our closing problem consider the following: Let G(n, t ) denote the graph of order n and size t obtained via the RGP and G(n,p ) the graph obtained by selecting each of the edges of Kn with independent probability p = t/C(n;2). There is a very useful equivalence between these graphs as described in [31]. In particular, under certain conditions the asymptotic properties of these two random graphs are very similar. In [ll] (see p.249) a general probability model is defined which we use here to define a p ) as follows. random f-graph Gf(n, Let Mf denote the set of all edge maximal f-graphs on n labeled vertices and Pf the probability distribution on Mfdetennined by the RfGP. (i) Let H b e a graph selected from Mfwith probability pf(H,and then (ii) let $(n, p ) have the same vertex set as H and edge set obtained by selecting each edge of H independently with probability p . Problem 9: Does Gf(n,p ) play the same role with respect to $(n, t ) as G(n,p ) does to G(n,t)?
The RfGP, the transition digraph D(n) and its underlying graph U(n) provide both probabilistic and graph theoretical contexts which are relatively unexplored considering that they are the source of challenging problems and applications.
Acknowledgement Support of this work was provided in part by The Technical University of Poznan and Pace University research grants.
References P. h d 6 s and A. Renyi; On random graphs I., Publ. Math. Debrecen, 6,29&297 (1959). P. Erd6s and A. RCnyi; On the evolution of random graphs, Publ. Math. Inst. Hungarian Acad. Sci., 5, 17-61 (1960). P. Ed6s and A. Rbyi; On the evolution of random graphs, Bull. Inter. Inst. Statistics, Tokyo, 38,343347 (1960). B. Bollobi4s; Random Graphs, Academic Press, New York( 1985). E. Godehardt; Graphs as structural models (2nd Edition),Advances in System Analysis. 4, Vieweg 62 S o h , VerlagsgesellschaftmbH, Bmunschweig ( 1990). E.M. Palmer; Graphical Evolution: An Introduction to the Theory of Random Graphs, Wiley Inter-Science Series in Discrete Mathematics, New York (1985). K.T. Balinska, E. Godehardt, and L.V. Quintas; When is the random f-graph process the random graph process? - to appear. F.S. Roberts; Discrete Mathematical Models, Prentice-Hall, Englewood Cliffs, New Jersey (1976). J.W. Kennedy; T%erandom graph-like stale of matter, Compuler Applications in Chemistry, S.R. Heller and R. Portenzone, Jr. (editors), Analytical Chemistry Symposia Series, 15, Elsevier,Amsterdam, 151178 (1983). K.T. Balinska and L.V. Quintas; Random graph models for physical systems, Graph Theory and Topology in Chemistry, Studies in Physical and Theoretical Chemistry, 51, Elsevier, Amsterdam, 349-361 (1987). J.W. Kennedy and L.V. Quintas; Probability models for randomf-graphs, Combinatorial Mathematics (New York, 1985),Ann. N.Y.Acad. Sci., 555,248-261 (1989).
The random f-graph process
339
H. Gdina and A. Szustalewicz;A kinetic approach to the network formation in an alternating stepwise copolymerization, Macromolecules, 23,3833-3838 (1990). K.T. Balidska and L.V. Quintas,Generating randomf-graphs: The equiprobablelimit, Proceedings of the Fifth Caribbean Conference on Combinatorics and Computing, (Bahados, 1988), University of the West Indies, 127-157 (1988). K.T. Balidska and L.V. Quintas; Edge maximal graphs with large bounded degree, Advances in Graph Theory, Vishwa International Publications, Gnlbarga, India, 12-19 (1991). K.T. Balidska and L.V. Quintas; The algorithm IMAGE for the random 2-graph process, Computer Science Center Report No. 334, The Technical University of Poznad (1990). K.T. Balmska and L.V. Quintas; On generating random f-graphs, Graph Theory Nates of New York, XU, New York Academy of Sciences, 22-24 (1986). K.T. Balidska and L.V. Quintas; The sequential generation of random f-graphs: Preliminaries, an algorithm. and line maximal 2-, 3.. and 4-graphs, (Presented in Ponad, April 1988). Mathematics Department, Pace University, New Yo& (1988). K.T. Balmska and L.V. Quintas; The sequential generation of random f-graphs: Line maximal 2-, 3-, and Cgraphs, Computers & Chemistry, 14,323-328 (1990). K.T. Balmska and L.V. Quintas; The sequential generation of random f-graphs: Distribuhons and predominant types of line maximalf-graphs for f > 4, Combinuioricsand Algorithms, (Jerusalem, November 1988). Discrete Math., (in press). K.T. Balmska and L.V. Quintas; The sequential generation of random f-graphs: Distributions of edgemaximal f-graphs as functions off, Computer Science Center Report No. 316, The Technical University of hna6 (1989); Revised (1990). K.T. Balidska and L.V. Quintas; The sequential generation of random edge maximalf-graphs as a function off J. Math. Chem ., 8.39-51 (1991). A. Rucidski; Maximal graphs with bounded maximum degree: Structure, asymptotic enumeration, randomness, Proc. Ill of the 7th Fischland Colloquium, Rostock Math. Kolloq., 41.47-58 (1990). P. Szmanda; The algorithm DOV for the random 2-graph process, Computer Science Center Report No. 335, The Technical University of Poznad (1990). K.T. Balinska and P. Szmanda, An algorithm concerningthe distribution of orexic vertices in the random 2-graph process. GraphTheory Notes of New York, XX,New York Academy of Sciences,2%33 (1991). K.T. Balmska and L.V. Quintas;Big cycles in random edge maximal 2-graphs - to appear. P. ErdBs and J.W. Kennedy; k-connectivity in random graphs, Europ. J . Combinatorics, 8, 281-286 (1987). K.T. Balinska, H. Galina, and L.V. Quintas; A kinetic approach to the random 2-graph process - to appear. P. Flajolet, D.E. Knutb andB. Pittel; The first cycles in an evolving graph, Discrete Math.. 7 5 , 167-215 (1989). K.T. Balidska and L.V. Quintas;Problem: The sequential generation of random edge maximalf-graphs, Random Graphs '89, John Wiley & Sons, New York (in press). K.T. Balmska and L.V. Quintas; Properties of the transition digraph of the random 2-graph process - to appear. T. tnczak, On the equivalence of two basic models of random graphs, Random Graphs '87, John Wiley & Sons, New York, 151-157 (1990).
This Page Intentionally Left Blank
Quo Va&s, Graph Theory? J. Gimbel, J.W. Kennedy & L.V. Quintas (eds.) Annals of Discrete Mathematics, 55, 341-348 (1993) 0 1993 Elsevier Science Publishers B.V. All rights reserved.
QUO VADIS, RANDOM GRAPH THEORY? Edgar M. PALMER Department of Mathematics, Michigan State University East Lansing, Michigan, U.S.A.
Abstract We trace the growth of the random graph theory from its inception with the fundamental papers of k d 6 s and R h y i in 195960 to the present, covering a thirty year period that has just seen the appearance of the first issue of a journal devoted to this subject, namely the Journal ojRundom Structures & Algorifhms. Necessarily our treatment is sketchy but we mention some of the outstandingresults,applications and methods discovered during this time. Although the area has been nicely developed, it is still only a beginning for the theory of random graphs. We conclude with several unsolved problems including some that involve the ReconstructionConjecture Isomorphism Problem Chromatic Number Hamiltonian Cycles.
1.
A Bit of History
Here is a list of dates that indicate some of the important steps in the development of the theory of random graphs. 1959 The first two papers on the theory of random graphs appeared. They were both concerned with the probability of connectedness. Erd6s and RCnyi [l] found their first threshold and Gilbert [2] established a recurrence relation. 1960 The block-buster paper of the founding fathers on the evolution of random graphs [3]. During the sixties some more nice work was performed by Erdiis and RCnyi on connectivity [4] and matchings but only a few other papers appeared.
[a
1970 Lift-off! The subject takes off and we have the beginning of two decades of discovery and several hundred research papers. 1983 The first of the Poznan Conferences is held in the summer at the Mathematical Institute of Adam Mickiewicz University. These conferences in odd numbered years provided an opportunity for random graph theorists to collaborate and get the latest news on fast breaking research developments. Also, Mathematical Reviews gave the field its own subject classification! The theory of random graphs is known as 05C80. 1985 The first books appeared that were devoted entirely to the subject. The introductory text [6Jis for advanced undergraduates and beginning graduate students and the definitive treatise of BCla Bollobiis [7] provides a superlative cornerstone for every researcher. For a treatment of the probabilistic method, the Durango lectures of Joel Spencer are unmatched [8]. 1990 The first issue of the first journal devoted to random graphs came off the press. The Journal of Random Structures & Algorithm is published by John Wiley & Sons. It is edited by Michat Karonski and Joel Spencer with managing editor Andrzej RuciAski and contains wonderful articles of great ingenuity, originality and depth.
E.M. Palmer
342
2. Introduction Our sample space consists of all labeled graphs G with n vertices. The vertex set of G is V ( G ) = { 1,2, ..., n} and the edge set is E(G).Given the edge probability 0 < p < 1 , the probability of a graph G with Medges is defined by where N = (:) is the number of slots available for edges. Thus the sample space consists of Bernoulli trials and the edges are selected independently with probability p. Suppose A is a set of graphs of order n with some specified property Q. If the probability P(A)approaches 1 as n goes to infinity, then we say that almost all graphs have property Q or the random graph has property Q as. (almost surely). At Michigan State University two of our computer scientists, Abdol Esfahanian and Guy Zimmerman have constructed a Graph Manipulation Package@ or GMP which operates on the Sunm stations in the computer laboratory of the Mathematics Department. One can create graphs from specified families, including random graphs with given edge probability. The graphs can be manipulated in a number of ways such as by vertex or edge additioddeletion. Certain algorithms can be called on to test for planarity, produce spanning trees, matchings and Hamiltonian cycles. Customized algorithms may be incorporated with programming language c'. 3.
A Few Milestones
3.1 Thresholds
If one creates a random graph from the GMP with n = 50 vertices and edge probability p = .SO,the resulting structure will have so many edges that it hardly seems necessary to run the spanning tree algorithm to test for connectedness. As the edge probability drops to .4,.3, .2 however the edges thin out and when p is just below .2, we cannot be so certain that the graph generated will be connected. At p = . l , there always seems to be at least one isolated vertex. How can the probability of connectedness be calculated? Is there a sharp drop in its value as p decreases? For what values of p , as a function of n, does this drop occur? These were some of the questions answered in the first papers of 1959 mentioned above. Let C be the set of connected graphs of order n and denote by P , = P(c> the probability of connectedness for graphs of order n. For example (3.1)
P3 = 3p2 ( 1 - p ) + p 3 ,
because there are 3 labeled connected graphs with three vertices and two edges and 1 with three vertices and three edges. One could compute Pn= P(C)from enumeration formulas for the number of connected graphs with a given number of vertices and edges. Indeed, Gilbert and Hamming used the generating functions of Riddell and Uhlenbeck [9] to calculate values for n I6 and p = .9, .7, .5, .3, . l , but it seems much easier to use Gilbert's recurrence relation. Theorem 3.1: (Gilbert, 1959) n
(3.2)
1 =
(iI:)Pk(l-p)k(n-k) k= 1
The proof is made by observing that each summand is the probability that the vertex with
Quo vadis, random graph theory?
343
label 1 belongs to a component of order k . The binomial coefficient is the number of ways to select the other k - 1 labels for that component, the term Pk insures that it is indeed connected, while the remaining product of (1 -p)k("-k) is the probability that there are no edges from vertices of the component to any other vertices. Note that we rely heavily on the multiplication principle which says roughly that when edges are present we multiply b y p and for those that are absent we multiply by (1 - p ) . For example if vertex number 1776 is adjacent to vertex 1984 but 1984 is not adjacent to 1990, then the probability of all the graphs of order n which satisfy these conditions is p( 1 - p ) . Of course this follows from the independence of adjacency in our probability model. This relation (2.1) can be used to calculate P,, for small values of n. The double precision Fortran program on my office PC is effective for n up to about 200. For n = 200 and p = .05, we find that P,, = .99 but when p = .02 the probability of connectedness drops all the way down to just P,, = .026. It is a good exercise for students to try this experiment because they will learn how to deal with fairly large binomial coefficients and may discover the serious consequences of round-off error. One should begin by comparing one's results with those of Hamming and Gilbert! And then see if P 2 , = .383 267 5554 ... , when p = .026 4916 ... . For the ultimate tabulation of P,, see [lo]. Gilbert also estimated P , = P(C) asymptotically for fixed p . Theorem 3.2: (Gilbert, 1959)
For fixed edge probability p, (3.3)
P,, = RC) + 1,
i.e. almost all graphs are connected. Here are a couple of simple hints for the proof. We just need to show that the sum on the right side of (3.2) fork = 1 to n - 1 goes to zero as n approaches infinity. Replace the binomial coefficient by N = replace the probability Pk by 1 and observe that the new sum is end symmetric so that we need only show that it goes to zero fork = 1 to nl2. Finish the job by using the fact that n - k 2 n / 2 and approximate the sum with a geometric series.
(i),
As for the steep drop in the value of P,, = P ( C ) and the value of p where it occurs, the investigation is much more involved. Using probabilistic methods and a slightly different probability model Erdos and RCnyi found this drop was quite predictable. Theorem 3.3: (Erdb and RCnyi, 1959)
For constant c and p = c log n ln, if 0 < c < 1, almost all graphs have isolated vertices, if c > 1, almost all graphs are connected. For p = (log n + x + o( 1))ln (3.4)
P(C> -+
Thus for the property of connectedness there is a critical level, called the Chresholdfuncr i m , such that if p is bigger than the threshold, almost all graphs are connected, while if p is below the threshold, almost all graphs are not connected. This was the greatest discovery of the founding fathers. And they observed many other important instances of this phenomenon in their long paper [3] and in the sequels [4] and [q. Later Bollobis and Thomason [l 11 found that thresholds existed for all monotone graph properties. A property is monotone if whenever
E.M. Palmer
344
a graph G has the property, so does any graph obtained by adding an edge to G. 3.2 Hamiltonian Cycles
How large should the edge probability be so that almost all graphs have a Hamiltonian or spanning cycle? This was an important question raised but not answered by Erd6s and RCnyi. It generated much important work by many researchers including Angluin and Valiant, Bollob& Fenner, Frieze, Koml6s and Szemerkdi, Korshunov, Wsa and Wright. The big breakthrough came in 1976 from Korshunov [12] and Wsa [13]. Theorem 3.4: (Korshunov and P6sa, 1976)
For sufficiently large constant c > 1 and p = ( c l o p ) /n, almost all graphs are Hamiltonian. In the proof, there is an algorithm that tries to extend a long path. When the path cannot be extended because all the neighbors u of the end vertex v have already been visited, there is a clever way in which an edge of the path incident with u can be deleted and the edge uv can be added to keep the search going for a long path. Variations of this algorithm are extremely effective for finding Hamiltonian cycles in graphs. Many improvements have been found for Wsa's theorem and algorithm, and one should consult the book [71 for more details. 3.3 Hitting Times
Also in 1976 Richard Karp reported the outcome of some remarkable experiments undertaken by MacGregor [141. With n = 500 vertices, edges were added one at a time so that each of the remaining empty slots had the same chance to receive an edge, until the minimum degree 6 was exactly 2. Then the graph so constructed was tested for a spanning cycle using Wsa's algorithm. In a total of 60 trials, each graph had a spanning cycle! In 57 of the graphs, the algorithm used found the spanning cycle immediately. In the other 3 cases, after randomizing the vertices, the algorithm found the cycle quickly. Thus was born the notion of a random graph process and the hitting time of a graph property. A random graph process, denoted is a sequence of graphs Go c G, c ... c GN where Go has no edges and G N is the complete graph and each Gi has exactly one more edge than its predecessor Gi - for i = 1 to N . A probability space is made from these by letting them be equally likely, i.e. the probability of any one is U N ! .If Gi has a spanning cycle but Gi - 1 does not, then the hitting time for a spanning cycle (property Ham) is i and we denote this by z(g; Ham). Similarly if Gi has minimum degree 2 but G i - 1 does not, then i is the hitting time for minimum degree 2 and this is also denoted by z( 5;6 2 2). Naturally z( Ham) 2 z( 6 2 2), because a Hamiltonian graph must have minimum degree 2. Several years later the proof amved which showed that these times were almost always identical!
z,
z;
z;
Theorem 3.5: (Bollobis, 1984)
For the random graph process
z,z( 6;Ham)) = T( C; 6 2 2) a s .
The details of the proof are available in the book [7] where one can also find similar results by Bollobfis and Thomason for matchings and connectivity. T o see how nicely behaved these processes are, look at [15] for even more surprising results. 3.4 Chromatic Number
The determination of the chromatic number of a random graph remained a puzzle until
Quo vadis, random graph theory?
345
fairly recently. In the seventies Grimmett and McDiarmid [16] proved the following theorem which was subsequently improved by BollobAs and ErdBs [lq. Theorem 3.6: (Grimmett and McDiarmid, 1975) Forpfixedand b = 1/(1-p)almostall Ghave
(3.5)
(112 - E ) n I logb n 5 x( G) 5 (1 + &)n / logb n.
For the lower bound one uses the fact that the chromatic number is at least as large as nip, where p denotes the independence number. To establish the upper bound, Grimmett and McDiarmid used the greedy coloring algorithm. See [6] for most details. The feeling in the random graph community was that the lower bound in (3.5)was the right order of magnitude but it seemed hopeless to expect that the greedy algorithm could be improved to color in half as many colors. The breakthrough came in the paper of Shamir and Spencer [ 181 with the application of the Doob Martingale process to random graph coloring and the subsequent paper of Bollobtis [19] which showed that the factor (1 + E ) in the upper bound of (3.5)could be replaced by (1/2 + E ) for a wide range of values of p. Alan Frieze made good use of this new method when he improved the bounds on the independence number of random graphs [m]. I would like to thank the referee for pointing out that Luczak has recently established that for p 2 cln, c 2 CO, the random graph has chromatic number roughly nlp, as expected. And Frieze and tuczak have extended the results on p and x to random regular graphs of degree r 2 re 3.5 Asymptotic Analysis
Finally I would like to call attention to a beautiful paper [21] of Bender, Canfield and MacKay from which Theorem 3.3 can be derived. It is motivated by the question, what is a good estimate of the probability that a graph is connected when the edge probability is so low that the graph is a s . not connected? Let c(n,q)be the number of connected labeled graphs of order n with exactly q edges. Let x = qln and k = q - n. The function y = 9.x)> 0 is defined implicitly by 2xy = log- 1 + Y for 1 < x < w . 1-Y Theorem 3.7: (Bender, Canfield and McKay, 1990) The probability of connectedness for graphs of order n and size q is (3.6)
c(n,q) / (
asn+=,uniformlyfor
(3.7) and
) = euCx)f(x, y)” [ 1 + O( Ilk) + O(k1’16/n9’50)],
O
a(x) = x ( x + 1 ) ( 1 - y ) + l o g ( 1 - x + x y ) - - l o g ( 12- x + x y 2 )
E.M. Palmer
346
The proof uses the well-known recurrence relation for graphs rooted at an edge and requires 40 pages of careful analysis! 4.
Selected Unsolved Problems
4.1 Reconstruction Problem Pyber [22] has just shown that if n is sufficiently large, about 240, a Hamiltonian graph of order n is edge-reconstructible. Since random graphs with minimum degree 2 are as. Hamiltonian, these are also U.S. edge-reconstructible. What about random graphs with minimum degree l? Or random graphs that consist of a giant component and perhaps just some isolated vertices? More specifically, are almost all graphs edge-reconstructible if
(a) (b)
p = (logn + on) In,or p = (logn-on)/n,or
(c) p = (logn)/n, where on+ and is much smaller than log n.? 00
As for vertex reconstruction, Bollobh has shown [23] that if p = c log n / n with constant c >1, then almost all graphs can be reconstructed from any three vertex-deleted subgraphs! But what if p is given by equation (a) above? A tree is starlike if it can be obtained from K1,h , h 2 3 by subdividing each edge at most once. Harary and Lauri [24] have conjectured that if a tree is not starlike, then its class reconstruction number is 2, i.e. among all trees it is uniquely determined by just 2 vertex deleted subtrees. Is this true for almost all trees?
4.2 Isomorphism Problem Is there a polynomial time algorithm that will compare any two graphs and decide whether or not they are isomorphic? No one knows. Babai, Erdiis and Selkow [25l found a beautiful, simple, fast algorithm that works for random graphs. But it only works if the degrees of the vertices are nicely spread out. What if (d)
p = clog n I n with constant c > 0, or
(e)
p = onlog n In with on +
00
arbitrarily slowly?
4.3 Chromatic Number Problem
There are still chromatic number problems for random graphs. For example, what is the edge-chromatic number xI(G) f o r p = c log n In ? For larger values of p, say (e) above, it is known (see [71 p.64) that a random graph has a unique vertex of maximum degree and so by a theorem of Vizing, the edge-chromatic number of a graph is equal to its maximum degree. Now consider a random r-regular graph G. It follows from Brooks’ theorem that when r =3, x(G) is at most 3. But the independence ratio PIn is a s . at least 6/13 (see [71 again, p.277). So x(G) = 3 a.s. But what is x(G) for a random r-regular graph when r 2 4?From [7] (p.277) we only learn that r I(2 log r) I x( G) I r as. In the Pomafi Proceedings of 1985, Alan Frieze asked for xl(G) for random r-regular graphs. In another variation, suppose we fix m 2 3, and let the sample space consist of all G of order n and max degree m. What is the behavior of P ( x , (G) = rn) ?
Quo vadis, random graph theory?
347
There is also the problem of finding the exact threshold for 3-colorability. The best result so far is by Chv6tal [26] who has shown that almost all graphs with at most 1.44n edges are 3colorable. 4.4 Hamiltonian Cycle Problem
The probability model r-out is defined as follows. Each of n vertices v has exactly r arcs from v to the other vertices selected at random without replacement. Hence there are digraphs in the sample space so far. Each is regular with out-degree r. Now make each one into a graph by simply ignoring the orientation of the arcs and converting each symmetric pair to a single edge. Frieze and Luczak [27] have shown that when r is at least 5, a graph in r-out is as. Hamiltonian. The consensus is that this is also true for r = 3 and 4 but there is no proof yet. Robinson and Wormald have shown that almost all r-regular graphs have a spanning cycle for fixed r 2 3. But they used a non-constructive method and hence there is no efficient algorithm known for these. 5.
Conclusions
This is a rich subject with a respectable and fascinating history. There are many beautiful theorems and a substantial body of knowledge has been built. With the research text of BCla Bollob&s and the new journal to support the field, and plenty of unsolved problems, we should expect more giant strides in the future. Here are some comments by leading practitioners: Chung and Graham (on quasi-random graphs): “the surface of this interesting topic has thus far only been scratched.” Joel Spencer: “Much detailed study has been made of the evolution at the Big Bang but much more needr to be done.” Dick Karp: “randomdigraphs have been studied very little. ” BCla BollobAs: “this is only a beginning ...from which we can learn a variety of techniques and find out what kind of results we should try to prove about more complicated random structures.”
References P. -6s and A. R6nyi; On Random graphs I, Publ. Math. Debrecen,6,290-297 (1959). E.N. Gilbert;Raudomgraphs,Ann. Math. Srat.,30,1141-1144(1959). P. EkdQ and A. Rhyi; On the evolution of randm gmphs, Magyar Tud. Acad. Mar. Kut& Int. Kozl. ,5. 17-61 (1960). P. Erdds and A. Rhyi; On the strength of connectedness of a random graph,Acta Math. Acad. Sci. Hungar., 12,261-267 (1961). P. Erdds and A. R6nyi; On the existence of a factor of degree one of a connected random graph, Acta Math. Acad. Sci. Hungar., 17,359-368 (1966). E.M. Patmer; Graphical Evolution: An Introduction to the Theory of Random Graphs, Wiley Inter-Science Series in Discrete Mathematics, New York (1985). B. Bdlobas; Random Graphs, Academic, London (1985). 3. Spencer; Ten lectures on theprobabilistic method, SIAM, Philadelphia (1987). R.J. Riddell, Jr. and G.E. Uhlenbeck On the virial development of the equation of state of monoatomic gases, J . Chem. Phys., 21,2056-2064 (1953).
348
1131 1141
E.M. Palmer
B. Bollobsls and A.G. Thomason; Random graphs of small order, Annals Discrete Math., 28, 47-97 (1%). B. Bollobh and A.G. Thomason, Threshold functions, Combinaforicu,7,35-38 (1986). A.D. Korshunov; Solution of a problem of Erd6s and R h y i about Hamilton cycles in non-oriented graphs. Soviet Mat. Dokl., 17,760-764 (1976). L. P6sa; Hamiltonian circuits in random graphs, Discrete Math., 14,359-364 (1976). R.M. Karp; The probabilistic analysis of some combinatorid search algorithms, Algorithms and Complexify ,J.F. Traub (editor),Academic Press, New Yo&, 1-19 (1976). B. Bollobas and A.M. Frieze; On matchings and Hamiltonian cycles in random graphs, Annals of Discrete Math., 28.23-46 (1985). G.R. Grimmett and C.J.H. McDiarmid; On colouring random graphs, Math. froc. Cambridge fhifos. Soc.,77,313-324(1975). B. Bollobi% and P. Erdbs; Cliques in random graphs, Math. Proc. Cambridge fhilos. Soc., 80,419-427 (1976). E. Shamir and J. Spencer; Sharp concentration of the chromatic number on random graphs G, CombiMtorica. 7,121-129 (1987). B. BolloMs; The chromatic number of random graphs, Combinatorica,8.49-55 (1988). A.M. Frieze; On the independence number of random graphs, Discrete Math., 81,171-175 (1990). E.A. Bender, E.R. Canfield, and B.D. McKay; The asymptotic number of labeled connected graphs with a given number of vertices and edges, J. Random Structures &Algorithms, 1,127-169 (1990). L. pyber,The edge reconshuction of Hamiltoman graphs, J . Graph Theory, 14,173-179 (1990). B. BolloW; Almost every graph has recoashuctionnumber three,J. Graph Theory, 1 4 , 1 4 (1990). F. Harary and J Lauri; On the class reumstructionnumber of trees, Quart. J. Math., 39. 47-60 (1988). L. Babai, P. Er& and S.M. Selkow; Random graph isomorphism, SIAM J . Comput., 9.628-635 (1980). V. ChvBtal; Almost all graphs with 1.44n edges a~ 3-colorable, J . Random Structures & Algorithms, 2 , 11-28 (1991). M. Frieze and T. tuczak, Hamiltmian cycles in a class of random graphs: one step further - to appear.
Quo Vadis, Graph Theory? J. Gimbel, J.W. Kennedy & L.V. Quintas (eds.) Annals of Discrete Marhematics, 55, 349-366 (1993) 0 1993 Elsevier Science Publishers B.V. All rights reserved.
EXPLORATORY STATISTICAL ANALYSIS OF NETWORKS Ove FRANK Department of Statistics, Stockholm University Stockholm, SWEDEN
Krzysztof NOWICKI Department of Statistics, University of Lund Lund, SWEDEN
Abstract We review standard multivariate statistical methods useful for exploring network data and discuss variow problems related to statistical analysis and modeling. General methods are suggested for three main problem areas, namely whether there is a need for block models, whether there is dependence between dyads, and whether there is dependence between different networks. In particular, we illustrate the use of logit regression analysis in order to fit log-linem models.We comment on various themes in the literature that are important for future research on statistical graph modeling.
1.
Exploring Network Data
Network data consist of attribute and relationship data on a set of individuals. Typically, we observe many different attributes and kinds of painvise relationships, and thus we have multivariate data referring to individuals as well as to pairs of individuals. Essential aspects of such multivariate data can be described by graphs and multigraphs. Random graph theory owes much of its development to the attempts being made to model uncertainty in networks. Uncertainty due to sampling variation, measurement e m r s and other inaccuracies necessitate the use of families of random graphs that depend on parameters that can be interpreted as quantities governing or controlling the outcomes of the mndom graphs. For example, in the exponential family of directed graphs introduced by Holland and Leinhardt [11, each vertex is characterized by two parameters governing fhe outcomes of out- and in-degree while two overall parameters govern reciprocity and density. In order to find an appropriate family ofrandom graphs for a certain application, there is a need for exploratory and confirmatory statistical methods by which empirical network data can be analyzed and fitted models evaluated. Such work might require special statistical methods but can also benefit from standard multivariate statistical program packages. Special computer software has been developed to analyze particular network models like, for instance, the Holland-Leinhardt model [l]. The approach of using log-linear analysis of multiway frequency tables with network data applied by Fienberg an6 Wasserman [2] and Fienberg, Meyer and Wasserman [31 is an example of thPusefulness of standard multivariate computer packages for the exploratory analysis of network data. A few other references to exploratory network analysis are given by [4]-[8]. The possibility of using easily available statistical software for network analysis has great potential. Simple tools are suggested in the following for exploring and modeling the statistical structure of network data encountered in various apphcations. We emphasize and illustrate these ideas by discussing in a fairly general way the choice of appropriate variables and units of analysis and the application of standard multivariate statistical techniques in order to find
0.Frank and K.Nowicki
350
useful models for describing networks and explaining their structure. To that end we focus on variation or change of various kinds in a network. Variation in the outcomes of individual statistics can be caused by some inhomogeneity that should be explicit in the model, or it can be caused just by random variation according to a model with individual homogeneity. Variation in the outcomes of dyad statistics, i.e. statistics referring to pairs of individuals, can be the cause of some structural dependencies that should be expressed by the model, or it can be caused merely by random variation according to a model with dyad independence. More generally, we could consider variation in the outcomes of triad statistics or other statistics referring to more than two individuals, but such variation is more difficult to relate to a plausible model, unless prior information is available that suggests specific model assumptions.
If a network changes with time, then either there is a need for a non-stationary model or the changes are considered as random fluctuations in a stationary model. Long series of networks are usually required to obtain statistical information about changes. Short series might suffice if attention is restricted to simple summary statistics of the networks. Special attention is due the frequency distributions of vertex statistics, dyad statistics, triad statistics, and other statistics referring to only a few individuals. The next section introduces some terminology and notations and, in particular, systematizes the kind of statistics we need for an exploratory analysis. Section 3 reviews some basic random graph models that are later used for illustrative purposes. Section 4 discusses how cluster analysis can be used to study the effects of individual heterogeneity. The analysis of dyad statistics is discussed in Section 5. A general method for using logistic regression analysis in graph modeling is described. Section 6 considers time series analysis of graph statistics. For large graphs, the practical problems involved in finding good ways of plotting a graph can sometimes be hard, and some suggestions are given in Section 7. Section 7 also describes a real data set consisting of a sequence of social networks, and Section 8 applies some of the suggested exploratory techniques to these data. A few concluding comments on exploratory analysis and modeling of network structures are given in Section 9. 2.
Preliminaries
A network on n individuals is specified by a matrix z = (zij),where the diagonal entries zii are attribute vectors characterizing the individuals i, and the off-diagonal entries z i j are attribute or relationship vectors characterizing pairs of distinct individuals ( i , ~ ] .
For instance, zii can be a two-vector giving the gender and age of individual i, and zij can be a two-vector giving the duration and strength of a certain type of contact from individual i to individual j . In the simplest case with no individual attributes and just one single symmetric relationship, the matrix z can be taken as the adjacency matrix of an undirected graph; that is, zii = 0 and zij = zji is an indicator of the occurrence of the relationship between individuals i and j . Any combination of characteristics of individual i that can be derived from the matrix z will be referred to as individual statistics and denoted by xi.Thus, provided the components of the vectors zij are numerical and can be added, xi can, for instance, be defined according to
Exploratory statistical analysis of networks
xi=
zii,c ZiJ
zii
351
.
[ i i i In the case of an adjacency matrix z of an undirected graph, the two-vector
I
1
zii.l (1- zji] maxkzikzjk ii i characterizes individual i by its numbers of neighbors at distance 1 and 2. xi =
Any combination of characteristics of the two individuals i and j that can be derived from the matrix z will be referred to as dyad statistics and will be denoted by xv For instance, xij can consist of zO, zji, zii, zj, &z&, & zjb &zW, & Z k j provided the components of the vectors zij canbeadded. In the case of an adjacency matrix z of an undirected graph, the four-vector
characterizes dyad (ij)by its edge indicator, its initial and final degrees, and its number of vertices adjacent to both i and j . For a network given by matrix z, any function of z will be referred to as network statistics and denoted by x. For instance, n can consist of the mean, vectors L..
%and i
‘J 1 2n(n - 1) itj
and the corresponding covariance matrices of the components of the diagonal entries and of the components of the off-diagonal entries, provided these are all numerical. In the case of an adjacency matrix z of an undirected graph, the two-vector
characterizes the graph by its numbers of edges and triangles. 3.
Some Random Graph Models
A random subset of a set S is called a Bernoulli @) subset if its elements are chosen by selecting independently each element of S with a common probability p .
An undirected random graph Z on N = { 1,...,n} is a random subset of
all 2-element subsets of N. A Bernoulli ( p ) subset of
(3%,
(3
,i.e. of the set of
is called an undirected Bernoulli (p)
graph on N. Its adjacency matrix (20)has Zit =O and Z g =
for all i, j E N , and the
[ ;) edge
indicators Z0 are independent Bernoulli @) variables for i c j .
If { N 1 , N 2 } is a partition of N into two disjoint non-empty subsets, and if Z1 and Z2 are independent undirected Bernoulli (pl) and Bernoulli ( p 2 ) graphs on N , and N2, respectively, then the union Z = Z1 u Z2 is called an undirected Bernoulli block model with two blocks N 1
0. Frank and K.Nowicki
352
and N2. If 212 is a Bernoulli ( ~ 1 2subset ) of N1 x N 2 , then the union Z = Z1 u &U 212 is a general undirected Bernoulli block model in which edges are allowed also between the blocks. This definition is readily extended to more than two blocks. Bernoulli block models, like Bernoulli models, have independent edges. A simple model exhibiting dependent edges can be introduced as follows. Let H be a Bernoulli @) subset of
( ), 1.e. of the set of all 3-element subsets of N. H i s a random hyperpph. For each hyperedge { i , j , k } E H, consider the complete undirected graph KVk = { { i , j } , { j , k } , { k, i} }
defined on { i , j , k } . Define the graph Z as the union of all these complete graphs Kijk for { i, j , k} E H . This random graph Z is called a Bernoulli @) triangle graph on N . Various generalizations are obtained by specifying other graphs than complete ones on the hyperedges. Frank and Strauss [9] have investigated graph models with dependent edges called Markov graphs. A simple undirected Markov graph is defined by the probability function P(Z = z) = exp&
+ hlx + h2% + h3x3),
where ho is a normalizing constant, h l , h 2 , h 3 are three parameters governing the density, clustering and transitivity properties of the graph, and ~ 1 ~ 2are . ~three 3 statistics given by the numbers of edges, two-paths and triangles in z; that is
4.
Clustering Individual Statistics
In order to decide whether or not there is a need for a model with individual heterogeneity, it is helpful to separate the individuals into clusters, that is subsets of individuals, such that individuals in the same subset are more similar than individuals in different subsets. After the clusters have been identified the approach is to specify distinct models within each cluster and between different clusters. The underlying idea is that it should be easier to find a useful network model under homogeneity assumptions. One very simple example of a model with individual heterogeneity is a Bernoulli block model with the blocks defined by the clusters. Even though this model might be unrealistic, it can be used to illustrate some of the problems involved in the search for clusters. In order to illustrate the clustering of individual statistics we start with an undirected Bernoulli ( p ) graph Z on N = { 1, ..., n} . For the individual statistics we choose the vertex degrees Xi = X, Zij which are binomial ( n- 1,p).The n vertices are clustered by similarity in degree.We denote by Fk the frequency of vertices of degree k for k = 0, I, ...,n - 1. The expected frequency of vertices of degree k is equal to
Exploratory statistical analysis of networks
353
Since this expected frequency is a unimodal function of k , we expect to find only one cluster with any reasonable decision rule based on the frequencies Fo,. ..,F, - 1, 1.e. we expect to correctly accept homogeneity between individuals under this model. If we use a block model with a Bernoulli (pl) block of size nl and a Bernoulli (p2) block of size n2, then the degrees are binomial (nl - 1,p1) for nl vertices and binomial (n2 - 1,pz)for n2 vertices. It follows that the expected frequency of vertices of degree k is equal to 2
EFk=
2
ni['il)d(l
-p;)nc-l-k.
i= 1
If the value of (nl - l)pl is far from the value of (n2 - l)p2, we can expect to find two clusters corresponding fairly well to the two blocks, but if (nl - l)pl is close to (n2- l)p2,the identification of two clusters would require other methods. As a numerical illustration we consider the case of a two-block model with nl = 10, n2 = 20, p 1 = 0.3, p2 = 0.5. Here the expected degree distribution has a bimodal form, as shown in Figure 1. From this model we simulated lo00 networks. The smoothed average degree distribution did not deviate very much from the expected distribution, but for 38 of the networks, that is for about 4 per cent, the degree distribution after smoothing turned out to be unimodal. This percentage can be considered as an optimistic estimate of the risk of not identifying the need for a block model in this case.
Frequency
0
1
2
3
4
5
6
7
8
9 10 11 12
13 14 15 16 17
Degree
Figure 1: Expected degree distribution in a two-block model. In practice, we cannot very often expect to be content with Bernoulli block models, and the main advantage of the present discussion is that it can be easily extended to more interesting cases with more complicated data patterns. With several characteristics of the individuals available, the search for clusters can be based on various standard methods of cluster analysis. The efforts to sepamte the vertex set into clusters has of course to be balanced against the efforts needed to find appropriate models within and between the clusters. Should it, for
0.Frank and K.Nowicki
354
instance, be possible to find kclusters described by k - 1 parameters and a set of simple twoparameter models within and between the clusters, then the total number of parameters would beequalto
2 ( y + k- 1, and this should be compared to the possibility of finding an overall model with this number of
parameters.
5. Cross-Classificationof Dyads In order to decide whether or not there is a need for a model with edge dependencies, we can cross-classify the dyads according to various statistics and count and analyze the numbers
of dyads in different categories. Log-linear analysis can be applied to detect interesting interaction effects between the statistics used for the classification.logit analysis can be used to analyze the edge proportions among the dyads in different categories. If the categories are defined in terms of statistics that measure the “local edge density”, then the discovery of different edge proportions among the dyads in different categories indicates a need for a model with dependent edges. To a great extent the success of such approaches depends on the choice of appropriate statistics for the cross-classification of dyads. Data for a simple model can be used to illustrate the difficulties involved. More realistic illustrations are provided in Section 8. Consider first a Bernoulli @) model on n vertices with the dyads classified according to a single statistic, say the number of two-paths between the two vertices in the dyad. Let F u be the number of dyads having k two-paths and 1 edges for k = 0, ...,n - 2 and I = 0,l. Set Fk = Fko + F k l for the number of dyads having k two-paths. The expected value of Fk is
and the expected value of Fkl is mkl = pEFk. The proportion Fkl/Fk of edges among the dyads with k two-paths is roughly constant, which correctly indicates no need for a model with edge dependence. Assume now instead that the model is a Bernoulli @)-triangle graph. Then for k # 1 the proportion FkltFk varies with k , and this strongly suggests the presence of edge dependence. In fact, this graph has very peculiar properties: It is transitive and it has no isolated edges; there are no end vertices, and every vertex of degree 2 is a comer of a triangle. If the model is modified so that instead of triangles we enter two-paths on the Bemoulli(p) selected triads, then the peculiarities of the graph will not be quite so revealing, but the method of examining conditional edge proportions Fkl/Fk will still work. The idea of considering the probability of an edge conditional on “local” properties can be modified and used to estimate graph models of exponential type. Consider an exponential model given by
z) = exp& + h l x l + h2%+... + h,~,,,), where ho is a normalizing constant and XI,. ..,A,,, are parameters corresponding to graph stap(Z=
tistics x l , ... ,xm evaluated at z.The probability of an edge at dyad (i,J conditional on all the rest of the graph z can be calculated as
Exploratory statistical analysis of networks
355
and it follows that
where xijk is the difference in statistic Xk evaluated when z has z i j substituted by 1 and 0, respectively. Thus, all the parameters hl,...,Am appear as coefficients in the logistic regression and can be estimated by standard methods. See[9] and the application in Section 8 below. 6.
Time Series of Graph Statistics
When a sequence of networks is available, the main question concerns how the networks are related. If the purpose of the analysis is to fit a non-stationary graph process, then a first approximation can be given by a sequence of independent random graphs governed by time dependent parameters. Frank [ 101 has elaborated on this idea. Any graph changes with time are considered as the effects of certain changes in the parameters governing the properties of the networks. Previous or present outcomes of the graph have no direct influence on the future outcomes. Should such influence be required, stochastic dependencies have to be introduced and a possible model is a Markov process with graph states. Exploratory analysis of a sequence of networks can be based on various summary statistics that reflect time changes. For instance, it is natural to look for time changes in the frequency distributions of various vertex statistics, dyad statistics, triad statistics, and so forth. Time series analysis of such low order statistics can also be helpful to detect interesting patterns. As a simple example, consider a random graph process for which the dyads are independent, identically distributed Markov chains with homogeneous transition probabilities P(Z,(t
I
+ 1) = I z&)
= k ) = Pkl
for k = 0,l and I = 0,l. If the evolution of the dyad processes are observable, then the transition probabilities can readily be estimated. If, however, only the “global” graph properties are available, say the total number of edges R, = 2
2 zg(d
i<j
then other methods are needed. Here R, is a Markov chain with R , + conditional on R , = r given by the sum of two independent binomial variables with parameters ( r , P I 1 ) and , respectively. It follows that the conditional edge density has an expected
0. Frank and K. Nowicki
356
and the transition probabilities can be estimated by regression methods. A numerical experiment with n = 20, Pol = 0.05 and Plo = 0.20 was simulated, and from a stationary sequence of 100 graphs the regression estimates with their standard deviations turned out to be Po, = 0.05f 0.01 and PI, = 0.20 f 0.08. A test of independence could be given as a test of whether the regression slope is zero. We omit the details.
7. A Sequence of Social Networks To get some flavour of real network data analysis we have reanalyzed sociometric network data provided by professor Wolfgang Sodeur. The data were collected at the University of Wuppertal as part of a research project on social network analysis. Various research reports from this project are included in a general bibliography on social networks edited by Sodeur et al [ll]. The data set can briefly be described as consisting of an ordered sequence of six networks of social contacts between 208 freshmen at the University of Wuppertal. The students were interviewed about their preferences for contacts with their fellow students on six different occasions, namely at 2 3 , 4 5 7 , and 9 weeks after the start of the semester. On each occasion, each student was asked to name the peers with whom he or she preferred to have contact. There was no restriction on the number of contacts, but most students named two or three peers. There were missing data due to the fact that some students refused to participate at all, some refused to reveal more than nick names of their peers, and some did not report on all occasions. The present discussion will be confined to the n=179 students for whom complete preference information was obtained on all six occasions, and no problems of missing data will be considered here. To simplify matters and to improve the reliability of the contact information, we restrict attention to reciprocal preferences only. Thus, data consist of an ordered sequence of six undirected networks of reciprocal contacts that can be represented by symmetric matrices z(f)=(zg(t)) for f = 2,3,4,5,7,9, where zb{t)=O and z&t)=zj(t) is 1 or 0 depending on whether or not students i and j report a reciprocal contact at time t. The students are labeled by integers 1,2, ...,179, in the same manner on each occasion. Table 1 reports some basic facts concerning number of contacts, number of triTable 1: Summary statistics of a social network of order 179 on six different occasions. ‘Iime Statistic Number of edges Number of triangles Number of isolated vertices Number of isolated edges Number of connected components of order 3 or more ~~
Order of the largest connected component
Exploratory statisticalanalysis of networks
357
angles, number of connected components, and other characteristics of the networks on each occasion With as many as 179 vertices, it is not straightforward to draw a nice graphical representation of the networks. In order to draw a graph we found it convenient to consider the connected components of the union of the six networks, that is the components of the network defined by the matrix with elements z..=max z i j ( t ) . 'j
r
This amounts to saying that two students i a n d j belong to the same component if and only if there is an ordered sequence io,. ..,i, of students such that 10 = i and ,i = j with ik - 1 and ik having a reciprocal contact at least once for every k = 1,. ..,111. According to the network given by matrix
there are 54 isolated vertices, 12 isolated edges, 2 connected components of order 3 , 2 of order 4, and 4 of larger orders, namely 5, 10, 17 and 45.Now we can restrict attention to the vertex sets of these connected components separately, and consider the corresponding subsets of the original networks on each occasion. In this way we do not have to consider more than 45 vertices simultaneously, and only 3 components have more than 5 vertices. Figure 2 shows the connected component of order 17 in the network given by z. Figure 3 shows the corresponding networks on each occasion. We notice that the sequence of networks consists of many small components and few large ones, and there is no obvious pattern in the way components on one occasion are related to components on another occasion.
Figure 2: A connected component of order 17 in the network obtained by taking the union over six occasions. Some of the exploratory methods discussed above have been applied for a more detailed analysis of the networks. We illustrate these methods in the next section.
8.
Statistical Analysis of the Social Networks
A natural way of summarizing the six social networks is by their distributions of vertex degrees. The degree distributions are given in Table 2, and it is pretty obvious that there is no significant difference between the distributions on the six different occasions. From the assumption of stationarity and independence between the six networks, the expected degree distribution can be estimated by the average frequencies in Table 2. The expected proportion of edges is given by 0.0037. For a Bernoulli graph model, the expected
0. Frank and K.Nowicki
358
Time 2
Time 3
Time 4
Time 5
Time 7
Time 9
Figure 3: The network on each occasion corresponding to the connected component given in Figure 2. number of isolated vertices should be close to 93 and the expected number of end vertices should be close to 61; both these numbers are far from the values given in Table 2. These facts and many others provide clear evidence that a Bernoulli model is not adequate here. In order to get some guidance as to what specific features are present here, we can look at the distributions of various dyad statistics. Table 3 shows the distributions of the dyads according to three statistics, namely the edge indicator, the number of edges adjacent to only one of the two vertices in the dyad, and the number of vertices adjacent to both the vertices in the dyad. For brevity, we refer to these statistics as the edge indicator, the degree, and the number of two-paths of the dyad, and they are denoted by
Exploratory statistical analysis of networks
359
Table 2 Degree distributions of a social network of order 179 on six different occasions. -~
~
~~
Time
Degree .. 2
3
4
5
0
95
102
109
104
1
55
39
32
44
2
19
26
23
20
3
9
9
11
10
4
1
3
3
1
Table 4 yields the proportions of dyads with edges among the dyads with fixed degree and fixed number of two-paths, that is the estimated edge probabilities conditional on these numbers. Obviously there is strong evidence that edges occur more frequently among vertices with many neighbors than among vertices with few neighbors.An interesting correspondence between these findings and a Markov graph model can be noticed. Consider an exponential graph model given by
where
are the numbers of edges, two-paths and triangles of the graph. Frank and Straws [9] and Frank [lo] have shown how logistic regression can be used to estimate this model. The interesting connection between the cross-classification in Table 4 and the parameters of this model is that the parameters can be estimated as coefficients in a logistic regression
If Nz ( d j ) is the number of dyads (ij) with i <j and zv = Z , dg= d , sg= s, then the sum of squares
0.Frank and K. Nowicki
360
Table 3: Dyad distributions of a social network on six different occasions according to degree (4,number of two-paths (s), and edge Occurrence ( z ) Time
f 4 0 4 0
1
0
0
0
0
0
0
Exploratory statistical analysis of networks
36 1
Table 3: Dyad distributions of a social network on six different occasions according to degree (4,number of two-paths (s), and edge occurrence (2)
d
I l l s
Time
z
2
4
3
5
9
7
6 1 0
2
4
7
0
0
0
6 1 1
0
0
1
0
0
2
6 2 0
0
0
3
1
3
1
3
1
0
0
0
0
0
0
0
0
0
0
1
0
0
1
0
1
0
0
0
0
0
0
I
or the weighted sum of squares
can be minimized with respect to hl, h2, h3 by using logistic regression options in standard statistical packages. Applying this approach to the present sequence of social networks leads to the results reported in Table 5. Thus we find a fairly good fit with a sequence of Markov graphs having time dependent parameters for density, clustering and transitivity.
362
0. Frank and K.Nowicki
Table 4 Edge proportions of a social network on six different occasions according to degree (d), and number of two-paths (s).
Exploratory statistical analysis of networks
363
Table 5: Fitting Markov graph parameters for density (hl),clustering (12)and transitivity (h3)and goodness of fit measures on six different occasions
9.
Comments on Modeling
The lack of special software for the statistical analysis of network data should not be considered as an obstacle for exploratory graph analysis. Most standard multivariate statistical packages contain several procedures that are sufficient to undertake an exploratory analysis of network data. Cluster analysis of vertex statistics, cross-classification analysis of dyad statistics, logistic regression analysis and time series analysis of various graph statistics have been discussed here and have been intended as illustrations of the possibility of applying standard methods in order to gain insight and support for graph modeling. There are a few basic approaches that can be varied according to the needs of the application considered and the intuition and imagination of the statistician. These approaches comprise the use of cluster analysis of vertex statistics, dyad statistics and other low order statistics in order to detect inhomogeneities, the use of cross-classifications of low order statistics and loglinear or logit analysis to detect dependencies within a graph, and the use of time series analysis of low or high order statistics in order to detect dependencies between distinct graphs. The main obstacle to network modeling is the difficulties involved in the interpretation of the various trends and tendencies that become evident from the exploratory analysis and the lack of knowledge about how such properties can be modeled by specific multiparametric random graphs.There should be more research devoted to the synthesis of random graph models with specific properties. For instance, in order to gain insight into the formation and development of human contact networks, dynamic models should be specified by setting up transition equations and deriving equilibrium solutions. Good references for this type of work are [12]-
rm.
Statistical graph theory developed during the 1970s and 1980s, and the emphasis in the literature is on different themes. Probably the main contributions have come from the develop-
0.Frank and K. Nowicki
364
ment of log-linear models in the analysis of categorical data. Research work related to Markov fields and conditional independence in families of random variables has also entered this development. A selection of modern references is [16]-[20]. Other approaches are related to developments in cluster analysis: [21]-[25l .There are also several contributions to random graph theory and graph sampling designs that are not directly related to exploratory statistical modeling but of relevance to the development of probability models and statistical inference in graphs. A few examples are contained in [26]-[31]. Statistical graph modeling is increasingly recognized as a field of importance for statistical applications. It also contributes to theoretical statistics and is a source of much enjoyable research.
References P. Holland and S. Leinhardt; An exponential family of probability densities for directed graphs, Journal of the American Statistical Association, 7633-65 (1981). S . Fienberg and S. Wasserman; Categorical data analysis of simple sociometric relations, Sociological Methodology, S. L e i a r d t (editor), Jossey-Bass, San Francisco, 156-182 (1981). S. Fienberg, M. Meyer and S. Wasserman; Statistical analysis of multiple sociometric relations, Journal of the American Statistical Association, 80,5147 (1985). 0. Frank, M. Hallinan and K. Nowicki; Clustering of dyad distributions as a tool in network modeling. Journal of Mathematical Sociology, 11.47-64 (1985). 0.Frank, H. Komanska and K. Widaman; Cluster analysis of dyad distributions in networks, Journal of Classification, 2,219-238 (1985). 0. Frank; Growing classification and regression trees on network data, Classification as a Tool of Research, W. Gaul and M.Schader (editors), North-Holland,Amsterdam, 137-143 (1986). 0. Frank; Multiple relation data analyses. Operations Research Proceedings 1986, H. Isermann et al. (editors), Springer-Vedag,Berlin Heidelberg. 455460 (1987). D. Knoke and J. Kuklinski; Network Analysis, Sage Publications, Beverly Hills (1982). 0. Frank and D. Straws; Markov graphs, Journal ofthe American StatisticaI Association, 81,832442 (1986). 0. Frank; Statistical analysis of change in networks, Statistica Neerlandica, 45. (1991). W. Sodeur et al.; Bibliographie zum Projekt Analyse sozialer Netzwerke, GesamthochschuleWuppertal (1978). U. Grenander; Pattern Synthesis, Lalures in Pattern Theory, Vol. I, Springer Verlag, New York (1976). U. Grenander; Pattern Analysis, Lectures in Pattern Theory, Vol. II, Springer Verlag, New York (1978). U. Grenander; Regular Structures, Lectures in Pattern Theory, Vol III,.Springer Verlag, New York (1W). P. Whittle; Systems in Stochastic Equilibrium, John Wiley & Sons, Chichester (1986). Y. Wang and G. Wong; Stochastic block models for directed graphs, Journal of the American Statistical Association, 82.8-19 (1987). G. Wong; Bayesian models for directed graphs, Journal of the American Statistical Association, 82, 140148 (1987). S. Wasserman and D. Iacobucci; Sequential social network data, Psychometrika, 53,261-285 (1988). T. Snijders;Testing for change in a digraph at two time points, Social Networks (to appear). J. Whittaker; Graphical Models in Applied Multivariate Statistics, John Wiley & Sons, Chichester
(1W). J. Hartigan; Asymptotic distributions for clustering criteria, The Annals of Statistics, 6 , 117-131 (1978). 0. Frank and F. Harary; Cluster inference by using transitivity indices in empirical graphs, Journal of the American Statistical Association, 77,835840 (1982). J. Hartigan; Statistical theory in clustering, Journal of Classification, 2.63-76 (1985). E. Godehardt; Explorative mathematische Modelle in der Medizin, Nichtlineare Regression und Numerische Klassi$katwn, institut fiir Medizinische hkumentation und Statist& der Universit;it, Koln (1%).
Exploratory statistical analysis of networks
365
[a E. Godehardt; Graphs m Structural Models, F. Vieweg and S o h Verlagsgesellschaft, Braunschweig (199o). [26] T. Snijders and F. Stokman; Extensions of triad counts to networks with different subsets of points and testing underlying random graph distributions, Social Nehuorks,9, a s 2 7 5 (1987). 0.Frank, Random sampling and social networks: A survey of various approaches, Mafhhatzques.Informatique et Sciences humaines, M:104.19-33 (1988). [ZS] K. Nowicki; Asymptotic Poisson distributions with applications to statistical analysis of graphs, Advances in Applied Probability. 20.315330 (1!388). [2!?] S . Berg and L. Mutafchiev; Random mappings with an attracting center: Lagrangian distributions and a regmsion function, Journal of Applied Probability, 27,622-626 (1990). K.Nowicki; Asymptotic distributionsof subgraph counts in colored Bernoulli graphs, Random Graphs. M. Karobski, A. Rucibski and J. Jawmki (editors), John Wiley & Sons,New York,87,203-221 (1990). [31] S. Janson and K. Nowicki; The asymptotic distributions of generalized U-statistics with applications to random graphs,Probability Theory and Rehred Fie& (to appear).
[m
m]
This Page Intentionally Left Blank
Quo Vadis, Graph Theory? J. Gimbel, J.W. Kennedy & L.V. Quintas (eds.) Annals of Discrete Mathematics, 55, 367-314 (1993) 0 1993 Elsevier Science Publishers B.V. All rights reserved.
THE HAMILTONIAN DECOMPOSITION OF CERTAIN CIRCULANT GRAPHS
Jiping LIU Department of Mathematics and Statistics, Simon Fraser University Burnaby, British Columbia, CANADA
Abstract In this papex, we show that the circulantgraphs with order Zp,for p a prime, are Hamiltonian decomposable.
1.
Introduction Let Z, denote the additive group of integers modulo n.
Definition 1.1:
The circulant graph C(n, S ) is a graph with vertex set V = Z,, and edge set E = { ( i , j ) : j - i e S } , w h e r e S c Z , \ ( O } satisfiesS = - S , t h a t i s , i f a e S , t h e n n - a e S .
+
+
L e t s + = { a i : a i e S,aiI:}.ThenS=S U S .ForaES,letE,= { ( i , j ) : l i - j l = a ) . We say that Ea is a set of ecfges with symbol a. Definition 1.2:
Let G be a regular graph with edge set E(G). It is said to have a Hamiltonian decomposition (or to be Hamiltonian decomposable) if either (i)
deg(G) = 2d and E(G) can be partitioned into d Hamilton cycles, or
(ii)
deg(G) = 2d + 1 and E(G) can be partitioned into d Hamilton cycles and a perfect matching.
Many known Cayley graphs on abelian groups are Hamiltonian decomposable. This led Alspach [ l ] to ask the question: “Does every connected Cayley graph on an abelian group have a Hamiltonian decomposition?” If the degree of the graph is 2, then the answer is obviously yes. If the degree is 3 , the answer is again yes since such a graph has a Hamilton cycle. The case of degree 4 has been solved by Bermond, et al. [2] and the answer is again yes. The answer is also yes for degree 5 [ 3 ] .Here I give these results in a theorem applying to the case that G is a circulant. Theorem 1.1: If C(n, S ) is a connected circulant graph of degree at most 5, then C(n,S) is Hamiltonian decomposable.
The Hamiltonian decomposition of a graph sometimes depends on the Hamiltonian decomposition of the Cartesian product of two graphs. Definition 1.3:
The Cartesian product G, x G, of Gl and G2 has vertex set V(G,) xV(G,) with ( u l , u2) adjacent to (v,, v,) if and only if either u1 = v1 and u ~ i adjacent s to v2 in G or u , = v, and u 1 is adjacent to v1 in G.
J. Liu
368
The strongest result about the Hamiltonian decomposition of Cartesian product was obtained recently by Stong. Theorem 1.2: (Stong [4]) If GI has a decomposition into nl Hamilton cycles and G, has a decomposition into n2 Hamilton cycles, nl S n,, then G, x G, has a Hamiltonian decomposition if any one of the following is true: (i)
n2 13n,,
(ii)
n, 2 3 ,
(iii) G I has an even number of vertices, or (vi) IV(G2)12 6?;"2 - 3 . 1
For a more general class of vertex transitive graphs, Alspach [5] obtained the following result. Every connected vertex transitive graph of order 2p, p 53 (mod 4) and p a prime, has a Hamiltonian decomposition. It is expected that the same result holds for p = l(mod 4) except some special cases. In $2, we show that this is true for circulant graphs. In $ 3 , we obtain a partial result for the case n = p q , where p and q are primes, and show that for n S 15, C(n, S) is Hamiltonian decomposable. The 4-regular circulants are called double loop graphs in [6], in fact, they have two edge disjoint Hamilton cycles if they are connected. Our result says that connected 2k-regular circulants, C(2p, S), are k-loop graphs (k edge disjoint Hamilton cycles). This property is important in routing applications because routes on a Hamilton cycle are easy; these graphs are good fault-tolerant networks, so one can always find a simple routing in case of edge failure. In a survey [7], Bermond et al., posed the problem of constructing graphs admitting a decomposition into k Hamilton cycles and having the smallest diameter. This problem arises in the context of computer loop networks and it is still open. From our result, we see that the connected circulants C(2p, S) provide a class of graphs useful to this problem.
In the following, we always assume that C(n, S) is connected. Boesch and Tindell [S], has shown this is equivalent to gcd(S+,n) = gcd(a1,a2, ..., a , n ) = 1, where s' = { a,, a,, ..., a,} . We will use G, CB G, to denote the edge disjoint sum of graphs G 1 and G2 (GI and G2 may have the same vertex set or different vertex sets), and by ( a ) we mean the generated subgroup of Z, for u E Z,. For definitions and notation not included here see Bondy and Murty [91. 2. It = 2 p
Let n = p q , where p and q are primes. Let Sp = { mp : mp E S} ,Sq = { mq : mq E S} and S,= {s : S E S,s is aunitof Z,} . Then S = S p v S q v S , . Let S d p = { m : mpE S,,} and Sqlq = { m : mq E Sq} . We have Lemma 2.1:
=
C@q, S)
S q 4 ) x C@q,S,/P)) @ C@q, Sub
The Hamiltonian decomposition of circulant graphs
369
Proof: Let X = C@q, S).We partition Zpq into the left cosets of (p), that is,
z,, = @) u 1 + (p) v ... u ( p - 1) + @). On each set i + (p), the induced subgraph X [i + (p)] with edges which have symbols in Sp, is isomorphic to C(4,S p / p ) . If there is an edge between i+ (p) and j + (p) with symbol in S,, then there is a perfect matching between i + ( p )and j + (p) with the same symbol. The edges (with the same symbol) between the cosets constitute p-cycles. There is at most one symbol which belongs to Sq between i + (p) and j + (p). Otherwise, we have j - i +mop, i - i + m lp E S for some mg and "1. This implies that j - i + m@ = koq a n d j - i + m l p = k l q forsome$andkl.Therefore, (ml-mo)p = ( k , - k o ) q . T h i s i s a contradiction.
If we let { (p), 1 + (p), ..., ( p - 1) + (p)} be a vertex set and S,/q be a symbol set, we obtain a circulant C@,Sdq).Clearly, C@q, sp u s,
2 C@, S , 4 )
x C(4, S,/P).
Therefore,
C@q, S)5 (C@, S / q ) x C(q,s p / m @ C@q, S u b Forexample, C(15, {3,6,5,12,9, 1O})zC(3, {1,2})xC(5, { 1 , 2 , 3 , 4 } ) . Now we can prove our main theorem.
Theorem 2.2: C(2p, S) is Hamiltonian decomposable. proof:
Recall that S, = {mp : mp E S} and S2 = { 2m : 2m E S} . There are two cases to consider. Case1 S,;t0. In this case, we have that S , = { p } , and C(2, S,/p)
Wp, s>= ( K 2 x C@, S2/2))
@
5 K 2 . Therefore,
C(2P, SJ
by Lemma 2.1. If S 2 t 0 , let S2/2 = {al, u2, ...,a m p - a l , ...,p - a , ) . Take an m-matching of C@,S2/2),say, { (xl,yl), ..., (x,y,)},suchthat yj-xj = u i o r p - a j . There are two parts in K 2 x C@, S2/2), each of which is isomorphic to C@, S2/2), and there is a perfect matching between the two parts. We can label the vertices of one part with {xl, x2, ..., x,} and the other one with { x ' ~xI2, , , .., XI,} such that (xi, xTi)is an edge for i = 1,2, ...,p . Let Ea, be the edge set with symbol ai in C@,S2/2). Then Ea, is a Hamilton cycle in C@, S 2 / 2 ) . Now we can give the Hamiltonian decomposition as follows. Let
=
wai- (Xi' Y J ) u (Evai- (x'p Y'J 1 u { (Xp
X'i)
9
(Y'7 Y'j) 1
J. Liu
370
for i = 1,2, ..., m , where E' is the image of Ea, under the map prime (I). Then each H i is a a, Hamilton cycle of K, x C@,S , / 2 ) , and Hi nHj = 0 . What remains is a perfect matching { ( x l , ~ l ) , . . . , ( x r n , ~ du{(~'~.~'l).....(n'm.y'd} } u {(xi>x'j),(yi,yti) : i* 1, ...,m1.
I.' S2 = 0, then C(2p, S ) z E @ C(2p, S,), where C(2p, S,) is Hamiltonian decomposable and Ep is a perfect matching o f C ( 2 p ,S). Therefore, C(2p, S) is Hamiltonian decomposable. Case 2 Sp = 0: Since C(2p, S) is connected, then there is at least one a E S, . Since gcd (a,2 p ) = 1, the map a - l .. C(2p, S) + C(2p,d ' S ) defined by d ' ( s ) = d ' s for any s E Z.+, is an isomorphism. So we can assume that 1
E
S.
Now let S' = S,\ { 1, -1 } ,then we have that C(2p, S)is Hamiltonian decomposable if S'is non empty, and C(2p, S) z C(2p, S , u { -1,. 1 } ) @ C(2p, S ' ) . Let Y = C(2p,S , u {-1, l } ). We partition 2%into ( 2 ) and 1 + ( 2 ) .The edges with symbols in S2 induce subcirculants on ( 2 ) and on 1 + (2), both of which are isomorphic to C@, S 2 0 . The edges with symbol 1 form two "parallel" classes between (2) and 1 + (2). One is: {(O,l),(2,3),...,(2p - 2, 2 p - l)}, denoted by M I , and the other, {(2,1),(4,3),...,(2p - l,O)}, is denoted by M-1. Let S, = { b l , b,, .. ., b , p - b , , ...,p - b,} , where b , > b, > ... > b,. T o decompose C(2p, S2 u { -1, 1) ), we need to find a special matching. Claim: There exists a near perfect matching M, = { (xl, y1 ) , ..., (xm, y,) } in X [ ( 2 ) ] such that (1) y i - x i = bi, i = 1,2, ...,m,and (2) O < X 1 < X 2 <
... < x , < y , < y m - l
<...
To prove the claim, let Kp be a complete graph with vertex set Zp Then p-1
M O ) = { ( l , p - l ) , ( 2 , p - 2 ) , . . . ,+ - + I
p+1
is a near perfect matching of Kp if p is odd. Let 2M@) = { ( 2 , 2 ( p - 1)
1, (432 ( p - 2 ) ), ..., ( p -
l , p + 1)
1
7
and let M, = 2M@)nE(C@, S 2 / 2 ) ) .Then Mo has the required properties. This completes the proof of the claim. Let Hfi = E,, nE(X [ ( 2 ) ]). We know that H f l , ..., g , is a Hamiltonian decomposition of X [ ( 2 ) ] , and (xi, y i ) E H i . Let H",,...,H", be the corresponding Hamiltonian decomposition of X [ 1 + ( 2 ) ] . Now let H i = ( H ' i - (xi,yi)) u ( W i - ( 1 + xi, 1 + y i ) ) u { ( x i , 1 + xi), (yi, 1 + y i ) } for i = 1,2, ...,m.Then each Hiis a Hamilton cycle of C(2p, S, u {-1, 1 } ), and Hi nHi = 0 . The remaining edges are H,
+1
= M,
u ( 1 + M,) u (M, - {
, 1 + X I ) , ..., (x,,
1 + Xm)
11 u M-1
where 1 + M, = { ( 1 + x, 1 + y ) : ( x , y ) E Mo} .To show that H,, 1 is a Hamilton cycle of
The Hamiltonian decomposition of circulant graphs
371
is a Hamilton cycle and hence C(2p, S) is E(c> = H,,,, (see Fig. 1). Therefore H,, Hamiltonian decomposable. This completes the proof of the theorem.
Figure 1:
3. ns15 First we give some simple results. Theorem 3.1: If for each a E S , gcd(n, a) = 1, then C(n, S) has a Hamiltonian decomposition. Proof: For each a E S, E , is a Hamilton cycle of C(n,S). But E(C(n,S)) = U { E, : a E 5'+ 1, therefore, C(n, S) has a Hamiltonian decomposition. Corollary 3.2: Let p be an odd prime. Then C@,S) has a Hamiltonian decomposition. Theorem 3.3: If S+ can be partitioned into S1, S,, ..., S,, such that gcd(S, n) = 1 and ISi! 13, or if ISi[ = 3 , then E Si.Then C(n, S) has a Hamiltonian decomposition. ProoT: Each C(n, Si) is connected and has degree at most 5, and hence has a Hamiltonian decomposik tion by Theorem 2.1. But C(n, S) = ,@ C(n, SJ, and therefore C(n, S) has a Hamiltonian I=
1
decomposition. Theorem 3.4:
If p and q are odd primes, and 0 < lSpl I ISql I 31Spl or lSpl 2 6. Then C(pq, S) has a Hamilte nian decomposition.
372
J. Liu
Proof: By Lemma 2.1, we have C(P% S) (C(P,S q 4 ) x C(q7 S p / P ) ) a3 C(P%S-1). By Theorem 1.2, we known that C(p, S q / q ) x C(q,S p / p ) has a Hamiltonian decomposition, and C(P%S,) has a Hamiltonian decomposition, if SU is nonempty. Therefore, c ( p q , s> 1s Hamiltonian decomposable.
Theorem 3.5: If n I 15, then C(n, S) is Hamiltonian decomposable. Proof:
If n 16, the regularity of C(n, S) is at most 5. Therefore, C(n, S) has a Hamiltonian d e c o m p sition by Theorem 2.1. For n = 7, 11 and 13, C(n,S) has a Hamiltonian decomposition by Corollary 3.2. For n = 8 and 9, we have S+ {1,2,3,4}. It is easy to see that we can partition S+ into the subsets which satisfy Theorem 3.3 or Theorem 1.1. Thus, C(8,S) and C(9,S) are Hamiltonian decomposable. For n = 12, S+ G{ 1,2,3,4,5,6}. We need consider only S+ = {1,2,4}, {2,3,4} and {5,2,4}, since all the other cases are covered by Theorems 1.1 and 3.3. But {5,2,4} = 5{ 1,2,4}, so we need only consider the cases s'= { 1,2,4} and {2,3,4}. If S+={ 1,2,4}, the three edge disjoint Hamilton cycles of C(12, S) are: 0 1 0 2 6 7 3 11 195480,O 13 5 7 9 11 1 0 8 6 4 2 0 a n d 0 4 3 2 1 5 6 1 0 9 8 7 11 0.
If S+= {2,3,4}, the three edge disjoint Hamilton cycles of C(12, S) are: 0 2 6 10 8 4 1 5 9 11 7 3 0 , O 4 6 8 11 2 107 5 3 I 9 0 and 1 8 5 2 4 7 9 6 3 11 1 10 0.
For n = 10 = 2 x 5 and n = 4 = 2 x 7 , we know that both C(10,S) and C(14, S) are Hamiltonian decomposable by Theorem 2.2. For n = 15, S+ c {1,2,3,4,5,6,7}. We need consider only S+ = {1,3,6}, {2,3,6} and {4,3,6}, since all the other cases are covered by Theorems 1.1,3.3, and 3.4. But 2{1,3,6} = {2,3,6} and4{1,3,6} = {4,3,6}, soweneedonlyconsiderS+={1,3,6}. But C(15,S+u-S+)zC3 xC(15, (1,2,3,4))~C3XK5. Therefore, C(15,p u -9) is Hamiltonian decomposable.
Acknowledgement The author wishes to thank Professor B. Alspach for his help and to thank the referees for their helpful comments.
References [l] [2] [3] [4]
B. Alspach; Research Problem 59, Discrete Math., 50,115 (1%). J.-C. Bermond, 0.Favaron and M. h4aheo; Hamiltonian decomposition of Cayley graphs of degree 4, J . Combinatorial Theory Ser. B, 46,142-153 (1989). B. Alspach, K. Heinrich and G. Liu; Orthogonal factorization of graphs, - preprint. R. Stong; On Kotzig's conjecture, - preprint.
The Hamiltonian decompositionof circulant graphs
[5]
[6]
[8]
[9]
373
B. Alspach; Hamiltonian partitions of vertex-transitivegraphs of order 2p, Congressus Numerantiurn, 28, 217-221 (1980). J.-C. Bermond, G. Illiades and C. Peyrat; An optimization problem in distributed loop computer networks, in Combinatorial Mathematics, (€'roc. 3rd Int. Cod.), New York, 1985, Annals New York Acad. Sci., 555,4555 (1989). J.-C. Bermond, F. Comellas and F. Hsu; Distributed loop computer networks, a survey, IEEE Trans. on Computers (submitted). F. Boesch and R. Tindell; Circulants and their connectivities, J . ofGraph Theory, 8,487499(1984). J.A. B o d y and U.S.R. Murty; Graph Theory with Applications, Macmillan Press (1976)
This Page Intentionally Left Blank
Quo Vadis, Graph Theory? J. Girnbel, J.W. Kennedy & L.V. Quintas (4s.) Annals of Discrete Mathematics, 55, 315-384 (1993) 0 1993 Elsevier Science Publishers B.V. All rights reserved.
DISCOVERY-METHOD TEACHING IN GRAPH THEORY Phyllis Zweig CHI” Mathematics Department, Humboldt State University Arcata, California, U.S.A.
Abstract A variety of recent reports have recommended sweeping changes in content and method of teaching mathematics. While many of these reports have been geared to precollege mathematics, their implementation will require changes in the way teachers are prepared in college as well as adaptation to new abilities and attitudes that should be evident in students entering colleges in the future. Graph Theory is an area of mathematics that is particularly well-suited to non-traditional methods of teaching. It is also one of the topics recommended for inclusion in high school mathematics courses. This presentation will include information from two reports from the National Council of Teachers of Mathematics and the Mathematical Sciences Education Board, along with ideas for incorporating discovery-method learning into Graph Theory classes.
In the last few years, several different mathematical organizations have conducted studies of the state of mathematics learning in the United States, and comparative studies regarding mathematics education in a variety of other countries. The results of these studies have received great attention in the press and in educational circles: for example, Everybody Counts. A Report to the Nation on the Future of Mathematics Education [l], Curriculum and Evaluation Standards for School Mathematics [ 2 ] ,A Nation at Risk: The Imperative for Educational Reform 131, Report of the N C T M I M Joint Task Force on Curriculum for Grades 11-13 [4], and the Mathematics Model Curriculum Guide [5]. In fact, so much attention has been given to these reports that Solomon Garfunkel wrote the following [6]: “In a recent issue of Business Week devoted to education I found the following quote: ‘No more awards for predicting rain, only for building arks!’ Basically I agree with the underlying sentiment. This is not to suggest an anti-research bias. I recognize that we don’t have the solutions to educational reform and that time (and money) must be spent on thinking about problems before implementing ‘solutions.’Having said that, we do know what the problems are and we have raised consciousness to the point where they know it too. “We have convinced the general public that our nation is at risk, that everybody should count, that we need to invoke new standards by incorporating important mathematical strands in our cumculum frameworks. Enough already - we won. We have convinced a remarkable number of people that there is a fire that needs to be put out.. .” Virtually all of the reports mentioned above discuss the need for a change in both the content and the methods of teaching mathematics at all levels from kindergarten through graduate school. Many of these reports urge the teaching of more discrete mathematics, including graph theory.
In this paper I would like first to discuss briefly some of the changes in method and emphasis of teaching mathematics that are common to many of the recent reports and then to describe a style of teaching an undergraduate graph theory class that seems to fit well with current educational recommendations. Among the items on which the NCTM/MAA joint task force felt the mathematical com-
P.Z. chinn
376
munity has reached consensus is: “13. Teachers of mathematics in grades 11-13 should employ strategies that encourage student reading, writing and reflection. Assignments ... should be designed to help students become more independent learners of mathematics and to increase their abilities to discuss both orally and in writing the mathematical ideas they are learning.” (See [4] p.3).
According to the section of Everybody Counts entitled “Teaching ... Learning Through Involvement,” “Evidence from many sources shows that the least effective mode for mathematics learning is the one that prevails in most of America’s classrooms: lecturing and listening. Despite daily homework, for most students and most teachers, mathematics continues to be primarily a passive activity: teachers prescribe; students transcribe. Students simply do not retain for long what they learn by imitation from lectures, worksheets, or routine homework. Presentation and repetition help students do well on standardized tests and lower-order skills, but they are generally ineffective as teaching strategies for long-term learning, for higher-order thinking, and for versatile problemsolving. “Teachers, however, almost always present mathematics as an established doctrine to be learned just as it was taught. This “broadcast” metaphor for learning leads students to expect that mathematics is about right answers rather than about clear creative thinking ... [l] p.57. “In reality, no one can teach mathematics. Effective teachers are those who can stimulate students to learn mathematics. Educational research offers compelling evidence that students learn mathematics well only when they construct their own mathematical understanding. To understand what they learn, they must enact for themselves verbs that permeate the mathematics curriculum: examine, represent, transform, solve, apply, prove, communicate. This happens most readily when students work in groups, engage in discussion, make presentations, and in other ways take charge of their own learning ... [11 p.58. “No teaching can be effective if it does not respond to students’ prior ideas. Teachers need to listen as much as they need to speak. They need to resist the temptation to control classroom ideas so that students can gain a sense of ownership over what they are learning. Doing this requires genuine give-and-take in the mathematics classroom, both among students and between students and teachers. The best way to develop effective logical thinking is to encourage open discussion and honest criticism of ideas ... [l] p.59.
“When students explore mathematics on their own, they construct strategies that bear little resemblance to the canonical examples presented in standard textbooks. Just as children need the opportunity to learn from mistakes, so students need an environment for learning mathematics that provides generous room for trial and error. In the long run, it is not the memorization of mathematical skills that is particularly important without constant use, skills fade rapidly - but the confidence that one knows how to find and use mathematical tools whenever they become necessary. There is no way to build this confidence except through the process of creating, constructing, and discovering mathematics .... [ 11 p.60.
Discovery-methodteacbing in graph theory
377
“Teachers’ roles should include those of consultant, moderator, and interlocutor, not just presenter and authority. Classroom activities must encourage students to express their approaches, both orally and in writing. Students must engage mathematics as a human activity; they must learn to work cooperatively in small teams to solve problems as well as to argue convincingly for their approach amid conflicting ideas and strategies. “There is a price to pay for less directive strategies of teaching. In many cases, greater instructional effort may be required. In those parts of the curriculum where mathematics directly serves another discipline (for example, engineering), students may not march through the required curriculum at the expected rate. In the long run, however, less teaching will yield more learning. As students begin to take responsibility for their own work, they will learn how to learn as well as what to learn.” [l] p.61. Most of the reports mentioned above call for increased attention to real world applications of mathematics. Graph theory is certainly well-suited to an applications approach, even for fairly young students. An example is described by Hart et al. fTJ. However, I wish to discuss a teaching approach that begins at an abstract level and involves students in the type of active learning described in the quotations from Everybody Counts. My description will include some history of experiences teaching in a discovery mode, along with organizational details that might make it easier for other instructors to experiment with this style of teaching.
I have been teaching a discovery-method graph theory class since Spring 1970. Initially the only formal prerequisite for the course was permission of the instructor. At present, I require abstract algebra or my permission. There are two reasons for this requirement. One is that students who have completed abstract algebra usually can write a reasonable mathematical proof, which I expect them to do in this class. The second is that the notion of isomorphism of graphs is easier for students to grasp if they have explored other types of isomorphism in abstract algebra. Perhaps less important is that a student who has studied abstract algebra is in a position to think of and explore the idea of an automorphism group of a graph. I usually interview each student prior to the first class meeting and explain that the course is designed as an experience in active learning, oriented toward discovery method; that the students will be working without a textbook or lectures; that I will provide relatively little guidance and expect them to do quite a bit of work independently or in small groups. I also warn them that the word ‘‘graph” is used in two different ways in mathematics and that this course has nothing to do with graphing as they are familiar with the term, and that an informal prerequisite to the class is not knowing what graph theory is. I request they not look at any graph theory books prior to the class. All those who still express an interest in the class after this vague explanation of what to expect are permitted to enroll. Class enrollment is usually 15 or fewer students. Some of the students are mathematics majors, some teach secondary school mathematics; others are science or engineering majors. Once an English major (mathematics minor) took the class, and one psychology graduate student has taken it.
The class meets once a week for three hours, with a short break (around 15 minutes) halfway through that period. The first week I bring refreshments for the break. In subsequent weeks the students take turns bringing something for a snack. The break is an important part of the class process, especially since three hours is too long a time for anyone to concentrate without a break. The break gives students a short time to stand up and stretch and to think about ideas that have been discussed, and to formulate new ideas that grow from the work
P.Z. chinn
378
thus far. Usually students continue talking informally during this time, although some continue working on a problem and others go outside for fresh air. This informal break time leads to a camaraderie among the class members that allows them to take more risks in suggesting conjectures or sharing newly formulated ideas and in general leads to more comfortable interactions between students and less reluctance by students to make presentations to the class. During class we sit around a large table or move desks into a circle, to facilitate student discussion and add to an informal atmosphere. At the first class meeting I explain again that graph has two different mathematical meanings and that the graphs with coordinates and equations are not those meant by graph theory. I then give students three pages with many examples of graphs in geometric, set theoretic and matrix representations. Each page is labeled in three sections: “Each of these is a graph,” “None of these are graphs,” and “Which of these are gmphs?” A partial sample of one such representation is shown in the figure below.
%
R
+) PI
pz
p3
p4
p5
p6
* %
It was only after the second time I taught this class that I realized how much easier it is for us to discover what something is when we can also see examples of what it is nor. Once students can identify the graphs correctly, I ask them to find a definition that includes and excludes exactly the proper examples. I encourage them to discuss their ideas in groups of 34 students. Usually the rcmm is very silent for 15 minutes while individuals think. Then they begin conversations with a group of neighbors. Once a group of students has made guesses as to which of the unknown examples in one of the representations are graphs, I will look at their
Discovery-method teaching in graph theory
379
answers and tell them either that all are correct, or that a certain number are incorrect, without telling them which answers are incorrect. Groups of students repeat this process until everyone knows which examples are graphs. The mathematical background of these students is such that they have some difficulty relating the matrices to the two other representations. Sometimes I will ask members of a group questions about what entries in a matrix represent, as a hint that there are labels associated with rows and columns. This usually suffices for one or two students to discover the correspondence between a matrix and a set theoretic or geometric version of a graph. Eventually each class formulates a definition of a graph similar to a standard one; for example, a graph is a finite non-empty set of elements called points and a set of elements called lines such that (1)
two distinct points are associated with each line, called its endpoints;
(2) no two lines share more than one endpoint. The next problem for the class is to determine what it should mean for two graphs to be essentially the same or isomorphic. I am able to motivate this question by asking sevelal students to draw a geometric representation of a graph associated with one of the set theoretic or matrix representations from their initial handouts. Once the class sees that two graphs may look very different yet be the same, they are more willing to explore the concept of isomorphism for graphs. For many students this seems to be the first experience they have of isomorphism in the abstract. They can state what it means for two groups to be isomorphic or two rings, etc., but have not considered what it means for two systems to be isomorphic, independent of what particular type of system is being discussed. In the case of graphs, understanding of isomorphism may also be complicated by the fact that there is no easy known test for graphs to be isomorphic, and the students tend to confuse the abstract definition with a means of testing whether the definition holds for particular graphs. Throughout the semester discussion returns to the notion of isomorphism, as students find new necessary conditions for graphs to be isomorphic. Eventually, the students begin to understand that a graph property may be defined as any property which holds for all isomorphic copies of a graph. Once the class as a whole has discussed and agreed upon definitions for graphs and isomorphism, the students are essentially in the position of any mathematician investigating a new abstract system. From this point, students set out to seek properties that apply to all graphs, or perhaps to a subset of graphs which they can define or characterize in some fashion. They are free to use any representation of graphs they prefer. They create their own definitions, questions, hypotheses, theorems, proofs, examples or counterexamples. Students are allowed, even encouraged, to work with one another during the week, as long as they do not consult written materials or former students. In the weekly class meetings students discuss ideas they have had the previous week: problems they thought up and perhaps solved, hypotheses, examples, counterexamples, proofs. During these class meetings I take notes. I d o some rewriting and organizing of the presentations in these notes and then run them off for students to pick up the following day. The students are thus freed to think about and participate in the discussions without fear of forgetting what went on during class. Every week each student is expected to turn in something written - any results they have completed, plus an indication of work in progress. I read these, keeping a set of notes on what each student has done, and write comments or suggestions on the work before returning it.
380
P.Z. chinn
The first problem that confronts the students, once a framework for graph theory has been established, is where to begin. Despite the fact that all of the students have seen some finite axiomatic systems before, it takes some encouragement for them to bother defining the most elementary concepts - for example, giving a name (usually order) to the number of points in a graph, or the number of lines adjacent to a point (usually inden or degree). The second class meeting is usually half over before students finish their first discussion of isomorphism, so these earliest concepts are usually named during class after some prompting from me encouraging students to consider some of the most basic properties of graphs. Thus, for example, I ask what simple, even obvious, things they can say about all graphs. A typical beginning might be a student suggesting that graphs all have points. What can you say about the points of a graph? You can say how many points there are in the graph. This suggests the number of lines, and the number of lines at each point. With some prompting, this leads to discoveries of relations among these concepts. I often have to encourage students to explore concepts at this elementary a level, since they tend to move into more complex ideas quickly, omitting basic vocabulary that is needed to state more complicated concepts.
Later in the course when students are beginning new broad topics, it is all right to let them discover the need for basic definitions and relations in order to be able to state more complex ones, but for their first experience with free-form thinking about a mathematical topic, I prefer giving enough guidance for them to be able to ask and answer a few simple questions, rather than letting them rush immediately into the difficult questions that seem to occur more naturally to them. Choosing a suitable name for a concept proves to be an unexpected challenge for students. Some names are rejected by a class for a definition being discussed, but are reserved for later use, often prompting a search for some property that could be suitably called by that name. For example, the number of lines adjacent to a point could be called the index, degree or dimension of that point. One class rejected dimension as being “too powerful” a word for such a simple concept, and then set about deciding how one could assign a meaning to the dimension of a graph. Thus, most definitions begin with a concept and proceed to a name, but a few develop in the opposite order. The class meetings are often noisy and charged with excitement. Sometimes several people speak simultaneously. Small groups of students working on a particular problem often talk with one another, temporarily ignoring the rest of the class. Sometimes someone presents a proof to the whole class, while the others attempt to tear it apart. Some proofs stand up under the scrutiny; in others flaws are found. This often results in more work on the part of students trying to find a valid proof. In every class, one or more students discover results which have only appeared in print within the last 40 years, and a few students have proven minor results which are genuinely new.
I do not tell students when they are covering known results, nor when they are venturing off into new areas of graph theory. In this way all the material is equally new and potentially solvable for them. I do occasionally help them formulate definitions once the class has decided on the essential content of the definition. I also offer possible nomenclature for properties, although my suggestions are often rejected. Often making up suitable names for properties takes up much time and energy. In some subject areas it could be important for students to know the usual name for a particular concept. Since graph theory nomenclature is not yet standardized, it does not seem to be a great disadvantage for students to select either one of the usual names for a property or a related, suggestive name.
Discovery-methodteaching in graph theory
381
One skill in which students improve greatly during the semester is an ability to decide what they are trying to say. The class as a whole often helps a student by asking good questions. In particular, when formulating a definition, students learn to consider which cases they want to exclude as well as which should be included. Many graph properties apply only to connected graphs, or to sufficiently large graphs. Students learn to consider what special conditions have to be met for a hypothesis to make sense. They also increase in their effectiveness in communicating their ideas. As the course progresses, certain patterns for asking questions begin to emerge. For example, often a student forms a hypothesis about graphs and then finds a counterexample. One which occurs every year is that a graph can be traced without lifting your pencil and without repeating any lines. This is an instance where clearly only connected graphs are meant. Even with this refinement, the hypothesis is easily shown to be false. Immediately, this false conjecture can be used to formulate a class of traceable graphs, and students seek conditions for a graph to be traceable. This process can always be applied. Whenever a property fails to hold for all graphs, one can seek to characterize those graphs for which that property does hold. Another pattern for asking questions emerges involving numbers (invariants) associated with a graph: the order of a graph, degree of a point, number of lines in a graph, number of colors necessary to color the points or lines under various rules, etc. Whenever a number is defined, one can immediately ask what are the extremal values which this number can assume? How can one characterize graphs for which the number under consideration takes on its extremal values? This type of question occurs in almost every area of mathematics, but students often do not notice this until they begin to ask many similar questions by themselves. Another class of questions are counting questions. Given any graphical property, one can ask how many different (non-isomorphic) graphs are there which have this property? Many graph theory counting problems are difficult, perhaps beyond the ability of undergraduate mathematics majors in a setting like this class, but every year some students do enjoy attempting these types of questions. Several more patterns of questions arise in connection with finding ways to map one type of graph onto another. For example, given any graph, one can form its line graph - where each line of the original graph is a point of the line graph, and two points of the line graph are joined by a line if the points on the original graph had an endpoint in common. Students can seek other ways to form new graphs. Alternatively they can explore dual concepts, in a finite or projective geometry sense of interchanging the roles of point and lines in theorems. For example, dual to a traceable graph is one containing a circuit which includes each point exactly once. A third direction which can be taken following the introduction of graphs derived from others, is to categorize graphs where properties are preserved under the mapping. Students do not discover these patterns of questions spontaneously. I have only noticed these categories after offering this type of discovery-learning class many times. Now I point out to the students when a question they have asked follows a particular pattern, and challenge them to generate other similar questions. Graph Theory is particularly well-suited to this minimally directed, discovery-learning approach, because the background needed is slight and many of the problems are independent of one another. In fact, few articles in the current literature show any broad interrelations among the questions of graph theory. Encouraging students to view questions as being particular instances of general types of questions is one way to give some continuity to the wide
382
P.Z. chinn
variety of concepts dealt with in this course. An introductory course in any easily axiomatized system (for example, matroids, point-set topology or group theory), might be handled in a similar manner. The important thing is for a mathematics student to do independent mathematical thinking, to learn where mathematical problems originate, and to improve in their ability to communicate their mathematical ideas. Students find this course exciting and challenging, and produce a surprising quantity and quality of work. One student wrote a note which describes the attitudes of many students: “Our class time went by entirely too fast. I really don’t know how much time I devoted to the course but every second was worth it. I had to think for myself but no one told me what to think or how to think it. This was great. I had freedom to do what I wanted and consequently I was willing to work harder and feel that I learned much more ...”. By the end of a semester of studying graph theory in the discovery method format, students have usually explored all the topics I might cover in a typical lecture course, using a text like [S] or [9]. Some of the topics are covered much more superficially than would occur in a lecture course, while a few are covered in more depth. Some topics are brought up spontaneously by students, others grow out of suggestions I make to students that they pursue a particular extension or variation of something they turn in on one of their written weekly reports or that they present to the class. In general I feel that their greater independence and their willingness to work to understand concepts that they have chosen for themselves or that their classmates have discovered, is a good trade-off for missing some “traditional” tools or concepts or applications.
In addition to covering a substantial portion of Graph Theory, students have often increased their tolerance for working at a problem whose solution is not immediately evident. They report that the “answer in back of book becomes less important” and they have become “more interested in what the answer means than what it is”. Several students have returned to tell me that they feel this class has made a great deal of difference in how they study for and think about their later math classes. While the specific content of the graph theory class may not be directly useful for later courses, the methodology seems to enhance the learning skills of students. T o assign grades for this course I usually ask the students what grade they feel they deserve. I do suggest some guidelines, for example, that an A grade should indicate a student has successfully proven some results and has offered some original ideas during the semester. Almost always the students’ evaluations of the grade deserved matches my own and the students are content. If we disagree, a student and I discuss our differences and reach a compromise. This usually involves a student underestimating the grade deserved. Occasionally students do well in their mathematics courses all through an undergraduate major, and sometimes even through graduate school. Suddenly, when they need to write a dissertation, these students find they need a new set of skills, calling for originality and an ability to ask questions and refine those questions until they are answerable. A discovery method class may be one of the few instances where students can improve these skills before deciding whether or not to pursue a doctorate. In fact, several students who have taken this class from me decided to pursue a graduate degree in mathematics when they had not previously planned further mathematical study! Many mathematics professors are currently teaching using the Moore method, based on teaching techniques of Professor R.L. Moore of the University of Texas, Austin. Students in Moore-method courses and those in my course are all actively involved in doing mathematics
Discovery-method teaching in graph theory
383
with no use of textbooks or other reading material. In both types of classes, students must find out for themselves what techniques will work to solve a particular problem. In both types of discovery courses, the students are able to make use of much of their prior mathematical learning. They have to decide for themselves what might be relevant to the question at hand. The major difference between the two styles of teaching is in the source of questions and problems for the students to consider. Moore-method teaching relies on the instructor to pose problems and stimulate students to produce elegant solutions. In Moore-method classes the definitions and theorems are given and students must discover the proofs. In this graph theory course, discovering the content of the course is part of the learning experience. The Mooremethod is an especially good method for outstanding students with strong mathematical backgrounds. Few of my students meet these requirements, yet in the Graph Theory course many of them do high quality mathematics, working on problems which they or their classmates suggest. In this class, asking the questions is as important as learning to answer them. Further information regarding discovery teaching in mathematics can be found in references [lo]-1121. In addition, Rowe [13] contains an extensive bibliography on this topic. In summary, the discovery-method class described above is particularly effective in addressing some of the concerns quoted earlier from Everybody Counts [lland at enhancing some of the mathematical skills espoused in the NCTM Stundurds [2], which I have listed below: Standard 4: Mathematical Power ability to apply their knowledge to solve problems within mathematics and in other disciplines; ability to use mathematical language to communicate ideas; ability to reason and analyze. Standard 5: Problem Solving formulate problems; apply a variety of strategies to solve problems; solve problems; verify and interpret results; generalize solutions. Standard 6: Communication express mathematical ideas by speaking, writing, demonstrating and depicting them visually; understand, interpret, and evaluate mathematical ideas that are presented in written, oral, or visual forms; use mathematical vocabulary, notation, and structure to represent ideas, describe relationships, and model situation. Standard 7: Reasoning use inductive reasoning to recognize patterns and form conjectures; use reasoning to develop plausible arguments for mathematical statements; use deductive reasoning to verify conclusions, judge the validity of arguments, and construct valid arguments; analyze situations to determine common properties and structures; appreciate the axiomatic nature of mathematics.
P.Z. chinn
384
Standard 8 Mathematical Concepts label, verbalize, and define concepts; identify and generate examples and non examples; use models diagrams, and symbols to represent concepts; translate from one mode of representation to another; recognize the various meanings and interpretations of concepts; identify properties of a given concept and recognize conditions that determine a particular concept; compare and contrast concepts. Standard 1 0 Mathematical Disposition confidence in using mathematics to solve problems, to communicate ideas, and to reason; flexibility in exploring mathematical ideas and trying alternative methods in solving problems; willingness to persevere in mathematical tasks; interest, curiosity, and inventiveness in doing mathematics; inclination to monitor and reflect on their own thinking and performance; valuing of the application of mathematics to situations arising in other disciplines and everyday experiences; appreciation of the role of mathematics in our culture and its value as a tool and as a language.
References National Research Council; Everybody Counts: A Report to the Nation on the Future of Mathematics Education, Board on Mathematical Sciences and Mathematical Sciences Education Board, Washington, D.C., National Academy Press (1989). National Council of Teachers of Mathematics; Curriculum and Evaluation Standards for School Mathematics, Reston, Virginia, National Council of Teachers of Mathematics (1989). National Commission on Excellence in Education;A Nation at Risk: The Imperative for Educational Reform. Washington, D.C., US. GovernmentPrinting Office (1983). MAA-NCTM Task Force on Math Curricdum for Grades 11-13. Report ofthe NCTMIMAA Joint Task Force, Reston, Virginia, National Council of Teachers of Mathematics, Preliminary Report (1987). California State Department of Education; Mathematics Model Curriculum Guide, Sacramento, California (1987). S. Garfunkel; Building arks, CJME Trends, 2.2-5 (1990). E.W. Hart, J. Maltas and B. Rich Teaching discrete mathematics in grades 7-12, Mathematics Teacher. 362-367 (1990). F. Harary;Graph Theory, Addison-Wesley,Reading,Massachusetts (1%9). N. Hartsfield and G. Ringel; Pear& in Graph Theory, Academic Press, Boston (1990). K.B. Henderson; Anent the discovery method, The Mathematics Teacher, 50,287-291 (1970). P.S. Jones; Discovery teaching - from Socrates to modernity, The Mathematics Teacher, 63,501-511 (1970). D. Resek and D. Fendel; A case for teaching exploration, Marhematically Speaking, Addison-Wesley, Reading, Massachusetts (1990). M.B. Rowe; Teaching Science as a Continuous Inquiry: A Basic, McGraw-Hill,New York (1978).
Index of Key Terms Symbols
#P 282 -complete 282 - h d 282 Numerics 1-factor 30, 118, 191,275 2-dimensional mesh 90 3-coloring 211 A abelian group Cayley graph of 367 achiralliak 162 acyclic digraph 313 nichegraph 324 polynomial 175 airpollution 27 algorithm analysis of 33 approximation 22 complexity 202,277 development in graph theory 201 distributed 23 efficiency 279 efficient 201 ellipsoid 289 existence of 21 exponential 201 graph 20 inefficient 201 LasVegas 281 lying 22 MonteCarl0 281 nor-deterministic polynomial time 278 hF' 278 number of embeddings 256 off-line 20 on-line 20 padel 23,292 planargraph 203 planarity 249,252 polynomial 275 polynomialhime 201 PQ-trw planarity testing 249 probabilistic 24 probabilisticanalysis 33 random approximation 23 sequential 292 st-numbering 250 almost all 265,342 almost surely (as.) 342 altemating cycle 284
alternating (cont.) knot 165 link 165 linkdiagram 165 amenable coloring 14 amphicheiral knot 162 analysis asymptotic random graph 345 cluster 93 F-polynomial 174 anthropology 26 antihole 220 applications of graph theory 24 role in graph theory 13 approximate counting 298 approximationalgorithm 22 random 23 aromatic character of chemical graph 123 artificial intelligence 31 asymptotic analysis random graph 345 atactic polymer 111 automorphism 77 avoidance theorem 34
B badcut 187 balanced 26,268 extension 33,269 strictly 270 beknottedness 165 Benson graph 114 Beraha number 2 poIynomial 2, 155 Berge conjecture 22.34 Bernoulli blockmodel 351 graph 351 subset 351 lriangle graph 352 bicircularmatroid 268 biconnected graph planar 257 bicritical graph 191 Big Bang in random graph 32 binary ladder 327 binding number 147 bipartite graph 275,327 complete 315 equitable 327 factor size 300 Birkhoff-Lewisequations 153, 155 block model Bernoulli 351
386
block,odd 187 Boolean circuit approach to complexity 277 monotone 302 size 301 bracket polynomial 168 brick decomposition 191 bridge 90 Brooks theorem 2,48,216 buildingblock 225
C cage graph 30,114 capacitated matching 302 Cartesian product 191.367 extendability 1% rank 258 Cayley color-graph directed 73 graphabelian 367 center 15.73.89 chemical structure 111 detour 130 eccentricpoint 89 generalized 111 central distance set 91 vertex 91 certificatemethod 278 chainingin cluster 93 channel assignment 14 characteristicpolynomial 174 Characterizing F-pol~m~mial 175 chemical applications 17,109 documentation 29 graphtheory 109 information retrival 110 nomenclature 29 Chinese postman problem 183 chiralknot 160 choicenumber 14 choosable 14 chordal 14 graph 141,220,285,295 strong 295 chromatic critical 50 join 157 number 45,261,289 defective 14 sequence 48 strong 235 polynomial 142,153, 174
Index of Key Terms
chromatic (cont.) polynomial derivative 174 generalizedpolynomial 20 theory F-polynomial 176 chromatic number 74 random graph 344.346 chromaticdly connected 227 chromial 153 constrained 154 free 154 non-planar 154 planar 154 circle paclang 31 circuit Boolean 302 Boolean size 301 rigid 220 value problem 296 circulant 367 circulant graph 367 circumference graph 145 classification 93 claw 276 claw-free 276 clique 25.55 clustergraph 93 cover 143 edge covering 315 number 141,262,289 randomgraph 263 polynomial 174 vertex covering 3 15 closed neighborhood 317 cluster 93 chaining 93 complete linkage 93 multigraph 94 significant 25 single linkage 93 statistical network 352 stronglinkage 93 weaklinkage 93 cluster analysis graph 93 graphtheory 103 heterogeneity 94 homogeneity 94 probability model 94 randomgraph 95 cluster graph 93 clique 93 component 93 edge 93 vertex 93
Index of Key Terms clustering 16,31 cochromatic number 261 theory 261 co-comparabilitygraph 295 coding chemical structure 110 cohomology theory 6 color class 45 coloring 8,13,45,81, 153,211,261 amenable 14 complete 46 complexity 71 defective 14.46 generalized 45 hypergraph 242 list 8, 14 planargraph 54 problem 71 proper 45 Ramsey theory 261 strongly 235 unique 52,226 vertex 214 combinatonal finite basis theorem 205 common enemy graph 322 communication in noisy channel 314 comparability graph 295 comparator circuit value problem 296 competition common enemy graph 322 graph 23,313 number 313 double 314,322 generalized 313 complete bipartite graph 3 15 coloring 46 linkage cluster 93 listing 298 complexity algorithm 202 Boolean circuit approach 277 class 160 algorithm 277 randomized 281 coloring 71 counting problem 277 decision problem 277 knot 159 knot mviality 163 recognition 65 search problem 277 component cluster graph 93
component (cont.) exterior 221 giant 32 interior 221 multigraph projection 94 computer graph 238 computing network 24 conflictgraph 22 conjecture Hadwiger 54 Ramanujau 6 Tait 31 Tutte 185 Ulam 175 connectivity 183 chromatic 227 multigraph 94 random graph 342 random multigraph 97 WhP 290 consensus protocol 24 constrained chromial 154 contraction 54 elementary 54 sub- 54 contractiveedge 155 coronoid graph 121 counting approximate 298 exact 298 pmblem compIexity 277 Turing machine 282 cover 290 clique 143 cycle 183 double cycle 186 F 173 graph 173 vertex 276 weight 173 covering 20 edge clique 315 number edge 143 vertex 143 vertex clique 315 critical chromatic 50 graph 261,290 crossing nugatory 165 number 20.31 link 164 crossover 230
388
crystallattice 298 cubicmap 212 cut 240 bad 187 cycle alternating 284 cover I83 decomposition 187 even 187 dominating 147 double cover 1% Hamilton 137,195,367 isolating 222 minbow 85 spanning random graph 347 structure 145
D data structure PQ-tree 249 decision problem complexity 277 deck edgenumber 60 endtreenumber 60 number 60 reducednumber 60 totalnumber 60 decomposition brick 191 cycle 187 F-polynomial 174 Hamiltonian 367,371 defective chromatic number 14 coloPing 1 4 , 4 d e P distributionin network 353 maximum 130 minimum 130 dense graph 295 density 33 global 33 dependency stochastic 355 Desargues-Levigraph 113 deterministicTuring machme 281 detour center 130 diameter 130 distance 127 vertex 131 eccentric sequence 129 eccentricity 128 set 128 graph 133 median 131
Index of Key Terns
detour (cant.) path 127 periphery 132 radius 130 D-graph 89 diagramlink 160 diameter 15,s clustering 17 detour 130 djametral
pair of vertices 89 path 89 diame~calpath 15 dichromatic number 262 digraph acyclic 313 weighted 27 dimer 298 directed Cayley color-graph 73 discovery method teaching 377 discrete mathematics teaching 375 disjoint path problem 205 disk dimension problem 204 distance 89.1 11.127 central set 91 centralvertex 91 concepts 15 detour 127 vertex 131 graph edgerotation 18 edge slide 18 rotation 18 vertex 131 distinguished vertex 240 distributed algorithm 23 DL-graph 91 DNA 26 Dodgson winner 24 dominating cycle 147 set 138 domination 20 number 138 Doob Martingale process 345 double competition number 3 14,322 loopgraph 368 star 327 starliketree 327 drug design 29,109 dual plane graph link dagram 161 dual polyhex graph 120 dyad statistics 351
Index of Key Terms
E eccentric centerpoint 89 sequence 129 vertex 89 eccentricity 89, I1 1 detour 128 of vertex 15 ecology 23.31 economics 26 ecosystem 27,313 edge chromatic number of random graph 346 clique covering 3 15 cluster graph 93 contractive 155 covering number 143 deck 59 density 354 dependency in random network 354 independencenumber 143 independent 143, 191 maximal f-graph 333 numberdeck 60 quasi- 224 reconstructibility of random graph 346 rotation distance graph 18 slide 18 distance 18 distancegraph 18 special 221 education and graph theory 35 efficiency of algorithm 279 efficient algorithm 201 eigenvalue 114 elementary contraction 54 ellipsoid algorithm 289 embedding level 251 nnmber 253 planar 250 polyhex graph 122 random 258 rank 258 spherical 250 st-graph 250 unrank 258 endtree number deck 60 energyuse 27 enumeration 31 planar embeddings 249 equilibriummodel for random graph 334 equitable bipartite graph 327 equivalenceisotopic 250
equivalence link diagram 160 Erdijs-RBnyi random graph 333 Eulerian 183, 187 weight 183 evolution of random graph 32 exact counting 298 existence of algorithm 21 exponential algorithm 201 extendability 191 Cartesian product 1% hypercube 194 spaaning subgraph 1% extension balanced 33.269 matching 191 exterior 221 exterior component 221 extremal graph theory 20
F facility location 16 factor 30, 118, 145, 191, 275,287 graph 116 size of bipartite graph 300 factorization 30 fan 135 F-graph 91 Jgraph 333 f-graph edge maximal 333 Fibonacci number of graph 141 finitely presented group 74 five color theorem 2 fleet maintenance 13 flow nowhere-zero 240 foodproduction 27 foodweb 313 forbidden part 187 four color problem 153,211 theorem 2.6.54 four color problem 34 F-polynomial analysis 174 characterizing 175 decomposition 174 higher level 173 impact in graph theory 173 isomorphism weighted 173 lower level 173 pure 173 reconstruction 175 freechromial 154 function threshold 343 future of graph theory 1.5
389
390
G Gallai
classnumber 290 graph 220 garbage pickup 14 generalized chromatic polynomial 20 coloring 4.5 genetics 18.25 genome project 18.25 genusofknot 166 giant component 32 guth 71 global density 33 globally sparse 271 gracefid numbering 34 gaph algorithm 20 Benson 114 Bernoulli 3.51 cage 30.114 Cayley 73,367 chordal 141,220 circulant 367 coloring 13 competition 23,313 computer 238 comnoid 121 Desargues-Levi 113 DL 91 dualplane 161 enumeration 31 F 91 f 333 factorization 116 Gallai 220 Grotzsch 218 homomorphism 242 integer-distance 137 intersection 142,231,313 interval 19,142,313,317 isoprenoid 117
L90 line 236,317,381 Markov 352 medial 161 minor 202 modeling statistical 364 Mycielski 218 niche 324 Paley 206 p-competition 319 perfect 141 Petersen 183,206
Index of Key Terms
graph (cont.) plaoar 154,215,249,287 algorithm 203 projective 242 polyhex 119 polynomial 20 Ramsey 270 reaction 112 Robertson 206 S90 signed 168,170 terpenoid 118 unitinterval 142 Wegner 206 well of 1,207 graph theory algorithm development 201 applications 13.24 cluster analysis 93, 103 education 35 extremal 20 future 1,5,36 mathematics 5,31 newdirections 13 teaching 375 graphite lattice 122 Grotzsch graph 218 theorem 212,218 group behavior 26 finitely presented 74 knot 165 groupthink 26
H Hadwiger conjecture 6.34.54 Hamilton cycle 137,195,344,367 path 137 Hamiltonian decomposition 367,371 Hamiltonicity 34 toughness 145 head vertex 249 health care delivery 27 H e a w d 213 heterogeneity cluster analysis 94 statistical network 352 hexagonal system 30 Hilbert 5 hitting time in random graph 344 hole 220 homogeneity in cluster analysis 94
Index of Key T m s homogeneity in statistical networks 352 homogeneous vertex 64 homomorphism 71,242 incompatible 71 problem 7 1,74 Hiickel molecular orbital theory 114 human genome project 18.25 hypercube 24,194,327 extendability 194 hypergraph 283 coloring 242 random 7
r incompatible graphs 71 independence number 290 edge 143 randomgraph 263 independent edge 143,191 subset number 140 subsets 138 vertexset 275 index topological 109 indifferencegraph 18 induced matchmg 284 subgraph 127 random graph 267 trivial 8 inefficient algorithm 201 inequality isoperimetric 7 information chemical 110 network management 30 inner-planar 204 input vertex 281 integerdistance graph 137 integrity 20 interior component 221 intersectiongraph 19,142,231,313 interval graph 19, 142,313,317 random 32 unit 142 interval scale 28 irreducible polymer sequence 111 isoarithmic polyhex graph 122 isolated vertices in random graph 343 isolating cycle 222 isomorphism problem 34 randomgraph 346 weighted F-polynomial 173 isoperimetric inequality 7
isoprenoid graph 117 isotactic polymer 111 isotopic equivalence 250,254 graphs 250 links 160 number of embeddings 253 isotopy regular 169
J jamming network 285 join chromatic 157 Jones polynomial 168, 169
K Kauffman bracket polynomial 168 Kekul6 structure 30,119 kinetic model random graph 334 knot 31,160 alternating 165 amphicheiral 162 calculus 163 chiral 160 complexity 159 equivalence 164 genus 166 group 165 prime 168 triviality 163 knottedcurve 205 Konig property 289
L ladder 327 bioary 327 L a s Vegas algorithm 281 layer 1% leading overlap 267 left hand trefoil 160 level embedding 251 isotopic equivalence 254 lexicographic first maximal matching 296 product 192 L-graph 90 line graph 236,275,381 competition number 3 17 linear matmid 288 order planar embedding 249 programming problem 279 link 160 achiral 162 alternating 165
391
392
link (cont.) crossing number 164 diagram 160 alternating 165 dualgraph 161 equivalence 160 oriented writhe 169 polynomial 168 list chromatic number 8 coloring 8, 14 local edge density 354 locally finite graph 71 location problem 31 log-space 279 transformation 279 uniform 281 lyingalgorithm 22
M manufacturing 31 map 212 Markov chain 31,336 decisionmodel 20 graph 352 Martingale 31,345 matching 20,191.275 Capacitated 302 extension 191 induced 284 lexicographicallyfirst 2% maximum 275 nearperfect 370 perfect 30,118,191,275,367 planar 287 polynomial 175 problem 277,283 3-dimensional 283 perfect 277 separated 285 stable 286 star 285 value 286 weighted 302 mathematics and graph theory 5.31 learning 375 matrix Morisbima 26 stable 27 matroid 288 bicircular 268 hear 288
Index of Key Terms
maximum degree 130 matching 275 meaningfulness 35 of conclusions 28 medialgraph 161 median 131 detour 131 membership problem 71 mesh 2-dimensional 90 metatheorem 35 minimal dominating set 138 minimum cover problem 68 degree 130 spanningtree 18 vertex cover 276 minor ordering 202 m d i e d homomorphism problem 74 molecular biology 25 monotone 343 Boolean circuit 302 Monte Carlo algorithm 281 Morishimamahix 26 multigraph cluster analysis 94 connectivity 94 projection 94 component 94 random 97 multipartite 53 Mycielski graph 218
N near perfect matching 370 neighborhood closed 317 open 317 network 350 computing 24 flow problem 24 jamming 285 management 30 social 28 sociometric 356 statistical analysis 349 topology reliable 23 new concepts 34 niche acyclic digraph 324 number 314,324 noisy channel communication 314 nomenclature of chemical structure 110 non-deterministicpolynomial time algorithm 278
Index of Key Terms non-deterministic Turing machine 282 non-planar chromial 154 nova 324 nowhere-zero flow 240 Np
algorithm 278 -complete 179,202, 278 -hard 278 nugatory crossing 165 number Beraha 2 choice 14 chromatic 45,261,289 defective 14 random graph 344,346 clique 141,262,289 random graph 263 cochromatic 261 competition 313 crossing 20.3 1 deck 60 dichromatic 262 domination 138 edgecovering 143 edge independence 143 embeddings algorithm 256 Fibonacci - of graph 141 Gallaiclass 290 independence 290 random graph 263 independent subset recurrence relation 140 isotopic embeddings 253 linkcrassing 164 listchromatic 8 niche 314,324 p-competition 319 sequence chromatic 48 strong chromatic 235 vertex covering 143 vertex partition 46 numbering graceful 34 st 249 0
obstruction set 203 odd block 187 girth 71 off-line algorithm 20 on-line algorithm 20 open neighborhood 317 optimal roommate problem 287
ordering minor 202 ordinalscale 28 orientation strongly connected 23 outerplanar 204,215,287 output vertex 281
P
P = Np? 6.34.237 packing problem vertex 276,277 pairingpartition 61 Paleygraph 206 pancyclic toughness 149 parallel algorithm 23,292 -series graph 291 parity set of matroid 288 parsimonious transformation 282 partition pairing 61 planar 154 Pa& detour 127 diametral 15.89 Hamilton 137 pattern recognition 16 p-competition graph 319 number 319 perfect 220 elimination scheme 295 graph 141,289 matching 30,118,191,275,367 problem 277 vertex elimination scheme 141 periphery 15.89,132 detour 132 permanent 299 permutationrank 258 Petenen graph 183,206 Pfaffian function of matrix 298 orientation of planar graph 298 phase transition 32 Plbiconnectedgraph 257 chromial 154 embedding 250 enumeration 249 linear order 249 rauk 249 unrank 249 graph 14,154,215,249,287 algorithm 203 coloring 54,242
393
394
planar (conr.) matching 287 partition 1.54 planarity 20 algorithm 252 testing algorithm 249 political science 26 plyhexgraph 119 embedding in plane 122 isoarithmic 122 polymer atactic 111 isotactic 111 sequence 111 stereoregular 111 syndiotactic 112 polynomial acyclic 175 algorithm 275 Beraha 2, 155 bracket 168 characteristic 174 chromatic 142, 153, 174 clique 174 F 173 generalized chromatic 20 grilph 20 Jones 168,169 Kauffman bracket 168 link 168 matching 175 reduction 278 rook 174.175 simpleF 173 simple matching 176 subgraph 174 timealgorithm 201 transfornation 278 Tutte 169 postmantour 183 PQ-tree 250 datastructure 249 planarity testing algorithm 249 precedence graph 295 primeknot 168 probabilisticalgorithm 24 probabilisticmethod 6.33 prohahility model in cluster analysis 94 problem chemical graph theory 109 Chinesepostman 1 8 3 chromatic numher 346 circuitvalue 2% coloring 71
Index of Key Terms
disjoint path 205 disk dimension 204 fourcolor 34. 153,211 homeomorphism 71 isomorphism 34 linear programming 279 location 31 matching 277,283 membership 71 minimumcover 68 networkflow 24 optimalroommate 287 perfect matching 277 reconstruction 34 shortest cycle cover 183 stable marriage 285 roommate 286 three color 21 1 traveling salesman 34,291 vertex cover 288 packing 277.288 word - for groups 71 process randomJgraph 333 random graph 333,344 product Cartesian 191.1%. 367 lexicographic 192 wreath 192 program verification 3 1 projection multigraph 94 projective planar graph 242 propercoloring 45 protocol, consensus 24 psychology 26 pulse process model 27
Q QSAR (quantitative structure activity relationship) 29,
109 QSPR (quantitativestructure property relationship) 29, 109 quasi-edge 224 quasi-random graph 7
R radius 89 detour 130 rainbow 4-cycle 85 subgraph 81 triangle 82
Index of Key Terms Ramanujan conjecture 6 Ramsey graph sparse 270 property 32,267 theory 20,261 random approximationalgorithm 23 embedding 258 f-graph process 333 hwrrnph 7 interval graph 32 random graph 32,265,341 asymptotic analysis 345 BigBang 32 chromatic number 344,346 clique number 263 cluster analysis 95 connectedness 342 edge chromatic number 346 reconstructibility 346 equilibrium model 334 Erd&-R6nyi 333 evolution 32 Hamiltoncycle 344 history 341 hitting time 344 independence number 263 induced subgraph 267 isomorphism 346 kineticmodel 334 model in chemistry 30 process 333,344 quasi- 7 regular 346 spanning cycle 344,347 sparse 270 statistical exploration of network 349 subgraph 265 threshold 341,342 random multigraph 97 randomized complexity class 281 randomness 31 rank
Cartesianprcduct 258 function 258 of embedding 258 permutation 258 planarembedding 249 s u m vertex 111 ratioscale 28 reaction graph 112 network (in chemistry) 109
reconstruction 59 F-polynomial 175 problem 34 tree63
recwence relation for independent subset 140 recursive set 71 reducednumber deck 60 reduction polynomial 278 Turing 278 regular isotopy 169 randomgraph 346 Reidemeistermove 160, 169 reliability 20 reliable network topology 23 retract 71 Riemann hypothesis 6 right hand trefoil 160 rigid circuit 220 Robertson graph 206 rook polynomial 174,175 rotationedge 18 relational distance 18
S scheduling 14 search problem complexity 277 separated matching 285 SeQuence
detour eccentric 129 eccentric 129 genome 25 polymer 111 sequentid algorithm 292 series-parallel graph 291 set, distance central 91 S-graph 90 Shannon capacity 34 shortest cycle cover problem 183 sign solvability 27 stable 27 signed graph 26,168,170 significant cluster 25 similarity in classification 93 simple F-polynomial 173 matching polynomial 176 single linkage cluster 93 size Boolean circuit 301 of star-matching 285 m-matching 285
395
3%
Social
network 28 science 26 sodology 26.28 sociomehk network 356 span 15 spanning cycle in random graph 344,347 subgraph extendibility 1% bee 18,327 Sparse Ramseygraph 270 randomgraph 270 Special edge 221 vertex 221 spherical embedding 250 split 17 stable marriage problem 285 matching 286 matrix 27 roommate problem 286 Star
double 327 matching maximum 285 size 285 starliketree 346 statistid analysis of network 349 graph modeling 364 network heterogeneity 352 homogeneity 352 statistics dyad 351 time dependent 355 Steiner minimum tree 18 stereoregularpolymer 111 sf -numbering 249 algorithm 250
stochastic dependency 355 strictlybalanced 270 5-g
chromatic number 235 linkagecluster 93 perfect graph conjecture 22 S'rOnglY
chordalgraph 295 colorable 235 connected orientation 23 structural information remeval 29
Index of Key Terms
subcontraction 54 subgraph induced 127 induced, in random graph 267 polynomial 174 rainbow 81 random graph 265 totally multicolored 81 subset Bernoulli 351 independent 138 survivability 20 syndiotacticpolymer 112 synthesis design 29
T tail vertex 249 Tait conjecture 31 taskassignment 14 T - c d h g 14 teaching discovery method 377 discrete mathematics 375 graphtheory 375 terpenoidgraph 118 theorem Brooks 2.48,216 fivecolor 2 four color 2,6,54 Grotzsch 212,218 threecolor 2 three color problem 21 1 theorem 2 t h - m e c t e d 183 threedimensional matching problem 283 three-uniform hypergraph 283 threshold function 343 random graph 341,342 time series graph statistics 355 topologicalindex 109 total number deck 60 totally multicolored subgraph 81 toughness 20,145 Hamiftonicity 145 pancyclicity 149 tour 22 postman 183 tournament 20 traffic phasing 13 transformation 279 parsimonious 282 polynomial 278
Index of Key Terms transitive vertex 368 transportation 31 system 27 traveling salesman problem 34,291 bee
double starlike 327 minimumspanning 18 PQ 249,250 reconstruction 63 spanning 327 starlike 346 Steinerminimum 18 trefoil left hand 160 righthand 160 triangle graph Bernoulli 352 rainbow 82 triangulated graph 220,317 triangulation 215 trivial induced subgraph 8 7R-matching 285 maximum 285 size 285 Turing equivalent 169 machine counting 282 deterministic 281 non-deterministic 282 reduction 278 Tutte conjecture 185 polynomial 169 two processor schedulingproblem 295
U Ulam conjecture 175 unavoidance theorem 34 unique coloring 52,226 unit interval graph 142 spheregraph 19 unrank
embedding 258 function 258 planarembedding 249
V valence bond theory 122 ValUe matching 286 stable 27
verification of program 31 vertex clique covering 315 cluster graph 93 coloring 214 cover minimum 276 problem 288 covering number 143 detourdistance 131 distance 131 distancecentral 91 distinguished 240 eccentric 89 eccentricity 15,111 elimination scheme 141 head 249 homogeneous 64 kpnt 281 output 281 packing 276,277,288 pdtionnumber 46 set independent 275 special 221 tail 249 transitive 71,368 VLSI design 31 voting 16 vulnerability 20 W weak linkage cluster 93 webfood 313 Wegnergraph 206 weight cover 173 Eulerian 183 weighted digraph 27 matching 302 well of graphs 1.2W wellcovered 179,278,290 word problem for groups 71 wreath product 192 writhe oriented link 169
397
This Page Intentionally Left Blank