SUBMODULAR FUNCTIONS AN D 0PTIMlZATl0N
ANNALS OF DISCRETE MATHEMATICS
47
General Editor: Peter L. HAMMER Rutgers University, New Brunswick, NJ, U.S.A.
Advisory Editors: C. BERGE, Universite de Paris, France R.L. GRAHAM, AT&T Bell Laboratories, NJ, U.S.A. M.A. HARRISON, University of California, Berkeley, CA, U.S.A. V. KLEE, University of Washington, Seattle, WA, U.S.A. J.H. VAN LINT California Institute of Technology, Pasadena, CA, U.S.A. G.C. ROTA, Massachusetts Institute of Technology, Cambridge, M.A., U.S.A. 7: TROTTER, Arizona State University, Tempe, AZ, U.S.A.
SUBMODULAR FUNCTIONS AND OPTIMIZATION
Satoru FUJlSHlGE Institute of Socio-Economic Planning University of Tsukuba Tsukuba, lbaraki 305 Japan
1991
NORTH-HOLLAND -AMSTERDAM
NEW YORK
OXFORD TOKYO
ELSEVIER SCIENCE PUBLISHERS B.V. Sara Burgerhartstraat 25 P.O. Box 211, 1000 AE Amsterdam, The Netherlands Distributors for the U.S.A. and Canada:
ELSEVIER SCIENCE PUBLISHING COMPANY, INC. 655 Avenue of the Americas New York, N.Y. 10010, U.S.A.
L i b r a r y of C o n g r e s s Cataloging-in-Publication
Data
Fujlshige, Satoru. S u b m o d u l a r f u n c t i o n s and o p t i m i z a t i o n I S a t o r u Fujlshige. p. cn. -- ( A n n a l s of d i s c r e t e m a t h e m a t i c s ; 47) I n c l u d e s bibliographical r e f e r e n c e s and index. ISBN 0-444-88656-0 1. S u b m o d u l a r functions. 2. C o m b l n a t o r l a l optimization. I. Title. 11. S e r i e s . PA166.6.FB5 1991 511'.6--dc20 90-21566 CIP
ISBN: 0 444 88556 0
0 ELSEVIER SCIENCE PUBLISHERS B.V., 1991 All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, without the prior written permission of the publisher, Elsevier Science Publishers B.V. / Physical Sciences and Engineering Division, PO. Box 103, 7000 AC Amsterdam, The Netherlands. Special regulations for readers in the U.S.A. - This publication has been registered with the Copyright Clearance Center Inc. (CCCI, Salem, Massachusetts. Information can be obtained from the CCC about conditions under which photocopies of parts of this publication may be made in the U.S.A. All other copyright questions, including photocopying outside of the U.S.A., should be referred to the publisher. No responsibility is assumed by the publisher for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or from any use or operation of any methods, products, instructions or ideas contained in the material herein.
PRINTED IN THE NETHERLANDS
Preface Submodular functions frequently appear in the analysis of combinatorial systems such as graphs, networks, and algebraic systems, and reveal combinatorially nice and deep structures of the systems. The importance of submodular functions has widely been recognized in recent years in combinatorial optimization and other fields of combinatorial analysis. The present book provides the readers with an exposition of the theory of submodular functions from an elementary technical level to an advanced one. The theory of submodular functions was developed in the earliest stage till 1950’s by H. Whitney and W. T. Tutte for matroids, by G. Choquet for the capacity theory and by 0. Ore for graphs. A flourishing stage of the theory came with J. Edmonds’ work on matroids and polymatroids in 1960’s. Related studies were also made in the theory of characteristic function games by L. S. Shapley and others. Since 1970, applications of (poly-)matroids to practical engineering problems have been extensively made by M. h i , A. Recski and others, and also further theoretical developments in submodular functions by a lot of great scholars in discrete mathematics and combinatorial optimization. The theory of submodular functions is now becoming mature, but a large number of fundamental and useful results on submodular functions are scattered in the literature. The main purpose of the present book is to put these materials together and to show the author’s unifying view of the theory of submodular functions by means of base polyhedra and duality for submodular and supermodular systems. Special emphasis is placed on the constructive aspects of the theory, which will lead us to practical efficient algorithms. No comprehensive survey of submodular functions is aimed at here. I had to omit important results on submodular functions such as a strongly polynomial time algorithm for minimizing submodular functions due to M. Grotschel, L. Lovdsz and A. Schrijver. This is mainly because the precise description and validation of the results would require further technical developments outside the mainstream of this book. A sketch of the author’s view of submodular functions was given in a survey paper [Fuji84c],which was written while I was visiting Professor Bernhard Korte’s Institute in Bonn as an Alexander von Humboldt fellow in 1982-83, and laid a basis of the project of writing this book, which I gratefully acknowledge. I also acknowledge that part of my work, upon which the present book is primarily based, has been supported by grants-in-aid of the Ministry of Education, Science and Culture of Japan. I would like to express my deep sincere thanks to Professor Masao Iri of the University of Tokyo who first drew my attention to the theory of matroids, a promising and enjoyable research field of combinatorics, in 1975 and has since then been keeping giving me invaluable advice and stimulating discussions on V
submodular functions and other related discrete systems. Without his advice most of my work would not have been accomplished. Thanks are also due to Professor Nobuaki Tomizawa, now at Niigata University, with whom I enjoyed inspiring discussions and joint work. I am also very much grateful to Professor Peter L. Hammer for his encouragement to write this book. I used a preliminary version of this book for a lecture of the Doctoral Program in Socio-Economic Planning at the University of Tsukuba and I thank the students who attended the lecture for their useful comments. I have also benefited from comments and communications received from Bill Cunningham, Tetsuo Ichimori, Naoki Katoh, Kazuo Murota, Masataka Nakamura, Hans Rock and Uwe Zimmermann, to name a few, in the course of my research on submodular functions and writing this book.
S.F.
July 1990
vi
Contents
Preface
..................................................................
v
Chapter I. Introduction
............................................... 1 1 . Introduction ......................................................... 1 1.1. Introduction .................................................... 1 1.2. Mathematical Preliminaries ..................................... 2 (a) Sets ......................................................... 2 (b) Algebraic structures ......................................... 3 (c) Graphs ...................................................... 6 (d) Network flows .............................................. 10 (e) Elements of convex analysis and linear inequalities .......... 12
.
C h a p t e r I1 S u b m o d u l a r S y s t e m s a n d Base P o l y h e d r a
...........
2 . From Matroids to Submodular Systems ............................. 2.1. Matroids ....................................................... 2.2. Polymatroids ................................................... 2.3. Submodular Systems ...........................................
17 17 17 20 26
3 . Submodular Systems and Base polyhedra ........................... 36 3.1. Fundamental Operations on Submodular Systems ............... 36 (a) Reductions and contractions by sets ........................ 36 (b) Reductions and contractions by vectors ..................... 37 (c) Translations and sums ...................................... 41 (d) Other operations ........................................... 43 44 3.2. Greedy Algorithm .............................................. (a) Distributive lattices and posets ............................. 44 (b) Greedy algorithm .......................................... 48 3.3. Structures of Base Polyhedra ................................... 55 (a) Extreme points and rays .................................... 55 (b) Elementary transformations of bases ........................ 58 (c) Tangent cones .............................................. 60 (d) Faces, dimensions and connected components ............... 63 3.4. Intersecting- and Crossing-Submodular Functions ............... 73 (a) Tree representations of cross-free families ................... 73 (b) Crossing-submodular functions ............................. 77 (c) Intersecting-submodular functions ........................... 86 3.5. Related Polyhedra ............................................. 87 (a) Generalized polymatroids ................................... 88 vii
(b) Polypseudomatroids ........................................ (c) Ternary semimodular polyhedra ............................ 3.6. Submodular Systems of Network Type ........................
.
Chapter I11 Neoflows ...............................................
91 96 105 109
4 . The Intersection Problem .......................................... 109 4.1. The Intersection Theorem ..................................... 109 (a) Preliminaries .............................................. 110 (b) An algorithm and the intersection theorem ................ 112 (c) A refinement of the algorithm ............................. 118 4.2. The Discrete Separation Theorem ............................. 121 4.3. The Common Base Problem ................................... 123
5. Neoflows .......................................................... 125 126 5.1. Neoflows ...................................................... (a) Submodular flows ......................................... 126 126 (b) Independent flows ......................................... (c) Polymatroidal flows ....................................... 127 5.2. The Equivalence of the Neoflow Problems ..................... 128 (a) From submodular flows to independent flows ............... 128 (b) From independent flows to polymatroidal flows ............ 129 (c) From polymatroidal flows to submodular flows ............. 131 133 5.3. Feasibility for Submodular Flows .............................. 136 5.4. Optimality for Submodular Flows ............................. 5.5. Algorithms for Neoflows ....................................... 146 (a) Maximum independent flows ............................... 146 (b) Maximum submodular flows ............................... 151 (c) Minimum-cost submodular flows ........................... 153 5.6. Matroid Optimization ......................................... 164 (a) Maximum independent matchings ......................... 164 (b) Optimal independent assignments ......................... 170
.
Chapter IV Submodular Analysis
.................................
6. Submodular Functions and Convexity .............................. 6.1. Conjugate Functions and a Fenchel-Type Min-Max Theorem for Submodular and Supermodular Functions .................. (a) Conjugate functions ....................................... (b) A Fenchel-type min-max theorem .......................... 6.2. Subgradients of Submodular Functions ........................ (a) Subgradients and subdifferentials .......................... (b) Structures of subdifferentials .............................. 6.3. The Lovdsz Extensions of Submodular Functions .............. viii
175 175 175 175 177 179 179 184 186
7 . Submodular Programs ............................................. 191 7.1. Submodular Programs - Unconstrained Optimization ......... 191 (a) Minimizing submodular functions .......................... 191 (b) Minimizing modular functions ............................. 197 7.2. Submodular Programs - Constrained Optimization ........... 202 (a) Lagrangian functions and optimality conditions ............ 202 (b) Related problems .......................................... 207 (b.1) The principal partition ............................... 207 (b.2) The principal structures of submodular systems ....... 217 (b.3) The minimum-ratio problem ......................... 219
Chapter V . Nonlinear Optimization with Submodular Constraints
.................... 8 . Separable Convex Optimization .................................... 8.1. Optimality Conditions ........................................
223
8.2. A Decomposition Algorithm ................................... 8.3. Discrete Optimization .........................................
223 223 227 229
9 . The Lexicographically Optimal Base Problem ...................... 9.1. Nonlinear Weight Functions ................................... 9.2. Linear Weight Functions ......................................
231 231 233
10. The Weighted Max-Min and Min-Max Problems ................... 238 238 10.1, Continuous Variables ........................................ 10.2. Discrete Variables ............................................ 240 11 . The Fair Resource Allocation Problem ............................. 11.1. Continuous Variables ........................................ 11.2. Discrete Variables ............................................ 12. The Neoflow Problem with a Separable Convex Cost Function
. . . . . . 248
............................................................. Index .................................................................. References
ix
241 241 243
251 265
This Page Intentionally Left Blank
Chapter I.
Introduction
In this chapter we give a short introduction of the structure of this book and briefly review elements of algebra, graphs, networks and linear inequalities for readers’ convenience.
1. Introduction
1.1. Introduction In 1935 H. Whitney [Whit351 introduced the concept of matroid as an abstraction of the linear dependence structure of a set of vectors. Several systems of axioms for defining a matroid are now known, each of which is simple but substantial enough to yield a deep theory in Combinatorial Optimization and to have a lot of applications in practical engineering problems (see [Iri83], [Iri+Fuji8l], [Murota87], [Recski]). Matroidal structures are closely related to a class of efficiently solvable combinatoria! optimization problems; a careful examination of an efficiently solvable problem often reveals a matroidal structure which underlies the problem. In 1970 J. Edmonds [Edm70] combined the matroid theory with polyhedral combinatorics and lead us to the concept of polymatroid. A polymatroid polyhedron, called an independence polyhedron, is expressed by a system of linear inequalities with (0,l)-coefficients and the right-hand sides given by a submodular function which is the rank function of the polymatroid. The relation between matroids and polymatroids is similar to that between matchings in bipartite graphs and flows in networks. The rank function of any polymatroid is a monotone nondecreasing submodular function on a Boolean lattice 2 E for a finite set E . The monotonicity of the rank function does not play any essential r61e in characterizing the combinatorial structure of the polymatroid polyhedron since the monotonicity is not invariant with respect to translations of the polyhedron. The concepts of submodular and supermodular systems [Fuji78b,84c] naturally come up with this observation. The rank function of a submodular (or supermodular) system is a submodular (or supermodular) function on a distributive lattice, a sublattice of a Boolean lattice. The duality is defined between a submodular system and a supermodular system, which dissolves the clumsy definition of polymatroid duality [McDiarmid75]. Submodular systems are not only theoretical generalizations of matroids and polymatroids but also significantly extend the applicability in practical problems. 1
I. INTRODUCTION In Chapter I1 we first introduce the concepts of submodular and supermodular systems and their associated base polyhedra by following the historical generalization sequence of matroids, polymatroids and submodular systems. We then examine algorithmic aspects of submodular systems and basic structures of base polyhedra. In Chapter I11 we consider a class of network flow problems with submodular boundary constraints, which we call the neoflow problem. It includes the (poly-)matroid intersection problem of J. Edmonds [Edrn’lO],the submodular flow problem of J. Edmonds and R. Giles (Edm+Giles77], the independent flow problem of the author [Fuji78a] and the polymatroidal flow problem of R. Hassin [Hassin78,82] and E. L. Lawler and C. U. Martel [Lawler+Martel82b]. Submodular functions are discrete analogues of convex functions. In Chapter IV we develop a theory of submodular functions from the point of view of convex analysis [Rockafellar70], which we call the submodular analysis. We will make clear the close relationship between the submodular analysis and the results obtained in Chapter 111. Finally we consider nonlinear optimization problems with submodular constraints in Chapter V. A decomposition algorithm is shown for a separable convex optimization problem over a base polyhedron and it lays a basis for the algorithms of the other problems such as the lexicographically optimal base problem, the weighted max-min (min-max) problem and the fair resource allocation problem. We also consider a neoflow problem (the submodular flow problem) with a separable convex cost function.
1.2. Mathematical Preliminaries
(a) Sets We denote the set of reals by R, the set of rationals by Q and the set of integers by Z. We also denote the set of nonnegative elements of R (Q, Z) by R+ ( Q + , Z+)* For any finite set X we denote its cardinality by When X is a subset of a set Y ,we write X 5 Y , and when X is a proper subset of Y (i.e., X C Y and X # Y), we write X c Y . It should be noted that this is different from the conventional notation. For subsets X and Y of a set E , when X n Y = 0, we say X and Y are disjoint, and when X U Y = E (or ( E - X)n ( E - Y)= 0), we say X and Y are codisjoint. A set of disjoint nonempty subsets Bi (i E I) of a set E is called a partition of E if UiGrBi = E . If {Bi I i E I } is a partition of E , we call { E - B, I i E I } a copartition of E . For sets A and B we call ( ( 1 , ~I )a E A } u {(2,6) I 6 E B }
1x1.
2
1.2. MATHEMATICAL PRELIMINARIES
I
the direct sum of A and B and denote it by A @ B. Also, we call { ( a ,b) a E A , b E B } the direct product of A and B and denote it by A x B . For a mapping f : A -+ B we often express f as ( f ( a ) I a E A ) . For example, a family 3 of sets Xi (i E I ) is written & 3 = (XiI i E I ) and a matrix M with a row index set I , a column index set J and an (i,j)-element Mij (i E I,j E J ) is expressed as M = (Mij I i E I , j E J ) . The set of all the mappings from A to B is denoted by B A . The characteristic vector of a subset A of an underlying set E is the mapping X A : E -+ (0,l} such that XA(e) = 1 for e E A and x ~ ( e=) 0 for e E E - A .
(b) Algebraic structures For a set A we call a binary relation, denoted by simply an order on A if it satisfies
<, on A a partial
order or
< a, (ii) (antisymmetry) a 5 b, b < a --r' a = b, (iii) (transitivity) a < b, b 5 c ==+ a 5 c . (i) (reflexivity) V a E A: a
We call the pair ( A , < ) a partially ordered set or a poset for short, Also, we is implicitly often say that A is a poset, when the underlying partial order assumed. If a 5 b and a # b, we write a < b. Moreover, if we have a 5 b or b 5 a for every a , b E A , we call the partial order 5 a total order or a linear order. The ordinary orders on R , Q and Z are total orders. A total order is usually denoted by 5. For a poset P = ( A , 5 ) define a poset P* = ( A ,<*) by
<
for any a , b E A . We call
P* = ( A ,<*)
the dual poset of
P. We often use
for
5*. For a poset P = ( A , 5 ) a subset B of A is called an ideal (or a lower ideal) of P if z y E B implies x E B . Also, B C A is called a dual ideal (or an upper ideal) of P if x 2 y E B implies x E B. For two elements z, y of a poset P = ( A ,i),if z < y and there exists no element z such that x < z < y , we say that y covers x. Let P = ( A , < ) be a poset. An element a E A is an upper bound of a subset B C A if for each b E B we have 6 a. Similarly, an element a E A is a lower bound of B C A if for each b E B a b. If a is an upper bound (lower bound) of B and a belongs to B , then a is called the maximum element (minimum element) of B . Note that the maximum (or minimum) element of B , if any exists, is unique. If the set of upper bounds (lower bounds) of B has the minimum element 6, (the maximum element b * ) , we call b, ( V )the
<
<
3
I. INTRODUCTION supremum (infimum) of B and denote it by sup B (inf B ) . If for each x,y E A there exist sup{x,y} and inf{z, y}, then we call poset P = ( A ,5 ) a lattice, and we write x V y = sup{z, y} and z A y = inf{z, y}. These two binary operations, V and A, are called the join and the m e e t , respectively, and satisfy
(i) (idempotency) V x E A : x V z = 5, z A x = z,
(ii) (commutativity) vx,y E A : z V y = y V z, x A y = y A x,
(iii) (associativity) Vx, y,z E A : x V (y V z ) = (z V y) V z , z A (y A Z ) = (z A y) A Z , (iv) (absorption) Vz,y E A : z A (z V y) = z, 5 V (z A y) = x. Conversely, if a set A is given two binary operations V and A which satisfy the above (i)-(iv), then, defining a binary relation 5 by “5 5 y xvy =y (or z A y = z),” we have a lattice ( A ,5 ) whose lattice operations are the given V and A . A lattice ( A , 5 ) with lattice operations V and A is also expressed as (A, V, A). We often call A itself a lattice, assuming the underlying partial order or lattice operations. A lattice f! = (A,V,A) is called a distributive lattice if it satisfies (v) (distributivity) Vz, y,z E A : z A (y V z ) = (z A y) V (z A z ) , x V (y A z ) = (x V y) A (z V z ) . Let D be the set of (lower) ideals of a finite poset P = ( E ,5 ) . Then D is a distributive lattice with set union u and intersection n as the lattice operations v and A . Conversely, any finite distributive lattice is isomorphic with the one represented by a collection of sets as above (see [Birkhoff37,67]). Any sublattice of a distributive lattice is a distributive lattice. For posets P; = ( A ; , < ; ) (i E I ) , define a binary relation 5 on the direct product XicrA; of A; (i E I) as follows: for a = (ai I i E I ) and b = (6i I i E I ) we have a 5 b if and only if V i E I : ai
<
z
Y e E E : z ( e ) 5 y(e).
(1.4
Since (RE,5 ) is the direct product of [El copies of (R, 5 ) and a direct product of distributive lattices is a distributive lattice, (RE,5 ) is a distributive lattice. Also, note that for any z,y,z E RE (i) x < y = = = + . + z i y + z , (ii) z 5 y,X E R + ==+ A s 5 Xy.
Therefore, RE is a vector lattice. Following the convention, we also use 5 for the partial order 5 on R E . 4
1.2. MATHEMATICAL PRELIMINARIES Consider a lattice f! = ( L , v , A ) with the minimum element 0 and the maximum element I. If z, y E L satisfy x V y = I and x A y = 0 , x (or y) is a complement of y (or z). Lattice f! is said to be complemented if each x E L has its complement. A Boolean lattice is a lattice which is distributive and complemented. A set G with an element e E G, a binary operation : G x G -+G and a unary operation ( )-I: G + G is a group if it satisfies
-
(1) VX,Y,ZEG: X-(Y.Z) = (x-y)'z, (2) Vx E G: e . x = x - e = z, (3) Vx E G: z-' . x = x . x-' = e . Here, the element e E G satisfying (2) uniquely exists and is called the unit element (or the neutral element). Also, z-l is called the inverse element of x. When G satisfies (1) but not necessarily (2) and (3), G is called a semi-group. When a group G is commutative, i.e., Vx,y E G: x .y = y x, G is called a commutative group or an Abelian group. For a commutative group G we often use instead of for the binary operation, and denote the inverse of x by -x and the unit (or neutral) element by 0. A commutative group G expressed in this way is called an additive group. G is called a totally ordered additive group if G is an additive group with a total order 5 defined on it and if the following (1) and (2) hold:
+
-
(1) x 5 Y ===+ -x L -Y, (2) 2 1 5 Y1, 5 2 F Y2 ==+XI + % F Yl +Y2.
The set R of reals, Q of rationals and Z of integers are typical examples of a totally ordered additive group. A ring A is an algebraic system with two operations (addition) and (multiplication) such that
+
(i) A is an additive group with operation
+,
(ii) A - ( 0 ) is a semi-group with operation . , (iii) Vx,y,z E A : (x+ y) - z = z . z y ' z , 3:. (y+ z ) = z . y
+
+ z az.
The set Z of integers is an example of a ring. For a ring A with a unit element 1,an additive group G, and an operation *: A x G + G, we call G an A-additive group or an A-module if for each a,,B E A and 5,y E G we have
a * .( + y) =
* z + a * y,
( a+ P I
a*(p*x)=(a.p)*x, 2 is a Z-additive group.
*
=
a*
+ p * z,
l*z=x.
(1.3a)
(1.3b)
I. INTRODUCTION A ring A is commutative if A - ( 0 ) is a commutative semi-group with respect to operation . A commutative ring K is called a field if K - ( 0 ) is a commutative group with respect to operation . . The set Q of rationals and R of reals are fields with ordinary operations of the addition and the multiplication. For two fields H and K such that H 2 K we say H is a subfield of K and K is an extension of H .
-
(c) Graphs
Let V and A be finite sets, where V is called a vertex set and A an arc set. Each v E V is called a vertex and each a E A an arc. We are also given two functions d + , d - : A -+ V. For each arc a E A d+a is the initial end-vertex (or the tail) of a and d-a is the terminal end-vertex (or the head) of a . We call G = ( V , A ; d + , d - )a graph. When there is no possibility of confusion, we also denote the graph by G = ( V , A ) . We often express an arc a by the ordered pair ( d + a , d - a ) of the end-vertices when such a pair uniquely determines the arc. A graph G = ( V , A )with arc set A = { ( u , v ) I u , v E V , u # v } is sometimes called a complete directed graph. See Fig. 1.1 for a geometric representation of a graph. For graphs see, e.g., [Berge73] and [HararyGg].
d+ a
U
d- a
Figure 1.1. An arc a such that d + a = d - a is called a selfloop and arcs a1,a2 such that { d + a l , d - a l } = { d + a z , d - a 2 } is called a parallel arcs. We call a graph H = (W,B ; a&,a;) a subgraph of a graph G = (V,A ; d + , and d j j are, respectively, the restrictions of d+ d - ) if W C: V , B C A and and 6’- to B . We often use the original d+ and d- for a& and 8,. Moreover, if B is the set of all the arcs a E A such that { d + a , d - a } W , then we call 6
1.2. MATHEMATICAL PRELIMINARIES
H = (W,B)a subgraph induced by the vertex subset W ;and if W is the set of all the end-vertices of arcs in B ,then we call H = (W,B ) a subgraph induced by the arc subset B and denote it by G . B . For any B A let V ( B )be the set of end-vertices of arcs of B. Consider a graph H = (W,A - B ;a&, aE) such that W = (V - V ( B ) )u {VB} (vg:a new vertex), aj!ja = a*a (if a*a $ V ( B ) ) and a$a = ~g (if a*a E V ( B ) ) .We denote this graph H by G / B and call it a contraction of G by B . Informally, contraction G / B is obtained by shrinking B together with the end-vertices into a single vertex V B . When we do not distinguish d+a from a-a for each a in A or we are not concerned with the orientations of the arcs, we call the graph G = ( V , A ) an undirected graph and call each a E A an edge instead of an arc. When we emphasize the fact that a graph is not an undirected one, we call it a directed graph. A graph is called simple if it does not contain any selfloops or parallel arcs. We define for each vertex v E V
6-v = { a I a E A , a-a = u } .
(1.5)
An arc a in 6+v u 6-v is said to be incident to v. For any B A and U C V we define d*B = {@:a I a E B } and 6*U = Uv,,6*v. A vertex u in (a-6+v) u (a+6-v) is said to be adjacent to v. Also define for each subset U of v A + U = { a I a E A , a+a E u, E v - u}, (1.6)
A-u
= {a
I a E A,
E
u, asa E v - u>.
(1.7)
If A + U U A-U # 0, we call the set A + U U A-U of arcs a cutset. A single-element cutset is called a bridge. For a graph G = ( V , A ) define a matrix D = (D," I v E V,a E A ) with a row index set V and a column index set A by 1
if v = P a and u if v = d - a and u
# %a, # d+a,
otherwise. We call D the incidence matrix of G. A path in G is an alternating sequence ( v ~ , a ~ , v l , a ~ ; ~ ~ , v ~ - l , aofk , ~ k ) vertices vi ( z = 0,1,. . . ,k) and arcs ai (i = 1 , 2 , * . k) such that { a + a i , 8-a;) = { v i d 1 , v i } (i = 1 , 2 , - . - , k ) . vo is the initial vertexof the path and v k is the terminal vertex. We also say that the path is from V o to V k . For an arc a; such that d+ai = vi-l and d-ai = vi we say a, is positively oriented in the path. Also, if d+ai = u i , %a; = ui-1 and u, # vi-1, we say a, is negatively oriented a
7
,
I. INTRODUCTION in the path. If all the arcs in the path are positively oriented, we call the path a directed path. A (directed) path is called a (directed) cycle if it contains at least one arc and its initial and terminal vertices coincide with each other. A path or cycle is called elementary if no vertices in it are repeated (except for the initial and terminal vertices when we consider a cycle). A graph G = (V,A ) is called connected if for every two vertices u , v E V there exists a path from u to v . Connectedness is a property of G as an undirected graph. A maximal connected subgraph of G is called a connected component of G. G is decomposed into its connected components Hi = (K, A i ) (i E I ) , where {& I i E I } is a partition of V and { A i I i E I } is a partition of A . The rank of a graph G = ( V , A ) is the number of its vertices minus the number of its connected components, which is equal to the matrix rank of the incidence matrix D of G. The nullity of G is equal to [ A ]minus the rank of G. A graph G = ( V , A ) is called strongly connected if for every two vertices u , v E V there exists a directed path from u to v . A maximal strongly-connected subgraph of G is called a strongly connected component of G. G is decomposed into its strongly connected components = (fi, (i E f), where {fi I i E i} is a partition of V but { & I i E f} is not a partition of A though ( i E i) are disjoint. For two strongly connected components l?il and H i 2 we write kil 5 Hi3 if and only if there exists a directed path from a vertex of H i , to a vertex of kil. Note that if 5 H i 2 , there exists a directed path from any vertex of Hi2 to any vertex of H i , . We can easily see that this binary relation j on the set of strongly connected components of G is a partial order. We say that the partial order is naturally induced by the decomposition of G into strongly connected components. (See Fig. 1.2.)
A,)
,&
Figure 1.2. The decomposition into strongly connected components. 8
1.2. MATHEMATICAL PRELIMINARIES
A graph G = ( V , A ) is called a tree if it is connected and does not contain any cycle. For a graph G = ( V , A ) and a subgraph H = ( W , B ) of G, H is called a (spanning) tree of G if W = V and H is a tree. A tree T = ( V , A ) is called a directed tree if IS-vl 5 1 for each v E V . For a directed tree T = (V,A ) there uniquely exists a vertex vo E V such that 16-voI = 0. We call vo the root of T . A vertex v of T with 16+vl = 0 is called a leaf of T. For vertices u , v of T such that for some arc a of T we have d+a = u and d - a = v, we call u the parent of v and v a child of u. Children of a common parent are called siblings. A tree !I? = ( V , A ) is called a directed tree toward the root if 16+vl 5 1 for each v E V . The root of f' is a vertex vo such that 16+voI = 0. A bipartite graph G = ( V + ,V - ; A ) is a graph with two disjoint vertex sets V + and V - and with an arc set A consisting of arcs a such that d+a E V + and d - a E V - alone. We often call V + the left vertex set and V - the right wertez s e t . A matching M in the bipartite graph G is a subset of the arc set A such that for any distinct arcs a,a' in M we have d + a # d+a' and 8 - a # %a'. Also, a cover ( U + , U - ) of G is the ordered pair of U+ C V + and U - C V such that for any arc a E A we have d+a E U+ or d - a E U - .
Theorem 1.1 (Konig-Egervky): For a bipartite graph G = ( V + ,V - ; A ) we have max{ IMI
I
M : a matching of G}
=rnin{IU+I+IU-I
I
( U + , U - ) : acoverofG}.
(1.9)
For a simple graph G = (V,A ) , not necessarily a bipartite graph, a subset M of the arc set A is called a matching of G if for each distinct a,a' E M we have { d + a , d - a } n {d+a',d-a'} = 0. For a poset P = (V,5 ) on a finite set V consider a graph G( P) = (V,A ( P ) ) with the vertex set V and the arc set A defined by
A ( P ) = { ( u , v ) I u , v E V, u covers v in
P}.
(1.10)
The Hasse diagram of the poset P = (V, 5 ) is a planar drawing of the graph G ( P ) = ( V , A ( P ) )in such a way that for each arc ( u , v ) of G ( P ) u is located vertically higher than v . Since the orientation of each arc of the Hasse diagram is uniquely determined by the planar drawing, the Hasse diagrams are usually drawn as if they were undirected graphs (see Fig. 1.3). For a finite set E and a family 3 of subsets of E we call H = ( E ,3) a hypergraph, where each element of E is called a vertex of H and each element of 3 an edge of H . A hypergraph H = ( E , 3 ) is said to be connected if for each distinct e , e' E E there exists a sequence F I , F2, . . ., Fk of edges of H 9
I. INTRODUCTION
Figure 1.3. The Hasse diagram for the decomposition structure in Fig. 1.2. such that e E F I , e' E Fk and Fi n F,+l # 0 ( i = l , - . . , k - 1 ) . A hypergraph H' = ( E f , 3 ' )is a subhypergraph of H = ( E , 3 ) i f E' C E and 3' 5 7. A connected component of H = ( E , 3) is a maximal connected subhypergraph of H.
(d) Network flows Consider a network with an underlying graph G = (V,A)having two distinguished vertices s+, s- E V and with a nonnegative capacity function c: A + R+ on the arc set A. Vertices s+ and s- are, respectively, the entrance and the exit of the network. We call this network a two-terminal network and denote it by N = (G = (V,A),s+,s-,c).A function p: A + R is a feasible flow in if p satisfies the network N = (G = (V,A),s+,s-,c) b'u E A : 0 5 p(a) 5 c ( u ) ,
vv E
v
-
{s+,s-}:
ap(v) f
c ll.Ed+
The value of a feasible flow p is given by
10
TI
p(a) -
c oE6-v
(1.11) p(u) = 0.
(1.12)
1.2. MATHEMATICAL PRELIMINARIES The function dp: V -+ R is called the boundary of p. Note that regarding p and d p as column vectors, we have d p = Dp for the incidence matrix D of G. A feasible flow of maximum value is called a maximum flow in N. We call a vertex subset U with 5+ E U and 5- $ U a cut of N. The capacity of a cut U is defined by .C(W
=
c
(1.14)
C(.).
aEA+U
Theorem 1.2 (Ford-Fulkerson): For a two-terminal network (V,A ) , s+ ,s-, c) max{dp(5+) I p: a feasible flow in = min{nc(U) I U : a cut of A}.
N
= (G =
N} (1.15)
Moreover, if the capacity function c is integer-valued, there exists an integral maximum flow in N. Suppose that we are given a graph G = (V,A ) and lower and upper capacity functions c, Z: A -+ R with c 5 3. Denote this network by NO = (G = ( V , A ) , c , F ) .A feasible circulation in network NO is a function p: A --f R such that (1.16) Ya E A: c(a) 5 p(a) 5 Z ( a ) ,
vv E
v : dp(v) = 0 .
Theorem 1.3 (Hoffman): There exists a feasible circulation in network and only if for each U C V we have
(1.17)
NO if (1.18)
Moreover, if the lower and upper capacity functions c and C are integer-valued and there exists a feasible circulation in NO,then there exists an integral feasible circulation in No. From Theorem 1.3 we can show
Theorem 1.4 (Gale): Consider a bipartite graph G = ( V + , V - ; A ) and s u p pose we are given nonnegative (supply and demand) vectors s € RY+ and d E Ry- . Then, there exists a nonnegative flow p: A -+ R+ in the bipartite network such that ( 1.19a) v v E v+:dp(v) 5 5 ( v ) , 11
I. INTRODUCTION
( 1.19b)
if and only if for each U -
CV-
we have (1.20)
Ea+
s - IT-
VEW-
(e) Elements of convex analysis and linear inequalities
+
For more details of the subjects of this subsection see [Rockafellar70], [Stoer Witzgall701, [Bachem Grotsche182] and [Schrijver86]. Let E be a finite set. For any x , y E RE and nonnegative X,p E R such that X p = 1 we call Ax p y a conuex Combination of x and y . A set A C RE is said to be conuex if all the convex combinations of arbitrary two points in A belong to A . A typical example of a convex set is a solution set of a system of linear equations and linear inequalities:
+
+
+
af ( e ) x ( e ) = bf
(i E I ) ,
(1.2la)
CEE
where a r ( e ) , u ; ( e ) , bf, 63” (i E I , j E J , e E E ) are real constants. Such a convex set is called a polyhedral conuex set or a conuex polyhedron or simply a polyhedron. We are mainly concerned with polyhedral convex sets. Note that different systems of linear equations and linear inequalities may give the same polyhedron. A set C C: RE is called a conuex cone if all the nonnegative combinations, Xz p y (A 2 0 , p 2 0 ; x,y E C ) , of points in C belong to C. If C is the set of all the nonnegative combinations of points a< E R E (i E I ) , we say the set of points ai (i E I ) generates the cone C . An a f i n e set or (linear variety) in RE is a translation of a subspace of vector space RE. The dimension of an affine set A is the dimension of the subspace obtained by a translation of A . The dimension of a (polyhedral) convex set A in RE is the dimension of the unique minimal affine set containing A . If the dimension of A is equal to [ E l ,A is said to be full-dimensional. A line is a onedimensional affine set. A halfline (or ray) is a translation of a set given by {Ax I X 2 0 ) for some nonzero vector 2: E RE.We usually represent a halfline (or ray) from the origin by a nonzero vector contained in it. A (polyhedral) convex set A is bounded if and only if it does not contain any halfline. A point of a (polyhedral) convex set A is called an extreme point of A if it can not be expressed as a convex combination of the other two points
+
12
1.2. MATHEMATICAL PRELIMINARIES
in A. A convex polyhedron (polyhedral convex set) is said to be pointed if it has at least one extreme point. For a convex polyhedron A described by (1.21) the characteristic c o n e (or recession c o n e ) of A is the solution set of
Denote the characteristic cone of A by C(A). The characteristic cone C(A) does not depend on the choice of a representation (1.21) of A , as the following lemma shows.
Lemma 1.5: For a convex polyhedron A, z E C ( A ) if and only if for each y t? A and nonnegative X E R we have y Ax A. Moreover, A is bounded if and only if C(A) = (0).
+
Lemma 1.6: A convex polyhedron A described by (1.21) is pointed if and only if A does not contain any line, or equivalently, the system of (1.22a) and
has the unique solution x = 0.
For any JO & J denote by A ( J 0 ) the convex polyhedron described by (1.21a) and (1.24) a;(e)x(e)
I b;
( j E J - Jo).
(1.25)
eEl3
Define 3 ( A ) = {A(Jo)
I Jo C J } .
(1.26)
Here, it should be noted that A ( J ) may be empty. Each F E 3 ( A ) is called a f a c e of A. A face which is a proper subset of A is called a proper f a c e . Faces of A are determined by A and are independent of the choice of a representation (1.21) of A . For a nonzero vector a E RE and a scalar b 6 R , the set of points 5 6 RE satisfying (a,s)E a ( e ) s ( e )= b (1.27) I!€
E
13
I. INTRODUCTION is called a hyperplane, and the set of points x E RE satisfying
is a halfspace. A convex polyhedron is the intersection of a finite number of halfspaces. We denote by H ( a , b ) the hyperplane described by (1.27), and by H+(a,b) the halfspace (1.28). Note that H + ( a , b ) nH+(-a,-b) = H ( a , b ) . A hyperplane H(a, b) is called a supporting hyperplane of A if H(a, b) n A # 0 and if A H+(a,b) or A C H+(-a,-b).
Lemma 1.7: Suppose A is a full-dimensional convex polyhedron. A subset F of A is a nonempty proper face of A if and only if F is the intersection of A and a supporting hyperplane of A . A zero-dimensional face is called a vertex and a vector in a vertex is an extreme point. A one-dimensional face is called an edge. An edge of a convex cone is called an extreme ray. An extreme ray of a polyhedron is an extreme ray of the characteristic cone. An extreme ray is represented by a nonzero vector contained in it, which is called an extreme vector.
Lemma 1.8: The set 3 ( A ) of faces of A is a lattice with respect to set inclusion as the partial order. For each F1, F2 E ? ( A )
F1 A F2 = Fi I- F2.
(1.30)
Lattice 3 ( A ) is called the face lattice of A . A maximal proper face of A is called a facet of A . For a convex cone C RE define C" = {x
I z E R E ,'dy E C : (s,y) 5 0 } ,
(1.31)
which is called the dual cone of C . If C is represented by the system of inequalities (Ci,Z) 5 0 (i € I), (1.32) then the dual cone C" is exactly the cone generated by the vectors c; E R E (i E I ) . A function f: RE --+ R U {+m} is called a convex function if for each z,y E R E with f(x),f ( y ) < +oo and for each X,p 2 0 with X p = 1 we have
+
14
1.2. MATHEMATICAL PRELIMINARIES
A vector
a E
RE is called a subgradient of f at a point x E RE if for every
yER"
f (Y) - f ( 4 1 (a, Y - 4.
(1.34)
The set of all the subgradients of f at x is the subdifferential of f at x and is denoted by af (z). Note that x E RE is a minimizer of f if and only if 0 E af (z). We use symbol a for subdifferentials and for boundaries of flows as in (1.13), since there seems to be no possibility of confusion. For a convex function f: RE -+ R U {+m} the convex conjugate function f*: R E -+ R U {+m} is defined by f * b )= SUP{(X,Y) - f ( Y ) I Y E RE}. (1.35) Let us consider a linear programming problem: n
(P)
Maximize x c j x j
(1.36a)
j=l n
subject to c a i j x j j= 1
5 b; (i = 1 , 2 , . - - , m ) ,
(1.36b)
where aij, b i , c j E R (i = 1 , 2 , . m;j = 1,2,. . . ,n) are given constants. The dual linear programming problem of ( P ) is given by -
a
,
m
(D)
C b,X,
Minimize
(1.37a)
i=1
c m
subject to
aijXi = c j
(j= 1,2,. . * , n ) ,
(1.37b)
i= 1
> o ( i = 1 , 2 , . . . , m ). (1.37~) We say Problem Problem ( P ) is called a primal problem and is the dual of (D). Xi
( P ) (or (D))is unbounded if the problem has feasible solutions and the objective function can be made arbitrarily large (or small). Lemma 1.9 (The weak duality for linear programming): For any feasible solutions x of ( P ) and X of (D), n
m
j= 1
i=l
C c j ~ 5j C
biXi.
(1.38)
Theorem 1.10 (The duality for linear programming): If one of the dual problems ( P ) and ( D ) has an optimal solution, then its dual problem also has an optimal solution and for any optimal solutions x* of ( P ) and A * of (D) we have n
m
j=l
i=l
(1.39)
15
I. INTRODUCTION Moreover, if one of the dual problems is unbounded, then its dual problem has no feasible solutions. Concerning the feasibility of a system of linear inequalities we have the following. Lemma 1.11 (Farkas): There exists a feasible solution for the system of linear inequalities (1.36b) if and only if, for each nonnegative X i E R (i = 1,2, ,rn) such that Xia;j = 0 ( j = 1 , 2 , . . . , n ) , we have CzlXi6i 2 0.
--
Elml
The “if” part is the essential part of the above lemma. Suppose that a i j , b ; , c j (i = 1 , 2 , - . - , r n ; j= 1 , 2 , . . . , n ) aregivenrationals. The system of linear inequalities (1.36b) is called totally dual integral if for each integral vector c = (c1, c 2 , . - .,c,) such that ( P ) has an optimal solution there exists an integral optimal solution of the dual problem (D)(see [Hoffman741 and [Edm Giles771).
+
Theorem 1.12 (Hoffman and Edmonds-Giles): If the system of linear inis integral, equalities (1.36b) is totally dual integral and 6 = ( b l , 6 2 ; . . , 6 , ) then each nonempty face of the polyhedron described by (1.36b) contains at least one integral point and, in particular, each vertex of the polyhedron, if any, is integral.
A version of the separation theorem for convex sets is given as follows. Theorem 1.13: For a convex cone C 5 RE and a vector x E R E not belonging t o C there exists a weight vector w E RE such that
(1.40)
(1.41)
If C is pointed, the inequalities in (1.40) c m be made strict inequalities.
16
Chapter 11.
Submodular Systems and Base Polyhedra
In this chapter we give basic concepts on matroids, polymatroids and submodular systems and show the natural generalization sequences of these concepts from matroids to submodular systems. We also examine the fundamental combinatorial structures of submodular systems and associated polyhedra. For basic properties of matroids and polymatroids shown without proofs in this chapter, readers should be referred to [Tutte65,71], [Crapo Rota’lO], [Lawler’lG],[ Welsh761. The knowledge about matroids and polymatroids is not prerequisite to understanding submodular systems, though it certainly helps. For general information on submodular and supermodular functions see, e.g., [Choquet55], [Edm’lO],[Faigle87], [Frank+Tardos88], [Fuji84c], [Lov&z83], (Tomi Fuji821 and [Welsh76].
+
+
2. From Matroids to Submodular Systems 2.1. Matroids
The concept of matroid was introduced in 1935 by H. Whitney [Whit351 and independently by B. L. van der Waerden [WaerdenS’I]. The term, matroid, is due to Whitney. As the term indicates, a matroid is an abstraction of linear independence and dependence structure of the set of columns of a matrix. Let E be a finite set. Suppose that a family I of subsets of E satisfies the following (10)-(12): (10) 0 E I . (11) I1 12 E I =+- 11 E I. (12) 11, Iz E I, [Ill < 1121 ==+ 3e E 12 - 11:11u { e } E
c
I.
We call the pair ( E ,I ) a matroid. Each I E I is called an independent set of matroid ( E ,I ) and I the family of independent sets of matroid ( E ,I ) . An independent set which is maximal in I with respect to set inclusion is called a base. The family B of bases satisfies the following [BO)and (Bl): (BO) 8 # 0. (Bl) V B l , BZ E 8, Ve1 E B I - B Z , 3ez E B 2 - B l : ( B 1 - { e 1 . } ) U { e z } E 8 . A subset of E which is not an independent set is called a dependent set. A dependent set which is minimal with respect to set inclusion is called a circuit. The family C of circuits satisfies the following (C0)-(CZ): 17
11. SUBMODULAR SYSTEMS AND BASE POLYHEDRA
(CO) c #
(01. vc1, c2 E c: c1 c c2 ===+ c1 = c2.
(Cl) (C2) VCl, C2 E C with C1 # C2, Ve E C1 n C2, 3C E C : C
G (C1 U C2) -
{elWe define the rank function p : 2E p ( x ) = max{ 1 for each X
+Z
of matroid ( E ,I ) by
I I c X, I E I }
(2.1)
c E . The rank function p satisfies the following (pO)-(p2):
(PO) VX C E : 0
1x1.
5 p(X) 5
xc
(PI) y 5 E ===+ P(X) I P(Y). (p2) V X , Y C E : p ( X ) p ( Y ) 2 p ( X U Y )
+
+p(X nY ) .
Any function p satisfying (p2) is called a submodular function on 2E. From (pO)-(p2) we see that p has the unit-increase property, i.e., for any X , Y 2 E with X Y and 1x1 1 = IYI we have p(Y) = p(X) or p(Y) = p(X) 1. Also define the closure function cl: 2“ + 2E of matroid ( E ,I ) by
c
+
+
Cl(X) = {e
I
e E E , P(X u {el) = P W ) }
(2.2)
C E . The closure function cl satisfies the following: VX C E : X 2 cl(X). VX, Y C E : X C cl(Y) ===+ cl(X) C cl(Y). VX C E , Ve E E : e’ E cl(XU{e}) -cl(X) ===+e E cl(XU{e’}) -cl(X).
for each X (c10) (cll) (c12)
For any independent set I E I and any element e E cl(I) - I, there exists a unique circuit contained in I U {e}. Such a circuit is called the fundamental circuit with respect to I and e, and is denoted by C(I1e). For any e’ E C(Ile), ( I U {e}) - {e’} is an independent set. It is well known (see [Welsh76]) that each of the family I of independent sets, the family €3 of bases, the family C of circuits, the rank function p and the closure function cl uniquely determines the matroid which defines it. Giving a system of axioms for each of the family B of bases, the family C of circuits, the rank function p and the closure function cl, we can define a matroid. We denote such a matroid by ( E , B ) , ( E , C ) , ( E , p ) and (E,cl), respectively. In fact, (BO) and (Bl), (C0)-(C2), (pO)-(p2), and (clO)-(c12), respectively, give the systems of axioms for the family B of bases, the family C of circuits, the rank function p , and the closure function cl of a matroid on E . For example, any integer-valued function p : 2E -+ Z satisfying (pO)-(p2) defines a matroid ( E , p )with the family I of independent sets given by
I
=
{II I 2 E , p ( I ) = [ I [ } . 18
(2-3)
2.1. MATROIDS
For a matroid M = ( E , B ) with a family B of bases, the family B* of the complements of bases of the matroid is also a family of bases of a matroid. The matroid M* = ( E ,8:) is called the dual matroid of M = ( E ,8). The dual of M* is equal to M, i.e., (M*)*= M. A matroidal term, say, “ X ” with respect to M* is referred to as “co-X” with respect to M. For example, bases and circuits of M* are called cobases and cocircuits of M. A self-dual system of axioms for the family of bases is given by N. Tomizawa [Tomi771as follows.
(BO’) B # 0. (Bl‘) VB1, B2
E
8, 3el E I31 - 232, 3e2 E B2 - B1: ( & - {el)) u { e 2 } E 8, (B2- (4)u {el}
E
8.
Examples of a Matroid (1) Graphs: For a graph G = (V,E ) with a vertex set V and an edge set E let I ( G ) be the set of those edge subsets each of which does not contain any cycle of G. Then M(G) = ( E ,I ( G ) ) is a matroid with I ( G ) being the family of independent sets. A matroid which can be obtained in this way is called a graphic matroid. Given an independence oracle for the membership in the family of independent sets, we can efficiently discern whether a given matroid is graphic and, if graphic, construct a graph representing it, by combining the algorithms of P. D. Seymour [Seymour811 and the author [Fuji80a] (also see [Bixby+Wagner881).
(2) Matrices: For a matrix A = [ a l , . . . , a , ] over some field with column vectors al, . ., a, let E = { 1,. . . ,n} be the column index set and I (A) be the set of those subsets of E each of which forms independent column vectors. Then M(A) = ( E ,I ( A ) ) is a matroid with I (A) being the family of independent sets. We say the matrix A represents (or is a representation of) the matroid M(A). A matroid which can be obtained in this way is called a matric matroid or a linear matroid. A graphic matroid is a matric matroid represented by the incidence matrix of the associated graph. A binary matroid is a matroid which can be represented by a matrix over the field of {O,l}. A regular matroid is a matroid which can be represented by a matrix over any field. A graphic matroid is a regular matroid. The dual of a binary (regular) matroid is also binary (regular). A characterization of regular matroids by decompositions is given in [Seymour801 which leads us to an efficient algorithm for discerning the regularity of a matroid. Extensive studies on matroid decompositions have been made by K. Truemper (Truemper].
-
(3) Bipartite matchings: For a bipartite graph G = ( V + , V - ; A )with left and right end-vertex sets V s and V - and an arc set A , define 19
11. SUBMODULAR SYSTEMS AND BASE POLYHEDRA
I+
= {d+M
I M:
a matching of G},
(2.4)
where d+M is the set of left end-vertices of matching M . Then M+ = ( V + , I + ) is a matroid with I + being the family of independent sets. M+ is called a transversal matroid. Transversal matroids are matric. Consider the matrix P = ( p i j I i E V - , j E V + )with the row index set V - and the column index set V + such that (i) p i , # 0 (ii) p ; , = 0
if i E V - and j E V + are adjacent in G and otherwise.
Also, suppose that nonzero real elements p i , are algebraically independent, i.e., they do not satisfy any nontrivial multi-variable polynomial equation with rational coefficients. Then matrix P represents the transversal matroid M+. (4) General matchings: For a simple graph G = ( V , E ) a subset W of the vertex set V is said to be matchable if W is the set of the end-vertices of a matching in G. Let I be the set of subsets W of the vertex set V such that W is a subset of some matchable vertex set. Then (V,I) is a matroid, called a matching matroid.
(5) Algebraic matroids: Consider a field K and its extension H and let E be a set of a finite number of elements of H . Define I as the family of subsets X of E such that the elements of X do not satisfy any nontrivial multi-variable polynomial equation with coefficients chosen from K . Then ( E ,I ) is a matroid and is called an algebraic matroid. (6) Uniform matroids: Let E be a finite set with cardinality [El = n > 0. For any nonnegative integer k 5 n define
Then ( E , I k ) is a matroid. It is called a uniform matroid of rank k and is usually denoted by u k , n . In particular, Un,+= ( E ,2 E ) is called a free matroid and U O ,= ~ ( E ,(0)) a trivial matroid.
2.2. Polymat roids
Let E be a finite set and p be a function from 2E to R. Here, R is the set of reals but throughout this book R can be any totally ordered additive group such as the set Z of integers and the set Q of rationals unless otherwise stated. Suppose that the function p: 2 E + R satisfies
(3)P ( 0 ) = 0. 20
2.2. POLYMATROIDS
(3)
Figure 2.1.
(a)X C Y C E ==+ ~ ( xF )~ ( y ) .
(2) V X , Y C E : p ( X ) + p ( Y ) 2 p ( X U Y )+ p ( X n Y ) .
The pair ( E ,p ) is called a polymatroid and p the rank function of the polymatroid ((Edm7Ol). The rank function p is a monotone nondecreasing submodular function on 2 E with p ( 0 ) = 0 and does not necessarily have the unit-increase property as the rank function of a matroid. When p is the rank function of a matroid, polymatroid ( E ,p ) is called matroidal. Define
P(+)(P)= {x I x E R E ,x 2 0 , v.X 2 E : z ( X ) I ~ ( x ) } , (2.6) where for each X 5 E and x E RE
.(XI =
C z(e).
(2.7)
eEX
We define z(0) = 0. P(+)( p ) is called the independence polyhedron (or polymatroid polyhedron) associated with polymatroid ( E ,p ) . Also define
B b ) = {.
I x E P(+)b)l4 E ) = p(E)).
(2.8)
We call B(p) the base polyhedron associated with polymatroid ( E , p ) .The base polyhedron B(p) is always nonempty, which will be shown in Theorem 2.3. (See Fig. 2.1.) 21
11. SUBMODULAR SYSTEMS AND BASE POLYHEDRA It can be shown (see [Edm7O]) that the convex hull in RE of the characteristic vectors of the independent sets (or bases) of a matroid on E with the rank function p , where R is the set of reals, is the independence polyhedron (or the base polyhedron) associated with the matroidal polymatroid ( E ,p ) (see Corollary 3.25). Each z E P(+)(p) is called an independent vector and each z E B(p) a base of polymatroid ( E ,p ) . For an independent vector z E P(+)(p)define
D(z) = {X I
x 2 E , z(X) = p(X)}.
(2.9)
P(z) is closed with respect to set union and intersection, i.e., X, Y E P(z) ==+
X U Y, X n Y E P ( z ) , For, if X, Y E D(z), we have
o = ~(x)--s(x)+~(Y)-z(Y) 2 p(xuY)-z(XuY)+p(XnY)-z(XnY) 1 0, (2.10)
wherep(XUY)-z(XUY) 2 O a n d p ( X n Y ) - z ( X n Y ) 2 O s i n c e z E P(+)(p). It follows that X u Y, X n Y E D(z). So, D(z) is a distributive lattice with set union and intersection as the lattice operations, join and meet. Denote the unique maximal element of P(z) by sat(z), i.e., sat(z) =
U{X I x G E , Z(X)= p(x)}.
(2.11)
The function, sat: P(+)(p)+ 2E, is called the saturution function ([Fuji78a]). Informally, sat(z) is the set of the saturated components of x. More precisely,
I
sat(z) = { e e E E , VCX> 0:2
+ a x e $! P(+)( p ) } ,
(2.12)
where x e is the unit vector with x e ( e ) = 1 and x,(e’) = 0 (e’ E E - { e } ) . The saturation function is a generalization of the closure function of a matroid. For an independent vector z E P(+)(p)and an element e E sat(%)define
I
P(z,e) = {X e E X
c E , z(X) = p(X)}.
(2.13)
We have P ( z , e ) C P ( z ) and P ( z , e ) is a sublattice of D(z). Denote the unique minimal element of P ( z , e ) by dep(z, e ) , i.e., dep(z,e) = n { X
1 e E X C E,
z ( X )= p(X)}.
(2.14)
For each e E E - sat(z) we define dep(z, e) = 0. The function, dep: P(+)( p ) x E -+2E, is called the dependence function (see [Fuji78a]). For z E P ( + ) ( p ) and e E sat(z), dep(z, e ) = {e’
1 e’ E E ,
3n > O:Z 22
+
Q(Xe -
x e ~E) P ( + ) ( p ) } ,
(2.15)
2.2. P 0LY MATROID S
Figure 2.2. where note that for e' E dep(x,e) - {e} we have x(e') > 0. We call an ordered pair (e,e') such that e' E dep(x,e) - {e} an exchangeable pair associated with x. The dependence function is a generalization of a fundamental circuit of a matroid. For x E P(+)(p)and e E E - sat(x) define i.(x,e) = max{a
1
QI
E R,
X+ axe
E P ( + ) ( p ) } (> O ) ,
(2.16)
which is called the saturation capacity associated with x and e. The saturation capacity is also expressed as t(x, e) = min{p(X) - x(X) I e E X
c E}.
(2.17)
+
For any CY such that 0 5 a 5 2(x, e) we have x axe E P(+)( p ) . For x E P(+)(p), e E sat(x) and e' E dep(x,e) - {e}, define c"(x,e,e') = max{a
I
QI
E R, x
+a(Xe -
xet)
E P(+)(p)}(> 0 ) ,
which is called the ezchange capacity associated with x, e and e'. 2.2.) The exchange capacity is also expressed as c"(x,e, e') = min{p(X) - x(X) 1 e E X C E , e' $ X}, where note that x(e') 2 t(x, e,e'). For any have 2 + a ( X e - x e l ) E P(+)(P).
CY
such that 0 5 cr
(2.18) (See Fig. (2.19)
5 C(x,e,e') we
11. SUBMODULAR SYSTEMS AND BASE POLYHEDRA A vector v E RE such that x 5 v for any x E P(+)( p ) is called a dominating vector of ( E , p ) . Note that v E RE is a dominating vector of ( E , p ) if and only if v ( e ) 2 p ( { e } ) for all e E E . For a dominating vector v of the polymatroid P = ( E , p ) define p i u , : 2E -+ R by Pi,) ( X )= . ( X )
+ P(E - X ) - P(E)
(X
c El.
(2.20)
Then Piu = (E,pi,,))is a polymatroid and is called the dual polymatroid of P = ( E , p )with respect to v ([McDiarmid75]). (See Fig. 2.3.)
0'
O1
Figure 2.3. For most polymatroids there is no reasonable and physically meaningful way of choosing a dominating vector v and the arbitrariness of dual polymatroids remains, though the choice of v = ( p ( { e } ) I e E E ) may be reasonable. For matroidal polymatroids, the matroid duality corresponds to the polymatroid duality with respect to the vector 1 of all the components being equal to 1. Examples of a P o l y m a t r o i d (1) Multiterminal flows: Consider a capacitated flow network N = (G = (V,A ) , c , S+, S - ) with the underlying graph G = (V,A ) , a nonnegative capacity function c: A + R + , the set S+ of entrances and the set S- of exits, where 24
2 . 2 . P 0 LYMAT RO I D S S+, S- C V and S+ n S- = 0 . We assume that there is no arc entering S+ or leaving S - . A function p: A + R is a feasible flow in M if
Va E A: 0 5 p(a) 5 c(a),
(2.21)
v - (S+ u s-):dp(v) = 0,
vv E
(2.22)
where dp: V -+ R is the boundary of p defined by (2.23) aE6+v
aE6-u
It is shown (see [Megiddo74]) that the set {(dp)” I p: a feasible flow in M} is the independence polyhedron of a polymatroid on the set S+ of entrances, where ( 8 ~ ) is ~ the ’ restriction of dp on S+. Similarly, we have a polymatroid on the set S- of exits. (2) Linear polymatroids: Let A be a matrix with the column index set E . Suppose that E is partitioned into nonempty disjoint subsets Fl, F2, ..., F,, and define 6 , = { 1 , 2 , - .,n}. Also define for each X & ,6
p ( X ) = rankAX,
(2.24)
where A X is the submatrix of A formed by the columns of A with the indices in U{F, I i E X } . We see that ( , 6 , p ) is a polymatroid, which is called a linear polymatroid. (3) Entropy functions: Let 21, 2 2 , ..., z, be random variables taking on values in a finite set { 1 , 2 , . - ,N } . For the set E = {XI,- . ,z}, of the random variables, define for each subset X of E
-
3
the entropy of X in the Shannon sense 0 if X = 0.
h(X)=
-
For example, if X = {z1,22,
h(X)= -
N
N
il=l
ik=l
C C
p(z1
9
,z k }
(1
if X #
0,
(2.25)
5 k 5 n) ,
= i1,***,zk = ik) log,p(zl = i 1 , . * - , z k = ik),
(2.26)
where p ( z 1 = 21,. ,z k = ik) is the probability of the event that x1 = il, . .-, xk = ik. The function h : 2 E -+ R+ is called an entropy function. Entropy function h is a monotone nondecreasing submodular function with h(0) = 0, 25
11. SUBMODULAR SYSTEMS AND BASE POLYHEDRA i.e., ( E , h ) is a polymatroid. See [Fuji78c] for polymatroidal problems in the Shannon information theory. (4) Convex games: Consider a characteristic-function game (see, e.g., [Shubik821). Let E = {1,2, n} be a set of n persons, called players. A characteristic function v is a nonnegative function defined on the set of coalitions which are subsets of E , where we assume v ( 0 ) = 0. A characteristic-function game ( E ,v ) is called a convez game ([Shapley’ll]) if the characteristic function v is supermodular, i.e., e . ,
VX,Y
C E : v ( X ) + v ( Y ) 5 v ( X u Y )+ v ( X n Y ) .
(2.27)
The core of the game ( E , v ) is the set of payoff vectors defined by {z
I z E RE,V X C E : z ( X ) 2 v ( X ) , z ( E )= v ( E ) } .
Define the function v # : 2E
-+
(2.28)
R by
v # ( E - X ) = v ( E )- v ( X ) ( X
CE).
(2.29)
Then we can show that v # is a polymatroid rank function and that the core given by (2.28) coincides with the base polyhedron B(v#) of the polymatroid (see Lemma 2.4).
2.3. Submodular Systems
Let E be a nonempty finite set and D be a collection of subsets of E which forms a distributive lattice with set union and intersection as the lattice operations, join and meet, i.e., for each X , Y E D we have X U Y , X n Y E D . Let f : D --+ R be a submodular function on the distributive lattice D, i.e.,
We have the following fundamental lemma concerning submodular functions.
Lemma 2.1 ([Ore56]): Let f : D a distributive lattice D
C 2E.
--+ R be an arbitrarysubmodular function on Then the set of all the minimizers o f f given by
(2.31) forms a sublattice of D , i.e., for any X , Y E 26
DO we have X U
Y, X n Y E
Do.
2.3. SUBMODULAR SYSTEMS
Figure 2.4. (Proof) For any X, Y E PO,
u Y),f ( X n Y ) }2 f(X) = f(Y).Hence we must have f(xU Y ) = f ( X n Y )= f(X)(= f ( y ) ) i.e., , X UY , X n Y E PO. Q.E.D.
where min{f(X
For a submodular function f on a distributive lattice D 5 2E with 8, E E D and f(0) = 0, we call the pair (P, f ) a submodular system on E, and f the rank function of the submodular system (see [Fuji78b, 84~1).We call f(E) the rank of ( D l f ) . Define a polyhedron in RE by
P(f) = {Z I x E RE,VX E
D: z(X) 5 f(X)}.
(2.33)
We call P(f) the submodular polyhedron associated with submodular system ( D , f ). Also define
B(f) = {.
I Z E P(f), 4 E ) = f ( E b
(2.34)
We call B(f) the base polyhedron associated with submodular system (P, f ) . (See Fig. 2.4.) 27
11. SUBMODULAR SYSTEMS AND BASE POLYHEDRA A vector in the base polyhedron B(f) is called a base of (D,f ) and a vector in the submodular polyhedron P ( f ) is called a subbase of (D,f ). The saturation function, the dependence function, the saturation capacity and the exchange capacity introduced for polymatroids can easily be extended for submodular systems.
Lemma 2.2: For any subbase x E P(f) define D(x) = {X I X E D, x(X) = f(x)}. Then D(x) is a sublattice of
(2.35)
D.
(Proof) This follows from Lemma 2.1 since D(x) is the set of minimizers of the nonnegative submodular function f - x: D + R . Q.E.D. The unique maximal element of D ( x ) is denoted by sat(x). sat: P(f) + 2E is the saturation function. For any subbase x E P(f) and e E sat(x), define D(x,e) = {X I e E X E
D, x ( X ) = f(X)}.
(2.36)
Then D ( x , e ) is a sublattice of D. (Note that D(x,e) is the set of minimizers of the nonnegative submodular function f - x on the distributive lattice D(e) { X I e E X E D}.) The unique minimal element of D(x,e) is denoted by 2E dep(z,e). For any e E E -sat(s) we define dep(x,e) = 8. dep: P(f) x E is the dependence function. For any x E P(f) and e E E - sat(x) the saturation capacity Z(x,e) is defined by (2.37) cI(x,e) = min{f(X) - x(X) I e E X E D}. --f
+
For a nonnegative a,we have x axe E P(f) if and only if 0 5 a 5 Z(x, e ) . Moreover, for any x E P(f), e E sat(x) and e' E dep(z,e) - {e} the exchange capacity Z(x, e, e') is defined by Z(x, e, e') = min{f(X)
-
x(X) I e E X E
D, e' $ X } .
(2.38)
For a nonnegative a , we have z + a(xe- x e # )E P(f) if and only if 0 5 a 5 C"(x,e,e'). (Note that if 2: E B ( f ) , then 3: a(xe- xet) E P(f) implies + a ( x e - E B(f)*)
+
x e l )
Theorem 2.3: The base polyhedron B(f) is the set of all the maximal subbases of (D,f).In particular, B ( f ) # 0. Here, the partial order 5 among vectors in RE is the one defined for vector lattice RE (i.e., for x, y E RE we have x 5 y i f and only i f x ( e ) 5 y(e) for all e E E ) . 28
2.3. SUBMODULAR SYSTEMS
(Proof) Denote by B'(f) the set of all the maximal subbases of (D,f). Any x E B(f) is maximal in P(f) since z ( E ) = f ( E ) ,so that we have B(f) E B'(f). Conversely, for any x E B'(f) we have sat(z) = E due to the maximality of 5. It follows that z ( E ) = s(sat(z)) = f(sat(z)) = f ( E ) ,i.e., x E B(f). Therefore, B'(f) C B(f), and hence we have B'(f) = B(f). Q.E.D. From Theorem 2.3 we see that for any subbase z of (D, f ) there exists a base y of ( 0 ,f ) such that x 5 y. A function g: D 4 R on the distributive lattice D is called a supermodular function if -g is a submodular function, i.e.,
If 0, E E
D
and g(0) = 0, the pair (D,g) is called a supermodular system on E .
Define P(g) = {x I x E
B(g) = {x
R E ,VX E D: x(X) 2 g(X)},
I 2 E P(g),
x ( E ) = g(E)}.
(2.40) (2.41)
P(g) is called the supermodular polyhedron and B(g) the base polyhedron associated with the supermodular system (D,g). A vector in P(g) is called a superbase and a vector in B(g) a base of (D,g).
A function which is submodular and at the same time supermodular is called a modular function. For a modular function x: D -+ R, P(z) should be considered as either a submodular polyhedron or a supermodular polyhedron according as we consider x as a submodular function or a supermodular function. There will be no confusion by the use of this notation. If we consider the dual order I* of the ordinary order I among R (where the dual order I* is the one such that a I* /3 if and only if p 5 a, for each a,@ E R ) , then a supermodular function g: D 3 R with respect to (R, 5 ) is a submodular function with respect to (R, I*). In the same way the supermodular polyhedron P(g) and a superbase x E P(g) with respect to (R, 5 ) is a submodular polyhedron and a subbase with respect to (R, <*). -+ R as For a submodular system (D,f)on E define a function f # : follows.
-
D ={E-
f#(E- X)
x I x E D},
(2.42)
f ( E )- f ( X ) (X E D).
a
(2.43)
of D. We call the pair ( D , f # ) the dual supermodular system of ( D , f ) . (See Fig. 2.5.) Similarly, we
f_# is a supermodular function on the dual lattice
29
11. SUBMODULAR SYSTEMS AND BASE POLYHEDRA 2
0
Figure 2.5. define the dual submodular system (D,g#) of a supermodular system ( D 7 g ) , where is defined by (2.42) and g# by (2.43) with f replaced by g .
D
Lemma 2.4: We have B(f) = B ( f # ) and ( f # ) # = f . (Proof) Consider the system of inequalities and a n equation for base polyhedron
B(f):
.(X)
I f ( X ) (XE 017
(2.44a)
4E)= f(E).
(2.44b)
This is equivalent to the following:
which defines B ( f # ) . Furthermore, for X E
7= D
(2.46) 30
2.3. SUBMODULAR SYSTEMS Q.E.D.
It follows that (f#)# = f.
It should be noted that in the above proof we do not use the submodularity o f f and the lattice properties of D. That is, Lemma 2.4 holds for any function f on any family D without the submodularity of f , where we define and f # by (2.42) and (2.43). The submodular/supermodular polyhedron and the base polyhedron of a submodular system also arise from more general functions. A family 3 2E is called an intersecting family if for each intersecting X , Y E 3 (i.e., X , Y E 3 and X n Y # 0) we have X u Y , X n Y E 3. A function f: 3 + R on the intersecting family 3 is called intersecting-submodular if for each intersecting X , Y E 3 we have the submodularity inequality
f ( X I + f ( Y )2 f ( X u Y ) + f ( X n Y ) .
(2.47)
Moreover, a family 3 C 2E is called a crossing family if for each crossing X , Y E 3 (i.e., X , Y E 3, X n Y # 0, X - Y # 0, Y - X # 0, and X u Y # E ) we have X u Y, X n Y E 3. A function f: 3 -+ R on the crossing family 3 is called crossing-submodular if for each crossing X , Y E 3 we have the submodularity inequality (2.47) (see [Edm+Giles77]). Note that the term, submodular, without any prefix is used for such a function f: D + R on a distributive lattice D that satisfies (2.47) for all X , Y E D. The following theorems, Theorems 2.5 and 2.6, concerning intersectingand crossing-submodular functions on intersecting and crossing families, play a very important r81e in the combinatorial optimization problems described by intersecting- or crossing-submodular functions on intersecting or crossing families, and reveal the essential combinatorial structures of the problems (which will also be seen in Chapter 111). The proofs of Theorems 2.5 and 2.6 will be given in Section 3.4.
Theorem 2.5 [Fuji84b]: (i) Let f be an intersecting-submodular function on an intersecting family 3 5 2E with 0, E E 3 and f(0) = 0. Define
P(f) = {z I z E R E , V X E 7 : x ( X ) I f(X)}.
(2.48)
Then there exists a unique submodular system (PI, f1) on E such that
P ( f ) = P(f1).
(2.49)
Moreover, i f f is integer-valued, then fi is also integer-valued. (ii) Let f be a crossing-submodular function on a crossing family ? & 2E with 0, E E 3 and f(0) = 0. Define
B(f) = { X
I x E RE,V X E ?:
x ( X )5 f ( X ) ,x ( E ) = f ( E ) } 31
(2.50)
11. SUBMODULAR SYSTEMS AND BASE POLYHEDRA and suppose B ( f ) on E such that
# 0. Then there exists a
unique submodular system
(P2,fz)
B ( f ) = B(f2). Moreover, i f f is integer-valued, so is
(2.51)
f2.
Theorem 2.6 [Fuji84b]: (i) Let f be an intersecting-submodular function on an intersecting family 3 2 2E with 0, E E 3 and f(0) = 0. The rank function f~ of the submodular system ( P I ,f l ) in (i) of Theorem 2.5 is given as follows. For each Y C E define
I {xi I i E I } : a partition of Y,
fP(y) = m i n { C f(xi) iEI
Vi E I :
xi E 7 } ,
(2.52)
where fp(Y) = +oo if the minimum in (2.52) is taken over the empty set. Then, fl is the restriction of fp on D1 = {Y I Y E , f , ( Y ) < +m}. (ii) Let f be a crossing-submodular function on a crossing family 3 2E with 0, E E 3 and f(0) = 0. Define f p ( ~ )= min{
C f(xi)I
{Xi I i E I } : a partition of E ,
i€I
ViE I:
(f#)p(E)= m a x { x f#(Xi)
I
xi E 3 } ,
(2.53)
{Xi I i E I } : a partition of E ,
iEI
Vi E
I: E
- Xi E
3},
(2.54)
where
f#(X) = f ( E ) - f ( E - X) ( E - X E 3).
(2.55)
Then B(f) defined by (2.50) is nonempty if and only if
or equivalently
Cf # ( ~I ) f ( ~I)C f(xi) j € .I
(2.57)
iEI
for any partitions {Xi I i E I} and {Yj I j E J } of E with Xi E 3 (i E I) and E - Yj E 3 ( j E J ) . 32
2.3. SUBMODULAR SYSTEMS Moreover, if B(f) # 0, the rank function f 2 of ( D z , fz) in (ii) of Theorem 2.5 is given its follows. For each Y C E define fz in terms of its dual by
where
fP(y) =min{C
f(x;)I {x;I i E I>:a partition of Y,
;€I
'di E I : 3p
= {X I
x s E, f,(X)
X;E 3) (Y E ) ,
(2.59)
< +m>,
(2.60)
(fP)#(X) = f ( E ) - fP(E - X) (E - x E 7P)
(2.61)
and the minimum (or maximum) taken over the empty set is equal to +m (or -m). The rank function f2 is the restriction of fz on DZ = {X I X 2 E, fZ(X) < +4. Function fp in (2.52) is called the Dilworth truncation of the intersectingsubmodular function f : 3 --f R (see [Dilworth44], [Edm7O], [Lov&sz77]). We also call f, in (2.59) the Dilworth truncation of the crossing-submodular function f : 3 -+ R. Note that f2# in (2.58) is the Dilworth truncation of the -+ R. f 2 is also called the biintersecting-supermodular function ( f p ) # : truncation o f f in [Frank Tardos881. Since B ( f ) = B(f#), we can also obtain a dual formula for f 2 , beginning with f # instead of f . It should be noted that for a crossing-submodular function f on a crossing family 3 the polyhedron P(f) defined by (2.48) may not be a submodular polyhedron but that the intersection of P ( f ) with any hyperplane s ( E ) = k (const.), if not empty, is a base polyhedron (see Fig. 2.6).
+
Examples of a Submodular S y s t e m Matroids and polymatroids are examples of a submodular system. We show some non-polymatroidal submodular systems. (1) C u t functions: A typical non-polymatroidal submodular system arises from network flows. Let N = (G = (V,A),g,E) be a capacitated network with the underlying graph G = (V,A) and the lower and upper capacity functionsc:A -+ RU{-m}
33
11. SUBMODULAR SYSTEMS AND BASE POLYHEDRA 43)
A
/
0
0
I I
I I
41)
I I
Figure 2.6. P ( f ) for a crossing-submodular function f .
5
and i?A + R U {+m} such that c ( a ) - 2' -+ R U {+m} by function n,,~:
%,a(U) =
c
qa) -
Z(a) for each arc a E A . Define a
c
C(4
(U
c V),
(2.62)
a€ A - U
aEA+ U
where A + U is the set of arcs leaving U in G and A-U is the set of arcs entering U in G. For each U,W E 2' such that K ~ , ~ (
+ .c,a(W) -
- &,F(U
u W )- 'C.,a(U n W )
I
= c { ' Z ( a )- ~ ( a )a E A , a s a E U - W , d - a E W
+ x { Z ( a )- c ( a ) I a E A , 8 - a
-
U}
E U - W , d'a E W - U }
2 0.
(2.63)
Therefore, P ( c , C ) 2 2E defined by
P(,.)
= {U
I u c v, Kc,,(U) < +m}
(2.64)
is a distributive lattice with 0, V E P ( c , C ) . Denoting the restriction of tccc3z to P(c,'Z) by K,,T - again, we have a submodular function 'cg.zon the distributive 34
2.3. SUBMODULAR SYSTEMS the cut function lattice P(c,Z), where s,,C(@)= s:,?(V) = 0. We call s c , ~ associated with network N = (G = (V,A),c,Z). The cut function sc,?is not monotone nondecreasing for nontrivial networks. A feasible $ow p in = (G = (V,A),c,Z) is a function p: A -+ R such that c ( a ) 5 p(a) 5 Z(a) for any a E A,’ The set of the boundaries dp of feasible flows p in .A( = (G = (V,A),c,Z) is given by
which is the base polyhedron associated with the submodular system ( D ( c , Z ) , n,,?). (See (2.23) for the definition of the boundary dp.) The fact that each base in B(n,,z) is expressed as the boundary d p of a feasible flow p in N can be shown by the use of the feasible circulation theorem (Theorem 1.3) of A. J. Hoffman [Hoffman601as follows. For any x E B(sc,?), - consider a new vertex s 4 V and new arcs ( s , ~ ) (v E V ) ,and definec(s, v) = E(s,v) = z(v) (v E V ) . Denote the augmented network by 1’= (G’ = (V U { s } , A U { ( s , v ) I v E V}),c,E). There exists a feasible circulation (a feasible flow with the zero boundary) in .A(’ if (and only if) for every U C V U {s} we have (2.66) where As and A- are with respect to GI. (2.66) is equivalent to z E B(sG,z). Therefore, there exists a feasible circulation p in N’. Restricting p to A , we obtain a required feasible flow in U whose boundary is equal to x . The converse, d @ B(sC,z), is immediate. (2) Cross-free families: For a finite set E let 3 2E be a cross-free family, i.e., for each X , Y E 3 X and Y do not cross, where we assume 0, E E 3. Then for any function f : 3 R with f ( 0 ) = 0, if B(f) = { x I x E RE,VX E 3 : 5 ( X ) 5 f ( X ) , z ( E ) = f(E)} # 0, B(f) is a base polyhedron due to Theorem 2.5, since 3 is a crossing family and f is a crossing-submodular function on 3. Moreover, if 3 is laminar, i.e., for each X , Y E 3 X n Y # 0 implies X Y or X 2 Y , then for any function f on 3 P(f) = { z 1 x E R E ,V X E 3 : x ( X ) 5 f ( X ) } is a submodular polyhedron due to Theorem 2.5. Note that laminar 3 is an intersecting family and that f is an intersecting-submodular function on 3. Also, if all the members of 3 form a chain Xo c XIc . . c X,,3 is called nested. A nested family is laminar and any function f on a nested family 3 with 0, E E 7 and f(0) = 0 gives the submodular polyhedron P ( f ) . It should --f
35
11. SUBMODULAR SYSTEMS AND BASE POLYHEDRA also be noted that a nested family trivially forms a distributive lattice. See [Tamir80] for an application to a production-sales planning model.
3. Submodular Systems
In this section we show basic properties of submodular systems. 3.1. Fundamental Operations on Submodular Systems
We introduce several fundamental operations on a submodular system on E .
(D,f )
(a) Reductions and contractions by sets For any A 6 D define
DA
={X
1A2X
E
D},
f A ( X )= f ( X ) ( X E P A ) .
(3.1) (34
Then ( P A ,f A ) is a submodular system on A and is called the reduction or the restrictionof ( D , f ) to A . Wedenote(DA,fA)by ( D , f ) . A o r( D , f ) - ( E - A ) . Also define for A E D
DA
I A C X C D},
={X -A
fA(x)=
f(xu A )
-
!(A)
(xE PA).
(3.3)
(3.4)
Then we have a submodular system ( D A , ~ A )on E - A , which is called the contraction of ( D , f ) b y A and is denoted by ( D , f ) / Aor ( D , f ) x ( E - A ) . We call a submodular system obtained by repeated reductions and/or contractions of (D,f ) by sets a set minor of (D,f).
Lemma 3.1: For any A E D let x A be a base of the reduction (D,f ) A of submodular system ( D , f ) to A and X A be a base of the contraction ( D , f ) / A of (D,f ) by A . Then the direct sum P = xA @ X A of x A and X A defined by
is a base of (D, f ). Conversely, for any base P of (D, f ) satisfying 2(A ) = f ( A ) , restricting P on A (or on E - A ) yields a base of (D,f) A (or (D,f)/ A ) . 36
3.1. FUNDAMENTAL OPERATIONS ON SUBMODULAR SYSTEMS
(Proof) If x A is a base of any X E D
(D,f ) - A and X A
is a base of
(D,f ) / A ,we have for
Since P(E) = f ( E ) , it follows that P is a base of ( D , f ) . Conversely, if P is a base of ( D , f ) and satisfies ? ( A ) = f ( A ) ,then for any X E D A and Y E DA
."(X)
=
W) 5 f(W,
(3.7)
zA(Y)= P(Y U A) - ?(A)I f ( Y U A ) - f ( A ) .
(3.8)
Q.E.D. Lemma 3.2: For any X E D there exists a subbase x E P(f) such that x ( X ) = f ( X ) . Furthermore, for any X 2 E - D and any K > 0 there exists a subbase x E P(f) such that z ( X ) 2. K . (Proof) The first half follows from Lemma 3.1. The second half is shown as follows. For any X E 2 E - D there exist e E X and e' E E - X such that for each Y E D with e E Y we have el E Y . (For, if for each e E X and el E E - X there were Y E D such that e E Y and er 4 Y ,denoting one such Y by Ye,,!,we would have X = UeEX n e ' E E - X Ye,,!E P.) Hence for any y E P(f), yd y d ( x , - x e t ) belongs to P(f) for any d > 0. Therefore, the Q.E.D. value, y d ( X ) = y ( X ) d , can be made arbitrarily large.
+
+
(b) Reductions and contractions by vectors For any vector
3:
E RE define a function f": 2 E + R by
f " ( X ) = min(f(2)
+z(X-
2) I X 2 Z E
D}
(3.9)
for each X C E. Then the function f x : 2 E + R is a submodular function on the Boolean lattice 2E. For, denoting by ZX a minimizer Z of the right-hand side of (3.9), we have for each X , Y C E
+
+
f"(X) + f+(Y)= f ( Z x )+ z ( X - Z X ) f ( Z y ) x ( Y - 2,) 2 f ( Z X u Z Y ) + .((X u Y )- ( Z X u ZY)) + f ( zn~ZY)+ .((x n Y ) - (zXn zY)) (3.10) 2 f " ( X u Y ) f X (nxY ) .
+
37
11. SUBMODULAR SYSTEMS AND BASE POLYHEDRA
Figure 3.1. We call the submodular system (2E,fZ) the reduction of (P,f)by vector x. A base of the reduction ( z E ,f") is called a base of x in (P, f). Define (3.11) W)"= {Y Y E P ( f ) , Y 5 4 ,
I
which is the set of subbases of
(P, f ) smaller than or equal to
z (see Fig. 3.1).
Theorem 3.3: The submodular polyhedron associated with the reduction ( 2 E , f") of ( 0 ,f) by vector z is given by
Also, if z is a superbase of (P, f), i.e. z E P ( f # ) , then
where B(fIZ = {Y
I Y E B ( f ) , Y 5 .I.
(Proof) For any y E P ( f Z ) ,we see from (3.9) with X = 2 E Z = 0 that
D
or X = {e}
2
vx E D: Y(X) I f(X),
(3.14)
b'e E E: y(e) 5 z(e).
(3.15)
38
3.1. FUNDAMENTAL OPERATIONS ON SUBMODULAR SYSTEMS Hence y E P(f).. Conversely, for any y E P(f). we have (3.14) and (3.15). For any 2, X C E with X 2 Z E D,
Hence y E P(fz). Moreover, (3.13) follows from (3.12) and Theorem 2.3 since there exists a base y E B(f) such that y 5 x. Q.E.D.
For a supermodular system ( P , g )on E we define the reduction of ( 0 , g ) by a vector 5 E RE in a dual manner. Define gr: 2E t R by
g,(X) = max{g(Z) P(gL = {Y
+ x(X - 2 ) I X 2 Z E P},
(3.17)
I Y E P(g), Y 1 4
(3.18)
.
Then ( 2 E ,gz) is the reduction of (P, g) by x and its associated supermodular polyhedron is given by P(9.) = P(dZ (3.19) due to Theorem 3.3. From Lemma 3.2,Theorem 3.3 and (3.9) we have the following min-max relation.
Corollary 3.4: min(f(2)
+x(E
-
2 ) I Z E D} = max{y(E)
I y E P ( f ) , y 5 x}.
(3.20)
(Proof) It is easy to see that
for any Z E D and any y E P ( f ) with y 5 x. On the other hand, from (3.9) with X = E and Lemma 3.2 there exists a subbase y of (2”,f“) such that y ( E ) = fz(E). Since P(fz) = P(f)= due to Theorem 3.3, this shows the existence of Z E D and y E P ( f ) with y 5 x such that (3.21) holds with equality. Q.E.D.
In particular, for x = 0 (3.20) becomes
Corollary 3.5:
39
11. SUBMODULAR SYSTEMS AND BASE POLYHEDRA
Figure 3.2. Next, for any subbase x E P(f) define a function fi: 2E
fi(X) = min(f(2)
-x(2-X
) IX
ZE
--f
P}
R by (3.23)
for each X 2 E . Similarly as in (3.10), we see that fi: 2E -+ R is a submodular function on the Boolean lattice 2E. Here, it should be noted that fi(0) = 0 due to the fact that x E P ( f ) . We call the submodular system (2E,fi) the contraction of (D,f ) by vector 2 E P (1).
Theorem 3.6: For any subbase x E P ( f ) we have (3.24) (3.25)
(Proof) Since x E P ( f ) , we have fi(E) = f ( E ) . (3.24) easily follows from (3.23) and the definition of the dual function. Furthermore, from (3.20), Theorem 3.3 and the duality shown in Lemma 2.4 together with the remarks given after it, we have (3.26) B ( f J = B((fZ)#) = B((f#-)z) = B ( f # ) i = B ( f ) i
Q.E.D. The contraction of ( D , f ) by x E P ( f ) corresponds to the reduction of its f # ) by x (see Fig. 3.2). dual supermodular system
(a,
40
3.1. FUNDAMENTAL OPERATIONS ON SUBMODULAR SYSTEMS
For any subbase x E P ( f ) define [P(f)l” = {Y
I Y E R E ,x v Y E P(f)),
(3.27)
where x v y is the vector in RE defined by (xVy)(e) = max{x(e),y(e)} (e E E ) , i.e., the join of x and y in the vector lattice RE.
Lemma 3.7: For any subbase x E P ( f ) ,
(Proof) For any y E P(fz),we have from (3.23)
for any X and 2 such that X C 2 E D. This implies z V y E P ( f ) . Conversely, for any y E [P(f)lz, we have (3.29) for any X and 2 such that X C 2 E D, from which follows the fact that y E P(fz). Q.E.D. In a dual manner we define the contraction (zE,g”) of a supermodular system ( D , g ) by a superbase x E P(g), where g” = ((g#)”)#. A submodular system obtained by repeated reductions and/or contractions of submodular system (D, f ) by vectors in RE is called a vector minor of (D,f ) .
Theorem 3.8: If vectors x, y E RE satisfy (i) x 5 y , (ii) B(f)z # 0 (i.e., z is a subbase) and (iii) B ( f ) y # 0 (i.e., y is a superbase), then we have B ( f ) i (= ( B ( f W = (B(f)Y)”) # 0. (Proof) Since z E P(f), y E P(f#) and z 5 y, we have y E P ( f # ) z = P((fz)#). Q.E.D. Hence, B(f)g = B(f3C)y= B((fz)#)y # 0. For any x E RE the rank of the reduction of (D, f ) by x is denoted by r f ( x ) ( = f”(E)). The function r f : RE -+ R is called the vector rank function of ( D , f ) . From the definition, r j is a concave function (see (3.9)). (c)
Translations and sums
For any vector x E RE the translation of a submodular system ( D , f) by x is the submodular system whose rank function is given by f x: D -+ R , where x should be considered as a set function (a modular function) on 2” by x),we have x ( X ) = CeEX x(e) ( X E ) . For the translation ( D , f
+
+
(3.30) (3.31) 41
11. SUBMODULAR SYSTEMS AND BASE POLYHEDRA
Figure 3.3. where x in the right-hand sides is in R E and the sums in the right-hand sides denote the vector sum (see Fig. 3.3). It should be noted that a contraction of ( 0 , f ) by x E P ( f ) followed by a translation by -x corresponds to an ordinary contraction of a (poly-)matroid (see [Fuji8Ob]). The combinatorial structure of the submodular polyhedron and the base polyhedron is invariant with respect to translations and the rank function can be made monotone nondecreasing by an appropriate translation (see Lemma 3.23 in Section 3.3.a). Therefore, the monotonicity of the rank function plays no essential r81e in the theory of submodular systems but sometimes makes it easier for us to find an initial feasible solution in algorithms. Also, any results obtained in (poly-)matroid theory which are invariant with respect to translations can easily be extended to submodular systems. For example, we will generalize the polymatroid intersection theorem of Edmonds [Edm70] to submodular systems in Section 4.1. For two submodular systems (01, f l ) and (02,f2) on E the sum of the two submodular systems is defined as the submodular system (01 fl 02, f l f2) on E . We have
+
42
3.1. FUNDAMENTAL OPERATIONS ON SUBMODULAR SYSTEMS
and (3.33) will be shown in Section 4.2.) Note that a translation is a special case of a sum.
(d) Other operations For a submodular system (D, f ) on E let r be an arbitrary nonnegative element in R.Define
Then (D,f-r) is also a submodular system on E and is called the r-truncation of ( D , f ) . Similarly, we define the r-truncation, (D,g+r), of a supermodular system (0,s)by
The dual of the r-truncation of the dual supermodular system of the submodular system (D,f ) is called the r- elongation of (D,f ) ([Tomi8la]) (see Fig. 3.4).
Figure 3.4. 43
11. SUBMODULAR SYSTEMS AND BASE POLYHEDRA Denoting the r-elongation of
(D,f ) by (D,f + r ) ,
we have (3.38)
Similarly, in a dual manner we define the r-elongation of a supermodular system. For a submodular system (D,f ) on E and any a E R define (3.39) Suppose (Y 5 0. Then f ( O ) : D --t R is a crossing-submodular function on D. If B(f(")) # 0, then by Theorem 2.5 there uniquely exists a submodular system (D,f"(")) on E such that B(f(a)) = B(f(")). We call ( D , f " ( a ) ) the la[abridgment of ( D , f ) . Also, suppose (Y 0. Then (D,f'")) is a submodular system and is called the a-enlargement of ( D , f ) . (For the abridgment and enlargement also see [Tomi8lb,8lc].) The concepts of abridgment and enlarge0 and € 0);in a convex ment also appear in the game theory as €-core ( € game an €-core for E 2 0 (or 5 0) is exactly the base polyhedron of the Eabridgment (or 161-enlargement) of the associated submodular system defining the convex game (see [Shubik82]). For any partition II = { A l , A z , - . - ,ofAE ~ )a subset X of E is said to be compatible with ll if, for each Ai E II, Ai n X # 0 implies Ai g X. Define
>
>
<
(3.40) D(n)= {X I X E D, X is compatible with ll} and denote the restriction of f to D(n)by fn. The pair (D(II),fn)is a sub-
modular system on E and is called the aggregation of ( D , f ) by II. Aggregations play a fundamental r61e in a decomposition theory for submodular systems ([Fuji83], [Cunningham83b]) which generalizes the decomposition theory of graphs by Tutte [Tutte66].
3.2. Greedy Algorithm
In this section we consider a linear optimization problem over the base polyhedron and give an algorithm, called a greedy algorithm, for solving the problem. Before getting into the problem, we first examine the structure of the distributive lattice D C 2E and show the one-to-one correspondence between the set of distributive lattices D 2 E with 0, E E D and the set of partially ordered sets (posets) on partitions of E .
(a) Distributive lattices and posets 44
3.2. GREEDY ALGORITHM
For a distributive lattice D C zE the cardinality [Dlof D can be as large as 2IEI and listing all the elements of D to represent it is not practical even for medium-sized E . We shall show how to efficiently express a distributive lattice as a structured system, a poset, on E . Let D C 2E be a distributive lattice with 0, E E D. A sequence of monotone increasing elements of D
is called a chain of D and Ic is the length of the chain C. If there exists no chain which contains chain C as a proper subsequence, C is called a maximal chain of D. If C given by (3.41) is a maximal chain, we have SO= 8 and s k = E . For each e E E define
D ( e ) is the unique minimal element of and e' E D(e) we have D(e')
D containing e . Note that for any e E E
c D(e).
(3.43)
Also let G ( D ) = ( E , A ( D ) )be a (directed) graph with a vertex set E and an arc set A(D) given by
A(D)= {(e,e')
IeE
E , e' E D(e)}.
(3.44)
Decompose the graph G(D) into strongly connected components Gi = (Fi,Ai) (i E I ) . Let 5~ be the partial order on the set of the strongly connected components {Gi I i E I } naturally induced by the decomposition (i.e., G i , d ~ G i , for il, 22 E I if and only if there exists a directed path from a vertex of Gi, to a vertex of G i , ) . Note that from (3.43) G ( D ) is transitively closed (i.e., if there is a directed path from a vertex v l to a vertex v 2 , then there is an arc (v1, v2) in G ( D ) ) .Therefore, if Gi, $Gd2, there exists an arc from any vertex of G;, to any vertex of G i , . Denote the set of the vertex sets F; (i E I ) of the strongly connected components Gi (i E I ) by
n(D) = {Fi I 2 E
I}.
(3.45)
I T ( D ) is a partition of E . In the following we regard < D as a partial order on I T ( D ) by identifying Gi with Fd for each i E I . Now, we have obtained a poset P(D)= ( I I ( D ) , d o ) which , is called the poset derived from distributive lattice D. An example of a distributive lattice D and the poset P(D) derived from it is shown in Fig. 3.5. 45
11. SUBMODULAR SYSTEMS AND BASE POLYHEDRA
0
Figure 3.5. For a (general) poset P = (P,d), a set J e, el E P we have
C P is called
a (lower) ideal of
P if for each
e<e'EJ=+eEJ.
(3.46)
c
Theorem 3.9 [Birkhoff37]:Let D zE be a distributive lattice with 0, E E D. Then, for the poset P(D)= (II(D),dD) derived from D the following (i) and (ii) hold. (i) For each ideal J of P (D),
U{F I F E J} E (ii) For each X E
D, there exists an ideal J X=U{F
u{
D. of P(D) such
I FE
J}.
(3.47)
that (3.48)
(Proof) (i) : Put X = F I F E J } . Since J is an ideal of P (D), it follows from the definition of P(D) that D ( e ) C X for each e E X . Then, X = U { D ( e ) I e E X } and we have X E D since D ( e ) E D from (3.42). (ii): For a given X E D, X and any F E T-I(D) do not cross, i.e., either F C X or F C E - X , due to the definition of P(D).Therefore, J = {F I F E 46
3.2.GREEDY ALGORITHM
r I ( D ) , F C X } is a partition of X . Moreover, because of the definition of P ( D ) “ F i d ~ FC z X” implies “FI c X ” . Consequently, J = { F I F E I I ( D ) , F C: X } is a desired ideal of P ( D ) . Q.E.D. From Theorem 3.9, (3.47)(or (3.48))determines a one-to-one correspondence between D and the set of all the ideals of P ( D ) . It should be noted that for any poset P = ( P , i ) on a partition P of E the set D ( P ) defined by
D(P)
= {J”
I J: an ideal of P},
(3.49) (3.50)
forms a distributive lattice with 0, E E D ( P ) . We can also see that the mapping which assigns each D to its derived posets P ( D ) is a one-to-one correspondence between the set of distributive lattices D g 2E with 0, E E D and that of posets P = ( P , d ) on partitions P of E . From Theorem 3.9 we can easily show
Corollary 3.10: Given a distributive lattice D
C 2E
with
0, E
E
D , let
be an arbitrary maximal chain of D . Then we have (3.52) In particular, the length of any maximal chain of D is independent of the choice of a maximal chain and is equal to III(D)I. We call D simple if the partition n ( D ) is composed of singletons of E alone, i.e., n ( D ) = {{e} I e E E } . For a simple distributive lattice D we regard P ( D ) as a poset on E and write P ( D ) = ( E ,$). Conversely, the set of all the (lower) ideals of a poset P = ( E , d ) on E forms a simple distributive lattice D 2E and we denote such a simple D by Z P . A submodular system ( D , f) with simple D is called simple. For a non-simple submodular system ( D , f) on E , define
c
k = {F I F E rI(D), F c X } B = {k I X E D } , f(k)= f ( X )
(X E D).
Then we have a simple submodular system simplification of ( D , f ) . 47
(X
E
P),
(3.53) (3.54) (3.55)
(B, i) on II( D ) , which we call the
11. SUBMODULAR SYSTEMS AND BASE POLYHEDRA
(b) Greedy algorithm For a submodular system ( 0 ,f) on E we consider a linear optimization problem described as follows.
P,: Minimize
w(e)z(e)
(3.56a)
eE E
subject to x E B ( f ) ,
(3.56b)
where w : E + R is a given weight function. An optimal solution of P, is called a minimum-weight base of ( D , f) with respect to the weight function w . Similarly, a maximum-weight base of ( D , f) with respect to the weight function w is an optimal solution of Problem P-, with the weight function - w . Fundamental structural properties of the base polyhedron B(f) are given by the following theorems.
Theorem 3.11: The base polyhedron B ( f ) is pointed (or has at least one extreme point) if and only if D is simple, i.e., D = 2 p for some poset P = ( E ,5). (Proof) A polyhedron is pointed if and only if its characteristic cone (or re cession cone) does not contain any line ([Stoer+Witzgall70] , [Rockafellar70]). The characteristic cone of B(f) is the solution set of the following system of inequalities and an equation: (3.57) (3.58) Therefore, B ( f ) is pointed if and only if the system of equations
z ( X )= O
(XED)
(3.59)
has the unique solution x = 0 , where note that E E D . We see from Corollary 3.10 that (3.59) is equivalent to
x(F)= 0
(F E II(D)).
(3.60)
System (3.60) has the unique solution x = 0 if and only if IF( = 1 for each Q.E.D. F E I I ( D ) , i.e., D is simple. It should be noted that the rank of the coefficient matrix of the left-hand side of (3.59) is equal to the height of D , the length of a maximal chain of D .
Theorem 3.12: The base polyhedron B(f) is bounded if and only if D is the Boolean lattice 2 E , i.e., D is simple and complemented. 48
3.2. GREEDY ALGORITHM (Proof) If of
D
= 2E, then
B(f) is included in the following bounded solution set
On the other hand, if D # 2E, there exist distinct elements e , e’ E E such that for each X E D e E X implies e‘ E X . Then for any base z E B(f) the ray or halfline
is contained in B(f). Hence, B(f) is not bounded.
Q.E.D.
When (D,f ) is not simple, Problem P, is unbounded if w ( e ) # w(e‘) for any e, e‘ E F E II(0). Therefore, if P, has an optimal solution, w : E + R is constant on each F E r I ( D ) , and hence it suffices to consider the simplification of ( D , f ) . We suppose without loss of generality that in the minimum-weight base problem P, described by (3.56) B(f) is pointed, i.e., D = 2 p with P = ( E ,5 ) .
Theorem 3.13 ([Fuji+Tomi83]): Problem P, in (3.56) has a finite optimal solution if and only if w : E --t R is a monotone nondecreasing function from P = ( E , < ) to ( R , i ) , i.e., V e , e ’ E E : e 5 e‘==+ w ( e ) 5 w ( e ‘ ) . (Proof) The “if” part: Suppose that w is a monotone nondecreasing function from P = ( E ,5 ) to (R,2 ) and that the distinct values of weights w ( e ) ( e E E ) are given by w1
< w2 <
< wp.
(3.63)
Define
A; = {e
I e E E,
w(e) 5 w;}
(i = 1 , 2 , . . . , p ) ,
( 3.64)
where note that A , = E. The sets A; (i = 1 , 2 , - . ., p ) form a chain A1 c A2 C A,, of D. From Lemma 3.1 there exists a base z E B ( f ) such that
49
c
11. SUBMODULAR SYSTEMS AND BASE POLYHEDRA Then for any base y E B(f) we have from (3.63)-(3.65)
i=l e E A , - A , - 1 P
= C { w i ( y ( ~ i )- z ( ~ i )-) Wi(Y(Ai-1) - z ( ~ i - 1 ) ) ) i=l P- 1
4%)) + Wp(Y(Ap) - Z(Ap))
= C ( W i - Wi+l)(Y(Ai) i= 1 P- 1
=C(wi+l-
~ i ) ( f ( ~-iY) ( A ~ ) )
i= 1
2 0,
(3.66)
where we define A0 = 0, and recall A, = E . This shows the optimality of x. The "only if" part: Suppose that w is not a monotone nondecreasing function from P = ( E ,5 ) to (R, I),i.e., for some k (1 I k < p) A k defined by (3.64) does not belong to D. Then there exist elements e E A k and e' E E - Ak such that for any X E D with e E X we have e' E X . Hence, for any base z E B(f) we have c"(z,e,e') = +oo and x a ( X e - x e l ) E B(f) for any a > 0. Since w (e) < w (e') , Problem P, is unbounded. Q.E.D.
+
Corollary 3.14: Let P,' be the problem given by Pw where B(f) is replaced by P(f ) . Problem P,' has a finite optimal solution if and only if w: E + R is a nonpositive monotone nondecreasing function from P = ( E ,5 ) to (R, 5 ) . (Proof) The nonpositiveness is imposed on w since for any z E P(f) and e E E we have z - axe E P(f) for an arbitrary CY E R. For any nonpositive w , P,' has a finite optimal solution if and only if P, has a finite optimal solution., Therefore, the present corollary follows from Theorem 3.13. Q.E.D. Furthermore, we have the following
Theorem 3.15: Suppose that w is a monotone nondecreasing function from P = ( E ,5 ) to (R, I),i.e., the sets A; (i = 1,2, - - - , p ) defined by (3.64) form , p let fi be the rank function of the set a chain of D. For each a = 1, 2, minor (D,f ) 'AiIAi-1, where A0 = 0. Then the set of all the optimal solutions of Problem P, is given by B(f1) @ "2) = {xi @ 5 2 @
@
.
* *
@W P )
@
z p
1 xi
E B(f;)
50
(i = 1 , 2 , . . * , ~ ) } ,
(3.67)
3.2. GREEDY ALGORITHM where the direct sum @ is defined by (3.5). That is, x E B(f) is an optimal solution ofP,,, ifandonlyifxrestrictedon Ai-Ai-1 i s a baseof (D,f).Ai/A;-1 foreachi = 1, 2, p. - . a ,
(Proof) It follows from the proof of the “if” part of Theorem 3.13 that x E B(fl) @ . . - @ B(fp) is an optimal solution of Problem P,,,. On the other hand, if z is an optimal solution, then we must have dep(s, e) = 1, 2, p and e E Ai. This implies z(Ai) = f(A;) (i = 1 , 2 , . - . , p ) since A; = U{dep(z,e) I e E A;} (use Lemma 2.2). Therefore, we have z E B(fl) $ - . . @ B ( f p ) due to Lemma 3.1. Q.E.D.
A; for each i
Theorem 3.16: A base x E B(f) is an optimal solution of Problem P,,, if and only if for each e, e’ E E such that e’ E dep(x,e) we have w(e) 2 w(e‘).
(3.68)
(Proof) The “only if” part is trivial. The “if” part follows from Theorem 3.15. For, if (3.68) holds for each e, e’ E E such that e’ E dep(s,e), then we have (3.65) for Ai (i = 1 , 2 , . - . , p ) defined by (3.64). (Note that Ai = , p , and use Lemma 2.2.) Q.E.D. U{dep(x,e) e E A;} for i = 1, 2,
I
Theorem 3.16 says that the local optimality with respect to (elementary) transformations fromx to z + a ( x e - x e t ) (e E E , e’ E dep(x,e), a 2 0) implies the global optimality. The proof of Theorem 3.13 provides us with an algorithm, called a greedy algorithm, for solving Problem P,,, (see [Fuji+Tomi83]). The greedy algorithm was given by Edmonds [Edm7O]for polymatroids and by Lovbsz [Lovbsz83]for submodular systems with D = 2E. A sequence ( e l , e2,. . ,en) of all the elements of E (a linear or total ordering of E ) is called a linear eztension of P = ( E ,<) if ei 5 e j implies i 5 j (i, j = 1, 2, ..., p ) . Furthermore, a linear extension ( e l , e 2 , - . . , e , ) of P = ( E ,5 ) is called monotone nondecreasing with respect to w: E + R if w(e1) 5 w(e2) 5 . 5 w(e,). Such a monotone nondecreasing linear extension of P = ( E ,5 ) exists if and only if w is a monotone nondecreasing function from P = ( E , i ) to ( R , S ) . Suppose we are given a monotone nondecreasing weight function w from P = ( E , 5 ) to (R,I).
.
-
Greedy algorithm I Step 1: Find a monotone nondecreasing linear extension ( e l , e 2 , . * ., e n ) of P = ( E , 5 ) with respect to w. 51
11. SUBMODULAR SYSTEMS AND BASE POLYHEDRA Step 2: Compute a vector
E R E by
where for each i = 1, 2, . . . , n S; is the set of the first i elements of (el, e2, . ,en) and SO = 0. Then z is a minimum-weight base of (D, f ) with respect to weight w . (End)
...
It should be noted that 0 = SO c S1 c c S, = E is a maximal chain of D containing Ai (i = 1,2, ... , p ) defined by (3.64)and that conversely such a maximal chain of D gives a monotone nondecreasing linear extension (e1,e2,--.,en) of P by { e i } = S; - Si-1 (i = 1 , 2 , . - - , n ) . Also note that due to Theorem 3.15 every extreme minimum-weight base can be obtained by the greedy algorithm by appropriately choosing a monotone nondecreasing linear extension ( e 1 , e 2 , - . . , e n ) of D in Step 1. Corollary 3.17: Problem P, has a unique optimal solution if and only if w : E + R is a one-to-one monotone increasing function from P = ( E ,5 ) to
(R,I ) . Theorem 3.18: Let f : D + R be a function on a simple distributive lattice D 2E such that 0, E E D and f(0) = 0. Define B ( f ) = {z I z E R E ,V X E
D: z ( X ) 5 f ( X ) , z ( E ) = f ( E ) } .
(3.70)
Then, the greedy algorithm described above works for B(f) defined by (3.70) if and only i f f is a submodular function on D.
(Proof) It suffices to show the “only if” part. Suppose that the greedy algorithm works for B(f). Then, for each maximal chain
C : 0 = So of
c S1 c
* * .
c S,
=E
(3.71)
D the vector z E R E defined by (3.69) belongs to B(f). For any incomparable D choose a maximal chain C of (3.71) containing X n Y and X U Y
X, Y E
and define z by (3.69). Since z E B ( f ) , by the definition of x we have
52
3.2. GREEDY ALGORITHM
It follows that f is a submodular function on D.
Q.E.D.
The greediness of the algorithm may not be seen from the above description of the algorithm. We can also get the same optimal solution x as follows. Let xo be a vector in R“ with sufficiently small (possibly negative) components x o ( e ) ( e E E ) such that f - 5 0 :D + R is monotone nondecreasing. (This is satisfied if and only if xo 5 s,where g will be defined by (3.95).)
Greedy algorithm I1
Step 1: Find a monotone nondecreasing linear extension ( e l , e 2 , . P = ( E , d ) with respect to w . Put x t 50. Step 2: For each i = 1,2, - ,n put x
t
x
+ 2(x,ei)xe,.
- ,en) of (3.74)
Then x is a minimum-weight base of (D, f ) with respect to weight w . (End) This algorithm is a coordinate-wise “steepest descent” method, which shows the greediness of the algorithm. The following theorem shows the validity of the algorithm.
Theorem 3.19: Suppose that the same linear extension is chosen in Step 1 of greedy algorithms I and 11. Then, starting from xo such that f - 5 0 : D -+ R is monotone nondecreasing, greedy algorithm I1 finds the same minimum-weight base of ( D , f ) with respect to w that is obtained by greedy algorithm I. (Proof) It suffices to show that the finally obtained x satisfies (3.69). We show this by induction. Since f - xo is monotone nondecreasing, the minimum of min{f(X) - x o ( X ) I el E X E
D}(=
(3.75)
2(xo,el))
is attained by { e l } (E D). It follows that (3.69) is satisfied for i = 1. Suppose that we have just finished the ioth cycle in Step 2 of greedy algorithm I1 for some io with 1 5 io 5 n - 1 and that (3.69) holds for i = 1 , 2 , - . ,io. Let I be the current x . Since f(Si,) = I ( S i o ) ,we have for any X E D
-
I ( X ) - I(X) = f(X) - .(X) + f(Si0)- I(Si0) 2. f ( x u si,) - i(xu si,) + f(xn sio) - I ( X n Si,) 2 f ( X u Sill) - q x u Si,). 53
(3.76)
11. SUBMODULAR SYSTEMS AND BASE POLYHEDRA Therefore, the minimum of min{f(X) - Z(X) I ei0+1 E X E
D}(=
2(Z,eio+1))
(3.77)
is attained by some X E D such that Si, 2 X. For X E D such that Si, C X,
f(X) - Z(X) = f(X) - ZO(X) + ZO(Sio)- f ( S i 0 ) .
(3.78)
Because of the monotonicity of f - 20, the minimum of (3.78) for X E D such that Si, G X and eio+l E X is attained by X = Si, u {eio+l} = Sio+l. Consequently, (3.69) also holds for i = io 1. Q.E.D.
+
The linear programming dual of Problem P,,, in (3.56) is given by
PW*:Maximize
c
X(X)f(X)
(3.79a)
XED
subject to Ve E E : c { X ( X ) I
VX E D
- {E}:
X E D, e E X} 2 w(e),
X(X) 5 0.
(3.79b) (3.79c)
For an optimal solution x obtained by the greedy algorithm we have
where w;, Ai (i = 1,2,. Define
- - , p ) are defined by
(3.63) and (3.64) and A0 =
(i = 1 , 2 , - * * , p -l),
X(Ai) = wi - wi+l
X(E) = wP
8.
(3.81) (3.82)
and X(X) = 0 for other X E D. Then X is a dual feasible solution and it follows from (3.80) that the values of the objective functions of the dual problems coincide with each other. Hence X given above is an optimal solution of the dual problem Pu,*. From (3.81) and (3.82), for any integral w such that the primal problem P, has an optimal solution there exists an integral optimal solution of the dual problem P,*. That is, we have proved the following. 54
3.3. STRUCTURES OF BASE POLYHEDRA
Theorem 3.20: The system of inequalities and an equation given by
is totally dual integral for any submodular system ( D , f ) on E Since the system of (3.83) and (3.84) is totally dual integral, if the rank function f is integer-valued, each face of B(f) contains at least one integral vector and in particular, each vertex is integral ([Hoffman74],[Edm+Giles77]), which can also be seen from the greedy algorithm (cf. Theorem 3.22 below).
Corollary 3.21: The system of inequalities
is totally dual integral for any submodular system (D, f ) . (Proof) The proof is similar to that of Theorem 3.20. Use Corollary 3.14. Q.E.D.
3.3. Structures of Base Polyhedra Suppose that
(D,f ) is a simple submodular system on E .
(a) Extreme points and rays From the greedy algorithm and Corollary 3.17 we have
Theorem 3.22 (The extreme point theorem) [Fuji+Tomi83]: A base z E B(f) is an extreme point of the base polyhedron B(f) if and only if for a maximal chain c: 0 = s, c s1 c * * . c s, = E (3.86) of D we have
z ( e i ) = f(Si) - f ( S i - 1 )
(i = 1 , 2 , . . - , n ) ,
(3.87)
where { e i } = S, Theorem 3.22, when D = 2 E , has been shown in [Edm7O],[Shapley71] and [LovBsz83]. Define a vector iZ E R E by
D(e) = n { x I e E 55
x E P},
(3.88)
11. SUBMODULAR SYSTEMS AND BASE POLYHEDRA
for any e E E and X E D such that e E X and X - {e} E D. Hence, from Theorem 3.22, for any e E E E(e) is the maximum value of the eth component of bases of (D,f ) , i.e., E is the least upper bound (or the join) of all the extreme points of B(f) in the vector lattice RE.In particular, for each X E D,
f ( X ) I V),
(3.91)
since there is an extreme base z E B ( f ) such that z(X) = f ( X ) . Because of (3.91), i5 can be used for estimating an upper bound of f, i.e., max{f(x)
I X E D} I:max{i5(X) I X E D} 5 I e E E , z ( e ) > 0).
z{~(e)
(3.92)
Also note that, given any base z E B(f), a lower bound of f is computed as
I x E D} 2 min{z(x) I x E P I 2 C { z ( e ) I e E E ,
s(e) < 0 ) . (3.93) Moreover, denote by cy the greatest lower bound (or the meet) of all the extreme points of B(f) in the vector lattice R E .g is given similarly as (3.88) and (3.89) in a dual form: min{f(x)
Lemma 3.23: The rank function f of a simple submodular system (0, f) is monotone nondecreasing if and only if q 2 0. In other words, f is monotone nondecreasing if and only if every extreme point of the base polyhedron B(f ) belongs to the nonnegative orthant RT. (Proof) If f is monotone nondecreasing, then we have g 2 0 by definition (3.95). Conversely, if g 2 0, we have for any e E E and X E D with e f X and X u {e} E D,
f ( X u {e})
-
f ( X ) 2 f(D*(e)u {e})
- f(D*(e))= cy(e)
2 0,
(3.96)
from which follows the monotonicity o f f . We see from (3.96) that the minimum value of the eth component of bases of (D, f ) . i s equal to cy(e). Q.E.D. 56
3.3. STRUCTURES OF BASE POLYHEDRA
Lemma 3.24: The rank function f of a simple submodular system ( D , f ) is a modular function if and only if for an extreme base x of (D, f ) we have x = E(= cy).
(Proof) Since ( D , f ) is simple, the rank function f is modular if and only if
B(f) has one and only one extreme base. The present lemma follows from this Q.E.D.
fact and the definitions of E and Q.
It follows from Theorem 3.22 (also from Theorem 3.20) that if the rank function f of (D,f ) is integer-valued, every extreme point of B(f) is integral. In particular, we have the following polyhedral characterization of matroids due to Edmonds (Edm7Ol. Corollary 3.25 [Edm’lO]: For a rnatroidal submodular system (2E,p), where p is the rank function of a matroid M on E , the base polyhedron B(f) is the convex hull of the characteristic vectors of all the bases of the matroid M. From Theorem 2.5 we can easily see that if f is a nonnegative integervalued crossing-submodular function on a crossing family 3 and if the polyhedron given by B ( f t ) = { z I x E B(f), Ve E E:O I x(e) 5 1)
(3.97)
is nonempty, then B ( f i ) is the integral base polyhedron obtained by the reduction of B(f) by vector 1 = ( l ( e ) = 1 I e E E ) and the contraction by zero vector 0 and is a base polyhedron of a matroid. Hence, from Corollary 3.25 we see that
is a family of bases of a matroid. This is a result by A. Frank and 8. Tardos [Frank+TardosSl]. If the rank function f of a simple submodular system (D,f)is integervalued and has the unit-increase property (i,e., for any X , Y € D with X & Y and + 1 = IYJ we have f(Y)= f ( X ) or f ( Y ) = f(X) I), then all the extreme points of B(f) are {O,l}-vectors. Therefore, we can develop a matroid-like theory for such a submodular system (D,f ) , which corresponds to Faigle’s geometry on the poset P = ( E , < ) with D = 2’ [Faigle79,80]. In fact, Theorem 3.22 gives a polyhedral characterization of the family of bases of Faigle’s geometry. Next, denote the characteristic cone of the base polyhedron B(f) by C ( f ) , which is given by
+
1x1
C(f) = {z
I 2 E R E ,VX E D: z ( X ) 5 0, 57
x(E) = O}.
( 3.99)
11. SUBMODULAR SYSTEMS AND BASE POLYHEDRA Let G = ( E , A )be the graph with vertex set E and arc set A which represents the Hasse diagram of the poset P = ( E , < ) , i.e., ( e y e ' ) E A if and only if e covers e' (or e' 4 e and there is no element e'' E E such that e' 4 e'' < e ) . Also define a capacity function c on A by ~ ( a= ) +co
(a E
(3.100)
A).
Then we easily see that C ( f ) in (3.99) coincides with the set of the boundaries M = (G = ( E , A ) , c ) . (Note that the cut function IC, associated with network M = (G = ( E , A ) , c )satisfies (i) tcC(X)= 0 if X is an ideal of P and (ii) I C ~ ( X = )+oo otherwise. Hence, C(f) in (3.99) is the set of the boundaries ap of feasible flows (i.e., nonnegative flows) p in N.) Consequently, we have
dp of nonnegative flows p in the network
T h e o r e m 3.26 (The extreme ray theorem) [Tomi83]: The extreme rays of the characteristic cone C(f) of the base polyhedron B(f) are exactly those represented by the vectors
(3.101)
X e - Xe'
for all e, e' E E such that e covers e' in P = ( E , < ) .
(Proof) Since the characteristic cone C(f) is the set of the boundaries of nonnegative flows in N = ( G = ( E , A ) , c ) defined above, C(f) is generated by vectors xe - Xer such that (eye') E A . Moreover, vector xe - Xer for an arc (eye') E A can not be expressed as a nonnegative linear combination of the other vectors xel - x,; with ( e l , e:) E A - { (e, el)}. Q.E.D. Theorem 3.13 characterizes the dual cone C*(f) of C(f), where C * ( f ) = {Y
I Y E R E , Vx E C(f):
x(e)y(e)
5 0).
(3.102)
eEE
Theorem 3.13 says that -C* ( f ) consists of monotone nondecreasing functions from P = ( E ,4)to (R, 5 ) or that C*(f) consists of monotone nonincreasing functions from P = ( E , < ) to (R,<). Therefore, C*(f) is generated by the characteristic vectors of the (lower) ideals of P.
(b) Elementary transformations of bases For any base z of submodular system e' E dep(z,e) - {e} we have z+a(~c-Xer)
(P,f ) on E
EB(f) 58
and for e, e' E E such that
(05aLc"(z,e,e')).
(3.103)
3.3. STRUCTURES OF BASE POLYHEDRA
+
The transformation of base x E B(f) into such a base x a ( x e- x , ~ E ) B(f) is called an elementary transformation of base x E B(f). The following theorem is important from an algorithmic point of view. For any real number a, La] denotes the maximum integer less than or equal to a.
Theorem 3.27: For any two bases x, y E B(f) base x can be transformed into base y by at most 11El2/4J repeated elementary transformations such that each component z(e) with x(e) < y(e) monotonically increases and each component x(e) with x(e) > y(e) monotonically decreases. (Proof) Consider the following algorithm. 1' If x = y, then stop.
2' Choose any element e E E such that x(e)
< y(e).
3' Choose any element e' E dep(x,e) such that %(el) > y(e'). Put a min{y(e) - x(e), x(e') - y(e'), E(x,e,e')} and x t x Q(x, - x,r) .
+
t
a! < y(e) - x(e), then go to Step 3'. Otherwise (a= y(e) - x(e)) go to Step 1'.
4' If
Note that if x # y, there is an element e such that x(e) < y(e) and that if x(e) < y(e), there is an element e' E dep(x,e) such that x(e') > y(e'), since otherwise, putting X = dep(x,e), we would have f(X) = x(X) < y(X), a contradiction. Also note that if a < y(e) - x(e) in Step 4 O , we have a = min{x(e') - y(e'), E(x,e,e')} and the number of elements in {e'' I e" E dep(x,e), x(e") > y(e")} decreases by at least one after the elementary transformation x t x a(xe- x , ~ ) Therefore, . the case where a < y(e) - x(e) is repeated at most IS-1 times, where S- = { e I e E E , x(e) > y(e)} for the initial base x. Defining S+ = {e 1 e E E , x(e) < y(e)} for the initial x, the total number of the elementary transformations is at most IS+ I x IS- 1, which is bounded by [IEl2/4J. Q.E.D.
+
Consider a capacitated network N = (G = (V,A),c,Z)with an underlying graph G and lower and upper capacity functions c, C: A + R with -c 5 t. Let 'c.,~: 2v + R be the cut function associated with the network N = (G = ( V , A ) , c , Z )(see (2.62) in Section 2.3). The set of the boundaries of feasible flows in 4 is the base polyhedron associated with the submodular system (2V,nc,c). Note that there exists a feasible circulation (a feasible flow 'p such that 8'p = 0 ) in N if and only if 0 E B('cc,z). Since dc E B('cC,?), there exists a feasible circulation in N if and only-if the base dc E B(t&) is transformed into 0 by repeated elementary transformations as in Theorem 3.27. A standard algorithm for finding a feasible circulation by the use of a max-flow algorithm consists of such repeated elementary transformations (cf. [HoffmanGO], [Ford+Fulkerson62]). 59
11. SUBMODULAR SYSTEMS AND BASE POLYHEDRA For a (directed) graph G = (V,A) and a {O,l}-valued function p: A + {0,1} define the graph G, as the one obtained from G by reorienting arcs a E A such that p(a) = 1. We say 'p defines the reorientation G,. A graph G = (V,A ) is strongly k-connected (k: a positive integer) if for each nonempty proper subset U of vertex set V there exist at least k arcs from U to V - U. We call p: A + {0,1} a strongly k-connected reorientation of G if G, is strongly k-connected. Define a capacity function c : A + R by C(U)
( a E A),
=1
(3.104)
and let K : 2E + R be the associated cut function, i.e., K ( U ) = c ( A + U ) = IA+UI for U C V . Also define
Then ~ ( ' 1 : 2v + R is a crossing-submodular function on 2v and defines a base polyhedron B(rc(')), if B(K(')) # 0 , due to Theorem 2.5. B(K(')) is called the k-abridgment of B(K) in [Tomi8lb, 81~1.It can easily be seen that B(K(')) = {dp I p defines a strongly k-connected reorientation of G}. (3.106) Here, we assume that the underlying totally ordered additive group R is the set Z of integers. Also note that the strong k-connectedness of a reorientation of G depends only on the boundary d p of p which defines-the reorientation. Because of this fact and Theorem 3.27 we obtain a theorem of F'rank [F'rank82a]: 'Let GI and G" be strongly k-connected reorientations of G. Then there exists a sequence of strongly k-connected reorientations GI = Go, GI, * * -, G, = G" of G such that for each i = 1 , 2 , - - - , mGi is obtained by reorienting arcs in a directed path or a directed cycle in Gi-1." Note that an elementary transformation of a base (or a boundary dp) in
B(K(')) corresponds to a transformation of the reorientation of G (defined by by reversing the arcs in a directed path and possibly directed cycles and that reversing arcs in a directed cycle does not change the boundary d'p.
'p)
(c) Tangent cones
For any base z E B(f) associated with submodular system (0,f) on E the tangent cone of B(f) at z,denoted by TC(B(f),z), is defined by
60
3.3. STRUCTURES OF BASE POLYHEDRA
Here, the underlying totally ordered additive group R is assumed to be the set of reals (or rationals). Given a base z E B(f), we call an ordered pair (e, e l ) of elements of E an ezchungeubie pair associated with z if e' E dep(z,e) - {e}.
Theorem 3.28: The tangent coneTC(B(f),z)ofB(f) at a basexisgenerated by the set of the following vectors:
In other words, for any vector y E TC(B(f),z) there exist some nonnegative coefficients A(e, e l ) for exchangeable pairs ( e , e l ) such that Y=
C{A(e,e')(xe - x e , ) I ( e ,
el):
an exchangeable pair associated with z}. (3.109)
(Proof) Let C be the cone generated by the vectors in (3.108). It follows from the definition of dependence function that
Suppose that there exists avector y E TC(B(f),z)-C. Then, by the separation theorem (Theorem 1.13) there exists a vector w E RE such that (3.111)
C w(e)y(e)
< 0.
(3.112)
eEE
From Theorem 3.16 and (3.111) the base z is a minimum-weight base with respect to the weight function w but (3.112) implies that for a sufficiently small a > 0 2 cwy is a base and the weight of the base z ay is smaller than that of z. This is a contradiction, so that we must have C = TC(B(f),z). Q.E.D.
+
+
A constructive proof of this theorem for polymatroids was given in [Fuji 78a, Lemma 91. It should also be noted that Theorems 3.26 and 3.28 are closely related. We can show Theorem 3.28 by using Theorem 3.26 and vice versa. Suppose that (D, f ) is a simple submodular system on E and that z is an extreme point of B(f). Then,
D(z) = { X I X E
D, z ( X )= f ( X ) }
61
(3.113)
11. SUBMODULAR SYSTEMS AND BASE POLYHEDRA is also a simple distributive lattice since the length of a maximal chain of D(z) is equal to [El due to Theorem 3.22. This fact is crucial in the following argument. Let P ( z ) = ( E , - Q be the poset representing D(z). An efficient method for finding dep(z,e) for all e E E is shown in [Bixby+Cunningham+Topkis85] (for polymatroids). Their algorithm also discerns whether a given vector is an extreme base. The algorithm adapted for submodular systems is given as follows. Let z be any vector in R E .
Algorithm Step 1: Put S t 0. Step 2: F o r e a c h i = 1 , 2 , . - . , I E I dothefollowing(2-1) and(2-2). (2-1) If there exists no element e E E - S such that S U {e} E D and z( S U { e}) = f (S U { e}) , then stop (zis not an extreme base). Otherwise find one such element e E E - S and put S t S u {e} and ei t e.
(2-2) Put T t S. For each j = 1,2,
- - - ,i - 1 do the following (*) .
(*) If z(T - { e i - j } ) = f ( T - { e i - j } ) , Put dep(z,ei) t T .
then put T
t
T
- {ei-j}.
(End) If z is an extreme base, then D(z) = 2p(5)with P(z) = ( E ,-&). Repeating C S, (n = IEI) of D(z), Step (2-1), we have a makimal chain SO c S, c where Si = {el,e2,.-.,e;}. Note that a maximal chain of D(z) is obtained by augmenting elements of an ideal S of P(z)one by one in Step (2-1). Conversely, if we find a chain 0 = So c S1 c c S, = E such that z(S;) = f(Si) (i = 1,2, ,n),then z is an extreme base due to Theorem 3.22. This validates Step (2-1). We now show the validity of Step (2-2). Because of the above argument we can assume that the given z is an extreme base. For the current i at the beginning of an execution of Step (2-2) we have a chain So C SI C ... C Si. Note that (3.114) deP(z,ei) si,
-
c
dep(z,e;)USi-j ED(^)
(j=O,l,.+.,i).
(3.115)
Since distinct members of (3.115) forms a maximal chain from dep(z,ei) to Si, the final T obtained in the current Step (2-2) is dep(z,ei). The Hasse diagram G ( s ) = ( E , A ( z ) )of P ( z ) = (E,<=) can be constructed by the use of dep(z,ei) (i = 1 , 2 , . . . , n ) . We can also modify the above algorithm so as to construct the Hasse diagram of P ( z ) = (E,5z)while 62
3.3. STRUCTURES OF BASE POLYHEDRA executing the algorithm. In Step 1 we put G(z) = (E,0). Modify Step (2-2) as follows. (2-2') Put T t S and W t 0. F o r e a c h j = 1 , 2 , . - . , i - l do thefollowing (*1)and(*2). (*1)If ei-j $ Wl and z(T - {ei-j}) = f(T- {ei-j}), then put T t T - {ei-j}. (*2) If ei-j W and x(T - {ei-j}) < f(T- { e i - j } ) , then add to G(z) an arc from ei to e i - j and put W t W u dep(z, ei-j). Put dep(s, ei) t T.
4
It should be noted that ei is adjacent to ei-j in the Hasse diagram if and only if e i - j E dep(s,ei) and ei-j is a maximal element of dep(z,ei) - { e i } in P ( x ) . Set W in Step (2-2') is introduced due to the fact that ei-j E dep(z,ei) implies dep(z,ei-j) 5 dep(s,ei). Set W can also be incorporated into Step (2-2), which may reduce the number of evaluations of f .
(d) Faces, dimensions and connected components [Fuji84d] Consider a simple submodular system set of reals (or rationals). For any 3 G D define
(D,f) on E.
In the following, R is the
F(?) = {z I x E R E ,VX E
3: x ( X ) = f ( X ) , vx E D - 3: .(X) 5 f ( X ) } ,
(3.116)
F"(?) = {x I x 6 R E ,V X E
7:x ( X ) = f ( X ) , vx E D - 3: .(X) < f(X)}.
(3.117)
Also, define
D = {Do
I Do
is a sublattice of D with O,E E Do, F"(Do) # 8).
(3.118)
Lemma 3.29: The collection D of sublattices of D defined by (3.118) is given bY D= I 5 E B(f)}, (3.119) where for each
2: E
B(f) D(z) is defined by (3.113).
(Proof) If DO E D, then for any z E F"(Do) we must have Do = D ( x ) by definition (3.117). Conversely, for any z E B(f) D ( x ) is a sublattice of D with O,E E P(z) and 2 E F"(D(2:)). Hence we have D ( 2 : ) E D. Q.E.D. 63
11. SUBMODULAR SYSTEMS AND BASE POLYHEDRA It should be noted that for each DO E D F(D0) is a face of B(f) and F"(D0) is the relative interior of the face F(D0). We see from Lemma 3.29 that D is the collection of equality sets, each given by D (z),for B( f ) expressed by
z ( E )= f ( E ) .
(3.121)
.
From Lemma 1.8 we can easily show
Theorem 3.30: The collection F of all the nonempty faces of B(f) is given by {F(Do) I Do E D}. Also, (i) If PI,& E D and DI # Dz,then F(D1) # F(D2). (ii) For any DI,D2 E D, D 1 DZ if and only if F(D2) C F(D1). In other words, F in (3.116) determines an anti-isomorphism from D onto F, where D and F are considered as posets relative to set inclusion.
c
F (or Fu{0} when F does not have a unique minimal element) is the face lattice of B(f). For two posets Pi = (Ei,Si) (i = 1,2) we say PZ is a homomorphic image of PI if there exists a mapping $ from El onto EZ such that e 51 e' implies $(e) d2 $ ( e l ) for all e, e' E E l . Also, a partition Pz = { y j I j E J} of E is a refinement of a partition PI = {X;I i E I} of E if for each y j E PZ there exists an Xi E PI such that Yj X;. Lemma 3.31: For any distributive lattices Di (i = 1,2) with 0, E E D;,we have D1 C Dz if and only ifII(D2) is a refinement of I I ( D 1 ) and P(&) = (.TI(&), is a homomorphic image of P(&) = (II(Dz),5 D 2 ) under the natural mapping (i.e., Tz E ll(Dz) is made correspond to TI if TZ C T I ) . (Proof) For each i = 1,2, distinct elements e and e' of E belong to different components of II(Di) if and only if there exists an X E Di such that I{e,e'} n XI = 1. Therefore, since D1 02, n(&)is a refinement of I l ( D 1 ) . Also, since we have Tz-(D,T{if and only if Ti X E D2 implies TZ X and since D1 C D2, Ti X E D1 implies TZ C X if T z d ~ , T i .Consequently, for T I , Ti E I I ( D 1 ) Q.E.D. such that Tz TI and Ti E T i , we have T ~ ~ D ~ T : .
c
c
c
c
c
Theorem 3.32: For
DO E D
we have
where dimF ( DO) is the dimension of the face F ( DO).
64
3.3. STRUCTURES OF BASE POLYHEDRA
(Proof) The dimension of the face F(D0) is equal to that of the affine set
M(D0) = {Z I z E RE,V X E
Do: .(X)
= f(X)}.
(3.123)
Since the rank of the coefficient matrix in the right-hand side of (3.123) is equal Q.E.D. to IIl(Do)l, we have (3.122). It may be noted that the extreme point theorem (Theorem 3.22) and the extreme ray theorem (Theorem 3.26) easily follow from Theorems 3.30 and 3.32 and Lemma 3.31.
Lemma 3.33: Suppose Do E D and let
CO: 8 = SO C S1 C
C sk =E
(3.124)
be a maximal chain of DO. Then,
(3.125)
F(C0) = F(D0).
(Proof) Since lI(C0) = Il(Do), we have z E F(C0) if and only if z E F(D0). Q.E.D.
Theorem 3.34: Suppose DO E D. Then a base z E B(f) is an extreme point of the face F(D0) if and only if, for a maximal chain
C: 0 = So C S1 C of
c S,
=E
(3.126)
D which contains a maximal chain of Do as a subchain, x is given by z(Si-si-1) = f ( S J --f(S+l)
( i = 1,2,-,n).
(3.127)
(Proof) The “if” part: Let CO be a maximal chain of DO contained in chain C of (3.126). Then, from (3.127) z E F(C0). Hence, from Lemma 3.33, x E F(D0). Since x is an extreme point of B(f) due to Theorem 3.22, z must be an extreme point of F(DO). The “only if” part: For any extreme point z of F(D0) we have DO C D(z) E D. Since z is also an extreme point of B ( f ) , i.e., {z} is a zero-dimentional face F(D(z)) of B ( f ) , a maximal chain of D(z) is a maximal chain of D due to Theorem 3.32. It follows that a maximal chain of D(z) which contains a maximal chain of Do as a subchain is a desired maximal chain of D. Q.E.D. Also, extreme rays of a face of B(f) are characterized by the following. 65
11. SUBMODULAR SYSTEMS AND BASE POLYHEDRA Theorem 3.35: Suppose DO E D and I I ( D 0 ) = (Ti I i E I } . Then the set of all the extreme rays of the characteristic cone of the face F ( D o ) is given by
(xe -
xel
I e = dsa,
e’ = d-a, a E A*(D),3i E I : {e,e‘}
Ti},
(3.128)
where A*(D)is the arc set of the Hasse diagram H(P(D)) = ( E , A * ( D ) )repre senting the poset P ( D ) = ( E , ~ D ) . (Proof) The characteristic cone C(D0) of the face F ( D 0 ) is given by C(D0) = (Z
I x E RE,VX E Do:
z ( X ) = 0, VX E D - Do: z ( X ) 5 0).
(3.129)
It follows from (3.129) and the extreme ray theorem (Theorem 3.26) that the set of the extreme rays of C ( f ) ( = C ( D ) )which belong to C(D0) is exactly given by (3.128). Q.E.D. We say that a submodular system ( D , f ) on E is connected if there is no nonempty proper subset X of E such that X , E - X E D and f ( X ) + f ( E - X ) =
f (El. Theorem 3.36: A submodular system ( D , f ) on E is connected if and only if (0, E } E D, i.e., there exists a base x E B(f) such that
4x1 < f ( X ) for any X E
(3.130)
D with 0 # X # E .
(Proof) The “if” part: If there exists a base x E B(f) satisfying (3.130) for each X E D with 0 # X # E , then
for each X E D with E - X E D and 0 # X # E . The “only if” part: Suppose that (P,f) is connected. We have B(f) = F ( D 0 ) for some DO E D. We show PO = (0,E } . Note that we have ( 0 , E ) E Do. Suppose that there exists some Y E DO such that 0 # Y # E . Then, for any base x we must have Ve E E - Y : dep(x,e) 5 E - Y
(3.132)
since otherwise there would exist a base x’ such that x‘(Y) < f ( Y ) ,which would contradict the fact that Y E Po. It follows from (3.132) that
E -Y E 0,
z(E - Y ) 66
=f
( E- Y ) .
(3.133)
3.3. STRUCTURES OF BASE POLYHEDRA
Since z ( Y ) = f ( Y ) ,we have from (3.133)
+
+
f ( Y ) f ( E - Y ) = z ( Y ) z ( E - Y )= z ( E ) = f(E).
(3.134)
This implies that (D,f) is not connected, which contradicts the assumption. Q.E.D. Therefore, we must have (8,E } = DO. The proof of the "only if" part shows the following.
Lemma 3.37: Suppose that B(f) = F(Do) for some Boolean sublattice of D.
DO
E D. Then
DO is a
Theorem 3.38: For a submodular system (D,f ) on E there uniquely exists a partition P* = {TI,7'2, - - .,Tk} (k 2 1) of E such that (i) Ti E D ( i = 1,2,...,k), ( D , f ) .Ti (i = 1 , 2 , . . . , k ) is connected, and (ii) each reduction ( D T i 7 f T i ) (iii) f(x)= f T l ( ~ l n ~ ) + . . . + f T k ( ~ k n ~ ) ( X E D). (3.135) (Proof) Suppose that Do E D satisfies F(D0) = B(f). From Lemma 3.37, DO is a Boolean lattice. Let Tl,. . . ,Tk be all the minimal nonempty elements (as subsets of E) of Do, i.e., II(D0) = {Tl,--.,Tk}. Then,
T;E D
(i=1,2,--*,k).
(3.136)
Since
we have
For any X E
I ( E ) = f(T1) + . - . +f(T&).
D
we have from (3.138)
It follows from (3.139) that
(3.138)
11. SUBMODULAR SYSTEMS AND BASE POLYHEDRA Also,foreachT; ( i ~ { 1 , 2 , . - - , k } ) a n d X C T ; s u c h t h a t X EDandT;-XE D, we have from (3.140) with X replaced by E - X
where (3.141) holds with equality if and only if
If (3.141) holds with equality or (3.142) holds, X must be a member of DO and hence either X = 0 or X = T; by the definition of Ti. Therefore, (D, f ) T; is connected. It follows that the partition {Ti I i = 1 , 2 , . . . , k } derived from Do satisfies (i)-(iii) of the present theorem. Moreover, if a partition {Wj I j E J} satisfies (i)-(iii) with Ti's replaced by Wj's,then from (iii) we have Wj E DO for each j E J. If {Wj I j E J } # {T;I i E I},there exists Wj such that Wj = T;, Us. .UTiPfor some il, ,.ip E I with p 2 2. Then, we have
-
---
(3.143)
-
from which follows that (D, f ) Wj is not connected. Consequently, we have {Wj I j E J } = {Ti I i E I},which shows the uniqueness of the partition. Q.E.D.
Theorem 3.39: I'I(Do) is the partition of E given by the decomposition of (D,f ) into connected components. In particular, the number of connected components is equal to IrI(D0)l. (Proof) The present theorem follows from the proof of Theorem 3.38. Q.E.D. From Theorems 3.32 and 3.39 we have
Corollary 3.40: Let n* be the number of the connected components of (D,f ) . Then, (3.144) dimB(f) = IEl - n*.
+
The dimension of B(f) was also given in [Shapley71], [Giles75], [Bkby Cunningham + Topkis851 and [Tomi831 (when D = 2E). The following lemma is useful for finding the partition rI(P0) (see [Bixby + Cunningham Topkis851).
+
68
3.3. STRUCTURES OF BASE POLYHEDRA
Lemma 3.41: For any base x E B(f) let G(z) = ( E , A ( x ) )be the graph with vertexset E and arc set A ( z ) = {(e,e') I e' E dep(z,e)}. Then, II(D0) is the set of the vertex sets of the connected components of G(z). (Proof) Let {Wj I j E J} be the set of the vertex sets of the connected components of G ( x ) . Since DO 2 D(z) = { X I X E D , x ( X ) = f ( X ) } ,partition {Wj I j E J } is a refinement of I I ( D 0 ) . However, for any j E J we have W j , E - Wj E D(z) by the definition of G(z) and Wj, and
f(E)= x ( E ) = .(Wj)
+ x(E - Wj) = f (Wj) + f(E- Wj).
(3.145)
I
Hence, W j E Do for any j E J. Therefore, II( DO)is a refinement of {Wj j E J}. Q.E.D. I I ( D 0 ) = {Wj I j E J}.
It follows that
Since, in particular, Lemma 3.41 holds for an extreme base x E B(f), the connected components of ( D , f ) can efficiently be found by the use of the algorithm shown in Section 3.3.c [Bixby Cunningham Topkis851.
+
+
Theorem 3.42: For any D1, DZ E D we have D1 n D2 E D and F ( D 1 n D2) is the unique minimal face which contains both F ( D 1 ) and F ( D 2 ) . Moreover, if F ( D 1 U 4) f 0, then F ( D 1 U D2) is the unique maximal face which is contained in both F ( D 1 ) and F ( D z ) . D3 E D satisfying F ( D 3 ) = F ( D 1 U Dz) is the unique minimal sublattice of D in D which contains both D1 and Dz. (Proof) The present theorem follows from the general theory of convex polyhedra (see Lemma 1.8). Q.E.D. It should be noted here that the unique minimal sublattice of D which contains both D1 and D2 does not necessarily belong to D. The following theorem gives a necessary and sufficient condition for a sublattice DO of D to be a member of D.
Theorem 3.43: Let DO be a sublattice of D with O,E E DO and fo be the restriction off to Do. Then Do E D if and only if the following three statements hold: (i) fo is a modular function. (ii) For any maximal chain co:
0 = so c s1 c * * - c S& = E
(3.146)
.
of Do each minor (P,f ) . Si/Si-, (i = 1,2, ,k) is connected. (iii) Let Co be any maximal chain of DO as in (3.146) and P be any base of (PO,fo) such that
qsi) = f O ( S i ) 69
(i= 1,2,*.*,k)
(3.147)
11. SUBMODULAR SYSTEMS AND BASE POLYHEDRA (i.e., CO C DO(?)). Then for any X E D such that (a) X is compatible with and (b) 2 ( X ) = f ( X ) , we have X E DO.
Il(D0)
(Proof) The “if” part: Suppose (i)-(iii) hold. For each i = 1 , 2 , . - . ,k let xT be (P, f).Si/Si-, such that for any X E D with Si-1 C X C Si a base of (Di,fi) we have xT(X - si-1) < f(X) - f(Si-1) = f i ( X - si-1). (3.148)
=
Such a base xr exists due to Theorem 3.36. Let x* xT (i = 1,2,. ,k),i.e., x*(e) = x;(e)
-.
R E be the direct sum of (3.149)
foreache€ E w i t h e E Si-Si-1 a n d i E { 1 , 2 , . . - , k } . Thenwehavex* E B(f) due to Lemma 3.1. We show x* E F”(Do),which will complete the proof of the “if” part. Let X be any member of Do. Since f is modular on Do and x* satisfies
.*pi)= f(Si)
( i = 1,2,-,k)
DO given by
(3.146), we have
on the maximal chain of
.*(X)= f(X)
(XE Do),
(3.150)
(3.151)
where recall that a simple submodular system with a modular rank function has a unique base. Now, suppose X E D - DO.If X is compatible with I I ( D o ) , then from (iii) we have .*(XI < f(X). (3.152) On the other hand, if X is not compatible with II(Do), i.e., for some io E {1,2,...,k} X - S, # 0 and Sio+l - X # 0, then defining Ti = Si - Si-1 ( i = 1 , 2 , . . - , k ) , we have
(3.153) 70
3.3. STRUCTURES OF BASE POLYHEDRA
due to (3.148) and (3.149). Therefore,
vx E D - Do: .*(X)< f(X).
(3.154)
From (3.151) and (3.154) we have x* E F O ( D 0 ) . The "only if" part: Suppose Do E D. Then there exists a vector x' E F o ( D o ) , from which follows (i). Also (ii) follows from Theorem 3.36. We show (iii). Let i be any base of (Do, fo) such that (3.147) holds. Suppose that Xo E D is compatible with I I ( D 0 ) and that
Since x' E F"(D0) also satisfies
.'pi) = f&)
(i=l,2,*..,k),
(3.156)
we have from (3.147) and (3.156)
?(T)= ~ ' ( 2 ' ) (7' E II(D0)).
(3.357)
Since Xo is compatible with II(Do), it follows from (3.155) and (3.157) that Z'(X0)
=
qxo) = f(X0).
Since x' E F" (DO),we must have X O E
(3.158)
DO.
Q.E.D.
From Theorems 3.32 and 3.43 we have the following.
Corollary 3.44: Suppose that B(f) is full-dimensional, i.e., dimB(f) = [El-1, and let Do be a sublattice of D with 0 , E E DO. Then, F(D0) is a facet of B(f) if and only if Do forms a chain 0 = SO C S1 C Sz' = E of D (oflength 2) and ( D , f ) Si/S,-l (i = 1,2) are connected. We also have
Theorem 3.45: Let
D1
be a sublattice of D with 0, E E
D1.
Suppose that f is
modular on D1. Also, let C1:
~ = S ~ C S ~ C * . * C S ~ = E
(3.159)
be any maximal chain of D1 and for each i = 1 , 2 , . . . , k let Pi = { T ) , T f , . * . , T:i} be the partition ofS; -S,-1 given by the decomposition of ( D , f).Si/Si-1 into connected components. Define P* = Pi and let 2 be a base of (D, f )
&
such that +a)
= f(Si)
(i = 1,2, * * * , k). 71
(3.160)
11. SUBMODULAR SYSTEMS AND BASE POLYHEDRA Then Dz given by
Dz
= {X
I X E D, ?(X)= f(X), X is compatible with P*}
(3.161)
(3.162) (Proof) We can easily see that D2 given by (3.161) satisfies (i)-(iii) of Theorem 3.43, where note that the set of minors ( D , f ) . S i / S i - , (i = 1 , 2 , - - . , k ) does not depend on the choice of a maximal chain since f is modular on D1 (see Theorem 7.17). Hence we have D2 E D due to Theorem 3.43. Moreover, for any x ' E F ( D 1 ) we have
qT;)= X'(Tj) It follows that for any
( j =-1,2,
* *
,ri;
i = 1,2,
- ,k).
(3.163)
X E DZ x'(X)= &(X)= f(X),
(3.164)
' E F ( D 2 ) . We thus have F ( D 1 ) 2 F ( D 2 ) . On the other hand, since f is i.e., x modular on D1, from (3.160) and (3.161) we have D1 G 0 2 and F ( D 2 ) C F ( D 1 ) . Q.E.D. Hence F ( Dz) = F ( D 1 ) . Note that the operation of getting Dz from D1 in Theorem 3.45 is a closure operation on sublattices of D , containing 0 and E , on which f is modular and that D is the collection of closed sublattices of D . Theorem 3.45 is closely related to the concept of f -skeleton introduced in [Nakamura Irisl]. An f-skeleton is a sublattice of D on which f is modular. A sublattice D1 of D in Theorem 3.45 is an f-skeleton and there exists a unique maximal f-skeleton D; (maximal with respect to set inclusion) such that II(D;) = I I ( D l ) , which is called the mazirnal f-skeleton corresponding to D1 ([Nakamura Irigl]). We can easily see that D; may not be closed and that D; C Dz, where Dz is the closure of D1 given by (3.161). In fact, D; is the aggregation of DZ by II( D1). Therefore, DZ is a finer structure than the maximal f-skeleton 0;. Theorem 3.45 gives a polyhedral interpretation of skeletons.
+
+
Theorem 3.46: Let x be an extreme point of B(f) and K be an edge of B(f). Suppose {x} = F ( D o ) and K = F ( D 1 ) for Do, D1 E D. Then, we have x E K (i.e., 2 is an end point of K) if and onlyif for { e y e ' } E I I ( D 1 ) (i) vertices e and e' in the Hasse diagram H ( P ( D 0 ) ) = (E,A*(Do))of P ( D 0 ) are adjacent and (ii) the Hasse diagram H(P(D1)) of P ( D 1 ) is obtained by identifying e with e' in H(P(D0)) (i.e., contracting the arc which connects e and e l ) . 72
3.4. INTERSECTING- AND CROSSING-SUBMODULAR FUNCTIONS (For the definition of
P(Di) (i = 0 , l ) see Section 3.2.a.)
(Proof) From Theorem 3.32, we have JIl(D0)J= J E Jand III(D1)I = \El - 1, so that TI(&) consists of a two-element set and singletons. From Theorem 3.43 and Lemma 3.31, we have z E K if and only if (1) D1 C 00 and (2) for T E II(D1) with IT1 = 2 and for S 1 , S 2 E I I ( P 1 ) with S1 c S 2 and T = S2 - S 1 ( P , f ) . S 2 / S 1 is connected. The present theorem follows from this fact. Q.E.D.
Theorem 3.47: Let x i (i = 1,2) be distinct extreme points of B(f) and suppose {z,} = F(D,) for D, E D (i = 1 , 2 ) . Then z1 and z 2 are adjacent in B(f) if and only if there exist vertices e and e' in E connected by an arc both in H(P(D1)) and in H ( P ( D 2 ) ) such that the Hasse diagrams obtained by identifying e with e' in H ( P ( D 1 ) ) and H ( P ( & ) ) , respectively, coincide with each other. (Proof) From Theorems 3.32 and 3.43 and Lemma 3.31 we see that extreme bases z1 and x 2 are adjacent in B(f) if and only if the following two statements hold: (i) There exists a common sublattice 4 of D1 and D2 such that O,E E D3 and (n(D3)1 = IEl - 1. (ii) For T E I l ( D 3 ) with )TI = 2 and for S l , S 2 E 03 with S 1 C S2 and T = S 2 - S1 (D,f ) * S2/S1 is connected, Q.E.D. from which follows the present theorem. It should be noted that for extreme bases x i (i = 1,2) we have P(x;) E D (i.e., D, = D ( q ) in Theorem 3.47) and that if z1 and x 2 are adjacent, we have D(z1)n D ( x 2 ) E D and F ( D ( z 1 ) n P ( x 2 ) ) is the edge of B(f) incident to z1 and 52*
3.4. Intersecting- and Crossing-Submodular Functions
In this section we prove Theorems 2.5 and 2.6 presented in Section 2.3 which relate intersecting- and crossing-submodular functions to submodular systems ([Fuji84b]).
(a) Tree representations of cross-free families We first introduce the tree representation of cross-free families due to Edmonds and Giles IEdmonds Giles77). We call a set { X ; I i E I } of subsets X i (i E I) of E a copartitionof E if { E - Xi I i E I} is a partition of E . Let T = (V,A ) be a tree with a vertex set V and an arc set A = {ui I i E I}. Also let P = (Pv,I 21 E V ) be a family of subsets of E such that the nonempty
+
73
11. SUBMODULAR SYSTEMS AND BASE POLYHEDRA P,,’s form a partition of E . Possible repeated members in P are the empty set only. Note that each vertex v E V is given a subset P,,of E. Deleting an arc ai (i E I) from tree T yields a graph with two connected components. Denote the vertex set of the connected component which contains the tail d+a, of ai by V ( a , ), and define
xi = U { PI ~v E V(ai)}.
(3.165)
3=(XaliEI)
(3.166)
We thus have a family of subsets of E . From this construction we can easily see that the family 3 is cross-free. We call the representation of 3 by the pair of the tree T = ( V , A ) and the family P = (PuI v E V ) a tree representation of 3. We show the converse that every cross-free family has one such tree representation. First, consider a laminar family 3 = ( X i I i E I) of subsets of E , i.e., for each X i , X j ( i , j E I ) we have at least one of the following three: (1) Xi n X j = 8, ( 2 ) Xi G X j and (3) X i 2 X j . For simplicity suppose that X i (i E I ) are distinct. Let io be a new index not in I , and define a vertex set V by (3.167) v = {v, i E Iu{i,}}
1
and an arc set A = {u,
I i E I}.
(3.168)
The incidence relation is defined as follows. For each i E I such that X i is a nonempty and non-maximal member of 3, let Xb be the unique minimal member of 3 which properly includes X,. (Note that the members of 3 which includes Xi form a chain due to the fact that 3 is a laminar family (without repeated members).) Then we define d+ui = v, and d-ai = vk. For each maximal member X , of 3 we define d+a, = vi and d-ai = v,,. Also, if 3 contains X i = 0, then for some nonempty minimal member Xl of 3 we define d+ai = vi and d-a, = vl. Consequently, the graph T = (V,A) is a directed tree toward the root vi,. We define X i , = E and for each i E I U {io}
We can see that the pair of tree T = (V,A = {a, 1 i E I } ) and P = (PuI v E V ) gives a tree representation of the laminar family 3 = ( X i I i E I). If we replace an arc ail by series arcs a,,! and ail#!with a vertex v,,‘ between them such that d + ~ , = , d + I~1 ., d - a , , = d - a i , ! and d + ~=~d - ~u i l#~ l , and if we define P,,il, = 0, then the pair of the augmented tree T’ = (V u {vi,#},(A- {ail}) U { U ; ~ ! ~ U ~ ~and ! ! } P’ ) = (P,,j I i E ( I - {il}) U {i”lii’,Zi’’}) II
74
3.4. INTERSECTING- AND CROSSING-SUBMODULAR FUNCTIONS represents the laminar family 3' = (Xi I i 6 ( I - {il}) U {t'o,"',ii''}) with ~. members can be treated in this way. repeated members Xil, = X j L ~Repeated We thus have
Lemma 3.48: A family of subsets of a finite set E is laminar if and only if it has a tree representation such that the tree is a directed tree toward the root. Next, let 3 = (Xi I i E I ) be a cross-free family of subsets of E , i.e., for each Xi, Xj ( i , j E I) we have a t least one of the following four: (1) XinXj = 0, (2) Xi C Xj, (3) Xi 2 Xj and (4) Xi u Xj = E. Choose an arbitrary element eo E E and define (3.170) where, putting I0 = {i
I i E I, eo E Xi}, xt =
{ Ext
-
Xi
if i E Io, ifiE1-1"
(3.171)
for each i E I . We can easily see that ? is a laminar family, since ~?is cross-free and 2,U 2 j E - {eo} for any z,j E 1. Therefore, ? can be represented by a pair of a tree 2'= ( V , i )and afamily P = (PI,I v E V ) ,where 2 = {a, I i E I } . Construct a tree T = ( V , A ) by reorienting the arcs 6, (t' E 1 0 ) of tree T = ( V , a ) . Then, the pair of the tree T and the family P represents the original cross-free family 3. Now, we have
Lemma 3.49 [Edm+Giles77]: A family of subsets of E is cross-free if and only if it has a tree representation. The following lemma, concerning the decomposition of uniform cross-free families, is fundamental.
Lemma 3.50: Let $ = (Xi I i E I ) be a cross-free family of subsets of E which uniformly covers each element of E , i.e., 'de E E :
I{; I i E I,
e
E Xi}\ = const. > 0.
(3.172)
Then, $ is the direct sum of families each of which forms a partition or a copartition of E . (Proof) Since { E } is a partition and (0) is a copartition of E , we suppose $ and that that 0 , E # 0. From Lemma 3.49, the cross-free family
4
75
11. SUBMODULAR SYSTEMS AND BASE POLYHEDRA
I
5 = (Xi i E I) can be represented by a pair of a tree T = ( V , A ) ,with a vertex set V and an arc set A = {ui I i E I},and a family P
= (P,,
12,
(3.173)
EV)
of subsets of E , where nonempty Pu’s form a partition of E . Note that by the assumption we have pu # 0 (3.174) for any leaf v of T . Now, from the assumption and (3.174) there exists at least one Xi such that Xi 4 {0,E } . Therefore, there exist distinct vertices v1 and v2 of T satisfying the conditions that pu, # 0, pu, # 0 (3.175) and that for any vertex u 4 {v1,v2} lying on the unique path from v1 to denoted by Q(v1,v2), in T we have
P,
v2,
(3.176)
= 0.
It follows from (3.172) that the number of positively oriented arcs in Q ( v l , v 2 ) is equal to the number of negatively oriented arcs in Q(v1,v2). (If this is not the case, the value of (3.172) for any el E P,,, can not be equal to that for any e2 E P,,,.) In particular, there exists at least one positively oriented arc and at least one negatively oriented arc in Q(v1,vZ). Hence, there is at least one vertex u $i!{v1,v2} on Q ( q , v 2 ) . Let 0 be the vertex on Q(v1,v2) adjacent to v1 and let ( v 2 , v3,. . - ,v k } (k 2 2) be the maximal set of vertices of T such that for each 1 = 2 , 3 , . . - ,k, (i) P,,, # 0, (ii) the vertex G lies on the unique path Q(vl,wl) from vl to vl in T and (iii) any vertex u { q , v l } lying on Q ( v l , v l ) satisfies (3.176). Moreover, let u j ( l ) E A be the arc connecting v1 with 0 and for each 1 = 2 , 3 , . - . , k let u j ( l ) E A be the arc in Q(v1,vl) such that the orientation of the arc u j ( q is opposite to the orientation of the arc uj(l)and any arc ui (# u j ( q ) between uj(l)and u j ( l ) in Q(v1,vl) has the same orientation as ajil).(Note that arcs u j ( q (1 = 2,3, ,k) are not necessarily distinct.) (See Fig. 3.6.) Let us define
---
= {Xj(l)
I Xjjl) E
$7
1
5 .i 5 k}.
(3.177)
Then, (i) if d + u j ( l ) = 211, II is a partition of E and (ii) if d - ~ j ( =~ vl, ) II is a copartition of E . Therefore, the family 5 contains a subfamily which forms a partition or copartition of E . Define
5’ = (Xi I i E I -
{ j ( l ) 11 = 1,2,...,k}) . 76
(3.178)
3.4. INTERSECTING- AND CROSSING-SUBMODULAR FUNCTIONS
.
L
i
Figure 3.6. If $’ is not empty, then $’ also satisfies (3.172) and is cross-free. Repeating the above-mentioned argument, we see that $ is the direct sum of families each Q.E.D. of which forms a partition or a copartition of E .
(b) Crossing-submodular functions Let 3 be a crossing family of subsets of a finite set E with 0, E E 3 and f a crossing-submodular function on 3 (see Section 2.3 for the terminology). We assume R is the set of reals (or rationals). Define (3.179) P ( f ) = { z I z E R E ,V X E 3 : z ( X ) 5 f(X)}. Note that P ( f ) is nonempty. Now, we would like to find the tightest system of inequalities with (0,l)coefficients which represents the polyhedron P(f ). Therefore, define
i ( y )= max{z(Y)
I z E P(f)}
(3.180)
for any Y E , where we put f ( Y )= $00 if the value of x(Y) can be made arbitrarily large for z E P ( f ) . Note that by definition (3.180) we have i ( 0 ) = 0. By the linear programming duality theorem (Theorem 1.10) we have
f(X)c(X)
f ( Y ) = min{ XEjF
77
I
(3.182),(3.183)},
(3.181)
11. SUBMODULAR SYSTEMS AND BASE POLYHEDRA where (3.182)
c(X)2 0
( X E 7).
(3.183)
We have f ( Y )= +oo if and only if there is no such c ( X ) ( X E 7)that (3.182) and (3.183) are satisfied. We thus have a set function f: 2E -+ R u {+oo}. Since the minimum value of (3.181) can be attained by rational c ( X ) ( X E ?), (3.181)-(3.183) can be rewritten as follows. 1
f(X,)
I
(3.185) - (3.187)},
(3.184)
where
5 = (Xi 1 2 E I ) with
x; E ?, xi G Y
(3.185)
(ZE I )
(3.186)
and
I{; I i E I ,
e E
X;}l = const. = p ( $ , Y ) > 0
(e E Y).
(3.187)
Note that 5 = (Xi I i E I ) is a family of subsets Xi of E , so that we may have X ; = X j for distinct i , j E I. (3.187) means that the family 5 uniformly covers each element of Y . By the definition of the set function f: 2E -+ R U {+oo} we have
P ( f ) = { x I x E RE,V X
C E: x ( X ) 5 f ( X ) } . f ( X ) (X C E ) , (3.189) does not
(3.189)
If we decrease the value of any hold. Since we have not used the crossing-submodularity of f in the above argument, (3.179)-(3.189) are valid for any set function on any family of subsets of E . In the following, we simplify (3.184) -(3.187) by use of the crossing-submodularity off. From the crossing-submodularity of f and (3.184)-(3.187), we can restrict admissible families 5 in (3.184) -(3.187) to those which satisfy (3.190) (i) $ is a cross-free family,
(4 0 4 5,
(3.191) 78
3.4. INTERSECTING- AND CROSSING-SUBMODULAR FUNCTIONS
to attain the minimum of (3.184). (For, let 5 be a family which satisfies (3.185)-(3.187) and attains the minimum of (3.184). If there exists a crossing pair of Xi and X j in 5, then put
x, +. xi u x j , xj + x, n x j . this replacement 5 = (Xi 1 i E I ) remains to satisfy
(3.192)
After (3.185)-(3.187), while the sequence of the values of lX,l (i E I ) arranged in order of nonincreasing magnitude becomes lexicographically greater than the previous one. Because of the finiteness character we can repeat the replacement (3.192) for crossing pairs in $ only finitely many times and eventually we obtain a cross-free family 5 which satisfies (3.185)-(3.187) and attains the minimum of (3.184). Furthermore, if 5 contains the empty set, it is redundant and can be removed.)
Theorem 3.51: Let fp(E)and fq(E)be defined by f p ( ~ )= min{
C f(x,) I
{xi I i E I>:
a partition of E ,
iEI
xi
3},
( 3.193)
V j E J : X j E 3, IJI 2 3 ) .
(3.194)
ViE I:
E
(3.195)
(Proof) We can restrict admissible families 5 = ( X i I i 6 I) in (3.185)-(3.187) with Y = E to those further satisfying (3.190) and (3.191). For such a family 5,from Lemma 3.50, 5 is the direct sum of partitions and copartitions of E . Since every subfamily of 5 forming a partition or a copartition of E satisfies (3.185)-(3.187) with Y = E , (3.190) and (3.191), it follows that we can restrict admissible families 5 = (Xi I i E I) to those which form partitions and copartitions of E with X , E 7 (i E I).Note that the copartition {0} of cardinality 1 is excluded by (3.191) and that copartitions of cardinality 2 are also partitions. Q.E.D. We thus have (3.193)-(3.195). From Theorem 3.51 we can show
Theorem 3.52: The following four statements are equivalent: 79
11. SUBMODULAR SYSTEMS AND BASE POLYHEDRA (i) The polyhedron defined as the intersection of P(f) and the hyperplane x ( E ) = f(E),i.e.,
B(f) = {x I x E R”, V X E 3: .(X) 5 f ( X ) , z ( E ) = f(E)}, (3.196)
(3.197) (3.198)
(iv) For any partitions { X ; I i E I} and and E - Y, E 3 ( j E J) we have
{Y, I j
E
J} of E with X ; E 3 (i E I )
(3.200) iEI
j€J
(Proof) The equivalence between (i) and (ii) follows from the definition of f*(E).To show the equivalence between (ii) and (iii), recall (3.193)-(3.195). If f(E) = f ( E ) we , have fp(E)2 f ( E ) while , from (3.193) we have fp(E)5 f(E) since E E 7. We thus have fp(E)= f ( E ) . Moreover, since 0 E 7,we have from (3.199) (3.201) ( f # ) P P ) 2 f#(E) = f(E). If j ( E ) = f ( E ) ,we have from (3.194) and (3.195) f(E)F f q ( E ) ,i.e., (3.202)
for any copartition { X j 1 j E J} of E such that X j E 7 ( j E J) and IJI 2 3. Note that (3..202) holds for IJI = 1 and 2, due to the assumption that f(E) 5 fp(E). (3.202) is rewritten as (3.203)
for any partition {q 1 j E J } of E such that E - Yj E 7 ( j E J ) . Hence, (f#),(E)5 f ( E ) . Consequently, from (3.201) we have ( f # ) , ( E ) = f(E). 80
3.4. INTERSECTING- AND CROSSING-SUBMODULAR FUNCTIONS
Conversely, suppose that f ( E ) = fp(E)= (f#),(E). Then we have (3.203) and hence f ( E ) 5 fq(E).Therefore, f ( E ) = f ( E ) . Finally, we show the equivalence between (iii) and (iv). Since we always have (3.204) f,(E) 5 f(E) 5 (f#)P(E), Q.E.D. (iv) implies (iii). The converse, (iii) -*- (iv), is immediate. Theorem 2.6 follows from Theorem 3.52. For proper subsets of E the value of f is expressed as follows. Theorem 3.53: For each nonempty proper subset Y of E ,
I
j ( ~=)min{ C f(x,) {x;I i E I ) : a partition of Y , iEI
vi E I : X ,
E
3}.
(3.205)
(Proof) By the same argument as in the case of Y = E in the proofs of Lemma 3.50 and Theorem 3.51, we see that we can restrict admissible families $ in (3.185)-(3.187) to those which form partitions and copartitions of Y . Let P* = { X i I i E I } be a copartition of Y with Xi E 3 (i E I ) and 111 2 3. (Note that if 111 = 2, P* is also a partition of Y . ) Then, for any distinct X , , X j E P*,Xi and X j cross. It follows from (3.190) that we need Q.E.D. not consider copartitions of Y . Note that f is the Dilworth truncation of f when B(f) is nonempty. We also have Theorem 3.54: For any X , Y 2 E with X U Y # E , if f ( X ) ,j*(Y)< +oo, then f ( x ) f ( y )2 f(xu Y ) f ( xn Y ) . (3.206)
+
+
(Proof) If X n Y = 0, (3.206) is immediate. Therefore, suppose that X , Y C E satisfy X n Y # 0,X U Y # E and f ( X ) ,f ( Y )< +m. Then, for some partition { X i I i E I } of X and some partition {Yj I j E J } of Y such that Xi E 3 (i E I ) and YJ E 7 ( j E J ) , we have (3.207) iEI
j€ J
due to Theorem 3.53. Let $ = (Zk I k E I @ J ) be the direct sum of families ( X , I i E I ) and (Yj 1 j E J ) . Since X U Y # E , one of the following three statements holds for any Zi,Zj E $: 81
11. SUBMODULAR SYSTEMS AND BASE POLYHEDRA
(i) Zi and Zj are disjoint,
(3.208)
(ii) Zi C Zj or Zj Zi, (iii) Zi and Zj cross. If 2, and Zj cross, replace 2,and Zj as
(3.209) (3.210)
zj+ zi n zj.
zi c zi u zj,
(3.211)
Repeat this replacement until the current family 5 = (Zk I k E I @J ) becomes a cross-free family and satisfies (3.208) and (3.209) for any Z,, Zj E 5. By the same argument made after (3.192), this replacement process terminates in a finite number of steps. The finally obtained 5 is the direct sum of two families which form a partition {Xi I i E i} (kjE 3 (i E i))of X U Y and a partition {$, 1 j E (?j E 7 ( j E of X n Y . Since the replacement (3.211) reduces the value of (3.207), we get
s}
s))
Q.E.D. It follows from Theorem 3.54 that the family
? defined by
(3.213) f = {X I x c E , i ( X ) < +m} is a cointersecting family (i.e., X , Y E f, x u Y # E ==+ x u Y , X n Y E f) and that f restricted to 3 is a cointersecting-submodular function on ? (i.e., f(X) + f ( Y ) 2 ? ( X u Y ) + f ( X n Y ) for any X , Y E 3 with X U Y # E ) .
Now, let us consider the polyhedron
which is the intersection of P ( f ) and the hyperplane s ( E ) = f ( E ) ,and suppose that B ( f ) is not empty (see Theorem 3.52). We examine the structure of B ( f ) . For each proper subset Y c E , define
Then, by the linear programming duality theorem,
82
3.4. INTERSECTING- AND CROSSING-SUBMODULAR FUNCTIONS
c
where
C ( X )= X Y ( 4
(e E
El,
(3.217)
eEXE3
c(X)2 0
( X E ?,
x # E).
(3.218)
Similarly as (3.184), (3.216) can be rewritten as
(3.220) - (3.224)),
(3.219)
where
5 = ( X ; I i E I), I{i I
eE
(3.220)
X ; E 7, X ; # E
(i E I),
X ; , i E I}[ = const.
=p(5,Y)
I{i 1 e E X ; , i E I } / = const. = p ( 5 , E - Y ) P(5,Y
) > 4 5 ,E
(e
(3.221)
Y) E E -Y), (e E
- Y).
(3.222) (3.223) (3.224)
It should be noted that (3.219) is valid for any set function f , By use of the set function f: 2E + R u {+m} defined by (3.184) (or (3.195) and (3.205)), the polyhedron B ( f ) of (3.214) can also be expressed as B ( f ) = {Z I x E RE,V X E
f : z ( X ) 5 f * ( X ) ,z ( E ) = f ( E ) ( =f(E))}, (3.225)
where 9 is defined by (3.213). Therefore, we can replace f and 7 in (3.219) and (3.221) by f and ?, respectively. For Y c E , if { E - X ; 1 i E I} is a partition of E - Y , we call {Xi I i E I} a copartition of E - Y augmented by Y .
Theorem 3.55: For each nonernpty Y C E ,
{ E - Xi I i E I } : a partition of E ViE
I: x,E f ) .
- Y,
(3.226)
(Proof) If f and ? in (3.219) and (3.221) are replaced by f and f , we can restrict admissible families 5 in (3.220) -(3.224) to those which satisfy (3.220)(3.224) (with ? replaced by ? in (3.221)) and the following (i)-(iv): 83
11. SUBMODULAR SYSTEMS AND BASE POLYHEDRA (i)
(ii) (iii) (i.1
5 is a cross-free family, for any X , , X j E 5 we have Xi n X j # 0, 5 does not contain a subfamily which forms a copartition of E , E 4 5.
(3.227) (3.228) (3.229) (3.230)
(Here, (i)-(iii) follow from Theorems 3.51, 3.53 and 3.54. (iv) follows from the form of (3.219).) From (i), the family 5 = ( X i I i E I) can be represented by a pair of a tree T = ( V ,A ) and a family
P
= (P,,
12,
(3.231)
E V),
where A = { a , I i E I} and nonempty PU’s form a partition of E as in Lemma 3.49. From (ii), T is a directed tree. (For, if there were distinct arcs ai and aj in T such that d - a , = a - a j , we would have Xi n X j = 0.) Let vo be the root of T. If P,,, = 0,then $ contains a subfamily which form a copartition of E. Therefore, P,,, # 0 due to (iii). Since for each e E E the number of i’s for which e E Xi should be taken from the fixed set of two distinct values of (3.222) and (3.223), for any leaf u of T every vertex w 6 { u , v o } lying on the unique path Q ( v 0 , u ) connecting vo with u in T gives
P, = 0, where note that P,
# 0 due to (iv). It follows from
(3.232) (3.224) that
P,,, = Y and that
{E -Xi
I Z E I,
d + ~= i VO}
(3.233)
(3.234)
is a partition of E - Y. Let 5’ be the family obtained from 5 by deleting Xi’s such that d + a , = vo. If 5’ # 0, then 5’ also satisfies (3.221)-(3.224) (with 3 replaced by f ) and (3.227)-(3.230). (The tree representation of 5’ with root vo is obtained by contracting or shortcircuiting arcs a, such that d+a, = vo in T.) It follows that 5 is a direct sum of families which form copartitions of E - Y augmented by Y.Since any family which forms a copartition of E - Y augmented by Y also satisfies the conditions required for 5, the minimum of (3.219) is attained by a family 5 which forms a copartition of E - Y augmented by Y. Q.E.D. We define
m)E(W=f(E)). =
We now have a set function
f2
from 2“ to R u {+oo}. 84
(3.235)
3.4. INTERSECTING- AND CROSSING-SUBMODULAR FUNCTIONS
Theorem 3.56: For any X , Y & E such that . f ~ ( X ) , . f 2 ( < Y )+oo, we have
(Proof) If X E { 0 , E } or Y E { 8 , E } , then (3.236) is trivial. So, suppose 4 E } , Y $! {O,E} and X , Y E 9. Then, from Theorem 3.55, for some partition { E - X i I i E I } of E - X and some partition { E - y3 I j E J}gf E - Y with X ; E 9 (i E I ) and y3 E 9 ( j E J),we have
X
{a,
5 = (Zk I k E I CB J ) be the direct sum of families ( X i I i E I ) and (y3 I j E J ) . If, for Z;,Zj E 5, we have ( E - Z i ) n ( E - z j ) # 0, z , n ( E - Z j ) # 0 and ( E - 2;)n Zj $. 0 , then replace Let
zi
zi u zj,
zj t Z, n zj.
(3.238)
Repeat such a replacement until there is no such pair of 2, and Zj in 5. This process terminates in finitely many steps. We can easily see that the finally obtained 5 is the direct sum of families 51 = (X,t I i E 1') and G2 = (YjI j E J ' ) such that { E - Xs* I i E I * } is a partition of E - ( X u Y ) and { E - Yj' I j E J ' } is a partition of E - ( X n Y ) ,where, if X n Y = 0 , the family g2 may be composed of the empty set alone. For any pair of 2, and Z j to be replaced by (3.238) we have 2, u Zj # E . It follows from Theorem 3.54 and (3.237) that
j€J *
> f 2 ( X u Y )+ f 2 ( x n Y ).
-
(3.239)
Q.E.D. From Theorem 3.56,
&
=
{X I X C E , . f 2 ( X )< +a} 85
(3.240)
11. SUBMODULAR SYSTEMS AND BASE POLYHEDRA is a distributive lattice with set union and intersection as the lattice operations, join and meet. Denote by f2 the function obtained by restricting the domain 2E of f2 to D2. Then, ( D 2 , f 2 ) is a submodular system on E and the polyhedron B(f) defined by (3.225) is also expressed as
B ( f ) = {x
I
E R E ,VX E D2:
x(X) I f2(X), x ( E ) = f 2 ( E ) ( = f ( E ) ) } (3.241)
which is the base polyhedron
B(f2)
associated with the submodular system
(D2, f2).
Since different submodular systems give different nonempty base polyhedra, nonempty B ( f ) uniquely determines a submodular system (D2, f 2 ) such that B(f) = B(f2). Moreover, we see from (3.235) and Theorems 3.53 and 3.55 that if B(f) is nonempty, each f2(X) (X E P2) is expressed M a linear combination of f ( Y ) ’ s (Y E 3) with integer coefficients. Therefore, if f is integer-valued, so is f 2 . This completes the proof of (ii) of Theorem 2.5. It should be noted that (3.226) can be rewritten in a dual form as
f$(z)= m a x { C f#(yj) 1
{ y j I j E J): a partition of Z,
j€ J
V j E J: E - Y , E f )
(3.242)
for each nonempty proper subset Z c E . Here, f,f is the Dilworth truncation of the intersecting-supermodular function f# on .
f
(c) Intersecting-submodular functions
Let f : 3 + R be an intersecting-submodular function on an intersecting family 3. We assume that f(8) = 0 if 8 E 3. Consider the polyhedron
P(f) = {x I z E RE, VX E 3: x(X) 5 f(X)}.
(3.243)
Since intersecting-submodular functions are crossing-submodular, P(f) is expressed as (3.244) P ( f ) = {x I x E RE,VX E : z ( X ) 5 f(X)}, where f is defined by (3.193)-(3.195) and (3.205). Since 3 is an intersecting family and f : 3 -+ R is intersecting-submodular, we can restrict 5 in (3.184)-(3.187) to those which satisfy (3.186), (3.187), (3.190), (3.191) and the following: 86
3.5. RELATED POLYHEDRA
(iii) for any X,, X i E
5 such that Xi n X j # 0 , we have Xi C X j
or X j
C Xi, (3.245)
i.e., 9 is a laminar family. In particular, fp(E) and fq(E) defined by (3.193) and (3.194) satisfy f d E ) 5 f,(E)* (3.246) It follows from (3.246) and Theorems 3.51 and 3.53 that for any nonempty
Y C E
j ( ~=)m i n { C f(xi) I {x,I i E I}: a partition of Y, iEI
vi E I : xi E 3 } , and we have cation of f .
j(0) = 0 by
definition (3.180). Function
(3.24 7)
f is the Dilworth trun-
Theorem 3.57: Let f: 3 --+ R be an intersecting-submodular function. For a n y X , Y C E andf: 2E -+ RU{+m} defined by(3.247), i f f ( X ) , j ( Y )< +co, then i ( X ) i ( Y )2 j ( X u Y ) j ( X n Y ) . (3.248)
+
+
(Proof) Since f is an intersecting-submodular function on 3, Theorem 3.54 holds for X , Y E with X U Y = E as well. Q.E.D. Define
Di
=
{X I X
C E,
f ( X ) < +m}
(3.249)
f
and let f l be the restriction of to D 1 . Then, from Theorem 3.57, D1 is a distributive lattice and f l is a submodular function on D1. Since polyhedron P(f) in (3.243) is expressed in terms of f l as P ( f ) = {X
I z E R E ,b'X
E
Di:
~ ( xI )f i ( X ) }
(3.250)
which is the submodular polyhedron P(f1) associated with the submodular system (01, f l ) on E . Since different submodular systems give different associated submodular polyhedra, the intersecting-submodular function f : 3 + R uniquely determines the submodular system ( & , f l ) such that P ( f ) = P(f1). Moreover, from (3.247), f l is integer-valued if f is. This completes the proof of (i) of Theorem 2.5.
3.5. Related Polyhedra
87
11. SUBMODULAR SYSTEMS AND BASE POLYHEDRA
We show some polyhedra which are closely related to base polyhedra and submodular/supermodular polyhedra.
(a) Generalized polymatroids A. Frank [FrankBlb] introduced the concept of generalized polymatroid. Suppose that a submodular system ( P I , f') and a supermodular system ( D 2 , g ' ) on E' satisfy
Then the polyhedron P(f',g') defined by
is called a generalized polymatroid or g-polymatroid. (The same polyhedron is E' also considered by R. Hassin [Hassin78,82] for the case where D1 = D2 = 2 .) Generalized polymatroids are characterized by the following theorem, which is implicit in [Frank8lb] (also see [Schrijver84a]) (see Fig. 3.7).
88
3.5. RELATED POLYHEDRA
Theorem 3.58 [Fuji84a]: For the base polyhedron B(f) associated with a submodular system ( D , f ) on E , the projection of B(f) along an axis e E E on the hyperplane z(e) = 0 is a generalized polymatroid P(f',g') in RE' with E' = E - {e}, where
DI D2
{ X I e $! X E D } , = {E - X I e E X E D}, =
(3.253) (3.254)
f' is the restriction off to D1 and g' is the restriction o f f # to D2. Conversely, every generalized polymatroid in RE'is obtained in this way, For each generalized polymatroid P(f', 9') with Di C 2E' (i = 1,2) such a base polyhedron B(f) in RE with E = E ' U { e } is unique u p to translation along the new axis e, and the two polyhedra P(f', 9') and B ( f ) are isomorphic with each other under the projection of the hyperplane z ( E ) = f ( E )onto the hyperplane z(e) = 0 along the axis e. (Proof) The base polyhedron B(f) is the solution set of (3.255)
z ( E ) = f(E). Choose an element
eE
(3.256)
E . From (3.256) we have s(e) =
f ( E )- z ( E - {e}).
(3.257)
Substituting (3.257) into (3.255), we have
VX E D with e $! X : z ( X ) 5 f(X), V X E D with
eE
X : z ( E - X)2 f ( E )- f(X).
(3.258) (3.259)
(3.258) and (3.259) is rewritten as
VXE
D1:
z ( X ) 5 f'(X),
(3.260)
WE
D2:
s ( Y )2 g'(Y),
(3.261)
where D1 and D2 are, respectively, defined by (3.253) and (3.254), j' is the restriction o f f to D1 and g' is the restriction of f# to D2. It follows from (3.260) and (3.261) that the projection of B ( f ) along the axis e on the hyperplane s(e) = 0 is the generalized polymatroid P(f',g') in RE'with E' = E - { e } . Note that (3.251) follows from the submodularity of j. 89
11. SUBMODULAR SYSTEMS AND BASE POLYHEDRA Now, we show the converse. For an arbitrary generalized polymatroid
P( f', 9')in RE'with a submodular system ( D 1 , f') and a supermodular system ( D 2 , g ' ) on E' let e be a new element not in E' and define E D2" =
= E' U {e},
{E-X I X E
D
(3.262) (3.263)
D2},
(3.264)
= D1 U K " .
Then, D is a distributive lattice with 0,E E D due to (3.251). Also, define a function f : D -+ R by
f(X)= f'(X) (XE
(3.265)
Dl),
f(X)= f ( E )- g'(E - X) (XE f(E) = c ,
E"),
(3.266) (3.267)
where c E R is arbitrary but fixed. From (3.251), (3.265) and (3.266) we have for any X E D1 and Y E K o
Therefore, f : D t R is a submodular function on the distributive lattice D . Moreover, for some a E R, ( z , ~ E) B(f) if and only if
VX E
vxE
D1:
z(x - { e l )
KO:
.(X) 5 f ( X ) = f'(X),
+ a 5 f(x) =f
+
z(E)
Q
( ~ -) g'(E - x ) ,
(3.269)
(3.270) (3.271)
=f(E).
Eliminating a in (3.270) by using (3.271), we have
VX E KO:z ( E - X)2 g'(E - X)
(3.272)
or WE
D2:
~ ( y2)g'(Y).
(3.273)
Therefore, for the submodular system ( D , f ) defined by (3.262)-(3.267) P(f',g') = {z
I z E RE,3a E R: 90
( z , a )E B(f)}.
(3.274)
3.5. RELATED POLYHEDRA Moreover, it is clear that P(f',g') and B(f) are isomorphic with each other under the projection of the hyperplane z ( E ) = f ( E )onto the hyperplane z(e) = 0 along the axis e. Q.E.D. It follows from Theorem 3.58 that extreme points, extreme rays, faces etc. of P(f',g') are characterized by the corresponding results for B(f) (cf. Section 3.3). Moreover, since the greedy algorithm works for B(f), it also works for P(f',g') mutatis mutandis (cf. [Hassin82]). When f': 2E' + Z+ is the rank function of a matroid M1 on E', 9': 2E' --t Z+ is the dual supermodular function of the rank function of a matroid M2 on E' and the pair of f ' and g' defines a generalized polymatroid, we call the ordered pair (M1,Mz) a strong map. This definition of strong map is equivalent to the ordinary one (see, e.g., [Welsh76]). If (M1,M2) is a strong map and the rank of M1 is equal to the rank of Ma plus one, the submodular system (P, f ) on E = E' U {e} appearing in Theorem 3.58 is a matroid M3 such that M3 - {e} = M1 and M3/{e} = Ma, if we define f(E) = f ' ( E ' ) . Frank [Frank8lb] originally defined generalized polymatroids in terms of intersecting families. This corresponds to the fact that a crossing-submodular function on a crossing family determines the base polyhedron associated with a submodular system (see Theorem 2.5). Note that if P C 2E is a crossing family, then Pi (i = 1,2) defined by (3.253) and (3.254) are intersecting families. (For more details on generalized polymatroids, see [Frank Tardos881.)
+
(b) Polypseudomatroids The concept of pseudomatroid was introduced by R. Chandrasekaran and S. N. Kabadi [Chandrasekaran+Kabadi88]. The same or similar concepts were independently considered by A. Bouchet [Bouchet87] as A-matroid, A. Dress and T. F. Have1 [Dress Have1861 as metroid, M. Nakamura [Nakamura88b] as universal polymatroid and L. Qi [Qi88] as ditroid. We shall consider (poly-) pseudomatroids of Chandrasekaran and Kabadi from the point of view of submodular systems. Denote by 3E the set of all the ordered pairs (X,Y ) of disjoint subsets of E . Let f: 3E -+ R be a function with f(0,0) = 0 such that for each ( X , , K ) E 3E (i = 1,2)
+
Define a polyhedron
P,(f) = { z
I 5 E RE,V(X,Y) E 3": 91
x(X) - z ( Y ) 5 f(X,Y)}.
(3.276)
11. SUBMODULAR SYSTEMS AND BASE POLYHEDRA The polyhedron P, ( f ) is called a polypseudomatroid ([Chandrasekaran+Kabadi 881, [Kabadi ChandrasekaranSO]) and f its rank function (see Fig. 3.8). Polyhedral studies are also made in [Nakamura88b] and [Qi88,89]. A pseudomatroid is a set-theoretical version of a polypseudomatroid.
+
We call a pair (S,T ) E 3E such that S U T = E an orthant of RE.For each orthant ( S , T ) denote by 2(spT) the set of all the pairs ( X , Y ) such that X C S and Y C T . Now, choose an orthant ( S , T ) . Then for each ( X ; , Y , ) E 2(S>T1(i = 1,2) we have from (3.275)
This means that for the orthant ( S , T ) the function f ‘ : 2E + R defined by
f‘(x)= f ( S n X , T n X ) is a submodular function on 2E. Define
(XcE)
(3.278)
3.5. RELATED POLYHEDRA The polyhedron P(S,T)( f ) is expressed by the submodular polyhedron P(f') associated with the submodular system (2E, f') as follows. P(S,T)(f)= {x I y E P(f'), Ve E S:z(e) = y ( e ) , V e E T : z ( e )= - y ( e ) } . (3.280) ) the same as those of Therefore, the combinatorial properties of P ( s , ~ ) ( fare P(f') and the greedy algorithm described in Section 3.2.b works for P ( s , ~ ) ( f ) mutatis rnutandis as for P(f'). Define B(S,T)(f) = {.
I x E P(S,T)(f),4 s)- 4 T ) = f ( S J ) ) .
(3.281)
the base polyhedron in the orthant ( S , T ) of the polypseuWe call B(S,T)(f) domatroid P,(f). When f ( S , T ) = 0 and B ( s , T ) ( ~C) RE+,B ( s , T ) ( ~is) the set of linked vectors of a polylinking system (or a polybimatroid); a set theoretical version of the system is called a linking system (or a bimatroid) (see [Schrijver78,79]and [Kung78]). From Theorem 3.22 we have
Corollary 3.59: A vector x E RE is an extreme point of B(s,T) ( f ) in (3.281) if and only if for a maximal chain
C: 8 = A o C A l c . * * C A , = E
(3.282)
of 2E we have
f ( S n Ai,T n A,) - f ( S n A i - l , T n A,-1) if A, - A,-1 E: S, z(A, - A,-l) - z ( A , - Ai-1) if A, - A,-1 C T . for each i = 1,2,.
(3.283)
- - ,n .
We also have
Lemma 3.60: For each orthant ( S , T ) of RE we have
Then, for any (X,Y) E 3E we have from (Proof) Suppose z E B(S,T)(f). (3.275)
4x1 - .(Y)
f(X,Y) = x(X)- .(Y) - j(X,Y) x ( S ) - s ( T ) - f(S,T) 5 5 ( S - Y )- Z(T - X)- f ( S - Y,T - X) + ~ ( s n X ) - s ( T n -j(SnX,TnY) Y) < 0. -
+
93
(3.285)
11. SUBMODULAR SYSTEMS AND BASE POLYHEDRA Hence z E P, ( f ).
Q.E.D.
We see from Lemma 3.60 that B(S,T)(f)for each orthant ( S , T ) is a face of P, ( f ) . Since
for any w: E
-+
R we have from Lemma 3.60
For an orthant (S,T ) such that w(e) 2 0 (e E S) and w(e) 5 0 (e E T ) ,(3.287) holds with equality. Therefore, the problem of maximizing the linear function CeEE w(e)z(e) over the polypseudomatroid P, ( f ) is solved by maximizing the same linear function over P(S,T)(f) or B ( s , ~ ) ( f )Hence . the greedy algorithm as in Corollary 3.59 works for the polypseudomatroid P,(f) and the union of all the extreme points of B(S,T)( f ) for all the orthants ( S ,T ) is exactly the set of all the extreme points of P, ( f ) . Also, we see from Lemma 3.60 (and Lemma 3.2) that for each ( X , Y ) E 3E
so that f is uniquely determined by P,(f).
The greedy algorithm is given as follows. Here, note that we consider a maximization problem rather than a minimization problem (cf. Section 3.2.b). Greedy algorithm f o r polypseudomatroids
Step 1: Find a sequence ( e l , e 2 , - . * , e n )of all the elements of E such that Iw(el)l 2 lw(e2)1 2 . .. 2 Iw(e,)I. Choose an orthant ( S ,7’)such that w(e) 2 0 (e E S) and w ( e ) 5 0 ( e E 7’). Step 2: Let Ai be the set of the first i elements of the sequence ( e l , e 2 , . . . ,en) for each i = 1 , 2 , .. - ,n and compute a vector z E R E by (3.283). (z is a maximizer of the linear function CeEE w ( e ) z ( e ) over the polypseudomatroid
P*(f)-) (End) 94
3.5. RELATED POLYHEDRA
Theorem 3.61 [Chandrasekaran+Kabadi88], [Nakamura88b]: For any function f : 3E -+ R define P,(f) = {x I x E RE,V(X,Y) E 3 E : x ( X )- ~ ( y2)f(X,y)}. (3.289) The greedy algorithm of the type described above works for P,(f) if and only i f f is the rank function of a polypseudomatroid.
(Proof) It suffices to show the “only if” part. Suppose that the greedy algorithm described above works for P,(f). Then, for any (Xi,&)E 3E (i = 1,2) choose a maximal chain C of (3.282) containing (XI n X2)U (Y1n Y2)and ((XlUX2)-(Y1UY2))U((Y1UY2)-(X1UX2)) in it and define thevector x E RE by (3.283), where (S,T) is an orthant such that (XI U X2)- (Y1U Y2)C S and (Y1U Y2)- (XI U X2)C T . From the assumption we have z E P,(f), It follows from the definition of z that
x(X1n X2)- z(Yln Y2)= f(Xl n X2,Y1 n Y2).
(3.292)
From (3.290) -(3.292),
It follows that f is the rank function of a polypseudomatroid.
Q.E.D.
Theorem 3.62 [Chandrasekaran+Kabadi88] : The system of inequalities
.(X) - .(Y) 5 f(X,Y) ((X,Y) E 3
9
(3.294)
is totally dual integral.
(Proof) The present theorem follows from Lemma 3.60 and Corollary 3.21. Q.E.D. 95
11. SUBMODULAR SYSTEMS AND BASE POLYHEDRA The concept of polypseudomatroid explained above can be generalized as follows. Let 3 be a subset of 3" such that for each ( X i ,y ) E 3 (i = 1,2) the two pairs ( ( X IU X2) - (Yl U Y2),(Yl U Y2)- ( X I U X 2 ) ) and ( X I f l X 2 , Y1 fl Y2) belong to 7.Also let f : 3 --f R be a function such that for each ( X , , K ) E 3 (i = 1,2) f satisfies (3.275). The class of polypseudomatroids in this generalized sense includes as special cases submodular and supermodular polyhedra, base polyhedra and generalized polymatroids. Further generalization was considered by Qi [Qi88,89].
(c) Ternary semimodular polyhedra In the preceding subsection 3.5.b we considered the set 3" of all the ordered pairs ( X ,Y ) of disjoint subsets X and Y of E . 3E is a distributive lattice with lattice operations V and A defined as follows. For any ( X i , y t ) E 3" (i = 1,2) define ( X 1 , Y l ) v (X2,Y2) = ( X l u X2,Yl nY2), (3.295)
Wl,Y l )A ( X 2 ,Y2) = ( X l fl x2,Yl u Y2).
(3.296)
Identifying each ( X , Y ) E 3" with a ( 0 , +l}-vector x ( X , Y ) defined by
(eEX) (eEY) (e E E - ( X U Y ) ) ,
1 -1 0
(3.297)
we see that operations V and A in (3.295) and (3.296) are ordinary ones in vector lattice R" and that 3" is a finite sublattice of R". Let 5 be the partial order on 3" associated with the distributive lattice 3E. We call a pair of ( X i , Y , ) E 3" (z = 1,2) comparable if we have ( X 1 , Y l ) 5 (X2,Y2) or ( X 2 , Y z ) 5 ( X 1 , Y l ) . We also use the partial order 5 for { O , f l } vectors under the correspondence (3.297). b 4 R be a Let b be a sublattice of the distributive lattice 3 E and submodular function on b, i.e., for each ( X i , & ) E b (i = 1 , 2 )
f:
f ( X 1 ,Y l ) + f ( X 2 3 2 ) 2 f ( X 1 u x2,Yl n Y2) + Wl flx2,Yl u Y2), (3.298) where we write f ( X ,Y ) instead of f ( ( X ,Y ) ) .Also, define a polyhedron by
$(f) = {z I z E R E , V ( X , Y )E 8: z ( X ) - z ( Y ) 5 f ( X , Y ) } .
@(f)
(3.299)
Without any additional condition the polyhedron F(f) defined by (3.299) may be empty. The following theorem shows that a trivial necessary condition for P(f) to be nonempty is also sufficient. 96
3.5. RELATED POLYHEDRA
Theorem 3.63 [Fuji84e]: e ( j )is nonempty if and only if
i ( 0 ,X ) + j ( X ,0 ) 2 0 for each
(3.300)
x g E such that ( @ , X I(x,@) , E 5.
(Proof) The “only if” part is trivial. We show the “if” part. Suppose (3.300) E such that (0,X ) , ( X ,8 ) E $. It follows from the Farkas holds for each X lemma or the linear programming duality theorem that @ ( j#) 0 if (and only if) for all positive a, E R (i E I ) and ( X , , Y , ) E b (i E I ) with a finite index set I such that (3.301) crix(Xi,K ) = 0
C iEt
we have (3.302) Here, note that x ( X , , y i ) is the coefficient vector of the inequality .(Xi) z(Y,) 5 f ( X , , Y , ) . Since the vectors x ( X , Y ) ( ( X ,Y ) E b ) are integral, we can restrict the positive real coefficients a; (i E I ) in (3.301) to positive integers. It follows that @ ( f )# 8 if (and only if) for all ( X i ,Y , ) E b (i E I ) with a finite index set I such that CX(Xa,Y,)= 0 (3.303) iEI
we have (3.304) *El
where possibly ( X , , Y , ) = ( X j , y 3 ) for distinct i , j E I . If the family of ( X , , Y , ) (i E I ) in (3.303) contains an incomparable pair of ( X l , Y l ) and ( X 2 , Y 2 ) ,then Put (XI,Yl) ( X l u x2,Yl n Y2), (3.305a) +
(X2,Y2)
+-
( X l n X2,Yl u Y2).
(3.305b)
After the replacement (3.305) relation (3.303) remains valid and the value ,of the left-hand side of (3.304) does not increase, due to the submodularity of f . Since we get a family of ( X l ,K ) (i E I ) composed of pairwise corngarable elements of b after a finite number of such replacements, it suffices to show that we have (3.304) for all (X,,Y,) E b (i E I ) such that (X,,Y,) (i E I ) are pairwise comparable and (3.303) holds. Therefore, suppose that (3.303) holds for (X,,Y,) (i E I = {1,2,...,m}) and
97
11. SUBMODULAR SYSTEMS AND BASE POLYHEDRA We see from (3.303) and (3.306) that
x1=0, Suppose 111 2 2. If X,
-
Y, = O .
(3.307)
Y1 # 0, then for any e E X,
- Yl
we have (3.308)
which contradicts (3.303). Similarly, Y1 - X, Hence, we must have
Y1=
# 0 leads to
x,.
a
contradiction. (3.309)
Putting 21 = Yl(= Xm), we have from (3.307) and (3.309)
From (3.310) and assumption (3.300),
Put I + I - { l , m } . For the new index set I (3.303) still holds. Repeat the above argument until (11 = 0 or 1. When (11= 1, suppose I = {io). Then, from (3.303), X(Xi,,r,,) = 0, (3.312)
x;, = 0,
r,, = 0.
(3.313)
Hence, from (3.300),
~ ( ~ a , , =~ f(0,0) , ) 2 0.
(3.314)
It follows from (3.311) and (3.314) that (3.315)
where I = {1,2,
. ,m } , the original index set.
Corollary 3.64: Define a sublattice
DO of b
Q.E.D.
by
Let fo be the restriction of f^ to DO, and P ( f 0 ) be the polyhedron given by (3.299) with b and replaced by bo and fo, respectively. Then, @(f)# 0 if and only ifF(j0) # 0. 98
3.5. RELATED POLYHEDRA Corollary 3.64 means that when
X # 0 and Y # 0 do not cut off
f’(f0)
# 0, the inequalities in (3.299) with
$(fa) too much to give empty f’(f).
Moreover, it should be noted that if (0,0) E each ( X ,Y ) E 8
f(X,Y) = i(X, Y)
b
and f(0,0) = 0, then for
+ f(0,0) 2 f(X,0) + f(0, Y) ,
(3.317)
so that the inequalities appearing in the right-hand side of (3.299) for (X,Y) E 8 with X # 0 and Y # 0 are redundant, i.e., f’(f) = f’(fo), where fo is defined in Corollary 3.64. Now, let us consider the following linear programming problem ( P ) with CERE.
( P ) : Maximize
(3.318a)
c(e)z(e) eEE
subject to V(X,Y) E
8: z(X) - z(Y) 5 f ( X , Y ) .
(3.318b)
In general, the system of linear inequalities (3.318b) is not totally dual integral, and even i f f is integer-valued, f’(f) may have non-integral extreme points. For example, consider E = {1,2), (3.319a)
8 = {({1,2},0),
(3.3 19b)
({1},{2))},
f(W, (2)) = 0. Then @(f ) has the only one vertex given by (i,i). i({L 2), 0) = 1,
(3.319~)
The linear programming dual of Problem ( P ) is described as follows.
( P * ) : Minimize
A(X,Y)f(X,Y) (X,Y)ED
subject to
c
A(X,Y) -
(X,Y)ED eE X
c
(3.320a) A(X,Y) = c(e)
(e E
E),
(X,Y)ED eEY
(3.320b)
~ ( X , Y2) 0 ( ( X , Y ) E P).
(3.320~)
Though the system of linear inequalities (3.318b) is not totally dual integral, we have the following.
Lemma 3.65: Let c E R E be an integral vector and suppose that Problem (P’) has an optimal solution. Then there exists a rational optimal solution A*(X,Y) ( ( X , Y ) E 8) of ( P * )such that & = {(X,Y) I A * ( X , Y ) > 0 ) 99
(3.321)
11. SUBMODULAR SYSTEMS AND BASE POLYHEDRA A
consists of pairwise comparable elements of D. (Proof) By the assumption there exists a rational optimal solution A * ( X , Y ) ( ( X , Y )E 6 ) of (P*). For some positive rational do each A * ( X , Y ) ( ( X , Y )E 6 ) is a nonnegative integral multiple of do. If & given by (3.321) contains an incomparable pair of elements ( X i , Y , ) (i = 1,2), then define CY
= min{A*(X,,Y,)
I i = 1,2} > 0
(3.322)
and change the values of A* ( X i ,Y,) (i = 1,2), A* ( X IU X2, Y1 nY2) and A* ( X In u Y2) as
X2,Yl
A*(X,,yi)
+-
A * ( X ; , y i )- CY
(i = 1,2),
(3.323)
A*(X1u X 2 , Y 1 n Y2)t A*(X1u X 2 , Y 1 nY2)+ C Y ,
(3.324)
A*(Xl n X 2 , Y 1 u Y2)t A*(X1n X 2 , Y 1 u Y2)+ C Y .
(3.325)
Then, new A* satisfies (3.320b) and (3.320~)and is an optimal solution of ( P * ) since the objective function is not made increase by the above replacement (3.323)-(3.325) due to the submodularity o f f . Repeat (3.322)-(3.325) so long as & given by (3.321) contains an incomparable pair. Only a finite number of repetitions of (3.322)-(3.325) are possible since (1) all the generated A*’s are distinct, (2) each A * ( X , Y ) ( ( X , Y ) E 6 ) is nonnegative integral multiple of do > 0 and (3) the sum of all A * ( X , Y ) ( ( X , Y )E 0)is constant. Q.E.D. The above proof is a direct adaptation of a standard proof technique developed by N. Robertson, L. Lovdsz [Lovdsz76], J. Edmonds and R. Giles [Edm+Giles77] and A. J. Hoffman [Hoffman74]. Now, consider an additional structure of 6 and f as follows.
(t) If for (xi,Y,)E 6
(i = 1,2)
x1
(3.327)
n y2 # 0,
(3.328) then ( X I
-
Y2,Y1), (X2,Y2 -
j(Xl,YJ
X I )E
0 and
+ j ( X 2 , Y 2 ) 2 j ( X 1 - Y2,Yl)+ j ( X 2 , Y 2
-
XI).
(3.329)
Note that (3.326)-(3.328) implies that for some e,e‘ E E we have x ( X 1 , Y d e ) = x(X2,Y2)(e)# 0 and x(X1,Y1)(e’) = - X ( X 2 , Y 2 ) ( 4 # 0. 100
3.5. RELATED POLYHEDRA
6 and satisfy property (t) described by (3.326) - (3.329). Then system (3.318b) of linear inequalities is totally dual integral.
f
Theorem 3.66 [Fuji84e]: Suppose
To prove Theorem 3.66 we need some lemmas. The proof of the following lemma is similar to the proof technique developed by Robertson et al.
Lemma 3.67: Suppose that c E RE is an integral vector, that Problem (P*) has an optimal solution, and that 6 and f: 6 + R satisfy the above additional property (t). Then there exists a rational optimal solution A * ( X , Y ) ( ( X , Y )E 6 ) of (P*) such that & given by (3.321) consists of pairwise comparable elements of 6 and does not contain any ( X j , x ) (i = 1,2) which satisfy (3.326) - (3.328). (Proof) By Lemma 3.65, let A * ( X , Y ) ( ( X , Y ) E 6 ) be a rational optimal solution of (P*) such that & given by (3.321) consists of pairwise comparable elements of 6. Also let do be a positive rational number such that each A * ( X ,Y ) Y )E is a nonnegative integral multiple of do. If & contains ( X , , x ) (i = 1,2) which satisfy (3.326)-(3.328), then let ( X , , y t ) (i = 1,2) be elements of & for which
((x,
6)
(t) (3.326)-(3.328)
hold and for each (X3,Y3) E & such that ( X l , Y l ) ? ( X 3 ,Y3) ? (X2,Y2) we have
x1n y2 c E - (x,u y3).
(3.330)
(Note that there always exists such a pair of ( X i ,y t ) (i = 1,2).) Define a!
= min{A*(X,,Y,)
I i = 1,2} > 0.
a!
by
(3.331)
Change thevaluesofA*(X;,yt) (i = 1,2), A*(Xl-Y2,Y1) and A*(X2,Y2-X1) as A*(X,,X) + A*(X,,yt) - a! (i = 1,2), (3.332)
A'(X1X*(X,,Y,
Y2,YI) + A*&
-
X,)
t
A'(X,,Y,
+
a!,
(3.333)
- X , ) +a!.
(3.334)
- Y2,Yl)
The new A* satisfies (3.320b) and (3.320~)and is an optimal solution of (P*) because of (3.329). It follows from (3.330) that the new A* gives & in (3.321) which consists of pairwise comparable elements of 5. Repeat (3.331)-(3.334) so long as & in (3.321) contains ( X , , Y , ) (i = 1 , 2 ) which satisfy the above (t). Then, all the generated A * % are distinct, because each changing of A * by (3.332)-(3.334) increases the total sum of 101
11. SUBMODULAR SYSTEMS AND BASE POLYHEDRA
IE - ( X U Y ) I A * ( X , Y )( ( X , Y )E 5 ) by 21x2 nY+ > 0. Also each A * ( X , Y ) ( ( X , Y ) E 6) is a nonnegative integral multiple of do > 0, and the sum of
A * ( X , Y )( ( X , Y )E 6) is constant. Therefore, after a finite number of steps we get a desired optimal solution A* of (P*). Q.E.D.
Let A = ( a ( i , j ) I i E I , j E J ) be a p x q matrix with a row index set I = { 1 , 2 , . . . , p } and a column index set J = { 1 , 2 , . . . , q }. For an integer t , we say A has the upper consecutive t ’s property (or the lower consecutive t ’s property) if a(i0,jo) = t for a pair of io E I and j o E J implies a ( i , j o ) = t for all i E I with i 5 io (or i 2 io). Also, we say A has the consecutiue t’s property if a(i1,jo) = a(i2,jo) = t for some i l , i 2 E I and j o E J with i~ < i2 implies u ( i , j ~= ) t for all i E I with il 5 i 5 i2. Denote the ith row and the j t h column of A = ( a ( i , j )I i E I , j E J) by a ( i , . ) and a ( . , j ) ,respectively. A matrix A is called totally unimodular if the determinant of every square submatrix of A is equal to 0, 1 or - 1 .
Lemma 3.68: Let I = { 1 , 2 , . . . , p } and J = { 1 , 2 , , q } . Suppose A = ( a ( i , j ) I i E I , j E J) is a (0, fl}-matrix such that (i) anypairofrowsa(i1,a) a n d a ( i a , . ) o f A ( i 1 , i a E I ) iscomparable and (ii) A does not contain, as a submatrix, any of the following 2 x 2 matrices
(; -;),
(-;
1:)
and the ones obtained from these matrices by row and column permutations. Then A is totally unimodular. (Proof) Suppose that the rows of A are indexed so that a(p,
a)
5 u(p - 1 , .) 5
. 5 u(1,-).
(3.335)
Also suppose q = p . We show that the possible values of the determinant of the (square) matrix A are 0, 1 and - 1 . Since the following argument is valid for any square submatrix of the original matrix A, this implies that A is totally unimodular. Define (3.336) I(+) = {i I i E I , 0 4 u ( i , e)},
I ( - ) = {i 1 i E I , a ( ; , . ) 5 O } ,
(3.337)
I ( + , - ) = { 1 , 2 , * * . , P-} ( I ( + )u I ( - ) ) ,
(3.338)
where 0 is the zero row vector of dimension IJI(= Ill). From (3.335), A has the upper consecutive 1’s property and the lower consecutive -1’s property. 102
3.5. RELATED POLYHEDRA If I ( + ) U I ( + , - ) = 8, then A is totally unimodular and detA = 0, 1 or -1, since A is a {O,-1} matrix and has the consecutive -1’s property[Hoffman+Kruskal56]. Therefore, suppose I(+) U I ( + , -) # 8. Let a(.,jo) be a column of A such that a ( i , j o ) = 1 for all i E I(+) U I(+, -). (Such a column exists because of (3.336), (3.338) and the upper consecutive 1’s property of A.) If a(io,jo)= -1 for some io E I ( - ) , then I ( + , -) = 8, since otherwise for an arbitrary il E I(+,-)we have a(i1,jo) = 1 and there exists a column u(.,jl) such that a ( i l , j l ) = a(io,jl) = -1, which contradicts the assumption. When I(+,-) = 0, A is totally unimodular and detA = 0, 1 or -1, since by multiplying each row a ( i , . ) (i E I(-))by -1, A becomes a matrix which has the consecutive 1’s property if the rows are appropriately reordered. Therefore, further suppose a(i,jo) = 0 for all i E I ( - ) . Now, transform the matrix A by fundamental row operations with pivot a(1,jo) = 1 in such a way that the joth column a ( . , j o )becomes a unit vector. Let A’ be the matrix obtained from the resultant matrix by further removing the first row and the joth column, and let A’, and A; be the submatrices of A’ composed, respectively, of row vectors corresponding to ( I ( + )U I(+, -)) - (1) and I(-). Then, since A has the upper consecutive 1’s property and the lower consecutive -1’s property and since by the assumption for any j E J such that u ( 1 , j ) = 1 we have u ( i , j ) = 0 or 1 for all i E I ( + ) U I ( + , - ) , both A‘, and A; have the lower consecutive -1’s property. Hence, A‘ has the consecutive -1’s property by appropriately reordering the rows (i.e., by reversing the row ordering of A;) and is totally unimodular. We thus have detA = +detA‘ = 0, 1, or -1. Q.E.D. We are now ready to prove Theorem 3.66. (Proof of Theorem 3.66) Let c E RE be an integral vector, and suppose Problem (P*)has an optimal solution. By Lemma 3.67 there exists an optimal solution A* of ( P * )such that & given by (3.321) consists of pairwise comparable elements of b and does not contain any (Xi,yi) (i = 1 , 2 ) which satisfy (3.326)-(3.328). Then it follows from Lemma 3.68 that the set of row vectors x ( X ,Y )( ( X ,Y )E &) forms a totally unimodular matrix, which implies the existence of an integral optimal solution of (P*). Q.E.D. We call the polyhedron $(f)the ternary semimodular polyhedron associated with (8,f ) satisfying property (t). A general framework for proving total dual integrality of systems of linear inequalities related to submodular functions is proposed by A. Schrijver [Schrijver84a,84b],which includes total dual integrality of the Edmonds-Giles submodular flow model [Edm Giles771 and many other models. However, it is not clear whether the total dual integrality of (3.318b) with property (t) of (4.326)-(3.329) follows from Schrijver’s framework.
+
103
11. SUBMODULAR SYSTEMS AND BASE POLYHEDRA For a submodular system ( D ! , f') and a supermodular system ( D " , 9") on
E , define b 2 3E by
and for each ( X , Y ) E
b (3.340)
Then, f: b + R is a submodular function on is expressed as
b , and e(f)defined by (3.299)
W) = P ( Y ) r-7 P ( 9 " ) ,
(3.341)
where P ( f r ) is the submodular polyhedron associated with the submodular system ( D t , f') and P(g") is the supermodular polyhedron associated with the supermodular system ( D t ' , g t 1 ) . Theorem 3.63 implies that P ( f ' ) n P ( g " ) is nonempty if and only if g"(X) 5 f ' ( X ) for each X E D' n D". Moreover, it should be noted that there exists no pair of ( X i ,y i ) in the present b which satisfies (3.326)-(3.328). Therefore, if f' and g" are integer-valued and P ( f t ) n P ( g " ) # 0, each face of P ( f r ) n P(g") contains integral points due to Theorem 3.66. This leads to the discrete separation theorem (Theorem 4.12) due to [Frank82b]. It should be noted that a submodular polyhedron, a supermodular polyhedron, a base polyhedron, and the intersection of two base polyhedra are special cases of the intersection of submodular and supermodular polyhedra and that all of these polyhedra are ternary semimodular polyhedra. Hence, in general the greedy algorithm does not work for ternary semimodular polyhedra. Let ( D ' , f') be a submodular system and (D", 9") a supermodular system, both on E , such that P ( f r , g r r )= {x 1 x E R E ,VX E p': x(X)5 f ' ( X ) , W E D": z ( Y ) 2 g " ( Y ) }
is a generalized polymatroid, i.e., for each
x-YED',
X E D' and Y
E
Dtr we have
Y-XED",
f ' ( X ) - g"(Y) 2 f'(X-- Y ) - g"(Y - X ) .
f'(0)
b
1: b
and + R by (3.339) and (3.340), respectively. Since = g"(0) = 0 by definition, we have from (3.344)
Define
f(0, X ) + f ( X ,0) 2 2j(0,0) = 0 104
(3.342)
(3.343) (3.344)
1(0,0) = (3.345)
3.6. SUBMODULAR SYSTEMS OF NETWORK T YP E for each X E D' n D". It follows from Theorem 3.63 that generalized polymatroids are always nonempty (cf. Theorems 2.3 and 3.58). Also, the total dual integrality of (3.342) follows from Theorem 3.66. Suppose that a distributive lattice 6 2 3" and a submodular function f: b -+ R satisfy the property that for any ( X , , x ) E b (i = 1,2) satisfying (3.326) and (3.327) we have ( X l -Y2, Y l ) , ( X 2 ,Yz - X I ) E D and (3.329) holds. Then, ?(f) defined by (3.299) is called a hybrid independence polyhedron by # 0 if and only if f ( 0 , 0 ) 2 0 (if Tomizawa [Tomi8la]. In this case, ( 0 , 0 ) E 6),since (3.345) holds for each X such that ( 0 , X ) , ( X , 0 ) E 5. It should be noted that the class of hybrid independence polyhedra includes the class of generalized polymatroids but not the class of intersections of submodular and supermodular polyhedra.
e(f)
Readers should be referred to a nice review [Schrijver84a] for other related polyhedra and systems such as lattice polyhedra ([Hoffman Schwartz781, Hoffman821, [Hoffman78]), Frank's kernel s y s t e m ((Frank791) and [Groflin Grishuhin's model ([Grishuhin81]).
+
+
3.6. Submodular Systems of Network Type [Tomi+Fuji8l]
Let N = (G = ( V , A ) , c )be a capacitated network with an underlying graph G = (V,A ) and a nonnegative upper capacity function c: A + R + , where the lower capacity function is regarded as the zero function. The cut function IC,: 2" + R associated with the network N is given by =
c 4.)
(U E V ) ,
(3.346)
lZEA+lJ
where A + U is the set of arcs leaving U (see Section 2.3). Without loss of generality we assume that G is a simple graph and that each arc a E A is identified with the ordered pair (d+a, % a ) of its end-vertices. (3.346) can be rewritten as
ICc(U) =
c c U E l l fJECl-[U}
c(u,v)-
cy
[I..fl)
E(
( c ( u , v )+ c ( v , u ) )
(U E V ) , (3.347)
)
('2')
where for each arc ( u , v ) E A we write c ( ( u , v ) )as c ( u , v ) , is the set of all the two-element subsets of U , and we define c(u, v) = 0 for (u,v ) $ A . For any the set of all finite set X and any integer i with 0 5 i 5 1x1 we denote by the i-element subsets of X .
(4)
105
11. SUBMODULAR SYSTEMS AND BASE POLYHEDRA For any non-zero set function f : 2v R (05 i 5 IVl) such that
-+
R there exist functions f ( i ) :
(7)
-+
(3.348)
By the Mobius inversion formula f(') (0 5 i 5 IVl) are uniquely determined from f as = (-l)~X-Y~f(Y). (3.349)
fqx)
c
YGX
+
Let k be an integer such that 0 5 k 5 IVI, f ( k ) # 0 and f ( i ) = 0 ( k 1 5 i 5 IVl). The function f is called a set function of order k (see [Tomi80b]). We see from (3.347) that any cut function is of order 2 if c # 0 (also see (3.353) and (3.354) below). A submodular system (2", f ) is said to be of network type if f is equal to the cut function IC,: 2v + R associated with a network having a nonnegative capacity function c.
Theorem 3.69 [Tomi+Fuji81]: Suppose that (2v,f ) is a submodular system f ) is of network type if and only if the following (i) - (iii) hold: (i) The order of f is less than or equal to 2. (ii) For the functions f ( i ) (0 5 i 5 IVl) in (3.348),
on V . (2',
f(0)
= 0,
f (1) > - 0,
( 3.350) (3.351)
(iii) For each X
CV ,
(Proof) The 'only if" part: If ( 2 V , f ) is of network type and f = n, for a network with a nonnegative capacity function c , then from (3.347) we have
106
3.6. SUBMODULAR SYSTEMS OF NETWORK TYPE f ( 2 ’ ( { u , v } )= - ( c ( u , v ) t c ( v , u ) ) .
(3.354)
Since the capacity function c is nonnegative, (i) - (iii) easily follow from (3.353) and (3.354). The “if” part: Suppose (i)-(iii) hold. Consider the bipartite graph G = ( W + ,W - ; with the left and right vertex sets W + and W - and with the arc set defined by
a
a)
w-= (;),
w+= v,
a = ((u,U)
I
u
EuE
(;)}.
(3.355)
(3.356)
Let fi = (2,;)be a network with the underlying graph G and a nonnegative capacity function c^ such that for each u E Z(u) is sufficiently large, where W + is the entrance vertex set and W - the exit vertex set of &. Then, it follows from the assumption that there exists a flow p: 2 -+ R in fi such that
a
a+) = fQ)({u}) ( u E W + ( =V ) ) , -ap(u) = f‘2’(u) (u E w-(=(;))I,
(3.357) (3.358)
due to the supply-demand theorem for bipartite networks (Theorem 1.4) ([Gale 571, [Ford Fulkerson621). Choose one such flow p: 2 -+ R+ . Define
+
C b ,
4 = P ( b , { u , 4))( ( u ,4 E 4.
(3.360)
Let N be the network with the underlying graph G = (V,A ) and the nonnegative capacity function c defined by (3.359) and (3.360). From (3.357)-(3.360) we have (3.353) and (3.354). Since the order of f is less than or equal to 2, f coincides with the cut function IG, associated with the network A. Q.E.D. We see from this theorem that for a submodular system ( Z V , f ) on V with the rank function of order at most 2 the problem of discerning whether the submodular system (2V, f ) is of network type or not can be solved by a max-flow computation for the network & defined in the above proof, since (ii) in Theorem 3.69 can directly be checked and (iii) together with (3.351) is a necessary and sufficient condition for the existence of a feasible flow in the bipartite network f i . Moreover, when ( 2 V , f ) is of network type, consider the minimum-cost network-realization problem defined as follows: 107
11. SUBMODULAR SYSTEMS AND BASE POLYHEDRA Find a network d = (G = (V,A ) , c ) with rc, = f such that the cost (3.361) a€ A
is as small as possible, where ~ ( ais) the realization cost per unit capacity for each arc a E A . As is seen from the above proof, this problem is reduced to a Hitchcock transdefined in the above proof (see portation problem for the bipartite network (TomiSFuji811).
k
108
Chapter 111.
Neoflows
In this chapter we consider generalizations of classical flow problems of Ford and Fulkerson to flow problems with boundary constraints described by submodular functions. The new flow problems to be treated are the submodular flow problem, the independent flow problem and the polymatroidal flow problem, and they are, in a sense, equivalent. Because of this we call the class of these flow problems and other possible equivalent ones the neoflow problem and each of them a neoflow problem (cf. [Fuji87a]). We give a theory and algorithms for the neoflow problem.
4. The Intersection Problem
In this section we consider the problem of finding a maximum common subbase of two submodular systems and some related problems. 4.1. The Intersection Theorem
Let ( D i , f i ) (i = 1,2) be two submodular systems on E and consider the following problem.
P I : Maximize z ( E ) subject to z E P(fl) n P(f2)
(4.la) (4.lb)
Equivalently, we express this problem as follows. Let E' be a copy of E and we regard ( D 2 , f i ) as a submodular system on E'. Also let G = ( E , E ' ; A )be the bipartite graph with the left and right vertex sets E and E' and with the arc set A = {(e,e') I e E E } , where e' E E' is a copy of e E E , i.e., A gives a natural bijection between E and its copy E' (see Fig. 4.1). Furthermore, we consider a capacity function c: A + R U {+m} such that ~ ( a=) +oo ( a E A ) . Then Problem PI in (4.1) is equivalent to the following problem
PI': Maximize dp(E ) ~ subject to ( ~ 3 pE) P(fI), -
(WE' E P(f21,
(4.2a) (4.2b) (4.2~)
where p: A + R is a flow, d p is the boundary of p in N and ( 8 ~and ) ( ~d ~ ) ~ ' are, respectively, the restrictions of d p to E and El. A flow p: A ---f R satisfying (4.2b) and ( 4 . 2 ~ )is called a feasible flow in A' = (G = ( E , E ' ; A ) , c , S l , S z ) , 109
111. NEOFLOWS
E
e4
el
0
. .
E'
.. Figure 4.1.
where S ; = (D,,f;) (i = 1 , 2 ) . As we will see later, Problem PI' is a special case of a neoflow problem.
(a) Preliminaries We need some preliminaries to furnish an algorithm for solving the intersection problem described by (4.1) or (4.2). Consider a submodular system (D, f) on E . In the following we give several lemmas which are obtained by a direct adaptation of the results shown in [Fuji78a] for polymatroids.
Lemma 4.1: Suppose z E P(f), u E sat(z) and v E dep(z,u) - { u } . For any a: E R such that 0 < a: 5 c"(z,u,v)define y = z a(xU- xu). Then, y E P(f) and (4.3) sat(y) = sat(z).
+
(Proof) From the definition of the exchange capacity we have y E P(f). Also, since for any X E D such that X 2 sat(s) we have y(X) = z ( X ) and sat(z) is the unique maximal tight set X such that z ( X ) = f ( X ) ,we have sat(y) = sat (x). Q.E.D. 110
4.1. THE INTERSECTION THEOREM
Lemma 4.2: Under the same assumption as in Lemma 4.1, (w E E
c^(y,w) = i3(z,w)
- sat(z)).
(4.4)
and similarly,
f(X) - .(X) 2 f ( X u XO) - z(X u XO). Hence the lemma follows from (2.37).
(4.6)
Q.E.D.
Lemma 4.3: For any z E P ( f ) let u1, u2 and 212 be three distinct elements of E such that u, E sat(z) (i = 1 , 2 ) , (4-7) 7)2
E dep(z,uz),
VZ
4 dep(z,.u1).
(4.8)
For any a E R such that 0 < CY 5 C " ( Z , U ~ , V ~ ) define Y = 2 + 4 x u , - XtJ.
(4-9)
Then we have u1 E sat(y) and deP(Y,ul) = d e p ( v 1 ) .
(4.10)
(Proof) From Lemma 4.1 we have u1 E sat(y). Also we have u2 4 dep(z,ul), since otherwise we would have dep(s,uz) dep(z,ul) by the minimality of dep(z,u2) and hence v2 E dep(z,ul). Therefore, putting XO= dep(z,ul), we have y(X0) = z ( X 0 ) = f ( X 0 ) and y ( X ) = z(X) for any X E P with X Cr XO. (4.10) follows from the definition of dependence function. Q.E.D. Lemma 4.4: For any z E P ( f ) let u1, uz, 211 and of E satisfying (4.7), (4.8) and v1 E dep(z, 4 .
be four distinct elements (4.11)
Then for the vector y defined by (4.9) for any CY E R with 0 < CY 5 C(z,uz,vz), we have C"(y,w,vd = E ( ~ , U h V l ) . (4.12) 111
111. NEOFLOWS (Proof) For any z E P(f) and
X, E D
such that z(X0) = f(X0) we have
f ( x )- Z ( X )2 f(xn X o ) - z ( X n X o ) For Xo
(XE D).
(4.13)
= dep(s, u1) we have from Lemma 4.3
and since u 2 , v2
4 dep(z,ul), we have y(X) = .(X)
(X
c xo,x E P).
(4.15)
Since (4.13) holds for z = 5, y, (4.12) follows from (4.14) and (4.15). Q.E.D. Lemma4.5: F o r a n y z c P ( f ) letui,vi ( i = 1 , 2 , - . - , q ) be2qdistinctelements of E such that
( k = 1,2,-..,q),
(4.16)
i < j 5 q). For any ( ~ (i i = 1 , 2 , - . . , q ) satisfying 0 < (Y; 5 E(z,ui,v;) (i define a vector y E RE by
(4.17)
v; E dep(z,ui)
ui E sat(z),
$! dep(s,ui)
vj
(1 5
= 1,2,..-,q)
(4.19) sat(y) = sat(z),
2(y, W ) = Z(Z, w )
(w E E
(4.20) - sat(s)).
(4.21)
(Proof) Considering the elementary transformations in the order of the pairs (ug,vq), (ug-l,vp-l), ( u l , q ) , the present lemma can be shown by repeatedly applying Lemmas 4.1-4.4. Q.E.D. .
a
*
,
It should be noted that we have (4.19)-(4.21) if (4.16) and (4.17) hold for an appropriate numbering of ui’s and v;’s. Lemma 4.5 will play a very fundamental rBle in developing algorithms for solving the intersection problem and other related problems.
(b) A n a l g o r i t h m a n d the intersection theorem 112
4.1, THE INTERSECTION THEOREM
We now consider Problem PIf described by (4.2). Given a feasible flow p in network N = (G = ( E , E ’ ; A ) , c , S 1 , S 2 )define , the auziliary network N, = (G, = ( V , A , ) , c , ) associated with p as follows. G, = ( V , A , ) is a directed graph, called the auxiliary graph associated with p, with vertex set V and arc set A , given by
v = E u E‘ u {s+,s-}, A , = s; u A; u A* u B* u A; u s;,
(4.22) (4.23)
where S: = {(s+,v)
I v E E - sat+(d+cp)},
(4.24)
A; = {(u,v) I v E s a t + ( a + p ) , u E dep+(d+p,v) - {v}}, A* = A ,
(4.25)
B’ = { ( e f , e ) 1 e E E } , A ; = {(u,v) 1 u E sat-(d-cp), v E dep-(d-p,u)
(4.27)
S; = {(v,s-)
- {u}},
I v E E - sat-(d-cp)}.
(4.26) (4.28) (4.29)
Here, d+’p = ( d ~ )d-cp ~ , = - ( d ~ ) ~and ’ , sat+ and dep+ (sat- and dep-) are, respectively, the saturation function and the dependence function defined with respect to submodular system (P,,fl) on E ( ( P 2 , f z ) on IT).Note that B* is the set of the reorientations of arcs of A . We also define the capacity function c,: A , t R U {+m} by ?+(d+cp,v) E+(d+cp,v,u) E-(d-cp,u,v)
2-(d-cp,v)
( a = (s+,v) E s;), ( a = ( u , v )E A : ) , ( a E A* U B * ) ,
(4.30)
( a = (u,v) E A ; ) , ( a = (v,s-) E s;),
where c^+ and E+ (2- and E - ) are, respectively, the saturation capacity and the exchange capacity defined with respect to submodular system (D1,f l ) on E ( ( P 2 , f z ) on E‘). Figure 4.2 shows the auxiliary network N,. Let us consider an algorithm described as follows. We call a directed path from s+ to s- of the minimum number of arcs a shortest path from s+ tq s-. We assume an oracle for exchange capacities for (Pi,fi) (i = 1 , 2 ) . Algorithm for the intersection problem I n p u t : afeasible flow p in U = (G = ( E , E f ; A ) , c , S 1 , S 2 ) . O u t p u t : a maximum flow p in 1. 113
111. NEOFLOWS
S+
S-
Figure 4.2.
Step 1: Construct the auxiliary network Alp = (G, = (V,Alp),c,,). Step 2: If there exists no directed path from s+ to s- in Alp,then the algorithm terminates and the current 'p is a maximum flow in A. Otherwise, find a shortest path P from s+ to s- in Nlp and put (Y
t
min{c,(a)
{
p(a)
I a is an arc in P } ,
+
(Y
p(a) - (Y
( a = (e,e') E A and a is in P),
( a = (e,e') E A and (e',e) is in P).
(4.31) (4.32)
Go to Step 1. (End) Theorem 4.6: The Aows 'p obtained in Step 2 of the algorithm are feasible Aows in 1.Moreover, each time we carry out (4.31) and (4.32), the Aow value of 'p increases by a! > 0 given by (4.31). (Proof) In Step 2 we find a shortest path P from s+ to s- in the auxiliary network Alp. Let ( u+l ,v1+ ), . - ., ( u z ,):v be the arcs in A: lying on P in this (u, ,v;) be the arcs in A; lying on P in this order. order and ( u : , w:), By the way of choosing path P , for each i, j such that 1 5 i < j 5 p ( q ) we have u' 4 dep+(d+(o,vf) (v; 4 dep-(d-'p,u,:)). (4.33) a
.
e ,
114
4.1. THE INTERSECTION THEOREM
Q.E.D.
Therefore, the present theorem follows from Lemma 4.5.
Theorem 4.7: If there exists no directed path from s+ to s- in the auxiliary network Nv in Step 2 of the algorithm, the current flow 'p is a maximum Aow in N. (Proof) Let U' be the set of the vertices in V which are reachable from s+ along directed paths in Nv (see Fig. 4.3).
U'
S-
Figure 4.3. Then for each v E E - U' and u E E' n U' we have dep+(a+'p,v) C E - U ' , dep-(a-'p,u) 2 E' n U * .
(4.34)
d+'p(E - U ' ) = fl(E- U * ) , a-'p(E'n U * ) = f 2 ( E ' n U ' ) .
(4.36)
(4.35)
Therefore,
Moreover, from the definition of U' and the auxiliary network
E n U'
= {e
1 e'
E E'
115
n U * } (gE ) .
(4.37)
Nv we see (4.38)
111. NEOFLOWS From (4.36)- (4.38),
+
d + ’ p ( ~= ) d + p (~ u’) d - ’ p ( ~ n ’ u*)
+
= f l ( E - U*) f2(E’ n U * ) .
(4.39)
On the other hand, for any feasible flow @ in N and for any U C E U E’ such that (1) E - U E D1,(2) E’ n U E D2 and (3) E n U = {e I e’ E E‘ n U } we have
a+@(E)= a+@(E- U ) + a-@(E’ n U )
I fi(E - u) + f2(E‘ n u).
(4.40)
It follows from (4.39) and (4.40) that the current flow y3 is a maximum flow in N. Q.E.D. Since there exists a maximum flow in N (this fact will also algorithmically be proven later), Theorems 4.6 and 4.7 together with the proof of Theorem 4.7 show the following theorem. Note that when f l and f2 are integer-valued, starting with an integral feasible flow in A, the algorithm given above terminates after a finite number of steps and then gives an integral maximum flow.
Theorem 4.8: For Problem Pl’ described by (4.2) we have max{d+’p(E) = min{fl(X)
I ’p:
N} + f 2 ( E ‘- X ‘ ) I X E D1,E’ a feasible flow in
- X‘ E
D2},
(4.41)
where X’ = {z‘ I z E X } C_ E’. Moreover, if f l and f2 are integer-valued, there exists an integral maximum flow in N. Rewriting Theorem 4.8 for Problem Pl,we also have
Theorem 4.9 (The Intersection Theorem): For Problem PI described by (4.1), max{z(E) I z E P(f1) n P(f2H = min{fl(X) f2(E- X ) I X E
+
D1,
E
-X E
Dz}.
(4.42)
Moreover, if f l and f2 are integer valued, the maximum on the left-hand side of (4.42) is attained by an integral vector E P(f1) n P(f2). Theorem 4.9 is a generalization of the intersection theorem for polymatroids due to Edmonds [Edrn’lO].The algorithm given in this section is a direct adaptation of the one given in [Fuji78a]. 116
4.1. THE INTERSECTION THEOREM
Denote by L the set of all the minimizers of the right-hand side of (4.42). L is a sublattice of D1 n &. Then, we have the following theorem. Recall that for a submodular system ( P , f ) and X , Y E D with X C Y denotes the rank function of the minor ( P , f ) - Y / X .
fz
Theorem 4.10 (cf. [Iri79], [Nakamura+Iri81]): Let
be any maximal chain of L1 and define
Then, the following two statements are equivalent:
(i) z is a maximum common subbase o f ( D ; , f ; ) (i = 1,2). (ii) For each i = 1,2,. ,k 1, zs*-’,-1 is a maximum common subbase of (Dl,fl)*S;/S;-1and ( P z , f z ) . ( E - S ; - l ) / ( E - S ; ) , where for each i = 1 , 2 , . . . , k zs*-ss-l i s a base o f ( D 1 , f l ) . S , / S ; - l a n d f o r e a c h i = 2 , 3 , . . . , k + l z s ~ - s l - l is a base of ( P 2 , fz) ( E - S ; - l ) / ( E - S;). (We disregard the case of i = 1 (or i = k 1) if S1 = 0 (or Sk = E ) . )
- +
-
+
(Proof) (i) ===+ (ii): Suppose (i). From Theorem 4.9 we have ”(S;)
+ z ( E - S;)
Since z(S;) have
= s(E)= f , ( S i )
+ f 2 ( E - S;)
(i = 1 , 2 , * * ,k). * (4.45)
I f l ( S ; )and z ( E - S;) I f 2 ( E - S;) for i = 1 , 2 , - . . , k , we must
Since z E P(f1) n P ( f 2 ) , it follows from (4.43) and (4.46) that (i = 1 , 2 , - . - , k )arebasesof ( P l , f 1 ) - S i / S ; - landthat zsi-si-l ( i = 2 , 3 , . . . , k + l ) are bases of ( P z , f2) ( E - Si- 1 ) / ( E - S;), where we disregard the case of i = 1 (or i = k 1) if S1 = 8 (or Sk = E ) . Hence, (ii) holds. (ii) ==+ (i): Suppose (ii). Then we have z E P ( f l ) n P ( f 2 ) . It follows from (ii) that
+
-
117
111. NEOFLOWS From (4.47) and Theorem 4.9 x is a maximum common subbase of ( D ; , f ; ) (i = 1 , 2 ) . Q.E.D. We can also show that the set of the minors of ( D 1 , f l ) and ( & , f 2 ) a p pearing in (ii) of Theorem 4.10 does not depend on the choice of a maximal chain C of L: ([Nakamura Iri811, [Tomi Fuji821) (see Theorem 7.17). Theorem 4.10 is a generalization of a structure theorem for bipartite graphs [Dulmage- Mendelsohn591.
+
+
+
(c) A refinement of the algorithm When fl and f2 are not integer-valued, the algorithm shown in Section 4.1.b may not terminate after finitely many steps. We shall show an algorithm due to Schonsleben [SchOnsleben80]and Lawler and Martel [Lawler+Martel82a] which modifies Step 2 of the algorithm and terminates after repeating Step 2 O( [El3) times. Let T : V ( = E U El) -+ {1,2,. .. ,2n} be a one-to-one mapping which denotes a numbering of the vertices in V , where n = (El. We also define T ( s + ) = 0 and ~ ( s - )= 2n 1. We represent any directed path P from s+ to s- in .A/,+, by the sequence (s+, v1, v2,. ,v,, s-) of vertices in P and denote by T(P)the sequence ( T ( V ~ ) , T ( V ~ ) , . . - , T (ofV the , ) ) numbering indices of the intermediate vertices. For any directed paths PI and P2 from s+ to s- in N,+,we say Pl is lexicographically smaller than Pz if 7r(P1)is lexicographically smaller than 7r(P2). We call the lexicographically smallest one in the set of all the shortest paths from s+ to s- in N,,,the lexicographically shortest path from s+ to s- in .A/,+, (with respect to n-). Now, we modify the part of finding a shortest path P in Step 2 of the algorithm as follows.
+
(*) If there exists a directed path from s+ to s- in N,+,, find the lexicographically shortest path P from s+ to s- in N,+,. Theorem 4.11 ([SchOnsleben80],[Lawler+Martel82a] for polymatroids): If we modify Step 2 of the algorithm given in Section 4.1.b for Problem PI' as above, the algorithm finds a maximum flow in .A/ after repeating Step 2 O(lE13)t imes. ' (Proof) Denote by Wk Vu{s+, s-} the set of the vertices which are reachable by a directed path from s+ having k arcs but not reachable by a directed path from s+ having less than k arcs. Suppose Wo = {s+} and s- E W,. Let P be the lexicographically shortest path in .A/,,. Also suppose that the arcs of A: lying on P appear in order as (4.48) 118
4.1. THE INTERSECTION THEOREM
from
S+
to s-. Put x = d+p and define Y =z
+4 X u ;
-
xu+
(4.49)
where Q is defined by (4.31) for the current p. Consider vertices w , z E E such that w !$ dep+(z,z), w E dep+(y,z). (4.50) Then we must have
v[ @ dep+(z,z),
u: E dep+(x,z),
(4.51) (4.52)
w E dep+(z,v[). (See Fig. 4.4.) Here we may have v;' = w or z = u t .
Figure 4.4. Now, suppose
u t E wk,
v r E wk+1
for some k (1 5 Ic 5 2n). Then we have z E with kl 5 k 1 .
+
Wk,
(4.53)
for some positive integer
(1) If vertex w is not reachable from s+ in N v , then adding arc (w,z)to does not change the set of shortest paths from s+ to s- in N v .
119
kl
N,,,
111. NEOFLOWS (2) Suppose w E Wk, for some
k2
(1
5 k2 5 272). (Note that k2 2 k.)
+
(2-1) If k2 > k or kl < k 1, then adding arc (w, z ) to the set of shortest paths from s+ to s- in 4,.
N,
does not change
(2-2) If k2 = k, kl = k+ 1 and there exists no shortest paths from s+ to swhich include vertex z, then adding arc (w,z) to N, does not change the set of shortest paths from s+ to s- in N,.
+
(2-3) If kz = k, kl = k 1 and there exists a shortest path from s+ to sin N, which includes vertex z , then we have (4.54) since P is the lexicographically shortest path. Because of Lemma 4.5 we can repeatedly apply the above argument to arcs (u:,v') (i = 2 , . - . , 1 ) in (4.48). We can also apply the argument for A: to A ; mutatis mutandis. For each vertex w E V U {s+} define
P,
(4
= min{T(v)
I v E V , (w,v)lies on a shortest path from s+
to s- in A',}. (4.55)
Also denote by p* the feasible flow obtained from 'p by transformation (4.32), Since at least one and let P* be the lexicographically shortest path in A',. arc lying on P is missing in N,., it follows from the above argument that
(i) (the length of P * ) 2 (the length of P)+1 or (ii) (the length of P*)= (the length of P) and p , 5 p,* with p,(v) < p,p.(v) for some v E V
U {s+}.
Case (ii) occurs consecutively O(lV12)times and hence Step 2 is executed Q.E.D.
o(Ivl3)(=O(lE13))times.
The above proof technique is due to R. E. Bixby (cf. [Cunningham84]). It should be noted that Theorem 4.11 together with Theorem 4.7 shows the existence of a maximum flow in 1 for any totally ordered additive group R and that the modified algorithm finds a maximum flow in N by changing feasible flows O( [ E l 3 )times. The above algorithm corresponds to the Edmonds-Karp algorithm for classical maximum flows [Edm Karp721. A complexity improvement over the above algorithm is shown in [Tardos Tovey Trick861 by generalizing the idea of layered networks due to E. A. Dinits [Dinits70].
+
+
120
+
4.2. DISCRETE SEPARATION THEOREM 4.2. Discrete Separation Theorem
A. Frank showed the following theorem. We prove this theorem by the use of the intersection theorem (Theorem 4.9). (Note that we have already shown this theorem in Section 3.5.c.)
Theorem 4.12 (Discrete Separation Theorem) [Frank82b]: Let ( D 1 , f ) be a submodular system on E and ( D 2 , g ) be a supermodular system on E . Then, g<
f ===+3xERE:g i x 5 f .
(4.56)
Moreover, i f f and g are integer-valued, there exists an integral x E RE such that g 5 x 5 f . Here, g 5 f means VX E D1 n Dz: g(X) 5 f(X) and g 5 x (x -< f ) means VX E D2: g(X) 5 x(X) (VX E D 1 : z(X) 5 f ( X ) ) . (Proof) Suppose g
< f . Then from the intersection theorem,
m={x(E) I x E P(f) n P(S#)) = min{f(X) g # ( E - X) I X E
+
= min{f(x)
+
n D2) g ( E ) - g(X) 1 X E DI n D 2 ) D1
= s(E).
(4.57)
Therefore, there exists a vector x in P ( f ) nB(g) (= P(f) nB(g#)). This vector x satisfies g 5 z 5 f . Moreover, if f and g are integer-valued, then from (4.57) and the intersection theorem there exists an integral x E P(f) n B(g), which Q.E.D. satisfies g 5 x 5 f . We can also show the intersection theorem (Theorem 4.9) by using the discrete separation theorem (Theorem 4.12) as follows. Let ( D ; , f;) (i = 1,2) be submodular systems on E . Define
k = min{fl(X) Also define f 2 :
D2
3
+
f2(E
- X) I X
D1,
E
- X E D2).
(4.58)
R by (4.59)
Note that iz: D2 3 R is a submodular function since k 5 f 2 ( E ) ; in fact, ( D 2 , f z ) is the ( f 2 ( E ) - k)-truncation of ( D 2 , f Z ) . Then we have i2#(0) = 0 = fl(0) and for each X E D1 n & - (0)
f2#(X) = f2(E) - f 2 ( E - X) = k - f 2 ( E - X)
5 fl(X). 121
(4.60)
111. NEOFLOWS It thus follows from the discrete separation theorem that there exists a vector x E RE such that f2# I x 5 f l . Since P(f1) n P(f2#) # 0, we have P(f1) fl B ( ~ J (= p(f1) n ~ ( f ~ # 0. ) )Consequently,
where the last inequality follows from the fact (the weak duality) that z ( E ) 5 f l ( X ) + f 2 ( E - X ) f o r a n y z E P ( f l ) n P ( f 2 ) andany X E D 1 w i t h E - X E Dz. This proves (4.42). Moreover, the integrality part of the intersection theorem follows from the integrality part of the discrete separation theorem and the above argument. We have thus shown the equivalence between the intersection theorem and the discrete separation theorem. Using the discrete separation theorem, we show (3.32) and (3.33) in Section 3.l.c of Chapter 11. For two submodular systems (P;,fi) (i = 1 , 2 ) on E let f be the submodular function on D1 fl Pz defined by
+
We write f as f l f2. We can easily see that the relation, P ( f l P(f1) + P(f2), holds. Conversely, for any x E P(fl f2) we have
+ f2)
2
+
which is rewritten as
VX E
D1 fl
Dz: z(X) - f2(X) L fl(X).
(4.64)
It follows from (4.64) and the discrete separation theorem that there exists a vector y E RE such that
Defining z = x - y, we have from (4.65) z E P(f2) and from (4.66) y E P(f,). W e t h u s h a v e z = y + z E P(f,)+P(fz). Therefore,P(fl+fz) g P(f,)+p(f,). This completes the proof of the fact that P(f, + f 2 ) = p(f,) + P(fz). 122
4.3. THE COMMON BASE PROBLEM
4.3. The Common Base Problem
Let (P;,f;) (i = 1,2) be two submodular systems on E . The common base problem for submodular systems (Pi,!;) (i = 1,2) is to discern whether there is a common base x E B(f1) n B(f2) and, if any, to find one such common base. Clearly, that fl(E) = f2(E) is necessary for the existence of a common base. Theorem 4.13: Let (P;,f;) (i = 1,2) be submodular systems on E with fl(E) = f2(E). Then there exists a common base x E B(fl) n B(f2) if and only iffz# 5 f l , i.e.,
vx E p1 nE: fZ#(x) 5 fl(x).
(4.67)
Moreover, i f f ; (i = 1 , 2 ) are integer-valued and a common base exists, then there exists an integral common base.
(Proof) If there exists a common base x E B(fl) n B(f2), then since B(f,) n B(f2) = p(f1) n p ( f ~ # )we , have f2# < x 5 f l . Conversely, if f 2 # 5 fly then there exists a vector x E RE such that f2# 5 x 5 fl, due to the discrete separation theorem (Theorem 4.12). Hence x E P(fl) n P(f2#) = B ( f l ) n B(f2#) = B(f1) n B(f2). Moreover, the integrality part also follows from the integrality part of the discrete separation theorem. Q.E.D. Note that, in Theorem 4.13,
fl(E)= f2(E). Also note that
5 f l if and only if 5 fly’is equivalent to:
f2#
“f2#
f1#
5 fz, since
From the intersection theorem, (4.68) means that there exists a vector x E P(f1) n P(f2) such that x ( E ) 2 f Z ( E ) ,and such a vector x is a common base since fl(E) = f z ( E ) . In this way Theorem 4.13 can also be shown by the intersection theorem. The common base problem can be solved by finding a maximum common subbase x E P(f1) n P(f2) through the algorithm shown in Section 4.1. We shall also give an algorithm for the common base problem which deals only with bases in B ( f l ) and B(f2). 123
111. NEOFLOWS Given bases bl E B(fl) and 62 E B(fi), we denote p = (b1,bz) and define the auxiliary network Np = (Gp = (V,A p ) ,TP+,Ti,c p ) as follows. G p is the underlying graph with the vertex set V = E and the arc set Ap defined by (4.69)
Ap = A; U A;,
I u, E v, = { ( u , ~ I) u , E~V ,
A; = { ( u ,V )
A;
21
{.I} , E dep2(b2,~)- {.I}, E dePl ( b l , ).
-
(4.70) (4.71)
where depl and dep2 are the dependence functions associated with (PI, f l ) and (P2, f i ) ,respectively. c p : Ap + R is the capacity function defined by (4.72)
Here, el and 22 are the exchange capacities associated with (D1, respectively. Also, T: and Ti are subsets of V defined by
: T
= {V
Ti = { V
f l ) and
I E V, bi(v) > bz(v)}, I v E V , bi(v) < b2(0)}.
( Pz,fz),
(4.73) (4.74)
Tg is the set of entrances and Ti the set of exits in Np. Now, an algorithm for finding a common base of B(f1) and B(f2) is given as follows.
An algorithm for finding a common base I n p u t : Submodular systems (D;,fi) (i = 1,2) on E with fl(E) = f2(E) and initial bases bi of (P;,fi) (i = 1,2). O u t p u t : A common base 61 (= b2) if any exists. Step 1: While bl # b 2 , do the following (a)-(c): (a) Construct the auxiliary network Np = (GP= (V,A p ) ,TJ,Ti,c p ) associated with p = (61,bZ). If there is no directed path from T$ to Ti,then stop (there is no common base in B(f1) and B(f2)). (b) Let P be a directed path from TJ to T f in G p having the smallest number of arcs and put a
+-
min{min{cp(a)
I a is an arc on P } ,
b l ( d + P ) - b2(d+P), bz(C3-P)
-
b@P)}.
( d + P is the initial vertex of P and d - P the terminal vertex of P.) (c) For each arc a E A p , 124
5.1. NEOFLOWS if a = (u,v) E A;, then put
bl(4
+
bl(4
- a,
bl(4
+ a,
bz(v)
+
bl(4
+ a,
if a = (u,v) E A;, then put b2(u) +-
Step 2: The current terminates. (End)
bl
b 2 ( ~ )
+-
b 2 ( ~)
a.
is a common base of B(fl) and B(f2) and the algorithm
Because of the way of choosing a directed path P in (b) of Step 1, the vectors bl and b2 remain bases in B ( f 1 ) and B(f2), respectively, due to Lemma 4.5. If the algorithm terminates at (a) of Step 1, then let U be the set of the vertices in GBwhich are reachable by directed paths from Ti.It follows from the definition of the auxiliary network Np that (4.75)
b l ( U ) = f l ( E )- f l ( E - U ) , b2(U) = f 2 ( U ) .
(4.76)
Since b l ( U ) > b 2 ( U ) by the assumption and the definition of U , we have from (4.75) and (4.76)
f1bW=
f 2 ( E ) ) > f l ( E - U )+ f 2 ( U ) .
(4.77)
From (4.77) (and Theorem 4.9) we see that there exists no common base in and B ( f 2 ) . When f l and f 2 are integer-valued and initial bases bl and b2 are integral, bases bl and b2 obtained during the execution of the above algorithm are integral and hence the algorithm terminates after repeating (a)-(.) of Step 1 at most b l ( T 2 ) - bZ(Ti) times. For general rank functions f l and f 2 we adopt the lexicographic ordering technique described in Section 4.l.c ([SchOnslebenSO],[Lawler Martel82al). When finding a shortest path from Ti to Ti by the breadth-first search, for each u E V search arc (u,v) in Ab (or A;) earlier than arc (u,v') in A$ (or A ; ) i f r ( v ) <~(v'),forafixednumberingx: V - + ( l 7 2 , . . - , I V I } o f V Bythis . modification the algorithm terminates after repeating Cycle (a)-(.) of Step 1 O( / E l 3 )times. B(l-1)
+
5. Neoflows
125
111. NEOFLOWS
In this section we consider the submodular flow problem, the independent flow problem and the polymatroidal flow problem, which we call neojlow problems. We discuss the equivalence among these neoflow problems and give algorithms for solving them.
5.1. Neoflows
We first give the definitions of the submodular flow problem, the independent flow problem and the polymatroidal flow problem. (a) Submodular flows
Let G = (V,A)be a graph with a vertex set V and an arc set A. Also let c: A -+ R U {+m} be an upper capacity function and c: A -+ R u {-m} be a lower capacity function. A function 7 : A t R is a cost function. Let 3 C 2v be a crossing family with 0, V E 3 and f: 3 -+ R be a crossing-submodular function on the crossing family 3 with f(0) = f(V)= 0. (See Section 2.3 for the definition of crossing-submodular function on a crossing family.) Denote this network by NS = (G = (V,A),c,E,y,(3,f)). The submodular jlow problem considered by Edmonds and Giles [Edm Giles771 is described as follows.
-
+
Ps: Minimize
(5.la)
7(a)p(a) aEA
subject to ~ ( u5) p(a) 5 Z ( a )
+
(a E
(5.lb)
A),
E B(f).
(5.1~)
Here, d p is the boundary of p with respect to G (see (2.23)) and B(f) is the base polyhedron associated with f (see Theorem 2.5), where we assume
B ( f ) # 0. A feasible p: A + R satisfying (5.lb) and ( 5 . 1 ~ )is called a submodular flow in NS and an optimal solution of the submodular flow problem PS is called an optimal submodular flow in Ns. (The term, submodular flow, was introduced by Zimmermann [Zimmermann82].)
(b) Independent flows Let G = ( V , A ; S + , S - ) be a graph with a vertex set V , an arc set A, a set S+ of entrances and a set S - of exits such that S + , S - C V and S+ n S - = 0. Also let C: A -+ R u ( $ 0 0 ) be an upper capacity function, c: A 4 R U ( - 0 0 ) be a lower capacity junction, and 7 : A R be -+
126
5.1. NEOFLOWS a cost function. Moreover, let (D+,f+) be a submodular system on S+ and (D-,f-) be a submodular system on S - . Denote this network by 11= (G =
(V,A; s+,S-),c,c,7,
(D+,f+),
(D-7
f-1).
The independent flow problem [Fuji78a] is given as follows. For a given v* E R,
PI:Minimize
(5.2a)
7(a)p(u) &A
subject to c ( a ) 5 p(a) 5 Z(u) ( u E A ) , ap(v) = 0 (v E - (S+ u s-)),
v
(5.2b) (5.2~)
( a d s +E P(f+),
(5.2d)
(adS- E ap(s+)= v*.
(5.2e)
w-1,
-
(5.2f)
Here, (dp)" (or (ap)"-)is the restriction of dp: V -+ R to S+ (or S - ) and P ( f + ) and P(f-) are, respectively, the submodular polyhedra associated with (D+,f + ) and (D-,f-). (The original independent flow problem considered in [Fuji78a] is described in terms of polymatroids and is slightly generalized here to submodular systems.) It should also be noted that if the system of (5.2b)-(5.2f) is feasible, we have min{f+(S+),f-(S-)} 2 v* and that letting ( D + , f * + ) and ( D - , f * - ) be, respectively, the (f+(S+) -v*)-truncation of (D+,f+) and the (f-(S-) -v*)truncation of (D-,f-), we can replace (5.2d)-(5.2f) by
Here, note that ap(S+)= --dp(S-) due to ( 5 . 2 ~ ) . A feasible flow p: A -+ R satisfying (5.2b)-(5.2f) is called an independent flow of value V * in NI and an optimal solution of the independent flow problem PI is called an optimal independent flow of value V * in NI. (c) Polymatroidal flows
Let G = (V,A)be a graph with a vertex set V and an arc set A. For each + vertex v E V we consider distributive lattices :D C_ 2b " and 0; C 2"" and submodular functions f: D,r' + R and fu-: D,; -+ R. Here, we assume 0 E D:, 0 E D,- but not necessarily 6+v E D:, 6-v E D,-. Also letc: A 4 R u {-m} be a lower capacity function and 7 : A + R be a cost function. Denote this network by N p = (G = (V,A),c,r,(D:,f:)(v E V),(DJ,fl)(v E V)). 127
111. NEOFLOWS The polymatroidal flow problem [Hassin78,82], [Lawler is given as follows.
P p : Minimize
+ Marte182a,82b] (5.3a)
y(u)p(a) aEA
subject to c ( u ) 5 p(u) (u E A ) , ap(.) = 0 ( v E V ) ,
(5.3b) (5.3c)
(. E V ) ,
(5.3d)
P6-uE P ( f i ) (. E V ) ,
(5.3e)
PO6+'E P ( f 3
where ap is the boundary of p with respect to G, p6+' (or p6-")is the restriction of p: A + R to 6+v (or 6 - v ) , and
P(f:)
= {z
I z E R6+', V X E D::z(X) 5 f : ( X ) } ,
P ( f l ) = {z I z E R6-', V X E Dl:z(X) 5 fi(X)},
(5.4) (5.5)
(The problem is originally defined in terms of polymatroids and is slightly generalized here.) A feasible flow p: A + R satisfying (5.3b)-(5.3e) is called a polymatroidul flow in N p and an optimal solution of Problem P p is called an optimal polymatroidal flow in .Np .
5.2. The Equivalence of the Neoflow Problems
We show the equivalence of the submodular flow problem, the independent flow problem and the polymatroidal flow problem. Here, the equivalence is with respect to the capability of modeling flow problems. Different models may require different oracles for algorithms but we will not go into this matter here. (a) From submodular flows to independent flows Consider the submodular flow problem Ps defined by (5.1). From Theorem 2.5 and the definition of the submodular flow problem we can assume that f appearing in (5.1) is a submodular function on a distributive lattice D C 2" such that 0, V E D and f(0) = f(V)= 0. Let s- be a new vertex not in G = (V,A ) and define s + = v , s- = {s - 1. (54 128
5.2. THE EQUIVALENCE OF THE NEOFLOW PROBLEMS Then the submodular flow problem Ps is rewritten as Minimize
7(a)cp(a)
(5.7a)
subject to ~ ( a5) p ( a ) I Z(a) ( a E A ) ,
(5.7b)
( a d S +E: B ( f ) , - ( a q E (0).
(5.7c)
aEA
(5.7d)
This is an independent flow problem with the underlying graph G’ = (V U { s - } , A ; S + , S - ) . (See (5.2b), (5.2c), (5.2g) and (5.2h), where constraint (5.2~) is void.) It is interesting to see that the submodular flow problem looks like a very special case of the independent flow problem (see Fig. 5.1).
S-
Figure 5.1.
(b) From independent flows to polymatroidal flows Consider the independent flow problem PI described by (5.2). Without loss of generality we assume that there is no arc entering S+ or leaving S- and that f + ( S + )= f - ( S - ) = u * . Let s+ and s - be new vertices not in G = 129
111. NEOFLOWS
(V,A; S+, S - ) . Construct a graph given by
2= (3,A) with vertex set t and arc set A
P = v u {s+,
s-},
(54
A = A u A+ u A -
u AO, AS = {(s+,v) I v E S'},
(5.9) (5.10)
A- = { ( v , s - ) I v E S - } , A' = {(s-,s+)}.
(5.11)
Also define a lower capacity function + R by
?:A
2:
--t
R
(5.12)
U (-00)
and a cost function
(a E A), ( a = (s-,s+)), (u E A+ U A - ) ,
(5.13)
(5.14)
C R6+"and P;
Moreover, for each vertex v E V define polyhedra P: by
P;=
{
4
P-=
R6-"
P(f +)
(v = s+),
(-m,+m)
(v E {s-}
I z E R6+",Vu E 6 + v : z ( u )5 i?(u)} P(f -1
(v E V - S - ) , (v = s-), (v E {s+} U S + ) , (5.16) (v E V - S+),
{z
(-m,+rn)
{z
I z E R6-", Vu E 6 - v : z ( u ) 5 C(u)}
U S - ) , (5.15)
where P ( j S ) and P(f - ) should, respectively, be regarded as polyhedra in RA+ and RA- under the natural correspondence between S+ and A+ and between S- and A - . Now, the independent flow problem PI is rewritten as Minimize
(5.17a)
+(u)p(u) a d
subject to c(a) 5 p ( u ) (u E dp(v) = 0 (v E
a),
t),
P6+"
E P,'
pos-l' E P; 130
(5.17b)
(5.17~)
(v E
ri),
(5.17d)
(v E
P).
(5.17e)
5.2. THE EQUIVALENCE OF THE NEOFLOW PROBLEMS We can easily see that for each v E V there exist distributive lattices 0,' C. 26+u and Du26-u and submodular functions f;: D', -+ R and fu-: 0,- -+R such that 8 E D,S n D;, f$(8) = f i ( 0 ) = 0 and
p,'
= P(f,+),
P,- = P ( f i ) .
(5.18)
Therefore, the independent flow problem is reduced to a polymatroidal flow problem (see Fig. 5.2).
S-
S+
S-
Figure 5.2.
(c)
From polymatroidal flows to submodular flows
Consider the polymatroidal flow problem Pp defined by (5.3). From the underlying graph G = ( V , A ) we construct a graph G' = ( W , A ) with the same arc set, where the vertex set W is given by
W = {w:
I a E A } U {w, I a E A }
(5.19)
and we define d+a = w: and d - a = w, for each a E A in G' (see Fig. 5 . 3 ) . Each arc of G' forms a connected component of G'. Define an upper capacity 131
111. NEOFLOWS function Z': A + R U ( $ 0 0 ) and a lower capacity function c': A + R U { +m} by ~ ' ( a=) +00 (a E A), .'(a) = ~ ( a )( a E A ) . (5.20) Also define
w,'= {w; I a E 6 + U } w, = {w; I a E 6 - u }
(5.21) (5.22)
where 6+ and 6- are with respect to G. Under the natural correspondences between W,,+ and 6+u and between Wv- and 6 - u for each v E V we regard P(fV+)and P(f;) as polyhedra in RwZ and RW;, respectively.
U
G
G' Figure 5.3. Define for each u E V
and let B
C RW be the direct sum of B, B=$ VEV
132
(u E V ) :
B,.
(5.24)
5.3. FEASIBILITY FOR SUBMODULAR FLOWS Then, the polymatroidal flow problem is rewritten as Minimize
(5.25a)
$u)cp(a) aEA
subject to ~ ' ( a5) p(a) 5 i?(u) acpE B.
(u E A),
(5.25b) (5.25~)
Note that y E B, if and only if (5.26) (5.27)
Y ( W 3 + Y(WJ
= 0,
(5.28)
26+u and D; 5 26-". The system of (5.26)-(5.28) is where we regard D Z equivalent to that of (5.26), (5.28) and
For each v E V let 3, be the crossing family defined by
7, = ',D u D, u {W,+u W l } , where
= {W: U (W; - X)
I X E D;},
(5.30)
and let f,: 3, + R be defined by
Then, f,, is a crossing-submodular function on the crossing family 3, and we have B, = B(f,), a base polyhedron. Therefore, it follows from (5.24) and (5.25) that the polymatroidal flow problem P p is reduced to a submodular flow problem. It is interesting to see again that the polymatroidal flow problem looks like a very special case of the submodular flow problem.
5.3. Feasibility for Submodular Flows
In the previous section we have shown the equivalence of the submodular flow problem, the independent flow problem and the polymatroidal flow problem. 133
111. NEOFLOWS Since the submodular flow problem is simple to describe, let us consider as a neoflow problem the submodular flow problem
Ps: Minimize
(5.la)
q(a)p(a) aEA
subject to :(a)
5 p ( a ) 5 C(a) ( a A ) ,
aP E B ( f ) .
(5.lb) (5.1~)
Due t o Theorem 2.5 we assume for simplicity that f is a submodular function on a distributive lattice D C 2v with 0, V E D and f(0) = f(V)= 0. Recall (2.65), where we have shown
~ ~ -2"7 :+ R U ( $ 0 0 ) is the cut function associated with the network ( G = ( V , A ) , c , C )and 8@ is the base polyhedron associated with the cut function I C , , ~ . Hence, there exists a feasible flow 'p for the submodular flow problem if and only if
Here,
U
=
i.e., there exists a common base in B(K~,:) and B(f). From Theorem 4.13 we have
Theorem 5.1 [Frank84]: There exists a feasible Aow for the submodular flow problem Ps satisfying (5.lb) and ( 5 . 1 ~ if ) and only if
vx E D : or
YX E
(KC.E)#(X)5
f( X )
D: C ( A - X ) - c ( A + X ) + f ( X ) 2 0 ,
(5.34)
(5.35)
where for each X C V A + X = { a I a E A , d+a E X , 8 - a E V - X } and A - X = { a I u E A , d - a E X , d+a E V - X } . Moreover, if C, c and f are integer-valued and P.T is feasible, there exists an integral feasible Aow. (Proof) Immediate from Theorem 4.13.
Q.E.D.
A feasible flow for the submodular flow problem can be obtained by the use of the algorithm shown in Section 4.3. 134
5.3. FEASIBILITY FOR SUBMODULAR FLOWS Frank [Frank841 showed feasibility theorems for the cases where f is an intersecting-submodular function and where f is a crossing-submodular function. We can give Frank’s result by combining Theorems 5.1 and 2.6. That is,
Corollary 5.2 [Frank84]: (i) When f is an intersecting-submodular function on an intersecting family 3 such that 0,V E 3 and f(0) = f ( V ) = 0, the submodular flow problem Ps described by (5.1) has a feasible Aow if and only if we have (5.36) for each X
CV
and disjoint X , E 3 (i E I ) such that X = UiErX i .
(ii) When f is a crossing-submodular function on a crossing family 3 such that O,V E 3 and f ( 0 ) = f(V)= 0, the submodular Aow problem PS has a feasible flow if and only if we have (5.37) for each X C V , codisjoint X i C V (i E I ) and disjoint Xij ( j E J ; ) (for each i E I ) such that X = X ; and X ; = UjEJiXij (i E I ) . Note that (ii) of Corollary 5.2 is also expressed in a dual form as follows: Problem PS has a feasible flow if and only if (5.37) holds for each X C V , disjoint X ; C V (i E I ) and codisjoint X ; j ( j E Ji) (for each i E I ) such that X = UiEIXi and X i = X ; j (i E I ) (see the remarks after Theorem 2.6). Since the description of the algorithm for the common base problem given in Section 4.3 depends on the base polyhedron B(f) but not on the submodular function f or the system of linear inequalities expressing B(f), the algorithm also works even if f is a crossing-submodular function on a crossing family, provided that an initial base in B(f) and an oracle for exchange capacities are available. For such an f , however, finding a base in B(f) together with determining the nonemptiness of B(f) is itself a nontrivial problem. For a fixed element e E E l define families 31 = { X I e 4 X E 3} U { V } , 32 = {X I e E X E 3 ) U (0) and = { E - X I X E 32)’ and let f l and f 2 , respectively, be restrictions o f f to 31and 32.Then, 31and are intersecting families and f l and are, respectively, an intersecting-submodular function and an intersecting-supermodular function. Since B ( f ) = B(f1) n B(f2) = B(fl) n B ( f f ) , a vector in B (f) is a common base in base polyhedra B(f1) and B ( f f ) , if both B(f1) and B ( f f ) are nonempty. Since P(f1) (or P(f2#)) is a
njEJi
f2#
135
111. NEOFLOWS submodular (or supermodular) polyhedron, we can find a maximal (or minimal) vector in P(f1) (or P ( f f ) ) by adapting greedy algorithm I1 in Section 3.2.b. A vector in B ( f ) , if any exists, is thus found by the algorithm for a common base problem (see [Frank84], [Fuji87b]). A more direct algorithm is given by [Frank Tardos881 (also see [Naitoh Fuji891) as a bi-truncation algorithm.
+
+
5.4. Optimality for Submodular Flows Consider the submodular flow problem Ps described by (5.1), where ( 0 , f ) is a submodular system on V with f ( V ) = 0. We show the following optimality theorem.
Theorem 5.3: A submodular Aow p: A -+ R for Problem Ps is optimal if and only if there exists a function p: V + R such that, defining rP: A -+ R bY 7*(4 = 7 b )
+ p(d+a) - p ( d - 4
(a E A),
(5.38)
we have for each a E A (5.39) (5.40) and such that the boundary dp: V + R is a maximum-weight base of B(f) with respect to the weight function p. (Proof) The “if” part: By an elementary calculation we have
The “if” part immediately follows from (5.41). We postpone the proof of the “only if” part. Q.E.D.
To prove the “only if” part of Theorem 5.3 we need some preliminaries. Any function p: V + R is called a potential. A potential p satisfying the conditions of Theorem 5.3 is called an optimal potential. Given a submodular flow p, we define the auxiliary network N, = (G, = (V,A , ) , c,, 7,), where G, is the graph with vertex set V and arc set A , given by (5.42) (5.43) A: = { a I a E A , p(a) < Z ( U ) } , B” = {Ti I a E A , c(a) < p(a)} (a: a reorientation of a ) , (5.44) (5.45) c, = { (v) u , v E v,11 E dep(dcp,v) -
A,,, = A ; U B; U C,,
,
w},
I
136
5.4. OPTIMALITY FOR SUBMODULAR FLOWS
c,: A , -+ R is the capacity function given by
-c(a)- d a )
( a E A;) p ( h ) - ~ ( h ) ( a E B;, h(E A ) : a reorientation of a) Z(dP,V,'LL) ( a = (v) E C,),
(5.46)
and 7,:A , + R is the length function given by
h(E A ) : a reorientation of a ) ( a = (u,v) E C,).
(5.47)
We call the graph Gap = (V,C,) the exchangeability graph associated with the base dp of B(f). Also, we call a directed cycle of negative length a negative cycle.
Lemma 5.4: Let p be an optimal submodular Aow for Problem Ps. Then there exists nonegative cycle, relative to the length function y,, in the auxiliary network N,. (Proof) Suppose, on the contrary, that there exists a negative cycle in N,. Let Q be a negative cycle having the smallest number of arcs in 1, and let the arcs in C, lying on Q be given by ( u i , v ; ) (i E I). We show that by an appropriate numbering of arcs (u;,vi)(i E I) the assumption of Lemma 4.5 is satisfied. Suppose, on the contrary, that the assumption of Lemma 4.5 cannot be satisfied by any numbering of arcs ( u ; , v,) (i E I ) , i.e., there are arcs ( u i k , v i k )(k = 1,2,...,p) such that i k E I (k = 1,2,...,p) and (u;,,v;,+,) E C, (k = 1,2,...,p) with i,+l = il. Then for each k = 1,2,.-.,plet Qk be the directed cycle formed by arc ( U ~ ~ , V ; ~ and the path in Q from v;,+~ to uik (see Fig. 5.4). Since y,((uik,vik)) = y , ( ( ~ i ~ , v i ~= + ~0 )(k ) = 1 , 2 , . . . , p ) , we see that I'
%(Qk)
= (P - qhp(Q)
<0
(5.48)
k=l
for some integer q such that 1 5 q < p , where y,(&k) and y , ( Q ) are the lengths of Qk and Q relative to the length function 7., It follows from (5.48) that there is at least one Qk (k = 1,2,. . . , p ) having a negative length and such a directed cycle Qk has a smaller number of arcs than Q . This contradicts the definition of Q , so that by an appropriate numbering of arcs ( u i , v;) (i E I ) the assumption of Lemma 4.5 is satisfied. Define (5.49) a = min{c,(a) I a lies on Q} 137
+ ~ )
111. NEOFLOWS
Figure 5.4. and modify the flow p as
+
~ ( a )Q - a:
(a E A; (‘si E l3;
n Ap(Q)) n A,(Q), ‘si: a reorientation of a )
(5.50)
(0therw ise)
for each a E A, where A , ( Q ) denotes the set of arcs lying on Q . Then p’ is a submodular flow due to Lemma 4.5 and
aEA
aEA
aEA
This contradicts the assumption that p is an optimal submodular flow. ConQ.E.D. sequently, there is no negative cycle in N,. The proof technique concerning (5.48) was originated by the author [Fuji 77a, 77b] for matroids and may be interesting in its own right (also see [Zimmermann821. Now, we show the “only if” part of Theorem 5.3. (Proof of the “only if” part of Theorem 5.3) Let p be an optimal submodular flow for Problem Ps. Then, from Lemma 5.4 there exists no negative cycle in 138
5.4. OPTIMALITY FOR SUBMODULAR FLOWS
the auxiliary network 4, relative to the length function 7,. This implies that there exists a potential p: V + R such that (5.52) for each a E A > . (Such a potential p can be found by a shortest path computation from a fixed origin s outside N,, where the origin s is connected with each vertex of V by a new arc of any finite length, say, zero.) We can easily see that (5.52) for a E A; u implies (5.39) and (5.40) and that (5.52) for a E C, implies P ( 4 2 P(4 .( E dep(&V) - {.I). (5.53)
~ ( f ) ) is a maximum-weight base of B(f) From (5.53) and Theorem 3.16 d p ( B with respect to the weight function p. Q.E.D. It should be emphasized here that Theorem 5.3 is independent of how the base polyhedron B ( f ) is represented by a system of linear inequalities. The proof of Lemma 5.4 suggests an algorithm for solving the submodular flow problem Ps as follows: Starting from an arbitrary submodular flow p, (i) find a negative cycle Q having the smallest number of arcs in the auxiliary network I,, (ii) modify the present flow p along the negative cycle Q as in (5.49) and (5.50), where if a = +m, the problem is unbounded, and (iii) repeat this process until there is no negative cycle in N,. This is the primal algorithm given in [Fuji78a] and [Zimmermann82] (also see [Cui Fuji881). When C, c, f and an initial p are integer-valued, this algorithm terminates after a finite number of steps and finds an integer-valued optimal submodular flow if any optimal one exists. We shall discuss more efficient algorithms in the next section. A necessary and sufficient condition for the existence of an optimal submodular flow is given as follows.
+
= Theorem 5 . 5 : For the submodular flow problem Ps define a network ( G = ( V , q ) , where 6 is a graph with vertex set V and arc set 2 given by
a), A = A* U B* U C ,
A* = { a I a 6 A , E(a) = +m}, B" = {a 1 a E A , ~ ( a=) -m} ( E : a reorientation of a ) ,
c = { ( u , v ) I u , v , E v,vx E D: (v E x =+ 139
uEX)}
(5.54) (5.55) (5.56) (5.57)
111. NEOFLOWS and fi: 2 -+ R is the length function defined by
Z(E A ) : a reorientation of u) (u E
’
(5.58)
C).
Suppose that there exists a submodular flow for Problem Ps. Then, there exists an optimal submodular flow for Problem Ps if and only if there is no negative cycle in relative to the length function fi.
fi
(Proof) The “if” part: Suppose that there is no negative cycle in -+ R such that for each B E
A
fi. Then there is a potential p : V fip(B) where
k relative to
= +(a) + p(d+B) - p ( 8 - B ) 2 0,
and d - are with respect to
&.
(5.59)
(5.59) implies
A , Z(U)= +&),
?(a) t p(d+t) - p ( a - t )
20
(U E
7 ( u ) t p(a+a) - p ( a - a )
50
( a E A , e(a) = -m),
P(4 2 P ( 4 ((w) E C).
(5.60) (5.61) (5.62)
Since (5.1~)is expressed as (5.63) (5.64) where (5.64) is void, and since
dP(X) = P(A+X) - cp(A-x),
(5.65)
) replaced by (5.63) the linear-programming dual of Problem Ps with ( 5 . 1 ~being is described as
140
5.4. OPTIMALITY FOR SUBMODULAR FLOWS
5
where [, A t R , v: D t R and we should regard (5.66a) as the objective function with the terms ( ( a ) c ( a )( a E A , c ( a ) = -00) and T ( a ) Z ( a ) ( a E A , Z(a) = +00) being suppressed. Because of (5.62) there is a minimal chain
c: 0 = s, c s1 c . * . c s,
=v
(5.67)
of D such that p is constant on each quotient Sk - s k - 1 (k = 1,2,. this chain C and the potential p , define v(sk) =Pk-Pk+l
- - ,n). Using
(k=1,2,*",n-l),
(5.68)
where p k is the value of p taken in s k - S k - 1 and note that define v ( X ) = 0 for other X E D. Moreover, define
v(&) > 0.
+ p(a+a) - p ( 8 - a ) ( a E A , Z(a) = ((4= -7b) - P P + a ) + p ( a - a) ( a E A , S(U) = - w ) ,
( ( a ) = $a) -
$00)~
Also
(5.69) (5.70)
arc a E A with ?(a) < +00 and c ( a ) > -00 define ( ( a ) and ( ( a ) such that ( ( a ) , ?(a) 2 0 and (5.66b) holds. We can easily see that thus defined [, 7, 9 satisfy (5.66b)-(5.66e), where note that for each arc a E A with Z(a7 = $00 and c(a) = -00 we have 7(a) p ( a + a ) - p ( a - a ) = 0 due t o (5.60) and (5.61). Since the dual of Problem Ps has a feasible solution and the feasibility of the primal problem Ps is assumed, ther,e exists an optimal solution of Problem
-and for each
+
Ps . The "only if" part: Suppose that there is a negative cycle in k relative to the length function 9 , and let Q be such a negative cycle in k . Then for any positive a,if we define p': A -+ R by (5.50), p' is feasible for Problem Ps because of the definition of fi and we have (5.51). Since a! (> 0) is arbitrary Q.E.D. and rP(Q)< 0, Problem Ps is unbounded. We also have
Theorem 5.6 [Edm+Giles77]: The system of linear inequalities (5.71)
aP(x)(=LP(A+X)
-
P(A-x)) 5 f ( X ) (X E
D)
(5.72)
for the submodular flow problem Ps is totally dual integral. (Proof) Consider a submodular flow problem P_s such that the coefficients ~ ( a ( )A E A ) of the objective function are integers and that there exists an 141
111. NEOFLOWS optimal submodular flow p. From the proof of the “only if” part of Theorem 5.3 there exists an integral potential p : V -+ Z such that for each a E A
Pb) I P(4. Let pl
> pa >
+
*
(5.75)
> pl be the distinct values of p ( u ) (u E V ) and define
W; = { u I u E
v,p ( u ) 2 pi)
(2
= 1,2,-,1).
(5.76)
From (5.75) we have
For the dual problem Ps* in (5.66) of
Ps define
t ( a ) = max{o,rp(a)) -
(a E A),
t(4 = max{O, -rp(a))
(a E A), (i = 1 , 2 , * * , I - l ) ,
Q(W;)= p; - p i t 1 q ( X ) = 0 for other X E D. We can easily see that these (, from (5.73), (5.74) and (5.76),-
aEA
7 and Q
-
satisfy (5.66b)-(5.66e).
(5.78) (5.79) (5.80) (5.81) Moreover,
XED
aEA 1-1
aEA
i=l 1-1
aEA
i= 1 1
aEA
i=l
aEA
UEV
(5.82) 142
5.4. OPTIMALITY FOR SUBMODULAR FLOWS
where Wo = 0, we put 0 x ( f o o ) = 0, and note that dp(W1) = d p ( V ) = 0. Therefore, (, 7 and 7 defined by (5.78)-(5.81) form an integral optimal solution Q.E.D. of the dualproblem Ps * . It follows from Theorems 5.6 and 2.6 that if we replace f in (5.72) by a crossing-submodular function on a crossing family, the system of (5.71) and (5.72) is totally dual integral.
Corollary 5.7 [Edm+Giles77]: For a crossing-submodular function f on a crossing family 3 2 2 v , the system of inequalities
is totally dual integral. The proof of Theorem 5.6 shows the way of constructing an optimal dual solution [, 7 and 7 from an optimal potential p when f is a submodular function on the distributive lattice D. When f is a crossing-submodular function on a crossing family 3 C 2v with 0, V E 3 and f(0) = f(V)= 0, suppose that B ( f ) = B(f2) for a submodular system (P,f2) (see Theorem 2.6) and that we are given an optimal submodular flow p and an optimal potential p . Also suppose that for each W; in (5.77) we have the expression of fZ(W;) in terms of f as follows (see Theorem 2.6).
where V - ( U k E K Xi .; j k ) ( j E J;) form a partition of V - Wi for each i = 1 , 2 , - - - , 1- 1 and k;jk (k E Kij) are disjoint sets (as subsets of in 3 for each i = 1,2,-.’,1- 1 and j E J;. Here, note that f i ( V ) = 0. Then define for each X E 3
v)
v ( X )= C { p ; - p ; + l
I
X = Xijk, i E {l,*-.,l-l}, j
E J;, k E K i j } , (5.86)
where the summation over the empty set is equal to zero. (Note that if all the X i j k are distinct, we have v ( X ; j k ) = p i - p ; + l (i E {1,..-,1- l}, j E J;, k E Kij) and v ( X ) = 0 for other X E 3.) E , 2 defined by (5.78) and (5.79) and this v form an optimal dual solution, since (5.82) with D replaced by 3 holds. We show the way of finding an expression in (5.85) for each i = 1, 2, . . . , 1 - 1. Put x = 8p and define
3(.)
=
{ X I X E 3, .(X) = f(X)}. 143
(5.87)
111. NEOFLOWS We assume an oracle which, for each ordered pair ( u , v ) of distinct vertices u , v E V , gives a u-’u cut X E 3(z) or answers that there is no u-v cut X E 3(z), where a u-v cut is a set X C V such that u E X and v 4 X . Let G = (V,A ( z ) ) be the exchangeability graph associated with z, i.e.,
where note that ( v , u ) E A ( z ) if and only if u # v and there is no u-v cut X E ?(z). Choose any Wi (i = 1,2, - .. , 1 - 1). Let Gi be the graph obtained from G by deleting all the vertices in W; together with the arcs incident to W i , and let U;j ( j E Ji) be the vertex sets of the connected components of G;. Note that A+Uij = 0 ( j E J i ) (5.89) in G, so that
fz(V
-
Uij)= z ( V - Uij) ( j E Ja).
(5.90)
Therefore, for each u E V - Uij and v E Uij there exists a u-v cut X E 7 ( s ) . If X n Uij # 0, choose v’ E X n Uij such that for-some w E U,j - X we have ( v ’ , w ) E A ( z ) . Such a v’ exists since Uij - X # 0 and U i j is the vertex set of a connected component of G ; . There exists a u-v’ cut X’ E ?(z) and either X and X’ cross or X’ C X , due to the way of choosing v’. (Note that u E X n X ’ , v’ E X - X’ and w 4 X U X’.) Since 3(2) is a crossing family, we have X n X’ E 3(z) and u E X n X’ c X . Put X t X n X‘. If X n U;j # 0 , then repeat this process until we have X n U,j = 0. After repeating this process at most lUijl times we have X E 3(z) such that u E X C: V - U i j . Denote this X by X u . In this way, for each u E V - Uij we can find X u such that u E X u L V - Uij. Consider the hypergraph H = (V - U i j , { X u I u E V - Uij}) and let Xijk ( k E K;j) be the vertex sets of the connected components of H . Since 3(z) is a crossing family of subsets of V , we have Xijk E 3(2) ( k &). Consequently,
(5.91)
since z ( V ) = 0 . When f is an intersecting-submodular function on an intersecting family 3C - 2” with 0, V E 3 and f(0) = f ( V ) = 0 , for each u E W; we can find X,, E 3(2) such that u E X,, & W;. (Note that for any v , v’ E V - Wi there exist a u-v cut X E 3(z) and a u-v’cut X‘ E 3(2) and that u E X n X ‘ E 3(z) 144
5.4. OPTIMALITY FOR SUBMODULAR FLOWS since 3(2)is an intersecting family.) Let X;, ( j E J;) be the vertex sets of the connected components of the hypergraph H = (W;,{Xu I u E W;}).Then Xij E 3(s) and we have (5.92) j E Ji
where f1 is the submodular function appearing in (i) of Theorem 2.6. Let G = ( V , A ) be a connected (directed) graph. In Section 3.3.b we considered strongly k-connected reorientations of G for a positive integer k. For the crossing-submodular function ~ ( ~ 2” 1 : t R defined by (3.105), the problem of finding a minimum-cost strongly k-connected reorientation ’p: A + ( 0 , l ) is described in a relaxed form as
P(”):Minimize
(5.93a)
7(u)’p(u) aEA
subject to 0 5 p(u)
I1
(u E A ) , I “ @ ) ( U ) (U V ) ,
c
(5.93b) (5.93c)
where 7: A --f R is a cost function and p: A + R is not restricted to integral vectors. Due to Corollary 5.7 the polyhedron of feasible flows ‘p is integral and there exists an integral optimal solution, if any optimal solution exists, which is a minimum-cost strongly k-connected reorientation of G. Problem (5.93) is a typical example of the submodular flow problem (see [Frank80,8lc]). Note that Problem (5.93) with k = 1 has a feasible flow if and only if G is connected and has no bridge. For a connected graph G = ( V , A ) let D(G) be the set of all the ideals of G, i.e., (5.94) D(G)= {W I W & V , A+W = 0).
For each W E D(G)- { S , V ) we call A - W ( # 0) a directed cut. A subset B of arc set A is called a directed-cut covering of G if B has nonempty intersection with each directed cut of G. C. L. Lucchesi and D. H. Younger [Lucchesi + Younger781 consider the problem of finding a minimum directed-cut covering, which is formulated in a relaxed form as
DCC: Minimize
p(u)
(5.95a)
aEA
subject to 0 5 p ( u ) I 1 (u E A ) , ~(P(W I )“ ( l ) ( W ) (W E D(G)),
(5.9513) (5.95c)
where ~ ( lis) defined by (3.105) with k = 1. It should be noted that K ( ’ ) is a crossing-submodular function on D (G). The linear-programming dual of DCC 145
111. NEOFLOWS is given by
DCC*:Maximize
c
E(u) -
-
v(W)n(’)(W)
(5.96a)
WED(G)
aEA
subject to - ( ( u )
+ c { v ( W )I W E D(G),u E A-W}
51
(u E A ) ,
(5.96b)
E,
rl
2 0.
(5.96~)
Therefore, Problem DC C* becomes Maximize
c
min(0, [l-
aEA
+
c
c { v ( W ) I W E D(G),u E A-W}]} rl(W
W E D ( G )- (@?V}
subject to 7 2 0.
(5.97a) (5.97b)
We see from (5.95)-(5.97) and Corollary 5.7 that there exists an integral optimal solution 77 of dual DCC* such that (1) v ( W ) E {0,1} (W E D(G)- { 0 , V } ) and (2) directed cuts A-W with v ( W ) = 1 are disjoint. Consequently, we have the following theorem.
Theorem 5.8 [Lucchesi+Younger78]: The minimum cardinality of a directedc u t covering of G is equal to the maximum cardinality of a set of disjoint directed cuts of G. 5.5. Algorithms for Neoflows We consider maximum neoflow and minimum-cost neoflow problems and show algorithms for these problems.
(a) Maximum independent flows The maximum flow problem for neoflows seems to be easily formulated by the independent flow problem. The man’mum independent flow problem is:
P M I :Maximize ap(S+) subject to ~ ( u 5) p ( u ) 5 C(U) ( a E A ) , ap(v) = 0 ( v E v - (S+ u s-)),
( a d s +E P ( f + ) , - (spy- E W - ) , 146
(5.98a) (5.98b) (5.98~) (5.98d) (5.98e)
5.5. ALGORITHMS FOR NEOFLOWS where we consider the same network described in Section 5.1.b. We denote (d’p)” and - ( d ~ ) ~by- d + p and d-p, respectively, in the following. We suppose there exists a feasible flow. A feasible flow, if any exists, can be found by adapting the algorithm in Section 4.3 as discussed in Section 5.3 for submodular flows. Given a feasible flow (o, we define an auxiliary network N, = (G, = (V U { s ~ , s - } , A , ) , c , , s + , s - ) with source s+ and sink s- as follows. G, is the underlying graph with vertex set V U {s+,s-} and arc set A, defined by
A , = S; US;
u ~ ufA;
S$ = { ( s + , v )
S; = { ( v , s - )
UA; u B;,
(5.99)
I v E S+ - sat+(d+(o)}, I v E S- - sat-(d-’p)},
(5.100) (5.101)
Af = { ( u , v ) I v E sat+(d+(o), u E dep+(a+(o,v) - { v } } , (5.102) A; = { ( v , u ) I v E sat-(d-(o), u E dep-(d-p,v)
-
{v}},
A; = { u I u E A, p(a) < Z ( u ) } , B:, = {z I a E A, ( o ( 4 > c(4h
(5.103) (5.104) (5.105)
where ii denotes a reorientation of a, sat’ (sat-) is the saturation function with respect to ( P + , f + ) ((P-,f-)) and dep+ (dep-) is the dependence function with respect to (P+,f+)((P-,f-)).Also, c,: A, -+ R is defined by
C,(4
=
I
c*+(d+p,v) 2- (a-p, v )
( u = (s+,v) E SL)
E+(d+p, v , u )
(U = ( u , v )
E-(a-p,v,u)
(u = ( v , u ) E A;)
Z(4 - P ( 4
( a E A;) ( u E BG, E ( E A ) : a reorientation of a ) ,
(o@)
- c(Z)
( u = ( v , s-) E )S ;
E A:)
(5.106)
where 2+ (or c*-) denotes the saturation capacity associated with (D+,f+) (or (P-,f-)) and E+ (or c“-) the exchange capacity associated with ( P + , f + ) (or
(D-,f-)). We suppose that a feasible flow (an independent flow) ‘p is given. An algorithm for finding a maximum independent flow is furnished as follows (cf. [Fuji78a], [Lawler + Martel82a1, [Schonsleben80]). We assume oracles for saturation capacities and exchange capacities. A n algorithm for finding a m a x i m u m independent flow I n p u t : an independent flow p in network N = (G = ( V , A ; S + , S - ) , c , F , (P+,f+),(P-,f-))and a fixed numbering 71: V + {1,2,...,IVI}. 147
111. NEOFLOWS
Output: a maximum independent flow p in N . Step 1: While there exists a directed path from S+ to s- in the auxiliary network N, = (G, = (V u {s+, s-}, A , ) , cv, s+, s-), do the following. (*) Find a lexicographically shortest path P from s+ to s- in N, and put (Y
+ min(c,(a)
I a:
an arc lying on P } ,
(5.107) (5.108)
a ( € A ) : a reorientation of E ) .
In this algorithm a lexicographically shortest path from s+ to s- in N p is a directed path from s+ to s- in N, which has the minimum number of arcs among directed paths from s+ to s- in A, and whose vertex sequence (s+, v 1 , ,vp, s-), say, gives the lexicographically minimum sequence ( ~ ( v l )-,.. ,7r(vp)) among directed paths from s+ to s- in N, having the minimum number of arcs. Also, A , ( P ) is the set of arcs, in A,, lying on P . The validity of the above algorithm can be shown almost in the same manner as in the proof of the validity of the algorithm for the intersection problem in Sections 4.1.b and 4.l.c. The algorithm described above finds a maximum independent flow after repeating (*) a t most IV times. The maximum independent flow problem naturally includes the intersection problem, where the underlying graph is the bipartite graph representing the bijection between S+ and S - . e
+
-
l3
Theorem 5.9 (The maximum-independent-flow minimum-cut theorem) (cf. [McDiarmid75], (Fuji78al): Suppose that the maximum independent Aow problem PMI has a feasible Aow. Then we have max{8p(S+) I p is an independent flow (satisfying (5.98b)-(5.98e))} = min{f+(S+ - U ) +Z(A+U) - c ( A - U ) f-(S- n U ) I U C V } , (5.109)
+
where A + and A- are with respect to G and we regard f+(X) = +co (f-(Y) = +a)if X 4 D+ (Y $ P-). Moreover, if c, c, f + and f - are integer-valued and Problem P M I is feasible, then there exits an integral maximum independent Aow for Problem PMI.
(Proof) Let p* be a maximum independent flow for Problem P M ~and , consider .Nv* associated with p*. Let U* be the set of vertices the auxiliary network 148
5.5. ALGORITHMS FOR NEOFLOWS
in V which are reachable by directed paths from s+ in the auxiliary network .Alp*.Then we have (5.110) S+ - U* c sat+(d+p*), S-
n U* g sat-(d-p*)
(5.111)
since there is no directed path from s+ to s- in .Alp*. Hence, by the definition of u*, (5.112) dep+(d+p*,v) g S+ - U* (v E S+ - U * ) , dep-(d-cp*,v) C S-
n U*
(v E S-
nU*),
(5.113)
(aE A+U*),
(5.114)
p*(a)= ~ ( a ) ( a E A - U * ) .
(5.115)
p*(a)= E(a)
From (5.112) and (5.113),
a+p*(S+ - U * ) = f+ (S+ - U * ),
(5.116)
a - p * ( ~n- u*)= f-(S- n U * )
(5.117)
due to Lemma 2.1. From (5.114) and (5.115),
p*(A+U*) = E(A+U*), p*(A-U*) = c ( A - U * ) .
(5.118)
It follows from (5.116)-(5.118) that
+
+
dp*(S+)= a+p*(S+- U * ) p*(A+U*) - p*(A-U*) a-p*(S- n U * ) = f+(S' - U') + C ( A + U * ) - c ( A - U * ) f-(S- n U*). (5.119)
+
On the other hand, for any independent flow p and any subset U of V we have
(5.109) follows from (5.119) and (5.120). Moreover, if ?, c, f+ and f- are integer-valued and Problem P M is ~ feasible, then there exists an integral feasible flow due to Theorem 5.1. Starting with an integral feasible flow, the algorithm finds an integral maximum independent Q.E.D. flow in finitely many steps.
+ +
We call any U V a cut and the value f+(S+- U )+?(A+U) - s ( A - U ) f-(S- n U ) the capacity of the cut U . Theorem 5.9 is a generalization of the classical max-flow min-cut theorem for ordinary capacitated networks [Ford Fulkerson621. 149
111. NEOFLOWS Theorem 5.9 (for polymatroids) was shown non-algorithmically by McDiarmid [McDiarmid75] and algorithmically by the author [Fuji78a]. Now, consider a supermodular system (D+, )'9 on S+ instead of submodular system (D+,f + ) and also consider the following system of inequalities.
a t p ( ~= )
o
(V
E
(ad"'
v - (s+n S - ) ) , P(S++),
E
-(ads-
(5.122) (5.123) (5.124)
E p(f-1,
where P(g+) is the supermodular polyhedron associated with we have the following theorem.
(D+,g+).
Then
Theorem 5.10: There exists a feasible flow p satisfying (5.121) - (5.124) if and only if we have for each U C V such that S+ n U E D+ and S- fl U E D-
g+(s+n u) - f-(S-
n U ) I c(A+U) - c(A-u)
and for each U C V such that S+ U S0
(5.125)
CU
5 C ( A + U )- c ( A - U ) .
(5.126)
Moreover, if there exists a feasible flow and C , c, g+ and f - are integer-valued, then there exists an integral feasible Aow. (Proof) Define
D & 2"
and g: D
D ={U I U CV, g(U)=
{;
+ S+
R by
-+
S+nUE
n U ) - f-(S- n U )
(
,'D
S-nU E (U E (U E
D-},
D, (S+ u S - ) D, S+ u S-
(5.127) -
U
U).
# 0) (5.128)
If there is a feasible flow, we must have
g+(S+)- f-(s-) L 0.
(5.129)
Also, (5.125) with U = V implies (5.129). Therefore, we assume (5.129). Due to (5.129), the function g: D -+ R defined by (5.128) is a supermodular function on the distributive lattice D with 0, V E D and g(0) = g(V) = 0. We have 3: E B(g) if and only if zs+ E
P(g+), -5s- E p ( f - ) , 150
,v-(s+Us-)
- 0,
(5.130)
5.5. ALGORITHMS FOR NEOFLOWS
.(V) = 0.
(5.131)
It follows from (2.65) and Theorem 4.13 that there exists a feasible flow satisfying (5.121)-(5.124) if and only if (5.132) We see that (5.132) is equivalent to (5.125) and (5.126), under condition (5.129). The integrality part of the present theorem follows from the counterpart Q.E.D. of Theorem 4.13. Theorem 5.10, where c = 0, Z 2 0, f - is a polymatroid rank function and g+ is the dual supermodular function of a polymatroid rank function, is shown in [Fuji78d]. The discrete separation theorem also follows from Theorem 5.10. The readers may deduce Theorem 5.10 from the feasibility theorem for submodular flows (Theorem 5.1).
(b) M a x i m u m s u b m o d u l a r flows For the submodular flow problem with the network J/s = (G = ( V , A ) , c , Z,7,(P,f)),disregarding the cost function 7 , let us consider a “ma-flow” problem PMS given as follows [Cunningham + Frank851. Let uo be a fixed reference arc in A .
P M S :Maximize p(u,-,)
(5.133a) (5.133b) (5.133~)
subject to ~ ( u5) p(a) 5 C(u) ( u E A ) , dP E B ( f ) .
We assume that there is a feasible flow in .Us and define the auxiliary network J/, = (G, = (V,A , ) , c,) associated with p as follows. G, is the graph with vertex set V and arc set A , defined by (5.42)-(5.45) except that
B*9 = { a I a E A and c,: A ,
-+
- {Q}, c ( u ) < p ( u ) }
(5:a reorientation of
u)
(5.134)
R is defined by (5.46).
An algorithm for finding a m a x i m u m s u b m o d u l a r flow I n p u t : a feasible flow p in J/s with reference arc Q and a vertex numbering 7r: V -t {1,2,. - ., IVl} which defines the lexicographic ordering among directed paths from d-a0 to d+ao in A’,. O u t p u t : a maximum submodular flow p in A’s with reference arc aa. 151
111. NEOFLOWS Step 1: While p(uo) < C(uo) and there exists a directed path from ~3-mto a+ao in the auxiliary network N,, do the following. (*) Find the lexicographically shortest path P from d-uo to d+ao in Np and let Q be the directed cycle formed by P and reference arc m.Put a! t
min{c,(a)
{ where if large). (End)
I a lies on Q},
+
~ ( a ) ( a E A; n A p ( Q ) ) p(a) - a! (E E B; n A,(Q), a: a reorientation of a E A),
a! =
+m, then stop (the flow value
cp(a0) can
be made arbitrarily
Here, A,(Q) is the set of the arcs in A , lying on Q and the lexicographically shortest path P is the directed path from d - m to d'ao in Np which has the minimum number of arcs among directed paths from a-ao to d+ao in Np and whose vertex sequence ( d - a o , v1, 'up, d+tao), say, gives the lexicographically minimum sequence ( m ( v l ) , .- ,m(vp)) among directed paths from d - m to d+ao in 1, having the minimum number of arcs. The algorithm terminates after repeating (*) O( IV 13) times. The analysis is by the same technique as shown in Section 4.l.c. a ,
Theorem 5.11: For the maximum submodular flow problem PMS described by (5.133), max{p(ao) I p is a feasible flow in 1 s ) = min{Z(ao), min{Z(A-X) - c ( A + X - {Q})
+ f ( X ) I X E D, uo E A+X}}.
(5.135)
Moreover, if c, Z and f are integer-valued and there exists a maximum submodular flow, then there exists an integral maximum submodular flow in Ns. (Proof) The maximum flow value is equal to the maximum value of c(u0) under the constraint that the network NS has a feasible flow, where c(ao) is regarded as a variable. Therefore, relation (5.135) is deduced from Theorem 5.1. The integrality property also follows from Theorem 5.1. Q.E.D. Theorem 5.11 can also be shown algorithmically. When the algorithm terminates with a maximum flow p* such that p*(ao) < Z ( a o ) , let U be the set of the vertices which are reachable by directed paths from a-ao in the then obtained N,.. From the assumption, %a,-, E U and d+ao $ U. Define W = V - U . It follows from the definition of N,* that p*(a) = Z ( a )
(u E
152
A-W),
(5.136)
5.5. ALGORITHMS FOR NEOFLOWS
cp*(a) = c(a) ( a E A+W - { a o } ) , dep(+*,v) G W
(. E W ) ,
(5.137) (5.138)
where A+ and A- are with respect to G. From (5.138) we have dP*( W ) = f ( W .
(5.139)
and from (5.136) and (5.137)
a'p*(W)=p*(ao) + c ( A + W
- {uo}) -E(A-W).
(5.140)
Combining (5.139) with (5.140), we get P*(Q) =C(A-W) -c(A+W - { a o } )
+ j(W).
On the other hand, we can easily see that for any feasible flow and any X E D with a0 E A + X , ~ ( a o5 ) C(A-x)
-
c(A+X
+f(X).
- {Q})
(5.141) 'p
in Ns
(5.142)
From (5.141) and (5.142) we have (5.135), where the case when p*(ao) = C(ao) is taken into account. Moreover, the integrality part of Theorem 5.11 follows from the fact that if Problem PMS is feasible and c, C and f are integer-valued, $here exists an integral feasible flow and, starting from such an integral feasible flow, we get an integral maximum submodular flow by the algorithm if a maximum submodular flow exists. When p*(ao) < E(ao),the above defined V (= V - W ) is called a minimum cut in Ns with reference arc ao. We can also consider the minimum submodular flow problem which is to minimize 'p(a0) subject to (5.133b) and (5.133~). Note that this problem is a maximum submodular flow problem when we consider the dual order <* among R. Hence an algorithm for the minimum submodular flow problem is given rnutatis rnutandis. ( c ) Minimum-cost submodular flows
As a minimum-cost flow problem for neoflows, consider the submodular flow problem P s , described by (5.1), in network Ns = (G = ( V , A ) , g , C , 7 ,
(D,f))* Ps: Minimize
7(a)p(a)
(5.la)
subject to c(u) 5 p( u) 5 C(u) ( a E A ) ,
(5.lb) (5.1~)
&A
aP E B ( f ) , 153
111. NEOFLOWS where ( 0 , f ) is a submodular system on V , the vertex set of the underlying graph G = (V,A ) . We suppose that Problem Ps has a feasible flow. We shall show an algorithm which tries to find a feasible flow p : A -+ R and a potential p : V -+ R satisfying the optimality condition of Theorem 5.3. Suppose that we are given a feasible flow p in Ns. Choose a potential p:V -+ R such that dp E B(f) is a maximum-weight base of B ( f ) with respect to the weight function p , i.e., p ( u ) 2 p ( v ) for each u , v E V with u E dep(dp,v) - { v } . For example, p = 0 (the zero function) satisfies this = (G, = (V,A , ) , c,, 7 , , p ) requirement. We define the auxiliary network as follows. G, is the graph with vertex set V and arc set A , defined by (5.42)-(5.45) and c , : A , -+ R is defined by (5.46). We define 7 , , p : A , + R by (5.143) 7,,p(a) = 7 , ( 4 + p(d+a> - p ( 8 - a ) ( a E A , ) , where 7 , is given by (5.47). Now, an algorithm based on Cunningham and Frank's [Cunningham Frank851 is given as follows. Also, compare it with an out-of-kilter method given in [Fuji87b].
+
An algorithm for finding an optimal submodular flow Input: a feasible flow p in Ns and a potential p : V -+ R such that p ( u ) 2 p ( v ) ( v E V , u E dep(dp,v) - { v } ) . Output: an optimal submodular flow p and an optimal potential p . Step 1: While y,,p(a) < 0 for some a E A , , choose an arc u0 E A , such that 7,,P(ao)< 0 and do (1-1) or (1-2) according as uo E A or a0 4 A : (1-1) If a' E A , then, while p(ao)< e(ao) and r p ( a O ) ( =7 ( a o )+ p ( d S a o ) p ( a - a o ) ) < 0, do the following: (*) Starting with p, find a maximum submodular flow po in the modified network N o = (G = ( V , A ) , ~ o , $ y ( D o , f owith ) ) the reference arc a', where the lower and upper capacity functions go, $ : A -+ R are defined by co(uo) = p(a') , $' (a') = i?(ao) and for each a E A - { a o }
(5.145) (where yP(a) = 7 ( a ) + p ( d + a )- p ( d - a ) with d+ and 8- being defined with respect to G) and, letting p l > p2 > > pt be the distinct values of p ( v ) (v E V ) and defining
---
154
5.5. ALGORITHMS FOR NEOFLOWS
so = 0,
(5.147)
( P O , f O ) is the submodular system given by the direct sum
(5.148) o f t h e s e t minors (P,f).Si/S,_l ( i = 1,2,.-.,t). If the maximum flow value is equal to +oo, then stop (Problem PS is unbounded). Otherwise put p t po. If po(ao)< i?(ao),then for the auxiliary network N$ associated with the current submodular flow poin N o let U be the set of the vertices which are reachable by directed paths from F a o in A:,, , define
I a E A-U, I a E A+U,
rP(a)< 0}, y P ( a )> O } ,
(5.149) (5.150) H2 = {a ( A + , A- in (5.149) and (5.150)are with respect to G.)
HI = { a
H3 = {(ti,.)
I
uE
U,
21
E V - U, u E dep(dcp,v) - { v } } ,
(5.151) (dep is with respect to (P,f).) p* = min{min{17P(a)I I a E H1 U H 2 } , min{p(u) - P ( 4
I
(u,u) E
Hd},
(5.152)
and put P(4
+
P ( 4 - P* .( E U ) .
(5.153)
(1-2)If a0 6 A, let $ E A be the reorientation of uo and, starting with p, find a minimum submodular flow po,with reference arc Z', in the modified network N o as defined in Step (1-1). Carry out Step (1-1) mutatis mutandis. (End) It should be noted that when we carry out (5.149)-(5.153), we have H1 U H2 U H3 # 0 since u0 E H I , and that p* in (5.152) is a finite positive number. For a feasible flow p and a potential p, if an arc a E A satisfies (i) +,(a) > 0 and c(a) < p(a) or (ii) 7p(a)< 0 and p(a) < i?(a),then arc a is said to be out of kilter and otherwise in kilter with respect to p and p. Denote the set of all the out-of-kilter arcs by Ao(p,p) and that of all the in-kilter arcs by A'(p,p). During the execution of the algorithm, (1) in-kilter arcs remain in kilter, (2) the value of 17p(~)l of each out-of-kilter arc a is monotone non-increasing, 155
111. NEOFLOWS (3) 17p(~0))l decreases every time the potential p is changed by (5.153) in Step (1-1) or (1-2), and (4) p is a submodular flow in NS and ap is a maximum-weight base of B(f) with respect to weight function p . Therefore, when the cost function 7 is integer-valued, the algorithm terminates I ~ ( u ) I maximum and minimum submodular-flow computaafter at most CaEA tions in Step (1-1) and Step (1-2) if we start with the initial potential p = 0. Moreover, for any real cost function 7 the algorithm also terminates after finitely many steps, as shown below. While the inner cycle (*) of Step (1-1) is repeated, potential p(a+ao) remains fixed and p ( d - a O ) is decreased by (5.153), where note that for each ( u , v ) E H3 we have p ( u ) > p ( v ) . Every time we change the potential p by (5.153), one of the following two cases occurs:
(i) at least one arc in HI U H2 becomes in kilter, (ii) the set of the vertices which are reachable from a-ao in the new auxiliary network N$ is augmented. Note that the total number of the occurrences of Case (i) is at most / A [and that Case (ii) occurs consecutively at most IVI - 1 times without increasing po(ao). While p(ao) < Z(ao) (when rp(a0)< 0), the potential difference p(a-ao) p ( d + a O )is decreased by (5.153) between the successive maximum submodular flow computations. Since the potential difference p ( d - a O )-p(a+ao) is equal to the length of a directed path from %ao to d+ao relative to the length function 7,, since the length of such a path relative to 7,,p is equal to zero. Since the number of distinct lengths, relative to length function 7 p , of elementary directed paths from a-ao to a+aO in any possible auxiliary networks is finite, p ( u o ) is increased only finitely many times. This argument can be applied to Step (1-2) mutatis mutandis. It follows that arc u0 becomes in kilter or we discern that Problem Ps is unbounded, after finitely many steps. If the algorithm terminates with p and p such that 7p,p(a) 2 0 for all a E A,, then the obtained 'p is an optimal submodular flow and p is an optimal potential due to Theorem 5.3. If the maximum flow value for the modified network N o in Step (1-1) is equal to +oo, then there exists a directed cycle Q, containing ao, in the auxiliary network N: = (GO, = (V,A,),c:) associated with the current p such that +(a) = +oo for any arc a in Q. By the definition of N o , Q is also a directed cycle in the auxiliary network N, = (G,,c,,7,) for the original submodular flow problem Ps and the length of Q relative to the length function 7, is negative. Moreover, p remains to be a (feasible) submodular flow if we change p along the directed cycle Q by an arbitrary positive amount of flow value. (Here, the condition in Lemma 4.5 for multiple exchanges is not required since +(a) = +oo for any arc a E C, in Q.) Consequently, Problem Ps is unbounded, i.e., the value of the objective function of Ps is made arbitrarily 156
5.5. ALGORITHMS FOR NEOFLOWS
small. Also, the same argument is valid mutatis mutandis for Step (1-2). Given an optimal submodular flow p and an optimal potential p , we can find an optimal dual solution of Ps by the procedure given in Section 5.4 when the base polyhedron is expressed in terms of a submodular function on a distributive lattice, or an intersecting- or crossing-submodular function on an intersecting or crossing family. When the cost function 7 is integer-valued and an oracle for exchange capacities is assumed, the above algorithm requires time polynomial in IVI, IA( and max{l7(a)I I a E A } , i.e., it is a pseudopolynomial algorithm. Applying the cost-scaling technique for the ordinary min-cost flows of Rock [Rock801 to submodular flows, we obtain a polynomial algorithm for the submodular flow problem ([Cunningham Frank85]), provided that an oracle for exchange capacities is available. Without loss of generality we assume c(a) > --oo for each a E A .
+
A cost-scaling algorithm
Input: a feasible flow p in .Ns, a potential p = 0 , and C s max{ I7(u) I I a E A } , where the cost function 7 is integer-valued and C > 0. Output: an optimal submodular flow p and an optimal potential p . Step 0: Discern whether Problem PS is unbounded. If it is unbounded, stop; otherwise put k t [log, Cl. Step 1: While k 2 0, do the following: (1-1) For each a E A put + ( a ) t [7(a)/2'1. (1-2) By the pseudopolynomial algorithm given above, starting from the current flow p and potential p , find an optimal submodular flow p* and an optimal potential p* in the network JS = (G = (V,A),g,i?,q,
(D,f))* (1-3) Put p
t
p*,p
t
2p*, and k
t
k - 1.
(End) Step 0 can be carried out by a shortest path computation based on Theorem 5.5. In Step (1-2) we have
for the current and p . Hence each Step (1-2) requires at most 21AI maximum and minimum submodular-flow computations. It follows that the costscaling algorithm requires O( IAl log, C ) maximum and minimum submodular 157
111. NEOFLOWS flow computations and that the algorithm runs in polynomial time if an oracle for exchange capacities is available. It should also be noted that, since in Step 1 we have (5.155) and since Problem Ps is bounded when we proceed to Step 1, the problem on network #s = (G = (V,A),c,Z,?, (D,f)) with approximated cost function is also bounded due to the assumption that c(a) > --oo ( a E A ) (recall Theorem 5.5). If we apply the cost-rounding and tree-projection technique of the author [Fuji861 (which is an improved version of Tardos’s algorithm [Tardos85]) for the ordinary minimum-cost flows, we can get a strongly polynomial algorithm for the submodular flow problem ([Fuji Rock Zimmermann89]). The first strongly polynomial algorithm for the submodular flow problem was given by Frank and Tardos [Frank Tardos851 by the use of the simultaneous approximation algorithm of [Lenstra Lenstra Lov&z82]. To give a strongly polynomial algorithm of [Fuji Rock Zimmermann891 we need a few preliminaries. For simplicity we assume that c and C take on only finite values, which guarantees the boundedness of Problem P s . For a cost function 7: A R and a positive real a,7’: A -+ R is called an a-approsirnation of 7 if [ ~ ’ ( a-) -y(a)l 5 a for all a E A . The following lemma is fundamental.
+
+
+
+
+
+
+
--f
Lemma 5.12: Let 7’ be an a-approximation o f 7 and p’ be an optimalpotential in N‘ = (G = ( V , A ) , c , Z , y ’ (P, , f)). Then there exists an optimal potential p in N = (G, c, Z, 7, (D,f ) ) such that we have for each u , v E V
(Proof) Without loss of generality we assume ~ ’ ( a 2) $a) for all a E A (by possible reorientations of arcs). If 7’ # 7 , let vo be a vertex in V such that 7‘(a) > ~ ( afor ) an arc a E 6-vo. Define (5.157) Starting from an optimal submodular flow p’ and an optimal potential p’ in W‘ = (G,s,Z,7’, (P,f ) ) , we apply a slightly modified version of the (nonscaling) algorithm for submodular flows to J? = (G,c,c,.j., (P, f ) ) . The modification consists only in postponing potential changings as long as possible, which does not affect the finiteness of the algorithm. 158
5.5. ALGORITHMS FOR NEOFLOWS
Note that initially for any out-of-kilter arc a E A we have
aE
(5.158)
6-21',
0 > + p l ( u ) 2 -a,
(5.159)
@ ( a ) < C(.).
(5.160)
In the original version of the algorithm an out-of-kilter arc a' E 6 - v o is chosen and @ ( a o ) is increased by solving a maximum submodular flow problem with reference arc uO, which is followed by a potential changing. In the modified version we repeatedly solve maximum submodular flow problems, each with an out-of-kilter arc a E 6-v' as a reference arc, without any potential changings. If there exists at least one out-of-kilter arc after this process, then let U be the set of the vertices reachable from v' in the current auxiliary network and change the potential p by (5.149)-(5.153). Repeat this process until all the out-of-kilter arcs become in kilter. (This is repeated finitely many times.) Let p be the resultant potential and let a' be one of the arcs which became in kilter at the last potential-changing. We can see from (5.159) that m={ I(P'(4 - P'b)) - (P(4 - P ( V N 5 m={ IP'(4 - P ( U ) I € v} I 1% (a') I 5 a,
I
I I
u , 21 E
v} (5.161)
where note that 0 5 p'(u) - p ( u ) for all u E V due to (5.153). Putting p' t p and 7' +- .^y and repeating the above argument for other vertices, we see that the total change of any potential difference is bounded by QlVI. Q.E.D. From this lemma we have
Theorem 5.13: Let 7' be an a-approximation of 7 and p' be an optimal potential in N' = ( G = (V,A),g,Z,7', ( D , f ) ) . Then for any optimal submodular flow P N = (G = (V,A),c,-d,7,(D,I ) ) , (1) Y a E A: 7 p ~ ( a>) alVI ==+p(a) = c ( a ) , (2) VU E A: 7 p t ( ~< ) -alVl =+-P ( U ) = Z ( U ) , (3) V u , v E V : p'(u) - p ' ( v ) > aIVI ==+v $! dep(dp,u). (Proof) The present theorem follows from Lemma 5.12 and Theorem 5.3. Q.E.D. Theorem 5.13 gives a basis for a strongly polynomial algorithm for submodular flows. 159
111. NEOFLOWS We show a strongly polynomial algorithm which consists of the repeated applications of a procedure called Fundamental Cycle.
Fundamental Cycle Input: Lower and upper capacity functions c and C; a paitition W = {W' I i E I}of V ;a representation of S = $i,=lSi as a direct sum of submodular systems S' = ( D i , on Wi (i E I ) ; a graph H = (V,D) with connected components Hi = (Wi, E') (i E I) which are strongly connected; and a set A' = { a I a E A , c(a) = i?(a)} of all tight arcs. (Comment: At the initial application of this procedure we put I = {l},W' = V and H = ( V , D ) with D = { ( u , ~ I)u , v E
fi)
v, u # 4)
Output: A nonnegative real M , and if M # 0, modified c, 5, W ,S , H and AO. (Comment: When M = 0, the set of all the submodular flows in the current network N = (G = (V,A ) ,c, E, 7, S ) is exactly the set of all the optimal submodular flows in the original network. When M # 0, c, C, W , H and A' have been modified in such a way that all the input characteristics are maintained and that at least one of the following two properties holds: (a) Ao is strictly larger than the input Ao, (b) D is strictly smaller than the input D. Step 1 [Construction of a potential]. (1-1) Shrink the vertex subsets Wi (i E I) of G = (V,A ) into (pseudo-)vertices w i (i E I) and denote the resultant graph by G / / W . Denote by = (t,A) the graph obtained from G//W by deleting all the arcs of A'. Find a spanning forest T of 6 which is composed of rooted spanning trees of the connected components of (1-2) Put $(v] t 0 for all roots v of spanning trees in T and construct the potential 5: V + R in such a way that 76(a) 7(a) j(a+a) - f i ( ~ 3 - a )= 0 for all the arcs a of T . (1-3) Define the potential p: V t R by
e
e.
+
i E I).
(5.162)
Step 2 [Rounding the cost function]. (2-1) Put M t r n a ~ { ( 7 ~ ( a ) Ia E A - A ' } .
(5.163)
(v E W',
p ( v ) = $(wi)
I
If M = 0, then stop. (2-2) Put
734
+
7'(a)
7Pb) +
'
IVI2/M
T7i(a)J 160
(aE
(aE
4,
A)*
5.5. ALGORITHMS FOR NEOFLOWS (Comment:
r-1 means rounding to the nearest
integer.)
Step 3 [Solving the approximate problem]. Find an optimal submodular flow pf and an optimal potential p' in the network N f = (G = ( V , A ) , c , Z , 7 ' , S )by the (cost-scaling) algorithm of Cunningham and Frank. Step 4 [Extracting information on optimal submodular flows]. (4-1)For each arc a E A - A', (i) if (7;)pt(a)< --ilV], then put c(a) t Z(a) and A' t A' U { a } ,
(ii) if ( 7; ) pt(u ) > iIVI, then put c(a) t c(a) and A' where ( ~ ; ) ~ t (=a7;(a) ) p'(a+a) - p'(a-a).
t
A'
U{a},
+
(4-2)For each arc ( u , v ) E D of the graph H , if p'(v) - p'(u) > then delete ( u , v ) from D and from Ei to which (u,v) belongs. (4-3)For each i E I do the following (4-3-1)-(4-3-3): (4-3-1)Find a maximal chain 0 = U i c Uf c . . c ULi = W' of upper ideals of Hi.(Comment: An upper ideal of a graph is a vertex set which no arcs enter.) (4-3-2)Define Si,(= ( D " , f d a ) ) = ( D ' , f " ) * U ~ / U ~ - ,(S = 1,2,...,ki),
W i a= Uj - Ujp1
(S =
1,2,***,ki).
-
(Comment: Wia (s = 1,2, +. ,ki) are the vertex sets of the strongly connected components of Hi.) (4-3-3)Delete from the graph H all the arcs connecting distinct subsets Wie (S = 1,2,*.*,ki). (4-4)Put
To find an optimal submodular flow in the original network we repeatedly apply the procedure, Fundamental Cycle. We show the validity and the strong polynomiality of this algorithm. At any stage of the algorithm the input to Fundamental Cycle is referred to as the current network with the current capacity functions, the current submodular systems, etc. 161
111. NEOFLOWS Theorem 5.14: At any stage of the algorithm the following statements are valid. (1) For any optimal submodular flow p in the current network the exchangeability graph Gap associated with base ap of the current submodular system is a subgraph of the current graph H = (V, D). (2) The set of all the optimal submodular flows in the original network coincides with the set of all the optimal submodular flows in the current network. (3) While M # 0, each application of Fundamental Cycle strictly enlarges the set Ao of the tight arcs or strictly reduces the set D. (Proof) We show the present theorem by induction on the number of applications of Fundamental Cycle. Before the initial application of Fundamental Cycle the current network and the original network coincide and the current graph H contains all arcs (u, v) with u,v E V and u # v. Obviously, the statements of the theorem are valid. Now, let us assume that the statements are valid before an application of Fundamental Cycle. We show their validity after this application. Suppose M # 0; otherwise the algorithm terminates without modifying the input and we are done. Let a* be an arc which maximizes (5.163) in Step (2-1). If we add the arc a* to the forest T in 6 = (see Step (1-l)), then a unique cycle Q* is created. Let Q* be represented by a sequence of arcs as Q* = (uo,ul,..~,u') with uo = a*. We assume that a* is positively oriented in the cycle Q*. Since the (pseudo-)vertices wi correspond to the shrunken strongly connected graphs Hi= (Wi, E * ) (i E I ) , we can extend Q* to a cycle C in the graph ( V , A U D) such that all the arcs from C fl D are positively oriented. We define
(?,A)
4 7
7%) = 7(4. I V I 2 / M
(a E
7 3 4 = 7 p ( 4 * IVl2/M
(aE A),
(5.165)
(. E V ) ,
(5.166)
P"4
= P b ) . IVI2/M
(5.164)
where we have
$ ( a ) =7'(~) + p ' ( a + ~ )- p ' ( ( d - ~ )
(uEA).
(5.167)
We extend the domain of 7; to A U D by defining 7;(u) = 0 ( u E D). (Note that this is equivalent to defining 7 ( a ) = 0 ( u E D).) In the following we assume that we have 7,(u*) = -M in Step (2-1). (The case when 7 , ( u * ) = M can be treated similarly.) Then the length of the cycle C relative to the length function 7; is given by
7 3 3 = -IVI2, 162
(5.168)
5.5. ALGORITHMS FOR NEOFLOWS
since 7;(a*) = -IVI2 and 7i(a) = 0 for any other arcs a in C. In Step 3 of Fundamental Cycle we construct an optimal submodular flow (of and an optimal potential p' in N' = (G = (V,A),c,Z,7',S). Since the length of C is invariant under the modification of the length function 7; by any potential, there exists some arc 6 in C such that
+
€c(iL)* (736) p'(at-6)
- p'(d-6))
I -IVI,
(5.169)
where ~ C ( i i is ) equal to 1 or -1 according a 6 is positively oriented or negatively oriented in C. Since E C ( & ) = 1 if 6 E D, either (i) l(7;)pf(6)l > iIVI if 6 E A - A', or (ii) p'(v) - p'(u) > !jIVI if 6 = ( u , v ) E D. Since 7' is a $approximation of 7; (see Step (2-2)), it follows from Theorem 5.13 that for any optimal submodular flow (o in .Va = (G,c,Z,r;,S) (1) if 6 E A - A', then
(2) if 2 = ( u , ~E) D,then p'(v) - p'(u) > i l V l and
where the dependence function is defined with respect to the current base polyhedron. Therefore, in Steps (4-1) and (4-2) A' is strictly enlarged or D is strictly reduced, which shows statement (3) in the present theorem. Moreover, we have
aEA
u EV
where wi is the (pseudo-)vertex representing the shrunken set W i and p ( v ) = p(wi) (v E WZ,i E I ) due to Step (1-2). Since a(o(W') (i E I ) are indepeni, Bi is the base polyhedron of dent of the choice of ap E B = e j E ~ Bwhere submodular system Sa for each i E I , it follows from (5.172) that the set of all the optimal submodular flows in N a = (G,c,E,7;, S) coincides with that of all the optimal submodular flows in the current N = (G,g,E,7,S). Consequently, (5.170) and (5.171) are valid for any optimal submodular flow p in the current 163
111. NEOFLOWS
M. Hence, the modifications of c and Z in Step (4-1) do not change the set of all the optimal submodular flows in M and Step (4-2) maintains Property (1) given in the present theorem. Therefore, in Step (4-3) a maximal chain 0 = U i c Uj c . c U i , = W i of upper ideals of Hi is a chain of tight sets for dp restricted to W i , i.e., # .
a p ( ~ , =i )f i ( ~ , )
(~=0,1,.**,ki).
(5.173)
Since Property (1) in the present theorem is valid before Step (4-3) is executed, (5.173) holds for any optimal submodular flow p in the current M. This validates the decomposition of ( P i , f i ) into minors ( P i , f i ) U ~ / U ~ - l (s = 1,2, , k i ) for each i E I in Step (4-3). This shows the validity of Q.E.D. (1) and (2) at the next stage. It follows from Theorem 5.14 that the algorithm terminates after at most IAl IVl(lVl - 1) applications of Fundamental Cycle. When M = 0, the set of all the submodular flows in the current network is exactly the set of all the optimal submodular flows in the original network. An optimal submodular flow is obtained by finding a (feasible) submodular flow in the current network by use of the common-base algorithm described in Section 4.3 (also see Section 5.3). All the steps of Fundamental Cycle can be carried out in strongly polynomial time, provided that an oracle for exchange capacities is available. It should be noted that an exchange capacity can be computed in strongly polynomial time by applying the strongly polynomial algorithm [Grlitschel LovQz + Schrijver881 for submodular-function minimization, where only an oracle for function evaluations is assumed, but the algorithm heavily relies on the ellipsoid method [Khachiyan’lO, 801 and is not a combinatorial one.
+
+
5.6. Matroid Optimization
We show some specializations of the results obtained in the previous sections to matroids, which is a retrospective view of matroid optimization. (a) M a x i m u m independent matchings Let G = ( V + , V - ;A ) be a bipartite graph with the left and right end-vertex sets V + and V- and the arc set A. Also let M+ = ( V s , I t ) and M- = ( V - , I - ) , respectively, be matroids on V+ and V - with families I+ V+ and I- C V of independent sets. Denote M = (G = ( V + , V - ; A ) , M + , M - ) . An independent matching M C A in N is a matching in G such that d+M E
I+, 164
a-M E
I-,
(5.174)
5.6. MATROID OPTIMIZATION where d+M ( d - M ) is the set of end-vertices in V + ( V - ) of arcs in M . (We assume that for each arc a E A we have d+a E V + and d - a E V - . ) The mazimum independent matching problem is to find a maximum independent matching (i.e., an independent matching of maximum cardinality) in U . The maximum independent matching problem can naturally be reduced to a maximum independent flow problem as follows. Consider a network & = (G = ( V + , V - ; A ) , C , E ,+( ,~p ~+ ) , ( 2 " , p - ) ) , where V + ( V - ) is the set of entrances (exits), (2'+,p+) ( ( 2 ' - , p - ) ) is the submodular system on V + ( V - ) with p + ( p - ) being the rank function of matroid M+ (M-), and
-~ -
( a= )0
~ ( a=) $00
(5.175)
(aE A), (aE A).
(5.176)
We see that any integral independent flow p in & is {O,l}-valued and that {a I a E A , p(a) = 1) is an independent matching in N. From Theorem 5.9 an integral maximum independent flow in & gives a maximum independent matching in N and we have the following max-min theorem.
Theorem 5.15 (The maximum-independent-matching minimum-coveringrank theorem) [Edm7O],(Rado421, [Welsh'lO]: For network N = (G = ( V + , V - ; A ) , M+,M-) we have
1
M is an independent matching in 1) = min{p+(U+) + p - ( U - ) I ( U + ,U - ) is a cover of G}.
max((M1
(5.177)
(Proof) The present theorem immediately follows from Theorem 5.9, where a cut U of finite capacity of corresponds to a cover ( U + ,U - ) of G such that U s = V s - U and U- = V - n U ; this gives a one-to-one correspondence between the set of cuts of finite capacity of # and that of covers of G, and the capacity of a cut U of finite capacity of 1 is equal to p+(U+) p - ( U - ) , the rank of the cover ( U s , U - ) . Q.E.D.
+
The transformation of matroids by bipartite graphs Let G = ( V + ,V - ;A ) be a bipartite graph and M+ = ( V + ,I+)be a matroid with a family I+ of independent sets. Define
I- = { d - M I M :
a matching in G, d + M E I + } ,
(5.178)
where d - M = { % a 1 a E M } . We can easily see that I- is a family of independent sets of a matroid. From Theorem 5.15 the rank function p - of the matroid M- = ( V - , I - ) is given by p - ( U - ) = min{p+(X+)
+ IY-1 I
(X+,Y- U (V-
165
-
U - ) ) is a cover of G} (5.179)
111. NEOFLOWS for each Ugraph G.
c V-.
We call M- the matroid induced from MS by the bipartite
The matroid intersection problem When V- is a copy of V+ and a bipartite graph G = ( V + , V - ; A ) represents the natural bijection between V+ and V - , the maximum independent matching problem for the network N = (G = ( V + , V - ; A ) , M + , M - ) becomes the problem of finding a maximum common independent set of the two matroid M+ and M - , where V+ is identified with V - . This problem is called the matroid intersection problem. From Theorem 5.15 we have
Theorem 5.16 (The matroid intersection theorem) [Edm'lO]: For two matroids M+ = (E,It) and M- = ( E , I - ) with I + and I- being families of independent sets, max{lIl
I
= min{p+
I E
I+ n I-}
( X )+ p - ( E
-X
) I X 2 E},
(5.180)
where p+ ( p - ) is the rank function of M+ (M-). Theorem 5.16 also follows from Theorem 4.9. The matroid intersection problem is a special case of the maximum independent matching problem in a natural way. Conversely, we can show that the maximum independent matching problem is reduced to a matroid intersection problem. Consider the maximum independent matching problem for a network .Al = (G = ( V + , V - ; A ) , M + , M - ) . From G define a bipartite graph 6= where
(@+,@-;A),
W + = {w,+ I u E A),
W - = {w;
I a E A},
A = {(w;,w;) I a E A}.
(5.181) (5.182)
(See Fig. 5.5.) Moreover, define
.T'
= {I'
I
I'
c W ,Vv E V + : [ { a I a E 62' 1, {@a
1- = { I -
I
I-
I a E A,
C W - , VV E V-: {a-a
w,' E )'1
I{a I a E 6-v,
I a E A,
w,' E I+}I 5 1, E W,
I+}, E I-}I
w, E I - } E I - } ,
(5.183)
5 1, (5.184)
where a+, a-, 6+ and 6- are with respect to G. We can easily show that M+ = (@+, I + ) and M - = (6'-, I - ) are matroids with families f+ and I 166
5.6. MATROID OPTIMIZATION
G
Figure 5.5. of independent sets. The matroid M+ (M-) is regarded as the one induced from M + (M-) by a bipartite graph, or as a composition of M+ ( M - ) and the direct sum of rank-one uniform matroids on 6+v ( v E V + ) ( 6 - v (v E V - ) ) . The maximum independent matching problem for network U = (G = (V+,V - ;A ) ,M+ ,M - ) is thus reduced to a maximum independent matching -+ problem for the new network k = (6 = (k+,lk-;6),M, M ), which is a matroid intersection problem. A
-
The matroid union Let M i = ( E i ,1;)(i = 1 , 2 ) be two matroids. Define 11v2 = {I1
u12 I 11 E 11, 12 E 1 2 ) .
(5.185)
Then Mlv2 = (El U E2,11v2) is a matroid, which is the one induced from the direct sum M1 $ M 2 of the two by the bipartite graph G = ( V + ,V - ; A ) , where V + = El $E2, V - = El UE2 and the arc set A consists of the natural bijections V - (see between El 5 V + and El C V - and between E2 C V + and E2 Fig. 5.6). Therefore, from (5.179) we have
Theorem 5.17: The rank function plV2(X) = min{pl(Y n E l )
plv2
of M I v 2 is given by
+ p2(Y n E 2 ) + IX - YI I 167
Y
c: X }
(5.186)
111. NEOFLOWS
Figure 5.6.
for each X
El
U E2.
The matroid M l V 2 is called the union (or sum) of M i (i = 1 , 2 ) . Note that if El = E z , we have p l v 2 = ( p 1 p 2 ) l , which is the rank function of the reduction of the sum (2E, p 1 + p 2 ) of submodular systems (2E, p i ) (i = 1,2) by vector 1 = ( l ( e ) = 1 I e E E ) (see (3.9)). A base of the union M l v 2 of M1 and M2 is given in the form of the union B1 u Bz of some bases Bi of matroids Mi (i = 1 , 2 ) . When El = E2, for bases Bi (i = 1,2) of matroids Mi B1 U B2 is a base of Mlv2 if and only if B1 U (E2 - B z ) is a maximum common independent set of MI and the dual M2* of M 2 . In this sense the problem of finding a base of the union of two matroids is equivalent to the matroid intersection problem and hence to the maximum independent matching problem. WhenwearegivenmatroidsMi = ( E j , l i ) ( i = 1 , 2 , . . - , k ) ( k > 2 ) , w e c a n . . (i ~= L 1,2, k) in the same way as in the define the union M I ~ ~ ~of. Mi case of k = 2. That is, an independent set of the union of Mi (i = 1,2, k) is the union of independent sets Ii E l i (i = 1 , 2 , - . . , k ) . The rank function p l v z v . . . v k of Mlv2v...vk is given by
+
--
a ,
--
a ,
I
p 1 ~ 2 ~ . . . ~= k (min{pl(YnEl)+...+pk(YnEk)+lX-YI X) Y G X } (5.187) for each X
C El U -..U Ek.
Theorem 5.18: There exist disjoint bases of Mi = (E,li) (i = 1 , 2 , . . - , k ) , one from each M i , if and only if p l ( ~ ) + - - - + p k (= ~ )rnin{pl(X)+...+pk(X)+IE-XI 168
1 x c E } . (5.188)
5.6. MATROID OPTIMIZATION (Proof) The present theorem follows from the fact that the rank of the union of Mi (i = 1,2, ,k ) is equal to the right-hand side of (5.188). Q.E.D.
- -.
The matroid partitioning
--
For matroids Mi = ( E ,Ii) with rank functions p; (i = 1,2, ,k ) , the matroid partitioning problem is to find k disjoint subsets 1i of E such that I1 U I 2 U * U 1, = E and 1; E I; (a = 1,2,. . ,k ) . We can easily see that k such subsets 1i (i = 1,2, - .,k ) exist if and only if the union of Mi (i = 1,2, . ,k ) is the free matroid on E , i.e., from (5.187) a
-
In other words,
Theorem 5 . 1 9 [Edm7O]: There exists a base Bi of M; = ( E , I i ) for each i = 1 , 2 , - .- ,k such that the union of B; (i = 1,2,. ,k ) is E if and only if (5.190)
When matroids Mi = ( E ,Ii) (i = 1 , 2 , . . . , k ) are the same matroid M = ( E , I ) with the rank function p , E is partitioned into k independent sets I; E I (i = 1,2,... ,k ) if and only if
VX C E : k p ( X ) 2
1x1.
(5.191)
From (5.191) we have
Corollary 5.20 [Edm65] (see [TutteGl], [Nash-Williams61] for graphs): The minimum number k for which E is partitioned into k disjoint independent sets ofM = ( E , I ) is equal to (5.192) where we assume M does not have any selfloop, i.e., we assume p ( { e } ) > 0 (e E
E).
Note that (5.191) is equivalent to that a uniform vector ( l / k ) l is an independent vector of the matroidal polymatroid P = ( E , p ) corresponding to M. If the matroidal polymatroid P = ( E , p ) has a uniform base ( Z / k ) l for positive integers k , 1 such that ZlEl = kp(E), then there exist k bases B; 169
111. NEOFLOWS
-
( i = 1 , 2 , . . ,k) of M such that each e E E is uniformly covered by B, (i = 1 , 2 , . . . , k) 1 times, i.e., for each e E E
I{i I i E { 1 , 2 , . . . , k } , e E B , } l = 1 ,
(5.193)
due to the fact that B(p)+..-+B(p) (k times) = B(kp) with Z as the underlying totally ordered additive group (see Section 3.l.c). Such a family of bases Bi (i = 1,2, ,k) is called a complete family of bases of M and a matroid having a complete family of bases is called irreducible [Tomi75,76]. The concept of irreducible matroid was introduced by N. Tomizawa for the analysis of principal partition of a matroid (see Sections 7.2.b.l and 9.2). The same concept was also independently introduced by H. Narayanan [Narayanan74] and was called a molecule (also see [Narayanan Vartak811). a
+
An algorithm for the maximum independent matching problem by using auxiliary graphs is given by Tomizawa and Iri [Tomi Iri741. An algorithm for the matroid intersection problem is given by Edmonds [Edm79]. For other related algorithms see [Edm Fulkerson651 for matroid partitionings and [Bruno Weinberg711 for matroid unions.
+
+
+
(b) Optimal independent assignments
Consider a bipartite graph G = ( V + , V - ; A ) , matroids M+ = ( V + , I + ) and M- = ( V- , I - ) and a weight function w: A -+R. Denote such a network by N = (G = ( V + , V - ; A ) , M + , M - , w ) . For apositive integer k, a k-independent matching M in N is an independent matching of cardinality k in N o = (G = ( V + , V - ; A ) , M + , M - ) . An optimal k-independent assignment in N is a kindependent matching M having the minimum weight w ( M ) = C e S E w ( e ) among all the k-independent matchings in N. The independent assrgnment problem is to find an optimal k-independent assignment in N for a given k. The independent assignment problem is a special case of a neoflow problem, especially of the independent flow problem in a natural way. The independent assignment problem is equivalent to the weighted matroid intersection problem, which is to find a minimum-weight common independent set, of two matroids, having cardinality k for a given positive integer k. For an independent matching M define the auxiliary network NM = ( G M = ( V * , A M ) , W M as ) follows. G M is the graph with vertex set V’ = 170
5.6. MATROID OPTIMIZATION
S,& = {(s+,v)
I v E V + - cl'(a+M)}
U {(v,s+)
I v E d+M},
(5.194)
A& = { ( u , v ) I v E cl+(a+M)- d + M , u E C+(a+Mlv)- {~}},(5.195) AMMAAM, (5.196) (5.197) A? = {Z I a E M I (a: a reorientation of a ) , A& = { ( v , u ) I v E ~ l - ( d - M )- 8 - M , u E C -(a -Ml v) - {~}},(5.198) S k = { ( v , s - ) I v E V - - ~l-(d-M)}U {(s-,v) I v E a - M } . (5.199) Here, cl+ and cl- are, respectively, the closure functions of M+ and M-, C + ( a + M I v ) for v E cl+(d+M) - d+M is the fundamental circuit associated with a+M E I + and v which is the unique circuit of M+ contained in a+M u {v}, and C -(d-Ml v ) is similarly defined for M-. Also, WM:AM-+ R is the length function defined by
( a E 6f,E ( E M ) : a reorientation of a ) ( a E S& U A& UA, USG).
(5.200)
A primal-dual algorithm for the independent assignment problem [Iri +Tomi761 Step 1: Put M t 0. Step 2: While IMI < k, do the following. (2-1) Find a shortest path P , relative to the length function W M , from s+ to s- in the auxiliary network 1~having the minimum number of arcs. If there exists no directed path from s+ to s- in JM,then stop (there is no k-independent matching). (2-2) Put M t M A A ( P ) (A: the symmetric difference), where A(P) = { a I a E A , a lies on P } . u { a I a E M , a reorientation of a lies on P}. (5.201)
If we define in Step (2-1) a potential p:V'
R in such a way that for each v E V' p ( v ) is equal to the length of a shortest path from s+ to v in N M , 171
-+
111. NEOFLOWS then using the potential p we can replace (2-1) by W M , ~defined by
WM
in the next execution of Step
Here, we disregard those vertices which are not reachable from s+ in N M , since vertices which are not reachable from s+ will not become reachable from s+ in N M for new M . We can show that thus defined W M , ~is nonnegative for those arcs reachable from s+, so that we can reduce the complexity of finding a shortest path in NM (see [Iri Tomi761). This is an adaptation of the technique for the classical min-cost flows developed by Tomizawa [Tomi711and also independently by Edmonds and Karp [Edm Karp721. It should be noted that arcs entering s+ or leaving s- in NM play no r61e in the primal-dual algorithm. These arcs are for the primal algorithm given as follows.
+
+
A primal algorithm for the independent assignment problem [Fuji77a] Step 1: Find a k-independent matching M in N . Step 2: While there exists a negative cycle, relative to the length function W M , in the auxiliary network N M , do the following. (2-1) Find a negative cycle Q in NM having the minimum number of arcs. (2-2) Put M t M A A ( Q ) , where A ( Q ) is defined by (5.201) with P replaced by Q . (End) For other related algorithms, see [Edm79], [Lawler75], [Fuji77b], [Iri78],
[Fr ank8 la]. The problem of finding a minimum weight directed spanning tree in a graph is an example of the independent assignment problem or the weighted matroid intersection problem. Consider a graph G = ( V , A ) and a weight function w : A -+ R . Let MI be the graphic matroid M(G) represented by G and Mz be the direct sum of the rank-one uniform matroids M-(v) on 6-v ( v E V ) ,i.e., M2 = @,,EvM-(v).We see that B C A with IBI = IVI - 1 is a common independent set of MI and Mz if and only if B forms a directed spanning tree of G . Hence the problem of finding a minimum weight directed spanning tree is a weighted matroid intersection problem. If we want to find a minimum weight spanning tree with a fixed root YO E V , then replace M-(vo) by the trivial matroid (rank-zero matroid) on 6 - v o . For other applications, see [Iri83], [ h i Fuji811, [Murota871 and [Recski]. Finally, it should be noted that the problem of finding a common independent set of maximum cardinality for three matroids is NP-hard. For, consider
+
172
5.6. MATROID OPTIMIZATION
a graph G = ( V , A ) , and let MI be the l-elongation of the graphic matroid M(G) and Mz be the direct sum of the rank-one uniform matroids M-(v) on 6 - v (v E V ) ,where the l-elongation of M(G) is the union of M(G) and the rank-one uniform matroid on A (see Section 3.1.d). Also, let MS be the direct sum of rank-one uniform matroids Ms(v) on 6+v ( v E V ) . We can easily see that the maximum cardinality of common independent sets of Mi (i = 1,2,3) is equal to JVIif and only if there exists a directed Hamiltonian cycle in G, and that a common independent set of Mi (i = 1,2,3) of cardinality IVI forms a directed Hamiltonian cycle in G.
173
This Page Intentionally Left Blank
Chapter IV. Submodular Analysis
Submodular (or supermodular) functions on distributive lattices share similar structures with convex (or concave) functions on convex sets. In this chapter we develop a theory of submodular and supermodular functions from the point of view of the duality in convex analysis ([Fuji84f,84g,84h]).
6. Submodular Functions and Convexity
We define the convex (concave) conjugate function of a submodular (supermodular) function and show a Fenchel-type duality theorem for submodular and supermodular functions. We also define the subgradients and subdifferentials of a submodular function and examine the relationship among these concepts and the polyhedra such as the submodular and supermodular polyhedra and the base polyhedron associated with the submodular function. The r e a o n for the analogy between a submodular function and a convex function is nicely explained by the Lovdsz extension of a submodular function.
6.1. Conjugate Functions and a Fenchel-Type Min-Max Theorem for
Submodular and Supermodular Functions (a) Conjugate functions Let f: D 4 R be a submodular function and g : D -+ R be a supermodular 2E with 8,E E D. Here, we do not function on a distributive lattice D necessarily assume that f(0) = g ( 8 ) = 0. Define the function f * : RE -+ R by f*(s) = max{s(X) - f ( X ) I X E
D}
(x E R E )
(64
(x E RE).
(6.2)
and also, in a dual form, the function g * : RE + R by g * ( x ) = min{z(X)
-
g(X) I X E
D}
By the definition f * is a convex function and g* is a concave function. We call f * the convex conjugate function of the submodular function f and g* a concave conjugate function of the supermodular function g . 175
IV. SUBMODULAR ANALYSIS
It may be noted that the terms x(X)appearing in (6.1) and (6.2) can be regarded as the inner product of x and X X , the characteristic vector of X. Notice the analogy between the definition (6.1) and that of a convex conjugthe function of an ordinary convex function (see (1.35)) ([Rockafellar70], [Stoer Witzgall'lo]) .
+
Theorem 6.1: For a submodular function f : D + R and a supermodular function g: D + R we have for any X E D
f(X)= max{x(X) - f*(x) I z E RE}, g(X)= min{x(X) - g*(z)I x E RE}.
(6.3) (6.4)
Moreover, for any X E 2E - D x(X)- f*(x) (or x(X)- g*(z)) as a function of x E RE can be made arbitrarily large (or small).
(Proof) Without loss of generality we assume that j ( 0 ) = g(0) = 0. From (6.1),
f(X)2 .(X)
-
f*b)
(6.5)
for any X E D and x E RE. We show that for any X E D there exists .a vector x E RE such that (6.5) holds with equality, from which (6.3) follows. From Lemma 3.2, for any X E D there exists a subbase 4 E P(f) such that 4(X)= f(X). Since 4 E P(f), we have from (6.1)
f*(4) = 4(X)- f(X)(= 0).
(6.6)
This implies (6.3). Relation (6.4) is the same as (6.3) by considering the dual order on R. Moreover, it follows from Lemma 3.2 that for any X E 2E - D we can make x(X)arbitrarily large (or small) subject to x E B(f) (or x E B(g)). Since f*(z)= 0 for x E B(f) and g*(z) = 0 for z E B(g), this completes the proof. Q.E.D. We see from Theorem 6.1 that the correspondence between a submodular (or supermodular) function f (or g) and its convex (or concave) conjugate function f * (or g*) is one to one. The convex conjugate function f * is closely related to the vector rank function rf of the submodular system ( D , f ) when f(0) = 0. Recall that for each z E RE rf(z) = min{f(X) z ( E - X) I X E P} (6.7)
+
(see (3.9) or (3.20)).
Lemma 6.2: For a submodular system ( P , f ) on E and a vector x E RE we
176
6.1. CONJUGATE FUNCTIONS AND A FENCHEL-TYPE THEOREM Q.E.D.
(Proof) Immediate from (6.1) and (6.7).
Note that since rf is submodular on the vector lattice RE (see [Welsh761 when D = 2E), the convex function f * is supermodular on RE, i.e., for any x, y E R E f*(Z)+f*(Y) I f*(xvy)+f*(xAy), (6.9) where (x V y)(e) = max(z(e),y(e)),(x A y)(e) = min(x(e),y(e)) (e E E ) . (See [Topkis78] for submodular functions on general lattices.) Here, it may also be noted that vector lattice RE is a distributive lattice.
(b) A Fenchel-type min-max theorem We show a Fenchel-type min-max theorem for submodular and supermodular functions, which relates the difference between a submodular function f and a supermodular function g to that between the concave conjugate function g* and the convex conjugate function f *. (For Fenchel's duality theorem for ordinary convex and concave functions, see [Rockafellar70] and [Stoer Witsgall701.)
+
Theorem 6.3 (A Fenchel-type min-max theorem) [Fuji84f]: For a submodular function f : D1 -+ R and a supermodular function g: D2 + R on distributive lattices D1, D2 C 2E with 0, E E D1 n D2 we have min{f(X) - g(X) I X E Di n D2> = max{g*(x) - f * ( x ) I x ERE}.
(6.10)
Moreover, i f f and g are integer-valued, the maximum on the right-hand side of (6.10) can be attained by an integral vector x. (Proof) Without loss of generality we assume that f(0) = g(0) = 0. Hence ( P I , f ) is a submodular system and ( D 2 , g ) is a supermodular system, both on E . The min-max relation (6.10) is equivalent to
+
min{ f ( X ) g # ( E - X) I X E = max{g*(z)
D1
nDz}
+ g ( E ) - f*(x) I x E RE},
(6.11)
where recall that g# is the dual submodular function of g. From (6.2), (6.7) and Lemma 6.2, (6.12) f* = x ( E ) - rf 7
(4
(4
g*(x)+ g ( E ) = min{x(X) = rg# (x).
+ g#(E - X) I X E
177
D2)
(6.13)
IV. SUBMODULAR ANALYSIS Substituting (6.12) and (6.13) into (6.11) yields
+ g#(E - X) I X E D1 n Dz} = max{rr(x) + rg#(x) - x(E) I z E R E } ,
min{f(X)
(6.14)
which is equivalent to (6.10). For any x E R E let y and z be bases of the reductions of ( & , f ) and (F2,g#) by x, respectively, i.e.,
Y E P(f), Y I z, y ( E ) = Tf("),
(6.15)
I z,
(6.16)
z E P(g#),
z
Z ( E )= rg#(z).
Also define
w =y A z
(f(min{y(e),z(e)} I e E E ) ) .
(6.17)
Then,
w E P(f)
n P(g#).
(6.18)
Because of (6.15)-(6.18),
It follows from (6.18) and (6.19) that (6.14) (or (6.11)) is also equivalent to
The min-max relation (6.20) is exactly the intersection theorem (Theorem 4.9), so that (6.10) holds. The integrality part of the present theorem follows from the counterpart of the intersection theorem. Q.E.D. Note that if x E R E is a maximizer of the right-hand side of (6.10), then w given by (6.15)-(6.17) is a maximizer of the right-hand side of (6.20). If f and g are integer-valued functions and x is an integral vector, then such a vector w can also be integral. Conversely, any maximizer x of the right-hand side of (6.20) is a maximizer of the right-hand side of (6.10). The above proof shows the equivalence between the Fenchel-type minmax theorem and the intersection theorem. Combining this with the results 178
6.2. SUBGRADIENTS OF SUBMODULAR FUNCTIONS
in Sections 4.1 and 4.2, we see that the three theorems, the intersection theorem (Theorem 4.9), the discrete separation theorem (Theorem 4.12) and the Fenchel-type min-max theorem (Theorem 6.3), are equivalent. The Fenchel-type min-max theorem motivates further investigation of submodular and supermodular functions from the point of view of the duality theory in convex analysis ([Rockafellar’lO],[Stoer Witzgall’lO]).
+
6.2. Subgradients of Submodular Functions
(a) Subgradients and subdifferentials Consider a submodular function f : D -+ R on a distributive lattice D with 0 , E E D . For a vector x E RE and a set X E D, if
c 2E (6.21)
holds for each Y E D, then we call x a subgradient of f at X . We denote by a f ( X )the set of all the subgradients off at X and call af(X) the subdiflerential of f at X . Previously we employed symbol a as the boundary operator for flows in networks. Here we use the same symbol for subdifferentials, following the convention in convex analysis, because there seems to be no possibility of confusion. Figure 6.1 shows a two-dimensional example of subdifferentials of f : D -+ R with D = 2I1l2). In general, RE is divided into ID1 unbounded polyhedra a f ( X ) (XE D ) and for distinct X , Y E D the subdifferentials af(X)and a f ( Y ) may have common faces but not common interior points. It may be noted that a subgradient of any set function can be defined by (6.21). Some of the arguments in the following are valid for any set function.
Lemma 6.4: For a submodular function f : D x E a f ( X ) if and only if x E RE satisfies
-+
R and a set X E D, we have
.(Y) - .(X) I f(Y)- f ( X )
(6.22)
for each Y E [ ~ , X ]uI [, X , E ] p ,where
c
[0,X]I= , {Y I y E 0, y X I , [ X ,E ] I ,= {Y I Y E D, X C Y E } .
c
(6.23) (6.24)
(Proof) It is sufficient to show the “if” part only. Suppose that (6.22) holds for each Y E [ ~ , X ]uI [, X , E ] I , .Then for any 2 E D,
x(X u 2 ) - .(X) F f(Xu 2 ) - f(X), 179
(6.25)
IV. SUBMODULAR ANALYSIS
Figure 6.1.
~ ( x 2n) - Z ( X ) I f ( x n2 ) - f ( x ) .
(6.26)
From (6.25), (6.26) and the submodularity of f,
Q.E.D. Lemma 6.5: For a submodular function f : D
c
-+
R and a set X E D we have
where a f x ( X ) 2 RX,d f x ( 0 ) R E - X , x denotes the direct product, and f X and fx are, respectively, the submodular functions on 10, X I , and [ X ,E ] D / X= {Y - X I Y E [ X , E ] p }defined by
180
6.2. SUBGRADIENTS OF SUBMODULAR FUNCTIONS (Proof) From Lemma 6.4, we have x E a f ( X ) if and only if
(6.30) means xx (= (z(e) I e E X ) ) E a f x ( X ) and (6.31) means x E - X (= Q.E.D. ( x ( e ) I e E E - X ) ) E afx(0). We thus have (6.27).
Lemma 6.6 (cf. [Rockafellar'lO, Theorem 23.51): For a (submodular) function f : D --+ R, a vector x E RE and a set X E D, the following three are equivalent: (i)
5
E
af(x),
+
(ii) min{f(Y) x ( E - Y) I Y E (iii) f ( X ) f*(z)= z ( X ) .
+
D} = f ( X ) + x ( E - X ) ,
(6.32) (6.33) (6.34)
(Proof) We can easily see that each of the above three is equivalent to min{ f(Y) Q.E.D. -x(Y)I Y E D } = f ( X ) - x ( X ) . Lemma 6.6 holds for any set function.
Lemma 6.7: For a submodular function f : D
+R
with 0 , E E D and f ( 0 ) =
0,
(6.35) (6.36) (6.37) (Proof) (a) and (b) immediately follow from the definition of subdifferential. We prove (c). For any X E D there is a base x of the submodular system (D,f) such that z ( X ) = f ( X ) (see Lemma 3.2). Since x E B( j) and z ( X ) = f ( X ) ,it easily follows from (6.21) that x E a f ( X ) . Hence, a f ( X ) n B(f) # 0. Q.E.D. When f ( 0 ) = 0, (6.27) implies that the subdifferential a f ( X ) is the direct product of the supermodular polyhedron a j x ( X ) and the submodular polyhedron afx ( 0 ) , due to Lemma 6.7. 181
IV. SUBMODULAR ANALYSIS The following theorem is a submodular analogue of [ Rockafellar'lO, Theorem 23.81 in convex analysis.
Theorem 0.8: Let f l : D1 + R and f 2 : D2 -+ R be submodular functions, where D1 and Dz are distributive lattices with 0 , E E D1 n D 2 . Then, for each X E D1 n D2 we have W
l
+ f 2 ) ( X )= a f l ( x )+ W 2 ( X ) .
Also, for each X > 0 and X E
D1,
Moreover, for X = 0 and X E
D1,
I
a(O*f l ) ( X )= {Z x E
RE,V Y E
D1:
(6.38)
z ( Y )- x ( X ) 5 0 ) (6.40)
= O+dfl(X),
where 0 + a f l ( X ) is the characteristic cone (or recession cone) of d f , ( X ) (see (1.22)).
(Proof) From Lemma 6.5, for each i = 1 , 2 and X E Di a f i ( X ) is the direct product of a f , " ( X ) and a f i x ( 0 ) . It follows from (3.32) (and its dual counterpart for supermodular functions) and from Lemma 6.7 that for each X € D1 nD2 we have
Note that we have used the intersection theorem through (3.32) to prove Theorem 6.8. For a convex conjugate function f* of a submodular function f : D -+ R and a vector z E R E ,define the set r 3 2 f * ( 5 ) of subsets of E as follows:
x E d,f*(z) if and only if X
(6.42)
5 E and y(X) - .(X)
5 f*(Y) 182
-
f*(4
(6.43)
6.2. SUBGRADIENTS OF SUBMODULAR FUNCTIONS
for each y E RE.We call d2 f * (5)the binary subdifferential of f * at x. Note that from (6.43) and Theorem 6.1 each X E &f*(x) belongs to D.
Theorem 6.9 (cf. [Rockafellar70, Theorem 23.51): Consider a submodular function f: D -+ R. For any vector x E RE and any set X E D the following two are equivalent. (i) z E af(x), (6.44) (6.45) (ii) X E d2 f*(z). Moreover, the binary subdifferential 132 f*(x) is a sublattice of D . (Proof) From Theorem 6.1, Lemma 6.6 and the definition (6.43) of binary subdifferential of f * , we see that both (i) and (ii) are equivalent to
Moreover, (6.45) or equivalently (6.46) implies
z(X) - f(X) = max{z(Y) Therefore,
f ( Y ) I Y E D}.
(6.47)
a2f*(z)is the set of the maximizers of the supermodular function
D and hence
x - f on
-
is a sublattice of D.
Q.E.D.
For two submodular functions fi: Di + R (i = 1,2) define the convolution RE -+ R of the convex conjugate functions f;* (i = 1,2) by
fl*o f2*:
for each
2
E
RE.
Theorem 6.10 (cf. [Rockafellar'lO, Theorem 16.41): For two submodular functions f,:D, -+ R (i = 1,2) with 0 E D1 n D2 we have
fl* 0 f2* (Proof) For any
5
=
(fl + f 2 ) * .
E RE,
183
(6.49)
IV. SUBMODULAR ANALYSIS On the other hand, let x be any vector in RE. There exists X E D1 n D2 such that x E a(f1 f 2 ) ( X ) . Furthermore, from Theorem 6.8 there exist x1 E a f , ( X ) and x2 E a f 2 ( X ) such that x1 x2 = x. Consequently, from Lemma 6.6
+
+
Q.E.D.
We thus have (6.49)
(b) Structures of subdifferentials Suppose that D C 2E is a simple distributive lattice, i.e., D = 2 p for a poset P = ( E ,5 ) . Consider a submodular function f : D -+ R. We first give a characterization of the extreme points of the subdifferential d f ( X ) for X E D.
Theorem 6.11: For each X 6 D,x E RE is an extreme point of a f ( X ) if and only if there exists a maximal chain
of D, including X in it, such that
x ( S i - s a - 1 ) =f(Sa)-f(Sa-,)
(i=1,2,.**,n).
(6.52)
(Proof) We assume, without loss of generality, that f ( 0 ) = 0. From Lemma 6.7 we have a f ( 0 )= P ( f ) . Note that P ( f ) and B ( f ) have the same set of extreme points. Therefore, the present theorem for X = 0 follows from Theorem 3.22 and, similarly, the present theorem for X = E follows from Theorem 3.22, since af(E) = P ( f # ) due to Lemma 6.7 and P ( f # ) and B ( f # ) (= B(f)) have the same set of extreme points. Furthermore, for any X 6 D we have a f ( X ) = a f x ( X ) x a f x ( 0 )due to Lemma 6.5. Hence the extreme points of a f ( X ) are given by the direct product of extreme points of a f x ( X ) and those of a f x ( 0 ) . Since X is the unique maximal element of the domain [ ~ , X ] of I, f X and 0 is the unique minimal element of the domain [ X ,E ] D / Xof f x , the present theorem for X E D with 0 c X c E follows from that for X = E and Q.E.D. X = 0 for f with domain D. 184
6.2. SUBGRADIENTS OF SUBMODULAR FUNCTIONS The characteristic cone (or the recession cone) of a subdifferential a f ( X ) D) is given by
(X E
C j ( X ) = {X = {X
I x E RE, VY E D: x(Y)- z ( X ) 5 0) I x E RE,V Y E [@,XI0U [ X , E ] p :x(Y)- x(X)5 O}.
(6.53)
Note that C,(X) depends only on D and X . We next give a characterization of extreme rays of C j (X). Let G ( P ) = ( E , B * ( P ) )be the directed graph with the vertex set E and the arc set B*(P) which represents the Hasse diagram of the poset P = ( E ,d), i.e., (e,e’) E B * ( P ) if and only if e covers e’ in P. Denote by E+ and E - , respectively, the set of all the maximal elements of P and the set of all the minimal elements of P. Note that E+ n E - may be nonempty. For each X E D denote by A - ( X ) the set of all the arcs entering X in G ( P ) . Define vectors tp+( p + E E + ) ,vp- ( p - E E - ) and fa (u E B * ( P ) )in RE by
(6.54) (6.55) ( e = el)
1
( u = (e’,e”) E
B*(P)).
(6.56)
( e E E - {e’,e’’})
0
Also define for each X E D ER(X) = {Ep+
I p + E E+ - X} u {vP- I p I a E B*(D) A - ( X ) } .
U {(a
-
E E-
n x} (6.57)
Now, we are ready to show a theorem characterizing extreme rays of C , ( X ) ( X E D). T h e o r e m 6.12: For each X E D, the set of all the extreme rays of the characteristic cone C,(X) of the subdifferential a f ( X ) is given by ER(X) in (6.57). (Proof) We can easily see from (6.53) that
Since no vector in ER(X) can be expressed as a nonnegative linear combination of the other vectors in ER(X), it suffices to prove that every vector in C j ( X ) can be expressed as a nonnegative linear combination of vectors in ER(X).
185
IV. SUBMODULAR ANALYSIS Let v be an arbitrary vector in C f ( X ) . From (6.53),
v(X - Y )2 0
(X 2 Y E D),
(6.59)
D).
(6.60)
v(Y - X ) 5 0 (X G Y E
Suppose that each arc of B*(P)- A-(X) has the infinite upper capacity and the zero lower capacity and that each arc of A-(X) has the zero upper and lower capacities. Then it easily follows from (6.59), (6.60) and the feasibility theorem for network flows (Theorem 1.3) [Hoffman601 ([Ford Fulkerson62I) that there exist a nonnegative flow p: B * ( P ) --+ R+ in G ( P ) with p(a) = 0 ( u E A - ( X ) ) , a nonpositive vector z E RE with x(e) = 0 (e $! E+ - X ) and a nonnegative vector y E Rf with y(e) = 0 (e E - n X ) such that
+
4
v = dp
+
X S
y,
(6.61)
where dp is the boundary of p in G ( P ) . (6.61) gives an expression of v as a Q.E.D. nonnegative linear combination of vectors in ER(X). We see from Lemmas 6.5 and 6.7 that C,(X) is the direct product of the characteristic cones of the supermodular polyhedron d f x ( X ) and the submodwhen f(0) = 0. Hence, Theorem 6.12 may also follow ular polyhedron dfx(@), from Theorem 3.26. It should be noted that if w in C , ( X ) satisfies w ( X ) = 0, then y = 0 in (6.61) and that if v satisfies v ( E - X ) = 0 , then x = 0 in (6.61). Theorem 3.26 (the extreme ray theorem for base polyhedra) also follows from this theorem.
6.3. The LovAsz Extensions of Submodular Functions Consider a submodular function f : D + R on a simple distributive lattice D = Z p with P = ( E , d ) .We assume f(0) = 0. Define the function f : R E + R U {+m} by
for each c E
RE.where (6.63)
Here, f is called the support function of P ( f ) and is a positively homogeneous function, i.e., ~ ( X C = ) Xf(c) for each X > 0 and c E RE ([Rockafellar70], [Stoer + Witzgall701). We see from Corollary 3.14 that f ( c ) < +co if and only if c: E --+ R is a nonnegative monotone nonincreasing function from P = ( E ,5 ) 186
6.3. THE LOVASZ EXTENSIONS OF SUBMODULAR FUNCTIONS
to ( R , < ) . Therefore, for any c E RE such that f(c) < +oo there uniquely exist a chain A1 C A2 C * - - C Ak (6.64) of nonempty Ai E D (i = 1 , 2 , . . - , k ) and positive numbers A; E R (i = 1 , 2 , . ,k) such that # .
k
(6.65) i=l
where k 2 0, X A E ~ RE is the characteristic vector of and if k = 0 (i.e., c = 0), the empty sum is defined to be zero vector 0 in RE. Moreover, we have k
(6.66)
since the value of f(c) defined by (6.62) can be obtained by the greedy algorithm (see Section 3.2.b) and a maximizer z of (6.62) satisfies z(Ai
-&I)
= f ( A i )- f ( A i - 1 )
(i=l,2,...,k)
(6.67)
with A0 = 0. If the right-hand side of (6.66) is the empty sum, it is defined to be zero. Formula (6.66) was introduced by L. Lovdsz [Lovdsz83] for D = 2E. The construction of f through (6.64)-(6.66) can be applied to any function f on D with j ( 0 ) = 0 and f is an extension of f . We call such an extension f the Lovcisz extension o f f .
Theorem 6.13 [Lovdsz83]: A function f: D the Loviisz extension f o f f is convex.
+R
is submodular i f and only i f
(Proof) If f is a submodular function, then its extension f^ is given by (6.62) and hence is a convex function. Conversely, suppose that the extension f of f is a convex function. By definition, for any X , Y E D
f ( x x + x u ) = f ( x x u u + xxnu) =f
Since
f
( x u Y ) +f ( x n Y ) .
(6.68)
is a positively homogeneous convex function, we also have
From (6.68) and (6.69) f is a submodular function on 187
D.
Q.E.D.
IV. SUBMODULAR ANALYSIS Theorem 6.13 shows the close relationship between the submodularity and the convexity. The results in Sections 6.1 and 6.2 can be viewed from the theory of convex functions through this theorem. However, the integrality result in Theorem 6.3 does not follow directly from the ordinary convex analysis; it is truly a combinatorial deep result. Define
P(D) = the convex hull of vectors
XA
( A E D).
Lemma 6.14 [Lov&z83]: For a submodular function f : D min{f(x)
I x E D)
= min{f(c)
t
I c E P(D)).
(6.70)
R we have (6.71)
(Proof) Immediate from Theorem 6.13 and (6.66), the positive homogeneity of Q.E.D.
i.
Lemma 6.15: For any c E P(D) there uniquely exists a nonernpty chain B1
C
B2
C
*
*
a
C
BP
(6.72)
of D such that c is expressed as a convex combination (6.73)
withpi > O ( i = 1 , 2 , . - . , p ) a n d C : = l p i = l . (Proof) Any c E P ( P ) is a nonnegative monotone nonincreasing function from = ( E , d ) to ( R , < ) . Therefore, there uniquely exists a chain (6.64) of nonempty A* E D (i = 1 , 2 , . . . , k ) and X i > 0 (i = 1 , 2 , - . . , k ) such that (6.65) holds. Since c E P(D), we have
P
k
CXi 5 1.
(6.74)
i=l
If E:=l Xi = 1, then (6.65) is the desired unique expression. Otherwise put Xo = 1 X i and A0 = 0. This yields a desired unique expression
xf=l
(6.75) i=O
188
6.3. THE LOVASZ EXTENSIONS OF SUBMODULAR FUNCTIONS
with Xi > 0 ( i = 0 , 1 , . . - , k ) and Now, let f : D {+4 by
-+
C:=oXi = 1.
Q.E.D.
R be a submodular function and define f”:RE-+ R
U
(6.76)
where
f* is the Lovbsz extension of f . Call f” the truncated Loucisz extension of
fTheorem 6.16: For each A E
D
we have
af(A) = af”(xA)
3
where df”(XA)denotes the subdifferential of the convex function ordinary sense of convex analysis (see (1.34)). (Proof) By definition, we have x E
(6.77)
f” at X A in an
a f ” ( x ~if )and only if
vc E R ~(C :- x A , z )
5 f”(c) - f”(xA).
(6.78)
Since ~ ” ( x A ) = f ( A ) ,(6.78) is rewritten as
IcER ~ } = min{f*(c) - (c,x) I c E P(P)> = min{f(X) - z(X) I X E P},
f ( ~-)X(A) I min{f”(c) - (c,z)
(6.79)
where the last equality follows from Lemma 6.14 with f replaced by f - x. Q.E.D. (6.79) is equivalent to x E d f ( A ) .
Theorem 6.17: Let c be an arbitrary vector in P(P) and suppose that c is expressed as (6.73) with (6.72). Then, we have
aj(c)= n { a f ( B i ) I i = 1 , 2 , . . . , p } . (Proof) We have z E df”(c) if and only if
From (6.72) and (6.73), (6.81) is rewritten as
189
(6.80)
IV. SUBMODULAR ANALYSIS due t o Lemma 6.14. Furthermore, since c y = l p i = 1 and p, 1,2, . . ,p ) , (6.82) is equivalent t o
f(&)
- x(&)
= min{f(X)
-
x(X) I X E
> 0 (i
=
D}
(i = 1 , 2 , . . . , p )
(6.83)
(6.84) Q.E.D.
For any maximal chain C:O = SO c S1 c ... c S, = E of D, denote by P(C) the n-simplex with vertices xsi (i = O , l , - - . , n ) .
Lemma 6.18: The collection of P(C)’s for all the maximal chains C of D forms a simplicia1 subdivision of P(P). Moreover, for two maximal chains Ci: 0 = SA c Sfc c S: = E (i = 1,2) the n-simplices P(Ci) (i = 1,2) have a common facet if and only if for some k with 1 5 k 5 ia - 1 we have S:=Sf
(Osj
(6.85)
(Proof) The first half of this lemma follows from the uniqueness property of Lemma 6.15. Moreover, any facet of the n-simplex P(Ci) corresponds t o a subchain of Ci with length n - 1. From this follows the second half of the lemma. Q.E.D.
Lemma 6.19: For any maximal chain C: 0 = SO c S1 c ... c S, = E of D and any interior point c of P(C), f” has a unique subgradient x a t c given by
x(S1 - si-1)= f(S1)- f(Si-1) (i = 1 , 2 , . . * , n).
(6.86)
(Proof) From Theorem 6.17 vector x given by (6.86) is the unique subgradient of j at c . Q.E.D. Note that the subgradient x of f given by (6.86) is a n extreme point of the base polyhedron B(f). We can see that for a submodular function f:D -+ R , the convex conjugate function ‘f and the truncated Lovbsz extension f” of f are the convex conjugate functions of each other in an ordinary sense of convex analysis (see (1.35)), since
f‘(z) = max{z(X)
-
f(X)IX E
= max{(c,x)
-
j(z)
190
P}
I z E RE}.
(6.87)
7.1. UNCONSTRAINED SUBMODULAR PROGRAMS
Consequently, the Fenchel-type min-max theorem for submodular and supermodular functions (Theorem 6.3), except for the integrality property, follows from Fenchel’s duality theorem for ordinary convex and concave functions. Finally, it should be noted that the extension of the rank function of a polypseudomatroid can be defined by generalizing the Lovksz extension of a submodular function, due to the greediness property of polypseudomatroids (see [QiSS]).
7. Submodular Programs In this section we consider optimization problems with objective functions and constraints described by submodular functions, which we call submodular programs ([Fuji84f]).
7.1. Submodular Programs - Unconstrained Optimization Let f : D -+ R be a submodular function on a simple distributive lattice D & 2E with f(0) = 0. We consider the problem of minimizing the submodular function f : D -+ R without any constraints. It should, however, be noted that the underlying distributive lattice D itself may be regarded as a constrained feasible region.
(a) Minimizing submodular functions From the definition of subdifferential of a submodular function we have the following trivial but fundamental lemma.
Lemma 7.1: A set A E D is a minimizer off: D
-+
R if and only if
where 0 is the zero vector in RE. From Lemmas 6.4 and 7.1 we get the following theorem which means that some “local” or “partial” minimality implies “global” minimality.
Theorem 7.2: A set A E D is a minimizer of f : D -+ R if and only if A 6 D minimizes f restricted to the sublattice [@,AIDu [ A ,E ] D . 191
IV. SUBMODULAR ANALYSIS
+
+
Grotschel, Lov&sz and Schrijver [Grotschel Lov&sz Schrijver881 have devised a strongly polynomial algorithm for minimizing a submodular function f : D --+ R which requires time polynomial in IEl. Their algorithm heavily depends on the so-called ellipsoid method [Khachian79, 801 for linear programs and is not a combinatorial one. A combinatorial but pseudopolynomial algorithm is proposed by Cunningham [CunninghamSFia],where the submodular function f is integer-valued and the required running time is polynomial in IEl and max I f ( X ) I but not in log(max I f ( X ) I ) . For special classes of submodular functions the problem of minimizing a submodular function can be reduced to network optimization problems such as the minimum cut problem [Picard761 and the maximum-weight stable-set problem in a bipartite graph [Billionnet MinouxSti]. We show a “practical” algorithm for minimizing submodular functions based on the following lemma (see (3.22)).
+
Lemma 7.3:
Consider a special quadratic programming problem over the base polyhedron B(f) described as Minimize
1 1 ~ 1 1(=~
x(e)2)
(7.3a)
CEE
subject to z E B(f)
(7.3b)
(see [ Fuj i80bl). In the present subsection we assume that the underlying totally ordered additive group R is the set of reals or rationals. (In Chapter V we will consider in detail a class of nonlinear optimization problems over the base polyhedron which includes Problem (7.3) .) Let x* be the optimal solution of Problem (7.3) and define y* = x* A O (= (min{z*(e),O} I e E E ) ) ,
A - = {e
I e E E,
z*(e) < 0},
(7.4) (7.5)
Lemma 7.4: y* in (7.4) is a maximizer of the right-hand side of (7.2). Also, Ais the unique minimal minimizer o f f and A“ is the unique maximal minimizer off. 192
7.1. UNCONSTRAINED SUBMODULAR PROGRAMS
(Proof) Because of the optimality of z* we have
V e E A _ : dep(z*,e) C A _ , V e E Ao: dep(z*,e) 2 Ao.
D and
From (7.4)-(7.8), we have A _ , A0 E
Since for any X E D and any y E P ( f ) with y 5 0 we have f ( X ) 2 y ( E ) , it follows from (7.9) that y* is a maximizer of the right-hand side of (7.2) and A - and A0 are minimizers of f. Moreover, for any X E D such that X c A _ ,
f ( X ) L z * ( X ) > .*(A_) = f ( A - ) and for any X E D such that A .
(7.10)
c X,
f ( X ) 2 z * ( X ) > z*(Ao)= f ( A o ) -
(7.11)
From (7.9)-(7.11), A - is the unique minimal minimizer of f and A0 is the Q.E.D. unique maximal minimizer of f. Let c1
< c2 < . . . < cp be the distinct values of B; = { e I e E E , P ( e ) 5 c;} Bo = 0.
z* (e) (e E E ) and define
(i = 1 , 2 , . - . , p ) ,
(7.12) (7.13)
By a similar argument as (7.7)-(7.9) we see that
8 = Bo c B1 c is a chain of
D
c Bp = E
(7.14)
(i = 0 , 1 , * . . ,P).
(7.15)
and that
f (23;)= 5*(B;) Consequently, we have
(7.16)
Problem (7.3) is to find the minimum-norm point in B(f). For simplicity, let us suppose B(f) is bounded, i.e., D = 2E. A solution algorithm for the minimum norm-point problem is proposed by P. Wolfe [Wolfe76] (also see 193
IV. SUBMODULAR ANALYSIS [von Hohenbalken751. We can adopt Wolfe's algorithm for Problem (7.3). We express a simplex S in RE by the set of its extreme points. A n algorithm for finding the minimum-norm point i n B ( f )
Step 1: Let z* be any extreme point of B(f). Put S + {z*}. Step 2: Using the greedy algorithm, find a minimum-weight base 2 of B(f) with respect to the weight function z*. If (z*, 2 - z*) = 0, then stop (z* is the minimum-norm point). Otherwise put S +- S U {2}. Step 3: Find the minimum-norm point xo in the affine space generated by S . If zo is in the relative interior of the simplex S, then put x* t zo and go to Step 2. Otherwise let S' c S be the unique minimal face of simplex S which has the nonempty intersection with the line segment [z*,zo]. Put z* + the intersection point of the face S' and the line segment [z*, xo], and also put S t S'. Go to the beginning of Step 3. (End) Step 3 is consecutively repeated at most IEl- 1 times, since the cardinality of S monotonically decreases. Each simplex S available when executing Step 2 uniquely determines z* lying in the relative interior of simplex S and the norm llz*II monotonically decreases every time z* is renewed. Therefore, all the simplices S which we encounter during the execution of the algorithm are different, so that the algorithm terminates after a finite number of steps.
Example: As an illustrative example, consider the minimum-cut problem for the two-terminal network 1 = (G = (V,A ) ,s+ ,s- ,c ) shown in Fig. 7.1, where V = {s+,s-, 1,2,3}. The numbers attached to the arcs denote the capacities ~ ( a (a ) E A ) . The problem is to find a cut U C V with s+ E U and s- $ U which minimizes its capacity c ( a ) , where A'U is the set of arcs leaving U . Putting V * = V - {s+,s-}, we define a submodular function f:2v* .+ R as follows. For each W C V * ,
xnEA+U
f(W) =
c
C(4
&A+ (WU{s+})
-
c
44.
(7.17)
& A + { s+ }
The second constant term is for the normalization, f(0) = 0. Now, the minimum-cut problem is reduced to the problem of minimizing the submodular function f : 2v* -+ R. Let us start with the extreme point Z* of B(f) corresponding to the sequence (1,2,3) of the vertices in V * ,i.e., Z*
(= ( ~ * ( 1 ) , ~ * ( 2 ) , ~ *=( 3(0,10,-20). ))) 194
(7.18)
7.1. UNCONSTRAINED SUBMODULAR PROGRAMS
1
3
30
S+
S-
2
Figure 7.1. Next, in Step 2 we find by the greedy algorithm the minimum-weight base 3 of B(f) with respect to weight x*, which is the extreme point of B(f) corresponding to the vertex sequence ( 3 , 1 , 2 ) , i.e.,
E = (-30,0,20).
(7.19)
The minimum-norm point xo in the line through the two points of (7.18) and (7.19) is given by XO
= -(0,10,-20) 17 26 1 = -(-270,170, 26
+ -(-30,0,20) 9 26 -160).
(7.20)
g,
9 Since 0 < ZT;, we put x* +- xo and find a new extreme point, denoted by 3, of B(f) corresponding to the vertex sequence (1,3,2) determined by x* (= zo in (7.20)). 2 is given by ? = (o,o, -10). (7.21)
The minimum-norm point xo in the two-dimensional affine space generated by the three points of (7.18), (7.19) and (7.21) is given by 1 xo = --(0,10, -20) 3
1 11 + -(-30,0,20) + -(O,O, 9 9 195
-10).
(7.22)
IV. SUBMODULAR ANALYSIS
-+
i, y ,
Since <0 < the minimum-norm intersection point of the present simplex formed by points of (7.18), (7.19) and (7.21) and the line segment between points of (7.20) and (7.22) lies in the face formed by points of (7.19) and (7.21). Hence, we discard point (7.18) from the present simplex S and find the minimum-norm point, denoted by zo again, in the line through points of (7.19) and (7.21). 1 6 = ( - 5 , 0 , -5).
zo = -(-30,0,20)
5 + -(O,O,-10) 6 (7.23)
Since the present zo is in the relative interior of the simplex formed by points of (7.19) and (7.21), we put z* +- xo and go back to Step 2. Now, we find a minimum-weight base 2 with respect to the weight given by (7.23). Such a base f is given by (7.19) (or (7.21)) and the algorithm terminates. From the minimum-norm base (7.23) we see that Uo = { s + , 1 , 2 , 3 } is the unique maximal minimum-cut and U- = {s+, 1 , 3 } is the unique minimal minimum-cut of the network, due to Lemma 7.4. When the algorithm for finding the minimum-norm point in B(f) terminates, we are given the minimum-norm point x* and a set of extreme points x; (i E I) of B ( f ) such that z* is expressed as a convex combination (7.24)
(i E I ) . For each i E 1 let to the distributive lattice
of z;
D(z,) = { X
Pi = ( E ,5;)be
the poset which corresponds
I x E D, z ; ( X ) = f ( X ) }
with D(z;)= Z p t , and define a distributive lattice
(7.25)
0 by (7.26)
Then,
B = D(z*) (= {X I x E D, z * ( X ) = f ( X ) } ) .
(7.27)
An algorithm for finding the poset Pi = ( E , s ; )for each extreme point z; E B(f) is proposed by Bixby, Cunningham and Topkis [Bixby Cunningham Topkis851 (see Section 3.3.c). The poset P = ( k ,5 ) on a partition of E which corresponds to the distributive lattice fl is easily obtained by superimposing the posets P; ( z E I ) .
+
196
+
7.1. UNCONSTRAINED SUBMODULAR PROGRAMS
For any maximal chain (7.28)
..
of D, the minimum-norm point z* satisfies (7.29)
for each e E Ai - hi-1 ( j = 1,2,...,q) (cf. (7.16)). Furthermore, all the minimizers of f are given by the interval [ A - , Ao]d of b , where A- and A0 are defined by (7.5) and (7.6).
(b) Minimizing modular functions Let p: D --f R be a modular function on a simple distributive lattice with P = (I?, 5 ) and p(0) = 0.
Lemma 7.5: There exists a unique vector P(X)=
c
Y
D
=Zp
E RE such that for each X E D (7.30)
4e).
eEX
(Proof) Choose any maximal chain C:0 = So define a vector Y E RE by
c S1 c ... c S,
= E of
v(e;)= p(S;) - P(S’;-~) (i = 1 , 2 , - . . , n ) , where {e;} = S, - S;-l (i = 1 , 2 , . . . ,n ) . For any X E 1 5 j l < . . . < jP5 n with = p such that
1x1
Since p is a modular function and Sjl
D
D and (7.31)
there exist integers
c Sj, c . . c Sit,, we have
k=2 I’
(7.33) k= 1
197
IV. SUBMODULAR ANALYSIS where use is made of the fact that X U Sjp-l = Sj,, X n Sjk-l = X n S j k - l (k = 2 , . . . , p ) and X n Sjl-l = 0. (7.30) follows from (7.31) and (7.33). The Q.E.D. uniqueness of u is clear from (7.30). We see from this lemma that the problem of minimizing the modular function p : D t R is equivalent to the problem of finding a minimum-weight (lower) ideal of the poset P = ( E ,5 ) with respect to the weight function u:E t R . The latter problem was solved by J. C. Picard [Picard761 by reducing it to a minimum-cut problem. Conversely, the reducibility of the minimum-cut problem to a problem of minimizing a modular function was shown by W. H. Cunningham [ CunningharnBCib] We first show Picard’s approach. Let G(P ) = ( E ,B * ( P ) )be the graph representing the Hasse diagramof P = ( E , < ) , i.e., ( e 1 , e z ) E B*(P)if andonly if e l covers e2 in P . (The following procedure works if we take, as G ( P ) ,any graph G’ whose transitive closure coincides with that of G ( P ) ,and a (transitively) closed set of G’ is an ideal of P.) Consider new vertices s+ and s- and sets of new arcs (7.34) S+ = { ( s + , e ) I e E E , u ( e ) < 0},
.
1 e E E,
v ( e ) > 0). of arcs a in B * ( P )u S+ U S- by
S- = { ( e , s - )
Also define the capacities C(U)
C(U)
=
{
+oo -v(e)
(a E W ) ) ((s+,e) E S+)
4e)
( ( e , s - ) E 5’-).
(7.35)
(7.36)
Denote this network by N = (& = (k,A),s+,s-,c),where E = E u {s+,s-} and = B * ( P )u S+ u S- (see Fig. 7.2). For any cut U C of the network N (i.e., s+ E U and s- 4 U ) , the capacity of the cut U is given by
A
K ~ ( u )=
I e E E U, v(e) < 01 + C { v ( e ) I e E E nU, v(e) > 0) + C { c ( a ) I a E B * ( P ) ,d+a E U , a - a
C{-v(e)
-
E E - U > (7.37)
Note that for any cut U of finite capacity z c ( U ) , the third term in (7.37) vanishes and U - {s+} is an ideal of P = ( E ,5 ) and, conversely, that for any ideal I E of P I u {s’} is a cut of finite capacity in N. Furthermore, for any ideal I of P we have K,(I
u {.+I)
=
C{-u(e) I e E E I , v(e) < 01 + C{+)I e E I , 4 e ) > 01
=Y(I)
-
+ C { - v ( e ) I e E E , v ( e ) < 01. 198
(7.38)
7.1, UNCONSTRAINED SUBMODULAR PROGRAMS
S-
3
-2
1
1
Figure 7.2. (a) A poset P with weights. (b) Network
N.
Therefore, minimizing /cC,(IU { s + } ) for cuts I U { s + } of N is equivalent to minimizing u(I) for ideals I of P . See [Picard Queyranne821 for practical applications of the problem of finding a minimum- or maximum-weight ideal of a poset or a minimum- or maximum-weight closed set of a graph. Next, we consider the problem of minimizing a modular function p: D + R from the point of view of submodular analysis. Given a submodular function f : D --f R and a vector x E R E ,consider the following problem. It includes the problem of minimizing the submodular function f .
+
(*) Find A 6 D such that
5
E a f ( A ) and then find an expression 5
= 51
+ 52
(7.39)
such that x1 is a convex combination of extreme points of a f ( A ) and 5 2 is a nonnegative linear combination of extreme vectors of C j ( A ) , the characteristic cone of a f ( A ) . Note that the problem of minimizing f is equivalent to that of finding a set A E D such that 0 E a f ( A ) . We consider the above problem (*) in the special case when f is a modular function p: D + R with p ( 0 ) = 0. 199
IV. SUBMODULAR ANALYSIS For modular function p , the subdifferential d p ( A ) for each A E D has a unique extreme point v due to Theorem 6.11 and Lemma 7.5, where Y is the vector appearing in Lemma 7.5. Hence Problem (*) is reduced to the problem of finding A E D such that x - v E C f( A ) and of finding a nonnegative linear combination, of ER(A) given by (6.57), which expresses x - v. The proof of Theorem 6.12 suggests an algorithm as follows. The graph G(P)= ( E , B * ( P ) represents ) the Hasse diagram of the poset P = ( E ,5 ) .
An algorithm for P r o b l e m (*) S t e p 1: Find a nonnegative flow p:B*(P) --+ R+ in G ( P ) and nonnegative coefficients a ( p + ) ( p + E E + ) and p ( p - ) ( p - E E - ) such that
(Here, &,+ ( p + E E+) and vp- ( p - E E - ) are vectors in RE defined by (6.54) and (6.55).) For each p + and p - such that p + = p - E E+ n E - we choose the values of a ( p + ) and , B ( p - ) so that a ( p + ) p ( p - ) = 0.
4
Step 2: Construct a network = (&(P),t ) with an underlying graph &(P) = ( E , B ( P ) )and a capacity function t : & ( P ) --+ R+ U {+m} defined as follows. The arc set &(P) of &(P) is given by
and the capacity function c^ by (7.42)
Then find a maximum flow $: &(P) + R+ in .$ from the entrance vertex set E+ - E - to the exit vertex set E - - E+ such that
d$(e) = 0
(e
EE
a $ ( ~ +I ) &(P+) -a$(p-)
-
( E +U E - ) ) ,
(P+ E E+ - E - ) ,
I P(P-)
(P- E E -
-
E+).
(Here, the boundary operator d is defined with respect to e ( P ) . ) 200
(7.44) (7.45) (7.46)
7.1. UNCONSTRAINED SUBMODULAR PROGRAMS
Step 3: Put
Then find A E D such that (i) p(a) = 0 ( a E A-(A)), (ii) a ( p + ) = 0 ( p + E E+ n A ) , (iii) P ( p - ) = 0 ( p - E E - - A). (End)
For any A E expressed as
D
(7.50) (7.51) (7.52)
satisfying (7.50)-(7.52) we have z E dp(A) and z is
using the obtained a , P and p . Step 1 can be carried out by the breadth-first search and requires linear time. Step 2 is performed by any maximum flow algorithm. Any minimum cut U C E obtained by the maximum flow algorithm in Step 2 gives a desired A D to be found in Step 3 as A = E - U . From (7.50)-(7.52), z--Y in (7.53) is a nonnegative linear combination of ER(A). When z = 0, the above algorithm solves the problem of minimizing the modular function p. In this case A E D found in Step 3 is a minimizer of p since 0 E dp(A). Let us consider the problem of minimizing a modular function p: D --+ R from a polyhedral point of view. Denote by P(P) the convex hull of all the characteristic vectors xx ( X E D). L e m m a 7.6:For any A E D , the inequalities of all the facets of the polyhedron P(D) which include the vertex X A are given by z(e) z(e)
-
z(e')
50
2 0 (e E E+ - A),
(e covers e' in
P and either e,
s(e) 5 1
e' E
(e E E - n A).
(7.54)
A or e, e'
A), (7.55) (7.56)
(Proof) The lemma follows from the fact that the dual cone (of the tangent cone) of P ( P ) at X A (regarded as the origin) is the characteristic cone C,(A) of a p ( A ) . Notice the one-to-one correspondence between the set of facets 201
IV. SUBMODULAR ANALYSIS (7.54)-(7.56) and the set ER(A) of the extreme rays of C,(A) given by (6.57) with (6.54)-(6.56). Q.E.D.
Corollary 7.7: All the facet inequalities of P( D ) are given by z(e)
2o
(e E
z(e) - z(e')
5o 51
(e
z(e)
E+),
(7.57)
covers e' in P),
(7.58)
(e E ,?3-).
The problem of minimizing the modular function p : D the problem of minimizing the linear function
(7.59) -+R
is reduced to
(7.60) over P ( D ) , where Y is the vector appearing in Lemma 7.5. Therefore, it follows from Lemma 7.6 that A E D is a minimizer of p (or X A is a minimizer of ( Y , z) over P ( D ) ) if and only if the vector -Y is expressed as a nonnegative linear combination of vectors tP+( p + E E+ - A), ,$, ( a 6 B*(P)- A-(A)) and qp( p - E E- nA) which are coefficient vectors of (7.54)-(7.56), where inequalities (7.54) should be considered in the form of -z(e) 5 0 (e E E+ - A). 7.2. Submodular Programs - Constrained Optimization
We consider the problem of minimizing a submodular function f : D + R with constraints on the domain D of f and discuss some other related problems.
(a) Lagrangian functions and optimality conditions Suppose that a sublattice DO of D is given. We say that a vector a E RE is normal to DO at A E DO if for each X E DO
u ( X ) - a ( A ) 5 0.
(7.61)
The following theorem characterizes the minimizers of f when the domain of f is restricted to a sublattice of D .
Theorem 7.8 (cf. [Rockafellar70, Theorem 27.41): Let f : D + R be a submodular function and Do be a sublattice of D . Then, for A € Do we have (7.62) 202
7.2. CONSTRAINED SUBMODULAR PROGRAMS if and only if there exists a subgradient a E a f ( A ) such that -a is normal to DO at A. (Proof) The “if” part: From the assumption, we have for any X E Do 0 F 4x1 - a(A) 5 f(X) - f ( A ) ,
(7.63)
from which (7.62)follows. The “only if” part: Define a modular function PO:DO -+ R by PO(X= ) 0
(X E Do).
Then, by the assumption A is a minimizer of fo from Theorem 6.8 and Lemma 7.1 that 0 E WO(A) = W A )
G
(7.64)
f
+ PO:DO
-+
R . It follows
+ aPo(A).
(7.65)
Hence, there exists a vector a E d f ( A ) such that -a E a p o ( A ) .
From (7.64),(7.66)implies that -a is normal to
(7.66)
DO at A.
Q.E.D.
Next, consider a constrained minimization problem for which the “feasible region” Do in (7.62)is defined by a set of equations. Suppose that we are given submodular functions fi: D -+ R (i = 0,1,. ,rn) and that the minimum value of f; for each i = 1,2, ,m is known and equal to ai. The feasible region DO is given by
-
Do = {X
I
X E
D, V i E {1,2,...,rn}:f;(X) = a;}.
(7.67)
Here, we assume Do # 0. From Lemma 2.1,Do is a sublattice of D. Let us consider the following constrained minimization problem: Minimize fo ( X ) subject to f;(X) = ai Define a function L:R’; x D
+R
(7.68a)
(i = 1,2,.-.,rn).
(7.68b)
by
for X = (X1,X2,. - - , A m ) E R’; and X E D. We call L the Lagrangian function associated with Problem (7.68). Also, we call X E RT an optimal Lagrange multiplier if (7.70) min{L(X,X) 1 X E D} 203
IV. SUBMODULAR ANALYSIS is equal t o the optimal value of the objective function of Problem (7.68).
Theorem 7.9 (cf. [Rockafellar70, Theorem 28.31): For Problem (7.68), (1)
= (il,..-,im) is an optimal Lagrange multiplier and
(2) 2 is an optimal solution of Problem (7.68) if and only if (i)
(3) (iii)
iE (il,..-,im) CRY,
2 is a feasible solution of Problem (7.68) and o E a f o ( 2 )+ ilafl(2)+ + imafm(2).
(Proof) First, note that for each i = l , . - . , m , af,(X) and a f o ( X ) has the same characteristic cone, since the characteristic cone is determined by the underlying distributive lattice D alone. Therefore,
dfo(X) =afo(x)+o+afi(x) ( 2 = 1 , 2 , . . * , m ) .
(7.71)
Hence, from Theorem 6.8 and Lemma 7.1, (iii) is equivalent to 0 E dxL(i,k),
(7.72)
where a x L ( i , 2)denotes the subdifferential of the submodular function L ( i , .): D -+ R a t X. The “if” part: From (i)-(iii) and (7.72) we have rnin{L(i,X)
I x E D) = ~ ( i ,=if)o ( 2 ) ,
(7.73)
while we have for any feasible solution Y
Hence (1) and (2) follow. The “only if” part: From (1)and (2) we have X E
min { L(i,X)
Do,
I x E D) = fo(2)= ~ ( i , i ) .
(7.75) (7.76)
Therefore, we have (7.72), or (iii). (i) and (ii) trivially follow from (1) and (2). Q.E.D. The above proof almost parallels that of the corresponding theorem for ordinary convex functions ([Rockafellar70, Theorem 28.31). It should, however, 204
7.2. CONSTRAINED SUBMODULAR PROGRAMS be noted that the above proof heavily depends on the results in Section 6.2, especially Theorem 6.8. Define a function p:RT -+ R U { t o o } by p ( u ) = min{fo(X)
1 X E D, Vi E {1,*.*,m}: f;(X) - ai L u i } ,
(7.77)
E RT. We define p(u) = +m for u for which there where u = (ul,***,um) exists no X E D such that f;(X) - ai 5 ui for all i = l , . . . ,rn. We call p the perturbation function associated with Problem (7.68).
Theorem 7.10: For the perturbation function p associated with Problem (7.68) we have for each X E RT
+
+ + Xmum 1 u E RT}
min{p(u) Xlul = min{L(X,X) I X E (Proof) Suppose that 2 E the definition of p ( u ) ,
D
D}.
is a minimizer of L ( X , X ) in X E
L(X,k) = p(u)+ XlE1 i* * * where
(7.78)
+ X,ii,,
(7.79)
(2)- a, ( z = 1, - . - ,rn) . Hence we have
= f;
On the other hand, suppose that .Ci E RT is a minimizer of p ( u ) Xmum in u E RI;". Then, there exists an Xo E D such that P ( 4 = fO(XO),
fi(X0) - ai 5 oi (i = 1,.. - ,m). Since
D. Then, from
+ Xlul+. ..+ (7.81) (7.82)
X E R;I, we have from (7.81) and .(7.82) p(c)
+XlQl+
. . . + Xm.ci, 2 L(X,X o ) .
(7.83)
Therefore,
+
+ + Amum I u E RT}
min{p(u) Xlul 2 min{L(X,X) 1 X E
D}.
The present theorem follows from (7.80) and (7.84). 205
(7.84) Q.E.D.
IV. SUBMODULAR ANALYSIS In the proof of Theorem 7.10 we have already shown the following.
Theorem 7.11: For each X E RT, (a) if 2 E D is a minimizer of L ( XX, ) in x E D , then 6 = (61, bY 6;= f;(R) - a; ( i = l , . - , m )
+
+ +
is a minimizer of p(u) Xlul .-. Xmu, in u E R;1; . X,U, (b) if 6 = (01, ,a,) is a minimizer of p(u) Xlul then there exists an 2 E D such that
---
+
+ a
+
- ,6,)
given (7.85)
in u E RT,
(7.86)
fi(2)-a; 5 6; ( i =
l,...,rn),
(7.87)
-
and 2 is a minimizer of L( A, X ) in X E D. Here, for each i = 1 , . . ,rn, (7.87) holds with equality if A, > 0, while X i = 0 if (7.87) for i holds with strict inequality. Furthermore, we have
Theorem 7.12 (cf. [Rockafellar70, Theorem 29.11): A vector optimal Lagrange multiplier of Problem (7.68) if and only if min{p(u)
E R;1 is an
+ i l u l + . . . + imu, I u E RT} = ~ ( 0 ) .
(7.88)
+
(Proof) (7.88) means that the zero vector 0 E R;1 is a minimizer of P(U) i l u l + ... + imu, in u E RT. Hence, the present theorem follows from Theorems 7.9-7.11. Q.E.D.
We see from Theorems 7.11 and 7.12 that if iE RT is an optimal Lagrange multiplier, then any X E RT such that $ 5 X is also an optimal one. Note that (7.87) for i such that 0; = 0, holds with equality since f ; ( X ) - a( 2 0 for any
XED. An algorithm for finding an optimal solution and an optimal Lagrange multiplier for Problem (7.68) is given as follows. For simplicity, we consider the case when m = 1.
An algorithm for solving Problem (7.68) (with rn = 1) Step 1: Let a0 be an upper bound of fo such that a0 > f o ( X ) for all X E D (with strict inequality) and let (YO be a lower bound of fo. Also let a1 be an upper bound of f l such that a1 > a 1 . Put X t (a0 - ao)/(al- ( ~ 1 ) . 206
7.2. CONSTRAINED SUBMODULAR PROGRAMS
k t a m i n i m i z e r o f L ( X , ~=) f o ( ~ ) + ~ ( f l ( ~ ) - ia nl x) E P . fl(k)- a1 = 0, then stop (k is an optimal solution and X is an
Step 2: Put
Step 3: If optimal Lagrange multiplier). Otherwise, put and go back to Step 2. (End)
X
+ (a0 -
fo(k))/(fl(k) - a1)
The validity of the above algorithm follows from Theorems 7.8, 7.11 and 7.12. The case when m > 1 can also be treated by the algorithm by setting f l +- f~ f2 f,. If min{fl(X) fz(X) f,(x) I X E P} f a1 a2 - . a,, then there is no feasible solution. The upper bounds of the submodular functions required in Step 1 are obtained by adapting (3.89). When fi (i = 0 , l ) are integer-valued, the above algorithm terminates after repeating the cycle of Steps 2 and 3 at most 2 min(a0 (YO, a1 - a1} times, where upper bounds ai (i = 0 , l ) and a lower bound a0 are chosen to be integral. For each X E RT denote by L(X) the set of minimizers of the Lagrangian function L(X,X) = fO(X) Xlfl(X) + * - * h&rJX) (7.89)
+ + +
+ + +
-
a
+
.
+
+ +
+
in X E D. L(X) is a sublattice of D. Because of the finiteness character of the problem there are a finite number of distinct L(X)’s (A E RT). The structure of these L(X)’s is closely related to the concept of principal partition ([Kishi Kajitani681, [Iri71], [Tomi76],“arayanan741, [Fuji78c],[Iri79], [Nakamura IriSl], [Iri84], [NakamuraSSa], [Tomi Fuji82]), which will be discussed in the subsequent subsection.
+
+
+
(b) Related problems We discuss other problems related to submodular programs.
(b.1) The principal partition [Iri79], [Nakamura+Iri81], [Tomi+Fuji82] Let us consider the Lagrangian function L(X,X) of (7.69) for the case when rn = 1 and we suppress the term a1 in L ( X , X ) . We suppose that ( P i , f i ) (i = 0 , l ) are submodular systems on E , that D1 is a Boolean lattice (i.e., D1 = (= {E - X 1 X E PI})),and that f l is monotone nonincreasing. The monotonicity of f l plays an essential r61e in the following argument. Define
D
= Do
n DI.
Consider the Lagrangian function (7.90) 207
IV. SUBMODULAR ANALYSIS for X 2 0 and
X
E
D. We also define L(X,X) for X < 0 as
L(X,X) = fO(X) + Xfl#(X),
(7.91)
f1# is the dual supermodular function of fl, i.e., fl#(X) = f l ( E )fl(E- X) (X E PI), Note that for any X E R L(X,X)is a submodular function in X E D. For each X E R let L(X) be the set of minimizers of L(X,X) in X E D. C(X) is a sublattice of D.
where
Theorem 7.13 ([Tomi
+ Fuji821): Define L'
u L(X).
=
(7.92)
AER
'L any
is a sublattice of D. More precisely, for any X and A' with X 5 A' and for X E L(X) and X' E C(X'), we have
X n x'E L(X),
X U x' E e(X').
(7.93)
(Proof) If X = A', (7.93) holds. Case I : O 5 X < A' For any X E L(X) and X' E C(X') we have
L(X,X) + L(X',X') = f O ( W + Xfl(X)+ fO(X') + Ul(X') 2 fo(X u X')+ X'fl(x u X')+ fo(Xn x')+ Xf1(x n x') + (A' - X)(fl(Xn x')- fl(X)) > L(X',XuX') +L(X,X~X') due to the submodularity of fo and from (7.94) that
fl
and the monotonicity of
L(X',XUX') = L(X',X'), L(X,XnX') = L(X,X) (and fl(X) = fl(Xn X')). We thus have (7.93). Case 2 : X < 0 5 A'
f1.
(7.94)
It follows (7.95) (7.96)
Similarly as (7.94), we have
L(X,X) + L(X',X') = fO(X) + Xfl#(X)+ fO(X')+ X'fl(X')
f O ( ux x')+ x'fl(x u x')+ f0(x n x')+ xfl#(x n x') + X'(fl(Xn x')- fl(x)) + ~(fP(x) - fl#(x n x')) 2 ~(X',xux') +~(X,xnx'). (7.97) 2
208
7.2. CONSTRAINED SUBMODULAR PROGRAMS
From this we have (7.93). Case 3: X < A' < 0 Similarly, we have L(X,X)
+ L(X',X')
+ Xfl#(X) + f o p ' ) + X'fl#(X') 2 f0(x u x') + x'fl#(xu x')+ f0(x n x') + xfl#(xn + (A' - X)(fl#(X') fl#(XU X')) 2 L(X',X u x')+ L(&X n x'). = fO(X)
XI)
-
Hence we have (7.93).
(7.98)
Q.E.D.
Denote by S+(X) the unique maximal element of f!(X) and by S-(X) the unique minimal element of f!(X) for each X E R.
Theorem 7.14: For any X and A' with X < A' we have
s-(X) _c s-(Y).
S+(X) _c S+(X'),
(Proof) For any X E C(X) and X' E f!(X')
C S+(X')
we have from Theorem 7.13
s-(X) n x' E ,!(A).
X U s+(X') E f!(X'),
This implies X we have (7.99).
(7.99)
( X E ,!(A))
and S-(X)
C: X' ( X '
(7.100) E f!(A')).
Hence Q.E.D.
We further suppose that f l is monotone decreasing. Then, for a sufficiently large X > 0 we have f!(A) = { E } and for a sufficiently small X < 0, f!(A) = (0).
Theorem 7.15: Suppose that
fi
is monotone decreasing. Then, there exists
a sequence of reals A1
< Xz <
< A,
(7.101)
with p 5 [El such that the distinct C(X) (A E R) are given by f!(X;)
(i = 1 , 2 , * - , p ) ,
{S+(Xi)>(= {S-(k+1)}) {S-(Xl)) (=
{0N,
( i = 1 , 2 , - * - , P -11,
{S+(X,)l (= {EH.
(7.102) (7.103) (7.104)
For any i E { 1 , 2 , . . . , p - 1) and any X such that A; < X < X;+1 we have
f!(X)
= {S+(k)}= {s-(Ai+1)}.
209
(7.105)
IV. SUBMODULAR ANALYSIS
Also, (7.106)
(Proof) Because of the finiteness character there exist finitely many distinct L(A) (A E R).Choose any iE R. If L ( i ) contains more than two elements, there exist some distinct X, X‘ E L ( i ) with X c X‘ and
fO(X)
+ ifl(X) = fO(X’)+ lfl(X’).
(7.107)
is Since fl(X) > fl(X’) by the monotonicity assumption, the value of uniquely determined from (7.107). If L ( i ) contains only one element, then from the finiteness character there exists an open interval (A’,A”) such that E (A’,X“) and L(A) = L ( i ) (A E (A””)). (7.108)
It follows that there exists a finite sequence of reals A1
< A2 <
* * *
(7.109)
such that distinct C(A) (A E R) are given by
L(A;) L ( A ) (A
(i = 1 , 2 , * - , p ) ,
E (Ai,&+l),
i =O,L***,P),
(7.110) (7.111)
where A0 s -00, A,+1 = +00, L(A)’s are the same in each interval ( A i , A i + l ) (i = O , l , . - . , p ) , IL(A;)l 2 2 (i = 1 , 2 , - . . , p )and IL(X)l= 1 (A E ( A ; , A i + l ) , i = 0,1, ,p ) . Moreover, for each i = 1,2, . -,p , because of the finiteness character there exists a (sufficiently small) positive number E such that
---
L(Ai - 4 _c L(Xi), L(Xi
+ c L(A;). E)
(7.112) (7.113)
Since f1 is monotone decreasing, we have from (7.112) and (7.113) S-(A;) E L(X1- €), S+(Ai) E L(Xi
+
6).
(7.114) (7.115)
From (7.114) and (7.115) we have (7.103)-(7.106). Also, since the set of the quotients S+(A;)-S+(A;-1) (i = 1 , 2 , . . . , p ) with S + ( A o ) E 0 is apartition of E into nonempty subsets of E due to Theorem 7.14 and (7.103)-(7.106), we Q.E.D. have p 5 IEl. 210
7.2. CONSTRAINED SUBMODULAR PROGRAMS The X i (i = 1 , 2 , - . - , p ) in (7.101) are called critical values for the pair of submodular systems ( D o , f ~ ) and (D1,fl). Denote SO = (D0,fo) and S1 = ( D 1 , f I ) . Submodular systems S; (i = 0 , l ) are decomposed according to the distributive lattice L* = UXERL(X) as follows. Choose any maximal chain
C : 0 = A"
c A1 c . . * c Ak
=E
(7.116)
of L* and then decompose S ; (i = 0 , l ) into their minors
S, *Aj/Ai-I ( j = 1 , 2 , * * * , k ) ,
(7.117)
where S; .Ai/Ai-lis the set minor of S ; obtained by restricting S; to Ai and contracting Aj- 1. Such a set of decompositions of S; ( 2 = 0 , l ) is called the principal partition of the pair of S ; (i = 0 , l ) . By the poset on the partition {Ai - Aj-1 I j = 1 , 2 , . . . ,k} of E which is uniquely determined by 'L (see Section 3.2.a), the corresponding poset structure is defined on the set of minors (7.117) for each i = 0 , l . We can show that the decompositions (7.117) do not depend on the choice of a maximal chain in C' ([Nakamura IriSl], [Tomi Fuji82]), due to Theorem 7.17 shown below.
+
+
Lemma 7.16: Let p: Do -+ R be a modular function on a distributive lattice Do C 2E with 0, E E D O . We have p ( X ) = 0 for all X E D if and only ifp(S,) = 0 (i = 0,1,. , k) for an arbitrary maximal chain 0 = SO c S1 c . . . C s k = E of Do. (Proof) This lemma immediately follows from Lemma 7.5 and Corollary 3.10. Q.E.D.
Theorem 7.17 [Nakamura+Iri81] (also see [Tomi+Fuji82]): For a submodular system (D,f)on E suppose that f is modular on a sublattice DO of D with 8, E E Do. For a maximal chain 8 = So c S1 c . . . c s k = E of DO consider minors (D,f) S;/S,-l (i = 1 , 2 , . . . , k ). Then, the set of these minors is independent of the choice of a maximal chain of DO. (Proof) For a vector z E R E the following two statements are equivalent (see Lemma 3.1): (i) z.'*-'1-1 i s a b a s e o f ( D , f ) . S ; / S ; - l f o r e a c h i = 1 , 2 , . . - , k . (ii) z is a base of ( D , f ) and f(St) - z(S;) = 0 for each i = 0 , 1 , . . . , k . Since f - z: Do + R is modular on Do, we see from Lemma 7.16 that (ii) is also equivalent to (iii) z is a base of ( D , f ) and f ( X ) - z ( X ) = 0 for each X E DO. It follows from the equivalence between (i) and (iii) that the set of minors ( D , f ) . S;/S,-, (i = 1,2,. . . ,k) does not depend on the choice of a maximal Q.E.D. chain of Do. 211
IV. SUBMODULAR ANALYSIS The concept of principal partition was originated by G. Kishi and Y. Kajitani [Kishi Kajitani681 for graphs, where S O is a graphic matroid and S1 = (P,-1)with uniform modular function -1(X) = -1XI (X E ) . It was generalized to a pair of graphs [Ozawa74]; to a pair of a matrix and a uniform modular function [Iri69,71]; to a pair of a matroid and a uniform modular function “arayanan741, [Tomi761 (also [Bruno Weinberg711); to a pair of a polymatroid and a positive modular function [Fuji78c], [Fuji80b]; to a pair of polymatroids [Iri79];to a pair of a submodular system and a modular function [Fuji80c]; and to a pair of submodular systems [Nakamura IriSl], [Tomi80d], [Tomi t Fuji821. The concept has effectively been applied to electrical network analysis [Iri Tomi751, “arayanan741, network flows [Fuji80b], structure and scene analyses [Sugihara82,86], the determination of the concept structure [Sugihara hi801 etc. (Also see [Iri79], [Iri83], [ h i Fuji81], [Tomi Fuji821).
+
+
+
+
+
+
+
Example b.1: The principal partition of a graph Consider a graph G = ( V , E ) shown in Fig. 7.3. Let fo: 2E + Z+ be the rank function r of the graph and let f l : 2 E + Z+ be the modular function which E ,RE. l) corresponds to the negative of the uniform vector 1 = ( l , l , - ~ ~ We have L ( X , X ) = r ( X ) - XIXI. The set (7.117) of minors S o . Aj/Ai-l ( j = 1,2, - .,lo) together with the poset structure is given in Fig. 7.4.
-
Figure 7.3. 212
7.2. CONSTRAINED SUBMODULAR PROGRAMS
The principal partition shown in Fig. 7.4 is the decomposition in the sense of Tomizawa and Narayanan. Note that each minor S O Aj/Aj-l, considered as a polymatroid with a graphic rank function, has the uniform base uxj = (Aj,Aj,...,Aj) E RAjVAj-1with the critical value X i corresponding to the minor. The direct sum of these bases uxj of minors S o Aj/Aj-r (j= 1,2, , l o ) is a base of the original graphic polymatroid and, decomposing the exchangeability graph associated with this base into strongly connected components, we have the principal partition shown in Fig. 7.4 (see [Fuji80c]).
-
-
-
---A I L3 Figure 7.4. The principal partition of the graph in Fig. 7.3. For each graph G j = (Vj, E j ) representing the minor So 1 , 2 , . . . , lo), the critical value is given by
A j = (IVil- 1)/lEil,
- Aj/Aj-1 (j= (7.118)
so that 1/Aj can be regarded as the “density” of G j = ( c , E j ) . A minor with the smallest critical value gives a subgraph of G with the maximum density (cf. [ Goldberg831). The principal partition of a graph G = (V,E ) in the sense of Kishi and Kajitani is the partition of E into three parts E+, Eo, E- such that G+ = G x E+, Go = (G - E+)/ E - , and G- = G E- have the critical values greater than equal to $, and less than respectively. Based on this tri-partition, we can determine the topological degree of freedom of the (electrical) network represented
4,
i,
-
213
IV. SUBMODULAR ANALYSIS by G. The topological degree of freedom is the minimum number of current variables and voltage variables whose values uniquely determine the value of the current variable or the voltage variable of each arc through Kirchhoff’s current and voltage laws. The topological degree of freedom of G is given by min{r(X) IE - XI - # ( E - X) I X C E } in terms of the rank function r of G . Note that IE - XI - r#(E - X ) is the nullity of the contraction G / X and that r ( X ) IE - XI - r # ( E - X) = 2r(X) - 1 x1 [El - r(E). Therefore, the x1and the topological problem is reduced to minimize 2r(X) - (XI or r ( X ) - 1 degree of freedom of G is equal to the sum of the rank of G-, the nullity of G+, and the rank (or nullity) of Go (see [Iri68,69], (Kishi Kajitani681 and [Ohtsuki Ishizaki Watanabe681). Kishi and Kajitani’s tri-partition also furnishes the solution of Shannon’s Weinberg7ll). Shannon’s switching switching game (see [Edm65b], [Bruno game is described as follows. Two players play on a graph G with a special reference edge eo. Alternately choosing one edge of G, one player, called a short player, tries to construct a cycle consisting of the reference edge eo and some of the arcs already chosen by the player, and the other player, called a cut player, tries to construct a cutset (cocircuit) containing eo. The short (or cut) player wins if he succeeds in constructing such a cycle (or cutset). If eo belongs to G- (or G+), then the short (or cut) player can win; and if eo belongs to Go, the first-move player can win (see [Bruno Weinberg7ll).
+ +
+
+
+
+
+
+
E x a m p l e b.2: The principal partition of a p o l y m a t r o i d induced by a mult i-t erminal network Consider a capacitated single-source multiple-sink network N = (G = (V,A ) ,c, s+,S-) shown in Fig. 7.5, where the label attached to an arc a E A denotes its capacity c ( a ) ,s+ is the source and S- = {s,: I i = 1,2, - . ,6} is the set of the sinks. For each feasible flow p from s+ to S- in N define a-p E RS- by
-
Then the set of vectors a-p for all feasible flows p forms the independence polyhedron of a polymatroid (see Section 2.2). Denote this polymatroid by P ( N ) . Let fo: 2’- -+ R+ be the rank function of P(J)and f l : 2’- -+ R+ be the modular function corresponding to the negative of the uniform vector 1=(1,1,...,1)E R S - . Hence,wehave
Then, associated with this pair of fo and f l , the principal partition of P(N) is given as in Fig. 7.6. A base 8-p of P(N)is shown in Fig. 7.7. Note that the principal partition of P(1)is obtained by decomposing the exchangeability 214
7.2. CONSTRAINED SUBMODULAR PROGRAMS
S+
Figure 7.5. A capacitated single-souce multiple-sink network. graph associated with the base i3-p. A characterization of such a base will be given in Section 9.2. Informally, such a base has as “uniform” components as possible.
Example b.3: The principal partition of a pair of polymatroids (IIri79, 841, [Nakamura Iri811, “akamura86bl) Let ( E , p ; ) (i = 0 , l ) be polymatroids and suppose that p 1 : 2E + R is monotone increasing. Define
+
L(X,X)= PO(X)
+ b l ( E- X )
(7.121)
for X 2 0 and X 5 E , where we may consider (7.121) as the Lagrangian function L ( X , X ) = p o ( X ) + X ( p l ( E - X ) - p l ( E ) ) with the term -Xp(E) being omitted. Let the critical values be given by
5 X I < Xz < . ‘ . < A.,
(7.122)
C : 0 = So c S1 c * * . c Sb = E
(7.123)
0
Choose a maximal chain
of L3’ in (7.92) for (7.121). Note that C is a concatenation of maximal chains of L(X;) (i = 1 , 2 , . . . , p ) , due to Theorem 7.15. It follows from Theorem 215
IV. SUBMODULAR ANALYSIS
x=5
x=2
Figure 7.6. The principal partition of polymatroid P(1 ) .
S+
Figure 7.7. A maximal flow p in
N
(each arc label denotes p(u)/c(u) (u E A ) ) .
216
7.2. CONSTRAINED SUBMODULAR PROGRAMS 4.10 that there exists a base bi of polymatroid ( E , p i ) for each i = 0 , l such that for each j = 1 , 2 , . - . , k we have b,S'-'j-l = X 3.*bsj-sj-l 1 and this vector b?-'j-' is a common base of the minors ( E , p o ) Sj/Sj-l and (E,Xj*p1) ( E - S j - l ) / ( E - Sj), where j* denotes the integer in {1,2,...,p} such that Sj-1,Sj E .L(Xj*). We can easily see from Theorem 4.10 that for any X 2 0 bo A Xbl = (min{bo(e),Xbl(e)} I e E E ) is a maximum common subbase (independent vector) of polymatroids ( E , p o )and ( E , p l ) . The pair of the bases bo and b l is called a universal pair of bases ([Nakamura81]and [Tomi80d]). Sucha universal pair is not necessarily unique. Note that
.
s-(A) <
b0
XbS-(A)
,
1
b f - s + ( A ) > Xb:-s+(A)
s + ( A ) - s - ( A )= Xbf+(x)-s-(A)
bo
Y
(7.124) and (i) b:-(A)
is a base of ( E , p l ) .S-(X) and a subbase of ( E , X p 2 ) / ( E - S-(A)), (= Xb2s+ ( A ) -s - ( A ) ) is acommon base of ( E , p o ) - S + ( X ) / S - ( X ) (ii) bo and (E,Xpl) ( E - S - ( X ) ) / ( E - S+(X)),and s+(A)-s-(A)
-
-
(iii) Xbf-S+(A) is a subbase of ( E , p o ) / S + ( X ) and a base of ( E , X p l ) ( E S+(U The collection of minors ( E , p o ) Sj/Sj-1 and ( E , p l ) . ( E - S j - l ) / ( E - Sj) ( j = 1 , 2 , . . . , p ) does not depend on the choice of a maximal chain C of L* ([Nakamura Iri811, [Tomi Fuji82]), due to Theorem 7.17.
+
+
An efficient algorithm for the principal partition of a graph is given by H. Imai [Imai83] and it requires O(IEI3log IVl) time. Also, the principal partition of a polymatroid induced by a multi-terminal network can be found in O(lV13) time (see [Gallo Grigoriadis Tarjan891 and also see [Fuji80b]). The concept of principal partition is also closely related to convex optimization problems over base polyhedra, which will be treated in Chapter V. Finally, it should also be noted that the above argument is also valid with an appropriate modification if f l is a monotone nondecreasing submodular function.
+
+
(b.2) The principal structures of submodular systems Related to the principal partition, we introduce the concept of principal structure of a submodular system ([Fuji80c]). Consider a submodular system (P, f ) on E and define for any e E E
Pf(e) = { X
I
eEX E
D, f(X) = min{f(Y) I e E Y E P}}.
(7.125)
Then, Df(e) is a distributive lattice. We denote by Df(e) the unique minimal element of Pf (e) . 217
IV. SUBMODULAR ANALYSIS
Lemma 7.18: For any
e1,e2
E E , ife2 E D j ( e l ) , then
(Proof) Put A; = D j ( e i ) (i = 1,2). Because of the definition of A1 = D j ( e l ) ,
From (7.127) and the submodularity of f we have
Since e2 E A1 n A2, it follows from (7.128) and the minimality of A2 = Dj ( e 2 ) that we have (7.126). Q.E.D. Now, let us define a directed graph G j = ( E , A f )with vertex set E and arc set A f by
It follows from Lemma 7.18 that graph G j = ( E , A j ) is transitive and that every strongly connected component of G f is a complete directed graph with a selfloop at each vertex. Decompose G j = ( E , A j ) into strongly connected components G$4 =
( E ( ; ) , A ( * )(i) = 1,2,...,p) . This induces the partition & = {E(a) I z = 1,2, - ., p } of set E and the partial order 5 on & such that E(i) 5 E ( i ) if and only if there exists at least one directed path in G f from a vertex in E ( i ) to a vertex in E(a). The principal structure of the submodular system (P, f) is the partition & together with the partial order 5 on E . We can easily see from the definitions of D j ( e ) and E(a)'sthat for any e E E ( i ) E &
-
Example b.4: The Dulmage-Mendelsohn decomposition of bipartite graphs Let G = ( V + ,V - ; A ) be a bipartite graph with left and right vertex sets V + and V - and arc set A C V + x V - . Define a set function f + : 2v+ + R by
f+(u)= p+u( ( u ( (u g v+) -
218
(7.131)
7.2. CONSTRAINED SUBMODULAR PROGRAMS
Figure 7.8. where r+U = {v- I (v+,v-) E A , v+ E U } for U (I V + . Then f+ is a submodular function, the negative of the so-called deficiency function. The principal structure of (2"+, f+) expresses a finer structure than the DulmageMendelsohn decomposition of G ([Dulmage Mendelsohn591). See Fig. 7.8 for an example of (a) the decomposition of a bipartite graph based on the principal structure of (2"', f+) and (b) the Hasse diagram representing the poset structure. Components G I , Gz, GSand G4 constitute a single component in the Dulmage-Mendelsohn decomposition. It may also be noted that for a (directed) graph G = ( V , A ) ,if we define f ( U ) = I(a-b+U) UUI - IUI (U (I V ) ,then the principal structure of (2",f) is the decomposition of G into strongly connected components with the associated poset structure. Also see [MurotaSO]for an application to certain matrices.
+
Example b.5: The principal partition of a polymatroid Let b E RE be a base of a polymatroid ( E , p ) with rank function p and define a submodular function f: 2" -+ R by
f(X)= P ( X ) - b ( X )
( X c E).
( 7.132)
We call the principal structure of submodular system (2", f) on E the principal structure of polymatroid ( E , p ) with respect to base b. The principal partition of a matroid due to [Tomi761 and [Narayanan74] is a special case of the principal 219
IV. SUBMODULAR ANALYSIS structure of the submodular system (2E,f) having the rank function f defined by (7.132) with an appropriately chosen base b.
(b.3) The minimum-ratio problem Suppose that f : D --f R is a nonnegative submodular function with f(0) = 0 and that g: D --+ R is a nonnegative supermodular function with g(0) = 0 and g(X) > 0 for some X E D. Consider the following problem: (7.133a) subject to X E
D,
g(X)
> 0.
(7.133b)
This is called a minimum-ratio problem. Special cases of the problem have been treated in [Brown791 and [Ichimori Ishii Nishida82J as a sharing problem in a network (also see [Megiddo74],[Fuji80b]),and in [Cunningham85c] as a minimum-cost problem of disconnecting a network. The minimum-ratio problem given by (7.133) is closely related to the principal partition discussed in the preceding subsection 7.2.b.l. Define a Lagrangian function for f and -g associated with Problem (7.133) by (7.134) L ( A , X ) = f(X) - Ag(X)
+
for A 2 0 and X E
+
D.
Theorem 7.19: A nonnegative fi is the minimum value of the ratio of Problem (7.133) if and only if
(Proof) Suppose that is the minimum value of the ratio of Problem (7.133). Then we have (7.137) L ( i , X ) = f ( X ) - i g ( X ) 2 0 ( X E D), where (7.137) holds with equality for some 2 E follows that L(X,X) 2 L ( i , X ) 2 0 ( 0 I x 5
L(X,2)< L ( i , X )= 0 220
D
such that g ( 2 )
A,
x E D),
(i < A).
>
0. It
(7.138) (7.139)
7.2. CONSTRAINED SUBMODULAR PROGRAMS Since L(X,0)= 0 for any X 2 0, from (7.138) and (7.139) we have (7.135) and (7.136). Conversely, suppose (7.135) and (7.136) hold for a nonnegative i. Then, there exists an k E D with g(k)> 0 which attains the minimum of L ( i , X ) in X , since otherwise we would have min{L(i E , X )I X E D , g ( X ) > 0) > 0 for a sufficiently small E > 0, which would contradict (7.136). It follows from (7.135) that
+
L ( i , X ) 2 L(i,k)= 0
(X E D),
(7.140)
i.e., for any X E D with g ( X ) > 0 we have
which holds with equality for X = k.
Q.E.D.
It should be noted that Theorem 7.19 holds for f and g without submodularity or supermodularity, though the problem would become hard without submodularity or supermodularity. A minimum-ratio problem for general set functions has also been discussed by Cunningham [CunninghamSSa]. For minimum-ratio problems also see [Gondran Minoux841. From Theorem 7.19, the minimum-ratio problem (7.133) is reduced to the problem of finding the minimum critical value X I (= I) for L ( X , X ) . The minimum critical value X I can be obtained by a binary search in R+ based on Theorem 7.19 (cf. [Cunningham85c] and [Imai83] for special cases). Also a dichotomy works for finding an element of L(X1) from among L* = U{L(X) I X 2 0} by searching a chain of L* (cf. [Fuji80b], [Nakamura Iri811, [TomiSOd], [Tomi Fuji821).
+
+
+
Example b.6: The strength of a weighted graph [Cunningham85c] Let G = (V,E ) be a connected undirected graph with a positive weight function w : E -+ R on the edge set E . For each edge e E E w ( e ) represents the strength of e . A measure of network invulnerability, called the strength of the weighted graph G, is defined by
(7.142) where for X C E k ( X ) is the number of the connected components of G - X minus one ([Cunningham85c] and also [Gusfield83]when w = 1). Let r be the rank function of G. Since G is connected, we have
k ( X ) = r ( E )- r ( E - X ) = r # ( X ) 22 1
(X 5 E).
(7.143)
IV. SUBMODULAR ANALYSIS Hence, k : 2E -+ R is a nonnegative supermodular function with k ( 0 ) = 0. Also, w: 2" -+ R is a modular function. Therefore, the problem of determining the strength o ( G ,w ) in (7.142) is a minimum-ratio problem. From Theorem 7.19,
o ( ~ , w=) max{A
I x 2 0 , vx c E :
W(X)
- A T # ( X )2 0 )
(7.144) where P ( r # ) is the supermodular polyhedron associated with the supermodular function I - # : 2E + R. Let j o be the rank function r of graph G, f l be the modular function -w, and A; (i = 1,2,...,p)be the critical values, in Theorem 7.15, for the pair of submodular systems (2E, j ; ) ( z = 0 , l ) . We can show from (7.144)that
(7.145) (see Corollary 9.6 in Section 9.2).
Example b.7: Finding a maximum density subgraph For a graph G = (V,E ) without selfloops define the density of G by (7.146) Consider the problem of finding a maximum density subgraph of G. Since a maximum density subgraph of G is connected, it suffices to find a maximumdensity connected subgraph of G. For a connected subgraph H = (W,F ) of G the density is given by
(7.147) where r is the rank function of G. Conversely, if an edge subset FO attains the maximum of (7.147), then the edge set of each connected component of the subgraph formed by Fo also attains the maximum of (7.147). Therefore, the problem is reduced to solving the following problem:
( F C E,F # 0)
(7.148)
or a minimum-ratio problem:
Minimize r(F) ( F C E , F f 0). (7.149) IF1 The minimum value of (7.149),is equal to the minimum critical value X 1 of the principal partition of G in the sense of Tomizawa and Narayanan (see Corollary 9.6). The maximum density subgraph problem is also considered by (Goldberg831, where the density of a graph G = ( V , E ) is defined by IEl/lVl. 222
Chapter V. Nonlinear Optimization with Submodular Constraints We consider a class of nonlinear optimization problems with constraints described by submodular functions which includes problems of minimizing separable convex functions over base polyhedra with and without integer constraints (see [Fuji89]). We also give efficient algorithms for solving these problems.
8. Separable Convex Optimization
Let (P, f ) be a submodular system on E with a real-valued (or rational-valued) rank function f . The underlying totally ordered additive group is assumed to be the set R of reals (or the set Q of rationals) unless otherwise stated. For each e E E let w,:R + R be a real-valued convex function on R, and consider the following problem
PI: Minimize
w,(x(e))
(8.la)
eEE
subject to
2
E B(f).
(8.lb)
Problem PI was first considered by the author (Fuji80bI for the case where for each e E E we(x(e)) is a quadratic function given by z(e)2/w(e) with a positive real weight w(e) and f is a polymatroid rank function. H. Groenevelt [Groenevelt85] also considered Problem PI where each w e is a convex function and f is a polymatroid rank function.
8.1. Optimality Conditions
It is almost straightforward to generalize the result of [Fuji80b] and [Groenevelt851 to Problem PI for a general submodular system. Optimal solutions of Problem PI are characterized as follows.
Theorem 8.1 ([Groenevelt85]; also see [Fuji80b]): A base x E B(f) is an optimal solution of Problem PI if and only if for each exchangeable pair (e, el) associated with base z (ie., e E E and e' E dep(s,e) - {e}), we have
223
V. NONLINEAR OPTIMIZATION where wei denotes the right derivative of we and
W,I-
the left derivative of
We).
(Proof) The “if” part: Denote by A, the set of all the exchangeable pairs ( e , e’) associated with x. Suppose that (8.2) holds for each ( e , e’) E A,. From Theorem 3.28, for any base z E B(f) there exist some nonnegative coefficients X ( e , e‘) ( ( e ,e‘) E As) such that
For each e E E define
We see from (8.2) and (8.4) that
2 E e l ( z ( e ’ ) ) ((e7e’) E A , ) , (8.6) where recall that e’ E dep(z, e ) implies dep(x, e‘) C dep(z, e ) . From (8.3)-(8.6) Ee(Z(e))
and the convexity of we ( e E E ) we have
where dX: E + R is the boundary of X defined by
for each e E E . (8.7) shows the optimality of x. T h e “only if” part: Suppose that for a base x E B(f) there exists an exchangeable pair ( e , e‘) associated with x such that we+(x(e))< w,l-(x(e’)).
224
(8.9)
8.1. OPTIMALITY CONDITIONS
Then for a sufficiently small a
5
> 0 we have
+ a ( X e - x e j ) E B(f).
Therefore, x is not an optimal solution.
(8.11)
Q.E.D.
Theorem 8.1 generalizes Theorem 3.16 for the minimum-weight base problem to that with a convex weight function. For each e E E and t E R define the interval
Je(E)is the subdifferential of we at define le(q) ={E
6.
Conversely, for each e E E and q E R
I t E R,
7 E Je(t))-
(8.13)
Because of the convexity of we, I e ( q ) ,if nonempty, is a closed interval in R and we express it as Ie(q) = [ i e - ( ~ ) , i e + ( ~ ) ] . (8.14) Note that q E .Ie(()if and only if
t E le(q).
Theorem 8.2: A base x E B ( f ) is an optimal solution of Problem P I if and only if there exists a chain
C: 8
A0 C A1 C
1
* * *
C Ak = E
(8.15)
of D such that (8.16)
(i) z ( A i ) = f ( A i ) ( i = O , l , . . . , k ) , (ii) f o r e a c h i = l , - . . , k ,
(iii) for each i , j = 1 , . - .,k such that i < j , we have
for any e' E Ai - A,-1 and e E A j - Aj-1. (Proof) The "if" part: Suppose there exists a chain C of (8.15) satisfying (i) - (iii). For any exchangeable pair (e', e") for base z it follows from (i) that for 225
V. NONLINEAR OPTIMIZATION somei,jE { 1 , 2 , . . . , k } w i t h i < j w e h a v e e ' E Aj-Aj-1 ande"EAi-Ai-1. If i = j , (ii) implies we,+(z(e')) 2
v 2 we,,-(z(e"))
(8.19)
for any 7 E n { J e ( z ( e ) )1 e E A, - A i - l } . From (8.19), (iii) and Theorem 8.1, z is an optimal solution of Problem P l . The "only if" part: Let x E B(f) be an optimal solution of Problem P l . The sublattice D(z)= { X I X E D, z ( X ) = f ( X ) } (8.20)
defines a poset P(D(z))= ( l I ( D ( z ) ) , d ~ ((see ~ ) ) Section 3.2.a). Suppose that n(D(z))= { E 1 , E z , . . . , E k }. For each i = 1 , 2 , . . . , k define Ji =
n{Je(+)) I
e E Ei}.
(8.21)
For each distinct e, e' E Ei ( e , e') is an exchangeable pair associated with z due to the definition of E i . Hence, because of the optimality of x we have
for each e, e' E E, due to Theorem 8.1. Therefore,
Ji # 0
(i = 1 , 2 , . . - , k ) .
(8.23)
For each i = 1 , 2 , . . . ,k we have
For any i, j such that Ej < D ( ~ )Ei we have
- 5 V+i
Vj
due to Theorem 8.1. From (8.25) and (8.26),
and we have the following monotonicity:
226
(8.26)
8.2. A DECOMPOSITION ALGORITHM
We assume without loss of generality that (8.29)
and that, defining (8.30) (8.31)
we have a (maximal) chain
C: 8 = A0
C
A1 C
* * *
C Ak = E
(8.32)
of P ( x ) . In particular, (i) holds. Also, (ii) is exactly (8.23). Moreover, from (8.27) and (8.29) we have (iii). Q.E.D. Theorem 8.2 generalizes the greedy algorithm given in Section 3.2.b.
8.2.
A Decomposition Algorithm
In the following we assume that I e ( q ) # 8 for every q E R, to simplify the following argument. It should also be noted that this assumption guarantees the existence of an optimal solution even if B(f) is unbounded. When B(f) is bounded, there is no loss of generality with this assumption. We first describe an algorithm for Problem Pl in (8.1). Here, x* is the output vector giving an optimal solution. A decomposition algorithm Step 1: Choose q E R such that (8.33) (see (8.14)). Step 2: Find a base x E B(f) such that for each e, e' E E (1) if w,+(x(e)) < q and w,l-(s(e')) > q , then we have e' dep(s,e), (2) if w e + ( z ( e ) ) < q , w e l - ( x ( e ' ) ) = q and e' E dep(z,e), then for any a > 0 we have w,~-(z(e') - a ) < q , i.e., x(e') = i e 1 - ( q ) , (3) if w , + ( x ( e ) ) = q , w,~-(x(e')) > q and e' E dep(x,e), then for any a > 0 we have w,+(x(e) a ) > q , i.e., x ( e ) = i e + ( q ) .
4
+
227
V. NONLINEAR OPTIMIZATION Put (8.34) (8.35) (8.36)
where dep# is the dual dependence function defined by
Put z * ( e ) = z(e) for each e E Eo. Step 3: If E- # 0, then apply the present algorithm recursively to the problem with E and f, respectively, replaced by E- and fE- and with the base polyhedron associated with the reduction (D, f ) E-. Also, if E+ # 0, then apply the present algorithm recursively to the problem with E and f, respectively, replaced by E+ and fE+ and with the base polyhedron associated with the contraction (D, f ) x E+ . (End)
-
The present algorithm is adapted from the algorithms in [Fuji80b] and [Groenevelt85]. Here, it is described in a self-dual form. It should be noted that for the dual dependence function dep# we have e' E dep#(z,e) if and only if e E dep(z,e'), for E B(f) (= B(f#)). Now, we show the validity of the decomposition algorithm described above. First, note that E- n E+ = 0 in Step 2 since otherwise there exist e,e' E E such that w,+(z(e)) < q , wej-(z(ef)) > q and e' E dep(z,e), which contradicts (1) of Step 2. Suppose we have E- = E in Step 2. Then, from (8.34) z(e) < i e - ( q )
Therefore,
f ( ~=)z
( ~< )
(e E E ) .
(8.38)
C ie-(q),
(8.39)
eEE
which contradicts (8.33). Hence E- # E . Similarly, we have E+ # E . Consequently, the total number of executions of Step 2 is at most [El - 1. When the algorithm terminates, then obtained vector z* is an optimal solution of Problem PI due to Theorem 8.2. Note that in Step 2 z ( E - ) = f ( E - ) and z ( E - E+) = f ( E - E+) from (8.34) and (8.35) and that E- and E - E + , if nonempty proper subsets of E, will be members of the chain C in (8.15). 228
8.3. DISCRETE OPTIMIZATION
In Step 1 a desired q may be found by a binary search but Step 1 heavily depends on the structure of the given functions we (e E E ) . A base x E B(f) satisfying (1)-(3) of Step 2 is obtained by O(lE12)elementary transformations if an oracle for exchange capacities for B(f) is available. If an oracle for saturation capacities for P(f) is available, Step 2 can be executed by calling the oracle O(lE1) times.
Remark 8.1: In Step 3 the original problem on E is decomposed into two problems, one being on E- and the other on E+, and the values of z*(e) (e E EO)are fixed. The decomposition relation obtained through the algorithm recursively defines a binary tree in such a way that E- is the left child and E , the right child of E . The decomposition algorithm described above traverses the binary tree by the depth-fist search where the search of the left child is prior to that of its sibling, the right child. The above algorithm is also valid if we modify Step 3 according to any efficient way of traversing the binary tree from the root. Remark 8.2: If we is strictly convex for each e E E , then conditions (2) and (3) in Step 2 are always satisfied, so that we have only to consider condition (1). Moreover, if we is strictly convex and differentiable for each e E E , then the above algorithm will further be simplified. This is the case to be treated in Section 9. The decomposition algorithm for the separable convex optimization problem PI lays a basis for the algorithms for the other problems to be considered in Sections 9-11. 8.3. Discrete Optimization
Suppose that the rank function f of the submodular system (D, f) is integervalued. We consider a discrete optimization problem which is Problem PI with variables being restricted to integers. For each e E E let Zir, be a real-valued function on 2 such that the piecewise linear extension, denoted by w e , of Ceon R is a convex function, where w e ( ( ) = C , ( f ) for ( E 2 and we restricted on each unit interval 16, f + 11 ( f E 2) is a linear function. Consider a discrete optimization problem described as
IPl: Minimize
Ge(x(e))
(8.40a)
eE E
subject to x E B z ( f ) ,
(8.40b)
where B Z ( f ) = {X
I x E Z E ,V X
E
D: x ( X ) 5 f ( X ) , z ( E ) = f(E)}. 229
(8.41)
V. NONLINEAR OPTIMIZATION
B z ( f ) is the base polyhedron associated with (P, f ) where the underlying totally ordered additive group is the set 2 of integers. For the same f we also denote by B R ( ~ the ) base polyhedron B(f) associated with (D,f) where the underlying totally ordered additive group is the set R of reals. Also consider the continuous version of IPl:
PI : Minimize
we(z(e))
(8.42a)
eEE
subject to z E B R ( ~ ) .
(8.42b)
Recall that we is the piecewise linear extension of 6,for each e E E.
Theorem 8.3 (cf. [Groeneveltdfi]): If there exists an optimal solution for Problem PI of (8.42), there exists an integral optimal solution for Problem PI. (Proof) Suppose that z* is an optimal solution of Problem P I . Define vectors u E RE by l ( e ) = b * ( e ) j (e E E l , (8.43)
I,
u(e) = [z*(e)l
(e E
E).
(8.44)
<
(Here, for each ( E R, I(]is the maximum integer less than or equal to and [El is the minimum integer greater than or equal to E.) Consider the following problem
Pl': Minimize
w,(z(e))
(8.45a)
eEE
subject to x E B ~ ( f ) y ,
(8-45b)
where B ~ ( f ) is r the base polyhedron of the submodular system (D, f); which is the vector minor of ( D ,f) obtained by the restriction by u and the contraction by 1. Note that an optimal solution of PI' is an optimal solution of PI. Since f is integer-valued and 1 and u are integral, B ~ ( f ) fis an integral base polyhedron. Also, the objective function in (8.45a) is linear on B ~ ( f ) y .Therefore, by the greedy algorithm given in Section 3.2.b we can find an integral optimal solution for Problem Pl',which is also optimal for Problem PI. Q.E.D. We can also prove Theorem 8.3 by using the decomposition algorithm given in Section 8.2. From the assumption, for each e E E and r] E R a,- (77) and ic+(r])are integers. We can choose an integral base 2 E B(f) as the base x required in Step 2 of the decomposition algorithm. Note that w , + ( z ( e ) ) < r] ( w e - ( z ( e ) ) > 7) is equivalent to z(e) < ie-(r]) (z(e) > i e + ( r ] ) ) . An integral optimal solution of Problem PIof (8.42) is an optimal solution of Problem IP1 of (8.40), and vice versa. 230
9.1. LEXICOGRAPHICALLY OPTIMAL BASE: NONLINEAR WEIGHTS An incremental algorithm is also given in [Federgruen
+ Groenevelt861.
9. The Lexicographically O p t i m a l Base P r o b l e m
We consider a submodular system (D,f ) on E with the set R of reals (or the set Q of rationals) as the underlying totally ordered additive group. Throughout this section R may be replaced by Q but not by the set Z of integers. 9.1. Nonlinear Weight Functions
For each e E E let he be a continuous and monotone increasing function from R onto R. For any vector x E RE we denote by T(s) the sequence of the components x(e) (e E E ) of x arranged in order of increasing magnitude, i.e., T(x) = (x(el),x(e2),.--,z(e,)) with x(e1) 5 x(e2) 5 5 x(e,), where IEl = n a n d E = {e1,e2,...,en} . * For real n-sequences Q = (a1,a2,* . . , a,) and p = (PI, - ,p,) (Y is lexicographically greater than or equal to p if and only if (1) a = p or (2) Q # P and for the minimum index i such that a; # pi we have ai > pi. Consider the following problem.
P2: Lexicographically maximize T((h,(x(e)) I e E E ) ) subject to z E B ( f ) .
(9.la) (9.lb)
We call an optimal solution of Problem P2 a lesicographically optimal base of (P, f ) with respect to nonlinear weight functions he ( e E E ) . Informally, Problem P2 is to find a base z which is as close as possible to a vector which equalizes the values of h , ( z ( e ) )(e E E ) on the basis of the lexicographic ordering in (9.la). T h e o r e m 9.1 (cf. [Fuji80b]): Let x be a base in B(f). Define a vector bY ~ ( e= ) he(z(e)) (e E E )
E
RE
(9.2)
and let the distinct values of ~ ( e (e ) E E ) be given by
Also, define
Ai = {e I e E E , v(e) 5
vi} (i = 1 , 2 , - . - , p ) .
(9.4)
Then the following four statements are equivalent: (i) z is a lexicographically optimal base of (P, f ) with respect to he (e E E ) . 231
V. NONLINEAR OPTIMIZATION (ii) Ai E D and x(Ai) = f(A;) (i = 1,2,...,p) . (iii) dep(x,e) C Ai (e E A i , i = 1 , 2 , . - . , p ) . in (8.1) where for each e E E the
(iv) x is an optimal solution of Problem derivative of w e coincides with he.
(Proof) The equivalence, (ii) (iii), immediately follows from the definition of dependence function. Also, the equivalence, (ii) (or (iii)) c (iv), follows from Theorem 8.2. Therefore, (ii)-(iv) are equivalent. We show the equivalence, (i) (ii) - (iv) . (i) --r' (iii): Suppose (i). If there exist i E {1,2,...,p} and e E Ai such that e' E dep(z, e) - Ai, then for a sufficiently small a > 0 the vector given by
Y
=5
+a(Xe -
xel)
(9.5)
is a base in B(f) and T((h,(y(e)) I e E E ) ) is lexicographically greater than T((h,(z(e)) I e E E ) ) . This contradicts (i). So, (iii) holds. (ii), (iii) ==+ (i): Suppose (ii) (and (iii)). Let Z be an arbitrary base such that T((h,(Z(e)) I e E E ) ) is lexicographically greater than or equal to T((h,(z(e)) 1 e E E ) ) . Define a vector ?j E RE by
T(e) = he(z(e))
(e E
E ).
(9.6)
Also define A0 = 0. We show by induction on i that x ( e ) = Z(e)
(e E Ai)
(9.7)
-
for i = 0,1, * * , p , from which the optimality of x follows. For i = 0 (9.7) trivially holds. So, suppose that (9.7) holds for some i = io < p . . Since T(?j) is lexicographically greater than or equal to T(r)),we have from (9.3) and (9.4) -
v(e) 2 v ( e ) = rlio+l
(e E
Ai,+l - A,).
(94
From (9.8) and the monotonicity of he (e E E ) ,
Since Z E B(f), it follows from (9.7) with i = io, (9.9) and assumption (ii) that
f(AiO+l) 2 ~ ( A i , + l )L x(Aio+l) = .f(Aio+l)From (9.9) and (9.10) we have Z(e) = x ( e ) (e E A i , + l ) .
(9.10) Q.E.D.
We see from Theorem 9.1 that the lexicographically optimal base is unique and that the problem can be solved by the decomposition algorithm given in Section 8.2. 232
9.2. LEXICOGRAPHICALLY OPTIMAL BASE: LINEAR WEIGHTS For a vector x E RE denote by T*(z)the sequence of the components x ( e ) ( e E E ) of x arranged in order of decreasing magnitude. We call a base x E B(f) which lexicographically minimizes T*( (he( x ( e ) ) I e E E ) ) a co-lesicographically optimal base of (D,f ) with respect to he ( e E E ) .
Theorem 9.2: x is a lexicographically optimal base of (D,f) with respect to he (e E E ) if and only if it is a co-lexicographically optimal base of (D,f) with respect to he ( e E E ) . *
(Proof) Using v ( e ) ( e E E ) and define
A;* = {e
v; (i = 1,2, - . . , p ) appearing in Theorem 9.1,
I e E E,
(9.11) v ( e ) 2 vp-i+l} (i = 1,2,---,p). Ao*. Since A;* = E - A,-; (i = O,l,...,p) and
Also define A0 = 0 = x ( E ) = f ( E ) , we can easily see that for a base x E B(f) x satisfies (ii) of Theorem 9.1 if and only if x satisfies (ii*) A;* and s ( A ; * ) = f#(Ai*) (i = 1 , 2 , . - . , p ) .
En
Consequently, the present theorem follows from Theorem 9.1 and the duality shown in Lemma 2.4. Q.E.D. Recall the self-dual structure of Problem PI shown in the decomposition algorithm in Section 8.2. 9.2. Linear Weight Functions
Let us consider Problem Ps in (9.1) for the case when h e ( s ( e ) )is a linear function expressed as x ( e ) / w ( e ) with w ( e ) > 0 for each e E E . Such a lexicographically optimal base is called a lexicographically optimal base with respect to the weight oector w = ( w ( e ) I e E E ) (see [Fuji80b]). This is a generalization of the concept of (lexicographically) optimal flow introduced by N. Megiddo [Megiddo74]concerning multiple-source multiple-sink networks. A polymatroid induced on the set of sources (or sinks) is considered in [Megiddo74] (see Section 2.2). From Theorem 9.1 the lexicographically optimal base problem with r e spect to the weight vector w is equivalent to the following separable quadratic optimization problem (see [Fuji80b]).
PI": Minimize
x(e)2/w(e)
(9.12a)
eEE
subject to x E B(f).
(9.12b)
This is a minimum-norm point problem for B ( f ) (see Section 7.l.a, where w ( e ) = 1 ( e E E ) ) . Problem P l w can be solved by the decomposition algorithm shown in Section 8.2. 233
V. NONLINEAR OPTIMIZATION The following procedure was also given in [Fuji80b] for polymatroids and in [Megiddo74] for multiterminal networks.
A Monotone Algorithm
Step 1: P u t i +- 1 and F t E . Step 2: Compute A* = max{X Put Ei
t
sat(X*w") and z*(e)
t
I XW"
E~(f)}.
(9.13)
X*w(e) for each e E Ei.
Step 3: If X*wF E B ( f ) (or Ei = F), then stop. Otherwise put ( D , f ) / E i and F t F - E,. P u t i t i + 1 and go to Step 2.
(D,f ) +-
(End) Note that w F appearing in Step 2 is the restriction of w td'F & E . The above monotone algorithm can also be viewed as follows. Start from a subbase z = Xw E P ( f ) for a sufficiently small A, where one such X can be given in terms of the greatest lower bound g defined by (3.94) and (3.95) when ( D , f ) is simple, since Xw 5 cy implies Xw E P ( f ) ; or take any subbase y E P ( f ) and choose X such that Xw 5 y. Increase all the components of z (= Xw) proportionally t o w as far as z belongs to P ( f ) . Then fix the saturated components of z and increase all the other components of z proportionally to w as far as z belongs t o P ( f ) . Repeat this process until z becomes a base in
B(f 1.
The above monotone algorithm can also be adapted t o Problem nonlinear weight functions h,; (e E E ) .
Pz with
Theorem 9.3 ([Fuji80b]):Let z+ be the base in B ( f ) obtained by themonotone algorithm described above. Then x* is the lexicographically optimal base with respect to the weight vector w. (Proof) We see from the monotone algorithm t h a t for any e, e' E E such that z'(e)/w(e)
< z'(el)/w(e')
(9.14)
(e,e') is not an exchangeable pair associated with z". The optimality of z* Q.E.D. follows from Theorems 8.1 and 9.1.
It should be noted that the above-described monotone algorithm traverses the leaves, from the leftmost t o the rightmost one, of the binary tree mentioned in Remark 8.1 in Section 8.2. For a general submodular system (D,f ) , computing A' in (9.13) is not easy even if we are given oracles for saturation capacities and exchange capacities for P ( f ) . One exception is the case where D is induced by a laminar family of subsets o f E expressed as a tree 234
9.2. LEXICOGRAPHICALLY OPTIMAL BASE: LINEAR WEIGHTS structure. Also, in case of multiple-source multiple-sink networks considered in [Megiddo74] and [Fuji80b], the problem can be solved in time proportional to that required for finding a maximum flow in the same network (see [Gallo Grigoriadis Tarjan891, where use is made of the distance labeling technique introduced by A. V. Goldberg and R. E. Tarjan ([Goldberg Tarjan881)). For efficient algorithms when f is a graphic rank function, see [Imai83] and [ Cunningham85bl.
+
+
+
Lemma 9.4 (see [Fuji79]): Let x* be the lexicographically optimal base with respect to the weight vector w . Then, for any X E R x* A ( X w ) = (min{z*(e), X w ( e ) } I e E E ) is a base of Xw (i.e., Xz* is a maximal vector in P(f)'"'). (Proof) Since the lexicographically optimal base z* is unique and the above monotone algorithm finds it, the present lemma follows from the algorithm. Q.E.D. In the sense of Lemma 9.4 we also call x* the universal base with respect to w (cf. [Nakamura Irigl]). Now, let us examine the relationship between the lexicographically optimal base and the subdifferentials of f .
+
Theorem 9.5: Let x* be the lexicographically optimal base of ( D , f ) with respect to w . For X E R and A E D we have Xw E d f ( A ) if and only if, defining (9.15) Bx+ = { e I e E E , z * ( e ) 5 X w ( e ) } ,
Bx- = {e I e E E , x * ( e ) < X w ( e ) } ,
(9.16)
C Bx+ and A E D(z*)(ie., z * ( A )= f ( A ) ) . (Proof) The "if" part: Suppose that Bx- C A C Bx+ and A E D(z*). Then for any X E D such that X C A or A C X , we have we have Bx-
A
Xw(X)- X w ( A ) 5 x ' ( X ) - z * ( A ) 5 f(X) - f ( A ) .
(9.17)
From Lemma 6.4 we have Xw E d f ( A ) . The "only if" part: Suppose Xw E d f ( A ) . From the monotone algorithm x* satisfies z * ( B x - )= f ( B x - ) . (9.18)
From (9.18) and the assumption we have
(9.19) 235
V. NONLINEAR OPTIMIZATION or
Xw(Bx-)
- z*(Bx-)5
Xw(A) - z*(A).
(9.20)
Since from (9.16) Xw(X) - z*(X) is maximized at X = Bx-, it follows from (9.20) that X = A also maximizes Xw(X) - z*(X). This implies Bx- C A C Bx+ because of (9.15) and (9.16). Since we have
from (9.18), (9.21) and the fact that z*(A) 5 f(A) we have
Since Xw E a f ( A ) ,we must have from (9.22) z*(A) = f ( A ) .
(9.23)
Q.E.D.
Corollary 9.6: Under the same assumption as in Theorem 9.5, let XI < X2 < < A, be the distinct values ofz*(e)/w(e) (e E E ) . Then we have X I = min{f(X)/w(X)
IX
E
D, X # 0)
1 Xw E P(f)}, = max{f#(X)/w(X) I x E D,X # 0} = min{X I xw E ~ ( f # ) } . = max{X
A,
(9.24) (9.25)
(Proof) Note that af(0) = P ( f ) and a f ( E ) = P(f#). It follows from Theorem 9.5 that Xw E P ( f ) if and only if B , = 0 or z*(e) 2 Xw(e) for all e E E . Therefore , max{X
I Xw E P(f)} = min{z*(e)/w(e) I e E E } = X I .
(9.26)
Also, since Xw E P ( f ) if and only if Xw(X) 5 f ( X ) for all X E D,(9.24) follows from (9.26). Similarly, from Theorem 9.5, we have Xw E P(f#) if and only if BX+ = E or z*(e) 5 Xw(e) for all e E E . This implies (9.25). Q.E.D.
Corollary 9.7: Under the same assumption as in Theorem 9.5, Xw belongs to the interior of a f ( A ) if and only if A = Bx- = BA+. (Proof) Since B,, BX+ E D, the present corollary immediately follows from Theorem 9.5, but we will give an alternative proof. 236
9.2. LEXICOGRAPHICALLY OPTIMAL BASE: LINEAR WEIGHTS Suppose that Xw is in the interior of a f ( A ) . If A
# B x - , then
Xw(Bx-) - Xw(A) < f ( B x - ) - f ( A ) 5 x*(Bx-) - s * ( A ) .
(9.27)
This contradicts the fact that X = Bx- maximizes Xw(X)-x*(X). Therefore, we have A = Bx-. Similarly, we have A = Bx+ and hence Bx- = A = Bx+. Conversely, suppose A = Bx- = Bx+. Then X = A (= Bx- = Bx+) is the unique maximizer of Xw(X) - x*(X),so that
XW(X)- z*(X)< XW(A)- z * ( A ) (X E D, X # A ) .
(9.28)
From (9.28),
where note that d ( A ) = ! ( A ) . Hence Xw belongs to the interior of a f ( A ) . Q.E.D. Let
a f ( A 0 ) = af(01, d f ( A l ) , " ' 1
df(Ap)= W
E )
(9.30)
be the subdifferentials of j whose interiors have nonempty intersections with the line L = {Xw 1 X E R}, where the subdifferentials in (9.30) are arranged in order of increasing X E R. Suppose that for each i = 0,1, , p (Xi, is the open interval consisting of those X for which Xw belongs to the interior = +oo. We see from Theorem 9.5 and of a f ( A i ) , where XO E --oo and Corollary 9.6 that for each i = 1,2, . ,p
--
x*(e) = Xiw(e)
(e E Ai
-
Ai-1).
(9.31)
Compare the present results with those in Sections 7.1.a and 7.2.b.l (also see [Fuji84c]). Recall that Xw E a f ( A ) if and only if X = A is a minimizer of f ( X ) Xw(X) ( X E P). Therefore, Xw belongs to the interior of a f ( A ) if and only if X = A is a unique minimizer of f ( X ) - Xw(X) (X E D) (see Corollary 9.7). Hence, X I , X2, A, are exactly the critical values associated with the principal partition of the pair of submodular systems (Pi,fi) (i = 0 , l ) with Po = D, D1 = 2E, fo = f and f l = -w. The concept of lexicographically optimal base is generalized by M. Nakamura [Nakamura81] and N. Tomizawa [Tomi80d]. Suppose that we are given two (polymatroid) base polyhedra B(fi) (i = 1,2) such that every base of B( j i ) (i = 1 , 2 ) consists of positive components alone. If bl is the lexicographically optimal base of B(f1) with respect to a weight vector b2 E B(f2) and a m . ,
237
V. NONLINEAR OPTIMIZATION bz is the lexicographically optimal base of B(f2) with respect to b l , then the pair (61, b2) is called a universal pair of bases (the original definition in [Nakamura811 and [Tomi8Od]is different from but equivalent to the present one (see Section 7.2.b.l)). Some characterizations of universal pairs are given by K. Murota [Murot a88]. 10. The Weighted Max-Min and Min-Max Problems
We consider the problem of maximizing the minimum (or minimizing the maximum) of a nonlinear objective function over the base polyhedron B ( f ) . 10.1. Continuous Variables
For each e E E let h e : R + R be a right-continuous and monotone nondecreasing function such that lime-+oo he(() = +oo and lime*-oo he(() -oo. Consider the following max-min problem with the nonlinear weight function he (e E E ) .
3
P, : Maximize min he(x(e)) eEE
subject to x E B R ( ~ ) ,
(10.1a)
(10.lb)
where B R ( ~ )is the base polyhedron associated with a submodular system (D,f)on E and the underlying totally ordered additive group is assumed to be the set R of reals. For each e E E let w e :R -+ R be a convex function whose right derivative wes is given by he.
Theorem 10.1: Consider Problem PI in (8.1) with we (e E E ) such that we+ = he. Let x be an optimal solution of Problem PI. Then x is an optimal solution of Problem P, in (10.1). (Proof) Define VI
min{he(x(e)) 1 e E E ) ,
1
S1 = {e
I e E E,
he(x(e)) = VI),
S1* = U{dep(x,e)
I e E S1).
(10.2)
(10.3) (10.4)
We have from (10.4) X(Sl*) =
f(S1').
It follows from Theorem 8.1 that
238
(10.5)
10.1. THE CONTINUOUS MAX-MIN AND MIN-MAX PROBLEMS
If there were a base y E B(f) such that
then from (10.2)- (10.7) we would have
4e)
I y(e)
(e E
s1* -
Sl),
(10.9)
since he = w e + . Hence, from (10.5), (10.8) and (10.9),
f(sl*) = x(&*) < Y ( s l * ) > which contradicts the fact that y E B R ( ~ ) .
(10.10)
Q.E.D.
We see from the above proof that the decomposition algorithm given in Section 8.2 can be simplified for solving Problem P, as follows. We may put x*(e) = x(e) for each e E Eo U E+ in Step 2 and apply the decomposition algorithm recursively to the problem on E - but not to the one on E+ in Step 3 (cf. [Ichimori Ishii Nishida82J). In other words, we only go down the leftmost path in the binary decomposition tree mentioned in Remark 8.1 in Section 8.2. Next, consider the following min-max problem
+
+
P*: Minimize maxh,(z(e)) eEE subject to x E B R ( ~ ) .
(IOS1a)
( 10.11b)
Here, we assume that he is left-continuous rather than right-continuous for each e E E. For each e E E let w , : R R be a convex function whose left derivative w e - is given by he. Then we have ---f
Corollary 10.2: Consider Problem PI in (8.1) with w e (e E E ) such that w e = he. Let x be an optimal solution of Problem PI. Then x is an optimal solution of Problem P* in (10.11). The proof of Corollary 10.2 is similar to that of Theorem 10.1 by duality. An optimal solution of Problem P' can be obtained by the decomposition algorithm given in Section 8.2, where we only go down the rightmost path of the binary decomposition tree mentioned in Remark 8.1. Moreover, suppose that for each e E E he is continuous monotone nondeh e ( ( )= +co and lim,+-, he(() = -co. creasing function such that lim,,,, From Theorem 10.1 and Corollary 10.2 we have the following. 239
V. NONLINEAR OPTIMIZATION
Corollary 10.3: Consider Problem PI in (8.1) with we (e E E ) such that the derivative of we is equal to he for each e E E . Let x be an optimal solution of Problem PI. Then x is an optimal solution of Problem P* of (10.1) and, at the same time, an optimal solution of Problem P, in (10.11).
A common optimal solution of Problems P, and P' can be obtained by the decomposition algorithm given in Section 8.2, where we only go down the leftmost and rightmost paths of the binary decomposition tree mentioned in Remark 8.1. Problems P, and P* are sometimes called sharing problems in the litIshii Nishida821). The sharing problems erature ([Brown'lg], [Ichimori with more general objective functions and feasible regions are considered by U. Zimmermann [ Zimmermann86a,86b].
+
+
10.2. Discrete Variables
For each e E E let i , : Z + R be a monotone nondecreasing function on Z such that lirnt++..i,(f) = 00 and limt-,-mie(f) = -co for each e E E . Consider
+
IP, : Maximize min ie (x(e))
(10.12a)
eEE
subject to x E B z ( f ) ,
(10.12b)
where Bz(f) is the base polyhedron associated with an integral submodular system (D,f ) on E and the underlying totally ordered additive group is the set Z of integers. For each e E E let we: R + R be a piecewise-linear convex function such that the following two hold: (10.13a) (i) its right derivative w e + satisfies we+(() = i,(f)( f E Z), (10.13b) (ii) we is linear on each unit interval [ f ,f 11 ( f E Z).
+
Theorem 10.4: Let x+ be an integral optimal solution of Problem PI in (8.1) with we (e E E ) defined as above. Then x, is an optimal solution of Problem IP, in (10.12). (Proof) For each e E E let he:R --+ R be a right-continuous piecewise-constant nondecreasing function such that h e ( q ) = i,(f) (v E ( l), f E Z). It follows from Theorem 10.1 that an integral optimal solution of Problem PI with we (e E E ) defined by (10.13) is an integral optimal solution of Problem P, in (10.1) with he (e E E ) defined as above. Therefore, x, is an optimal solution of Problem IP,. Q.E.D.
[e, +
24 0
11.1. FAIR RESOURCE ALLOCATION: CONTINUOUS VARIABLES
It should be noted that there exists an integral optimal solution x* of Problem PI in Theorem 10.4 due to Theorem 8.3. The reduction of Problem IP, to Problem Pl was also communicated by N. Katoh [Katoh85]. A direct algorithm for Problem IP* is given in [Fuji + Katoh Ichimori88). Moreover, consider the weighted min-max problem
+
IP*: Minimize rnaxh,(x(e))
(10.14a)
eEE
subject to x E Bz(f).
For each e E E let w,:R
(10.14b)
R be a piecewise-linear convex function such that (10.158) (i) its left derivative we- satisfies we-(() = ie(() (( E Z), (ii) we is linear on each unit interval [ E , [ + 11 (( E Z). (10.15b) -+
Similarly as Theorem 10.4 we have Corollary 10.5: Let x* be an integral optimal solution of Problem PI in (8.1) with w e (e E E ) defined by (10.15). Then x* is an optimal solution ofproblem IP* in (10.14). The discrete max-min and min-max problems are thus reduced to continuous ones.
11. T h e Fair Resource Allocation Problem
In this section we consider the problem of allocating resources in a fair manner which generalizes the max-min and min-max problems treated in the preceding section. The readers should also be referred to the book [Ibaraki Katoh881 by T. Ibaraki and N. Katoh for resource allocation problems and related topics.
+
11.1. Continuous Variables
Let g: R2 + R be a function such that g(u, v) is monotone nondecreasing in u and monotone nonincreasing in v . Typical examples of such a function g are the following. g ( u , v ) = u - v, g ( u , v )= u / v
(u,
v > 0)
(11.1) (11.2)
Also, for each e E E let he be a continuous monotone nondecreasing function from R onto R. 24 1
V. NONLINEAR OPTIMIZATION Consider
P3: Minimize g(rnaxh,(z(e)),minh,(z(e))) eEE
eEE
subject to x E B R ( ~ ) .
(11.3a)
(11.3b)
We call Problem P3 the continuous fair resource allocation problem with submodular constraints. This type of objective function was first considered by N. Katoh, T. Ibaraki and H. Mine [Katoh Ibaraki Mine851. Using the same functions he (e E E ) appearing in (11.3), let us consider Problems P, and P* described by (10.1) and ( l O . l l ) , respectively. Denote the optimal values of the objective functions of Problems P, and P* by v, and v*, respectively, and define vectors I , u E RE by
+
l(e) = min{a
I a E R,
u(e) = max{a
I
(Y
+
&(a) 2 v*}
(e E E ) ,
(11.4)
E R, &(a) I v*}
(e E E ) .
(11.5)
Theorem 11.1: Suppose that values v, and v* and vectors I and u are defined as above. Then we have v* 5 v* and I 5 u. Moreover, B(f),U is nonempty and any z E B(f)Y is an optimal solution of Problem P3 in (11.3), where B(f),U = {x I x E B(f), 15 z 5 u } (see Section 3.1.b). (Proof) Let x, and x*,respectively, be optimal solutions of Problems P, and P * . If v, > v*, then we have x*(e) I u(e)
< I(e) I x*(e)
(e E E ) ,
(11.6)
which contradicts the fact that z * ( E ) = f(E) = z , ( E ) . Therefore, we have v, I v*. This implies 1 5 u. Moreover, since x* E B ( f ) l , x* E B(f)" and 1 5 u, from Theorem 3.8 we have B(f)Y # 0. For any x E B(f),U and y E B(f) we have
due to the monotonicity of g. This shows that any x E B(f),U is an optimal solution of Problem P3. Q.E.D. The continuous fair resource allocation problem P3 is thus reduced to Problems P, and P* and can be solved by the decomposition algorithm given in Section 8.2. 242
11.2. FAIR RESOURCE ALLOCATION: DISCRETE VARIABLES 11.2. Discrete Variables
We consider the discrete fair resource allocation problem, which is a discrete version of the continuous fair resource allocation problem P3 treated in Section 11.1 (see [Fuji Katoh Ichimori881). Let g: R2+ R be a function such that g(u, v) is monotone nonde:reasing in u and monotone nonincreasing in v. Also, for each e E E let h,:Z -+ R be a monotone nondecreasing function. We assume for simplicity that h e ( € ) = 00 and lime-,-m A,([) = -00. For a submodular system (D,f ) on E with an integer-valued rank function f , consider the problem
+
+
+
Ips: Minimize g(rnaxi,(z(e)), minh,(z(e))) eEE
eEE
(11.8a) (11.8b)
subject to x E Bz(f).
Problem IP3 is not so easy as its continuous version P3 because of the integer constraints. Using the same functions ie (e E E), consider the weighted integral maxmin problem IP, in (10.12) and the weighted integral min-max problem IP' in (10.14). Let ij* and G', respectively, be the optimal values of the objective functions of IP, and IP'. Define vectors i, G E Z E by
i(e) = min{a G(e) = max{a
I a E Z,
2
(11.9)
Le(a)5 G*}
(11.10)
ie(a)
I a E Z,
for each e E E. We have 6, 5 6' but, unlike the continuous version of the problem, we may not have i 5 G in general. However, we have i(e) 5 G(e)
+1
(e E E ) .
(11.11)
L e m m a 11.2: If we have i 5 0 , then B z ( f ) f is nonempty and any x E B z ( f ) ? is an optimal solution of Problem IP3 in (11.8). (Proof) Since the vector minors Bz(f)', B z ( f ) i and B z ( f ) f are integral, the Q.E.D. present lemma can be shown similarly as Theorem 11.1. Now, let us suppose that we do not have f 5 0. Define
It follows from (11.9)-(11.12) that i(e) = G(e)
+1 24 3
(e E
D),
(11.13)
V. NONLINEAR OPTIMIZATION
i(e) 5 G(e) iLe(G(e))
8,
(e E E - D),
< G, 5 O* < iLe(i(e))
2 i,(f(e)) 5 i e ( G ( e ) ) 5 8,
(e E
(11.14)
(11.15)
D),
( 11.16)
(e E E - D).
Moreover, define i A G = (min{i(e), G(e)} I e E E) and i V 0 = (max{i(e), G(e)} I e E E). Then, all the four sets Bz(f)j, Bz(f)", Bz(f)jA,j and B z ( ~ ) ~ " " are nonempty since Bz(f)jhd 2 B z ( f ) j # 0 and B Z ( ~ ) 2 ~ ~Bz(f)" ' # 0. Therefore, from Theorem 3.8 Bz(f)fhcAand B z ( f ) P " are nonempty. Choose any bases P E Bz(f)fACand $ E Bz(f)Y". From (11.12) we have
Hence, from (11.13)- (11.16), $(e) = P(e)
+1
(e E
D),
iL,(e(e)) < 8, 5 6* < iie($(e))
(11.18)
(e E D),
(11.19)
8, 5 min{iie(e(e)),iLe($(e))}I max(iL,(~(e)),iL,(g(e))}5 8,
(e E E - D). (11.20)
Let the distinct values of ie(P(e)) (e E D) be given by dl where note that
dk
< d2 <
a
*
*
< dk,
(11.21)
< ij,. Also define
We consider a parametric problem IPS' with a real parameter X associated with the original problem IP3.
IP3': Minimize g(max&,(z(e)),A)
(11.23a)
subject to z E Bz(f),
(11.23b)
eEE
ie(z(e)) 2
(e E E ) ,
(11.23~)
where X 5 6,. Denote the minimum of the objective function (11.23a) of IP3' by 7 ( X ) . Then, we have
Lemma 11.3: Suppose that X = A' attains the minimum of 7 ( X ) for X 5 8,. Then any optimal solution of IP3'* is an optimal solution of the original 244
11.2. FAIR RESOURCE ALLOCATION: DISCRETE VARIABLES problem IP3 and the minimum of the objective function (11.8a)of IP3 is equal to $A*). (Proof) Let z* and zo,respectively, be any optimal solutions of Problems IPS'* and IP3. Denote by vo be the minimum of the objective function (11.8a) of Problem I p s , and define Xo = mineEE ie(xo(e)).Then vo = g(maxie(zo(e)),xo)2 eEE
~(xO)
2 Y(x*).
(11.24)
On the other hand, we have 7 ( ~ *2)g(maxiL,(x*(e)),rniniL,(z*(e))) eEE
eEE
2 vo.
(11.25)
From (11.24)and (11.25)we have vo = r ( X * ) and x* is also an optimal solution of IP3. Q.E.D. We determine the function 7 ( X ) to solve the original problem IP3 with the help of Lemma 11.3. In the following arguments, 3, $, d;, A; (i = 1,2,...,k) are those appearing in (11.17)-(11.22). We consider the two cases when X 5 d l and when d; < X 5 d;+l for some i E {1,2,...,k}, where dk+l = 8,. Case I : X 5 d l . Because of the definition (11.21)of d l , P is a feasible solution of IP3' for X 5 d l . Since P E Bz(f)fA4, P is also an optimal solution of I P * . Therefore, it follows from the monotonicity of g that 7 0 ) = g@*,X).
(11.26)
Case 11:d; < X 5 di+l for some i E {1,2,.-.,k}(dk+l 8*). From (11.18)-(11.22) and the monotonicity of g we have
r(~ 2 )s(maxLe(P(e) + 11,A). &A,
(11.27)
It follows, from (11.18)-(11.20) for two bases P and $, that by repeated elementary transformations of 2 we can have a base B E Bz(f) such that 2(e) = P ( e )
(e E
D -A;),
i ( e ) = $(e) (= P(e)
+ 1)
(e E Ai),
min{P(e),$(e)} 5 2(e) 5 max{?(e),$(e)}
(e E E - D ) .
(11.28) (11.29) (11.30)
We see from (11.18)-(11.20) and (11.28)-(11.30) that P is a feasible solution of IP3x for the present X and that
rnaxiL,(s(e)) = maxiL,(P(e) eEE
eEAi
245
+ 1).
(11.31)
V. NONLINEAR OPTIMIZATION From (11.27) and (11.31)0 is an optimal solution of IPSx and we have (11.32) The function 7(X) is thus given by (11.26) for X 5 dl and by (11.32) for d,
< X 5 d;+l ( i = 1,2,***,k).
Theorem 11.4: Suppose that we do not have f 5 Q. Then, the minimum of the objective function (11.8a) of Problem IP3 is equal to the minimum of the following k 1 values
+
g(tj*,dl), g(maxA,(fi(e) eEAi
+ l),di+l)
(i = 1,2,**.,k).
(11.33)
(Proof) The present theorem follows from Lemma 11.3, (11.17),(11.26), (11.32) Q.E.D. and the monotonicity of g. It should be noted that because of (11.17) we can employ i instead of 2 in the definitions of d; and Ai (i = 1,2,-..,k)in (11.21) and (11.22). An algorithm for solving Problem IP3 in (11.8) is given as follows.
An algorithm for the discrete fair resource allocation problem
Step 1: Solve Problems IP* and IP* given by (10.12)and (10.14),respectively. Let G, and 6*, respectively, be the optimal values of the objective functions of IP, and IP* and determine vectors i and 0 by (11.9) and (11.10). If i 5 3, then any base x E Bz(f): is an optimal solution of IP3 and the algorithm terminates. Step 2: Let D C E be defined by (11.12) and d l < dz < - - .< dk be the distinct values of Le(C(e)) (e E D ) . Also determine sets Ai (i = 1,2,-..,k)by (11.22)with 2 replaced by 3. Step 3: Find the minimum of the following k 1 values.
+
g(G*,dl),
g(maL,(ii(e) eEA,
+ l),di+l)
(i = 1,2,.-.,k).
(11.34)
(3-1) If g(tj*,dl) is minimum, then find any base 2 E Bz(f):*, and 2 is an optimal solution of IPS. (3-2) If g(max,EA,. Le(ii(e) l),d;*+l) for i* E {1,2,-..,k}is minimum, then putting w * = maxeEAi*fie(C(e) l), d e h e l o , uo E Z E by
+
+
I
Z, fi,((~) 2 di*+l), uo(e) = max{cr I a E Z , iLe(a)5 w * } iO(e)= min{cr
(Y
246
E
(11.35) (11.36)
11.2. FAIR RESOURCE ALLOCATION: DISCRETE VARIABLES for each e E E , and any base P 6 Bz(f)Yoo is an optimal solution of IPS. (End)
It should be noted that from the arguments preceding Theorem 11.4 we have Ba(f);bo # 8 in Step 3-2. Let us consider the computational complexity of the algorithm when B z ( f ) is bounded, i.e., P = 2E. Denote by T(IP,, I P * ) the time required for solving Problems IP, and IP * . Let M be an integral upper boundof If(X)l ( X C E ) . Then we have -2M
I f(E) - f ( E - {el) i
(11.37)
I f({el) I M
for each z E B z ( f ) and e E E . Therefore, given 8* and 8*, we can determine the values of f ( e ) and 6(e) in (11.9) and (11.10) by the binary search, which requires O(1ogM) time for each e E E , where we assume that the function evaluation of he requires unit time for each e E E . Also, finding a base in Bz(f)F requires not more than O (T (IP , , IP* )) time. Hence, Step 1 runs in O(T(IP,, I P * ) + /El log M ) time. Determining set D in Step 2 requires O(lE1) time. The values of d l , dz, . d k are found and sorted in O(lEllogIE1) time. Instead of having sets A, (i = 1 , 2 , . - . , k ) we compute differences A; = Ai - A;-1 (i = 1 , 2 , - - . , k ) with A. = 0, which requires O(lE1) time. Note that having the differences * Ai = Ai - A,-1 (i = 1 , 2 , - .,k) is enough to carry out Step 3. Then, Step 2 requires O(lE1log IEI) time. In Step 3 we can compute all the values of . # ,
-
max h,(.ii(e) &A,
+ 1)
(i = 1,2, - - IC)
(11.38)
a ,
I
in O(lE1)time, using the differences A; = A;-A;-1 (i = 1 , 2 , . . - , k ) . Moreover, for each e E E , l o ( e ) and u o ( e ) can be computed in O(1ogM) time, and a base P E Bz(f)FA, or 2 E Bz(f)$ can be found in O ( T ( I P , , I P * ) ) time. Consequently, Step 3 runs in O(T(IP,, I P * ) + /El logM) time. The overall running time of the algorithm is thus O(T(IP,, I P * ) (El. (logM log [ E l ) )for general (bounded) submodular constraints. When specialized to the problem of Katoh, Ibaraki and Mine [Katoh Ibaraki Mine851, where B z ( f ) = {Z I z E Z f , 15 x 5 U , x ( E ) = c}, (11.39)
+
+
+
+
f ( E ) (= c ) can be chosen as M , T ( I P , , I P * ) is O(lEllog(f(E)/IEl) by the algorithm of G. N. Frederickson and D. B. Johnson [Frederickson Johnson821, and hence the complexity becomes the same as the algorithm of [Katoh Ibaraki Mine851.
+
+
247
+
V. NONLINEAR OPTIMIZATION
+
Very recently, K. Namikawa and T. Ibaraki [Namikawa IbarakiSO] have improved the algorithm of [Fuji Katoh Ichimori881 by the use of an optimal solution of the continuous version of the original problem.
+
12.
+
A Neoflow Problem with a Separable Convex Cost Function
In this section we consider the submodular flow problem (see Section 5.1) where the cost function is given by a separable convex function. Let G = (V,A)be a graph with a vertex set V and an arc set A, ?:A ---f R u {+m} be an upper capacity function and c: A -+ R U (-00) be a lower capacity function. Also, for each arc a E A let w,: R -+ R be a convex function. Suppose that ( D ,f ) is a submodular system on V such that f ( V ) = 0. Denote this network by NSS = (G = (V,A),c,Z,w,(a E A),(D,f)). Consider the following flow problem in Nss.
P S S :Minimize
w,(p(a))
( 12.1a)
a€ A
subject to c(a) 5 p(a) 5 C(u) ( a E A), d p E BR(f)*
( 12.1b) ( 12.1c)
We call a function p: A -+ R satisfying (12.lb) and ( 1 2 . 1 ~ )a submodular flow in NSS. Optimal solutions of Problem PSS in (12.1) are characterized by the following. Theorem 12.1: A submodular flow ’p in NSS is an optimal solution of Problem Pss in (12.1) if and only if there exists a potential p:V --t R such that the following (i) and (ii) hold. Here, w,+ and w,- denote, respectively, the right derivative and the left derivative of w, for each a E A. (i) For each a E A, w,+(p(a))
w,-(p(a))
+p(a+a) +p(a+a)
-
p ( a - a ) < 0 ===+ p(a) = +),
(12.2)
-
p ( 8 - a ) > 0 ===+-p(a) = C(.).
(12.3)
(ii) dp is a maximum-weight base of BR(f) with respect to the weight vector P. (Proof) The “if” part: Suppose that a potential p satisfies (i) and (ii) for a submodular flow p in NSS.Note that (i) implies the following: (a) If p(a) < c ( a ) ,we have (12.4) 24 8
12. A NEOFLOW PROBLEM WITH SEPARABLE CONVEX COSTS
(b) If c ( a ) < p(a), we have
NSS we have
Hence, for any submodular flow @ in
aEA
VEV
(12.6)
Therefore, p is an optimal solution of Problem Pss in (12.1). The “only if” part: Let p be an optimal solution of Problem Pss in (12.1). Construct an auxiliary network N, = (G, = ( V , A v ) , y , ) associated with p as follows. The arc set A , is defined by (5.42)-(5.45) and y , : A , + R is the length function defined by -
a ( E A ) : a reorientation of a)
(12.7)
where A,*, B,* and C, are defined by (5.43)-(5.45). Since there is no directed cycle of negative length relative to the length function yp due to the optimality of p, there exists a potential p : V + R such that
where d+, d- are with respect to G,. is due to Theorem 3.16.
(12.8) is equivalent to (i) and (ii). (ii) Q.E.D.
It should be noted that Theorem 12.1 generalizes Theorem 5.3. If w, is differentiable for each a E A , then (i) in Theorem 12.1 becomes as follows. For each a E A ,
Here, w,’ denotes the derivative of w, for each a E A . Given a submodular flow p in NSS and a potential p : V + R, we say that an arc a E A is in kilter if (12.2) or (12.3) in Theorem 12.1 is satisfied and that 249
V. NONLINEAR OPTIMIZATION h
El
Figure 12.1. an ordered pair ( u , v) of vertices u , v E V ( u # v ) is in kilter if (1) p ( u ) 2 p ( v ) or (2) u $- dep(dp,v). Also, to be out of kilter is to be not in kilter. We see from Theorem 12.1 that if all the arcs and all the pairs of distinct vertices are in kilter, then the given submodular flow p is optimal. For each arc a E A we call the set of points ( p ( a ) , p ( d - a )- p ( d s a ) ) in R2such that arc a is in kilter the characteristic curve (see Fig. 12.1). Theorem 12.1 suggests an out-of-kilter method which keeps in-kilter arcs in kilter and monotonically decreases the kilter number of each out-of-kilter arc, which is a “distance” to the characteristic curve for a E A or max{O,p(v)-p(u)} for ( u ,v ) such that u E dep(dp, v). When w ,is piecewise linear for each a E A , the characteristic curve for each a E A consists of vertical and horizontal line segments and the out-of-kilter method described in Section 5.5.c can easily be adapted so as to be a finite algorithm.
250
References [Bachem+Grotschel82] A. Bachem and M. Grotschel: New aspects of polyhedral theory. In: Modern Applied Mathematics - Optimization and Operations Research (B. Korte, ed.) (North-Holland, Amsterdam, 1982), pp. 51-106. [Barahona+Cunningham84] F. Barahona and W. H. Cunningham: A submodular network simplex method. Mathematical Programming Study 22 (1984) 9-3 1. [Berge73]C. Berge: Graphs and Hypergraphs (translated by E. Minieka, NorthHolland, Amsterdam, 1973). [Billionnet+Minoux85] A. Billionnet and M. Minoux: Maximizing a supermodular pseudoboolean function - a polynomial algorithm for supermodular cubic functions. Discrete Applied Mathematics 12 (1985) 1-11. [Birkhoff37] G. Birkhoff Rings of sets. Duke Mathematical Journal 3 (1937) 443-454. [Birkhoff67]G. Birkhoff Lattice Theory (American Mathematical Colloquium Publications 25 (3rd ed.), Providence, R. I., 1967). [Bixby+Cunningham+Topkis85] R. E. Bixby, W. H. Cunningham and D. M. Topkis: Partial order of a polymatroid extreme point. Mathematics of Operations Research 10 (1985) 367-378. [Bixby+Wagner88] R. E. Bixby and D. K. Wagner: An almost linear-time algorithm for graph realization. Mathematics of Operations Research 13 (1988) 99- 123. [Bouchet87] A. Bouchet: Greedy algorithm and symmetric matroids. Mathematical Programming 38 (1987) 147-159. [Brown791 J. R. Brown: The sharing problem. Operations Research 27 (1979) 324-340. [Bruno+Weinberg71] J. Bruno and L. Weinberg: The principal minors of a matroid. Linear Algebra and Its Applications 4 (1971) 17-54. [Chandrasekaran+Kabadi88] R. Chandrasekaran and S. N. Kabadi: Pseudomatroids. Discrete Mathematics 71 (1988) 205-217. [Choquet55] G. Choquet: Theory of capacities. Annales de l’lnstitut Fourier 5 (1955) 131-295. [Crapo+Rota70] H. H. Crapo and G.-C. Rota: On the Foundation of Combinatorial Theory - Combinatorial Geometries (MIT Press, Cambridge, MA, 1970). [Cui+Fuji88] W. Cui and S. Fujishige: A primal algorithm for the submodular flow problem with minimum-mean cycle selection. Journal of the Operations Research Society of Japan 31 (1988) 431-440. 251
[Cunningham83a] W. H. Cunningham: Submodular optimization. Presented a t the Bielefeld Workshop on Matroids (Bielefeld, August 1983) (also see [Cunningham85cl). [Cunningham83b] W. H. Cunningham: Decomposition of submodular functions. Combinatorica 3 (1983) 53-68. [Cunningham841 W. H. Cunningham: Testing membership in matroid polyhedra. Journal of Combinatorial Theory B36 (1984) 161-188. [Cunningham85a] W.H. Cunningham: On submodular function minimization. Combinatorica 5 (1985) 185-192. [Cunningham85b]W. H. Cunningham: Minimum cuts, modular functions, and matroid polyhedra. Networks 15 (1985) 205-215. [Cunningham85c] W. H. Cunningham: Optimal attack and reinforcement of a network. Journal of ACM 32 (1985) 549-561. [Cunningham+Frank85] W. H. Cunningham and A. Frank: A primal-dual algorithm for submodular flows. Mathematics of Operations Research 10 (1985) 251-262. [Dilworth44] R. P. Dilworth: Dependence relations in a semimodular lattice. Duke Math. J. 11 (1944) 575-587. (Dinits701 E. A. Dinits: Algorithm for solution of a problem of maximum flow in a network with power estimation. Soviet Mathematics Doklady 11 (1970) 1277-1280. [Dress+Havel86] A. Dress and T . F. Havel: Some combinatorial properties of discriminants in metric vector spaces. Advances in Mathematics 62 (1986) 285-312. [Dulmage+Mendelsohn59] A. L. Dulmage and N. S. Mendelsohn: A structure theory of bipartite graphs of finite exterior dimension. Transactions of the Royal Society of Canada, Third Series, Section 111, 53 (1959) 1-13. [Edm65a] J. Edmonds: Minimum partition of a matroid into independent subsets. Journal of Research of the National Bureau of Standards 69B (1965) 67-72. [Edm65b] J. Edmonds: Lehman’s switching game and a theorem of Tutte and Nash-Williams. ibid. 69B (1965) 73-77. [Edm7O] J. Edmonds: Submodular functions, matroids, and certain polyhedra. Proceedings of the Calgary International Conference on Combinatorial Structures and Their Applications (R. Guy, H. Hanani, N. Sauer and 3. Schonheim, eds., Gordon and Breach, New York, 1970), pp. 69-87. [Edm79] J. Edmonds: Matroid intersection. Annals of Discrete Mathematics 4 (1979) 39-49.
252
[Edm+Fulkerson65] J. Edmonds and D. R. Fulkerson: Transversals and matroid partition. Journal of Research of the National Bureau of Standards 69B (1965) 147-157. [Edm+Giles77] J. Edmonds and R. Giles: A min-max relation for submodular functions on graphs. Annals of Discrete Mathematics 1 (1977) 185-204. [Edm+Karp72] J. Edmonds and R. M. Karp: Theoretical improvements in algorithmic efficiency for network flow problems. Journal of ACM 19 (1972) 248-264. [Faigle79] U. Faigle: The greedy algorithm for partially ordered sets. Discrete Mathematics 28 (1979) 153-159. [Faigle80] U. Faigle: Geometries on partially ordered sets. Journal of Combinatorial Theory B28 (1980) 26-51. [Faigle87] U. Faigle: Matroids in combinatorial optimization. In: Combinatorial Geometries (N. White, ed., Encyclopedia of Mathematics and Its Applications 29, Cambridge University Press, 1987) pp. 161-210.
[ Federgruen+Groenevelt86] A. Federgruen and H. Groenevelt: The greedy procedure for resource allocation problems - necessary and sufficient conditions for optimality. Operations Research 34 (1986) 909-918. [Ford+Fulkerson62] L. R. Ford, Jr., and D. R. Fulkerson: Flows in Networks (Princeton University Press, Princeton, N. J., 1962). [Frank791 A. Frank: Kernel systems of directed graphs. Szegediensis 41 (1979) 63-76.
Acta Universities
[Frank801 A. Frank: On the orientation of graphs. Journal of Combinatorial Theory B28 (1980) 251-261. [Frank8la] A. Frank: A weighted matroid intersection algorithm. Journal of Algorithms 2 (1981) 328-336. [Frank8lb] A. Frank: Generalized polymatroids. Proceedings of the Sixth Hungarian Combinatorial Colloquium (Eger, 1981). [Frank8lc] A . Frank: How to make a digraph strongly connected. Combinatorica 1 (1981) 145-153. [Frank82a] A. Frank: A note on k-strongly connected orientations of an undirected graph. Discrete Mathematics 39 (1982) 103-104. [Frank82b] A. Frank: An algorithm for submodular functions on graphs. Annals of Discrete Mathematics 16 (1982) 189-212. [Frank841 A. Frank: Finding feasible vectors of Edmonds-Giles polyhedra. Journal of Combinatorial Theory B36 (1984) 221-239. [Frank+Tardos81] A. Frank and E. Tardos: Matroids from crossing families. Proceedings of the Sixth Hungarian Combinatorial Colloquium (Eger, 1981). 253
[Frank+Tardos85] A. Frank and 8. Tardos: An Application of the simultaneous approximation in combinatorial optimization. Report No. 85375, Institut fur Okonometrie und Operations Research, Universitat Bonn (May 1985). [Frank+Tardos88] A. Frank and E. Tardos: Generalized polymatroids and submodular flows. Mathematical Programming 42 (1988) 489-563.
[Frederickson+Johnson82] G. N. Frederickson and D. B. Johnson: The complexity of selection and ranking in X Y and matrices with sorted columns. Journal of Computer and System Science 24 (1982) 197-208.
+
[Fuji77a]S. Fujishige: A primal approach to the independent assignment problem. Journal of the Operations Research Society of Japan 20 (1977) 1-15. [Fuji77b] S. Fujishige: An algorithm for finding an optimal independent linkage. Journal of the Operations Research Society of Japan 20 (1977) 59-75. [Fuji78a] S. Fujishige: Algorithms for solving the independent-flow problems. Journal of the Operations Research Society of Japan 21 (1978) 189-204. [Fuji78b] S. Fuj ishige: The independent-flow problems and submodular functions (in Japanese). Journal of the Faculty of Engineering, University of Tokyo A-16 (1978) 42-43. [Fuji78c] S. Fujishige: Polymatroidal dependence structure of a set of random variables. Information and Control 39 (1978) 55-72. [Fuji78d] S. Fujishige: Polymatroids and information theory (in Japanese). Proceedings of the First Symposium on Information Theory and Its Applications (Kobe, November 1978), pp. 61-64. [Fuji791S. Fujishige: Matroid theory and its applications to system engineering problems (in Japanese). Systems and Control 23 (1979) 11-20. [FujiSOa] S. Fujishige: An efficient PQ-graph algorithm for solving the graphrealization problem. Journal of Computer and System Sciences 21 (1980) 6386. [Fuji80b] S. Fujishige: Lexicographically optimal base of a polymatroid with respect to a weight vector. Mathematics of Operations Research 5 (1980) 186196. [Fuji80c] S. Fujishige: Principal structures of submodular systems. Discrete Applied Mathematics 2 (1980) 77-79. [Fuji831 S. Fujishige: Canonical decompositions of symmetric submodular systems. Discrete Applied Mathematics 5 (1983) 175-190. (Fuji84aI S. Fujishige: A note on Frank’s generalized polymatroids. Discrete Applied Mathematics 7 (1984) 105-109. [Fuji84b]S. Fujishige: Structures of polyhedra determined by submodular functions on crossing families. Mathematical Programming 2 9 (1984) 125-141. 254
[Fuji84c] S. Fujishige: Submodular systems and related topics. Mathematical Programming Study 22 (1984) 113-131 [Fuji84d] S. Fujishige: A characterization of faces of the base polyhedron associated with a submodular system. Journal of the Operations Research Society of Japan 27 (1984) 112-129. [Fuji84e] S. Fujishige: A system of linear inequalities with a submodular function on (0, *l} vectors. Linear Algebra and Its Applications 6 3 (1984) 235-266. [Fuji84f] S. Fujishige: Theory of submodular programs - a Fenchel-type minmax theorem and subgradients of submodular functions. Mathematical Programming 29 (1984) 142-155. [Fuji84g] S. Fujishige: On the subdifferential of a submodular function. Mathematical Programming 29 (1984) 348-360. [Fuji84h] S. Fujishige: Combinatorial optimization problems described by submodular functions. In: Operational Research ’84 (J. P. Brans, ed., Elsevier Science Publishers, 1984), pp. 379-392. [Fuji861 S. Fujishige: A capacity-rounding algorithm for the minimum cost circulation problem - A dual framework of the Tardos algorithm. Mathematical Programming 35 (1986) 298-308. [Fuji87a] S. Fujishige: From classical flow problems to the “neoflow” problem (in Japanese). Transactions of the Institute of Electronics, Information and Communication Engineers of Japan J70-A (1987) 139-145. [Fuji87b] S. Fujishige: An out-of-kilter method for submodular flows. Discrete Applied Mathematics 17 (1987) 3-16. [Fuji891S. Fujishige: Linear and nonlinear optimization problems with submodular constraints. In: Mathematical Programming - Recent Developments and Applications (M. Iri and K. Tanabe, eds., KTK Scientific Publishers, Tokyo, 1989), pp. 203-225. [Fuji+Katoh+Ichimori88] S. Fujishige, N. Katoh and T. Ichimori: The fair resource allocation problem with submodular constraints. Mathematics of Operations Research 13 (1988) 164-173. [Fuji+Rock+Zimmermann89] S. Fujishige, H. Rock and U. Zimmermann: A strongly polynomial algorithm for minimum cost submodular flow problems. Mathematics of Operations Research 14 (1989) 60-69. [Fuji+Tomi83] S. Fujishige and N . Tomizawa: A note on submodular functions on distributive lattices. Journal of the Operations Research Society of Japan 26 (1983) 309-318. (Gale571 D. Gale: A theorem of flows in networks. Pacific Journal of Mathematics 7 (1957) 1073-1082. 255
[Gallo+Grigoriadis+Tarjan89] G. Gallo, M. D. Grigoriadis and R. E. Tarjan: A fast parametric maximum flow algorithm and applications. SIAM Journal on Computing 18 (1989) 30-55. [Giles75] F. R. Giles: Submodular Functions, Graphs and Integer Polyhedra. Ph.D. Thesis, University of Waterloo, 1975. [Goldberg831A. V. Goldberg: Finding a maximum density subgraph. Research Report, Computer Science Division, University of California, Berkeley, California, July 1983. [Goldberg+Tarjan88] A. V. Goldberg and R. E. Tarjan: A new approach to the maximum-flow problem. Journal of ACM 35 (1988) 921-940. [Gondran+Minoux84] M. Gondran and M. Minoux: Graphs and Algorithms (Wiley, New York, 1984). [Grishuhin81] V. P. Grishuhin: Polyhedra related to a lattice. Mathematical Programming 21 (1981) 70-89. [Groenevelt85]H. Groenevelt: Two algorithms for maximizing a separable concave function over a polymatroid feasible region. Working Paper, The Graduate School of Management, The University of Rochester, August 1985. [Groflin+Hoffman82] H. Groflin and A. J. Hoffman: Lattice polyhedra I1 generalizations, constructions and examples. Annals of Discrete Mathematics 15 (1982) 189-203.
[Grotschel+Lov&sz+Schrijver88]M. Grotschel, L. Lovdsz and A . Schrijver: Geometric Algorithms and Combinatorial Optimization (Algorithms and Combinatorics 2) (Springer, Berlin, 1988). [Gusfield83]D. Gusfield: Connectivity and edge-disjoint spanning trees. Information Processing Letters 16 (1983) 87-89. [Harary69] F. Harary: Graph Theory (Addison-Wesley, Reading, MA, 1969). (Hassin781R. Hassin: On Network Flows, Ph.D. Thesis, Yale University, 1978 (also see [Hassin82]). [Hassin821 R. Hassin: Minimum cost flow with set-constraints. Networks 12 (1982) 1-21. [Hoffman601A. J. Hoffman: Some recent applications of the theory of linear inequalities to extremal combinatorial analysis. Proceedings of Symposia in Applied Mathematics 10 (1960) 113-127. [Hoffman741A. J. Hoffman: A generalization of max-flow min-cut. Mathematical Programming 6 (1974) 352-359. [Hoffman781 A. J. Hoffman: On lattice polyhedra I11 - blockers and antiblockers of lattice clutters. Mathematical Programming Study 8 (1978) 197207. 256
[Hoffman+Kruskal56] A. J. Hoffman and J. B. Kruskal: Integral boundary points of convex polyhedra. Annals of Mathematics Studies 38 (1956) 223241. [Hoffman+Schwartz78] A. J . Hoffman and D. E. Schwartz: On lattice polyhedra. In: Combinatorics (A. Hajnal and V. T. S ~ S ,eds., North-Holland, Amsterdam, 1978) pp. 593-598. [Ibaraki+Katoh88] T. Ibaraki and N. Katoh: Resource Allocation Problems Algorithmic Approaches, Foundations of Computing Series (MIT Press, 1988).
[Ichimori+Ishii+Nishida82] T. Ichimori, H. Ishii and T. Nishida: Optimal sharing. Mathematical Programming 23 (1982) 341-348. [Imai83]H. Imai: Network-flow algorithms for lower-truncated transversal polymatroids. Journal of the Operations Research Society of Japan 26 (1983) 186211. [hi681 M. Iri: A min-max theorem for the ranks and term-ranks of a class of matrices - an algebraic approach to the problem of the topological degrees of freedom of a network (in Japanese). Transactions of the Institute of Electrical and Communication Engineers of Japan 51A (1968) 180-187. [hi691 M. Iri: The maximum-rank minirnum-term-rank theorem for the pivotal transformations of a matrix. Linear Algebra and Its Applications 2 (1969) 427-446. [hi711 M. Iri: Combinatorial canonical form of a matrix with applications to the principal partition of a graph (in Japanese). Transactions of the Institute of Electronics and Communication Engineers of Japan 54A (1971) 30-37. [hi781 M. hi: A practical algorithm for the Menger-type generalization of the independent assignment problem. Mathematical Programming Study 8 (1978) 88-105. [hi791 M. Iri: A review of recent work in Japan on principal partitions of matroids and their applications. Annals of the New York Academy of Sciences 319 (1979) 306-319. [Iri83] M. h i : Applications of matroid theory. Mathematical Programming The State of the Art (A. Bachem, M. Grotschel and B. Korte, eds., Springer, Berlin, 1983), pp. 158-201. [Iris41 M. Iri: Structural theory for the combinatorial systems characterized by submodular functions. In: Progress in Combinatorial Optimization (W. R. Pulleyblank, ed., Academic Press, Toronto, 1984), pp. 197-219. [Iri+Fuji81] M. Iri and S.Fujishige: Use of matroid theory in operations research, circuits and system theory. International Journal of Systems Science 1 2 (1981) 27-54. [Iri+Fuji+Oyama86] M. Iri, S. Fujishige and T . Oyama: Graphs, Networks and Matroids (in Japanese) (Sangyo-Tosho, Tokyo, 1986). 25 7
[Iri+Tomi75] M. Iri and N . Tomizawa: A unifying approach to fundamental problems in network theory by means of matroids (in Japanese). Transactions of the Institute of Electronics and Communication Engineers of Japan 58A (1975) 33-40. [Iri+Tomi76] M. Iri and N. Tomizawa: An algorithm for finding an optimal “independent assignment”. Journal of the Operations Research Society of Japan 19 (1976) 32-57.
[Kabadi+Chandrasekaran90] S. N. Kabadi and R. Chandrasekaran: On totally dual integral systems. Discrete Applied Mathematics 26 (1990) 87-104. [Katoh85] N. Katoh: Private communication, 1985. [Katoh+Ibaraki+Mine85] N. Katoh, T. Ibaraki and H. Mine: An algorithm for the equipollent resource allocation problem. Mathematics of Operations Research 10 (1985) 44-53. [Khachiyan’lg] L. G. Khachiyan: A polynomial algorithm in linear programming. Soviet Mathematics Doklady 20 (1979) 191-194. [Khachiyan80] L. G. Khachiyan: Polynomial algorithms in linear programming. USSR Computational Mathematics and Mathematical Physics 20 (1980) 53-72. [Kishi+Kajitani68] G. Kishi and Y. Kajitani: Maximally distant trees in a linear graphs (in Japanese). Transactions of the Institute of Electronics and Communication Engineers of Japan 51A (1968) 196-203. [Kung78] J. P. S. Kung: Bimatroids and invariants. Advances in Mathematics 30 (1978) 238-249. [Lawler75] E. L. Lawler: Matroid intersection algorithms. Mathematical Programming 9 (1975) 31-56. [Lawler76]E. L. Lawler: Combinatorial Optimization -Networks and Matroids (Holt, Rinehart and Winston, New York, 1976). [Lawler+Martel82a] E. L. Lawler and C . U. Martel: Computing maximal polymatroidal network flows. Mathematics of Operations Research 7 (1982) 334347. [Lawler+Martel82b] E. L. Lawler and C. U. Martel: Flow network formulations of polymatroid optimization problems. Annals of Discrete Mathematics 16 (1982) 189-200. [Lenstra+Lenstra+Lovbsz82] A. K. Lenstra, H. W. Lenstra, Jr., and L. Lovbsz: Factoring polynomials with rational coefficients. Math. Ann. 261 (1982) 515534. [Lovbsz76]L. Lovbsz: On two minimax theorems in graph theory. Journal of Combinatorial Theory, Ser. B 21 (1976) 96-103. [ L o v ~ s z L. ~ ~Lovbsz: ] Flats in matroids and geometric graphs. In: Combinatorial Surveys (P. Cameron, ed., Academic Press, London, 1977), pp. 45-86. 258
[Lovbsz83] L. Lovbz: Submodular functions and convexity. In: Mathematical Programming - The State of the Art (A. Bachem, M. Griitschel and B. Korte, eds., Springer, Berlin, 1983), pp. 235-257. [Lucchesi+Younger78] C. L. Lucchesi and D. H. Younger: A minimax relation for directed graphs. Journal of the London Mathematical Society 17 (1978) 369-374. [McDiarmid75] C. J. H. McDiarmid: Rado’s theorem for polymatroids. Mathematical Proceedings of the Cambridge Philosophical Society 78 (1975) 263-281. [Megiddo74] N. Megiddo: Optimal flows in networks with multiple sources and sinks. Mathematical Programming 7 (1974) 97-107. [Murota871K. Murota: Systems Analysis by Graphs and Matroids - Structural Solvability and Controllability (Algorithms and Combinatorics 3) (Springer, Berlin, 1987). [Murota881 K. Murota: Note on the universal bases of a pair of polymatroids. Journal of the Operations Research Society of Japan 31 (1988) 565-572. [Murota901K. Murota: Principal structure of layered mixed matrices. Discrete Applied Mathematics 27 (1990) 221-234. [Naitoh+Fuji89] T. Naitoh and S. Fujishige: A note on the Frank-Tardos bitruncation algorithm for crossing-submodular functions. Working Paper, 1989 (to appear in Mathematical Programming). [Nakamura81] M. Nakamura: Mathematical Analysis of Discrete Systems and Its Applications (in Japanese). Dissertation, Department of Mathematical Engineering and Instrumentation Physics, Faculty of Engineering, University of Tokyo, 1981. [Nakamura88a] M. Nakamura: Structural theorems for submodular functions, polymatroids and polymatroid intersections. Graphs and Combinatorics 4 (1988) 257-284. [Nakamura88b] M. Nakamura: A characterization of greedy sets - universal polymatroids (I). Scientific Papers of the College of Arts and Sciences, The University of Tokyo, 38-2 (1988) 155-167. [Nakamura+Iri81] M. Nakamura and M. Iri: A structural theory for submodular functions, polymatroids and polymatroid intersections. Research Memorandum RMI 81-06, Department of Mathematical Engineering and Instrumentation Physics, Faculty of Engineering, University of Tokyo, August 1981. (Also see “akamura88aI and [Iri84]) [NamikawafIbarakiSO] K. Namikawa and T. Ibaraki: The fair resource allocation problem with a submodular constraint. Technical Report #90002, Department of Applied Mathematics and Physics, Kyoto University, April 1990. 259
[Narayanan74]H. Narayanan: Theory of Matroids and Network Analysis, Ph.D Thesis, Department of Electrical Engineering, Indian Institute of Technology, Bombay, February 1974. [Narayanan+Vartak81] H. Narayanan and M. N. Vartak: An elementary approach t o the principal partition of a matroid. Transactions of the Institute of Electronics and Communication Engineers of Japan E64 (1981) 227-234. “ash-WilliamsGl] C. St. J. A. Nash- Williams: Edge-disjoint spanning trees of finite graphs. Journal of the London Mathematical Society 36 (1961) 445-450. [Ohtsuki+Ishizaki+Watanabe68] T. Ohtsuki, Y. Ishizaki and H. Watanabe: Network analysis and topological degrees of freedom (in Japanese). Transactions of the Institute of Electrical and Communication Engineers of Japan 51A (1968) 238-245. [Ore561 0. Ore: Studies on directed graphs, I. Annals of Mathematics 63 (1956) 383-405. [Ozawa74] T. Ozawa: Common trees and partition of two-graphs (in Japanese). Transactions of the Institute of Electronics and Communication Engineers of Japan 57A (1974) 383-390. [Picard761 J. C. Picard: Maximal closure of a graph and applications to combinatorial problems. Management Science 22 (1976) 1268-1272. (PicardfQueyranne821 J. C. Picard and M. Queyranne: Selected applications on minimum cuts in networks. INFOR 20 (1982) 394-422. [&is81 L. Qi: Directed submodularity, ditroids and directed submodular flows. Mathematical Programming 42 (1988) 579-599. [&it391 L. Qi: Bisubmodular functions. CORE Discussion Paper No. 8901, CORE, Universitd Catholique de Louvain, 1989. [Rado42] R. Rado: A theorem on independence relations. Quarterly Journal of Mathematics, Oxford 13 (1942) 83-89. [Recski] A. Recski: Matroid Theory and Applications (Springer) (to appear). [Rockafellar70] R. T. Rockafellar: Convex Analysis (Princeton University Press, Princeton, N.J., 1970). [Rock80] H. Rock: Scaling techniques for minimal cost network flows. In: Discrete Structures and Algorithms (U. Pape, ed., Hanser, Munchen, 1980), pp. 181-191. [Schonsleben80] P. Schonsleben: Ganzzahlige Polymatroid - Intresektions - Algorithmen, Dissertation, Eidgenossische Technische Hochschule Ziirich, 1980. [Schrijver78] A. Schrijver: Matroids and Linking Systems. Mathematical Centre Tracts, Vol. 88 (Amsterdam, 1978). [Schrijver79] A. Schrijver: Matroids and linking systems. Journal of Cornbinatorial Theory, Ser. B 26 (1979) 349-369. 260
[Schrijver84a] A. Schrijver: Total dual integrality from directed graphs, crossing families, and sub- and supermodular functions. In: Progress in Combinatorial Optimization (W. R. Pulleyblank, ed., Academic Press, Toronto, 1984), pp. 315-361. [Schrijver84b]A. Schrijver: Proving total dual integrality with cross-free families - a general framework. Mathematical Programming 2 9 (1984) 15-27. [Schrijver86] A. Schrijver: Theory of Linear and Integer Programming (John Wiley & Sons, New York, 1986). [Seymour80] P. D. Seymour: Decomposition of regular matroids. Journal of Combinatorial Theory B28 (1980) 305-359. [Seymour81] P. D. Seymour: Recognizing graphic matroids. Combinatorica 1 (1981) 75-78. [Shapley71] L. S. Shapley: Cores of convex games. International Journal of Game Theory 1 (1971) 11-26. [Shubik82] M. Shubik: Game Theory in the Social Sciences - Concepts and Solutions (The MIT Press, Cambridge, Massachusetts, 1982). [Stoer+Witzgall70] J. Stoer and C. Witzgall: Convexity and Optimization in Finite Dimensions I (Springer, Berlin, 1970). [Sugihara82] K. Sugihara: Mathematical structures of line drawings of polyhedra: toward man-machine communication by means of line drawings. IEEE Transactions on Pattern Analysis and Machine Intelligence PAMI-4 (1982) 45 8-469. [Sugihara86] K. Sugihara: Machine Interpretation of Line Drawings (The MIT Press, Cambridge, Massachusetts, 1986). [Sugihara+Iri80] K. Sugihara and M. Iri: A mathematical approach to the determination of the structure of concepts. Matrix and Tensor Quarterly 30 (1980) 62-75. [Tamir80] A. Tamir: Efficient algorithms for a selection problem with nested constraints and its application to a production-sales planning model. SIAM Journal on Control and Optimization 18 (1980) 282-287. [Tardos85] E. Tardos: A strongly polynomial minimum cost circulation algorithm. Combinatorica 5 (1985) 247-256. [Tardos+Tovey+Trick86] E. Tardos, C. A. Tovey and M. A. Trick: Layered augmenting path algorithms. Mathematics of Operations Research 11 (1986) 362-370. [Tomi711 N. Tomizawa: On some techniques useful for solution of transportation network problems. Networks 1 (1971) 173-194. [Tomi751 N. Tomizawa: Irreducible matroids and classes of r-complete bases (in Japanese). Transactions of the Institute of Electronics and Communication Engineers of Japan 58A (1975) 793-794. 261
[Tomi761N. Tomizawa: Strongly irreducible matroids and principal partition of a matroid into strongly irreducible minors (in Japanese). Transactions of the Institute of Electronics and Communication Engineers of Japan 59A (1976) 83-91. [Tomi771N. Tomizawa: On a self-dual base axiom for matroids (in Japanese). Papers of the Technical Group on Circuits and System Theory, Institute of Electronics and Communication Engineers of Japan, CST77-110 (1977). [TomiSOa]N. Tomizawa: Theory of hyperspace (I) - supermodular functions and generalization of concept of ‘bases’ (in Japanese). Papers of the Technical Group on Circuits and Systems, Institute of Electronics and Communication Engineers of Japan, CAS80-72 (1980). [Tomi80b]N. Tomizawa: Theory of hyperspace (11) - geometry of intervals and supermodular functions of higher order. ibid., CAS80-73 (1980). [Tomi80c]N. Tomizawa: Theory of hyperspace (111) - Maximum deficiency = minimum residue theorem and its application (in Japanese). ibid., CAS8074( 1980). [Tomi80d] N. Tomizawa: Theory of hyperspace (IV) - principal partitions of hypermatroids (in Japanese). ibid., CAS80-85(1980). [TomiSla] N. Tornizawa: Theory of hyperspace (IX) - on hybrid independence system (in Japanese). ibid., CAS81-63(1981). [Tomi8lb] N. Tomizawa: Theory of hyperspaces (X) - on some properties of polymatroids (in Japanese). ibid., CAS81-84(1981). [Tomi8lc]N. Tomizawa: Theory of hypermatroids. In: Applied Combinatorial Theory and Algorithms, RIMS Kokyuroku 427, Research Institute for Mathematical Sciences, Kyoto University (1981), pp. 43-50. [Tomi82a]N. Tomizawa: Quasimatroids and orientations of graphs. Presented at the XI. International Symposium on Mathematical Programming (Bonn, 1982). [Tomi82b] N. Tomizawa: Theory of hedron space. Presented at the Szeged Conference on Matroids (Szeged, 1982). [Tomi831 N. Tomizawa: Theory of hyperspace (XVI) - on the structures of hedrons (in Japanese). Papers of the Technical Group on Circuits and Systems, Institute of Electronics and Communications Engineers of Japan, CAS82-172 (1983). (Tomi+Fuji81] N. Tomizawa and S. Fujishige: Theory of hyperspace (VIII) on the structures of hypermatroids of network type (in Japanese). ibid, CAS8162(1981). [Tomi+Fuji82] N. Tomizawa and S. Fujishige: Historical survey of extensions of the concept of principal partition and their unifying generalization to hypermatroids. Systems Science Research Report No. 5 , Department of Systems 262
Science, Tokyo Institute of Technology, April 1982 (also its abridgment appeared in Proceedings of the 1982 International Symposium on Circuits and Systems (Rome, May 10-12, 1982), pp. 142-145). [Tomi+Iri74] N. Tomizawa and M. Iri: An algorithm for determining the rank of a triple matrix product A X B with application to the problem of discerning the existence of the unique solution in a network (in Japanese). Transactions of the Institute of Electronics and Communication Engineers of Japan 57A (1974) 834-841. [Topkis78] D. M. Topkis: Minimizing a submodular function on a lattice. Operations Research 26 (1978) 305-321. [Topkist341D. M. Topkis: Adjacency on polymatroids. Mathematical Programming 30 (1984) 229-237. [Truemper] K. Truemper: Matroid Decomposition (to appear); also see: A decomposition theory for matroids I - general results, Journal of Combinatorial Theory, Ser. B 39 (1985) 43-76, and the series of subsequent papers appeared in the same journal. [TutteGl] W. T. Tutte: On the problem of decomposing a graph into n connected factors. Journal of the London Mathematical Society 36 (1961) 221-230. [Tutte65] W. T. Tutte: Lectures on matroids, Journal of Research of the National Bureau of Standards 69B (1965) 1-48. [Tutte66] W. T. Tutte: Connectivity in Graphs (University of Toronto Press, London, 1966). [Tutte71] W. T. Tutte: Introduction to the Theory of Matroids (American Elsevier, New York, 1971). [von Hohenbalken751 B. von Hohenbalken: A finite algorithm to maximize certain pseudoconcave functions on polytopes. Mathematical Programming 8 (1975) 189-206. [Waerden37] B. L. van der Waerden: Moderne Algebra (2nd ed.) (Springer, Berlin, 1937) [Welsh701D. J. A. Welsh: On matroid theorems of Edmonds and Rado. Journal of the London Mathematical Society 2 (1970) 251-256. [Welsh761D. J. A. Welsh: Matroid Theory (Academic Press, London, 1976). [Whit351H. Whitney: On the abstract properties of linear dependence. American Journal of Mathematics 57(1935) 509-533. [Wolfe76] P. Wolfe: Finding the nearest point in a polytope. Mathematical Programming 11 (1976) 128-149. [Zimmermann82] U. Zimmermann: Minimization on submodular flows. Discrete Applied Mathematics 4 (1982) 303-323. 263
[Zimmermann86a] U. Zimmermann: Linear and combinatorial sharing problems. Discrete Applied Mathematics 15 (1986) 85-105. [Zimmermann86b] U. Zimmermann: Sharing problems. Optimization 17(1986) 31-47.
264
Index A-additive group 5 A-module 5 Abelian group 5 abridgement 44, 60 additive group 5 adjacent to 7 affine set 12 aggregation 44 algebraic matroid 20 algorithms for neoflows 146 approximation 158 arc 6 arc set 6 auxiliary graph 113 auxiliary network 113, 136 base 17, 22, 28, 29, 38 base polyhedron 21, 27, 29 base polyhedron in an orthant 93 bimatroid 93 binary matroid 19 binary subdifferential 183 bipartite graph 9 bi-truncation 33 Boolean lattice 5 boundary 11, 25 bounded 12 bridge 7 capacity function 10 capacity of a cut 11, 149 chain 45 characteristic cone 13 characteristic curve 250 characteristic vector 3 child 9 circuit 1 7 circulation 11 closed sublattice 72 closure function 18
closure operation 72 cobase 19 cocircuit 19 codisjoint 2 cointersecting family 82 cointersecting-submodular 82 co-lexicographically optimal base 233 common base problem 123 commutative 5, 6 commutative group 5 comparable 96 compatible 44 complement 5 complemented 5 complete directed graph 6 complete family of bases 170 concave conjugate function 175 conjugate function 175 connected 8, 9, 66 connected component (graphs) 8 (hypergraphs) 10 consecutive t’s property 102 contraction of a graph 7 contraction by sets 36 contraction by vectors 40 convex 12 convex analysis 12 convex combination 12 convex cone 12 convex conjugate function 15, 175 convex function 1 4 convex game 26 convex polyhedron 12 convolution 183 copartition 2, 73 copartition augmented by 83 core 26, 44 cost-scaling algorithm 157 cover 3 265
cover of a bipartite graph 9 critical value 211 cross-free family 35 crossing family 31 crossing-submodular 31, 77 cut 11, 144, 149 cut function 35 cutset 7 cycle 8 A-matroid 91 decomposition algorithm 227 deficiency function 219 density 222 dependence function 22, 28 dependent set 17 D ilworth truncation 33 dimension 12 directed cut 145 directed-cut covering 145 directed cycle 8 directed graph 7 directed path 8 directed tree 9 directed tree toward the root 9 direct product 3, 4 direct sum 3 discrete separation theorem 121 disjoint 2 distributive lattice 4 ditroid 91 dominating vector 24 dual cone 14 dual dependence function 228 dual ideal 3 duality for linear programming 15 dual matroid 19 dual polymatroid 24 dual poset 3 dual problem 15 dual submodular system 30 dual supermodular system 29
Dulmage-Mendelsohn decomposition 218 edge 7, 9, 14 elementary path 8 elementary transformation of a base 59 elongation 43, 44 enlargement 44 entrance 10 entropy function 25 equality set 64 exchangeability graph 137 exchangeable pair 23, 61 exchange capacity 23, 28 exit 10 extension 6 extreme point 12, 14 extreme ray 14 extreme vector 14 face 13 face lattice 14, 64 facet 14 fair resource allocation problem 241, 242 family of independent sets 17 Farkas lemma 16 feasibility for submodular flows 133 feasible circulation 11 feasible flow 10, 25, 35, 109 Fenchel-type min-max theorem 175 field 6 free matroid 20 full-dimensional 1 2 fundamental circuit 18 generalized polymatroid 88 generate 12 graph 6 graphic matroid 19 266
reedy algorithm 44, 51, 53, 94, rishuhin’s model 105 :oup 5 dfline 12 dfspace 14 asse diagram 9 ?ad 6 3momorphic image 64 ibrid independence polyhedron 105 {pergraph 9 {perplane 14 eal 3, 46 cidence matrix 7 dependence polyhedron 21 cident to 7 dependent assignment problem 170 dependent flow problem 127 dependent matching 164 dependent set 17 dependent vector 22 duced by 7 fimum 4 itial end-vertex 6 itial vertex 7 kilter 155, 250 tersecting family 31 tersecting-submodular 31, 86 tersection problem 109 tersection theorem 116 verse element 5 reducible 170 ‘in 4 :me1 system 105 Iter number 250 agrangian function 203
laminar 35 lattice 4 lattice polyhedron 105 leaf 9 left vertex set 9 length of a chain 45 lexicographically greater 231 lexicographically optimal base 231, 233 lexicographically shortest path 118 lexicographically smaller 118 line 12 h e a r extension 51 linear matroid 19 linear order 3 linear polymatroid 25 linear programming problem 15 linear variety 12 linear weight function 233 linked vector 93 linking system 93 LovAsz extension 187 lower bound 3 lower consecutive t’s property 102 lower ideal 3, 46 matchable 20 matching 9 matching matroid 20 matric matroid 19 matroid 17 matroidal 21 matroid intersection problem 166 matroid intersection theorem 166 matroid optimization 164 matroid partitioning problem 169 matroid union 167 maximal chain 45 maximum element 3 maximum flow 11 267
maximum independent flow problem 146 maximum independent matching problem 165 maximum submodular flow problem 151 maximum-weight base 48 max-min problem 238, 240 meet 4 metroid 91 minimum-cost submodular flow 153 minimum cut 153 minimum element 3 minimum-ratio problem 220 minimum submodular flow problem 153 minimum-weight base 48 min-max problem 239, 241 minor 36, 4 1 modular function 29 molecule 170 monotone nondecreasing 51 negative cycle 137 negatively oriented 7 neoflow problem 109, 126 nested 35 network flow 10 neutral element 5 nonlinear weight function 231 normal to 202 nullity 8 optimal independent assignment 170 optimal independent flow 127 optimality conditions 202, 223 optimality for submodular flows 136 optimal Lagrange multiplier 203 optimal polymatroidal flow 128 optimal potential 136
optimal submodular flow 126 order (partial order) 3 order of a set function 105 ort hant 92 out of kilter 155, 250 parallel arc 6 parent 9 partially ordered set 3 partial order 3 partition 2 path 7 perturbation function 205 player 26 pointed 13 polybimatroid 93 polyhedral convex set 12 polyhedron 12 polylinking system 93 polymatroid 21 polymatroidal flow problem 128 polymatroid polyhedron 21 polypseudomatroid 92 poset 3 poset derived from a distributive lattice 45 positively oriented 7 potential 136 principal partition 170, 207, 211 principal structure 218 primal problem 15 proper face 13 pseudomatroid 92 rank 8, 27 rank function 18, 21, 27, 92 ray 12 recession cone 13 reduction by sets 36 reduction by vectors 38, 39 refinement 64 regular matroid 19
268
represent 19 representation 19 restriction 36 right vertex set 9 ring 5 root 9 saturation capacity 23, 28 saturation function 22, 28 selfloop 6 semi-group 5 separable convex optimization 223 set minor 36 Shannon’s switching game 214 sharing problem 240 shortest path 113 sibling 9 simple distributive lattice 47 simple graph 7 simple submodular system 47 simplification 47 skeleton 72 spanning tree 9 strength 221 strongly connected 8 strongly connected component 8 strongly k-connected 60 strongly k-connected reorientation 60 strong map 91 subbase 28 subdifferential 15, 179 subfield 6 subgradient 15, 179 subgraph 6 subhypergraph 10 submodular analysis 175 submodular constraint 242 submodular flow 126, 248 submodular flow problem 126, 248 submodular function 18 submodular polyhedron 27
submodular program 191 submodular system 27 submodular system of network type 106 sum of matroids 168 sum of submodular systems 42 superbase 29 supermodular function 29 supermodular polyhedron 29 supermodular system 29 support function 186 supporting hyperplane 14 supremum 4 tail 6 tangent cone 60 terminal end-vertex 6 terminal vertex 7 ternary semimodular polyhedron 103 topological degree of freedom 213 totally dual integral 16 totally ordered additive group 5 totally unimodular 102 total order 3 transformation of a matroid by bipartite graphs 165 translation 41 transversal matroid 20 tree 9 tree representation of a cross-free family 74 trivial matroid 20 truncated Lovksz extension 189 truncation 43 two-terminal network 10 unbounded 15 undirected graph 7 uniform matroid 20 union of matroids 168 unit element 5 269
unit-increase property 18 universal base 235 universal pair of bases 217, 238 universal polymatroid 91 upper bound 3 upper consecutive t ’ s property 102 upper ideal 3 value 10 vector lattice 4 vector minor 41 vector rank function 41 vertex 6, 9, 1 4 vertex set 6 weak duality 15 weighted max-min problem 238, 240 weighted min-max problem 239, 241
270